Nutch 0.9 not loading plugins (sorry very long)

2006-11-08 Thread zzcgiacomini
Hi everybody, Sorry if I come again on this issue with this long mail but I really cant have my plugin loaded. I have read and applied the suggestion given in various previous postings on this list but i still have not get results Well basically I have used part of the code written for the

Re: implement thai lanaguage analyzer in nutch

2006-11-08 Thread Arun Kaundal
Sanjeev, You have implemented Thai language, right? What else changes you have done in orignal code ? Do I need to make same changes for say Hindi and Punjabi Language? If u bit of time to explain the things to him, will be of great help to me. Thank you ./Arun On 11/8/06, sanjeev

Re: Nutch 0.9 not loading plugins (sorry very long)

2006-11-08 Thread zzcgiacomini
Sorry in my previous posting the output of nutch readseg -get was wrong .. here is the actual output: -Corrado SegmentReader: get 'http://testmachine.test.net/index.html' Content:: Version: 2 url: http://testmachine.test.net/index.html base: http://testmachine.test.net/index.html contentType:

Re: implement thai lanaguage analyzer in nutch

2006-11-08 Thread sanjeev
Arun, I tried implementing thai search for nutch. I followed the steps outllined in this tutorialfor Chinese: http://issues.apache.org/jira/browse/NUTCH-36?page=comments#action_62153 So sorry - I am not able to help much. How urgent is your requirement ? Mine is very urgent as I have to get

RE: implement thai lanaguage analyzer in nutch

2006-11-08 Thread Teruhiko Kurosaka
Sanjay, I don't think you should follow the Chinese example and extend the CJK range. This was needed because Chinese and Japanese don't use space to separate words. I believe Thai uses spaces, right? If so, you should extend LETTER range to include Thai character rather than CJK. Another place

Re: implement thai lanaguage analyzer in nutch

2006-11-08 Thread ogjunk-nutch
Regarding Thai, there is a Thai Analyzer in Lucene already: $ ll contrib/analyzers/src/java/org/apache/lucene/analysis/th/ total 24 drwxrwxr-x 7 otis otis 4096 Oct 27 02:08 .svn/ -rw-rw-r-- 1 otis otis 1528 Jun 5 14:27 ThaiAnalyzer.java -rw-rw-r-- 1 otis otis 2437 Jun 5 14:27

why can't build in the Linux with ant

2006-11-08 Thread kauu
hi : i get a problem now ,i can't build the nutch in the linux os with ant and my ant version is Apache Ant version 1.5.2-20 compiled on September 25 2003 the error is below so anyone get the same problem ?i need ur help Buildfile: build.xml BUILD FAILED

Re: implement thai lanaguage analyzer in nutch

2006-11-08 Thread sanjeev
ok. I downloaded the LuceneInAction code examples from the book and found there were some analyzers and tests/demos which included chinese. But these analyzers were standalone java programs with a main method. My question is how to integrate into nutch so the index created by crawl process can

Re: implement thai lanaguage analyzer in nutch

2006-11-08 Thread sanjeev
ok Kuro - you are wrong about thai language having spaces between words. Thai don't have space between words and segmenting thai is a bit tricky methinks. Will appreciate any/all help you can give me cheers, sanjeev sanjeev wrote: ok. I downloaded the LuceneInAction code examples

Re: implement thai lanaguage analyzer in nutch

2006-11-08 Thread sanjeev
Arun, No I haven't come anywhere near the solution. I am myself confused a little. From what I've learnt - one approach is to use NutchAnalysis.jj and compile using javacc. Another is to download dev version of nutch and try to use the patches for the language analyzer and identifier. I failed