date:20061110

Nutch and Lucene

2006-11-10 Thread hzhong

Hello, This is what I want to do. Given a document, find all its terms and frequencies. I understand that Nutch is built on top of Lucene. In Lucene, I can access the terms and their frequencies of a document via the indexreader. However, in nutch, I am not sure if there's an equivalent.

Re: Nutch and Lucene

2006-11-10 Thread Andrzej Bialecki

hzhong wrote: Hello, This is what I want to do. Given a document, find all its terms and frequencies. I understand that Nutch is built on top of Lucene. In Lucene, I can access the terms and their frequencies of a document via the indexreader. However, in nutch, I am not sure if there's

[jira] Commented: (NUTCH-395) Increase fetching speed

2006-11-10 Thread Sami Siren (JIRA)

[ http://issues.apache.org/jira/browse/NUTCH-395?page=comments#action_12448795 ] Sami Siren commented on NUTCH-395: -- have you measured what made the biggest impact on performance - changes to Metadata, or changes to IO in FetcherOutput? did

RE: implement thai lanaguage analyzer in nutch

2006-11-10 Thread Teruhiko Kurosaka

Oh, Thai words are not space delimited? OK, in that case, you'd need to study how ThaiAnalyzer works and then modify the rules in NutchAnalysis.jj (if you are going to use the web search GUI from Nutch). This is because the search expressions are parsed by the parser generated from

Nutch and Lucene

Re: Nutch and Lucene

[jira] Commented: (NUTCH-395) Increase fetching speed

RE: implement thai lanaguage analyzer in nutch

4 matches

Site Navigation

Mail list logo

Footer information