Modifiying Nutch Indexer

2006-11-07 Thread Javier P. L.
Hi, I need to modify the Nutch Indexer class because for me it is very useful to add some fields to the generated Lucene index. I was trying and I find out that it is possible to add fields to the Document with doc.addField() in the reduce function. My point is that for those fields I need the

Re: implement thai lanaguage analyzer in nutch

2006-11-07 Thread kauu
i think you should learn the javacc ,then understand the analasis.jj then the thai will be resolved soon . just try it On 11/7/06, sanjeev [EMAIL PROTECTED] wrote: Hello, After playing around with nutch for a few months I was tying to implement the thai lanaguage analyzer for nutch.

Re: Modifiying Nutch Indexer

2006-11-07 Thread Enis Soztutar
Javier P. L. wrote: Hi, I need to modify the Nutch Indexer class because for me it is very useful to add some fields to the generated Lucene index. I was trying and I find out that it is possible to add fields to the Document with doc.addField() in the reduce function. My point is that for

[jira] Updated: (NUTCH-389) a url tokenizer implementation for tokenizing index fields : url and host

2006-11-07 Thread Enis Soztutar (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-389?page=all ] Enis Soztutar updated NUTCH-389: Attachment: urlTokenizer-improved.diff This is an improvement and a minor bug fix over the previous url tokenizer. This version first replaces characters,

[jira] Commented: (NUTCH-393) Indexer doesn't handle null documents returned by filters

2006-11-07 Thread Enis Soztutar (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-393?page=comments#action_12447787 ] Enis Soztutar commented on NUTCH-393: - Also IndexingException is catched by the Indexer, in which case the whole document is not added to the writer (the

[jira] Created: (NUTCH-397) porting clustering-carrot2 plugin to carrot2 v2.0

2006-11-07 Thread JIRA
porting clustering-carrot2 plugin to carrot2 v2.0 - Key: NUTCH-397 URL: http://issues.apache.org/jira/browse/NUTCH-397 Project: Nutch Issue Type: Improvement Reporter: Do?acan

[jira] Commented: (NUTCH-393) Indexer doesn't handle null documents returned by filters

2006-11-07 Thread Eelco Lempsink (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-393?page=comments#action_12447939 ] Eelco Lempsink commented on NUTCH-393: -- I'm not sure I agree with that. After running a document through a set of filters you'd expect all filters ran. If

Re: implement thai lanaguage analyzer in nutch

2006-11-07 Thread sanjeev
Oh btw - I followed the chinese tutorial and was able to compile and everything was fine. Lemme just test if it is working properly - however i didn't make any changes to NutchAnalysis.jj I need more information please. Thanks a bunch. -- View this message in context:

Re: implement thai lanaguage analyzer in nutch

2006-11-07 Thread Arun Kaundal
Hi sanjeev and Kauu I want to support Hindi-Language widely spoken in India language. Can u guide what else I need to modify ? I think there is no support to search and index Hindi language. I want to work on this. But I need some information as what to modify and where eaxctly