Re: The best way forward

jt oob Tue, 04 Nov 2003 04:14:01 -0800

Thank you for the replies!

My indexes are currently looking like they might be 12GB when finished
on the current run.


I have spotted a tool on the lucene site for listing the most
frequently occuring words in the index. Currently I am using the
defaultAnalyzer  stoplist, I should probably use a more comprehensive
list.

Is there a way of implementing a stoplist after the index has been
created,  removing all occurances of the new stoplist words?
I could then write a new Analyzer with the new stoplist for adding new
documents to the index.
Am i doomed to reindexing with a better stoplist?


In view of the index size, I am going to see how well the kernel
caching performs, as the index probably won't fit entirely into memory
once the operating system and other system processes have taken their
bite of the available memory.

Eventually i am going to try to implement something similar to google
groups, indexing lots of NNTP traffic. Has anyone done this before with
lucune?


Thanks again,
jt

________________________________________________________________________
Want to chat instantly with your online friends?  Get the FREE Yahoo!
Messenger http://mail.messenger.yahoo.co.uk

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: The best way forward

Reply via email to