Doug Cutting wrote:

Kevin A. Burton wrote:

So is it possible to fix this index now? Can I just delete the most recent segment that was created? I can find this by ls -alt


Sorry, I forgot to answer your question: this should work fine. I don't think you should even have to delete that segment.

I'm worried about duplicate or missing content from the original index. I'd rather rebuild the index and waste another 6 hours (I've probably blown 100 hours of CPU time on this already) and have a correct index :)


During an optimize I assume Lucene starts writing to a new segment and leaves all others in place until everything is done and THEN deletes them?

Also, to elaborate on my previous comment, a mergeFactor of 5000 not only delays the work until the end, but it also makes the disk workload more seek-dominated, which is not optimal.

The only settings I uses are:

targetIndex.mergeFactor=10;
targetIndex.minMergeDocs=1000;

the resulting index has 230k files in it :-/

I assume this is contributing to all the disk seeks.

So I suspect a smaller merge factor, together with a larger minMergeDocs, will be much faster overall, including the final optimize(). Please tell us how it goes.

This is what I did for this last round but then I ended up with the highly fragmented index.

hm...

Thanks for all the help btw!

Kevin

--

Please reply using PGP.

http://peerfear.org/pubkey.asc NewsMonster - http://www.newsmonster.org/
Kevin A. Burton, Location - San Francisco, CA, Cell - 415.595.9965
AIM/YIM - sfburtonator, Web - http://peerfear.org/
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412
IRC - freenode.net #infoanarchy | #p2p-hackers | #newsmonster



--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to