Hello Luke,
Have you changed default values of parameters related to indexing? It helped in my case - Yesterday I was indexing ~3.5mln pages segment and it took 3.5h and optimization took 10 minutes. I am using linux (ext3) on AMD Opteron 2.2GHz +SCSI drives.
I am using (probably not the best values- but they work good enough for me):
indexer.mergeFactor=30 indexer.minMergeDocs=10000 indexer.maxMergeDocs=10000000
I am passing -Xmx of 2GB to Java JVM (1.5.02 64bit). P.
Luke Baker wrote:
Hey,
Is there some sort of optimal or maximum segment size? I have a segment with 3.9 million records and it appears to be taking a really long time to index. The index process has been optimizing the index for over a week. The server I'm running it on is a dual Xeon 3.0 Ghz with 2GB of RAM. I've done 2 million page segments before and the optimizing has taken about 48 hours.
Would a truncated segment cause the optimizing process to take a really long time? I would guess that the optimizing process would just be manipulating the index that already has been created and that nothing in the segment itself would cause the optimizing part to take a really long time.
I have confirmed that the process is still running and modifying files in the index directory. Would the underlying filesystem play any role in all this? I'm using ext3.
Thanks,
Luke
