Tuning DirectUpdateHandler2.addDoc

Trym R. Møller Tue, 02 Oct 2012 05:03:30 -0700

Hi

I have been profiling SolrCloud when indexing into a sharded non-replicacollection because indexing slows down when the index files (*.fdt)grows to a couple of GB (the largest is about 3.5GB).

When profiling for a couple of minutes I see that most time is spend inthe DirectUpdateHandler2.addDoc method (being called about 8000 times).Its time is spendin UpdateLog.lookupVersion, VersionInfo.getVersionFromIndex,SolrIndexSearcher.lookupId (being called about 6000 times) and it spendsit time in AtomicReader.termDocsEnums which is called about 530.000times taking about 770.000 ms

Is it true, that the reason for "AtomicReader.termDocsEnums" is beingcalled 530.000/6000 =~ 90 times per "SolrIndexSearcher.lookupId" call,is that I have in average 90 "term"-files?

Can I do anything to lower this number of "term"-files?

I'm running more cores on my SolrCloud instance. Is there any way I canlower the time spend in each "AtomicReader.termDocsEnums" method call(this seems to be much faster when I don't have so many documents in mycollection/shard)?


Thanks as always.

Best regards Trym

Tuning DirectUpdateHandler2.addDoc

Reply via email to