Hi

I have been profiling SolrCloud when indexing into a sharded non-replica collection because indexing slows down when the index files (*.fdt) grows to a couple of GB (the largest is about 3.5GB).

When profiling for a couple of minutes I see that most time is spend in the DirectUpdateHandler2.addDoc method (being called about 8000 times). Its time is spend in UpdateLog.lookupVersion, VersionInfo.getVersionFromIndex, SolrIndexSearcher.lookupId (being called about 6000 times) and it spends it time in AtomicReader.termDocsEnums which is called about 530.000 times taking about 770.000 ms

Is it true, that the reason for "AtomicReader.termDocsEnums" is being called 530.000/6000 =~ 90 times per "SolrIndexSearcher.lookupId" call, is that I have in average 90 "term"-files?
Can I do anything to lower this number of "term"-files?

I'm running more cores on my SolrCloud instance. Is there any way I can lower the time spend in each "AtomicReader.termDocsEnums" method call (this seems to be much faster when I don't have so many documents in my collection/shard)?

Thanks as always.

Best regards Trym

Reply via email to