Re: slow solr facet processing

Ere Maijala Mon, 04 Sep 2017 07:00:38 -0700

Toke Eskildsen kirjoitti 4.9.2017 klo 13.38:

On Mon, 2017-09-04 at 13:21 +0300, Ere Maijala wrote:

Thanks for the insight, Yonik. I can confirm that #2 is true. I ran


<optimize maxSegments="1" waitSearcher="true"/>

and after it completed I was able to retrieve 2000 values in 17ms.


Very interesting. Is this on spinning disks or SSD? Is your index data
cached in memory? What I am aiming at is if this is primarily a "many
relatively slow random access"-thing or more due to the way DocValues
are represented in the segments (the codec).

I indexed a few million new/changed records, and the performance is backto slow. Upside is that I can test again with a slow server.

It's spinning disks on a SAN, and the full index doesn't fit intomemory. I don't see any IO wait, and repeated attempts are just as sloweven though I would have thought the relevant parts would be cached inmemory. During testing and reporting the results I've always discardedthe very first requests since they're always slower than subsequentrepeats due to there being another test index on the same server. Maybeworth noting is that while there's no IO wait, there is fairly high CPUusage for Solr's Java process hovering around 100% if I repeat therequest in a loop.


I took a quick sample with VisualVM, and the top hotspots are:

org.apache.solr.search.facet.UnInvertedField.getCounts() 32.079956 7,356ms (32.1%) 7,356 ms 7,655 ms 7,655 msorg.apache.lucene.util.PriorityQueue.downHeap() 30.232546 6,932 ms(30.2%) 6,932 ms 6,932 ms 6,932 msorg.apache.lucene.index.MultiTermsEnum.pushTop() 11.628195 2,666 ms(11.6%) 2,666 ms 11,177 ms 11,177 msorg.apache.lucene.index.MultiTermsEnum$TermMergeQueue.fillTop() 9.0795712,082 ms (9.1%) 2,082 ms 2,082 ms 2,082 msorg.apache.lucene.store.ByteBufferGuard.getBytes() 4.176216 957 ms(4.2%) 957 ms 957 ms 957 msorg.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.next()2.6867974 616 ms (2.7%) 616 ms 616 ms 616 msorg.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTermLeaf()1.393562 319 ms (1.4%) 319 ms 319 ms 319 msorg.apache.lucene.util.fst.ByteSequenceOutputs.read() 1.2111844 277 ms(1.2%) 277 ms 277 ms 277 ms


(sorry if that looks bad in the email)

I'm building another index on a higher-end server that can load the fullindex to memory and will retest with that. But note that this index hasdocValues disabled as facet.method=uif seems to only cause trouble ifdocValues are enabled.


--Ere

--
Ere Maijala
Kansalliskirjasto / The National Library of Finland

Re: slow solr facet processing

Reply via email to