These are the tests I re-ran: 1) Boolean queries with filters (range queries) on a tdouble field (precisionStep=8). Solr 4.10.4 and 5.2.1 are the official releases. 5.3.0-Exitable is a snapshot of 5.x branch, 5.3.0-Not-Exitable is the same snapshot but I don't wrap the reader in ExitableDirectoryReader[1] http://snag.gy/CClfF.jpg
2) The same queries, but the filters are made with "double" fields instead (precisionStep=0): http://snag.gy/jKsrV.jpg I created https://issues.apache.org/jira/browse/SOLR-7875 and https://issues.apache.org/jira/browse/SOLR-7876 [1] The code I changed in SolrIndexSearcher for the test: private static DirectoryReader wrapReader(SolrCore core, DirectoryReader reader) throws IOException { assert reader != null; DirectoryReader wrapped = UninvertingReader.wrap(reader, core.getLatestSchema().getUninversionMap(reader)); boolean disableExitableDirectory = Boolean.getBoolean("solr.disableQueryTimeout"); if (disableExitableDirectory) { return wrapped; } return ExitableDirectoryReader.wrap(wrapped, SolrQueryTimeoutImpl.getInstance()); } On Tue, Aug 4, 2015 at 10:13 AM, Tomás Fernández Löbbe < [email protected]> wrote: > The stack traces I shared are also from the EC2 instances, using Amazon > Linux. I can also include a test in Mac OS if you are interested in the > results. > These are some more tests I ran using 5.3 (r1693977) "Exitable" is with > the ExitableDirectoryReader wrap, "Not-Exitable" is skips the wrapping of > the reader completely in SolrIndexSearcher. > > Boolean queries with filters on the tdouble fields "tlongitude" and > "tlatitude": > http://snag.gy/1zjUy.jpg > The difference between the two is very small. 5.3 looks better in > performance than 5.2.1 (probably the fix that Adrian previously mentioned), > but I ran the test from a different network/location, so the network > latency may be different. I'll run the test against 5.2.1 and 4.10.4 again > and update those results. Here is that preliminary graph: > http://snag.gy/jR43Z.jpg > > The "ExitableDirectory" vs. "Not ExitableDirectory" did have a huge > difference when running the range queries with the double field > (precisionStep=0): > http://snag.gy/frW8f.jpg > > > > On Tue, Aug 4, 2015 at 1:43 AM, Uwe Schindler <[email protected]> wrote: > >> Hi, >> >> I just asked, if maybe the stack dumps were from his MacOSX local >> development machine. Otherwise the time difference between 4.10 and 5.2 on >> MacOSX should be much larger - just as a cross check! >> >> Uwe >> >> ----- >> Uwe Schindler >> H.-H.-Meier-Allee 63, D-28213 Bremen >> http://www.thetaphi.de >> eMail: [email protected] >> >> > -----Original Message----- >> > From: Adrien Grand [mailto:[email protected]] >> > Sent: Tuesday, August 04, 2015 10:35 AM >> > To: [email protected] >> > Subject: Re: Understanding performance degradation in range queries >> > between Solr 5.2.1 and 4.10.4 >> > >> > Tomás said Solr was running on EC2 in his initial email, so this can't >> be >> > MacOSX. >> > >> > On Tue, Aug 4, 2015 at 10:21 AM, Uwe Schindler <[email protected]> wrote: >> > > Hi, >> > > >> > > Just related: Do you run tests on MacOSX? NanoTime is damn slow there! >> > > And its also not monotonic! There wss a discussion on hotspot-dev >> > > about that one year ago, but no solution until now. >> > > >> > > Macos had no real nanotime impl in kernel so java falls back to wall >> > > clock, which is slow and non monotonic with ntp or vmware... (see test >> > > failures in the past on policeman when ntp adjusted clock). >> > > >> > > If you run on Linux this could also be a good test: on osx the >> > > difference to prev version should be even larger. >> > > >> > > Uwe >> > > >> > > >> > > Am 4. August 2015 03:07:45 MESZ, schrieb Anshum Gupta >> > > <[email protected]>: >> > >> >> > >> Guess bypassing that check and always returning true instead would >> > >> help confirm that. If that's the case, we should just initialize the >> > >> timeout to Long.MAX_VALUE and check for that to short-circuit? >> > >> >> > >> On Mon, Aug 3, 2015 at 3:50 PM, Tomás Fernández Löbbe >> > >> <[email protected]> wrote: >> > >>> >> > >>> Yes, I saw that, but thought it could be the underlying >> > >>> implementation, not the "ExitableTermsEnum" wrapper. Maybe it's >> > >>> related to the calls to System.nanoTime then... >> > >>> >> > >>> On Mon, Aug 3, 2015 at 3:11 PM, Adrien Grand <[email protected]> >> > wrote: >> > >>>> >> > >>>> Thanks for sharing the traces, it looks like my intuition was >> wrong. >> > >>>> :) They seem to point to >> > >>>> ExitableDirectoryReader$ExitableTermsEnum.next, which checks >> > >>>> whether the time is out before delegating. >> > >>>> >> > >>>> On Mon, Aug 3, 2015 at 7:21 PM, Tomás Fernández Löbbe >> > >>>> <[email protected]> wrote: >> > >>>> > Thanks Adrien, >> > >>>> > I'll run the tests with 5.3 snapshot and post the results here. >> > >>>> > In case this helps, this is the hprof samples output >> > >>>> > (-Xrunhprof:cpu=samples,depth=3,file=/home/ec2- >> > user/hprof_output. >> > >>>> > txt) >> > >>>> > for >> > >>>> > 4.10.4 and 5.2.1 in my test: >> > >>>> > >> > >>>> > Solr 4.10.4: >> > >>>> > CPU SAMPLES BEGIN (total = 243525) Fri Jul 31 22:29:06 2015 >> > >>>> > rank self accum count trace method >> > >>>> > 1 75.07% 75.07% 182812 300523 >> > >>>> > java.net.PlainSocketImpl.socketAccept >> > >>>> > 2 4.27% 79.34% 10408 301576 >> > >>>> > >> > >>>> > >> > org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.decodeMet >> > aData >> > >>>> > 3 4.15% 83.49% 10108 301585 >> > >>>> > org.apache.lucene.index.FilteredTermsEnum.docs >> > >>>> > 4 3.46% 86.95% 8419 301582 >> > >>>> > org.apache.lucene.index.FilteredTermsEnum.next >> > >>>> > 5 2.49% 89.44% 6070 301573 >> > >>>> > java.net.SocketInputStream.socketRead0 >> > >>>> > 6 1.99% 91.43% 4848 301599 >> > >>>> > >> > org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet >> > >>>> > 7 1.98% 93.42% 4824 301583 >> > >>>> > >> > org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet >> > >>>> > 8 1.57% 94.99% 3824 301589 >> > >>>> > org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll >> > >>>> > 9 1.44% 96.42% 3504 301594 >> > >>>> > >> > >>>> > >> > org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnu >> > m.refillDocs >> > >>>> > 10 1.09% 97.51% 2655 301581 java.nio.Bits.copyToArray >> > >>>> > 11 0.98% 98.50% 2388 301598 >> > >>>> > >> > >>>> > >> > org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnu >> > m.nextDoc >> > >>>> > 12 0.62% 99.12% 1522 301600 >> > >>>> > org.apache.lucene.store.DataInput.readVInt >> > >>>> > 13 0.21% 99.33% 500 301612 >> > >>>> > org.apache.lucene.codecs.blocktree.SegmentTermsEnum.docs >> > >>>> > 14 0.07% 99.39% 167 301601 >> > >>>> > org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.next >> > >>>> > 15 0.06% 99.45% 139 301619 >> java.lang.System.identityHashCode >> > >>>> > 16 0.05% 99.50% 114 301632 >> > >>>> > org.apache.lucene.codecs.lucene41.ForUtil.readBlock >> > >>>> > 17 0.04% 99.54% 92 300708 >> java.util.zip.Inflater.inflateBytes >> > >>>> > 18 0.03% 99.57% 76 301624 >> > >>>> > >> > >>>> > >> > org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadNextFlo >> > orBlock >> > >>>> > 19 0.03% 99.59% 68 300613 >> java.lang.ClassLoader.defineClass1 >> > >>>> > 20 0.03% 99.62% 68 301615 >> > >>>> > org.apache.lucene.codecs.blocktree.SegmentTermsEnum.next >> > >>>> > 21 0.03% 99.65% 62 301635 >> > >>>> > org.apache.solr.search.SolrIndexSearcher.getDocSetNC >> > >>>> > 22 0.02% 99.66% 41 301664 >> > >>>> > org.apache.lucene.codecs.blocktree.SegmentTermsEnum.next >> > >>>> > 23 0.01% 99.68% 31 301629 >> > >>>> > org.apache.lucene.util.FixedBitSet.<init> >> > >>>> > CPU SAMPLES END >> > >>>> > >> > >>>> > Solr 5.2.1: >> > >>>> > CPU SAMPLES BEGIN (total = 235415) Fri Jul 31 22:42:06 2015 >> > >>>> > rank self accum count trace method >> > >>>> > 1 51.38% 51.38% 120954 301291 >> > >>>> > sun.nio.ch.EPollArrayWrapper.epollWait >> > >>>> > 2 25.69% 77.07% 60477 301292 >> > >>>> > sun.nio.ch.ServerSocketChannelImpl.accept0 >> > >>>> > 3 10.59% 87.66% 24923 301369 >> > >>>> > >> > org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.next >> > >>>> > 4 2.20% 89.86% 5182 301414 >> > >>>> > org.apache.lucene.codecs.blocktree.SegmentTermsEnum.postings >> > >>>> > 5 2.01% 91.87% 4742 301384 >> > >>>> > org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.postings >> > >>>> > 6 1.25% 93.12% 2944 301434 >> > >>>> > java.lang.ThreadLocal$ThreadLocalMap.getEntryAfterMiss >> > >>>> > 7 1.11% 94.23% 2612 301367 >> > >>>> > >> > org.apache.lucene.search.MultiTermQueryConstantScoreWrapper$1.rewrite >> > >>>> > 8 0.94% 95.17% 2204 301390 >> org.apache.lucene.util.BitSet.or >> > >>>> > 9 0.93% 96.10% 2190 301383 java.nio.Bits.copyToArray >> > >>>> > 10 0.78% 96.87% 1825 301449 >> > >>>> > >> > >>>> > >> > org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnu >> > m.refillDocs >> > >>>> > 11 0.73% 97.60% 1717 301378 >> > >>>> > org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll >> > >>>> > 12 0.73% 98.33% 1715 301374 >> org.apache.lucene.util.BitSet.or >> > >>>> > 13 0.33% 98.66% 787 301387 >> > >>>> > >> > >>>> > >> > org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnu >> > m.nextDoc >> > >>>> > 14 0.16% 98.82% 374 301426 >> > >>>> > >> > >>>> > >> > org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnu >> > m.nextDoc >> > >>>> > 15 0.10% 98.93% 245 301382 >> org.apache.lucene.util.BitSet.or >> > >>>> > 16 0.09% 99.02% 219 301381 >> > >>>> > org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.next >> > >>>> > 17 0.09% 99.11% 207 301370 >> org.apache.lucene.util.BitSet.or >> > >>>> > 18 0.06% 99.17% 153 301416 >> org.apache.lucene.util.BitSet.or >> > >>>> > 19 0.06% 99.24% 151 301427 >> org.apache.lucene.util.BitSet.or >> > >>>> > 20 0.06% 99.30% 151 301441 >> > >>>> > org.apache.lucene.store.DataInput.readVInt >> > >>>> > 21 0.06% 99.36% 147 301389 >> java.lang.System.identityHashCode >> > >>>> > 22 0.06% 99.42% 140 301375 >> > >>>> > org.apache.lucene.codecs.blocktree.SegmentTermsEnum.next >> > >>>> > 23 0.04% 99.47% 104 301425 >> org.apache.lucene.util.BitSet.or >> > >>>> > 24 0.03% 99.50% 76 301423 >> > >>>> > >> > >>>> > >> > org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnu >> > m.nextDoc >> > >>>> > 25 0.03% 99.53% 74 301454 >> > >>>> > >> > org.apache.lucene.search.MultiTermQueryConstantScoreWrapper$1.rewrite >> > >>>> > 26 0.03% 99.56% 65 301432 >> > >>>> > org.apache.lucene.util.BitDocIdSet$Builder.or >> > >>>> > 27 0.02% 99.58% 53 301456 >> > >>>> > org.apache.lucene.util.FixedBitSet.or >> > >>>> > 28 0.02% 99.60% 52 300077 >> java.lang.ClassLoader.defineClass1 >> > >>>> > 29 0.02% 99.63% 50 301464 >> > >>>> > org.apache.lucene.codecs.lucene50.ForUtil.readBlock >> > >>>> > 30 0.02% 99.64% 39 301438 >> > >>>> > org.apache.lucene.codecs.blocktree.SegmentTermsEnum.next >> > >>>> > 31 0.02% 99.66% 37 301465 >> > >>>> > >> > >>>> > >> > org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadNextFlo >> > orBlock >> > >>>> > 32 0.02% 99.67% 36 301419 >> > >>>> > >> > >>>> > >> > org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDoc >> > >>>> > sEnum.nextDoc >> > >>>> > CPU SAMPLES END >> > >>>> > >> > >>>> > On Fri, Jul 31, 2015 at 4:23 PM, Adrien Grand <[email protected] >> > >> > >>>> > wrote: >> > >>>> >> >> > >>>> >> Hi Tomás, >> > >>>> >> >> > >>>> >> I suspect this might be related to LUCENE-5938. We changed the >> > >>>> >> default rewrite method for multi-term queries to load documents >> > >>>> >> into a sparse bit set first first, and only upgrade to a dense >> > >>>> >> bit set when we know many documents match. When there are lots >> > >>>> >> of terms to intersect, then we end up spending significant cpu >> > >>>> >> time to build the sparse bit set to eventually upgrade to a >> > >>>> >> dense bit set like before. This might be what you are seeing. >> > >>>> >> >> > >>>> >> You might see the issue less with the population field because >> > >>>> >> it has fewer unique values, so postings lists are longer and the >> > >>>> >> DocIdSet building logic can upgrade quicker to a dense bit set. >> > >>>> >> >> > >>>> >> Mike noticed this slowness when working on BDK trees and we >> > >>>> >> changed this first phase to use a plain int[] array that we sort >> > >>>> >> and deduplicate instead of a more fancy sparse bit set >> > >>>> >> (LUCENE-6645), which seemed to make things faster. Would it be >> > >>>> >> possible for you to also check a 5.3 snapshot? >> > >>>> >> >> > >>>> >> >> > >>>> >> >> > >>>> >> >> > >>>> >> On Fri, Jul 31, 2015 at 10:51 PM, Tomás Fernández Löbbe >> > >>>> >> <[email protected]> wrote: >> > >>>> >> > Hi, I'm seeing some query performance degradation between >> > >>>> >> > 4.10.4 and 5.2.1. >> > >>>> >> > It doesn't happen with all the queries, but for queries like >> > >>>> >> > range queries on fields with many different values the average >> > >>>> >> > time in 5.2.1 is worse than in 4.10.4. Is anyone seeing >> > >>>> >> > something similar? >> > >>>> >> > >> > >>>> >> > Test Details: >> > >>>> >> > * Single thread running queries continuously. I run the test >> > >>>> >> > twice for each Solr version. >> > >>>> >> > * JMeter running on my laptop, Solr running on EC2, on an >> > >>>> >> > m3.xlarge instance with all the defaults but with 5G heap. >> > >>>> >> > Index in local disk (SSD) >> > >>>> >> > * Plain Solr releases, nothing custom. Single Solr core, not >> > >>>> >> > in SolrCloud mode, no distributed search. >> > >>>> >> > * "allCountries" geonames dataset (~8M small docs). No updates >> > >>>> >> > during the test. Index Size is around 1.1GB for Solr 5.2.1 and >> > >>>> >> > 1.3GB for Solr >> > >>>> >> > 4.10.4 >> > >>>> >> > (fits entirely in RAM) >> > >>>> >> > * jdk1.8.0_45 >> > >>>> >> > >> > >>>> >> > Queries: 3k boolean queries (generated with terms from the >> > >>>> >> > dataset) with range queries as filters on "tlongitude" and >> > >>>> >> > "tlatitude" fields with randomly generated bounds, e.g. >> > >>>> >> > q=name:foo OR name:bar&fq=tlongitude:[W TO >> > X]&fq=tlatitude:[Y >> > >>>> >> > TO Z] >> > >>>> >> > >> > >>>> >> > Fields are: >> > >>>> >> > <field name="tlatitude" type="tdouble"/> <field >> > >>>> >> > name="tlongitude" type="tdouble"/> Field Type: >> > >>>> >> > <fieldType name="tdouble" class="solr.TrieDoubleField" >> > >>>> >> > precisionStep="8" >> > >>>> >> > positionIncrementGap="0"/> >> > >>>> >> > >> > >>>> >> > In this case, Solr 4.10.4 was between 20% to 30% faster than >> > >>>> >> > 5.2.1 in average. >> > >>>> >> > >> > >>>> >> > http://snag.gy/2yPPM.jpg >> > >>>> >> > >> > >>>> >> > Doing only the boolean queries show no performance difference >> > >>>> >> > between >> > >>>> >> > 4.10 >> > >>>> >> > and 5.2, same thing if I do filters on a string field instead >> > >>>> >> > of the range queries. >> > >>>> >> > >> > >>>> >> > When using "double" field type (precisionStep="0"), the >> > >>>> >> > difference was >> > >>>> >> > bigger: >> > >>>> >> > >> > >>>> >> > longitude/latitude fields: >> > >>>> >> > <field name="longitude" type="double" docValues="true"/> >> > >>>> >> > <field name="latitude" type="double" docValues="true"/> >> > >>>> >> > <fieldType name="double" class="solr.TrieDoubleField" >> > >>>> >> > precisionStep="0" >> > >>>> >> > positionIncrementGap="0"/> >> > >>>> >> > >> > >>>> >> > http://snag.gy/Vi5uk.jpg >> > >>>> >> > I understand this is not the best field type definition for >> > >>>> >> > range queries, I'm just trying to understand the difference >> > >>>> >> > between the two versions and why. >> > >>>> >> > >> > >>>> >> > Performance was OK when doing range queries on the >> > "population" >> > >>>> >> > field >> > >>>> >> > (long), but that field doesn't have many different values, >> > >>>> >> > only 300k out of the 8.3M docs have the population field with >> > >>>> >> > a value different to 0. On the other hand, doing range queries >> > >>>> >> > on the _version_ field did show a graph similar to the >> > >>>> >> > previous one: >> > >>>> >> > >> > >>>> >> > <field name="_version_" type="long" indexed="true" >> > >>>> >> > stored="true"/> <fieldType name="long" >> > class="solr.TrieLongField" precisionStep="0" >> > >>>> >> > positionIncrementGap="0"/> >> > >>>> >> > >> > >>>> >> > http://snag.gy/4tc7e.jpg >> > >>>> >> > >> > >>>> >> > Any idea what could be causing this? Is this expected after >> > >>>> >> > some known change? >> > >>>> >> > >> > >>>> >> > With Solr 4.10, a single CPU core remains high during the test >> > >>>> >> > (close to 100%), but with Solr 5.2, different cores go up and >> > >>>> >> > down in utilization continuously. That's probably because of >> > >>>> >> > the different Jetty version I suppose. >> > >>>> >> > GC pattern looks similar in both. For both Solr versions I'm >> > >>>> >> > using the settings that ship with Solr (in solr.in.sh) except >> > >>>> >> > for Xmx and Xms >> > >>>> >> > >> > >>>> >> >> > >>>> >> >> > >>>> >> >> > >>>> >> -- >> > >>>> >> Adrien >> > >>>> >> >> > >>>> >> ---------------------------------------------------------------- >> > >>>> >> ----- To unsubscribe, e-mail: [email protected] >> > >>>> >> For additional commands, e-mail: [email protected] >> > >>>> >> >> > >>>> > >> > >>>> >> > >>>> >> > >>>> >> > >>>> -- >> > >>>> Adrien >> > >>>> >> > >>>> ------------------------------------------------------------------- >> > >>>> -- To unsubscribe, e-mail: [email protected] For >> > >>>> additional commands, e-mail: [email protected] >> > >>>> >> > >>> >> > >> >> > >> >> > >> >> > >> -- >> > >> Anshum Gupta >> > > >> > > >> > > -- >> > > Uwe Schindler >> > > H.-H.-Meier-Allee 63, 28213 Bremen >> > > http://www.thetaphi.de >> > >> > >> > >> > -- >> > Adrien >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: [email protected] For >> additional >> > commands, e-mail: [email protected] >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> >
