Tomás said Solr was running on EC2 in his initial email, so this can't be MacOSX.
On Tue, Aug 4, 2015 at 10:21 AM, Uwe Schindler <[email protected]> wrote: > Hi, > > Just related: Do you run tests on MacOSX? NanoTime is damn slow there! And > its also not monotonic! There wss a discussion on hotspot-dev about that one > year ago, but no solution until now. > > Macos had no real nanotime impl in kernel so java falls back to wall clock, > which is slow and non monotonic with ntp or vmware... (see test failures in > the past on policeman when ntp adjusted clock). > > If you run on Linux this could also be a good test: on osx the difference to > prev version should be even larger. > > Uwe > > > Am 4. August 2015 03:07:45 MESZ, schrieb Anshum Gupta > <[email protected]>: >> >> Guess bypassing that check and always returning true instead would help >> confirm that. If that's the case, we should just initialize the timeout to >> Long.MAX_VALUE and check for that to short-circuit? >> >> On Mon, Aug 3, 2015 at 3:50 PM, Tomás Fernández Löbbe >> <[email protected]> wrote: >>> >>> Yes, I saw that, but thought it could be the underlying implementation, >>> not the "ExitableTermsEnum" wrapper. Maybe it's related to the calls to >>> System.nanoTime then... >>> >>> On Mon, Aug 3, 2015 at 3:11 PM, Adrien Grand <[email protected]> wrote: >>>> >>>> Thanks for sharing the traces, it looks like my intuition was wrong. >>>> :) They seem to point to >>>> ExitableDirectoryReader$ExitableTermsEnum.next, which checks whether >>>> the time is out before delegating. >>>> >>>> On Mon, Aug 3, 2015 at 7:21 PM, Tomás Fernández Löbbe >>>> <[email protected]> wrote: >>>> > Thanks Adrien, >>>> > I'll run the tests with 5.3 snapshot and post the results here. In >>>> > case this >>>> > helps, this is the hprof samples output >>>> > (-Xrunhprof:cpu=samples,depth=3,file=/home/ec2-user/hprof_output.txt) >>>> > for >>>> > 4.10.4 and 5.2.1 in my test: >>>> > >>>> > Solr 4.10.4: >>>> > CPU SAMPLES BEGIN (total = 243525) Fri Jul 31 22:29:06 2015 >>>> > rank self accum count trace method >>>> > 1 75.07% 75.07% 182812 300523 >>>> > java.net.PlainSocketImpl.socketAccept >>>> > 2 4.27% 79.34% 10408 301576 >>>> > >>>> > org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.decodeMetaData >>>> > 3 4.15% 83.49% 10108 301585 >>>> > org.apache.lucene.index.FilteredTermsEnum.docs >>>> > 4 3.46% 86.95% 8419 301582 >>>> > org.apache.lucene.index.FilteredTermsEnum.next >>>> > 5 2.49% 89.44% 6070 301573 >>>> > java.net.SocketInputStream.socketRead0 >>>> > 6 1.99% 91.43% 4848 301599 >>>> > org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet >>>> > 7 1.98% 93.42% 4824 301583 >>>> > org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet >>>> > 8 1.57% 94.99% 3824 301589 >>>> > org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll >>>> > 9 1.44% 96.42% 3504 301594 >>>> > >>>> > org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.refillDocs >>>> > 10 1.09% 97.51% 2655 301581 java.nio.Bits.copyToArray >>>> > 11 0.98% 98.50% 2388 301598 >>>> > >>>> > org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.nextDoc >>>> > 12 0.62% 99.12% 1522 301600 >>>> > org.apache.lucene.store.DataInput.readVInt >>>> > 13 0.21% 99.33% 500 301612 >>>> > org.apache.lucene.codecs.blocktree.SegmentTermsEnum.docs >>>> > 14 0.07% 99.39% 167 301601 >>>> > org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.next >>>> > 15 0.06% 99.45% 139 301619 java.lang.System.identityHashCode >>>> > 16 0.05% 99.50% 114 301632 >>>> > org.apache.lucene.codecs.lucene41.ForUtil.readBlock >>>> > 17 0.04% 99.54% 92 300708 java.util.zip.Inflater.inflateBytes >>>> > 18 0.03% 99.57% 76 301624 >>>> > >>>> > org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadNextFloorBlock >>>> > 19 0.03% 99.59% 68 300613 java.lang.ClassLoader.defineClass1 >>>> > 20 0.03% 99.62% 68 301615 >>>> > org.apache.lucene.codecs.blocktree.SegmentTermsEnum.next >>>> > 21 0.03% 99.65% 62 301635 >>>> > org.apache.solr.search.SolrIndexSearcher.getDocSetNC >>>> > 22 0.02% 99.66% 41 301664 >>>> > org.apache.lucene.codecs.blocktree.SegmentTermsEnum.next >>>> > 23 0.01% 99.68% 31 301629 >>>> > org.apache.lucene.util.FixedBitSet.<init> >>>> > CPU SAMPLES END >>>> > >>>> > Solr 5.2.1: >>>> > CPU SAMPLES BEGIN (total = 235415) Fri Jul 31 22:42:06 2015 >>>> > rank self accum count trace method >>>> > 1 51.38% 51.38% 120954 301291 >>>> > sun.nio.ch.EPollArrayWrapper.epollWait >>>> > 2 25.69% 77.07% 60477 301292 >>>> > sun.nio.ch.ServerSocketChannelImpl.accept0 >>>> > 3 10.59% 87.66% 24923 301369 >>>> > org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.next >>>> > 4 2.20% 89.86% 5182 301414 >>>> > org.apache.lucene.codecs.blocktree.SegmentTermsEnum.postings >>>> > 5 2.01% 91.87% 4742 301384 >>>> > org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.postings >>>> > 6 1.25% 93.12% 2944 301434 >>>> > java.lang.ThreadLocal$ThreadLocalMap.getEntryAfterMiss >>>> > 7 1.11% 94.23% 2612 301367 >>>> > org.apache.lucene.search.MultiTermQueryConstantScoreWrapper$1.rewrite >>>> > 8 0.94% 95.17% 2204 301390 org.apache.lucene.util.BitSet.or >>>> > 9 0.93% 96.10% 2190 301383 java.nio.Bits.copyToArray >>>> > 10 0.78% 96.87% 1825 301449 >>>> > >>>> > org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.refillDocs >>>> > 11 0.73% 97.60% 1717 301378 >>>> > org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll >>>> > 12 0.73% 98.33% 1715 301374 org.apache.lucene.util.BitSet.or >>>> > 13 0.33% 98.66% 787 301387 >>>> > >>>> > org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.nextDoc >>>> > 14 0.16% 98.82% 374 301426 >>>> > >>>> > org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.nextDoc >>>> > 15 0.10% 98.93% 245 301382 org.apache.lucene.util.BitSet.or >>>> > 16 0.09% 99.02% 219 301381 >>>> > org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.next >>>> > 17 0.09% 99.11% 207 301370 org.apache.lucene.util.BitSet.or >>>> > 18 0.06% 99.17% 153 301416 org.apache.lucene.util.BitSet.or >>>> > 19 0.06% 99.24% 151 301427 org.apache.lucene.util.BitSet.or >>>> > 20 0.06% 99.30% 151 301441 >>>> > org.apache.lucene.store.DataInput.readVInt >>>> > 21 0.06% 99.36% 147 301389 java.lang.System.identityHashCode >>>> > 22 0.06% 99.42% 140 301375 >>>> > org.apache.lucene.codecs.blocktree.SegmentTermsEnum.next >>>> > 23 0.04% 99.47% 104 301425 org.apache.lucene.util.BitSet.or >>>> > 24 0.03% 99.50% 76 301423 >>>> > >>>> > org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.nextDoc >>>> > 25 0.03% 99.53% 74 301454 >>>> > org.apache.lucene.search.MultiTermQueryConstantScoreWrapper$1.rewrite >>>> > 26 0.03% 99.56% 65 301432 >>>> > org.apache.lucene.util.BitDocIdSet$Builder.or >>>> > 27 0.02% 99.58% 53 301456 >>>> > org.apache.lucene.util.FixedBitSet.or >>>> > 28 0.02% 99.60% 52 300077 java.lang.ClassLoader.defineClass1 >>>> > 29 0.02% 99.63% 50 301464 >>>> > org.apache.lucene.codecs.lucene50.ForUtil.readBlock >>>> > 30 0.02% 99.64% 39 301438 >>>> > org.apache.lucene.codecs.blocktree.SegmentTermsEnum.next >>>> > 31 0.02% 99.66% 37 301465 >>>> > >>>> > org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadNextFloorBlock >>>> > 32 0.02% 99.67% 36 301419 >>>> > >>>> > org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.nextDoc >>>> > CPU SAMPLES END >>>> > >>>> > On Fri, Jul 31, 2015 at 4:23 PM, Adrien Grand <[email protected]> >>>> > wrote: >>>> >> >>>> >> Hi Tomás, >>>> >> >>>> >> I suspect this might be related to LUCENE-5938. We changed the >>>> >> default >>>> >> rewrite method for multi-term queries to load documents into a sparse >>>> >> bit set first first, and only upgrade to a dense bit set when we know >>>> >> many documents match. When there are lots of terms to intersect, then >>>> >> we end up spending significant cpu time to build the sparse bit set >>>> >> to >>>> >> eventually upgrade to a dense bit set like before. This might be what >>>> >> you are seeing. >>>> >> >>>> >> You might see the issue less with the population field because it has >>>> >> fewer unique values, so postings lists are longer and the DocIdSet >>>> >> building logic can upgrade quicker to a dense bit set. >>>> >> >>>> >> Mike noticed this slowness when working on BDK trees and we changed >>>> >> this first phase to use a plain int[] array that we sort and >>>> >> deduplicate instead of a more fancy sparse bit set (LUCENE-6645), >>>> >> which seemed to make things faster. Would it be possible for you to >>>> >> also check a 5.3 snapshot? >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> On Fri, Jul 31, 2015 at 10:51 PM, Tomás Fernández Löbbe >>>> >> <[email protected]> wrote: >>>> >> > Hi, I'm seeing some query performance degradation between 4.10.4 >>>> >> > and >>>> >> > 5.2.1. >>>> >> > It doesn't happen with all the queries, but for queries like range >>>> >> > queries >>>> >> > on fields with many different values the average time in 5.2.1 is >>>> >> > worse >>>> >> > than >>>> >> > in 4.10.4. Is anyone seeing something similar? >>>> >> > >>>> >> > Test Details: >>>> >> > * Single thread running queries continuously. I run the test twice >>>> >> > for >>>> >> > each >>>> >> > Solr version. >>>> >> > * JMeter running on my laptop, Solr running on EC2, on an m3.xlarge >>>> >> > instance >>>> >> > with all the defaults but with 5G heap. Index in local disk (SSD) >>>> >> > * Plain Solr releases, nothing custom. Single Solr core, not in >>>> >> > SolrCloud >>>> >> > mode, no distributed search. >>>> >> > * "allCountries" geonames dataset (~8M small docs). No updates >>>> >> > during >>>> >> > the >>>> >> > test. Index Size is around 1.1GB for Solr 5.2.1 and 1.3GB for Solr >>>> >> > 4.10.4 >>>> >> > (fits entirely in RAM) >>>> >> > * jdk1.8.0_45 >>>> >> > >>>> >> > Queries: 3k boolean queries (generated with terms from the dataset) >>>> >> > with >>>> >> > range queries as filters on "tlongitude" and "tlatitude" fields >>>> >> > with >>>> >> > randomly generated bounds, e.g. >>>> >> > q=name:foo OR name:bar&fq=tlongitude:[W TO X]&fq=tlatitude:[Y TO Z] >>>> >> > >>>> >> > Fields are: >>>> >> > <field name="tlatitude" type="tdouble"/> >>>> >> > <field name="tlongitude" type="tdouble"/> >>>> >> > Field Type: >>>> >> > <fieldType name="tdouble" class="solr.TrieDoubleField" >>>> >> > precisionStep="8" >>>> >> > positionIncrementGap="0"/> >>>> >> > >>>> >> > In this case, Solr 4.10.4 was between 20% to 30% faster than 5.2.1 >>>> >> > in >>>> >> > average. >>>> >> > >>>> >> > http://snag.gy/2yPPM.jpg >>>> >> > >>>> >> > Doing only the boolean queries show no performance difference >>>> >> > between >>>> >> > 4.10 >>>> >> > and 5.2, same thing if I do filters on a string field instead of >>>> >> > the >>>> >> > range >>>> >> > queries. >>>> >> > >>>> >> > When using "double" field type (precisionStep="0"), the difference >>>> >> > was >>>> >> > bigger: >>>> >> > >>>> >> > longitude/latitude fields: >>>> >> > <field name="longitude" type="double" docValues="true"/> >>>> >> > <field name="latitude" type="double" docValues="true"/> >>>> >> > <fieldType name="double" class="solr.TrieDoubleField" >>>> >> > precisionStep="0" >>>> >> > positionIncrementGap="0"/> >>>> >> > >>>> >> > http://snag.gy/Vi5uk.jpg >>>> >> > I understand this is not the best field type definition for range >>>> >> > queries, >>>> >> > I'm just trying to understand the difference between the two >>>> >> > versions >>>> >> > and >>>> >> > why. >>>> >> > >>>> >> > Performance was OK when doing range queries on the "population" >>>> >> > field >>>> >> > (long), but that field doesn't have many different values, only >>>> >> > 300k out >>>> >> > of >>>> >> > the 8.3M docs have the population field with a value different to >>>> >> > 0. On >>>> >> > the >>>> >> > other hand, doing range queries on the _version_ field did show a >>>> >> > graph >>>> >> > similar to the previous one: >>>> >> > >>>> >> > <field name="_version_" type="long" indexed="true" stored="true"/> >>>> >> > <fieldType name="long" class="solr.TrieLongField" precisionStep="0" >>>> >> > positionIncrementGap="0"/> >>>> >> > >>>> >> > http://snag.gy/4tc7e.jpg >>>> >> > >>>> >> > Any idea what could be causing this? Is this expected after some >>>> >> > known >>>> >> > change? >>>> >> > >>>> >> > With Solr 4.10, a single CPU core remains high during the test >>>> >> > (close to >>>> >> > 100%), but with Solr 5.2, different cores go up and down in >>>> >> > utilization >>>> >> > continuously. That's probably because of the different Jetty >>>> >> > version I >>>> >> > suppose. >>>> >> > GC pattern looks similar in both. For both Solr versions I'm using >>>> >> > the >>>> >> > settings that ship with Solr (in solr.in.sh) except for Xmx and Xms >>>> >> > >>>> >> >>>> >> >>>> >> >>>> >> -- >>>> >> Adrien >>>> >> >>>> >> --------------------------------------------------------------------- >>>> >> To unsubscribe, e-mail: [email protected] >>>> >> For additional commands, e-mail: [email protected] >>>> >> >>>> > >>>> >>>> >>>> >>>> -- >>>> Adrien >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: [email protected] >>>> For additional commands, e-mail: [email protected] >>>> >>> >> >> >> >> -- >> Anshum Gupta > > > -- > Uwe Schindler > H.-H.-Meier-Allee 63, 28213 Bremen > http://www.thetaphi.de -- Adrien --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
