So after re-feeding our data with a new boolean field that is true when
data exists and false when it doesn't our search times have gone from avg
of about 20s to around 150ms... pretty amazing change in perf... It seems
like https://issues.apache.org/jira/browse/SOLR-5093 might alleviate many
peoples pain in doing this kind of query (if I have some time I may take a
look at it)..

Anyway we are in pretty good shape at this point.. the only remaining issue
is that the first queries after commits are taking 5-6s... This is cause by
the loading of 2 (one long and one int) FieldCaches (uninvert) that are
used for sorting.. I'm suspecting that docvalues will greatly help this
load performance?

thanks,

steve


On Wed, Jul 31, 2013 at 4:32 PM, Steven Bower <smb-apa...@alcyon.net> wrote:

> the list of IDs does change relatively frequently, but this doesn't seem
> to have very much impact on the performance of the query as far as I can
> tell.
>
> attached are the stacks
>
> thanks,
>
> steve
>
>
> On Wed, Jul 31, 2013 at 6:33 AM, Mikhail Khludnev <
> mkhlud...@griddynamics.com> wrote:
>
>> On Wed, Jul 31, 2013 at 1:10 AM, Steven Bower <sbo...@alcyon.net> wrote:
>>
>> >
>> > not sure what you mean by good hit raitio?
>> >
>>
>> I mean such queries are really expensive (even on cache hit), so if the
>> list of ids changes every time, it never hit cache and hence executes
>> these
>> heavy queries every time. It's well known performance problem.
>>
>>
>> > Here are the stacks...
>> >
>> they seems like hotspots, and shows index reading that's reasonable. But I
>> can't see what caused these readings, to get that I need whole stack of
>> hot
>> thread.
>>
>>
>> >
>> >   Name Time (ms) Own Time (ms)
>> >
>> >
>> org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(AtomicReaderContext,
>> > Bits) 300879 203478
>> >
>> >
>> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.nextDoc()
>> > 45539 19
>> >
>> >
>> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.refillDocs()
>> > 45519 40
>> >
>> >
>> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.readVIntBlock(IndexInput,
>> > int[], int[], int, boolean) 24352 0
>> > org.apache.lucene.store.DataInput.readVInt() 24352 24352
>> > org.apache.lucene.codecs.lucene41.ForUtil.readBlock(IndexInput, byte[],
>> > int[]) 21126 14976
>> > org.apache.lucene.store.ByteBufferIndexInput.readBytes(byte[], int, int)
>> > 6150 0              java.nio.DirectByteBuffer.get(byte[], int, int)
>> > 6150 0
>> > java.nio.Bits.copyToArray(long, Object, long, long, long) 6150 6150
>> >
>> >
>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.docs(Bits,
>> > DocsEnum, int) 35342 421
>> >
>> >
>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.decodeMetaData()
>> > 34920 27939
>> >
>> >
>> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.nextTerm(FieldInfo,
>> > BlockTermState) 6980 6980
>> >
>> >
>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.next()
>> > 14129 1053
>> >
>> >
>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadNextFloorBlock()
>> > 5948 261
>> >
>> >
>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadBlock()
>> > 5686 199
>> > org.apache.lucene.store.ByteBufferIndexInput.readBytes(byte[], int, int)
>> > 3606 0              java.nio.DirectByteBuffer.get(byte[], int, int)
>> > 3606 0
>> > java.nio.Bits.copyToArray(long, Object, long, long, long) 3606 3606
>> >
>> >
>> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.readTermsBlock(IndexInput,
>> > FieldInfo, BlockTermState) 1879 80
>> > org.apache.lucene.store.ByteBufferIndexInput.readBytes(byte[], int, int)
>> > 1798 0                java.nio.DirectByteBuffer.get(byte[], int, int)
>> > 1798 0
>> > java.nio.Bits.copyToArray(long, Object, long, long, long) 1798 1798
>> >
>> >
>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.next()
>> > 4010 3324
>> >
>> >
>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.nextNonLeaf()
>> > 685 685
>> >
>> >
>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadBlock()
>> > 3117 144
>> > org.apache.lucene.store.ByteBufferIndexInput.readBytes(byte[], int, int)
>> > 1861 0            java.nio.DirectByteBuffer.get(byte[], int, int) 1861
>> > 0
>> > java.nio.Bits.copyToArray(long, Object, long, long, long) 1861 1861
>> >
>> >
>> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.readTermsBlock(IndexInput,
>> > FieldInfo, BlockTermState) 1090 19
>> > org.apache.lucene.store.ByteBufferIndexInput.readBytes(byte[], int, int)
>> > 1070 0              java.nio.DirectByteBuffer.get(byte[], int, int)
>> > 1070 0
>> > java.nio.Bits.copyToArray(long, Object, long, long, long) 1070 1070
>> >
>> >
>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.initIndexInput()
>> > 20 0            org.apache.lucene.store.ByteBufferIndexInput.clone()
>> > 20 0
>> > org.apache.lucene.store.ByteBufferIndexInput.clone() 20 0
>> > org.apache.lucene.store.ByteBufferIndexInput.buildSlice(long, long) 20
>> > 0
>> > org.apache.lucene.util.WeakIdentityMap.put(Object, Object) 20 0
>> >
>> org.apache.lucene.util.WeakIdentityMap$IdentityWeakReference.<init>(Object,
>> > ReferenceQueue) 20 0
>> > java.lang.System.identityHashCode(Object) 20 20
>> > org.apache.lucene.index.FilteredTermsEnum.docs(Bits, DocsEnum, int)
>> > 1485 527
>> >
>> >
>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.docs(Bits,
>> > DocsEnum, int) 957 0
>> >
>> >
>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.decodeMetaData()
>> > 957 513
>> >
>> >
>> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.nextTerm(FieldInfo,
>> > BlockTermState) 443 443
>> > org.apache.lucene.index.FilteredTermsEnum.next() 874 324
>> >
>> >
>> org.apache.lucene.search.NumericRangeQuery$NumericRangeTermsEnum.accept(BytesRef)
>> > 368 0
>> >
>> >
>> org.apache.lucene.util.BytesRef$UTF8SortedAsUnicodeComparator.compare(Object,
>> > Object) 368 368
>> >
>> >
>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.next()
>> > 160 0
>> >
>> >
>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadNextFloorBlock()
>> > 160 0
>> >
>> >
>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadBlock()
>> > 160 0
>> > org.apache.lucene.store.ByteBufferIndexInput.readBytes(byte[], int, int)
>> > 120
>> > 0
>> >
>> >
>> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.readTermsBlock(IndexInput,
>> > FieldInfo, BlockTermState) 39 0
>> >
>> >
>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.seekCeil(BytesRef,
>> > boolean) 19 0
>> >
>> >
>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadBlock()
>> > 19 0
>> >
>> >
>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.initIndexInput()
>> > 19 0              org.apache.lucene.store.ByteBufferIndexInput.clone()
>> > 19 0
>> > org.apache.lucene.store.ByteBufferIndexInput.clone() 19 0
>> > org.apache.lucene.store.ByteBufferIndexInput.buildSlice(long, long) 19
>> > 0
>> > org.apache.lucene.util.WeakIdentityMap.put(Object, Object) 19 0
>> >
>> org.apache.lucene.util.WeakIdentityMap$IdentityWeakReference.<init>(Object,
>> > ReferenceQueue) 19 0
>> > java.lang.System.identityHashCode(Object) 19 19
>> > org.apache.lucene.util.FixedBitSet.<init>(int) 28 28
>> >
>> >
>> > On Tue, Jul 30, 2013 at 4:18 PM, Mikhail Khludnev <
>> > mkhlud...@griddynamics.com> wrote:
>> >
>> > > On Tue, Jul 30, 2013 at 12:45 AM, Steven Bower <smb-apa...@alcyon.net
>> > > >wrote:
>> > >
>> > > >
>> > > > - Most of my time (98%) is being spent in
>> > > > java.nio.Bits.copyToByteArray(long,Object,long,long) which is being
>> > >
>> > >
>> > > Steven, please
>> > >
>> >
>> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html.my
>> > > benchmarking experience shows that NIO is a turtle, absolutely.
>> > >
>> > > also, are you sure that fq=(vid:86XXX73 OR vid:86XXX20 ..... has good
>> hit
>> > > ratio? otherwise it's a  well known beast.
>> > >
>> > > could you also show deeper stack, to make sure what causes to
>> excessive
>> > > reading?
>> > >
>> > >
>> > >
>> > > --
>> > > Sincerely yours
>> > > Mikhail Khludnev
>> > > Principal Engineer,
>> > > Grid Dynamics
>> > >
>> > > <http://www.griddynamics.com>
>> > >  <mkhlud...@griddynamics.com>
>> > >
>> >
>>
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>> Principal Engineer,
>> Grid Dynamics
>>
>> <http://www.griddynamics.com>
>>  <mkhlud...@griddynamics.com>
>>
>
>

Reply via email to