FWIW, Micro benchmark shows 4% gain on reusing incoming ByteRef.bytes in short binary docvalues Test2BBinaryDocValues.testVariableBinary() with mmap directory. I wonder why it doesn't reads into incoming bytes https://github.com/apache/lucene-solr/blame/trunk/lucene/core/src/java/org/apache/lucene/codecs/lucene45/Lucene45DocValuesProducer.java#L401
On Wed, Jan 8, 2014 at 12:53 AM, Michael McCandless < [email protected]> wrote: > Going sequentially should help, if the pages are not hot (in the OS's IO > cache). > > You can also use a different DVFormat, e.g. Direct, but this holds all > bytes in RAM. > > Mike McCandless > > http://blog.mikemccandless.com > > > On Tue, Jan 7, 2014 at 1:09 PM, Mikhail Khludnev > <[email protected]> wrote: > > Joel, > > > > I tried to hack it straightforwardly, but found no free gain there. The > only > > attempt I can suggest is to try to reuse bytes in > > > https://github.com/apache/lucene-solr/blame/trunk/lucene/core/src/java/org/apache/lucene/codecs/lucene45/Lucene45DocValuesProducer.java#L401 > > right now it allocates bytes every time, which beside of GC can also > impact > > memory access locality. Could you try fix memory waste and repeat > > performance test? > > > > Have a good hack! > > > > > > On Mon, Dec 23, 2013 at 9:51 PM, Joel Bernstein <[email protected]> > wrote: > >> > >> > >> Hi, > >> > >> I'm looking for a faster way to perform large scale docId -> bytesRef > >> lookups for BinaryDocValues. > >> > >> I'm finding that I can't get the performance that I need from the random > >> access seek in the BinaryDocValues interface. > >> > >> I'm wondering if sequentially scanning the docValues would be a faster > >> approach. I have a BitSet of matching docs, so if I sequentially moved > >> through the docValues I could test each one against that bitset. > >> > >> Wondering if that approach would be faster for bulk extracts and how > >> tricky it would be to add an iterator to the BinaryDocValues interface? > >> > >> Thanks, > >> Joel > > > > > > > > > > -- > > Sincerely yours > > Mikhail Khludnev > > Principal Engineer, > > Grid Dynamics > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics <http://www.griddynamics.com> <[email protected]>
