Performance issues with the default field compression

Alex Parvulescu Wed, 09 Apr 2014 12:18:07 -0700

Hi,

I was investigating some performance issues and during profiling I noticed
that there is a significant amount of time being spent decompressing fields
which are unrelated to the actual field I'm trying to load from the lucene
documents. In our benchmark doing mostly a simple full-test search, 40% of
the time was lost in these parts.


My code does the following: reader.document(id, Set(":path")).get(":path"),
and this is where the fun begins :)
I noticed 2 things, please excuse the ignorance if some of the things I
write here are not 100% correct:

 - all the fields in the document are being decompressed prior to applying
the field filter. We've noticed this because we have a lot of content
stored in the index, so there is an important time lost around
decompressing junk. At one point I tried adding the field first, thinking
this will save some work, but it doesn't look like it's doing much.
Reference code, the visitor is only used at the very end. [0]

 - second, and probably of a smaller impact would be to have the
DocumentStoredFieldVisitor return STOP when there are no more fields in the
visitor to visit. I only have one, and it looks like it will #skip through
a bunch of other stuff before finishing a document. [1]

thanks in advance,
alex


[0]
https://svn.apache.org/viewvc/lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/compressing/CompressingStoredFieldsReader.java?view=markup#l364

[1]
https://svn.apache.org/viewvc/lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/document/DocumentStoredFieldVisitor.java?view=markup#l100

Performance issues with the default field compression

Reply via email to