On Mon, 2019-08-26 at 16:01 +0000, Wittenberg, Lucas wrote:
>       @Override
>       public void collect(int docNumber) throws IOException {
>               if (null != this.reader &&
> isValid(this.reader.document(docNumber).get("customid")))
>               {
>                       super.collect(docNumber);
>               }
>       }
...
        - It runs pretty fast on SOLR 4, with average QTime equals to
> 30.
>       - But now on SOLR 7, it is awfully slow with average QTime
> around 25000!

Lucene 4.0 did not compress stored fields per default and as far as I
remember, Solr 7 forces compression across documents in 16KB blocks. My
guess is that you see the effect of a lot of decompression (although 25
seconds still seems excessive).

There's at least 2 thing you can try:

1) State which field you need.

Instead of 
  this.reader.document(docNumber).get("customid")
have a Set containing only customid and call
  this.reader.document(docNumber, wantedFields)


https://lucene.apache.org/core/4_0_0/core/org/apache/lucene/index/IndexReader.html#document(int,%20java.util.Set
)

I'm not sure if it will be much better with Solr 7 though, due to the
block-oriented compression.


2) Switch to using DocValues, as Erick suggests.

You will have to add something like
  SortedDocValues dv = DocValues.getSorted(context, "customid");
to your doSetNextReader method and
  dv.advanceExact(docNumber) && 
    isValid(dv.binaryValue().utf8ToString())
in your collect method.


https://lucene.apache.org/core/7_1_0/core/org/apache/lucene/index/DocValues.html#getSorted-org.apache.lucene.index.LeafReader-java.lang.String
-

If you want to speed it up further, you can use BytesRefs as keys in
your customMap instead of Strings, and avoid the .utf8ToString() call.

- Toke Eskildsen, Royal Danish Library


Reply via email to