[ 
https://issues.apache.org/jira/browse/LUCENE-8585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16747950#comment-16747950
 ] 

Adrien Grand commented on LUCENE-8585:
--------------------------------------

[~toke] I was just playing with this new doc-value format and while it seems to 
perform generally as well as the previous format or better, it seems a bit 
slower when doing lots of calls to advanceExact by small gaps. I looked at how 
things get compiled and was able to fix the regression by moving early exits in 
rankSkip at the call site: because rankSkip is too large to be inlined, if you 
keep calling advanceExact with small gaps then you will always perform a method 
call only to figure out than rank skipping doesn't help. Moving the exit 
conditions to the call site avoids the method call. Here is what the patch 
looks like (I also removed private modifiers from members of this class as this 
would create synthetic accessors when used from the Method constants):

{noformat}
diff --git 
a/lucene/core/src/java/org/apache/lucene/codecs/lucene80/IndexedDISI.java 
b/lucene/core/src/java/org/apache/lucene/codecs/lucene80/IndexedDISI.java
index 8ddb93e..520d1d4 100644
--- a/lucene/core/src/java/org/apache/lucene/codecs/lucene80/IndexedDISI.java
+++ b/lucene/core/src/java/org/apache/lucene/codecs/lucene80/IndexedDISI.java
@@ -254,13 +254,15 @@ final class IndexedDISI extends DocIdSetIterator {
     return (short)blockCount;
   }
 
+  // Members are pkg-private to avoid synthetic accessors when accessed from 
the `Method` enum
+
   /** The slice that stores the {@link DocIdSetIterator}. */
-  private final IndexInput slice;
-  private final int jumpTableEntryCount;
-  private final byte denseRankPower;
-  private final RandomAccessInput jumpTable; // Skip blocks of 64K bits
-  private final byte[] denseRankTable;
-  private final long cost;
+  final IndexInput slice;
+  final int jumpTableEntryCount;
+  final byte denseRankPower;
+  final RandomAccessInput jumpTable; // Skip blocks of 64K bits
+  final byte[] denseRankTable;
+  final long cost;
 
   /**
    * This constructor always creates a new blockSlice and a new jumpTable from 
in, to ensure that operations are
@@ -344,28 +346,28 @@ final class IndexedDISI extends DocIdSetIterator {
     }
   }
 
-  private int block = -1;
-  private long blockEnd;
-  private long denseBitmapOffset = -1; // Only used for DENSE blocks
-  private int nextBlockIndex = -1;
+  int block = -1;
+  long blockEnd;
+  long denseBitmapOffset = -1; // Only used for DENSE blocks
+  int nextBlockIndex = -1;
   Method method;
 
-  private int doc = -1;
-  private int index = -1;
+  int doc = -1;
+  int index = -1;
 
   // SPARSE variables
-  private boolean exists;
+  boolean exists;
 
   // DENSE variables
-  private long word;
-  private int wordIndex = -1;
+  long word;
+  int wordIndex = -1;
   // number of one bits encountered so far, including those of `word`
-  private int numberOfOnes;
+  int numberOfOnes;
   // Used with rank for jumps inside of DENSE as they are absolute instead of 
relative
-  private int denseOrigoIndex;
+  int denseOrigoIndex;
 
   // ALL variables
-  private int gap;
+  int gap;
 
   @Override
   public int docID() {
@@ -514,7 +516,11 @@ final class IndexedDISI extends DocIdSetIterator {
         final int targetWordIndex = targetInBlock >>> 6;
 
         // If possible, skip ahead using the rank cache
-        rankSkip(disi, target);
+        // If the distance between the current position and the target is < 
rank-longs
+        // there is no sense in using rank
+        if (disi.denseRankPower != -1 && targetWordIndex - disi.wordIndex >= 
(1 << (disi.denseRankPower-6) )) {
+          rankSkip(disi, targetInBlock);
+        }
 
         for (int i = disi.wordIndex + 1; i <= targetWordIndex; ++i) {
           disi.word = disi.slice.readLong();
@@ -550,7 +556,12 @@ final class IndexedDISI extends DocIdSetIterator {
         final int targetInBlock = target & 0xFFFF;
         final int targetWordIndex = targetInBlock >>> 6;
 
-        rankSkip(disi, target);
+        // If possible, skip ahead using the rank cache
+        // If the distance between the current position and the target is < 
rank-longs
+        // there is no sense in using rank
+        if (disi.denseRankPower != -1 && targetWordIndex - disi.wordIndex >= 
(1 << (disi.denseRankPower-6) )) {
+          rankSkip(disi, targetInBlock);
+        }
 
         for (int i = disi.wordIndex + 1; i <= targetWordIndex; ++i) {
           disi.word = disi.slice.readLong();
@@ -594,23 +605,11 @@ final class IndexedDISI extends DocIdSetIterator {
    * Note: This does not guarantee a skip up to target, only up to nearest 
rank boundary. It is the
    * responsibility of the caller to iterate further to reach target.
    * @param disi standard DISI.
-   * @param target the wanted docID for which to calculate set-flag and index.
+   * @param targetInBlock lower 16 bits of the target
    * @throws IOException if a DISI seek failed.
    */
-  private static void rankSkip(IndexedDISI disi, int target) throws 
IOException {
-    if (disi.denseRankPower == -1) { // No rank for the current structure
-      return;
-    }
-
-    final int targetInBlock = target & 0xFFFF;       // Lower 16 bits
-    final int targetWordIndex = targetInBlock >>> 6; // long: 2^6 = 64
-
-    // If the distance between the current position and the target is < 
rank-longs
-    // there is no sense in using rank
-    if (targetWordIndex - disi.wordIndex < (1 << (disi.denseRankPower-6) )) {
-      return;
-    }
-
+  private static void rankSkip(IndexedDISI disi, int targetInBlock) throws 
IOException {
+    assert disi.denseRankPower >= 0 : disi.denseRankPower;
     // Resolve the rank as close to targetInBlock as possible (maximum 
distance is 8 longs)
     // Note: rankOrigoOffset is tracked on block open, so it is absolute (e.g. 
don't add origo)
     final int rankIndex = targetInBlock >> disi.denseRankPower; // Default is 
9 (8 longs: 2^3 * 2^6 = 512 docIDs)
{noformat}

> Create jump-tables for DocValues at index-time
> ----------------------------------------------
>
>                 Key: LUCENE-8585
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8585
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/codecs
>    Affects Versions: 8.0
>            Reporter: Toke Eskildsen
>            Priority: Minor
>              Labels: performance
>         Attachments: LUCENE-8585.patch, LUCENE-8585.patch, 
> make_patch_lucene8585.sh
>
>          Time Spent: 10.5h
>  Remaining Estimate: 0h
>
> As noted in LUCENE-7589, lookup of DocValues should use jump-tables to avoid 
> long iterative walks. This is implemented in LUCENE-8374 at search-time 
> (first request for DocValues from a field in a segment), with the benefit of 
> working without changes to existing Lucene 7 indexes and the downside of 
> introducing a startup time penalty and a memory overhead.
> As discussed in LUCENE-8374, the codec should be updated to create these 
> jump-tables at index time. This eliminates the segment-open time & memory 
> penalties, with the potential downside of increasing index-time for DocValues.
> The three elements of LUCENE-8374 should be transferable to index-time 
> without much alteration of the core structures:
>  * {{IndexedDISI}} block offset and index skips: A {{long}} (64 bits) for 
> every 65536 documents, containing the offset of the block in 33 bits and the 
> index (number of set bits) up to the block in 31 bits.
>  It can be build sequentially and should be stored as a simple sequence of 
> consecutive longs for caching of lookups.
>  As it is fairly small, relative to document count, it might be better to 
> simply memory cache it?
>  * {{IndexedDISI}} DENSE (> 4095, < 65536 set bits) blocks: A {{short}} (16 
> bits) for every 8 {{longs}} (512 bits) for a total of 256 bytes/DENSE_block. 
> Each {{short}} represents the number of set bits up to right before the 
> corresponding sub-block of 512 docIDs.
>  The \{{shorts}} can be computed sequentially or when the DENSE block is 
> flushed (probably the easiest). They should be stored as a simple sequence of 
> consecutive shorts for caching of lookups, one logically independent sequence 
> for each DENSE block. The logical position would be one sequence at the start 
> of every DENSE block.
>  Whether it is best to read all the 16 {{shorts}} up front when a DENSE block 
> is accessed or whether it is best to only read any individual {{short}} when 
> needed is not clear at this point.
>  * Variable Bits Per Value: A {{long}} (64 bits) for every 16384 numeric 
> values. Each {{long}} holds the offset to the corresponding block of values.
>  The offsets can be computed sequentially and should be stored as a simple 
> sequence of consecutive {{longs}} for caching of lookups.
>  The vBPV-offsets has the largest space overhead og the 3 jump-tables and a 
> lot of the 64 bits in each long are not used for most indexes. They could be 
> represented as a simple {{PackedInts}} sequence or {{MonotonicLongValues}}, 
> with the downsides of a potential lookup-time overhead and the need for doing 
> the compression after all offsets has been determined.
> I have no experience with the codec-parts responsible for creating 
> index-structures. I'm quite willing to take a stab at this, although I 
> probably won't do much about it before January 2019. Should anyone else wish 
> to adopt this JIRA-issue or co-work on it, I'll be happy to share.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to