[GitHub] [lucene] uschindler commented on a change in pull request #327: LUCENE-10125: Speed up DirectWriter.

GitBox Thu, 30 Sep 2021 00:22:11 -0700


uschindler commented on a change in pull request #327:
URL: https://github.com/apache/lucene/pull/327#discussion_r719127442




##########
File path: lucene/core/src/java/org/apache/lucene/util/packed/DirectWriter.java
##########
@@ -94,38 +91,54 @@ private void flush() throws IOException {
     }
     // Avoid writing bits from values that are outside of the range we need to 
encode
     Arrays.fill(nextValues, off, nextValues.length, 0L);
-    encode(nextValues, 0, nextBlocks, 0, iterations);
+    encode(nextValues, off, nextBlocks, bitsPerValue);
     final int blockCount =
         (int) PackedInts.Format.PACKED.byteCount(PackedInts.VERSION_CURRENT, 
off, bitsPerValue);
     output.writeBytes(nextBlocks, blockCount);
     off = 0;
   }
 
-  public void encode(
-      long[] values, int valuesOffset, byte[] blocks, int blocksOffset, int 
iterations) {
-    int nextBlock = 0;
-    int bitsUsed = 0;
-    for (int i = 0; i < byteValueCount * iterations; ++i) {
-      final long v = values[valuesOffset++];
-      assert PackedInts.unsignedBitsRequired(v) <= bitsPerValue;
-      if (bitsUsed < byteOffset) {
-        // just buffer
-        nextBlock |= v << bitsUsed;
-        bitsUsed += bitsPerValue;
-      } else {
-        // flush as many blocks as possible
-        blocks[blocksOffset++] = (byte) (nextBlock | (v << bitsUsed));
-        int bits = 8 - bitsUsed;
-        while (bits <= bitsUsedOffset) {
-          blocks[blocksOffset++] = (byte) (v >> bits);
-          bits += 8;
+  private static void encode(long[] nextValues, int upTo, byte[] nextBlocks, 
int bitsPerValue) {
+    if ((bitsPerValue & 7) == 0) {
+      // bitsPerValue is a multiple of 8: 8, 16, 24, 32, 30, 48, 56, 64
+      final int bytesPerValue = bitsPerValue / Byte.SIZE;
+      for (int i = 0, o = 0; i < upTo; ++i, o += bytesPerValue) {

Review comment:
       Hi,
   it was too late yesterday to do any test. I just drafted my idea and went to 
sleep.
   
   I don't know which benchmark you used (lucenebench and how was it called - 
the taxidriver bench was completely new for me; I have no idea how to start 
it?) I can quickly start a benchmark but I want numbers comparable so it should 
be same than the one you used.
   
   > It should hoist it as a loop constant.
   
   I hope so, I am just afraid that the code is too complex (4 branches). In 
addition: how many iterations have the loop? If it is called often for loops 
with only few elements, then I am not sure if it helps, because whenever the 
bitsize changes when encode() is called, it will be deoptimized and it starts 
over again.
   
   Because of this I made the 2 drafts. I would now prefer the #333 because of 
cleaner code. Both should behave similar.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] uschindler commented on a change in pull request #327: LUCENE-10125: Speed up DirectWriter.

Reply via email to