[GitHub] [lucene] vigyasharma commented on a diff in pull request #966: LUCENE-10619: Optimize the writeBytes in TermsHashPerField

GitBox Sat, 18 Jun 2022 17:54:32 -0700


vigyasharma commented on code in PR #966:
URL: https://github.com/apache/lucene/pull/966#discussion_r901031392



##########
lucene/core/src/java/org/apache/lucene/index/TermsHashPerField.java:
##########
@@ -230,9 +230,29 @@ final void writeByte(int stream, byte b) {
   }
 
   final void writeBytes(int stream, byte[] b, int offset, int len) {
-    // TODO: optimize
     final int end = offset + len;
-    for (int i = offset; i < end; i++) writeByte(stream, b[i]);
+    int streamAddress = streamAddressOffset + stream;
+    int upto = termStreamAddressBuffer[streamAddress];
+    byte[] slice = bytePool.buffers[upto >> ByteBlockPool.BYTE_BLOCK_SHIFT];
+    assert slice != null;
+    int sliceOffset = upto & ByteBlockPool.BYTE_BLOCK_MASK;
+
+    while (slice[sliceOffset] == 0 && offset < end) {
+      slice[sliceOffset++] = b[offset++];
+      (termStreamAddressBuffer[streamAddress])++;
+    }
+
+    while (offset < end) {
+      int offsetAndLength = bytePool.allocKnownSizeSlice(slice, sliceOffset);
+      sliceOffset = offsetAndLength >> 8;
+      int sliceLength = offsetAndLength & 0xff;
+      slice = bytePool.buffer;
+      int writeLength = Math.min(sliceLength - 1, end - offset);
+      System.arraycopy(b, offset, slice, sliceOffset, writeLength);
+      sliceOffset += writeLength;
+      offset += writeLength;
+      termStreamAddressBuffer[streamAddress] = sliceOffset + 
bytePool.byteOffset;
+    }

Review Comment:
   Wow, this is an interesting change! Can you help me better understand this 
optimization? 
   
   Is `System.arraycopy()` much faster than manually copying like in your while 
loop above. I would guess it is similar since there are no extra checks for 
primitive or object type [[ref: fast array copy 
comparison](https://www.javaspecialists.eu/archive/Issue124-Copying-Arrays-Fast.html)],
 but I'm not sure.
   This is probably better than calling `writeByte()` in a loop because we 
don't need to fetch all the pointers every time. 
   
   A follow up idea: I think `allocSlice()` (or `allocKnownSizeSlice()` here), 
always returns the next higher sized buffer slice.. Maybe `allocKnownSizeSlice` 
could instead return a slice appropriate for the length of pending items we 
need to write `(end - offset)`? We could then avoid having to call alloc 
multiple times for larger writes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [lucene] vigyasharma commented on a diff in pull request #966: LUCENE-10619: Optimize the writeBytes in TermsHashPerField

Reply via email to