jpountz commented on code in PR #966:
URL: https://github.com/apache/lucene/pull/966#discussion_r918804129


##########
lucene/core/src/java/org/apache/lucene/index/TermsHashPerField.java:
##########
@@ -230,9 +230,29 @@ final void writeByte(int stream, byte b) {
   }
 
   final void writeBytes(int stream, byte[] b, int offset, int len) {
-    // TODO: optimize
     final int end = offset + len;
-    for (int i = offset; i < end; i++) writeByte(stream, b[i]);
+    int streamAddress = streamAddressOffset + stream;
+    int upto = termStreamAddressBuffer[streamAddress];
+    byte[] slice = bytePool.buffers[upto >> ByteBlockPool.BYTE_BLOCK_SHIFT];
+    assert slice != null;
+    int sliceOffset = upto & ByteBlockPool.BYTE_BLOCK_MASK;
+
+    while (slice[sliceOffset] == 0 && offset < end) {
+      slice[sliceOffset++] = b[offset++];
+      (termStreamAddressBuffer[streamAddress])++;
+    }

Review Comment:
   Maybe in the future we could optimize this case a bit too by using 
`Arrays#mismatch` with an array that is full of zeroes.



##########
lucene/core/src/test/org/apache/lucene/index/TestTermsHashPerField.java:
##########
@@ -298,4 +299,23 @@ class Posting {
       assertTrue("the last posting must be EOF on the reader", eof);
     }
   }
+
+  public void testWriteBytes() throws IOException {
+    for (int i = 0; i < 100; i++) {
+      AtomicInteger newCalled = new AtomicInteger(0);
+      AtomicInteger addCalled = new AtomicInteger(0);
+      TermsHashPerField hash = createNewHash(newCalled, addCalled);
+      hash.start(null, true);
+      hash.add(newBytesRef("start"), 0); // tid = 0;
+      int size = TestUtil.nextInt(random(), 50000, 100000);
+      byte[] randomData = new byte[size];
+      random().nextBytes(randomData);
+      hash.writeBytes(0, randomData, 0, randomData.length);

Review Comment:
   Maybe change this to write small chunks at once to better exercise the case 
when we're starting a write in the middle or at the end of a slice?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to