vigyasharma commented on code in PR #966:
URL: https://github.com/apache/lucene/pull/966#discussion_r901050635
##########
lucene/core/src/java/org/apache/lucene/index/TermsHashPerField.java:
##########
@@ -230,9 +230,29 @@ final void writeByte(int stream, byte b) {
}
final void writeBytes(int stream, byte[] b, int offset, int len) {
- // TODO: optimize
final int end = offset + len;
- for (int i = offset; i < end; i++) writeByte(stream, b[i]);
+ int streamAddress = streamAddressOffset + stream;
+ int upto = termStreamAddressBuffer[streamAddress];
+ byte[] slice = bytePool.buffers[upto >> ByteBlockPool.BYTE_BLOCK_SHIFT];
+ assert slice != null;
+ int sliceOffset = upto & ByteBlockPool.BYTE_BLOCK_MASK;
+
+ while (slice[sliceOffset] == 0 && offset < end) {
+ slice[sliceOffset++] = b[offset++];
+ (termStreamAddressBuffer[streamAddress])++;
+ }
+
+ while (offset < end) {
+ int offsetAndLength = bytePool.allocKnownSizeSlice(slice, sliceOffset);
+ sliceOffset = offsetAndLength >> 8;
+ int sliceLength = offsetAndLength & 0xff;
+ slice = bytePool.buffer;
+ int writeLength = Math.min(sliceLength - 1, end - offset);
+ System.arraycopy(b, offset, slice, sliceOffset, writeLength);
+ sliceOffset += writeLength;
+ offset += writeLength;
+ termStreamAddressBuffer[streamAddress] = sliceOffset +
bytePool.byteOffset;
+ }
Review Comment:
Ah, I see what you are saying. This is some deep coupling between the
`ByteSliceReader` and the `ByteBlockPool`.
`ByteSliceReader.nextSlice()` just assumes that the slice is the next higher
level in `ByteBlockPool.NEXT_LEVEL_ARRAY`. Wonder if we could set the
`ByteSliceReader.level` to the right value before calling `nextSlice()`. The
buffer seems to contain the next level value [in its last few
bits](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/util/ByteBlockPool.java#L268-L269).
`allocKnownSizeSlice()` would then return the next higher slice size for
`(end-offset)` in `LEVEL_SIZE_ARRAY`.
It might be a more intrusive change though. ByteBlockPool docstring calls
out that it allocates *"slices of increasing lengths"*, so there might be other
places that hold this expectation. Maybe worth playing around in a separate
spin off PR.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]