mayankshriv commented on a change in pull request #4321: #4317 Feature/variable
length bytes offline dictionary for indexing bytes and string dicts.
URL: https://github.com/apache/incubator-pinot/pull/4321#discussion_r295596950
##########
File path:
pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/SegmentDictionaryCreator.java
##########
@@ -211,6 +203,39 @@ public void build()
}
}
+ private void writeBytesValueDictionary(int numValues, byte[][]
sortedByteArrays) throws IOException {
+
+ if (_useVarLengthDictionary) {
+ // Backward-compatible: index file is always big-endian
+ long size =
VarLengthBytesValueReaderWriter.getRequiredSize(sortedByteArrays);
+ try (PinotDataBuffer dataBuffer = PinotDataBuffer
+ .mapFile(_dictionaryFile, false, 0, size, ByteOrder.BIG_ENDIAN,
+ getClass().getSimpleName());
+ VarLengthBytesValueReaderWriter writer = new
VarLengthBytesValueReaderWriter(
+ dataBuffer)) {
+ writer.init(sortedByteArrays);
Review comment:
Any reason why the `init` does the `write` as well (comparing with existing
implementation)? This caller side code becomes less readable, as readers will
question why only init is called, and then go to `init` and figure out that
init is also writing the values.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]