Jackie-Jiang commented on code in PR #18470:
URL: https://github.com/apache/pinot/pull/18470#discussion_r3223642654
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/loader/InvertedIndexAndDictionaryBasedForwardIndexCreator.java:
##########
@@ -357,38 +330,42 @@ private Map<String, String>
createForwardIndexForMVColumn()
}
// Construct the forward index values buffer from the inverted index
using the length buffer for index tracking
+ DataType storedType = _columnMetadata.getStoredType();
+ boolean isFixedWidth = storedType.isFixedWidth();
+ int maxRowLengthInBytes = isFixedWidth ? maxNumberOfMultiValues *
storedType.size() : 0;
for (int dictId = 0; dictId < _cardinality; dictId++) {
ImmutableRoaringBitmap docIdsBitmap =
invertedIndexReader.getDocIds(dictId);
- int finalDictId = dictId;
- docIdsBitmap.stream().forEach(docId -> {
+ PeekableIntIterator intIterator = docIdsBitmap.getIntIterator();
+ while (intIterator.hasNext()) {
+ int docId = intIterator.next();
int index = getInt(_forwardIndexLengthBuffer, docId);
- putInt(_forwardIndexValueBuffer, index, finalDictId);
+ putInt(_forwardIndexValueBuffer, index, dictId);
putInt(_forwardIndexLengthBuffer, docId, index + 1);
if (!isFixedWidth) {
- trackMaxRowLengthInBytes(dictionary, maxRowLengthInBytes, docId,
finalDictId);
+ int currentRowLength = getInt(_forwardIndexMaxSizeBuffer, docId);
+ int newRowLength = currentRowLength +
dictionary.getValueSize(dictId);
+ putInt(_forwardIndexMaxSizeBuffer, docId, newRowLength);
+ maxRowLengthInBytes = Math.max(maxRowLengthInBytes, newRowLength);
}
- });
+ }
}
- IndexCreationContext context = IndexCreationContext.builder()
- .withIndexDir(_segmentMetadata.getIndexDir())
- .withColumnMetadata(_columnMetadata)
- .withForwardIndexDisabled(false)
- .withDictionary(_dictionaryPresent)
+ // When the duplicate-detection branch above fires, `_nextValueId` and
`maxNumberOfMultiValues[0]` are the new
Review Comment:
Updated the comment to drop the `[0]`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]