mcvsubbu commented on a change in pull request #4791: Support STRING and BYTES
for no dictionary columns in realtime consuming segments
URL: https://github.com/apache/incubator-pinot/pull/4791#discussion_r343787566
##########
File path:
pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
##########
@@ -187,12 +188,16 @@ public long getLatestIngestionTimestamp() {
// Only support generating raw index on single-value non-string columns
that do not have inverted index while
// consuming. After consumption completes and the segment is built, all
single-value columns can have raw index
FieldSpec.DataType dataType = fieldSpec.getDataType();
- int indexColumnSize = FieldSpec.DataType.INT.size();
+ int forwardIndexColumnSize;
if (noDictionaryColumns.contains(column) &&
fieldSpec.isSingleValueField()
- && dataType != FieldSpec.DataType.STRING &&
!invertedIndexColumns.contains(column)) {
- // No dictionary
- indexColumnSize = dataType.size();
+ && !invertedIndexColumns.contains(column)) {
+ // No dictionary -- size will be equal to size of data
Review comment:
This does mean that we support no dictionary for all types of columns. Seems
ok, but just worried that if another column type is added for which we may need
some speical logic, it may be hard to locate this place to change it. Short of
introducing a method isNoDictionarySupportedForColumnType() I am not sure what
else can be done. We can add the isSingleValueField() check inside the new
method, though
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]