mcvsubbu commented on a change in pull request #4791: Support STRING and BYTES 
for no dictionary columns in realtime consuming segments
URL: https://github.com/apache/incubator-pinot/pull/4791#discussion_r343797508
 
 

 ##########
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/realtime/converter/stats/RealtimeNoDictionaryColStatistics.java
 ##########
 @@ -63,12 +68,52 @@ public int getCardinality() {
 
   @Override
   public int getLengthOfShortestElement() {
-    return lengthOfDataType(); // Only fixed length data types supported.
+    FieldSpec.DataType dataType = _blockValSet.getValueType();
+    if (dataType == FieldSpec.DataType.STRING || dataType == 
FieldSpec.DataType.BYTES) {
+      // variable width no dictionary columns
+      int minLength = Integer.MAX_VALUE;
+      BaseSingleColumnSingleValueReaderWriter readerWriter = 
(BaseSingleColumnSingleValueReaderWriter)_forwardIndex;
 
 Review comment:
   Can we keep track of shortest and longest element in the fwd index and just 
read it here? Will save time as well as garbage generation during segment build.
   
   We can definitely compute min and max in one go, and not need to walk over 
the fwd index for each of them separately.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to