kfaraz commented on code in PR #13133:
URL: https://github.com/apache/druid/pull/13133#discussion_r976604210


##########
processing/src/main/java/org/apache/druid/segment/data/GenericIndexed.java:
##########
@@ -826,4 +862,28 @@ public void inspectRuntimeShape(RuntimeShapeInspector 
inspector)
       }
     };
   }
+
+  public class ValueWithIndex

Review Comment:
   I guess it would be cleaner to just use a `ListIterator` which provides 
`nextIndex()`.
   You wouldn't be able to peek the next index though, and you might have to 
work around that.
   (That could be easier to do if we go with @FrankChen021 's suggestion to 
separate the two kinds of
   searches into two different iterables.)
   
   Another alternative could be to just use `Pair` but I am not a fan of it.
   
   If you do decide to use this class, however, I would suggest putting as a 
top level class in `druid-core/org.apache.druid.java.util.common`, as other 
parts of the code might have similar requirements.



##########
processing/src/main/java/org/apache/druid/segment/serde/DictionaryEncodedStringIndexSupplier.java:
##########
@@ -280,15 +287,35 @@ public ImmutableBitmap next()
 
             private void findNext()
             {
-              while (next < 0 && iterator.hasNext()) {
-                ByteBuffer nextValue = iterator.next();
-                next = dictionary.indexOf(nextValue);
-
-                if (next == -dictionarySize - 1) {
-                  // nextValue is past the end of the dictionary.
-                  // Note: we can rely on indexOf returning (-(insertion 
point) - 1), even though Indexed doesn't
-                  // guarantee it, because "dictionary" comes from 
GenericIndexed singleThreaded().
-                  break;
+              // if the size of in-filter values is less than the threshold 
percentage of dictionary size, then use binary search
+              // based lookup per value. The algorithm works well for smaller 
number of values.
+              if (size < SORTED_MERGE_RATIO_THRESHOLD * dictionary.size()) {

Review Comment:
   Yes, that would be much more readable.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to