abhishekagarwal87 commented on a change in pull request #11068:
URL: https://github.com/apache/druid/pull/11068#discussion_r607595447



##########
File path: 
processing/src/main/java/org/apache/druid/query/lookup/LookupExtractor.java
##########
@@ -105,13 +106,25 @@
    */
   public abstract boolean canIterate();
 
+  /**
+   * Returns true if this lookup extractor's {@link #keySet()} method will 
return a valid set.
+   */
+  public abstract boolean canGetKeySet();
+
   /**
    * Returns an Iterable that iterates over the keys and values in this lookup 
extractor.
    *
    * @throws UnsupportedOperationException if {@link #canIterate()} returns 
false.
    */
   public abstract Iterable<Map.Entry<String, String>> iterable();
 
+  /**
+   * Returns a Set of all keys in this lookup extractor. The returned Set will 
not change.
+   *
+   * @throws UnsupportedOperationException if {@link #canIterate()} returns 
false.

Review comment:
       ```suggestion
      * @throws UnsupportedOperationException if {@link #canGetKeySet()} 
returns false.
   ```

##########
File path: 
processing/src/main/java/org/apache/druid/segment/join/table/IndexedTableJoinable.java
##########
@@ -88,6 +90,42 @@ public JoinMatcher makeJoinMatcher(
     );
   }
 
+  @Override
+  public Optional<Set<String>> getNonNullColumnValuesIfAllUnique(final String 
columnName, final int maxNumValues)
+  {
+    final int columnPosition = table.rowSignature().indexOf(columnName);
+
+    if (columnPosition < 0) {
+      return Optional.empty();
+    }
+
+    try (final IndexedTable.Reader reader = 
table.columnReader(columnPosition)) {
+      // Sorted set to encourage "in" filters that result from this method to 
do dictionary lookups in order.
+      // The hopes are that this will improve locality and therefore improve 
performance.
+      final Set<String> allValues = new TreeSet<>();
+
+      for (int i = 0; i < table.numRows(); i++) {
+        final String s = 
DimensionHandlerUtils.convertObjectToString(reader.read(i));
+
+        if (s != null) {

Review comment:
       do we need not exclude empty strings here? 

##########
File path: 
processing/src/main/java/org/apache/druid/segment/join/JoinableFactoryWrapper.java
##########
@@ -84,22 +100,48 @@ public JoinableFactoryWrapper(final JoinableFactory 
joinableFactory)
             return Function.identity();
           } else {
             final JoinableClauses joinableClauses = 
JoinableClauses.createClauses(clauses, joinableFactory);
+            final JoinFilterRewriteConfig filterRewriteConfig = 
JoinFilterRewriteConfig.forQuery(query);
+
+            // Pick off any join clauses that can be converted into filters.
+            final Set<String> requiredColumns = query.getRequiredColumns();
+            final Filter baseFilterToUse;
+            final List<JoinableClause> clausesToUse;
+
+            if (requiredColumns != null && 
filterRewriteConfig.isEnableRewriteJoinToFilter()) {
+              final Pair<List<Filter>, List<JoinableClause>> conversionResult 
= convertJoinsToFilters(
+                  joinableClauses.getJoinableClauses(),
+                  requiredColumns,
+                  
Ints.checkedCast(Math.min(filterRewriteConfig.getFilterRewriteMaxSize(), 
Integer.MAX_VALUE))

Review comment:
       maybe not here but should we cap the max size so that it can't result in 
an OOM?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to