cryptoe commented on code in PR #16068:
URL: https://github.com/apache/druid/pull/16068#discussion_r1568682683


##########
sql/src/test/java/org/apache/druid/sql/calcite/CalciteGroupByQueryTest.java:
##########
@@ -0,0 +1,24 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql.calcite;

Review Comment:
   Why is this class required ?



##########
processing/src/main/java/org/apache/druid/query/dimension/ColumnSelectorStrategyFactory.java:
##########
@@ -24,5 +24,8 @@
 
 public interface ColumnSelectorStrategyFactory<ColumnSelectorStrategyClass 
extends ColumnSelectorStrategy>
 {
-  ColumnSelectorStrategyClass makeColumnSelectorStrategy(ColumnCapabilities 
capabilities, ColumnValueSelector selector);
+  ColumnSelectorStrategyClass makeColumnSelectorStrategy(ColumnCapabilities 
capabilities, ColumnValueSelector selector, String dimension);
+
+  // TODO(laksh): Javadoc

Review Comment:
   Looks like we need java docs here. 



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/column/DictionaryBuildingGroupByColumnSelectorStrategy.java:
##########
@@ -0,0 +1,203 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.column;
+
+import it.unimi.dsi.fastutil.objects.Object2IntMap;
+import org.apache.druid.error.DruidException;
+import org.apache.druid.query.groupby.epinephelinae.DictionaryBuildingUtils;
+import org.apache.druid.segment.column.ColumnType;
+import org.apache.druid.segment.column.NullableTypeStrategy;
+
+import javax.annotation.concurrent.NotThreadSafe;
+import java.util.List;
+
+/**
+ * Strategy for grouping dimensions which can have variable-width objects, and 
aren't backed by prebuilt dictionaries. It
+ * encapsulates the dictionary building logic, along with providing the 
implementations for dimension to dictionary id
+ * encoding-decoding.
+ * <p>
+ * This strategy can handle any dimension that can be addressed on a 
reverse-dictionary. Reverse dictionary uses
+ * a sorted map, rather than a hashmap.
+ * <p>
+ * This is the most expensive of all the strategies, and hence must be used 
only when other strategies aren't valid.
+ */
+@NotThreadSafe
+public class DictionaryBuildingGroupByColumnSelectorStrategy<DimensionType>
+    extends KeyMappingGroupByColumnSelectorStrategy<DimensionType>
+{
+
+  /**
+   * Dictionary for mapping the dimension value to an index. i-th position in 
the dictionary holds the value represented
+   * by the dictionaryId "i".
+   * Therefore, if a value has a dictionary id "i", dictionary.get(i) = value

Review Comment:
   Can you add an example for this as well. 



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/column/MemoryEstimate.java:
##########
@@ -0,0 +1,46 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.column;
+
+/**
+ * Holder for a value and the memory increase in the internal dictionary 
associated with the increase
+ */
+public class MemoryEstimate<T>
+{
+  private final T value;
+  private final int memoryIncrease;
+
+  // Reduced visibility
+  MemoryEstimate(T value, int memoryIncrease)
+  {
+    this.value = value;
+    this.memoryIncrease = memoryIncrease;

Review Comment:
   lets cal it memory footprint. 



##########
processing/src/main/java/org/apache/druid/segment/column/TypeStrategies.java:
##########
@@ -544,5 +622,47 @@ public int compare(@Nullable Object o1Obj, @Nullable 
Object o2Obj)
       }
       return Integer.compare(o1.length, o2.length);
     }
+
+    /**
+     * Implements {@link Arrays#hashCode(Object[])} but the element hashing 
uses the element's type strategy
+     */
+    @Override
+    public int hashCode(Object[] o)
+    {
+      if (o == null) {
+        return 0;
+      } else {
+        int result = 1;
+        for (Object element : o) {
+          result = 31 * result + (element == null ? 0 : 
elementStrategy.hashCode(element));
+        }
+        return result;
+      }
+    }
+    /**
+     * Implements {@link Arrays#equals} but the element equality uses the 
element's type strategy
+     */
+    @Override
+    public boolean equals(@Nullable Object[] a, @Nullable Object[] b)

Review Comment:
   Lets have Ut's for this. 



##########
processing/src/main/java/org/apache/druid/segment/column/TypeStrategy.java:
##########
@@ -170,4 +172,30 @@ default T fromBytes(byte[] value)
   {
     throw new IllegalStateException("Not supported");
   }
+
+  /**
+   * Whether the type is groupable or not. This is always true for all the 
primitive types, arrays, and nested arrays
+   * therefore the SQL and the native layer might ignore this flag for those 
types. For complex types, this flag can be
+   * true or false, depending on whether the semantics and implementation of 
the type naturally leads to groupability
+   * or not. For example, it makes sense for JSON columns to be groupable, 
however there is little sense in grouping
+   * sketches (before finalizing).
+   *
+   * If a type is groupable, it MUST implement the {@link #hashCode} and 
{@link #equals} correctly
+   */
+  default boolean groupable()
+  {
+    return false;
+  }
+
+  @Override
+  default int hashCode(T o)

Review Comment:
   Lets add java doc mentioning that most implementation have groupable=true 
while overriding this method. 



##########
processing/src/main/java/org/apache/druid/query/filter/ArrayContainsElementFilter.java:
##########
@@ -108,7 +109,10 @@ public byte[] getCacheKey()
     final NullableTypeStrategy<Object> typeStrategy = 
elementMatchValueEval.type().getNullableStrategy();
     final int size = 
typeStrategy.estimateSizeBytes(elementMatchValueEval.value());
     final ByteBuffer valueBuffer = ByteBuffer.allocate(size);
-    typeStrategy.write(valueBuffer, elementMatchValueEval.value(), size);
+    if (typeStrategy.write(valueBuffer, elementMatchValueEval.value(), size) < 
0) {
+      // Defensive check, since the size had already been estimated from the 
same type strategy
+      throw DruidException.defensive("Unable to write the value");

Review Comment:
   We should also add the the colName and the size.  The reason I say this is 
because colValue can have sensitive data . 



##########
.github/workflows/static-checks.yml:
##########
@@ -163,9 +163,17 @@ jobs:
           ${MVN} install -q -ff -pl 'distribution' ${MAVEN_SKIP} 
${MAVEN_SKIP_TESTS}
 
       - name: rewrite:dryRun
+        id: rewrite-dryRun
         run: |
           ${MVN} rewrite:dryRun ${MAVEN_SKIP}
 
+      - name: Upload open rewrite patch

Review Comment:
   I think this is not related to this PR :)



##########
processing/src/main/java/org/apache/druid/query/dimension/ColumnSelectorStrategyFactory.java:
##########
@@ -24,5 +24,8 @@
 
 public interface ColumnSelectorStrategyFactory<ColumnSelectorStrategyClass 
extends ColumnSelectorStrategy>
 {
-  ColumnSelectorStrategyClass makeColumnSelectorStrategy(ColumnCapabilities 
capabilities, ColumnValueSelector selector);
+  ColumnSelectorStrategyClass makeColumnSelectorStrategy(ColumnCapabilities 
capabilities, ColumnValueSelector selector, String dimension);
+
+  // TODO(laksh): Javadoc
+  boolean supportsNestedArraysAndComplexTypes();

Review Comment:
   This shoudl be renamed supportsComplexTypes() or should return a list of 
columns types to suppport . 



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/column/DictionaryBuildingGroupByColumnSelectorStrategy.java:
##########
@@ -0,0 +1,203 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.column;
+
+import it.unimi.dsi.fastutil.objects.Object2IntMap;
+import org.apache.druid.error.DruidException;
+import org.apache.druid.query.groupby.epinephelinae.DictionaryBuildingUtils;
+import org.apache.druid.segment.column.ColumnType;
+import org.apache.druid.segment.column.NullableTypeStrategy;
+
+import javax.annotation.concurrent.NotThreadSafe;
+import java.util.List;
+
+/**
+ * Strategy for grouping dimensions which can have variable-width objects, and 
aren't backed by prebuilt dictionaries. It
+ * encapsulates the dictionary building logic, along with providing the 
implementations for dimension to dictionary id
+ * encoding-decoding.
+ * <p>
+ * This strategy can handle any dimension that can be addressed on a 
reverse-dictionary. Reverse dictionary uses
+ * a sorted map, rather than a hashmap.
+ * <p>
+ * This is the most expensive of all the strategies, and hence must be used 
only when other strategies aren't valid.
+ */
+@NotThreadSafe
+public class DictionaryBuildingGroupByColumnSelectorStrategy<DimensionType>
+    extends KeyMappingGroupByColumnSelectorStrategy<DimensionType>
+{
+
+  /**
+   * Dictionary for mapping the dimension value to an index. i-th position in 
the dictionary holds the value represented
+   * by the dictionaryId "i".
+   * Therefore, if a value has a dictionary id "i", dictionary.get(i) = value
+   */
+  private final List<DimensionType> dictionary;
+
+  /**
+   * Reverse dictionary for faster lookup into the dictionary, and reusing 
pre-existing dictionary ids.
+   * <p>
+   * An entry of form (value, i) in the reverse dictionary represents that 
"value" is present at the i-th location in the
+   * {@link #dictionary}.
+   * Absence of mapping of a "value" (denoted by returning {@link 
GroupByColumnSelectorStrategy#GROUP_BY_MISSING_VALUE})
+   * represents that the value is absent in the dictionary
+   */
+  private final Object2IntMap<DimensionType> reverseDictionary;
+
+  private DictionaryBuildingGroupByColumnSelectorStrategy(
+      DimensionToIdConverter<DimensionType> dimensionToIdConverter,
+      ColumnType columnType,
+      NullableTypeStrategy<DimensionType> nullableTypeStrategy,
+      DimensionType defaultValue,
+      IdToDimensionConverter<DimensionType> idToDimensionConverter,
+      List<DimensionType> dictionary,
+      Object2IntMap<DimensionType> reverseDictionary
+  )
+  {
+    super(dimensionToIdConverter, columnType, nullableTypeStrategy, 
defaultValue, idToDimensionConverter);
+    this.dictionary = dictionary;
+    this.reverseDictionary = reverseDictionary;
+  }
+
+  /**
+   * Creates an implementation of the strategy for the given type
+   */
+  public static GroupByColumnSelectorStrategy forType(final ColumnType 
columnType)
+  {
+    if (columnType.equals(ColumnType.STRING)) {
+      // String types are handled specially because they can have multi-value 
dimensions
+      throw DruidException.defensive("Should use special variant which handles 
multi-value dimensions");
+    } else if (
+      // Defensive check, primitives should be using a faster fixed-width 
strategy
+        columnType.equals(ColumnType.DOUBLE)
+        || columnType.equals(ColumnType.FLOAT)
+        || columnType.equals(ColumnType.LONG)) {
+      throw DruidException.defensive("Could used a fixed width strategy");
+    }
+
+    // Catch-all for all other types, that can only have single-valued 
dimensions
+    return forArrayAndComplexTypes(columnType);
+  }
+
+  /**
+   * Implemenatation of dictionary building strategy for types other than 
strings (since they can be multi-valued and need
+   * to be handled separately) and numeric primitives (since they can be 
handled by fixed-width strategy).
+   * This also means that we handle array and complex types here, which 
simplifies the generics a lot, as everything can be
+   * treated as Object in this class.
+   * <p>
+   * Also, there isn't any concept of multi-values here, therefore Dimension 
== DimensionHolderType == Object. We still
+   * homogenize rogue selectors which can return non-standard implementation 
of arrays (like Long[] for long arrays instead of
+   * Object[]) to what the callers would expect (i.e. Object[] in this case).
+   */
+  private static GroupByColumnSelectorStrategy forArrayAndComplexTypes(final 
ColumnType columnType)
+  {
+    final List<Object> dictionary = DictionaryBuildingUtils.createDictionary();
+    final Object2IntMap<Object> reverseDictionary =
+        
DictionaryBuildingUtils.createReverseDictionary(columnType.getNullableStrategy());
+    return new DictionaryBuildingGroupByColumnSelectorStrategy<>(

Review Comment:
   Also what would happen in Array<Strings> earlier we had an optimization 
where each string element of the array of dictionary encoded and the whole 
array was made of of Array<dictionaryInts> which gave us best compression. In 
your case we would have only one level of dictionary. 
   



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/column/DictionaryBuildingGroupByColumnSelectorStrategy.java:
##########
@@ -0,0 +1,203 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.column;
+
+import it.unimi.dsi.fastutil.objects.Object2IntMap;
+import org.apache.druid.error.DruidException;
+import org.apache.druid.query.groupby.epinephelinae.DictionaryBuildingUtils;
+import org.apache.druid.segment.column.ColumnType;
+import org.apache.druid.segment.column.NullableTypeStrategy;
+
+import javax.annotation.concurrent.NotThreadSafe;
+import java.util.List;
+
+/**
+ * Strategy for grouping dimensions which can have variable-width objects, and 
aren't backed by prebuilt dictionaries. It
+ * encapsulates the dictionary building logic, along with providing the 
implementations for dimension to dictionary id
+ * encoding-decoding.
+ * <p>
+ * This strategy can handle any dimension that can be addressed on a 
reverse-dictionary. Reverse dictionary uses
+ * a sorted map, rather than a hashmap.
+ * <p>
+ * This is the most expensive of all the strategies, and hence must be used 
only when other strategies aren't valid.
+ */
+@NotThreadSafe
+public class DictionaryBuildingGroupByColumnSelectorStrategy<DimensionType>
+    extends KeyMappingGroupByColumnSelectorStrategy<DimensionType>
+{
+
+  /**
+   * Dictionary for mapping the dimension value to an index. i-th position in 
the dictionary holds the value represented
+   * by the dictionaryId "i".
+   * Therefore, if a value has a dictionary id "i", dictionary.get(i) = value
+   */
+  private final List<DimensionType> dictionary;
+
+  /**
+   * Reverse dictionary for faster lookup into the dictionary, and reusing 
pre-existing dictionary ids.
+   * <p>
+   * An entry of form (value, i) in the reverse dictionary represents that 
"value" is present at the i-th location in the
+   * {@link #dictionary}.
+   * Absence of mapping of a "value" (denoted by returning {@link 
GroupByColumnSelectorStrategy#GROUP_BY_MISSING_VALUE})
+   * represents that the value is absent in the dictionary
+   */
+  private final Object2IntMap<DimensionType> reverseDictionary;
+
+  private DictionaryBuildingGroupByColumnSelectorStrategy(
+      DimensionToIdConverter<DimensionType> dimensionToIdConverter,
+      ColumnType columnType,
+      NullableTypeStrategy<DimensionType> nullableTypeStrategy,
+      DimensionType defaultValue,
+      IdToDimensionConverter<DimensionType> idToDimensionConverter,
+      List<DimensionType> dictionary,
+      Object2IntMap<DimensionType> reverseDictionary
+  )
+  {
+    super(dimensionToIdConverter, columnType, nullableTypeStrategy, 
defaultValue, idToDimensionConverter);
+    this.dictionary = dictionary;
+    this.reverseDictionary = reverseDictionary;
+  }
+
+  /**
+   * Creates an implementation of the strategy for the given type
+   */
+  public static GroupByColumnSelectorStrategy forType(final ColumnType 
columnType)
+  {
+    if (columnType.equals(ColumnType.STRING)) {
+      // String types are handled specially because they can have multi-value 
dimensions
+      throw DruidException.defensive("Should use special variant which handles 
multi-value dimensions");
+    } else if (
+      // Defensive check, primitives should be using a faster fixed-width 
strategy
+        columnType.equals(ColumnType.DOUBLE)
+        || columnType.equals(ColumnType.FLOAT)
+        || columnType.equals(ColumnType.LONG)) {
+      throw DruidException.defensive("Could used a fixed width strategy");
+    }
+
+    // Catch-all for all other types, that can only have single-valued 
dimensions
+    return forArrayAndComplexTypes(columnType);
+  }
+
+  /**
+   * Implemenatation of dictionary building strategy for types other than 
strings (since they can be multi-valued and need
+   * to be handled separately) and numeric primitives (since they can be 
handled by fixed-width strategy).
+   * This also means that we handle array and complex types here, which 
simplifies the generics a lot, as everything can be
+   * treated as Object in this class.
+   * <p>
+   * Also, there isn't any concept of multi-values here, therefore Dimension 
== DimensionHolderType == Object. We still
+   * homogenize rogue selectors which can return non-standard implementation 
of arrays (like Long[] for long arrays instead of
+   * Object[]) to what the callers would expect (i.e. Object[] in this case).
+   */
+  private static GroupByColumnSelectorStrategy forArrayAndComplexTypes(final 
ColumnType columnType)
+  {
+    final List<Object> dictionary = DictionaryBuildingUtils.createDictionary();
+    final Object2IntMap<Object> reverseDictionary =
+        
DictionaryBuildingUtils.createReverseDictionary(columnType.getNullableStrategy());

Review Comment:
   You are always passing a Object dictionar where as on line 51 : `  private 
final List<DimensionType> dictionary;` we have a templated dictionary. Seems 
odd to me. 
   



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/vector/DictionaryBuildingSingleValueStringGroupByVectorColumnSelector.java:
##########
@@ -82,7 +83,7 @@ public int writeKeys(
 
         // Use same ROUGH_OVERHEAD_PER_DICTIONARY_ENTRY as the nonvectorized 
version; dictionary structure is the same.
         stateFootprintIncrease +=
-            DictionaryBuilding.estimateEntryFootprint((value == null ? 0 : 
value.length()) * Character.BYTES);
+            DictionaryBuildingUtils.estimateEntryFootprint((value == null ? 0 
: value.length()) * Character.BYTES);

Review Comment:
   Should we push this stringLogic inside the footPrintMethod or maybe have a 
wrapper string method inside the same class ?



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/column/MemoryEstimate.java:
##########
@@ -0,0 +1,46 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.column;
+
+/**
+ * Holder for a value and the memory increase in the internal dictionary 
associated with the increase
+ */
+public class MemoryEstimate<T>

Review Comment:
   This is more like memoryFootprint. Lets rename it to such. 



##########
processing/src/main/java/org/apache/druid/segment/nested/NestedDataComplexTypeSerde.java:
##########
@@ -157,6 +160,34 @@ public byte[] toBytes(@Nullable Object val)
     };
   }
 
+  @Override
+  public <T extends Comparable<T>> TypeStrategy<T> getTypeStrategy()
+  {
+    return new ObjectStrategyComplexTypeStrategy<>(
+        getObjectStrategy(),
+        ColumnType.ofComplex(TYPE_NAME),
+        true,
+        new Hash.Strategy<Object>()
+        {
+          @Override
+          public int hashCode(Object o)
+          {
+            // TODO(laksh): VET, Check if StructuredData.wrap(o).hashCode() 
makes sense, given that most of the objects inside

Review Comment:
   Is this todo still there ?
   cc @clintropolis I think this is not a blocker for this PR. We can always 
improve the implementation of the StructuredData 
   



##########
processing/src/main/java/org/apache/druid/segment/column/ObjectStrategyComplexTypeStrategy.java:
##########
@@ -27,19 +29,35 @@
 /**
  * Default implementation of {@link TypeStrategy} for all {@link 
org.apache.druid.segment.serde.ComplexMetricSerde}
  * implementations that just wraps the {@link ObjectStrategy} they are 
required to implement.
- *
+ * <p>
  * This is not likely to be the most efficient way to do things, especially 
since writing must first produce a byte
  * array before it can be written to the buffer, but it is cheap and should 
work correctly, which is important.
  */
 public class ObjectStrategyComplexTypeStrategy<T> implements TypeStrategy<T>
 {
   private final ObjectStrategy<T> objectStrategy;
   private final TypeSignature<?> typeSignature;
+  private final boolean groupable;

Review Comment:
   We should just store hashStrategy here. 
   and just do hashStrategey.groupable()



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/column/DictionaryBuildingGroupByColumnSelectorStrategy.java:
##########
@@ -0,0 +1,203 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.column;
+
+import it.unimi.dsi.fastutil.objects.Object2IntMap;
+import org.apache.druid.error.DruidException;
+import org.apache.druid.query.groupby.epinephelinae.DictionaryBuildingUtils;
+import org.apache.druid.segment.column.ColumnType;
+import org.apache.druid.segment.column.NullableTypeStrategy;
+
+import javax.annotation.concurrent.NotThreadSafe;
+import java.util.List;
+
+/**
+ * Strategy for grouping dimensions which can have variable-width objects, and 
aren't backed by prebuilt dictionaries. It
+ * encapsulates the dictionary building logic, along with providing the 
implementations for dimension to dictionary id
+ * encoding-decoding.
+ * <p>
+ * This strategy can handle any dimension that can be addressed on a 
reverse-dictionary. Reverse dictionary uses
+ * a sorted map, rather than a hashmap.
+ * <p>
+ * This is the most expensive of all the strategies, and hence must be used 
only when other strategies aren't valid.
+ */
+@NotThreadSafe
+public class DictionaryBuildingGroupByColumnSelectorStrategy<DimensionType>
+    extends KeyMappingGroupByColumnSelectorStrategy<DimensionType>

Review Comment:
    I also think the abstraction is an overkill since we only have one impl for 
now. 



##########
processing/src/main/java/org/apache/druid/segment/nested/StructuredData.java:
##########
@@ -185,6 +186,13 @@ public int hashCode()
     return Objects.hash(value);
   }
 
+  // hashCode that relies on the object equality. Translates the hashcode to 
an integer as well
+  // TODO(laksh): better name

Review Comment:
   Lets remove the todo we can always make this better later. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to