clintropolis commented on code in PR #16068:
URL: https://github.com/apache/druid/pull/16068#discussion_r1543788649


##########
processing/src/main/java/org/apache/druid/segment/column/TypeStrategies.java:
##########
@@ -518,6 +590,12 @@ public int write(ByteBuffer buffer, Object[] value, int 
maxSizeBytes)
       return extraNeeded < 0 ? extraNeeded : sizeBytes;
     }
 
+    @Override
+    public boolean groupable()
+    {
+      return true;

Review Comment:
   this should check if element type strategy is groupable



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/column/DictionaryBuildingGroupByColumnSelectorStrategy.java:
##########
@@ -0,0 +1,200 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.column;
+
+import it.unimi.dsi.fastutil.objects.Object2IntMap;
+import org.apache.druid.error.DruidException;
+import org.apache.druid.query.groupby.epinephelinae.DictionaryBuildingUtils;
+import org.apache.druid.segment.column.ColumnType;
+import org.apache.druid.segment.column.NullableTypeStrategy;
+
+import javax.annotation.concurrent.NotThreadSafe;
+import java.util.List;
+
+/**
+ * Strategy for grouping dimensions which can have variable-width objects, and 
aren't backed by prebuilt dictionaries. It
+ * encapsulates the dictionary building logic, along with providing the 
implementations for dimension to dictionary id
+ * encoding-decoding.
+ * <p>
+ * This strategy can handle any dimension that can be addressed on a 
reverse-dictionary. Reverse dictionary uses
+ * a sorted map, rather than a hashmap.
+ * <p>
+ * This is the most expensive of all the strategies, and hence must be used 
only when other strategies aren't valid.
+ */
+@NotThreadSafe
+public class DictionaryBuildingGroupByColumnSelectorStrategy<DimensionType>
+    extends KeyMappingGroupByColumnSelectorStrategy<DimensionType>
+{
+
+  /**
+   * Dictionary for mapping the dimension value to an index. i-th position in 
the dictionary holds the value represented
+   * by the dictionaryId "i".
+   * Therefore, if a value has a dictionary id "i", dictionary.get(i) = value
+   */
+  private final List<DimensionType> dictionary;
+
+  /**
+   * Reverse dictionary for faster lookup into the dictionary, and reusing 
pre-existing dictionary ids.
+   * <p>
+   * An entry of form (value, i) in the reverse dictionary represents that 
"value" is present at the i-th location in the
+   * {@link #dictionary}.
+   * Absence of mapping of a "value" (denoted by returning {@link 
GroupByColumnSelectorStrategy#GROUP_BY_MISSING_VALUE})
+   * represents that the value is absent in the dictionary
+   */
+  private final Object2IntMap<DimensionType> reverseDictionary;
+
+  private DictionaryBuildingGroupByColumnSelectorStrategy(
+      DimensionToIdConverter<DimensionType> dimensionToIdConverter,
+      ColumnType columnType,
+      NullableTypeStrategy<DimensionType> nullableTypeStrategy,
+      DimensionType defaultValue,
+      IdToDimensionConverter<DimensionType> idToDimensionConverter,
+      List<DimensionType> dictionary,
+      Object2IntMap<DimensionType> reverseDictionary
+  )
+  {
+    super(dimensionToIdConverter, columnType, nullableTypeStrategy, 
defaultValue, idToDimensionConverter);
+    this.dictionary = dictionary;
+    this.reverseDictionary = reverseDictionary;
+  }
+
+  /**
+   * Creates an implementation of the strategy for the given type
+   */
+  public static GroupByColumnSelectorStrategy forType(final ColumnType 
columnType)
+  {
+    if (columnType.equals(ColumnType.STRING)) {
+      // String types are handled specially because they can have multi-value 
dimensions
+      throw DruidException.defensive("Should use special variant which handles 
multi-value dimensions");

Review Comment:
   this can handle regular strings probably, since if we know that a string 
column definitely isn't multi-value it probably more efficient to not have to 
check every value?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org

Reply via email to