gianm commented on code in PR #14025:
URL: https://github.com/apache/druid/pull/14025#discussion_r1164385083
##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java:
##########
@@ -1856,35 +1873,38 @@ private static Pair<List<DimensionSchema>,
List<AggregatorFactory>> makeDimensio
// Each column can be of either time, dimension, aggregator. For this
method. we can ignore the time column.
// For non-complex columns, If the aggregator factory of the column is not
available, we treat the column as
// a dimension. For complex columns, certains hacks are in place.
- for (final String outputColumn : outputColumnsInOrder) {
- final String queryColumn =
columnMappings.getQueryColumnForOutputColumn(outputColumn);
+ for (final String outputColumnName : outputColumnsInOrder) {
+ // Iterables.getOnlyElement because this method is only called during
ingestion, where we require
+ // that output names be unique.
+ final int outputColumn =
Iterables.getOnlyElement(columnMappings.getOutputColumnsByName(outputColumnName));
Review Comment:
Same comment from me as well 🙂
##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/ColumnMappings.java:
##########
@@ -36,23 +35,32 @@
import java.util.Set;
import java.util.stream.Collectors;
+/**
+ * Maps column names from {@link MSQSpec#getQuery()} to output names desired
by the user, in the order
+ * desired by the user.
+ *
+ * The {@link MSQSpec#getQuery()} is translated by {@link
org.apache.druid.msq.querykit.QueryKit} into
+ * a {@link org.apache.druid.msq.kernel.QueryDefinition}. So, this class also
represents mappings from
+ * {@link
org.apache.druid.msq.kernel.QueryDefinition#getFinalStageDefinition()} into the
output names desired
+ * by the user.
+ */
public class ColumnMappings
{
private final List<ColumnMapping> mappings;
- private final Map<String, String> outputToQueryColumnMap;
- private final Map<String, List<String>> queryToOutputColumnsMap;
+ private final Map<String, IntList> outputNameToPositionMap;
+ private final Map<String, IntList> queryNameToPositionMap;
Review Comment:
I don't really like abbreviations in names, so I went with the even longer
`queryColumnnameToPositionMap`.
##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java:
##########
@@ -1841,13 +1856,15 @@ private static Pair<List<DimensionSchema>,
List<AggregatorFactory>> makeDimensio
if (isRollupQuery) {
// Populate aggregators from the native query when doing an ingest in
rollup mode.
for (AggregatorFactory aggregatorFactory : ((GroupByQuery)
query).getAggregatorSpecs()) {
- String outputColumn =
Iterables.getOnlyElement(columnMappings.getOutputColumnsForQueryColumn(aggregatorFactory.getName()));
- if (outputColumnAggregatorFactories.containsKey(outputColumn)) {
- throw new ISE("There can only be one aggregator factory for column
[%s].", outputColumn);
+ final int outputColumn =
+
Iterables.getOnlyElement(columnMappings.getOutputColumnsForQueryColumn(aggregatorFactory.getName()));
Review Comment:
I tried doing this, but I felt it complexified the code too much, especially
since this case "can't happen" (unless there is some kind of coding bug). I
decided to do it by adding a method `CollectionUtils.getOnlyElement` that makes
this pattern (`getOnlyElement` + custom error message`) a little easier,
requiring less code at the call site.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]