gianm commented on a change in pull request #9638:
URL: https://github.com/apache/druid/pull/9638#discussion_r417756200
##########
File path:
processing/src/main/java/org/apache/druid/query/aggregation/PostAggregator.java
##########
@@ -43,6 +44,8 @@
@Nullable
String getName();
+ ValueType getType();
Review comment:
Please add javadocs.
##########
File path:
processing/src/main/java/org/apache/druid/query/aggregation/AggregatorFactory.java
##########
@@ -210,22 +212,36 @@ public AggregatorFactory
getMergingFactory(AggregatorFactory other) throws Aggre
public abstract List<String> requiredFields();
/**
- * Get the type name of the intermediate type for this aggregator. This is
the same as the type returned by
+ * Get the "intermediate" {@link ValueType} for this aggregator. This is the
same as the type returned by
* {@link #deserialize} and the type accepted by {@link #combine}. However,
it is *not* necessarily the same type
* returned by {@link #finalizeComputation}.
+ */
+ public abstract ValueType getType();
+
+ /**
+ * Get the type for the final form of this this aggregator, i.e. the type of
the value returned by
+ * {@link #finalizeComputation}. This may be the same as or different than
the types expected in {@link #deserialize}
+ * and {@link #combine}.
+ * @return
+ */
+ public ValueType getFinalizedType()
+ {
+ return getType();
+ }
+
+ /**
+ * Get the complex type name of the intermediate type for this aggregator.
*
- * If the type is complex (i.e. not a simple, numeric {@link
org.apache.druid.segment.column.ValueType}) then there
+ * This should ONLY be implemented if the type is complex (i.e. not a
simple, numeric {@link ValueType}), and there
* must be a corresponding {@link
org.apache.druid.segment.serde.ComplexMetricSerde} which was registered with
* {@link org.apache.druid.segment.serde.ComplexMetrics#registerSerde} using
this type name.
*
- * If you need a ValueType enum corresponding to this aggregator, a good way
to do that is:
- *
- * <pre>
- * Optional.ofNullable(GuavaUtils.getEnumIfPresent(ValueType.class,
aggregator.getTypeName()))
- * .orElse(ValueType.COMPLEX);
- * </pre>
+ * If you need a ValueType enum corresponding to this aggregator, use {@link
#getTypeName} instead.
Review comment:
Add a `@throws IllegalStateException if getType() != ValueType.COMPLEX`
##########
File path:
processing/src/main/java/org/apache/druid/query/aggregation/AggregatorFactory.java
##########
@@ -210,22 +212,36 @@ public AggregatorFactory
getMergingFactory(AggregatorFactory other) throws Aggre
public abstract List<String> requiredFields();
/**
- * Get the type name of the intermediate type for this aggregator. This is
the same as the type returned by
+ * Get the "intermediate" {@link ValueType} for this aggregator. This is the
same as the type returned by
* {@link #deserialize} and the type accepted by {@link #combine}. However,
it is *not* necessarily the same type
* returned by {@link #finalizeComputation}.
+ */
+ public abstract ValueType getType();
+
+ /**
+ * Get the type for the final form of this this aggregator, i.e. the type of
the value returned by
+ * {@link #finalizeComputation}. This may be the same as or different than
the types expected in {@link #deserialize}
+ * and {@link #combine}.
+ * @return
+ */
+ public ValueType getFinalizedType()
+ {
+ return getType();
Review comment:
IMO, it would be better for this to be abstract, since this makes it too
easy for people to forget to override it.
##########
File path:
processing/src/main/java/org/apache/druid/query/aggregation/post/ExpressionPostAggregator.java
##########
@@ -179,6 +169,14 @@ public String getName()
return name;
}
+ @Override
+ public ValueType getType()
+ {
+ // this is wrong, replace with Expr output type based on the input types
once it is available
+ // but treat as string for now
+ return ValueType.STRING;
Review comment:
Why STRING? (Instead of COMPLEX or whatever.)
What bad things could potentially happen if the type returned by this method
is wrong? (The javadoc should explain this, ideally.)
##########
File path: core/src/main/java/org/apache/druid/segment/column/ValueType.java
##########
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.segment.column;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import org.apache.druid.java.util.common.StringUtils;
+
+import javax.annotation.Nullable;
+
+public enum ValueType
+{
+ DOUBLE,
+ FLOAT,
+ LONG,
+ STRING,
+ COMPLEX,
+ DOUBLE_ARRAY,
+ LONG_ARRAY,
+ STRING_ARRAY;
Review comment:
Please add javadocs documenting these types.
Especially important things to document include:
- When do we use `STRING` and when do we use `STRING_ARRAY`? (Multivalue
strings are type STRING, even though they behave kind of like arrays. How do we
explain this coherently?)
- What does COMPLEX mean and how can you get more information about
something that is COMPLEX?
##########
File path:
processing/src/main/java/org/apache/druid/segment/column/RowSignature.java
##########
@@ -237,21 +235,14 @@ public Builder addDimensions(final List<DimensionSpec>
dimensions)
public Builder addAggregators(final List<AggregatorFactory> aggregators)
{
for (final AggregatorFactory aggregator : aggregators) {
- final ValueType type = GuavaUtils.getEnumIfPresent(
- ValueType.class,
- StringUtils.toUpperCase(aggregator.getTypeName())
- );
-
- // Use null instead of COMPLEX for nonnumeric types, since in that
case, the type depends on whether or not
- // the aggregator is finalized, and we don't know (a) if it will be
finalized, or even (b) what the type would
- // be if it were finalized. So null (i.e. unknown) is the proper thing
to do.
- //
- // Another note: technically, we don't know what the finalized type
will be even if the type here is numeric,
- // but we're assuming that it doesn't change upon finalization. All
builtin aggregators work this way.
+ final ValueType type = aggregator.getType();
- if (type != null && type.isNumeric()) {
+ if (type.equals(aggregator.getFinalizedType())) {
add(aggregator.getName(), type);
} else {
+ // Use null if the type depends on whether or not the aggregator is
finalized, since
Review comment:
Instead of using null here, we should have different signatures for
finalized and nonfinalized rows. Perhaps the row signature builder should have
`addAggregators(List<AggregatorFactory> aggregators, boolean finalize)` and the
things that call it should be given knowledge about whether they're going to be
finalizing or not.
##########
File path: core/src/main/java/org/apache/druid/segment/column/ValueType.java
##########
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.segment.column;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import org.apache.druid.java.util.common.StringUtils;
+
+import javax.annotation.Nullable;
+
+public enum ValueType
+{
+ DOUBLE,
+ FLOAT,
+ LONG,
+ STRING,
+ COMPLEX,
+ DOUBLE_ARRAY,
+ LONG_ARRAY,
+ STRING_ARRAY;
+
+
+ public boolean isNumeric()
+ {
+ return isNumeric(this);
+ }
+
+ public boolean isPrimitiveScalar()
+ {
+ return this.equals(ValueType.STRING) || isNumeric(this);
+ }
+
+ public boolean isComplex()
Review comment:
With this logic, `DOUBLE_ARRAY.isComplex()` is true. This seems weird. I
would think only COMPLEX is complex. Please add javadocs to the method and
maybe consider renaming it.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]