gianm commented on code in PR #12914:
URL: https://github.com/apache/druid/pull/12914#discussion_r953237605
##########
core/src/main/java/org/apache/druid/math/expr/ExprEval.java:
##########
@@ -429,18 +443,15 @@ public static ExprEval ofType(@Nullable ExpressionType
type, @Nullable Object va
case STRING:
// not all who claim to be "STRING" are always a String, prepare
ourselves...
if (value instanceof String[]) {
- return new ArrayExprEval(ExpressionType.STRING_ARRAY, (String[])
value);
+ return new ArrayExprEval(ExpressionType.STRING_ARRAY,
Arrays.stream((String[]) value).toArray());
Review Comment:
Stream construction in hot paths can be expensive. May be better to avoid it.
##########
core/src/main/java/org/apache/druid/math/expr/ExprEval.java:
##########
@@ -388,16 +388,30 @@ public static ExprEval bestEffortOf(@Nullable Object val)
return new StringExprEval(String.valueOf(val));
}
if (val instanceof Long[]) {
- return new ArrayExprEval(ExpressionType.LONG_ARRAY, (Long[]) val);
+ return new ArrayExprEval(ExpressionType.LONG_ARRAY,
Arrays.stream((Long[]) val).toArray());
+ }
+ if (val instanceof long[]) {
+ return new ArrayExprEval(ExpressionType.LONG_ARRAY,
Arrays.stream((long[]) val).boxed().toArray());
}
if (val instanceof Double[]) {
- return new ArrayExprEval(ExpressionType.DOUBLE_ARRAY, (Double[]) val);
+ return new ArrayExprEval(ExpressionType.DOUBLE_ARRAY,
Arrays.stream((Double[]) val).toArray());
+ }
+ if (val instanceof double[]) {
+ return new ArrayExprEval(ExpressionType.DOUBLE_ARRAY,
Arrays.stream((double[]) val).boxed().toArray());
}
if (val instanceof Float[]) {
return new ArrayExprEval(ExpressionType.DOUBLE_ARRAY,
Arrays.stream((Float[]) val).map(Float::doubleValue).toArray());
}
+ if (val instanceof float[]) {
+ float[] floats = (float[]) val;
+ Object[] boxed = new Object[floats.length];
+ for (int i = 0; i < floats.length; i++) {
+ boxed[i] = floats[i];
+ }
+ return new ArrayExprEval(ExpressionType.DOUBLE_ARRAY, boxed);
+ }
if (val instanceof String[]) {
- return new ArrayExprEval(ExpressionType.STRING_ARRAY, (String[]) val);
+ return new ArrayExprEval(ExpressionType.STRING_ARRAY,
Arrays.stream((String[]) val).toArray());
Review Comment:
Stream construction in hot paths can be expensive. May be better to avoid it.
##########
processing/src/main/java/org/apache/druid/segment/filter/ExpressionFilter.java:
##########
@@ -141,26 +141,26 @@ public boolean matches()
if (eval.type().isArray()) {
switch (eval.elementType().getType()) {
case LONG:
- final Long[] lResult = eval.asLongArray();
+ final Object[] lResult = eval.asArray();
if (lResult == null) {
return false;
}
- return
Arrays.stream(lResult).filter(Objects::nonNull).anyMatch(Evals::asBoolean);
+ return
Arrays.stream(lResult).filter(Objects::nonNull).anyMatch(o ->
Evals.asBoolean((long) o));
Review Comment:
I know this code was pre-existing, but, may be a good idea to move it away
from streams since it's a hot path.
##########
core/src/main/java/org/apache/druid/segment/column/ColumnType.java:
##########
@@ -26,18 +26,71 @@
import javax.annotation.Nullable;
+/**
+ * Native Druid types.
+ *
+ * @see TypeSignature
+ */
@JsonSerialize(using = ToStringSerializer.class)
public class ColumnType extends BaseTypeSignature<ValueType>
{
+ /**
+ * Druid string type. Values will be represented as {@link String} or {@link
java.util.List<String>} in the case
+ * of multi-value string columns. {@link ColumnType} has insufficient
information to distinguish between single
+ * and multi-value strings, this requires a type 'ColumnCapabilities' which
is available at a higher layer.
+ *
+ * @see ValueType#STRING
+ */
public static final ColumnType STRING = new ColumnType(ValueType.STRING,
null, null);
+
+ /**
+ * Druid 64-bit integer number primitve type. Values will be represented as
Java long or {@link Long}.
+ *
+ * @see ValueType#LONG
+ */
public static final ColumnType LONG = new ColumnType(ValueType.LONG, null,
null);
+ /**
+ * Druid 64-bit double precision floating point number primitive type.
Values will be represented as Java double or
+ * {@link Double}.
+ *
+ * @see ValueType#DOUBLE
+ */
public static final ColumnType DOUBLE = new ColumnType(ValueType.DOUBLE,
null, null);
+ /**
+ * Druid 32-bit single precision floating point number primitive type.
Values will be represented as Java float or
+ * {@link Float}.
+ *
+ * @see ValueType#FLOAT
+ */
public static final ColumnType FLOAT = new ColumnType(ValueType.FLOAT, null,
null);
// currently, arrays only come from expressions or aggregators
// and there are no native float expressions (or aggs which produce float
arrays)
+ /**
+ * An array of Strings. Values will reprsented as Object[]
Review Comment:
represented (spelling)
##########
core/src/main/java/org/apache/druid/segment/column/ValueType.java:
##########
@@ -63,23 +63,37 @@ public enum ValueType implements TypeDescriptor
* String object type. This type may be used as a grouping key, an input to
certain types of complex sketch
* aggregators, and as an input to expression virtual columns. String types
might potentially be 'multi-valued' when
* stored in segments, and contextually at various layers of query
processing, but this information is not available
- * through this enum alone, and must be accompany this type indicator to
properly handle.
+ * at this level.
+ *
+ * Strings are typically represented as {@link String}, but multi-value
strings might also appear as a
+ * {@link java.util.List<String>}.
*/
STRING,
/**
* Placeholder for arbitrary 'complex' types, which have a corresponding
serializer/deserializer implementation. Note
* that knowing a type is complex alone isn't enough information to work
with it directly, and additional information
- * in the form of a type name that is registered in the complex type
registry must be available to make this type
- * meaningful. This type is not currently supported as a grouping key for
aggregations, and may not be used as an
- * input to expression virtual columns, and might only be supported by the
specific aggregators crafted to handle
- * this complex type.
+ * in the form of a type name which must be registered in the complex type
registry. This type is not currently
+ * supported as a grouping key for aggregations, and might only be supported
by the specific aggregators crafted to
Review Comment:
"might"? How about:
> Complex types are not currently supported as a grouping key for
aggregations. Complex types can be used as inputs to aggregators, in cases
where the specific aggregator supports the specific complex type.
##########
core/src/main/java/org/apache/druid/segment/column/ColumnType.java:
##########
@@ -26,18 +26,71 @@
import javax.annotation.Nullable;
+/**
+ * Native Druid types.
+ *
+ * @see TypeSignature
+ */
@JsonSerialize(using = ToStringSerializer.class)
public class ColumnType extends BaseTypeSignature<ValueType>
{
+ /**
+ * Druid string type. Values will be represented as {@link String} or {@link
java.util.List<String>} in the case
+ * of multi-value string columns. {@link ColumnType} has insufficient
information to distinguish between single
+ * and multi-value strings, this requires a type 'ColumnCapabilities' which
is available at a higher layer.
+ *
+ * @see ValueType#STRING
+ */
public static final ColumnType STRING = new ColumnType(ValueType.STRING,
null, null);
+
+ /**
+ * Druid 64-bit integer number primitve type. Values will be represented as
Java long or {@link Long}.
+ *
+ * @see ValueType#LONG
+ */
public static final ColumnType LONG = new ColumnType(ValueType.LONG, null,
null);
+ /**
+ * Druid 64-bit double precision floating point number primitive type.
Values will be represented as Java double or
+ * {@link Double}.
+ *
+ * @see ValueType#DOUBLE
+ */
public static final ColumnType DOUBLE = new ColumnType(ValueType.DOUBLE,
null, null);
+ /**
+ * Druid 32-bit single precision floating point number primitive type.
Values will be represented as Java float or
+ * {@link Float}.
+ *
+ * @see ValueType#FLOAT
+ */
public static final ColumnType FLOAT = new ColumnType(ValueType.FLOAT, null,
null);
// currently, arrays only come from expressions or aggregators
// and there are no native float expressions (or aggs which produce float
arrays)
+ /**
+ * An array of Strings. Values will reprsented as Object[]
+ * @see ValueType#ARRAY
+ * @see ValueType#STRING
+ */
public static final ColumnType STRING_ARRAY = new
ColumnType(ValueType.ARRAY, null, STRING);
+ /**
+ * An array of Longs. Values may be represented as Object[] or long[]
Review Comment:
May? Where did the certainty go? The others are "will"!
##########
core/src/main/java/org/apache/druid/segment/column/ValueType.java:
##########
@@ -63,23 +63,37 @@ public enum ValueType implements TypeDescriptor
* String object type. This type may be used as a grouping key, an input to
certain types of complex sketch
* aggregators, and as an input to expression virtual columns. String types
might potentially be 'multi-valued' when
* stored in segments, and contextually at various layers of query
processing, but this information is not available
- * through this enum alone, and must be accompany this type indicator to
properly handle.
+ * at this level.
+ *
+ * Strings are typically represented as {@link String}, but multi-value
strings might also appear as a
+ * {@link java.util.List<String>}.
*/
STRING,
/**
* Placeholder for arbitrary 'complex' types, which have a corresponding
serializer/deserializer implementation. Note
* that knowing a type is complex alone isn't enough information to work
with it directly, and additional information
- * in the form of a type name that is registered in the complex type
registry must be available to make this type
- * meaningful. This type is not currently supported as a grouping key for
aggregations, and may not be used as an
- * input to expression virtual columns, and might only be supported by the
specific aggregators crafted to handle
- * this complex type.
+ * in the form of a type name which must be registered in the complex type
registry. This type is not currently
+ * supported as a grouping key for aggregations, and might only be supported
by the specific aggregators crafted to
+ * handle this complex type. Filtering on this type with standard filters
will most likely have limited support, and
+ * may match as a "null" value.
+ *
+ * These types are represented by the individual Java type associated with
the complex type name as defined in the
+ * type registry.
*/
COMPLEX,
/**
- * Placeholder for arbitrary arrays of other {@link ValueType}. This type is
not currently supported as a grouping
+ * Placeholder for arbitrary arrays of other {@link ValueType}. This type
has limited support as a grouping
Review Comment:
What does "limited support" mean?
##########
core/src/main/java/org/apache/druid/segment/column/ValueType.java:
##########
@@ -63,23 +63,37 @@ public enum ValueType implements TypeDescriptor
* String object type. This type may be used as a grouping key, an input to
certain types of complex sketch
* aggregators, and as an input to expression virtual columns. String types
might potentially be 'multi-valued' when
* stored in segments, and contextually at various layers of query
processing, but this information is not available
- * through this enum alone, and must be accompany this type indicator to
properly handle.
+ * at this level.
+ *
+ * Strings are typically represented as {@link String}, but multi-value
strings might also appear as a
Review Comment:
"Might"?
How about: "multi-value strings appear as {@code java.util.List<String>}
when necessary to represent multiple values."
##########
core/src/main/java/org/apache/druid/math/expr/Evals.java:
##########
@@ -20,21 +20,22 @@
package org.apache.druid.math.expr;
import org.apache.druid.common.config.NullHandling;
-import org.apache.druid.java.util.common.logger.Logger;
import javax.annotation.Nullable;
-import java.util.Arrays;
import java.util.List;
/**
*/
public class Evals
{
- private static final Logger log = new Logger(Evals.class);
-
public static boolean isAllConstants(Expr... exprs)
{
- return isAllConstants(Arrays.asList(exprs));
+ for (Expr expr : exprs) {
Review Comment:
Why duplicate the code? To avoid wrapping the array?
##########
core/src/main/java/org/apache/druid/segment/column/ColumnType.java:
##########
@@ -26,18 +26,71 @@
import javax.annotation.Nullable;
+/**
+ * Native Druid types.
+ *
+ * @see TypeSignature
+ */
@JsonSerialize(using = ToStringSerializer.class)
public class ColumnType extends BaseTypeSignature<ValueType>
{
+ /**
+ * Druid string type. Values will be represented as {@link String} or {@link
java.util.List<String>} in the case
+ * of multi-value string columns. {@link ColumnType} has insufficient
information to distinguish between single
+ * and multi-value strings, this requires a type 'ColumnCapabilities' which
is available at a higher layer.
Review Comment:
Link to the specific hasMultipleValues function? Or mention, if it's not
available from this package.
##########
core/src/main/java/org/apache/druid/segment/column/ValueType.java:
##########
@@ -63,23 +63,37 @@ public enum ValueType implements TypeDescriptor
* String object type. This type may be used as a grouping key, an input to
certain types of complex sketch
* aggregators, and as an input to expression virtual columns. String types
might potentially be 'multi-valued' when
* stored in segments, and contextually at various layers of query
processing, but this information is not available
- * through this enum alone, and must be accompany this type indicator to
properly handle.
+ * at this level.
+ *
+ * Strings are typically represented as {@link String}, but multi-value
strings might also appear as a
+ * {@link java.util.List<String>}.
*/
STRING,
/**
* Placeholder for arbitrary 'complex' types, which have a corresponding
serializer/deserializer implementation. Note
* that knowing a type is complex alone isn't enough information to work
with it directly, and additional information
- * in the form of a type name that is registered in the complex type
registry must be available to make this type
- * meaningful. This type is not currently supported as a grouping key for
aggregations, and may not be used as an
- * input to expression virtual columns, and might only be supported by the
specific aggregators crafted to handle
- * this complex type.
+ * in the form of a type name which must be registered in the complex type
registry. This type is not currently
+ * supported as a grouping key for aggregations, and might only be supported
by the specific aggregators crafted to
+ * handle this complex type. Filtering on this type with standard filters
will most likely have limited support, and
+ * may match as a "null" value.
+ *
+ * These types are represented by the individual Java type associated with
the complex type name as defined in the
+ * type registry.
*/
COMPLEX,
/**
- * Placeholder for arbitrary arrays of other {@link ValueType}. This type is
not currently supported as a grouping
+ * Placeholder for arbitrary arrays of other {@link ValueType}. This type
has limited support as a grouping
* key for aggregations, cannot be used as an input for numerical primitive
aggregations such as sums, and may have
* limited support as an input among complex type sketch aggregators.
+ *
+ * There are currently no native ARRAY typed columns, but they may be
produced by expression virtual columns,
+ * aggregators, and post-aggregators.
+ *
+ * Arrays are typically represented as Object[], but may also be Java
primitive arrays such as long[], double[], or
Review Comment:
Love the idea of this paragraph. How do you feel about this adjustment? The
idea is to firm up the language and give people clearer expectations.
> Arrays are represented as Object[], long[], double[], or float[]. The
preferred type is Object[], since the expression system is the main consumer of
arrays, and the expression system uses Object[] internally. Some code
represents arrays in other ways; in particular the groupBy engine and SQL
result layer. Over time we expect these usages to migrate to Object[], long[],
double[], and float[].
##########
core/src/main/java/org/apache/druid/segment/column/ValueType.java:
##########
@@ -63,23 +63,37 @@ public enum ValueType implements TypeDescriptor
* String object type. This type may be used as a grouping key, an input to
certain types of complex sketch
* aggregators, and as an input to expression virtual columns. String types
might potentially be 'multi-valued' when
* stored in segments, and contextually at various layers of query
processing, but this information is not available
- * through this enum alone, and must be accompany this type indicator to
properly handle.
+ * at this level.
+ *
+ * Strings are typically represented as {@link String}, but multi-value
strings might also appear as a
+ * {@link java.util.List<String>}.
*/
STRING,
/**
* Placeholder for arbitrary 'complex' types, which have a corresponding
serializer/deserializer implementation. Note
* that knowing a type is complex alone isn't enough information to work
with it directly, and additional information
- * in the form of a type name that is registered in the complex type
registry must be available to make this type
- * meaningful. This type is not currently supported as a grouping key for
aggregations, and may not be used as an
- * input to expression virtual columns, and might only be supported by the
specific aggregators crafted to handle
- * this complex type.
+ * in the form of a type name which must be registered in the complex type
registry. This type is not currently
+ * supported as a grouping key for aggregations, and might only be supported
by the specific aggregators crafted to
+ * handle this complex type. Filtering on this type with standard filters
will most likely have limited support, and
Review Comment:
"most likely"? Are there cases where they are treated as anything other than
null?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]