siddharthteotia commented on a change in pull request #4535: Implement DISTINCT
clause
URL: https://github.com/apache/incubator-pinot/pull/4535#discussion_r319217612
##########
File path:
pinot-core/src/main/java/org/apache/pinot/core/plan/TransformPlanNode.java
##########
@@ -124,6 +143,14 @@ public TransformOperator run() {
return new TransformOperator(_projectionPlanNode.run(), new
ArrayList<>(_expressionTrees));
}
+ FieldSpec.DataType[] getProjectedColumnTypes() {
Review comment:
Getting this information from projection block would mean we get this every
time a new transform block is given to aggregation executor. We need this
information exactly once during creation of operator.
Distinct function needs this information since other function have DOUBLE as
result column type. For distinct, this is not true. Similarly, for other
functions, the name of the result column is SUM_COL, AVG_COL etc. Again, for
distinct this is not true, we have to project each column name as it is.
Not only that, since the output of distinct function is one single object,
we need to preserve the name and type info in this object so that we can
serialize all that when we send to broker.
I can point more in code how this is being used. But this is indeed very
necessary
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]