[
https://issues.apache.org/jira/browse/ARROW-17484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606681#comment-17606681
]
Weston Pace edited comment on ARROW-17484 at 9/19/22 5:19 PM:
--------------------------------------------------------------
Aggregate functions typically have very small outputs compared to the input
(e.g. the sum of 1 million rows is a single value) and so it very often makes
sense for the output type to be larger than the input type.
One could argue that you can simply cast beforehand. However, you would have
to cast the entire array of inputs (e.g. the 1 million rows) and this could be
rather costly.
Finally, we are mirroring SQL here (which is not, by itself, necessarily a good
thing, but it is worth noting). From the [postgres
docs|https://www.postgresql.org/docs/8.2/functions-aggregate.html] for sum the
return type is:
{quote}
bigint for smallint or int arguments, numeric for bigint arguments, double
precision for floating-point arguments, otherwise the same as the argument data
type
{quote}
was (Author: westonpace):
Aggregate functions typically have very small outputs compared to the input
(e.g. the sum of 1 million rows is a single value) and so it very often makes
sense for the output type to be larger than the input type.
One could argue that you can simply cast beforehand. However, you would have
to cast the entire array of inputs (e.g. the 1 million rows) and this could be
rather costly.
Finally, we are mirroring SQL here (which is not, necessarily a good thing, but
worth noting). From the [postgres
docs|https://www.postgresql.org/docs/8.2/functions-aggregate.html] for sum the
return type is:
{quote}
bigint for smallint or int arguments, numeric for bigint arguments, double
precision for floating-point arguments, otherwise the same as the argument data
type
{quote}
> [C++] Substrait to Arrow Aggregate doesn't take the provided Output Type for
> aggregates
> ---------------------------------------------------------------------------------------
>
> Key: ARROW-17484
> URL: https://issues.apache.org/jira/browse/ARROW-17484
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++
> Reporter: Vibhatha Lakmal Abeykoon
> Assignee: Vibhatha Lakmal Abeykoon
> Priority: Major
>
> The current Substrait to Aggregate deserializer doesn't take the plan
> provided output type as the output type of the execution plan.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)