[
https://issues.apache.org/jira/browse/ARROW-18403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Felipe Oliveira reassigned ARROW-18403:
---------------------------------------
Assignee: Felipe Oliveira
> [C++] Error consuming Substrait plan which uses count function: "only unary
> aggregate functions are currently supported"
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: ARROW-18403
> URL: https://issues.apache.org/jira/browse/ARROW-18403
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++
> Reporter: Nicola Crane
> Assignee: Felipe Oliveira
> Priority: Major
> Labels: pull-request-available, substrait
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> ARROW-17523 added support for the Substrait extension function "count", but
> when I write code which produces a Substrait plan which calls it, and then
> try to run it in Acero, I get an error.
> The plan:
> {code:r}
> message of type 'substrait.Plan' with 3 fields set
> extension_uris {
> extension_uri_anchor: 1
> uri:
> "https://github.com/substrait-io/substrait/blob/main/extensions/functions_arithmetic.yaml"
> }
> extension_uris {
> extension_uri_anchor: 2
> uri:
> "https://github.com/substrait-io/substrait/blob/main/extensions/functions_comparison.yaml"
> }
> extension_uris {
> extension_uri_anchor: 3
> uri:
> "https://github.com/substrait-io/substrait/blob/main/extensions/functions_aggregate_generic.yaml"
> }
> extensions {
> extension_function {
> extension_uri_reference: 3
> function_anchor: 2
> name: "count"
> }
> }
> relations {
> rel {
> aggregate {
> input {
> project {
> common {
> emit {
> output_mapping: 9
> output_mapping: 10
> output_mapping: 11
> output_mapping: 12
> output_mapping: 13
> output_mapping: 14
> output_mapping: 15
> output_mapping: 16
> output_mapping: 17
> }
> }
> input {
> read {
> base_schema {
> names: "int"
> names: "dbl"
> names: "dbl2"
> names: "lgl"
> names: "false"
> names: "chr"
> names: "verses"
> names: "padded_strings"
> names: "some_negative"
> struct_ {
> types {
> i32 {
> nullability: NULLABILITY_NULLABLE
> }
> }
> types {
> fp64 {
> nullability: NULLABILITY_NULLABLE
> }
> }
> types {
> fp64 {
> nullability: NULLABILITY_NULLABLE
> }
> }
> types {
> bool_ {
> nullability: NULLABILITY_NULLABLE
> }
> }
> types {
> bool_ {
> nullability: NULLABILITY_NULLABLE
> }
> }
> types {
> string {
> nullability: NULLABILITY_NULLABLE
> }
> }
> types {
> string {
> nullability: NULLABILITY_NULLABLE
> }
> }
> types {
> string {
> nullability: NULLABILITY_NULLABLE
> }
> }
> types {
> fp64 {
> nullability: NULLABILITY_NULLABLE
> }
> }
> }
> }
> local_files {
> items {
> uri_file: "file:///tmp/RtmpsBsoZJ/file1915f604cff4a"
> parquet {
> }
> }
> }
> }
> }
> expressions {
> selection {
> direct_reference {
> struct_field {
> }
> }
> root_reference {
> }
> }
> }
> expressions {
> selection {
> direct_reference {
> struct_field {
> field: 1
> }
> }
> root_reference {
> }
> }
> }
> expressions {
> selection {
> direct_reference {
> struct_field {
> field: 2
> }
> }
> root_reference {
> }
> }
> }
> expressions {
> selection {
> direct_reference {
> struct_field {
> field: 3
> }
> }
> root_reference {
> }
> }
> }
> expressions {
> selection {
> direct_reference {
> struct_field {
> field: 4
> }
> }
> root_reference {
> }
> }
> }
> expressions {
> selection {
> direct_reference {
> struct_field {
> field: 5
> }
> }
> root_reference {
> }
> }
> }
> expressions {
> selection {
> direct_reference {
> struct_field {
> field: 6
> }
> }
> root_reference {
> }
> }
> }
> expressions {
> selection {
> direct_reference {
> struct_field {
> field: 7
> }
> }
> root_reference {
> }
> }
> }
> expressions {
> selection {
> direct_reference {
> struct_field {
> field: 8
> }
> }
> root_reference {
> }
> }
> }
> }
> }
> groupings {
> grouping_expressions {
> selection {
> direct_reference {
> struct_field {
> field: 3
> }
> }
> root_reference {
> }
> }
> }
> }
> measures {
> measure {
> function_reference: 2
> phase: AGGREGATION_PHASE_INITIAL_TO_RESULT
> output_type {
> i64 {
> nullability: NULLABILITY_NULLABLE
> }
> }
> invocation: AGGREGATION_INVOCATION_ALL
> }
> }
> }
> }
> }
> {code}
> The error:
> {code:java}
> Error: NotImplemented: Only unary aggregate functions are currently supported
> /home/nic2/arrow/cpp/src/arrow/engine/substrait/relation_internal.cc:587
> converter(aggregate_call)
> /home/nic2/arrow/cpp/src/arrow/engine/substrait/serde.cc:153
> FromProto(plan_rel.has_root() ? plan_rel.root().input() : plan_rel.rel(),
> ext_set, conversion_options)
> {code}
> I have no idea what the "phase" and "invocation" fields above do, but
> previous attempts to get Acero to consume this plan led to errors due to me
> using default values instead of the ones specified there (e.g. "Not
> Implemented: Unsupported aggregation phase 'AGGREGATION_PHASE_UNSPECIFIED'"),
> so I just changed them to see if it helped.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)