[ 
https://issues.apache.org/jira/browse/ARROW-18403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felipe Oliveira reassigned ARROW-18403:
---------------------------------------

    Assignee: Felipe Oliveira

> [C++] Error consuming Substrait plan which uses count function: "only unary 
> aggregate functions are currently supported"
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-18403
>                 URL: https://issues.apache.org/jira/browse/ARROW-18403
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>            Reporter: Nicola Crane
>            Assignee: Felipe Oliveira
>            Priority: Major
>              Labels: pull-request-available, substrait
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> ARROW-17523 added support for the Substrait extension function "count", but 
> when I write code which produces a Substrait plan which calls it, and then 
> try to run it in Acero, I get an error.
> The plan:
> {code:r}
> message of type 'substrait.Plan' with 3 fields set
> extension_uris {
>   extension_uri_anchor: 1
>   uri: 
> "https://github.com/substrait-io/substrait/blob/main/extensions/functions_arithmetic.yaml";
> }
> extension_uris {
>   extension_uri_anchor: 2
>   uri: 
> "https://github.com/substrait-io/substrait/blob/main/extensions/functions_comparison.yaml";
> }
> extension_uris {
>   extension_uri_anchor: 3
>   uri: 
> "https://github.com/substrait-io/substrait/blob/main/extensions/functions_aggregate_generic.yaml";
> }
> extensions {
>   extension_function {
>     extension_uri_reference: 3
>     function_anchor: 2
>     name: "count"
>   }
> }
> relations {
>   rel {
>     aggregate {
>       input {
>         project {
>           common {
>             emit {
>               output_mapping: 9
>               output_mapping: 10
>               output_mapping: 11
>               output_mapping: 12
>               output_mapping: 13
>               output_mapping: 14
>               output_mapping: 15
>               output_mapping: 16
>               output_mapping: 17
>             }
>           }
>           input {
>             read {
>               base_schema {
>                 names: "int"
>                 names: "dbl"
>                 names: "dbl2"
>                 names: "lgl"
>                 names: "false"
>                 names: "chr"
>                 names: "verses"
>                 names: "padded_strings"
>                 names: "some_negative"
>                 struct_ {
>                   types {
>                     i32 {
>                       nullability: NULLABILITY_NULLABLE
>                     }
>                   }
>                   types {
>                     fp64 {
>                       nullability: NULLABILITY_NULLABLE
>                     }
>                   }
>                   types {
>                     fp64 {
>                       nullability: NULLABILITY_NULLABLE
>                     }
>                   }
>                   types {
>                     bool_ {
>                       nullability: NULLABILITY_NULLABLE
>                     }
>                   }
>                   types {
>                     bool_ {
>                       nullability: NULLABILITY_NULLABLE
>                     }
>                   }
>                   types {
>                     string {
>                       nullability: NULLABILITY_NULLABLE
>                     }
>                   }
>                   types {
>                     string {
>                       nullability: NULLABILITY_NULLABLE
>                     }
>                   }
>                   types {
>                     string {
>                       nullability: NULLABILITY_NULLABLE
>                     }
>                   }
>                   types {
>                     fp64 {
>                       nullability: NULLABILITY_NULLABLE
>                     }
>                   }
>                 }
>               }
>               local_files {
>                 items {
>                   uri_file: "file:///tmp/RtmpsBsoZJ/file1915f604cff4a"
>                   parquet {
>                   }
>                 }
>               }
>             }
>           }
>           expressions {
>             selection {
>               direct_reference {
>                 struct_field {
>                 }
>               }
>               root_reference {
>               }
>             }
>           }
>           expressions {
>             selection {
>               direct_reference {
>                 struct_field {
>                   field: 1
>                 }
>               }
>               root_reference {
>               }
>             }
>           }
>           expressions {
>             selection {
>               direct_reference {
>                 struct_field {
>                   field: 2
>                 }
>               }
>               root_reference {
>               }
>             }
>           }
>           expressions {
>             selection {
>               direct_reference {
>                 struct_field {
>                   field: 3
>                 }
>               }
>               root_reference {
>               }
>             }
>           }
>           expressions {
>             selection {
>               direct_reference {
>                 struct_field {
>                   field: 4
>                 }
>               }
>               root_reference {
>               }
>             }
>           }
>           expressions {
>             selection {
>               direct_reference {
>                 struct_field {
>                   field: 5
>                 }
>               }
>               root_reference {
>               }
>             }
>           }
>           expressions {
>             selection {
>               direct_reference {
>                 struct_field {
>                   field: 6
>                 }
>               }
>               root_reference {
>               }
>             }
>           }
>           expressions {
>             selection {
>               direct_reference {
>                 struct_field {
>                   field: 7
>                 }
>               }
>               root_reference {
>               }
>             }
>           }
>           expressions {
>             selection {
>               direct_reference {
>                 struct_field {
>                   field: 8
>                 }
>               }
>               root_reference {
>               }
>             }
>           }
>         }
>       }
>       groupings {
>         grouping_expressions {
>           selection {
>             direct_reference {
>               struct_field {
>                 field: 3
>               }
>             }
>             root_reference {
>             }
>           }
>         }
>       }
>       measures {
>         measure {
>           function_reference: 2
>           phase: AGGREGATION_PHASE_INITIAL_TO_RESULT
>           output_type {
>             i64 {
>               nullability: NULLABILITY_NULLABLE
>             }
>           }
>           invocation: AGGREGATION_INVOCATION_ALL
>         }
>       }
>     }
>   }
> }
> {code}
> The error:
> {code:java}
> Error: NotImplemented: Only unary aggregate functions are currently supported
> /home/nic2/arrow/cpp/src/arrow/engine/substrait/relation_internal.cc:587  
> converter(aggregate_call)
> /home/nic2/arrow/cpp/src/arrow/engine/substrait/serde.cc:153  
> FromProto(plan_rel.has_root() ? plan_rel.root().input() : plan_rel.rel(), 
> ext_set, conversion_options)
> {code}
> I have no idea what the "phase" and "invocation" fields above do, but 
> previous attempts to get Acero to consume this plan led to errors due to me 
> using default values instead of the ones specified there (e.g. "Not 
> Implemented: Unsupported aggregation phase 'AGGREGATION_PHASE_UNSPECIFIED'"), 
> so I just changed them to see if it helped.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to