[ 
https://issues.apache.org/jira/browse/ARROW-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17604965#comment-17604965
 ] 

Richard Tia commented on ARROW-17061:
-------------------------------------

So I actually tried again using the example in the issue:

 

 
{code:java}
>   ???
E   pyarrow.lib.ArrowNotImplementedError: Only unary aggregate functions are 
currently supported {code}
Here's the plan:
{code:java}
{
  "extensionUris": [{
    "extensionUriAnchor": 1,
    "uri": "AGGREGATE_URI_PLACEHOLDER"
  }],
  "extensions": [{
    "extensionFunction": {
      "extensionUriReference": 1,
      "functionAnchor": 0,
      "name": "count"
    }
  }],
  "relations": [{
    "root": {
      "input": {
        "aggregate": {
          "common": {
            "direct": {
            }
          },
          "input": {
            "project": {
              "common": {
                "emit": {
                  "outputMapping": [9]
                }
              },
              "input": {
                "read": {
                  "common": {
                    "direct": {
                    }
                  },
                  "baseSchema": {
                    "names": ["O_ORDERKEY", "O_CUSTKEY", "O_ORDERSTATUS", 
"O_TOTALPRICE", "O_ORDERDATE", "O_ORDERPRIORITY", "O_CLERK", "O_SHIPPRIORITY", 
"O_COMMENT"],
                    "struct": {
                      "types": [{
                        "i32": {
                          "typeVariationReference": 0,
                          "nullability": "NULLABILITY_REQUIRED"
                        }
                      }, {
                        "i32": {
                          "typeVariationReference": 0,
                          "nullability": "NULLABILITY_REQUIRED"
                        }
                      }, {
                        "string": {
                          "typeVariationReference": 0,
                          "nullability": "NULLABILITY_REQUIRED"
                        }
                      }, {
                        "decimal": {
                          "scale": 2,
                          "precision": 15,
                          "typeVariationReference": 0,
                          "nullability": "NULLABILITY_REQUIRED"
                        }
                      }, {
                        "date": {
                          "typeVariationReference": 0,
                          "nullability": "NULLABILITY_REQUIRED"
                        }
                      }, {
                        "string": {
                          "typeVariationReference": 0,
                          "nullability": "NULLABILITY_REQUIRED"
                        }
                      }, {
                        "string": {
                          "typeVariationReference": 0,
                          "nullability": "NULLABILITY_REQUIRED"
                        }
                      }, {
                        "i32": {
                          "typeVariationReference": 0,
                          "nullability": "NULLABILITY_REQUIRED"
                        }
                      }, {
                        "string": {
                          "typeVariationReference": 0,
                          "nullability": "NULLABILITY_REQUIRED"
                        }
                      }],
                      "typeVariationReference": 0,
                      "nullability": "NULLABILITY_REQUIRED"
                    }
                  },
                    "local_files": {
                      "items": [
                        {
                          "uri_file": "file://FILENAME_PLACEHOLDER_0",
                          "parquet": {}
                        }
                      ]
                    }
                }
              },
              "expressions": [{
                "selection": {
                  "directReference": {
                    "structField": {
                      "field": 5
                    }
                  },
                  "rootReference": {
                  }
                }
              }]
            }
          },
          "groupings": [{
            "groupingExpressions": [{
              "selection": {
                "directReference": {
                  "structField": {
                    "field": 0
                  }
                },
                "rootReference": {
                }
              }
            }]
          }],
          "measures": [{
            "measure": {
              "functionReference": 0,
              "args": [],
              "sorts": [],
              "phase": "AGGREGATION_PHASE_INITIAL_TO_RESULT",
              "outputType": {
                "i64": {
                  "typeVariationReference": 0,
                  "nullability": "NULLABILITY_REQUIRED"
                }
              },
              "invocation": "AGGREGATION_INVOCATION_ALL",
              "arguments": []
            }
          }]
        }
      },
      "names": ["O_ORDERPRIORITY", "ORDER_COUNT"]
    }
  }],
  "expectedTypeUrls": []
} {code}
 

 

 

> [Python][Substrait] Acero consumer is unable to consume count function from 
> substrait query plan
> ------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-17061
>                 URL: https://issues.apache.org/jira/browse/ARROW-17061
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>            Reporter: Richard Tia
>            Assignee: Vibhatha Lakmal Abeykoon
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> SQL
> {code:java}
> SELECT
>     o_orderpriority,
>     count(*) AS order_count
> FROM
>     orders
> GROUP BY
>     o_orderpriority{code}
> The substrait plan generated from SQL, using Isthmus.
>  
> substrait count: 
> [https://github.com/substrait-io/substrait/blob/main/extensions/functions_aggregate_generic.yaml]
>  
> Running the substrait plan with Acero returns this error:
> {code:java}
> E   pyarrow.lib.ArrowInvalid: JsonToBinaryStream returned 
> INVALID_ARGUMENT:(relations[0].root.input.aggregate.measures[0].measure) 
> arguments: Cannot find field.  {code}
>  
> From substrait query plan:
> relations[0].root.input.aggregate.measures[0].measure
> {code:java}
> "measure": {
>   "functionReference": 0,
>   "args": [],
>   "sorts": [],
>   "phase": "AGGREGATION_PHASE_INITIAL_TO_RESULT",
>   "outputType": {
>     "i64": {
>       "typeVariationReference": 0,
>       "nullability": "NULLABILITY_REQUIRED"
>     }
>   },
>   "invocation": "AGGREGATION_INVOCATION_ALL",
>   "arguments": []
> }{code}
> {code:java}
> "extensions": [{
>   "extensionFunction": {
>     "extensionUriReference": 1,
>     "functionAnchor": 0,
>     "name": "count:opt"
>   }
> }],{code}
> Count is a unary function and should be consumable, but isn't in this case.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to