[
https://issues.apache.org/jira/browse/ARROW-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17604965#comment-17604965
]
Richard Tia commented on ARROW-17061:
-------------------------------------
So I actually tried again using the example in the issue:
{code:java}
> ???
E pyarrow.lib.ArrowNotImplementedError: Only unary aggregate functions are
currently supported {code}
Here's the plan:
{code:java}
{
"extensionUris": [{
"extensionUriAnchor": 1,
"uri": "AGGREGATE_URI_PLACEHOLDER"
}],
"extensions": [{
"extensionFunction": {
"extensionUriReference": 1,
"functionAnchor": 0,
"name": "count"
}
}],
"relations": [{
"root": {
"input": {
"aggregate": {
"common": {
"direct": {
}
},
"input": {
"project": {
"common": {
"emit": {
"outputMapping": [9]
}
},
"input": {
"read": {
"common": {
"direct": {
}
},
"baseSchema": {
"names": ["O_ORDERKEY", "O_CUSTKEY", "O_ORDERSTATUS",
"O_TOTALPRICE", "O_ORDERDATE", "O_ORDERPRIORITY", "O_CLERK", "O_SHIPPRIORITY",
"O_COMMENT"],
"struct": {
"types": [{
"i32": {
"typeVariationReference": 0,
"nullability": "NULLABILITY_REQUIRED"
}
}, {
"i32": {
"typeVariationReference": 0,
"nullability": "NULLABILITY_REQUIRED"
}
}, {
"string": {
"typeVariationReference": 0,
"nullability": "NULLABILITY_REQUIRED"
}
}, {
"decimal": {
"scale": 2,
"precision": 15,
"typeVariationReference": 0,
"nullability": "NULLABILITY_REQUIRED"
}
}, {
"date": {
"typeVariationReference": 0,
"nullability": "NULLABILITY_REQUIRED"
}
}, {
"string": {
"typeVariationReference": 0,
"nullability": "NULLABILITY_REQUIRED"
}
}, {
"string": {
"typeVariationReference": 0,
"nullability": "NULLABILITY_REQUIRED"
}
}, {
"i32": {
"typeVariationReference": 0,
"nullability": "NULLABILITY_REQUIRED"
}
}, {
"string": {
"typeVariationReference": 0,
"nullability": "NULLABILITY_REQUIRED"
}
}],
"typeVariationReference": 0,
"nullability": "NULLABILITY_REQUIRED"
}
},
"local_files": {
"items": [
{
"uri_file": "file://FILENAME_PLACEHOLDER_0",
"parquet": {}
}
]
}
}
},
"expressions": [{
"selection": {
"directReference": {
"structField": {
"field": 5
}
},
"rootReference": {
}
}
}]
}
},
"groupings": [{
"groupingExpressions": [{
"selection": {
"directReference": {
"structField": {
"field": 0
}
},
"rootReference": {
}
}
}]
}],
"measures": [{
"measure": {
"functionReference": 0,
"args": [],
"sorts": [],
"phase": "AGGREGATION_PHASE_INITIAL_TO_RESULT",
"outputType": {
"i64": {
"typeVariationReference": 0,
"nullability": "NULLABILITY_REQUIRED"
}
},
"invocation": "AGGREGATION_INVOCATION_ALL",
"arguments": []
}
}]
}
},
"names": ["O_ORDERPRIORITY", "ORDER_COUNT"]
}
}],
"expectedTypeUrls": []
} {code}
> [Python][Substrait] Acero consumer is unable to consume count function from
> substrait query plan
> ------------------------------------------------------------------------------------------------
>
> Key: ARROW-17061
> URL: https://issues.apache.org/jira/browse/ARROW-17061
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Reporter: Richard Tia
> Assignee: Vibhatha Lakmal Abeykoon
> Priority: Major
> Labels: pull-request-available
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> SQL
> {code:java}
> SELECT
> o_orderpriority,
> count(*) AS order_count
> FROM
> orders
> GROUP BY
> o_orderpriority{code}
> The substrait plan generated from SQL, using Isthmus.
>
> substrait count:
> [https://github.com/substrait-io/substrait/blob/main/extensions/functions_aggregate_generic.yaml]
>
> Running the substrait plan with Acero returns this error:
> {code:java}
> E pyarrow.lib.ArrowInvalid: JsonToBinaryStream returned
> INVALID_ARGUMENT:(relations[0].root.input.aggregate.measures[0].measure)
> arguments: Cannot find field. {code}
>
> From substrait query plan:
> relations[0].root.input.aggregate.measures[0].measure
> {code:java}
> "measure": {
> "functionReference": 0,
> "args": [],
> "sorts": [],
> "phase": "AGGREGATION_PHASE_INITIAL_TO_RESULT",
> "outputType": {
> "i64": {
> "typeVariationReference": 0,
> "nullability": "NULLABILITY_REQUIRED"
> }
> },
> "invocation": "AGGREGATION_INVOCATION_ALL",
> "arguments": []
> }{code}
> {code:java}
> "extensions": [{
> "extensionFunction": {
> "extensionUriReference": 1,
> "functionAnchor": 0,
> "name": "count:opt"
> }
> }],{code}
> Count is a unary function and should be consumable, but isn't in this case.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)