FauxFaux opened a new issue #1516:
URL: https://github.com/apache/arrow-datafusion/issues/1516
**Describe the bug**
An aggregation query against a `FixedSizeBinary` column returns an internal
error:
`Error: Arrow error: External error: Execution error: Arrow error: External
error: Internal error: Unsupported data type in hasher: FixedSizeBinary(16).`
**To Reproduce**
Steps to reproduce the behavior:
`ctx.sql("select fsb, count(*) from tbl group by fsb")` for some `fsb
FixedSizedBinary` column.
**Expected behavior**
I expect this column to be treated as if it was a `Binary`: equality based
on length and byte equality.
**Additional context**
I realise there's very little support for `FixedSizeBinary` columns anywhere
else; I have built my own equality as a UDF. Not sure if there's a wider plan
here, like "always treat them as `Binary`".
I'm pulling these from Parquet; they are legitimately binary opaque keys,
exactly like a (v4) UUID, and I believe a FixedSizeBinary is the right type for
them here. They have duplicates in a way that Parquet's compression handles
well.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]