sidneymau opened a new issue, #47902:
URL: https://github.com/apache/arrow/issues/47902
### Describe the enhancement requested
Similarly to #43716, it would be useful to support fields with struct types
in joins.
Here is similar pyarrow code to that shared by Anja in the other issue to
illustrate this case:
```
import pyarrow as pa
import pyarrow.acero as acero
# ---
table_1 = pa.table({"a": [1, 2, 3], "b": ["x", "y", "z"]})
table_1_node = acero.Declaration(
"table_source", options=acero.TableSourceNodeOptions(table_1)
)
table_2 = pa.table(
{"a": [1, 2, 3], "c": [{"x": 1, "y": 2}, {"x": 2, "y": 2}, {"x": 3, "y":
2}]}
)
table_2_node = acero.Declaration(
"table_source", options=acero.TableSourceNodeOptions(table_2)
)
expected = pa.table(
{
"a": [1, 2, 3],
"b": ["x", "y", "z"],
"c": [{"x": 1, "y": 2}, {"x": 2, "y": 2}, {"x": 3, "y": 2}],
}
)
# ---
hash_join_options = acero.HashJoinNodeOptions(
"left outer", left_keys=["a"], right_keys=["a"]
)
join_node = acero.Declaration(
"hashjoin", options=hash_join_options, inputs=[table_1_node,
table_2_node]
)
result = join_node.to_table()
assert result == expected
```
When run, the join operation raises the following error:
```
result = join_node.to_table()
File "pyarrow/_acero.pyx", line 592, in pyarrow._acero.Declaration.to_table
File "pyarrow/error.pxi", line 155, in
pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Data type struct<x: int64, y: int64> is not
supported in join non-key field c
```
I _think_ addressing this may depend on #45001, but I am not positive
### Component(s)
C++
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]