alamb commented on issue #15162:
URL: https://github.com/apache/datafusion/issues/15162#issuecomment-2715232608
Thank you for bringing this up @comphead -- I think we have struggled with
this issue for a while downstream in DataFusion
I think the core fix of this issue is not constructing `DataType::List`, but
rather one of comparison
As @tustvold points out, the field name is arbitrary and not consistent
across arrow implementations. Plumbing some way to change it around might work,
but we'll be forever trying to find all the corder cases.
Thus in my opinion, rather than try and control the name of the field, a
better approach is to change places where **`DataType::List`s** are compared
and ignore the field name unless is it is important
For example, the specific error that @comphead posted in this issue is
```
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in
stage 309.0 failed 1 times, most
recent failure: Lost task 0.0 in stage 309.0 (TID 797)
(Mac-1741305812954.local executor driver):
org.apache.comet.CometNativeException: Invalid argument error: column types
must match schema types,
expected List(Field { name: "element", data_type: Int8, nullable: true,
dict_id: 0, dict_is_ordered: false, metadata:
{} }) but found List(Field { name: "item", data_type: Int8, nullable: true,
dict_id: 0, dict_is_ordered: false,
metadata: {} }) at column index 0
```
It seems like the that error actually comes from RecordBatch construction
within arrow-rs
https://github.com/apache/arrow-rs/blob/f4fde769ab6e1a9b75f890b7f8b47bc22800830b/arrow-array/src/record_batch.rs#L333
Perhaps we can relax this check / update RecordBatch::new() to align
incoming `DataType::List` to match the schema 🤔
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]