Blizzara opened a new pull request, #13522:
URL: https://github.com/apache/datafusion/pull/13522

   ## Which issue does this PR close?
   
   <!--
   We generally require a GitHub issue to be filed for all bug fixes and 
enhancements and this helps us generate change logs for our releases. You can 
link an issue to this PR using the GitHub syntax. For example `Closes #123` 
indicates that this PR will close issue #123.
   -->
   
   Closes #13437
   
   ## Rationale for this change
   
   Explained more in the issue, but in short: In Substrait consumer we check 
schemas of the input dataset and the Substrait input relation using 
`logically_equivalent_names_and_types(..)`. This then calls 
`datatype_is_logically_equal(..)` on all fields, which can fail if the 
technical inner fields of a list or map have differing names. That happens to 
be the case when reading lists from parquet, as the parquet reader uses 
"element" as the name vs DF (incl. the substrait consumer) mostly using "item".
   
   ## What changes are included in this PR?
   
   Ignore technical inner fields' names when comparing data types for logical 
equivalence.
   
   Arguably that should be the case for all equivalence testing, since Arrow 
doesn't mandate any specific names for these fields and so the names shouldn't 
matter. That's a bigger change and might be hard to even enumerate all the 
places to check, so I only did the minimal thing I need here, but if it'd be 
preferred, I can try to expand to other cases as well - at least 
`datatype_is_semantically_equal`.
   
   ## Are these changes tested?
   
   Added unit test
   
   ## Are there any user-facing changes?
   
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to