neilconway opened a new pull request, #20439: URL: https://github.com/apache/datafusion/pull/20439
## Which issue does this PR close? - Closes #20438. ## Rationale for this change The substrait consumer parsed field references in correlated subqueries incorrectly. Field references were always resolved relative to the schema of the current (innermost) subquery, leading to incorrect results. ## What changes are included in this PR? We now maintain a stack of outer query schemas, and pushes/pops elements from it as we traverse subqueries. When resolving field references, we now use `FieldReference.root_type` to detect outer query field references and resolve them against the appropriate schema. This commit updates the expected results for parsing TPC-H queries, because several of them were parsed incorrectly (the misparsing was probably not detected because the incorrect parse didn't result in any illegal queries, by sheer luck). ## Are these changes tested? Yes. Test results updated to reflect new, correct behavior, and new unit tests added. ## Are there any user-facing changes? The behavior of the substrait consumer has changed, although the previous behavior was wrong and it seems a bit unlikely anyone would have dependend on it. The `DefaultSubstraitConsumer` API is slightly changed (new private field). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
