Re: [PR] fix: add missing columns into list directly [datafusion]

via GitHub Sat, 25 Jan 2025 00:01:03 -0800


lichuang commented on PR #14180:
URL: https://github.com/apache/datafusion/pull/14180#issuecomment-2613832659


   > > @jonahgao in [#10234 
(comment)](https://github.com/apache/datafusion/pull/10234#issuecomment-2087760241)
 comment:
   > > > I think that we should handle ORDER BY similarly to HAVING, use the 
merged schema, add the missing columns directly in the select list, instead of 
traversing the plan looking for projection node.
   > > 
   > > 
   > > If not traversing the plan looking for projection node, how to do 
`ambiguous_distinct_check` check? This check only be called when handling 
projection node in distinct, case `test_distinct_on_sort_by_unprojected` test 
this case.
   > 
   > My thought is to handle it in 
[select_to_plan](https://github.com/apache/datafusion/blob/274e5356ceb4c559ab4105478e75817a302d2f13/datafusion/sql/src/select.rs#L51),
 where the original select list and the 
[distinct](https://github.com/apache/datafusion/blob/274e5356ceb4c559ab4105478e75817a302d2f13/datafusion/sql/src/select.rs#L240)
 flag are both available.
   
   @jonahgao `select_to_plan` only works when playing with SQL, but sometimes 
people use `DataFrame` API directly, where `test_distinct_sort_by_unprojected` 
is this case, so only check in `select_to_plan` not works in `DataFrame` API.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] fix: add missing columns into list directly [datafusion]

Reply via email to