xanderbailey commented on code in PR #17299:
URL: https://github.com/apache/datafusion/pull/17299#discussion_r2325802938


##########
datafusion/substrait/src/logical_plan/consumer/rel/project_rel.rs:
##########
@@ -62,7 +62,17 @@ pub async fn from_project_rel(
                 // to transform it into a column reference
                 window_exprs.insert(e.clone());
             }
-            explicit_exprs.push(name_tracker.get_uniquely_named_expr(e)?);
+            // Since substrait removes aliases, we need to assign literals 
with a UUID alias to avoid
+            // ambiguous names when the same literal is used before and after 
a join.
+            // The name tracker will ensure that two literals in the same 
project would have
+            // unique names but, it does not ensure that if a literal column 
exists in a previous
+            // project say before a join that it is deduplicated with respect 
to those columns.

Review Comment:
   I think you're right Arttu about other cases. I think now() for example 
might also suffer from this. 
   
   If the name tracker took in the current DF schema then I think that would 
work. I looked at doing that first but there were a number of call sites for 
the name tracker that looked a little scary to touch. What do you think? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to