eejbyfeldt commented on code in PR #13134:
URL: https://github.com/apache/datafusion/pull/13134#discussion_r1819669249
##########
datafusion/expr/src/logical_plan/builder.rs:
##########
@@ -1324,6 +1325,25 @@ pub fn change_redundant_column(fields: &Fields) ->
Vec<Field> {
})
.collect()
}
+
+fn mark_field(schema: &DFSchema) -> (Option<TableReference>, Arc<Field>) {
+ let mut table_references = schema
+ .iter()
+ .filter_map(|(qualifier, _)| qualifier)
+ .collect::<Vec<_>>();
+ table_references.dedup();
+ let table_reference = if table_references.len() == 1 {
+ table_references.pop().cloned()
+ } else {
+ None
+ };
+
+ (
+ table_reference,
+ Arc::new(Field::new("mark", DataType::Boolean, false)),
Review Comment:
The way the join is used from decorrelate subqueries it will never conflict
as that uses a subquery alias (that is prefixed __) and the new code will then
use that for the mark column as well.
But if some uses LeftMark in a query without an alias it would be able to
conflict. But I don't think just adding `__` would be a perfect fix. As it can
still conflict with it self if you have multiple joins. But also if you are
using a `LeftMark` join you will probably like to refer to the mark column and
naming it `__mark` make it look like an internal name.
One option could be to make the output column name be part of of the
`JoinType` e.g `LeftMark(Column)` and then use that for the output. Then each
user would need to make sure that name is sufficiently unique.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]