[GitHub] [arrow-datafusion] Jefffrey commented on a diff in pull request #4840: Support wildcard select on multiple column using joins

GitBox Sat, 07 Jan 2023 00:56:49 -0800


Jefffrey commented on code in PR #4840:
URL: https://github.com/apache/arrow-datafusion/pull/4840#discussion_r1063978289



##########
datafusion/expr/src/utils.rs:
##########
@@ -198,9 +207,14 @@ pub fn expand_qualified_wildcard(
             "Invalid qualifier {qualifier}"
         )));
     }
-    let qualifier_schema =
+    let qualified_schema =
         DFSchema::new_with_metadata(qualified_fields, 
schema.metadata().clone())?;
-    expand_wildcard(&qualifier_schema, plan)
+    // if qualified, allow all columns in output (i.e. ignore using column 
check)
+    Ok(qualified_schema
+        .fields()
+        .iter()
+        .map(|f| Expr::Column(f.qualified_column()))
+        .collect::<Vec<Expr>>())

Review Comment:
   this is an extra fix, as i observed in postgresql if you have the following 
query:
   
   ```sql
   select a.*, b.*, c.*
   from categories a
        join categories b using (category_id)
        join categories c using (category_id)
   ;
   ```
   
   then `a.category_id`, `b.category_id` and `c.category_id` are all included 
in the output instead of being omitted due to being part of the using join, 
because they have been specifically qualified in their wildcard, hence 
shouldn't try to deduplicate those as in the non-qualified wildcard case



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [arrow-datafusion] Jefffrey commented on a diff in pull request #4840: Support wildcard select on multiple column using joins

Reply via email to