[
https://issues.apache.org/jira/browse/ARROW-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240001#comment-17240001
]
Andy Grove commented on ARROW-10732:
------------------------------------
[~alamb] [~jorgecarleitao] [~nevime] In order to implement this I need to make
a design decision that impacts the core Arrow crate and potentially IPC so
thought it would be good to discuss here first . I'm not sure if it warrants
its own Google design doc or not but I am happy to create one if you think that
would be helpful.
The issue is that when translating a SQL AST into a query plan, we need to be
able to reference columns using compound keys such as "customer.id" and when we
support structs in SQL we will need it for representing projections into
structs as well e.g. "customer.address.street". I can easily update the
Expr::Column to support compound names or even add a new
Expr::CompoundColumnReferece but the issue I face is that we represent the
schema of each LogicalPlan using the Arrow Schema and Field structs and Field
does not currently support compound names:
{code:java}
pub struct Field {
name: String,
data_type: DataType,
nullable: bool,
dict_id: i64,
dict_is_ordered: bool,
} {code}
I can work (hack) around this by just using the fully qualified name in the
string in the form "table.column" and then have logic in the SQL planner to
look up a field either by its simple name (while also checking that this is not
an ambiguous reference) as well as looking up fully-qualified names. The other
option would be to make changes to Field to support compound names and/or
adding meta-data that we can use.
What do you think?
> [Rust] [DataFusion] Add SQL support for table/relation aliases and compound
> identifiers
> ---------------------------------------------------------------------------------------
>
> Key: ARROW-10732
> URL: https://issues.apache.org/jira/browse/ARROW-10732
> Project: Apache Arrow
> Issue Type: New Feature
> Components: Rust - DataFusion
> Reporter: Andy Grove
> Assignee: Andy Grove
> Priority: Major
>
> We need to support referencing columns in queries using table name and/or
> alias prefixes so that we can support use cases such as joins between tables
> that have duplicate column names.
> For example:
> {code:java}
> SELECT t1.id, t1.name, t2.name FROM t1 JOIN t2 ON t1.id = t2.id {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)