Re: [Discuss] Converting SELECT Fields to STAR may cause duplicate column exception when executing ctas

Julian Hyde Thu, 24 Feb 2022 14:07:31 -0800

Which code is generating this "SELECT *" query? It seems wrong to
generate "SELECT *". It should generate a list of columns with unique
names instead.


On Thu, Feb 24, 2022 at 3:41 AM Yanjing Wang <[email protected]> wrote:
>
> Hi community,
>
> I'm trying to convert a plan to a sql of SPARK dialect, but sometimes the
> fields will be converted to star, see the following example
>
> *create table users(id int);*
> *create table depts(id int);*
>
> *select a.id <http://a.id>, b.id <http://b.id> as id0 from users1 a, depts1
> b*
>
> converting to plan results
> LogicalProject(id=[$0], id0=[$1])
>   LogicalJoin(condition=[true], joinType=[inner])
>     HiveTableScan(table=[[default, users1]])
>     HiveTableScan(table=[[default, depts1]])
>
> converting the plan to SPARK sql results
>
>
> *SELECT *FROM `default`.`users1`CROSS JOIN `default`.`depts1`*
>
> because PROJECT and JOIN row type are identical, the *PROJECT* has been
> converted to *SELECT ** .
>
> the problem is the SPARK sql can't be used to create table, such as
>
>
> *CREATE TABLE tmp AS SELECT *FROM `default`.`users1`CROSS JOIN
> `default`.`depts1`*
>
> this ctas will throw  '*Found duplicate column(s) in the table definition'*
>
> I wonder if we can add a config to indicate the *PROJECT *couldn't be
> converted to star.
>
> how do you think?

Re: [Discuss] Converting SELECT Fields to STAR may cause duplicate column exception when executing ctas

Reply via email to