[ https://issues.apache.org/jira/browse/SPARK-32131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17149108#comment-17149108 ]
Dongjoon Hyun commented on SPARK-32131: --------------------------------------- I also verified that this bug exists at 2.1.3 ~ 2.3.7 and updated the affected versions. > union and set operations have wrong exception infomation > -------------------------------------------------------- > > Key: SPARK-32131 > URL: https://issues.apache.org/jira/browse/SPARK-32131 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.1.3, 2.2.3, 2.3.4, 2.4.6, 3.0.0 > Reporter: philipse > Priority: Minor > > Union and set operations can only be performed on tables with the compatible > column types,while when we have more than two column, the warning messages > will have wrong column index.Steps to reproduce. > Step1:prepare test data > {code:java} > drop table if exists test1; > drop table if exists test2; > drop table if exists test3; > create table if not exists test1(id int, age int, name timestamp); > create table if not exists test2(id int, age timestamp, name timestamp); > create table if not exists test3(id int, age int, name int); > insert into test1 select 1,2,'2020-01-01 01:01:01'; > insert into test2 select 1,'2020-01-01 01:01:01','2020-01-01 01:01:01'; > insert into test3 select 1,3,4; > {code} > Step2:do query: > {code:java} > Query1: > select * from test1 except select * from test2; > Result1: > Error: org.apache.spark.sql.AnalysisException: Except can only be performed > on tables with the compatible column types. timestamp <> int at the second > column of the second table;; 'Except false :- Project [id#620, age#621, > name#622] : +- SubqueryAlias `default`.`test1` : +- HiveTableRelation > `default`.`test1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, > [id#620, age#621, name#622] +- Project [id#623, age#624, name#625] +- > SubqueryAlias `default`.`test2` +- HiveTableRelation `default`.`test2`, > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [id#623, age#624, > name#625] (state=,code=0) > Query2: > select * from test1 except select * from test3; > Result2: > Error: org.apache.spark.sql.AnalysisException: Except can only be performed > on tables with the compatible column types. int <> timestamp at the 2th > column of the second table;; 'Except false :- Project [id#632, age#633, > name#634] : +- SubqueryAlias `default`.`test1` : +- HiveTableRelation > `default`.`test1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, > [id#632, age#633, name#634] +- Project [id#635, age#636, name#637] +- > SubqueryAlias `default`.`test3` +- HiveTableRelation `default`.`test3`, > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [id#635, age#636, > name#637] (state=,code=0) > {code} > the result of query1 is correct, while query2 have the wrong errors,it should > be the third column > Here has the wrong column index. > +Error: org.apache.spark.sql.AnalysisException: Except can only be performed > on tables with the compatible column types. int <> timestamp at the *2th* > column of the second table+ > We may need to change to the following > +Error: org.apache.spark.sql.AnalysisException: Except can only be performed > on tables with the compatible column types. int <> timestamp at the *third* > column of the second table+ -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org