[
https://issues.apache.org/jira/browse/SPARK-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Armbrust reassigned SPARK-6450:
---------------------------------------
Assignee: Cheng Lian
> Self joining query failure
> --------------------------
>
> Key: SPARK-6450
> URL: https://issues.apache.org/jira/browse/SPARK-6450
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.3.0
> Reporter: Anand Mohan Tumuluri
> Assignee: Cheng Lian
> Priority: Critical
>
> The below query was working fine till 1.3 commit
> 9a151ce58b3e756f205c9f3ebbbf3ab0ba5b33fd.(Yes it definitely works at this
> commit although this commit is completely unrelated)
> It got broken in 1.3.0 release with an AnalysisException: resolved attributes
> ... missing from .... (although this list contains the fields which it
> reports missing)
> {code}
> at
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:189)
> at
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
> at
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:218)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79)
> at
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37)
> at
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:493)
> at
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60)
> at com.sun.proxy.$Proxy17.executeStatementAsync(Unknown Source)
> at
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:233)
> at
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:344)
> at
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
> at
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:55)
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> {code}
> select Orders.Country, Orders.ProductCategory,count(1) from Orders join
> (select Orders.Country, count(1) CountryOrderCount from Orders where
> to_date(Orders.PlacedDate) > '2015-01-01' group by Orders.Country order by
> CountryOrderCount DESC LIMIT 5) Top5Countries on Top5Countries.Country =
> Orders.Country where to_date(Orders.PlacedDate) > '2015-01-01' group by
> Orders.Country,Orders.ProductCategory;
> {code}
> The temporary workaround is to add explicit alias for the table Orders
> {code}
> select o.Country, o.ProductCategory,count(1) from Orders o join (select
> r.Country, count(1) CountryOrderCount from Orders r where
> to_date(r.PlacedDate) > '2015-01-01' group by r.Country order by
> CountryOrderCount DESC LIMIT 5) Top5Countries on Top5Countries.Country =
> o.Country where to_date(o.PlacedDate) > '2015-01-01' group by
> o.Country,o.ProductCategory;
> {code}
> However this change not only affects self joins, it also seems to affect
> union queries as well, like the below query which was again working
> before(commit 9a151ce) got broken
> {code}
> select Orders.Country,null,count(1) OrderCount from Orders group by
> Orders.Country,null
> union all
> select null,Orders.ProductCategory,count(1) OrderCount from Orders group by
> null, Orders.ProductCategory
> {code}
> also fails with a Analysis exception.
> The workaround is to add different aliases for the tables.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]