[
https://issues.apache.org/jira/browse/SPARK-4702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14233443#comment-14233443
]
Yana Kadiyska commented on SPARK-4702:
--------------------------------------
Michael, just wanted to point out that the workaround you suggested did indeed
help when the partition is missing -- I get a count of 0. But it did break the
otherwise working case when a partition is present:
java.lang.IllegalStateException:
All the offsets listed in the split should be found in the file. expected: [4,
4]
found: {my schema dumped out here}
out of: [4, 121017555, 242333553, 363518600] in range 0, 134217728
It's possible that this is a very corner case -- we've added columns to our
schema so it's possible that the parquet files are likely not symmetric (not
quite sure what convertMetastoreParquet does under the hood). But wanted to
point out that in our case the bug is truly a blocker (I'm hoping it makes it
in 1.2, don't care if it makes it in the next RC or later)
> Querying non-existent partition produces exception in v1.2.0-rc1
> -----------------------------------------------------------------
>
> Key: SPARK-4702
> URL: https://issues.apache.org/jira/browse/SPARK-4702
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.2.0
> Reporter: Yana Kadiyska
>
> Using HiveThriftServer2, when querying a non-existent partition I get an
> exception rather than an empty result set. This seems to be a regression -- I
> had an older build of master branch where this works. Build off of RC1.2 tag
> produces the following:
> 14/12/02 20:04:12 WARN ThriftCLIService: Error executing statement:
> org.apache.hive.service.cli.HiveSQLException:
> java.lang.IllegalArgumentException: Can not create a Path from an empty string
> at
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:192)
> at
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
> at
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:218)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79)
> at
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37)
> at
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:493)
> at
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60)
> at com.sun.proxy.$Proxy19.executeStatementAsync(Unknown Source)
> at
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:233)
> at
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:344)
> at
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
> at
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:55)
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]