[jira] [Commented] (SPARK-4702) Querying non-existent partition produces exception in v1.2.0-rc1

Yana Kadiyska (JIRA) Wed, 03 Dec 2014 13:06:31 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-4702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14233514#comment-14233514
 ]


Yana Kadiyska commented on SPARK-4702:
--------------------------------------

Michael, I do not have a 1.1. In October I built master manually, I believe 
from commit d2987e8f7a2cb3bf971f381399d8efdccb51d3d2.
At that time both of types of queries worked, without setting 
spark.sql.hive.convertMetastoreParquet=false (I tested on a smaller cluster, 
will drop on the same one now to make sure there's not some data weirdness)

If you meant to say "When convertMetastoreParquet is FALSE, there is not 
currently support for heterogeneous schema." then we are saying the same thing 
-- I didn't have to set this flag as the missing partitions where handled fine. 
Now missing partitions are broken, but setting  SET 
spark.sql.hive.convertMetastoreParquet=false breaks "99% case" because my files 
have different # columns.

I have not tried the PR you mentioned, will try it now --in my case the issue 
is not an empty file, it's a missing directory -- our query does a 
partition="YYYY-mm" query, where parquet files are laid out under YYYY-mm 
directories representing partitions. But will see if the issue is helped by 
that PR.

In any case, I am just hoping this works before the final release, not a 
particular rush

> Querying  non-existent partition produces exception in v1.2.0-rc1
> -----------------------------------------------------------------
>
>                 Key: SPARK-4702
>                 URL: https://issues.apache.org/jira/browse/SPARK-4702
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.2.0
>            Reporter: Yana Kadiyska
>
> Using HiveThriftServer2, when querying a non-existent partition I get an 
> exception rather than an empty result set. This seems to be a regression -- I 
> had an older build of master branch where this works. Build off of RC1.2 tag 
> produces the following:
> 14/12/02 20:04:12 WARN ThriftCLIService: Error executing statement:
> org.apache.hive.service.cli.HiveSQLException: 
> java.lang.IllegalArgumentException: Can not create a Path from an empty string
>         at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:192)
>         at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
>         at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:218)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79)
>         at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37)
>         at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>         at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:493)
>         at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60)
>         at com.sun.proxy.$Proxy19.executeStatementAsync(Unknown Source)
>         at 
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:233)
>         at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:344)
>         at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
>         at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>         at 
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:55)
>         at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-4702) Querying non-existent partition produces exception in v1.2.0-rc1

Reply via email to