[jira] [Commented] (SPARK-18681) Throw Filtering is supported only on partition keys of type string exception
[ https://issues.apache.org/jira/browse/SPARK-18681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17074361#comment-17074361 ] philipse commented on SPARK-18681: -- [~michael] any news for this issue ? i meet the same issue on spark2.4.5 > Throw Filtering is supported only on partition keys of type string exception > > > Key: SPARK-18681 > URL: https://issues.apache.org/jira/browse/SPARK-18681 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: Yuming Wang >Assignee: Yuming Wang >Priority: Major > Fix For: 2.1.0 > > > Cloudera put > {{/var/run/cloudera-scm-agent/process/15000-hive-HIVEMETASTORE/hive-site.xml}} > as the configuration file for the Hive Metastore Server, where > {{hive.metastore.try.direct.sql=false}}. But Spark reading the gateway > configuration file and get default value > {{hive.metastore.try.direct.sql=true}}. we should use {{getMetaConf}} or > {{getMSC.getConfigValue}} method to obtain the original configuration from > Hive Metastore Server. > {noformat} > spark-sql> CREATE TABLE test (value INT) PARTITIONED BY (part INT); > Time taken: 0.221 seconds > spark-sql> select * from test where part=1 limit 10; > 16/12/02 08:33:45 ERROR thriftserver.SparkSQLDriver: Failed in [select * from > test where part=1 limit 10] > java.lang.RuntimeException: Caught Hive MetaException attempting to get > partition metadata by filter from Hive. You can set the Spark configuration > setting spark.sql.hive.manageFilesourcePartitions to false to work around > this problem, however this will result in degraded performance. Please report > a bug: https://issues.apache.org/jira/browse/SPARK > at > org.apache.spark.sql.hive.client.Shim_v0_13.getPartitionsByFilter(HiveShim.scala:610) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionsByFilter$1.apply(HiveClientImpl.scala:549) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionsByFilter$1.apply(HiveClientImpl.scala:547) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:282) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:229) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:228) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:271) > at > org.apache.spark.sql.hive.client.HiveClientImpl.getPartitionsByFilter(HiveClientImpl.scala:547) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$listPartitionsByFilter$1.apply(HiveExternalCatalog.scala:954) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$listPartitionsByFilter$1.apply(HiveExternalCatalog.scala:938) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:91) > at > org.apache.spark.sql.hive.HiveExternalCatalog.listPartitionsByFilter(HiveExternalCatalog.scala:938) > at > org.apache.spark.sql.hive.MetastoreRelation.getHiveQlPartitions(MetastoreRelation.scala:156) > at > org.apache.spark.sql.hive.execution.HiveTableScanExec$$anonfun$10.apply(HiveTableScanExec.scala:151) > at > org.apache.spark.sql.hive.execution.HiveTableScanExec$$anonfun$10.apply(HiveTableScanExec.scala:150) > at org.apache.spark.util.Utils$.withDummyCallSite(Utils.scala:2435) > at > org.apache.spark.sql.hive.execution.HiveTableScanExec.doExecute(HiveTableScanExec.scala:149) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132) > at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113) > at > org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:225) > at > org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:308) > at > org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38) > at > org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:295) > at > org.apache.spark.sql.execution.QueryExecution$$anonfun$hiveResultString$4.apply(QueryExecution.scala:134) > at > org.apache.spark.sql.execution.QueryExecution$$anonfun$hiveResultString$4.apply(QueryExecution.scala:133) > at >
[jira] [Commented] (SPARK-18681) Throw Filtering is supported only on partition keys of type string exception
[ https://issues.apache.org/jira/browse/SPARK-18681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523049#comment-16523049 ] Harish commented on SPARK-18681: [~michael] I see the same issue in 2.3.1. I have hive partitioned table with integer as partition key. I set ("spark.sql.hive.manageFilesourcePartitions","false") in my pyspark code. Caused by: java.lang.RuntimeException: Caught Hive MetaException attempting to get partition metadata by filter from Hive. You can set the Spark configuration setting spark.sql.hive.manageFilesourcePartitions to false to work around this problem, however this will result in degraded performance. Please report a bug: https://issues.apache.org/jira/browse/SPARK at org.apache.spark.sql.hive.client.Shim_v0_13.getPartitionsByFilter(HiveShim.scala:741) at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionsByFilter$1.apply(HiveClientImpl.scala:655) at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionsByFilter$1.apply(HiveClientImpl.scala:653) at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:272) at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:210) at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:209) at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:255) at org.apache.spark.sql.hive.client.HiveClientImpl.getPartitionsByFilter(HiveClientImpl.scala:653) at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$listPartitionsByFilter$1.apply(HiveExternalCatalog.scala:1218) at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$listPartitionsByFilter$1.apply(HiveExternalCatalog.scala:1211) at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) at org.apache.spark.sql.hive.HiveExternalCatalog.listPartitionsByFilter(HiveExternalCatalog.scala:1211) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.listPartitionsByFilter(SessionCatalog.scala:925) at org.apache.spark.sql.hive.execution.HiveTableScanExec.rawPartitions$lzycompute(HiveTableScanExec.scala:172) at org.apache.spark.sql.hive.execution.HiveTableScanExec.rawPartitions(HiveTableScanExec.scala:164) at org.apache.spark.sql.hive.execution.HiveTableScanExec$$anonfun$11.apply(HiveTableScanExec.scala:190) at org.apache.spark.sql.hive.execution.HiveTableScanExec$$anonfun$11.apply(HiveTableScanExec.scala:190) at org.apache.spark.util.Utils$.withDummyCallSite(Utils.scala:2515) at org.apache.spark.sql.hive.execution.HiveTableScanExec.doExecute(HiveTableScanExec.scala:189) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:371) at org.apache.spark.sql.execution.FilterExec.inputRDDs(basicPhysicalOperators.scala:121) at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.inputRDDs(BroadcastHashJoinExec.scala:76) at org.apache.spark.sql.execution.ProjectExec.inputRDDs(basicPhysicalOperators.scala:41) at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.inputRDDs(BroadcastHashJoinExec.scala:76) at org.apache.spark.sql.execution.ProjectExec.inputRDDs(basicPhysicalOperators.scala:41) at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:605) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.prepareShuffleDependency(ShuffleExchangeExec.scala:92) at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec$$anonfun$doExecute$1.apply(ShuffleExchangeExec.scala:128) at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec$$anonfun$doExecute$1.apply(ShuffleExchangeExec.scala:119) at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:52) ... 38 more Caused by: java.lang.reflect.InvocationTargetException at
[jira] [Commented] (SPARK-18681) Throw Filtering is supported only on partition keys of type string exception
[ https://issues.apache.org/jira/browse/SPARK-18681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726464#comment-15726464 ] Michael Allman commented on SPARK-18681: [~rxin] I think this should be a blocker for 2.1. This introduces a major regression for some common table partition schemes for a default configuration for many deployments of CDH that will fail with the (new) default Spark configuration. I think we have a fix in the PR, but the unit tests are inadequate. I'm going to work on that. > Throw Filtering is supported only on partition keys of type string exception > > > Key: SPARK-18681 > URL: https://issues.apache.org/jira/browse/SPARK-18681 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: Yuming Wang > > Cloudera put > {{/var/run/cloudera-scm-agent/process/15000-hive-HIVEMETASTORE/hive-site.xml}} > as the configuration file for the Hive Metastore Server, where > {{hive.metastore.try.direct.sql=false}}. But Spark isn't reading this > configuration file and get default value > {{hive.metastore.try.direct.sql=true}}. we should use {{getMetaConf}} or > {{getMSC.getConfigValue}} method to obtain the original configuration from > Hive Metastore Server. > {noformat} > spark-sql> CREATE TABLE test (value INT) PARTITIONED BY (part INT); > Time taken: 0.221 seconds > spark-sql> select * from test where part=1 limit 10; > 16/12/02 08:33:45 ERROR thriftserver.SparkSQLDriver: Failed in [select * from > test where part=1 limit 10] > java.lang.RuntimeException: Caught Hive MetaException attempting to get > partition metadata by filter from Hive. You can set the Spark configuration > setting spark.sql.hive.manageFilesourcePartitions to false to work around > this problem, however this will result in degraded performance. Please report > a bug: https://issues.apache.org/jira/browse/SPARK > at > org.apache.spark.sql.hive.client.Shim_v0_13.getPartitionsByFilter(HiveShim.scala:610) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionsByFilter$1.apply(HiveClientImpl.scala:549) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionsByFilter$1.apply(HiveClientImpl.scala:547) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:282) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:229) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:228) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:271) > at > org.apache.spark.sql.hive.client.HiveClientImpl.getPartitionsByFilter(HiveClientImpl.scala:547) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$listPartitionsByFilter$1.apply(HiveExternalCatalog.scala:954) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$listPartitionsByFilter$1.apply(HiveExternalCatalog.scala:938) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:91) > at > org.apache.spark.sql.hive.HiveExternalCatalog.listPartitionsByFilter(HiveExternalCatalog.scala:938) > at > org.apache.spark.sql.hive.MetastoreRelation.getHiveQlPartitions(MetastoreRelation.scala:156) > at > org.apache.spark.sql.hive.execution.HiveTableScanExec$$anonfun$10.apply(HiveTableScanExec.scala:151) > at > org.apache.spark.sql.hive.execution.HiveTableScanExec$$anonfun$10.apply(HiveTableScanExec.scala:150) > at org.apache.spark.util.Utils$.withDummyCallSite(Utils.scala:2435) > at > org.apache.spark.sql.hive.execution.HiveTableScanExec.doExecute(HiveTableScanExec.scala:149) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132) > at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113) > at > org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:225) > at > org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:308) > at > org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38) > at > org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:295) > at >
[jira] [Commented] (SPARK-18681) Throw Filtering is supported only on partition keys of type string exception
[ https://issues.apache.org/jira/browse/SPARK-18681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15715860#comment-15715860 ] Apache Spark commented on SPARK-18681: -- User 'wangyum' has created a pull request for this issue: https://github.com/apache/spark/pull/16122 > Throw Filtering is supported only on partition keys of type string exception > > > Key: SPARK-18681 > URL: https://issues.apache.org/jira/browse/SPARK-18681 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: Yuming Wang > > I'm using MySQL for the Hive Metastore. > {{hive.metastore.try.direct.sql=true}} and > {{hive.metastore.integral.jdo.pushdown=false}} . It will throw following > exception when filtering on a *int* partition column after > [SPARK-17992|https://issues.apache.org/jira/browse/SPARK-17992]. > {noformat} > spark-sql> CREATE TABLE test (value INT) PARTITIONED BY (part INT); > Time taken: 0.221 seconds > spark-sql> select * from test where part=1 limit 10; > 16/12/02 08:33:45 ERROR thriftserver.SparkSQLDriver: Failed in [select * from > test where part=1 limit 10] > java.lang.RuntimeException: Caught Hive MetaException attempting to get > partition metadata by filter from Hive. You can set the Spark configuration > setting spark.sql.hive.manageFilesourcePartitions to false to work around > this problem, however this will result in degraded performance. Please report > a bug: https://issues.apache.org/jira/browse/SPARK > at > org.apache.spark.sql.hive.client.Shim_v0_13.getPartitionsByFilter(HiveShim.scala:610) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionsByFilter$1.apply(HiveClientImpl.scala:549) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionsByFilter$1.apply(HiveClientImpl.scala:547) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:282) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:229) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:228) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:271) > at > org.apache.spark.sql.hive.client.HiveClientImpl.getPartitionsByFilter(HiveClientImpl.scala:547) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$listPartitionsByFilter$1.apply(HiveExternalCatalog.scala:954) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$listPartitionsByFilter$1.apply(HiveExternalCatalog.scala:938) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:91) > at > org.apache.spark.sql.hive.HiveExternalCatalog.listPartitionsByFilter(HiveExternalCatalog.scala:938) > at > org.apache.spark.sql.hive.MetastoreRelation.getHiveQlPartitions(MetastoreRelation.scala:156) > at > org.apache.spark.sql.hive.execution.HiveTableScanExec$$anonfun$10.apply(HiveTableScanExec.scala:151) > at > org.apache.spark.sql.hive.execution.HiveTableScanExec$$anonfun$10.apply(HiveTableScanExec.scala:150) > at org.apache.spark.util.Utils$.withDummyCallSite(Utils.scala:2435) > at > org.apache.spark.sql.hive.execution.HiveTableScanExec.doExecute(HiveTableScanExec.scala:149) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132) > at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113) > at > org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:225) > at > org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:308) > at > org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38) > at > org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:295) > at > org.apache.spark.sql.execution.QueryExecution$$anonfun$hiveResultString$4.apply(QueryExecution.scala:134) > at > org.apache.spark.sql.execution.QueryExecution$$anonfun$hiveResultString$4.apply(QueryExecution.scala:133) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57) > at > org.apache.spark.sql.execution.QueryExecution.hiveResultString(QueryExecution.scala:133) > at >
[jira] [Commented] (SPARK-18681) Throw Filtering is supported only on partition keys of type string exception
[ https://issues.apache.org/jira/browse/SPARK-18681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713871#comment-15713871 ] Liang-Chi Hsieh commented on SPARK-18681: - Looks like you create two Jiras (SPARK-18680, SPARK-18681) for the same issue. Mind close one of them? > Throw Filtering is supported only on partition keys of type string exception > > > Key: SPARK-18681 > URL: https://issues.apache.org/jira/browse/SPARK-18681 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: Yuming Wang > > I'm using MySQL for the Hive Metastore. > {{hive.metastore.try.direct.sql=true}} and > {{hive.metastore.integral.jdo.pushdown=false}} . It will throw following > exception when filtering on a *int* partition column after > [SPARK-17992|https://issues.apache.org/jira/browse/SPARK-17992]. > {noformat} > spark-sql> CREATE TABLE test (value INT) PARTITIONED BY (part INT); > Time taken: 0.221 seconds > spark-sql> select * from test where part=1 limit 10; > 16/12/02 08:33:45 ERROR thriftserver.SparkSQLDriver: Failed in [select * from > test where part=1 limit 10] > java.lang.RuntimeException: Caught Hive MetaException attempting to get > partition metadata by filter from Hive. You can set the Spark configuration > setting spark.sql.hive.manageFilesourcePartitions to false to work around > this problem, however this will result in degraded performance. Please report > a bug: https://issues.apache.org/jira/browse/SPARK > at > org.apache.spark.sql.hive.client.Shim_v0_13.getPartitionsByFilter(HiveShim.scala:610) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionsByFilter$1.apply(HiveClientImpl.scala:549) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionsByFilter$1.apply(HiveClientImpl.scala:547) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:282) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:229) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:228) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:271) > at > org.apache.spark.sql.hive.client.HiveClientImpl.getPartitionsByFilter(HiveClientImpl.scala:547) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$listPartitionsByFilter$1.apply(HiveExternalCatalog.scala:954) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$listPartitionsByFilter$1.apply(HiveExternalCatalog.scala:938) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:91) > at > org.apache.spark.sql.hive.HiveExternalCatalog.listPartitionsByFilter(HiveExternalCatalog.scala:938) > at > org.apache.spark.sql.hive.MetastoreRelation.getHiveQlPartitions(MetastoreRelation.scala:156) > at > org.apache.spark.sql.hive.execution.HiveTableScanExec$$anonfun$10.apply(HiveTableScanExec.scala:151) > at > org.apache.spark.sql.hive.execution.HiveTableScanExec$$anonfun$10.apply(HiveTableScanExec.scala:150) > at org.apache.spark.util.Utils$.withDummyCallSite(Utils.scala:2435) > at > org.apache.spark.sql.hive.execution.HiveTableScanExec.doExecute(HiveTableScanExec.scala:149) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132) > at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113) > at > org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:225) > at > org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:308) > at > org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38) > at > org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:295) > at > org.apache.spark.sql.execution.QueryExecution$$anonfun$hiveResultString$4.apply(QueryExecution.scala:134) > at > org.apache.spark.sql.execution.QueryExecution$$anonfun$hiveResultString$4.apply(QueryExecution.scala:133) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57) > at > org.apache.spark.sql.execution.QueryExecution.hiveResultString(QueryExecution.scala:133) > at >
[jira] [Commented] (SPARK-18681) Throw Filtering is supported only on partition keys of type string exception
[ https://issues.apache.org/jira/browse/SPARK-18681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713624#comment-15713624 ] Yuming Wang commented on SPARK-18681: - I will pull request for this issue later. > Throw Filtering is supported only on partition keys of type string exception > > > Key: SPARK-18681 > URL: https://issues.apache.org/jira/browse/SPARK-18681 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: Yuming Wang > > I'm using MySQL for the Hive Metastore. > {{hive.metastore.try.direct.sql=true}} and > {{hive.metastore.integral.jdo.pushdown=false}} . It will throw following > exception when filtering on a *int* partition column after > [SPARK-17992|https://issues.apache.org/jira/browse/SPARK-17992]. > {noformat} > spark-sql> CREATE TABLE test (value INT) PARTITIONED BY (part INT); > Time taken: 0.221 seconds > spark-sql> select * from test where part=1 limit 10; > 16/12/02 08:33:45 ERROR thriftserver.SparkSQLDriver: Failed in [select * from > test where part=1 limit 10] > java.lang.RuntimeException: Caught Hive MetaException attempting to get > partition metadata by filter from Hive. You can set the Spark configuration > setting spark.sql.hive.manageFilesourcePartitions to false to work around > this problem, however this will result in degraded performance. Please report > a bug: https://issues.apache.org/jira/browse/SPARK > at > org.apache.spark.sql.hive.client.Shim_v0_13.getPartitionsByFilter(HiveShim.scala:610) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionsByFilter$1.apply(HiveClientImpl.scala:549) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionsByFilter$1.apply(HiveClientImpl.scala:547) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:282) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:229) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:228) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:271) > at > org.apache.spark.sql.hive.client.HiveClientImpl.getPartitionsByFilter(HiveClientImpl.scala:547) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$listPartitionsByFilter$1.apply(HiveExternalCatalog.scala:954) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$listPartitionsByFilter$1.apply(HiveExternalCatalog.scala:938) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:91) > at > org.apache.spark.sql.hive.HiveExternalCatalog.listPartitionsByFilter(HiveExternalCatalog.scala:938) > at > org.apache.spark.sql.hive.MetastoreRelation.getHiveQlPartitions(MetastoreRelation.scala:156) > at > org.apache.spark.sql.hive.execution.HiveTableScanExec$$anonfun$10.apply(HiveTableScanExec.scala:151) > at > org.apache.spark.sql.hive.execution.HiveTableScanExec$$anonfun$10.apply(HiveTableScanExec.scala:150) > at org.apache.spark.util.Utils$.withDummyCallSite(Utils.scala:2435) > at > org.apache.spark.sql.hive.execution.HiveTableScanExec.doExecute(HiveTableScanExec.scala:149) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132) > at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113) > at > org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:225) > at > org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:308) > at > org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38) > at > org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:295) > at > org.apache.spark.sql.execution.QueryExecution$$anonfun$hiveResultString$4.apply(QueryExecution.scala:134) > at > org.apache.spark.sql.execution.QueryExecution$$anonfun$hiveResultString$4.apply(QueryExecution.scala:133) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57) > at > org.apache.spark.sql.execution.QueryExecution.hiveResultString(QueryExecution.scala:133) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:63)