[
https://issues.apache.org/jira/browse/SPARK-30709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean R. Owen deleted SPARK-30709:
---------------------------------
> Spark 2.3 to Spark 2.4 Upgrade. Problems reading HIVE partitioned tables.
> -------------------------------------------------------------------------
>
> Key: SPARK-30709
> URL: https://issues.apache.org/jira/browse/SPARK-30709
> Project: Spark
> Issue Type: Question
> Environment: PRE- Production
> Reporter: Carlos Mario
> Priority: Major
> Labels: SQL, Spark
>
> Hello
> We recently updated our preproduction environment from Spark 2.3 to Spark
> 2.4.0
> Along time we have created a big amount of tables in Hive Metastore,
> partitioned by 2 fields one of them String and the other one BigInt.
> We were reading this tables with Spark 2.3 with no problem, but after
> upgrading to Spark 2.4 we get the following log every time we run our SW:
> <log>
> log_filterBIGINT.out:
> Caused by: MetaException(message:Filtering is supported only on partition
> keys of type string) Caused by: MetaException(message:Filtering is supported
> only on partition keys of type string) Caused by:
> MetaException(message:Filtering is supported only on partition keys of type
> string)
>
> hadoop-cmf-hive-HIVEMETASTORE-isblcsmsttc0001.scisb.isban.corp.log.out.1:
>
> 2020-01-10 09:36:05,781 ERROR
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-5-thread-138]:
> MetaException(message:Filtering is supported only on partition keys of type
> string)
> 2020-01-10 11:19:19,208 ERROR
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-5-thread-187]:
> MetaException(message:Filtering is supported only on partition keys of type
> string)
> 2020-01-10 11:19:54,780 ERROR
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-5-thread-167]:
> MetaException(message:Filtering is supported only on partition keys of type
> string)
> </log>
>
> We know the best practice from Spark point of view is to use 'STRING' type
> for partition columns, but we need to explore a solution we'll be able to
> deploy with ease, due to the big amount of tables created with a bigiint type
> column partition.
>
> As a first solution we tried to set the
> spark.sql.hive.manageFilesourcePartitions parameter to false in the Spark
> Submmit, but after reruning the SW the error stood still.
>
> Is there anyone in the community who experienced the same problem? What was
> the solution for it?
>
> Kind Regards and thanks in advance.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]