[ 
https://issues.apache.org/jira/browse/SPARK-42292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuanzhiang resolved SPARK-42292.
--------------------------------
    Resolution: Fixed

when i set spark.sql.hive.convertMetastoreParquet=true , spark3 use inner 
parquet reader.

> Spark SQL not use hive partition info
> -------------------------------------
>
>                 Key: SPARK-42292
>                 URL: https://issues.apache.org/jira/browse/SPARK-42292
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.2.1
>            Reporter: xuanzhiang
>            Priority: Major
>
> I use spark3 to count partition num , like : 
> table a is external parquet table, it have 3 partition columns (year ,month, 
> day).
> query sql : "select distinct month , day from a where year = '2022' "
> i think spark can find hive metadata and use partition info, but it load all  
> "year = '2022'" partition data.
> in spark2.4, it use TableLocalScanExec ,but spark3 use HiveTableRelation and 
> scan hive parquet.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to