[
https://issues.apache.org/jira/browse/HIVE-26888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Indhumathi Muthumurugesh updated HIVE-26888:
--------------------------------------------
Description:
1. From Spark sql:
create table test(a int, b string) partitioned by (c string) stored as parquet;
insert into test select 1,'abc','part1';
2. Use spark dataframe to generate new parquet file
val df = spark.sql("select * from test");
df.write.mode("overwrite").parquet("/Users/indhu/Downloads/part=part1");
3. From hive, create a external table with parquet format and add partition,
with the location
create external table test(a int, b string) partitioned by (c string)
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
alter table test add partition(part='part1') location
'/Users/indhu/Downloads/part=part1';
select * from test where part='part1';
> Hive gives empty results with partition column filter for hive parquet table
> when data loaded through spark dataframe
> ---------------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-26888
> URL: https://issues.apache.org/jira/browse/HIVE-26888
> Project: Hive
> Issue Type: Bug
> Reporter: Indhumathi Muthumurugesh
> Priority: Major
>
> 1. From Spark sql:
> create table test(a int, b string) partitioned by (c string) stored as
> parquet;
> insert into test select 1,'abc','part1';
>
> 2. Use spark dataframe to generate new parquet file
> val df = spark.sql("select * from test");
> df.write.mode("overwrite").parquet("/Users/indhu/Downloads/part=part1");
>
> 3. From hive, create a external table with parquet format and add partition,
> with the location
> create external table test(a int, b string) partitioned by (c string)
> ROW FORMAT SERDE
> 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
> STORED AS INPUTFORMAT
> 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> alter table test add partition(part='part1') location
> '/Users/indhu/Downloads/part=part1';
> select * from test where part='part1';
--
This message was sent by Atlassian Jira
(v8.20.10#820010)