[ 
https://issues.apache.org/jira/browse/HIVE-11625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Lian updated HIVE-11625:
------------------------------
    Description: 
Hive allows maps with null keys:
{code:sql}
hive> select map(null, 'foo', 1, 'bar', null, 'baz');
{null:"baz",1:"bar"}
{code}
However, when written into Parquet tables, map entries with null as keys are 
dropped:
{code:sql}
hive> CREATE TABLE map_test STORED AS PARQUET
    > AS SELECT MAP(null, 'foo', 1, 'bar', null, 'baz');
...
hive> SELECT * from map_test;
{1:"bar"}
{code}
This is because entries with null keys are explicitly skipped in 
{{DataWritableWriter}}, [see 
here|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L223-L237].

This issue can be fixed by moving [the value writing 
block|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L230-L236]
 out of [the key writing 
block|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L223-L237].

  was:
Hive allows maps with null keys:
{code:sql}
hive> select map(null, 'foo', 1, 'bar', null, 'baz');
{null:"baz",1:"bar"}
{code}
However, when written into Parquet tables, map entries with null as keys are 
dropped:
{code:sql}
hive> CREATE TABLE map_test STORED AS PARQUET
    > AS SELECT MAP(null, 'foo', 1, 'bar', null, 'baz');
...
hive> SELECT * from map_test;
{1:"bar"}
{code}
This is because entries with null keys are explicitly skipped in 
{{DataWritableWriter}}, [see 
here|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L223-L237].


> Map instances with null keys are not written to Parquet tables
> --------------------------------------------------------------
>
>                 Key: HIVE-11625
>                 URL: https://issues.apache.org/jira/browse/HIVE-11625
>             Project: Hive
>          Issue Type: Sub-task
>    Affects Versions: 0.14.0, 0.13.1, 1.0.1, 1.1.1, 1.2.1
>            Reporter: Cheng Lian
>
> Hive allows maps with null keys:
> {code:sql}
> hive> select map(null, 'foo', 1, 'bar', null, 'baz');
> {null:"baz",1:"bar"}
> {code}
> However, when written into Parquet tables, map entries with null as keys are 
> dropped:
> {code:sql}
> hive> CREATE TABLE map_test STORED AS PARQUET
>     > AS SELECT MAP(null, 'foo', 1, 'bar', null, 'baz');
> ...
> hive> SELECT * from map_test;
> {1:"bar"}
> {code}
> This is because entries with null keys are explicitly skipped in 
> {{DataWritableWriter}}, [see 
> here|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L223-L237].
> This issue can be fixed by moving [the value writing 
> block|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L230-L236]
>  out of [the key writing 
> block|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L223-L237].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to