[ https://issues.apache.org/jira/browse/HIVE-11625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Cheng Lian updated HIVE-11625: ------------------------------ Description: Hive allows maps with null keys: {code:sql} hive> select map(null, 'foo', 1, 'bar', null, 'baz'); {null:"baz",1:"bar"} {code} However, when written into Parquet tables, map entries with null as keys are dropped: {code:sql} hive> CREATE TABLE map_test STORED AS PARQUET > AS SELECT MAP(null, 'foo', 1, 'bar', null, 'baz'); ... hive> SELECT * from map_test; {1:"bar"} {code} This is because entries with null keys are explicitly skipped in {{DataWritableWriter}}, [see here|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L223-L237]. This issue can be fixed by moving [the value writing block|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L230-L236] out of [the key writing block|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L223-L237]. was: Hive allows maps with null keys: {code:sql} hive> select map(null, 'foo', 1, 'bar', null, 'baz'); {null:"baz",1:"bar"} {code} However, when written into Parquet tables, map entries with null as keys are dropped: {code:sql} hive> CREATE TABLE map_test STORED AS PARQUET > AS SELECT MAP(null, 'foo', 1, 'bar', null, 'baz'); ... hive> SELECT * from map_test; {1:"bar"} {code} This is because entries with null keys are explicitly skipped in {{DataWritableWriter}}, [see here|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L223-L237]. > Map instances with null keys are not written to Parquet tables > -------------------------------------------------------------- > > Key: HIVE-11625 > URL: https://issues.apache.org/jira/browse/HIVE-11625 > Project: Hive > Issue Type: Sub-task > Affects Versions: 0.14.0, 0.13.1, 1.0.1, 1.1.1, 1.2.1 > Reporter: Cheng Lian > > Hive allows maps with null keys: > {code:sql} > hive> select map(null, 'foo', 1, 'bar', null, 'baz'); > {null:"baz",1:"bar"} > {code} > However, when written into Parquet tables, map entries with null as keys are > dropped: > {code:sql} > hive> CREATE TABLE map_test STORED AS PARQUET > > AS SELECT MAP(null, 'foo', 1, 'bar', null, 'baz'); > ... > hive> SELECT * from map_test; > {1:"bar"} > {code} > This is because entries with null keys are explicitly skipped in > {{DataWritableWriter}}, [see > here|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L223-L237]. > This issue can be fixed by moving [the value writing > block|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L230-L236] > out of [the key writing > block|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L223-L237]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)