Skye Wanderman-Milne created HIVE-9303:
------------------------------------------

             Summary: Parquet files are written with incorrect definition levels
                 Key: HIVE-9303
                 URL: https://issues.apache.org/jira/browse/HIVE-9303
             Project: Hive
          Issue Type: Bug
    Affects Versions: 0.13.1
            Reporter: Skye Wanderman-Milne


The definition level, which determines which level of nesting is NULL, appears 
to always be n or n-1, where n is the maximum definition level. This means that 
only the innermost level of nesting can be NULL. This is only relevant for 
Parquet files. For example:

{code:sql}
CREATE TABLE text_tbl (a STRUCT<b:STRUCT<c:INT>>)
STORED AS TEXTFILE;

INSERT OVERWRITE TABLE text_tbl
SELECT IF(false, named_struct("b", named_struct("c", 1)), NULL)
FROM tbl LIMIT 1;

CREATE TABLE parq_tbl
STORED AS PARQUET
AS SELECT * FROM text_tbl;

SELECT * FROM text_tbl;
=> NULL # right

SELECT * FROM parq_tbl;
=> {"b":{"c":null}} # wrong
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to