[
https://issues.apache.org/jira/browse/HIVE-13330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15212643#comment-15212643
]
Carl Steinbach commented on HIVE-13330:
---------------------------------------
Please change the name of the test from "vector_string_reader_empty_dict.q" to
"orc_string_reader_empty_dict.q"
> ORC vectorized string dictionary reader does not differentiate null vs empty
> string dictionary
> ----------------------------------------------------------------------------------------------
>
> Key: HIVE-13330
> URL: https://issues.apache.org/jira/browse/HIVE-13330
> Project: Hive
> Issue Type: Bug
> Affects Versions: 1.3.0, 2.0.0, 2.1.0
> Reporter: Prasanth Jayachandran
> Assignee: Prasanth Jayachandran
> Priority: Critical
> Labels: CorrectnessBug
> Attachments: HIVE-13330.1.patch, HIVE-13330.2.patch
>
>
> Vectorized string dictionary reader cannot differentiate between the case
> where all dictionary entries are null vs single entry with empty string. This
> causes wrong results when reading data out of such files.
> {code:title=Vectorization On}
> SET hive.vectorized.execution.enabled=true;
> SET hive.fetch.task.conversion=none;
> select vcol from testnullorc3 limit 1;
> OK
> NULL
> {code}
> {code:title=Vectorization Off}
> SET hive.vectorized.execution.enabled=false;
> SET hive.fetch.task.conversion=none;
> select vcol from testnullorc3 limit 1;
> OK
> {code}
> The input table testnullorc3 contains a varchar column vcol with few empty
> strings and few nulls. For this table, non vectorized reader returns empty as
> first row but vectorized reader returns NULL.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)