[
https://issues.apache.org/jira/browse/HIVE-27662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raghav Aggarwal updated HIVE-27662:
-----------------------------------
Description:
When reading a text table with vectorization on and hive.fetch.task.conversion
as none, wrong parsing of delimiter is happening in nested complex types
containing map. For example, if a columns schema is like:
map<string,structid:string,name:string> then \u0004 char is coming in the
output. Here is a example:
Sample q file:
{code:java}
set hive.fetch.task.conversion=none;
set hive.vectorized.execution.enabled=true;
create EXTERNAL table `table6` as
select
'bob' as name,
MAP(
"Key1",
ARRAY(
1,
2,
3
),
"Key2",
ARRAY(
4,
5,
6
)
) as testmarks;
select * from table6;
set hive.vectorized.execution.enabled=false;
select * from table6; {code}
Output of 1st select statement:
{code:java}
bob· {"Key1":null,"Key2":null} {code}
Output of 2nd select statement:
{code:java}
bob· {"Key1":[1,2,3],"Key2":[4,5,6]} {code}
MAP Complex type is not handling the scenario where it contains a nested
complex type like STRUCT, ARRAY, UNION.
was:
When reading a text table with vectorization on and hive.fetch.task.conversion
as none, wrong parsing of delimiter is happening in nested complex types
containing map. For example, if a columns schema is like:
map<string,structid:string,name:string> then \u0004 char is coming in the
output. Here is a example:
Sample q file:
{code:java}
set hive.fetch.task.conversion=none;
set hive.vectorized.execution.enabled=true;
create EXTERNAL table `table6` as
select
'bob' as name,
MAP(
"Key1",
ARRAY(
1,
2,
3
),
"Key2",
ARRAY(
4,
5,
6
)
) as testmarks;
select * from table6;
set hive.vectorized.execution.enabled=false;
select * from table6; {code}
Output of 1st select statement:
{code:java}
bob· {"Key1":null,"Key2":null} {code}
Output of 2nd select statement:
{code:java}
bob· {"Key1":[1,2,3],"Key2":[4,5,6]} {code}
> Incorrect parsing of nested complex types containing map during vectorized
> text processing
> ------------------------------------------------------------------------------------------
>
> Key: HIVE-27662
> URL: https://issues.apache.org/jira/browse/HIVE-27662
> Project: Hive
> Issue Type: Bug
> Components: Vectorization
> Reporter: Raghav Aggarwal
> Assignee: Raghav Aggarwal
> Priority: Major
>
> When reading a text table with vectorization on and
> hive.fetch.task.conversion as none, wrong parsing of delimiter is happening
> in nested complex types containing map. For example, if a columns schema is
> like: map<string,structid:string,name:string> then \u0004 char is coming in
> the output. Here is a example:
> Sample q file:
>
> {code:java}
> set hive.fetch.task.conversion=none;
> set hive.vectorized.execution.enabled=true;
> create EXTERNAL table `table6` as
> select
> 'bob' as name,
> MAP(
> "Key1",
> ARRAY(
> 1,
> 2,
> 3
> ),
> "Key2",
> ARRAY(
> 4,
> 5,
> 6
> )
> ) as testmarks;
> select * from table6;
> set hive.vectorized.execution.enabled=false;
> select * from table6; {code}
> Output of 1st select statement:
> {code:java}
> bob· {"Key1":null,"Key2":null} {code}
> Output of 2nd select statement:
> {code:java}
> bob· {"Key1":[1,2,3],"Key2":[4,5,6]} {code}
>
> MAP Complex type is not handling the scenario where it contains a nested
> complex type like STRUCT, ARRAY, UNION.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)