[
https://issues.apache.org/jira/browse/HIVE-27662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raghav Aggarwal updated HIVE-27662:
-----------------------------------
Description:
When reading a text table with vectorization on and hive.fetch.task.conversion
as none, wrong parsing of delimiter is happening in nested complex types
containing map. For example, if a columns schema is like:
map<string,structid:string,name:string> then \u0004 char is coming in the
output. Here is a example:
was:When reading the data from text file format (with vectorizaton on) which
contains multiple delimiter like ^A ^B ^C ^D etc i.e (\u0001, \u0002, \u0003,
\u0004), incorrect parsing of data is happening which leads to incorrect
result.
> Incorrect parsing of nested complex types containing map during vectorized
> text processing
> ------------------------------------------------------------------------------------------
>
> Key: HIVE-27662
> URL: https://issues.apache.org/jira/browse/HIVE-27662
> Project: Hive
> Issue Type: Bug
> Components: Vectorization
> Reporter: Raghav Aggarwal
> Assignee: Raghav Aggarwal
> Priority: Major
>
> When reading a text table with vectorization on and
> hive.fetch.task.conversion as none, wrong parsing of delimiter is happening
> in nested complex types containing map. For example, if a columns schema is
> like: map<string,structid:string,name:string> then \u0004 char is coming in
> the output. Here is a example:
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)