[
https://issues.apache.org/jira/browse/HIVE-16332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhizhen Hou updated HIVE-16332:
-------------------------------
Description:
##The step to reproduce the result.
1. First crate a text format table with array type field in hive.
```
create table test_text_orc (
col_int bigint,
col_text string,
col_array array<string>,
col_map map<string, string>
)
PARTITIONED BY (
day string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
collection items TERMINATED BY ']'
map keys TERMINATED BY ':'
;
```
2. Create new text file hive-orc-text-file-array-error-test.txt.
```
1,text_value1,array_value1]array_value2]array_value3,
map_key1:map_value1,map_key2:map_value2
2,text_value2,array_value4, map_key1:map_value3
,text_value3,, map_key1:]map_key3:map_value3
```
3. Load the data into one partition.
```
LOAD DATA local INPATH '.hive-orc-text-file-array-error-test.txt' overwrite
into table test_text_orc partition(day=20170329)
```
4. select the data to verify the result.
```
hive> select * from test.test_text_orc;
OK
1 text_value1 ["array_value1","array_value2","array_value3"] {"
map_key1":"map_value1","map_key2":"map_value2"} 20170329
2 text_value2 ["array_value4"] {"map_key1":"map_value3"}
20170329
NULL text_value3 [] {" map_key1":"","map_key3":"map_value3"}
20170329
```
5. Alter table format of table to orc;
```
alter table test_text_orc set fileformat orc;
```
6. Check the result again, and you can see the error result.
```
hive> select * from test.test_text_orc;
OK
1 text_value1 ["array_value1","array_value2","array_value3"] {"
map_key1":"map_value1","map_key2":"map_value2"} 20170329
2 text_value2 ["array_value4","array_value2","array_value3"]
{"map_key1":"map_value3"} 20170329
NULL text_value3 ["array_value4","array_value2","array_value3"]
{"map_key3":"map_value3"," map_key1":""} 20170329
```
was:
##The step to reproduce the result.
1. First crate a text format table with array type field in hive.
```
create table test_text_orc (
col_int bigint,
col_text string,
col_array array<string>,
col_map map<string, string>
)
PARTITIONED BY (
day string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
collection items TERMINATED BY ']'
map keys TERMINATED BY ':'
;
```
2. Create new text file hive-orc-text-file-array-error-test.txt.
```
1,text_value1,array_value1]array_value2]array_value3,
map_key1:map_value1,map_key2:map_value2
2,text_value2,array_value4, map_key1:map_value3
,text_value3,, map_key1:]map_key3:map_value3
```
* 3. Load the data into one partition.
```
LOAD DATA local INPATH '.hive-orc-text-file-array-error-test.txt' overwrite
into table test_text_orc partition(day=20170329)
```
4. select the data to verify the result.
```
hive> select * from test.test_text_orc;
OK
1 text_value1 ["array_value1","array_value2","array_value3"] {"
map_key1":"map_value1","map_key2":"map_value2"} 20170329
2 text_value2 ["array_value4"] {"map_key1":"map_value3"}
20170329
NULL text_value3 [] {" map_key1":"","map_key3":"map_value3"}
20170329
```
5. Alter table format of table to orc;
```
alter table test_text_orc set fileformat orc;
```
6. Check the result again, and you can see the error result.
```
hive> select * from test.test_text_orc;
OK
1 text_value1 ["array_value1","array_value2","array_value3"] {"
map_key1":"map_value1","map_key2":"map_value2"} 20170329
2 text_value2 ["array_value4","array_value2","array_value3"]
{"map_key1":"map_value3"} 20170329
NULL text_value3 ["array_value4","array_value2","array_value3"]
{"map_key3":"map_value3"," map_key1":""} 20170329
```
> We create a partitioned text format table with one partition, after we change
> the format of table to orc, then the array type field may output error.
> -----------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-16332
> URL: https://issues.apache.org/jira/browse/HIVE-16332
> Project: Hive
> Issue Type: Bug
> Components: ORC
> Affects Versions: 2.1.1
> Reporter: Zhizhen Hou
> Priority: Critical
>
> ##The step to reproduce the result.
> 1. First crate a text format table with array type field in hive.
> ```
> create table test_text_orc (
> col_int bigint,
> col_text string,
> col_array array<string>,
> col_map map<string, string>
> )
> PARTITIONED BY (
> day string
> )
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ','
> collection items TERMINATED BY ']'
> map keys TERMINATED BY ':'
> ;
>
> ```
> 2. Create new text file hive-orc-text-file-array-error-test.txt.
> ```
> 1,text_value1,array_value1]array_value2]array_value3,
> map_key1:map_value1,map_key2:map_value2
> 2,text_value2,array_value4, map_key1:map_value3
> ,text_value3,, map_key1:]map_key3:map_value3
> ```
> 3. Load the data into one partition.
> ```
> LOAD DATA local INPATH '.hive-orc-text-file-array-error-test.txt' overwrite
> into table test_text_orc partition(day=20170329)
> ```
> 4. select the data to verify the result.
> ```
> hive> select * from test.test_text_orc;
> OK
> 1 text_value1 ["array_value1","array_value2","array_value3"] {"
> map_key1":"map_value1","map_key2":"map_value2"} 20170329
> 2 text_value2 ["array_value4"] {"map_key1":"map_value3"}
> 20170329
> NULL text_value3 [] {" map_key1":"","map_key3":"map_value3"}
> 20170329
> ```
> 5. Alter table format of table to orc;
> ```
> alter table test_text_orc set fileformat orc;
> ```
> 6. Check the result again, and you can see the error result.
> ```
> hive> select * from test.test_text_orc;
> OK
> 1 text_value1 ["array_value1","array_value2","array_value3"] {"
> map_key1":"map_value1","map_key2":"map_value2"} 20170329
> 2 text_value2 ["array_value4","array_value2","array_value3"]
> {"map_key1":"map_value3"} 20170329
> NULL text_value3 ["array_value4","array_value2","array_value3"]
> {"map_key3":"map_value3"," map_key1":""} 20170329
> ```
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)