Mohammad Kamrul Islam created HIVE-5803:
-------------------------------------------
Summary: Support CTAS from a non-avro table to an avro table
Key: HIVE-5803
URL: https://issues.apache.org/jira/browse/HIVE-5803
Project: Hive
Issue Type: Task
Reporter: Mohammad Kamrul Islam
Hive currently does not work with HQL like :
CREATE TABLE <AVRO-BASE-TABLE> as SELECT * from <NON_AVRO_TABLE>;
Actual it works successfully. But when I run "SELECT * from <AVRO-BASED-TABLE>
.." it fails.
This JIRA depends on HIVE-3159 that translates TypeInfo to Avro schema.
Findings so far: CTAS uses internal column names (in place of using the column
names provided in select) when crating the AVRO data file. In other words, avro
data file has column names in this form of: _col0, _col1 where as table column
names are different.
I tested with the following test cases and it failed:
- verify 1) can create table using create table as select from non-avro table
2) LOAD avro data into new table and read data from the new table
CREATE TABLE simple_kv_txt (key STRING, value STRING) STORED AS TEXTFILE;
DESCRIBE simple_kv_txt;
LOAD DATA LOCAL INPATH '../data/files/kv1.txt' INTO TABLE simple_kv_txt;
SELECT * FROM simple_kv_txt ORDER BY KEY;
CREATE TABLE copy_doctors ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' as SELECT key as
key, value as value FROM simple_kv_txt;
DESCRIBE copy_doctors;
SELECT * FROM copy_doctors;
--
This message was sent by Atlassian JIRA
(v6.1#6144)