-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18254/
-----------------------------------------------------------
(Updated Feb. 19, 2014, 9:56 p.m.)
Review request for hive.
Changes
-------
Incorporated review feedback.
Updated more test cases results of explain CTAS.
It seems that the test table srcbucket, as a bucketed (multi-file) table, will
give random results from select query, so first insert to a staging table using
sort by.
Bugs: HIVE-6375
https://issues.apache.org/jira/browse/HIVE-6375
Repository: hive-git
Description
-------
There is a Hive bug in SemanticAnalyzer that chooses different names for
columns in the CreateTable task and the FileSink task.
columnInfo.getInternalName() was used in one place, and fieldSchema still used
columnInfo.getAlias() if it is available. This change makes both consistent,
favoring columnInfo.getAlias if it is available.
This is not revealed before because other file-formats like RcFile seem to use
column-ordinal position, and Avro file stores the schema separately altogether.
Diffs (updated)
-----
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 77388dd
ql/src/test/queries/clientpositive/parquet_ctas.q PRE-CREATION
ql/src/test/results/clientpositive/ctas.q.out 9668855
ql/src/test/results/clientpositive/ctas_hadoop20.q.out 0ec0af5
ql/src/test/results/clientpositive/merge3.q.out 3df75b7
ql/src/test/results/clientpositive/parquet_ctas.q.out PRE-CREATION
Diff: https://reviews.apache.org/r/18254/diff/
Testing
-------
Added parquet_ctas.q. Covers cases where column name is gotten directly from
input table (implied alias), where name is auto-generated, where name is
specified as alias, and a mix of the three.
Thanks,
Szehon Ho