[jira] [Commented] (HIVE-6131) New columns after table alter result in null values despite data

Hive QA (JIRA) Thu, 08 Dec 2016 05:38:23 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732234#comment-15732234
 ]


Hive QA commented on HIVE-6131:
-------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12637927/HIVE-6131.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 10723 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=143)
        
[vectorized_rcfile_columnar.q,vector_elt.q,explainuser_1.q,multi_insert.q,tez_dml.q,vector_bround.q,schema_evol_orc_acid_table.q,vector_when_case_null.q,orc_ppd_schema_evol_1b.q,vector_join30.q,vectorization_11.q,cte_3.q,update_tmp_table.q,vector_decimal_cast.q,groupby_grouping_id2.q,vector_decimal_round.q,tez_smb_empty.q,orc_merge6.q,vector_decimal_trailing.q,cte_5.q,tez_union.q,cbo_rp_subq_not_in.q,vector_decimal_2.q,columnStatsUpdateForStatsOptimizer_1.q,vector_outer_join3.q,schema_evol_text_vec_part_all_complex.q,tez_dynpart_hashjoin_2.q,auto_sortmerge_join_12.q,offset_limit.q,tez_union_multiinsert.q]
TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=148)
        
[auto_sortmerge_join_13.q,union_top_level.q,vector_left_outer_join2.q,schema_evol_text_vecrow_part_all_primitive.q,constprog_semijoin.q,update_where_partitioned.q,drop_partition_with_stats.q,smb_mapjoin_14.q,skiphf_aggr.q,vectorized_ptf.q,auto_join_filters.q,join0.q,insert_orig_table.q,mergejoin.q,join_filters.q,orc_split_elimination.q,subquery_in.q,vector_outer_join0.q,schema_evol_text_vec_part_all_primitive.q,vector_complex_all.q,auto_sortmerge_join_4.q,bucket_many.q,vectorization_15.q,union3.q,vectorization_nested_udf.q,windowing_windowspec2.q,auto_smb_mapjoin_14.q,vector_mr_diff_schema_alias.q,vector_join_filters.q,reduce_deduplicate_extended.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_change_col]
 (batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_cascade] 
(batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_data_after_schema_update]
 (batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_wise_fileformat11]
 (batchId=6)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_wise_fileformat12]
 (batchId=75)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_wise_fileformat13]
 (batchId=58)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_wise_fileformat14]
 (batchId=69)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_part]
 (batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_part_all_complex]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_part_all_primitive]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part_all_complex]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=92)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2487/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2487/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2487/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 24 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12637927 - PreCommit-HIVE-Build

> New columns after table alter result in null values despite data
> ----------------------------------------------------------------
>
>                 Key: HIVE-6131
>                 URL: https://issues.apache.org/jira/browse/HIVE-6131
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.11.0, 0.12.0, 0.13.0, 1.2.1
>            Reporter: James Vaughan
>            Priority: Critical
>         Attachments: HIVE-6131.1.patch
>
>
> Hi folks,
> I found and verified a bug on our CDH 4.0.3 install of Hive when adding 
> columns to tables with Partitions using 'REPLACE COLUMNS'.  I dug through the 
> Jira a little bit and didn't see anything for it so hopefully this isn't just 
> noise on the radar.
> Basically, when you alter a table with partitions and then reupload data to 
> that partition, it doesn't seem to recognize the extra data that actually 
> exists in HDFS- as in, returns NULL values on the new column despite having 
> the data and recognizing the new column in the metadata.
> Here's some steps to reproduce using a basic table:
> 1.  Run this hive command:  CREATE TABLE jvaughan_test (col1 string) 
> partitioned by (day string);
> 2.  Create a simple file on the system with a couple of entries, something 
> like "hi" and "hi2" separated by newlines.
> 3.  Run this hive command, pointing it at the file:  LOAD DATA LOCAL INPATH 
> '<FILEDIR>' OVERWRITE INTO TABLE jvaughan_test PARTITION (day = '2014-01-02');
> 4.  Confirm the data with:  SELECT * FROM jvaughan_test WHERE day = 
> '2014-01-02';
> 5.  Alter the column definitions:  ALTER TABLE jvaughan_test REPLACE COLUMNS 
> (col1 string, col2 string);
> 6.  Edit your file and add a second column using the default separator 
> (ctrl+v, then ctrl+a in Vim) and add two more entries, such as "hi3" on the 
> first row and "hi4" on the second
> 7.  Run step 3 again
> 8.  Check the data again like in step 4
> For me, this is the results that get returned:
> hive> select * from jvaughan_test where day = '2014-01-01';
> OK
> hi    NULL    2014-01-02
> hi2   NULL    2014-01-02
> This is despite the fact that there is data in the file stored by the 
> partition in HDFS.
> Let me know if you need any other information.  The only workaround for me 
> currently is to drop partitions for any I'm replacing data in and THEN 
> reupload the new data file.
> Thanks,
> -James



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6131) New columns after table alter result in null values despite data

Reply via email to