[jira] [Commented] (HIVE-17843) UINT32 Parquet columns are handled as signed INT32-s, silently reading incorrect data

2018-03-23 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16412005#comment-16412005
 ] 

Vihang Karajgaonkar commented on HIVE-17843:


patch merged to master branch. Thanks for your contribution [~janulatha]

> UINT32 Parquet columns are handled as signed INT32-s, silently reading 
> incorrect data
> -
>
> Key: HIVE-17843
> URL: https://issues.apache.org/jira/browse/HIVE-17843
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Ivanfi
>Assignee: Janaki Lahorani
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-17843.1.patch, HIVE-17843.1.patch, 
> HIVE-17843.2.patch, HIVE-17843.3.patch, HIVE-17843.4.patch, 
> data_including_invalid_values.parquet, data_with_valid_values.parquet, 
> test_uint.parquet
>
>
> An unsigned 32 bit Parquet column, such as
> {noformat}
> optional int32 uint_32_col (UINT_32)
> {noformat}
> is read by Hive as if it were signed, leading to incorrect results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17843) UINT32 Parquet columns are handled as signed INT32-s, silently reading incorrect data

2018-03-21 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16408964#comment-16408964
 ] 

Vihang Karajgaonkar commented on HIVE-17843:


+1 patch looks good to me.

> UINT32 Parquet columns are handled as signed INT32-s, silently reading 
> incorrect data
> -
>
> Key: HIVE-17843
> URL: https://issues.apache.org/jira/browse/HIVE-17843
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Ivanfi
>Assignee: Janaki Lahorani
>Priority: Major
> Attachments: HIVE-17843.1.patch, HIVE-17843.1.patch, 
> HIVE-17843.2.patch, HIVE-17843.3.patch, HIVE-17843.4.patch
>
>
> An unsigned 32 bit Parquet column, such as
> {noformat}
> optional int32 uint_32_col (UINT_32)
> {noformat}
> is read by Hive as if it were signed, leading to incorrect results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17843) UINT32 Parquet columns are handled as signed INT32-s, silently reading incorrect data

2018-03-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16405679#comment-16405679
 ] 

Hive QA commented on HIVE-17843:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12915223/HIVE-17843.4.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 29 failed/errored test(s), 13020 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=92)

[infer_bucket_sort_num_buckets.q,infer_bucket_sort_reducers_power_two.q,parallel_orderby.q,bucket_num_reducers_acid.q,infer_bucket_sort_map_operators.q,infer_bucket_sort_merge.q,root_dir_external_table.q,infer_bucket_sort_dyn_part.q,udf_using.q,bucket_num_reducers_acid2.q]
TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=94)

[nopart_insert.q,insert_into_with_schema.q,input41.q,having1.q,create_table_failure3.q,default_constraint_invalid_default_value.q,database_drop_not_empty_restrict.q,windowing_after_orderby.q,orderbysortby.q,subquery_select_distinct2.q,authorization_uri_alterpart_loc.q,udf_last_day_error_1.q,constraint_duplicate_name.q,create_table_failure4.q,alter_tableprops_external_with_notnull_constraint.q,semijoin5.q,udf_format_number_wrong4.q,deletejar.q,exim_11_nonpart_noncompat_sorting.q,show_tables_bad_db2.q,drop_func_nonexistent.q,nopart_load.q,alter_table_non_partitioned_table_cascade.q,load_wrong_fileformat.q,lockneg_try_db_lock_conflict.q,udf_field_wrong_args_len.q,create_table_failure2.q,create_with_fk_constraints_enforced.q,groupby2_map_skew_multi_distinct.q,udf_min.q,authorization_update_noupdatepriv.q,show_columns2.q,authorization_insert_noselectpriv.q,orc_replace_columns3_acid.q,compare_double_bigint.q,authorization_set_nonexistent_conf.q,alter_rename_partition_failure3.q,split_sample_wrong_format2.q,create_with_fk_pk_same_tab.q,compare_double_bigint_2.q,authorization_show_roles_no_admin.q,materialized_view_authorization_rebuild_no_grant.q,unionLimit.q,authorization_revoke_table_fail2.q,authorization_insert_noinspriv.q,duplicate_insert3.q,authorization_desc_table_nosel.q,stats_noscan_non_native.q,orc_change_serde_acid.q,create_or_replace_view7.q,exim_07_nonpart_noncompat_ifof.q,create_with_unique_constraints_enforced.q,udf_concat_ws_wrong2.q,fileformat_bad_class.q,merge_negative_2.q,exim_15_part_nonpart.q,authorization_not_owner_drop_view.q,external1.q,authorization_uri_insert.q,create_with_fk_wrong_ref.q,columnstats_tbllvl_incorrect_column.q,authorization_show_parts_nosel.q,authorization_not_owner_drop_tab.q,external2.q,authorization_deletejar.q,temp_table_create_like_partitions.q,udf_greatest_error_1.q,ptf_negative_AggrFuncsWithNoGBYNoPartDef.q,alter_view_as_select_not_exist.q,touch1.q,groupby3_map_skew_multi_distinct.q,insert_into_notnull_constraint.q,exchange_partition_neg_partition_missing.q,groupby_cube_multi_gby.q,columnstats_tbllvl.q,drop_invalid_constraint2.q,alter_table_add_partition.q,update_not_acid.q,archive5.q,alter_table_constraint_invalid_pk_col.q,ivyDownload.q,udf_instr_wrong_type.q,bad_sample_clause.q,authorization_not_owner_drop_tab2.q,authorization_alter_db_owner.q,show_columns1.q,orc_type_promotion3.q,create_view_failure8.q,strict_join.q,udf_add_months_error_1.q,groupby_cube2.q,groupby_cube1.q,groupby_rollup1.q,genericFileFormat.q,invalid_cast_from_binary_4.q,drop_invalid_constraint1.q,serde_regex.q,show_partitions1.q,invalid_cast_from_binary_6.q,create_with_multi_pk_constraint.q,udf_field_wrong_type.q,groupby_grouping_sets4.q,groupby_grouping_sets3.q,insertsel_fail.q,udf_locate_wrong_type.q,orc_type_promotion1_acid.q,set_table_property.q,create_or_replace_view2.q,groupby_grouping_sets2.q,alter_view_failure.q,distinct_windowing_failure1.q,invalid_t_alter2.q,alter_table_constraint_invalid_fk_col1.q,invalid_varchar_length_2.q,authorization_show_grant_otheruser_alltabs.q,subquery_windowing_corr.q,compact_non_acid_table.q,authorization_view_4.q,authorization_disallow_transform.q,materialized_view_authorization_rebuild_other.q,authorization_fail_4.q,dbtxnmgr_nodblock.q,set_hiveconf_internal_variable1.q,input_part0_neg.q,udf_printf_wrong3.q,load_orc_negative2.q,druid_buckets.q,archive2.q,authorization_addjar.q,invalid_sum_syntax.q,insert_into_with_schema1.q,udf_add_months_error_2.q,dyn_part_max_per_node.q,authorization_revoke_table_fail1.q,udf_printf_wrong2.q,archive_multi3.q,udf_printf_wrong1.q,subquery_subquery_chain.q,authorization_view_disable_cbo_4.q,no_matching_udf.q,create_view_failure7.q,drop_native_udf.q,truncate_column_list_bucketing.q,authorization_uri_add_partition.q,authorization_view_disable_cbo_3.q,bad_exec_hooks.q,authorization_view_disable_cbo_2.q,fetchtask_ioexception.q,char_pad_convert_fail2.q,authorization_set_role_neg1.q,serde_regex3.q,authorization_delete

[jira] [Commented] (HIVE-17843) UINT32 Parquet columns are handled as signed INT32-s, silently reading incorrect data

2018-03-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16405666#comment-16405666
 ] 

Hive QA commented on HIVE-17843:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  6m 
15s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
32s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
42s{color} | {color:red} root: The patch generated 1 new + 17 unchanged - 2 
fixed = 18 total (was 19) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
37s{color} | {color:red} ql: The patch generated 1 new + 17 unchanged - 2 fixed 
= 18 total (was 19) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  6m 
36s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
13s{color} | {color:red} The patch generated 49 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 43m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-9713/dev-support/hive-personality.sh
 |
| git revision | master / 26c0ab6 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9713/yetus/diff-checkstyle-root.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9713/yetus/diff-checkstyle-ql.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9713/yetus/patch-asflicense-problems.txt
 |
| modules | C: . ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9713/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> UINT32 Parquet columns are handled as signed INT32-s, silently reading 
> incorrect data
> -
>
> Key: HIVE-17843
> URL: https://issues.apache.org/jira/browse/HIVE-17843
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Ivanfi
>Assignee: Janaki Lahorani
>Priority: Major
> Attachments: HIVE-17843.1.patch, HIVE-17843.1.patch, 
> HIVE-17843.2.patch, HIVE-17843.3.patch, HIVE-17843.4.patch
>
>
> An unsigned 32 bit Parquet column, such as
> {noformat}
> optional int32 uint_32_col (UINT_32)
> {noformat}
> is read by Hive as if it were signed, leading to incorrect results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17843) UINT32 Parquet columns are handled as signed INT32-s, silently reading incorrect data

2018-03-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16405537#comment-16405537
 ] 

Hive QA commented on HIVE-17843:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12915158/HIVE-17843.3.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 30 failed/errored test(s), 13416 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=92)

[infer_bucket_sort_num_buckets.q,infer_bucket_sort_reducers_power_two.q,parallel_orderby.q,bucket_num_reducers_acid.q,infer_bucket_sort_map_operators.q,infer_bucket_sort_merge.q,root_dir_external_table.q,infer_bucket_sort_dyn_part.q,udf_using.q,bucket_num_reducers_acid2.q]
TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=95)

[udf_invalid.q,authorization_uri_export.q,default_constraint_complex_default_value.q,druid_datasource2.q,view_update.q,default_partition_name.q,authorization_public_create.q,load_wrong_fileformat_rc_seq.q,default_constraint_invalid_type.q,altern1.q,describe_xpath1.q,drop_view_failure2.q,temp_table_rename.q,invalid_select_column_with_subquery.q,udf_trunc_error1.q,insert_view_failure.q,dbtxnmgr_nodbunlock.q,authorization_show_columns.q,cte_recursion.q,merge_constraint_notnull.q,load_part_nospec.q,clusterbyorderby.q,orc_type_promotion2.q,ctas_noperm_loc.q,udf_instr_wrong_args_len.q,invalid_create_tbl2.q,part_col_complex_type.q,authorization_drop_db_empty.q,smb_mapjoin_14.q,subquery_scalar_multi_rows.q,alter_partition_coltype_2columns.q,subquery_corr_in_agg.q,insert_overwrite_notnull_constraint.q,authorization_show_grant_otheruser_wtab.q,regex_col_groupby.q,udaf_collect_set_unsupported.q,ptf_negative_DuplicateWindowAlias.q,exim_22_export_authfail.q,udf_likeany_wrong1.q,groupby_key.q,ambiguous_col.q,groupby3_multi_distinct.q,authorization_alter_drop_ptn.q,invalid_cast_from_binary_5.q,show_create_table_does_not_exist.q,invalid_select_column.q,exim_20_managed_location_over_existing.q,interval_3.q,authorization_compile.q,join35.q,merge_negative_3.q,udf_concat_ws_wrong3.q,create_or_replace_view8.q,create_external_with_notnull_constraint.q,split_sample_out_of_range.q,materialized_view_no_transactional_rewrite.q,authorization_show_grant_otherrole.q,create_with_constraints_duplicate_name.q,invalid_stddev_samp_syntax.q,authorization_view_disable_cbo_7.q,autolocal1.q,avro_non_nullable_union.q,load_orc_negative_part.q,drop_view_failure1.q,columnstats_partlvl_invalid_values_autogather.q,exim_13_nonnative_import.q,alter_table_wrong_regex.q,add_partition_with_whitelist.q,udf_next_day_error_2.q,authorization_select.q,udf_trunc_error2.q,authorization_view_7.q,udf_format_number_wrong5.q,touch2.q,exim_03_nonpart_noncompat_colschema.q,orc_type_promotion1.q,lateral_view_alias.q,show_tables_bad_db1.q,unset_table_property.q,alter_non_native.q,nvl_mismatch_type.q,load_orc_negative3.q,authorization_create_role_no_admin.q,invalid_distinct1.q,authorization_grant_server.q,orc_type_promotion3_acid.q,show_tables_bad1.q,macro_unused_parameter.q,drop_invalid_constraint3.q,drop_partition_filter_failure.q,char_pad_convert_fail3.q,exim_23_import_exist_authfail.q,drop_invalid_constraint4.q,authorization_create_macro1.q,archive1.q,subquery_multiple_cols_in_select.q,change_hive_hdfs_session_path.q,udf_trunc_error3.q,invalid_variance_syntax.q,authorization_truncate_2.q,invalid_avg_syntax.q,invalid_select_column_with_tablename.q,mm_truncate_cols.q,groupby_grouping_sets1.q,druid_location.q,groupby2_multi_distinct.q,authorization_sba_drop_table.q,dynamic_partitions_with_whitelist.q,compare_string_bigint_2.q,udf_greatest_error_2.q,authorization_view_6.q,show_tablestatus.q,duplicate_alias_in_transform_schema.q,create_with_fk_uk_same_tab.q,udtf_not_supported3.q,alter_table_constraint_invalid_fk_col2.q,udtf_not_supported1.q,dbtxnmgr_notableunlock.q,ptf_negative_InvalidValueBoundary.q,alter_table_constraint_duplicate_pk.q,udf_printf_wrong4.q,create_view_failure9.q,udf_elt_wrong_type.q,selectDistinctStarNeg_1.q,invalid_mapjoin1.q,load_stored_as_dirs.q,input1.q,udf_sort_array_wrong1.q,invalid_distinct2.q,invalid_select_fn.q,authorization_role_grant_otherrole.q,archive4.q,load_nonpart_authfail.q,recursive_view.q,authorization_view_disable_cbo_1.q,desc_failure4.q,create_not_acid.q,udf_sort_array_wrong3.q,char_pad_convert_fail0.q,udf_map_values_arg_type.q,alter_view_failure6_2.q,alter_partition_change_col_nonexist.q,update_non_acid_table.q,authorization_view_disable_cbo_5.q,ct_noperm_loc.q,interval_1.q,authorization_show_grant_otheruser_all.q,authorization_view_2.q,show_tables_bad2.q,groupby_rollup2.q,truncate_column_seqfile.q,create_view_failure5.q,authorization_create_view.q,ptf_window_boundaries.q,ctasnullcol.q,input_part0_neg_2.q,create_or_r

[jira] [Commented] (HIVE-17843) UINT32 Parquet columns are handled as signed INT32-s, silently reading incorrect data

2018-03-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16405498#comment-16405498
 ] 

Hive QA commented on HIVE-17843:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
37s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  6m 
40s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 1s{color} | {color:green} root: The patch generated 0 new + 17 unchanged - 2 
fixed = 17 total (was 19) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} ql: The patch generated 0 new + 17 unchanged - 2 
fixed = 17 total (was 19) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m  
7s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 49 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 46m  5s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-9712/dev-support/hive-personality.sh
 |
| git revision | master / 26c0ab6 |
| Default Java | 1.8.0_111 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9712/yetus/patch-asflicense-problems.txt
 |
| modules | C: . ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9712/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> UINT32 Parquet columns are handled as signed INT32-s, silently reading 
> incorrect data
> -
>
> Key: HIVE-17843
> URL: https://issues.apache.org/jira/browse/HIVE-17843
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Ivanfi
>Assignee: Janaki Lahorani
>Priority: Major
> Attachments: HIVE-17843.1.patch, HIVE-17843.1.patch, 
> HIVE-17843.2.patch, HIVE-17843.3.patch
>
>
> An unsigned 32 bit Parquet column, such as
> {noformat}
> optional int32 uint_32_col (UINT_32)
> {noformat}
> is read by Hive as if it were signed, leading to incorrect results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17843) UINT32 Parquet columns are handled as signed INT32-s, silently reading incorrect data

2018-03-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16404246#comment-16404246
 ] 

Hive QA commented on HIVE-17843:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12915068/HIVE-17843.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9700/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9700/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9700/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-03-18 23:31:30.232
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-9700/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-03-18 23:31:30.235
+ cd apache-github-source-source
+ git fetch origin
Auto packing the repository in background for optimum performance.
See "git help gc" for manual housekeeping.
+ git reset --hard HEAD
HEAD is now at d2d50e6 HIVE-18886: ACID: NPE on unexplained mysql exceptions 
(Gopal V, reviewed by Eugene Koifman)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at d2d50e6 HIVE-18886: ACID: NPE on unexplained mysql exceptions 
(Gopal V, reviewed by Eugene Koifman)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-03-18 23:31:33.878
+ rm -rf ../yetus_PreCommit-HIVE-Build-9700
+ mkdir ../yetus_PreCommit-HIVE-Build-9700
+ git gc
+ sleep 1s
+ git gc
+ sleep 1s
+ git gc
+ sleep 1s
+ git gc
+ sleep 1s
+ git gc
+ sleep 1s
+ git gc
+ sleep 1s
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-9700
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-9700/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: missing binary patch data for 
'data/files/data_including_invalid_values.parquet'
error: binary patch does not apply to 
'data/files/data_including_invalid_values.parquet'
Falling back to three-way merge...
error: missing binary patch data for 
'data/files/data_including_invalid_values.parquet'
error: binary patch does not apply to 
'data/files/data_including_invalid_values.parquet'
error: data/files/data_including_invalid_values.parquet: patch does not apply
error: missing binary patch data for 'data/files/data_with_valid_values.parquet'
error: binary patch does not apply to 
'data/files/data_with_valid_values.parquet'
Falling back to three-way merge...
error: missing binary patch data for 'data/files/data_with_valid_values.parquet'
error: binary patch does not apply to 
'data/files/data_with_valid_values.parquet'
error: data/files/data_with_valid_values.parquet: patch does not apply
error: missing binary patch data for 'data/files/test_uint.parquet'
error: binary patch does not apply to 'data/files/test_uint.parquet'
Falling back to three-way merge...
error: missing binary patch data for 'data/files/test_uint.parquet'
error: binary patch does not apply to 'data/files/test_uint.parquet'
error: data/files/test_uint.parquet: patch does not apply
error: missing binary patch data for 
'files/data_including_invalid_values.parquet'
error: binary patch does not apply to 
'files/data_including_invalid_values.parquet'
Falling back to three-way merge...
error: missing binary patch data for 
'files/data_including_invalid_values.parquet'
error: binary patch does not apply to 
'files/data_including_invalid_values.parquet'
error: files/data_including_invalid_values.parquet: patch does not apply
error: missing binary patch data for 'files/data_with_valid_valu

[jira] [Commented] (HIVE-17843) UINT32 Parquet columns are handled as signed INT32-s, silently reading incorrect data

2018-03-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16403805#comment-16403805
 ] 

Hive QA commented on HIVE-17843:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12915028/HIVE-17843.1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9693/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9693/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9693/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-03-18 00:13:24.802
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-9693/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-03-18 00:13:24.805
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at d2d50e6 HIVE-18886: ACID: NPE on unexplained mysql exceptions 
(Gopal V, reviewed by Eugene Koifman)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at d2d50e6 HIVE-18886: ACID: NPE on unexplained mysql exceptions 
(Gopal V, reviewed by Eugene Koifman)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-03-18 00:13:28.070
+ rm -rf ../yetus_PreCommit-HIVE-Build-9693
+ mkdir ../yetus_PreCommit-HIVE-Build-9693
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-9693
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-9693/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: missing binary patch data for 
'data/files/data_including_invalid_values.parquet'
error: binary patch does not apply to 
'data/files/data_including_invalid_values.parquet'
Falling back to three-way merge...
error: missing binary patch data for 
'data/files/data_including_invalid_values.parquet'
error: binary patch does not apply to 
'data/files/data_including_invalid_values.parquet'
error: data/files/data_including_invalid_values.parquet: patch does not apply
error: missing binary patch data for 'data/files/data_with_valid_values.parquet'
error: binary patch does not apply to 
'data/files/data_with_valid_values.parquet'
Falling back to three-way merge...
error: missing binary patch data for 'data/files/data_with_valid_values.parquet'
error: binary patch does not apply to 
'data/files/data_with_valid_values.parquet'
error: data/files/data_with_valid_values.parquet: patch does not apply
error: missing binary patch data for 'data/files/test_uint.parquet'
error: binary patch does not apply to 'data/files/test_uint.parquet'
Falling back to three-way merge...
error: missing binary patch data for 'data/files/test_uint.parquet'
error: binary patch does not apply to 'data/files/test_uint.parquet'
error: data/files/test_uint.parquet: patch does not apply
error: missing binary patch data for 
'files/data_including_invalid_values.parquet'
error: binary patch does not apply to 
'files/data_including_invalid_values.parquet'
Falling back to three-way merge...
error: missing binary patch data for 
'files/data_including_invalid_values.parquet'
error: binary patch does not apply to 
'files/data_including_invalid_values.parquet'
error: files/data_including_invalid_values.parquet: patch does not apply
error: missing binary patch data for 'files/data_with_valid_values.parquet'
error: binary patch does not apply to 'files/data_with_valid_values.parquet'
Falling back to three-way merge...
error: missing binary patch data for 'files/data_with_valid_values.parquet'
error: binary patch does not a

[jira] [Commented] (HIVE-17843) UINT32 Parquet columns are handled as signed INT32-s, silently reading incorrect data

2018-03-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16403705#comment-16403705
 ] 

Hive QA commented on HIVE-17843:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12915021/HIVE-17843.1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9689/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9689/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9689/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-03-17 19:57:20.431
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-9689/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-03-17 19:57:20.434
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at d2d50e6 HIVE-18886: ACID: NPE on unexplained mysql exceptions 
(Gopal V, reviewed by Eugene Koifman)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at d2d50e6 HIVE-18886: ACID: NPE on unexplained mysql exceptions 
(Gopal V, reviewed by Eugene Koifman)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-03-17 19:57:24.354
+ rm -rf ../yetus_PreCommit-HIVE-Build-9689
+ mkdir ../yetus_PreCommit-HIVE-Build-9689
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-9689
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-9689/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: missing binary patch data for 
'data/files/data_including_invalid_values.parquet'
error: binary patch does not apply to 
'data/files/data_including_invalid_values.parquet'
Falling back to three-way merge...
error: missing binary patch data for 
'data/files/data_including_invalid_values.parquet'
error: binary patch does not apply to 
'data/files/data_including_invalid_values.parquet'
error: data/files/data_including_invalid_values.parquet: patch does not apply
error: missing binary patch data for 'data/files/data_with_valid_values.parquet'
error: binary patch does not apply to 
'data/files/data_with_valid_values.parquet'
Falling back to three-way merge...
error: missing binary patch data for 'data/files/data_with_valid_values.parquet'
error: binary patch does not apply to 
'data/files/data_with_valid_values.parquet'
error: data/files/data_with_valid_values.parquet: patch does not apply
error: missing binary patch data for 'data/files/test_uint.parquet'
error: binary patch does not apply to 'data/files/test_uint.parquet'
Falling back to three-way merge...
error: missing binary patch data for 'data/files/test_uint.parquet'
error: binary patch does not apply to 'data/files/test_uint.parquet'
error: data/files/test_uint.parquet: patch does not apply
error: missing binary patch data for 
'files/data_including_invalid_values.parquet'
error: binary patch does not apply to 
'files/data_including_invalid_values.parquet'
Falling back to three-way merge...
error: missing binary patch data for 
'files/data_including_invalid_values.parquet'
error: binary patch does not apply to 
'files/data_including_invalid_values.parquet'
error: files/data_including_invalid_values.parquet: patch does not apply
error: missing binary patch data for 'files/data_with_valid_values.parquet'
error: binary patch does not apply to 'files/data_with_valid_values.parquet'
Falling back to three-way merge...
error: missing binary patch data for 'files/data_with_valid_values.parquet'
error: binary patch does not a

[jira] [Commented] (HIVE-17843) UINT32 Parquet columns are handled as signed INT32-s, silently reading incorrect data

2018-02-14 Thread Gabor Szadovszky (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364347#comment-16364347
 ] 

Gabor Szadovszky commented on HIVE-17843:
-

I've created a parquet file but I am unable to upload/attach it.

> UINT32 Parquet columns are handled as signed INT32-s, silently reading 
> incorrect data
> -
>
> Key: HIVE-17843
> URL: https://issues.apache.org/jira/browse/HIVE-17843
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Ivanfi
>Assignee: Janaki Lahorani
>Priority: Major
>
> An unsigned 32 bit Parquet column, such as
> {noformat}
> optional int32 uint_32_col (UINT_32)
> {noformat}
> is read by Hive as if it were signed, leading to incorrect results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17843) UINT32 Parquet columns are handled as signed INT32-s, silently reading incorrect data

2018-02-14 Thread Zoltan Ivanfi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364282#comment-16364282
 ] 

Zoltan Ivanfi commented on HIVE-17843:
--

Sorry for the late answer. The simplest query suffices, e.g., a SELECT * on a 
table that contains a single column and a single row. But the parquet file has 
to have an unsigned integer in it and Hive does not write unsignes ints. 
[~gszadovszky] could you supply an example parquet file with an unsigned int 
that has its first bit set? Thanks!

> UINT32 Parquet columns are handled as signed INT32-s, silently reading 
> incorrect data
> -
>
> Key: HIVE-17843
> URL: https://issues.apache.org/jira/browse/HIVE-17843
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Ivanfi
>Assignee: Janaki Lahorani
>Priority: Major
>
> An unsigned 32 bit Parquet column, such as
> {noformat}
> optional int32 uint_32_col (UINT_32)
> {noformat}
> is read by Hive as if it were signed, leading to incorrect results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17843) UINT32 Parquet columns are handled as signed INT32-s, silently reading incorrect data

2018-02-09 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16358900#comment-16358900
 ] 

Vihang Karajgaonkar commented on HIVE-17843:


[~zi] Can you give a example query which can lead to incorrect results? Thanks 
for reporting this..

> UINT32 Parquet columns are handled as signed INT32-s, silently reading 
> incorrect data
> -
>
> Key: HIVE-17843
> URL: https://issues.apache.org/jira/browse/HIVE-17843
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Ivanfi
>Priority: Major
>
> An unsigned 32 bit Parquet column, such as
> {noformat}
> optional int32 uint_32_col (UINT_32)
> {noformat}
> is read by Hive as if it were signed, leading to incorrect results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)