[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde
[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386762#comment-15386762 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-13708: -- [~ashutoshc] My .1 patch does not support non string column types with OpenCSVSerde. Rather it throws an error when non-string columns are used. The change for HIVE-13709 might be to replace the below code with the ones corresponding to the field type and make the corresponding changes everywhere else affected : {code} for (int i = 0; i < numCols; i++) { columnOIs.add(PrimitiveObjectInspectorFactory.javaStringObjectInspector); } {code} Thanks > Create table should verify datatypes supported by the serde > --- > > Key: HIVE-13708 > URL: https://issues.apache.org/jira/browse/HIVE-13708 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Thejas M Nair >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Critical > Attachments: HIVE-13708.1.patch, HIVE-13708.2.patch, > HIVE-13708.3.patch, HIVE-13708.4.patch > > > As [~Goldshuv] mentioned in HIVE-. > Create table with serde such as OpenCSVSerde allows for creation of table > with columns of arbitrary types. But 'describe table' would still return > string datatypes, and so does selects on the table. > This is misleading and would result in users not getting intended results. > The create table ideally should disallow the creation of such tables with > unsupported types. > Example posted by [~Goldshuv] in HIVE- - > {noformat} > CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) > ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with > serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") > STORED AS TEXTFILE > LOCATION '' > tblproperties ("skip.header.line.count"="1"); > {noformat} > Now consider this sql: > hive> select min(totalprice) from test; > in this case given my data, the result should have been 874.89, but the > actual result became 11.57 (as it is first according to byte ordering of > a string type). this is a wrong result. > hive> desc extended test; > OK > o_totalprice string from deserializer > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde
[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15383315#comment-15383315 ] Ashutosh Chauhan commented on HIVE-13708: - [~hsubramaniyan] you may want to upload your .1 patch on HIVE-13709 which makes sense for that issue. > Create table should verify datatypes supported by the serde > --- > > Key: HIVE-13708 > URL: https://issues.apache.org/jira/browse/HIVE-13708 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Thejas M Nair >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Critical > Attachments: HIVE-13708.1.patch, HIVE-13708.2.patch, > HIVE-13708.3.patch, HIVE-13708.4.patch > > > As [~Goldshuv] mentioned in HIVE-. > Create table with serde such as OpenCSVSerde allows for creation of table > with columns of arbitrary types. But 'describe table' would still return > string datatypes, and so does selects on the table. > This is misleading and would result in users not getting intended results. > The create table ideally should disallow the creation of such tables with > unsupported types. > Example posted by [~Goldshuv] in HIVE- - > {noformat} > CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) > ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with > serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") > STORED AS TEXTFILE > LOCATION '' > tblproperties ("skip.header.line.count"="1"); > {noformat} > Now consider this sql: > hive> select min(totalprice) from test; > in this case given my data, the result should have been 874.89, but the > actual result became 11.57 (as it is first according to byte ordering of > a string type). this is a wrong result. > hive> desc extended test; > OK > o_totalprice string from deserializer > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde
[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316833#comment-15316833 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-13708: -- Retouching this jira after some test runs: 1. Making this support generic as in patch#4 is still not entirely correct as it still throws errors too aggressively in case of certain serdes where it is ok to bypass the datatype check. 2. I would be an ok for going ahead with patch#1, which is a fix specific to the serde. 3. Making the check generic will need to bypass some scenarios(for e.g. in patch#4, actualColumnTypes.size() and expectedColumnTypes.size() to be non-zero values) which effectively doesnt make the check generic and nullifies the overall purpose of this patch. I dont see a direct way to verify the datatypes supported by the serde since there was no such API to take care of this while the serde was originally designed. [~ashutoshc] Please let me know if I am missing something here. Thanks Hari > Create table should verify datatypes supported by the serde > --- > > Key: HIVE-13708 > URL: https://issues.apache.org/jira/browse/HIVE-13708 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Thejas M Nair >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Critical > Attachments: HIVE-13708.1.patch, HIVE-13708.2.patch, > HIVE-13708.3.patch, HIVE-13708.4.patch > > > As [~Goldshuv] mentioned in HIVE-. > Create table with serde such as OpenCSVSerde allows for creation of table > with columns of arbitrary types. But 'describe table' would still return > string datatypes, and so does selects on the table. > This is misleading and would result in users not getting intended results. > The create table ideally should disallow the creation of such tables with > unsupported types. > Example posted by [~Goldshuv] in HIVE- - > {noformat} > CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) > ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with > serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") > STORED AS TEXTFILE > LOCATION '' > tblproperties ("skip.header.line.count"="1"); > {noformat} > Now consider this sql: > hive> select min(totalprice) from test; > in this case given my data, the result should have been 874.89, but the > actual result became 11.57 (as it is first according to byte ordering of > a string type). this is a wrong result. > hive> desc extended test; > OK > o_totalprice string from deserializer > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde
[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15312049#comment-15312049 ] Hive QA commented on HIVE-13708: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12807290/HIVE-13708.4.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 65 failed/errored test(s), 10198 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestJdbcWithMiniHA - did not produce a TEST-*.xml file TestJdbcWithMiniMr - did not produce a TEST-*.xml file TestOperationLoggingAPIWithTez - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_change_col org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_cascade org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_admin_almighty2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_cli_nonsql org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_change_schema org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_remove_cols org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_vs_table_metadata org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat15 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat16 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_stats org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_nonvec_mapwork_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_nonvec_mapwork_part_all_complex org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_nonvec_mapwork_part_all_primitive org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_vec_mapwork_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_vec_mapwork_part_all_complex org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_vec_mapwork_part_all_primitive org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_vecrow_mapwork_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_vecrow_mapwork_part_all_complex org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_vecrow_mapwork_part_all_primitive org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_serde_opencsv org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_stats org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_nonvec_mapwork_part org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_nonvec_mapwork_part_all_complex org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_nonvec_mapwork_part_all_primitive org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_vec_mapwork_part org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_vec_mapwork_part_all_complex org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_vec_mapwork_part_all_primitive org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_vecrow_mapwork_part org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_vecrow_mapwork_part_all_complex org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_vecrow_mapwork_part_all_primitive org.apache.hadoop.hive.ql.metadata.TestHive.testIndex org.apache.hadoop.hive.ql.metadata.TestHiveMetaStoreChecker.testDataDeletion org.apache.hadoop.hive.ql.metadata.TestHiveMetaStoreChecker.testPartitionsCheck org.apache.hadoop.hive.ql.metadata.TestHiveMetaStoreChecker.testTableCheck
[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde
[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294941#comment-15294941 ] Hive QA commented on HIVE-13708: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12804836/HIVE-13708.3.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 855 failed/errored test(s), 9327 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniTezCliDriver-explainuser_4.q-update_after_multiple_inserts.q-mapreduce2.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-mapjoin_mapjoin.q-insert_into1.q-vector_decimal_2.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-script_pipe.q-vector_decimal_aggregate.q-vector_data_types.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-tez_union_group_by.q-vector_auto_smb_mapjoin_14.q-union_fast_stats.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-vectorized_parquet.q-insert_values_non_partitioned.q-schema_evol_orc_nonvec_mapwork_part.q-and-12-more - did not produce a TEST-*.xml file TestNegativeCliDriver-udf_invalid.q-nopart_insert.q-insert_into_with_schema.q-and-735-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_grouping_id2.q-vectorization_13.q-auto_sortmerge_join_13.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_globallimit org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization_project org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_exist org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_index org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_change_col org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_cascade org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_view_as_select org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_excludeHadoop20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_multi org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_admin_almighty2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_cli_nonsql org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_explain org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_show_grant org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join19 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join19_inclause org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join25 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_change_schema org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_comments org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_evolved_schemas org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_fields org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_sanity_test org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_literal org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_case_sensitivity org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_const org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_gby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_gby_empty org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_input26 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_limit org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_auto_join0
[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde
[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292163#comment-15292163 ] Hive QA commented on HIVE-13708: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12804836/HIVE-13708.3.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 962 failed/errored test(s), 10030 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniTezCliDriver-auto_join30.q-vector_decimal_10_0.q-acid_globallimit.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-explainuser_4.q-update_after_multiple_inserts.q-mapreduce2.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-tez_union_group_by.q-vector_auto_smb_mapjoin_14.q-union_fast_stats.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-update_orig_table.q-union2.q-bucket4.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-vector_grouping_sets.q-update_all_partitioned.q-cte_5.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_globallimit org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization_project org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_exist org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_index org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_change_col org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_cascade org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_view_as_select org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_excludeHadoop20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_multi org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_admin_almighty2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_cli_nonsql org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_explain org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_show_grant org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join19 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join19_inclause org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join25 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_change_schema org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_comments org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_evolved_schemas org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_fields org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_sanity_test org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_literal org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_case_sensitivity org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_const org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_gby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_gby_empty org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_input26 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_limit org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_auto_join0 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_gby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_gby_empty org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_join0
[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde
[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283743#comment-15283743 ] Hive QA commented on HIVE-13708: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12803764/HIVE-13708.1.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/276/testReport Console output: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/276/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-276/ Messages: {noformat} This message was trimmed, see log for full details [INFO] Excluding org.apache.spark:spark-core_2.10:jar:1.6.0 from the shaded jar. [INFO] Excluding com.twitter:chill_2.10:jar:0.5.0 from the shaded jar. [INFO] Excluding com.twitter:chill-java:jar:0.5.0 from the shaded jar. [INFO] Excluding org.apache.xbean:xbean-asm5-shaded:jar:4.4 from the shaded jar. [INFO] Excluding org.apache.hadoop:hadoop-client:jar:2.6.0 from the shaded jar. [INFO] Excluding org.apache.hadoop:hadoop-mapreduce-client-app:jar:2.6.0 from the shaded jar. [INFO] Excluding org.apache.hadoop:hadoop-mapreduce-client-shuffle:jar:2.6.0 from the shaded jar. [INFO] Excluding org.apache.hadoop:hadoop-mapreduce-client-jobclient:jar:2.6.0 from the shaded jar. [INFO] Excluding org.apache.spark:spark-launcher_2.10:jar:1.6.0 from the shaded jar. [INFO] Excluding org.apache.spark:spark-network-common_2.10:jar:1.6.0 from the shaded jar. [INFO] Excluding org.apache.spark:spark-network-shuffle_2.10:jar:1.6.0 from the shaded jar. [INFO] Excluding org.apache.spark:spark-unsafe_2.10:jar:1.6.0 from the shaded jar. [INFO] Excluding org.slf4j:jul-to-slf4j:jar:1.7.10 from the shaded jar. [INFO] Excluding org.slf4j:jcl-over-slf4j:jar:1.7.10 from the shaded jar. [INFO] Excluding com.ning:compress-lzf:jar:1.0.3 from the shaded jar. [INFO] Excluding net.jpountz.lz4:lz4:jar:1.3.0 from the shaded jar. [INFO] Excluding com.typesafe.akka:akka-remote_2.10:jar:2.3.11 from the shaded jar. [INFO] Excluding com.typesafe.akka:akka-actor_2.10:jar:2.3.11 from the shaded jar. [INFO] Excluding com.typesafe:config:jar:1.2.1 from the shaded jar. [INFO] Excluding org.uncommons.maths:uncommons-maths:jar:1.2.2a from the shaded jar. [INFO] Excluding com.typesafe.akka:akka-slf4j_2.10:jar:2.3.11 from the shaded jar. [INFO] Excluding org.scala-lang:scala-library:jar:2.10.4 from the shaded jar. [INFO] Excluding org.json4s:json4s-jackson_2.10:jar:3.2.10 from the shaded jar. [INFO] Excluding org.json4s:json4s-core_2.10:jar:3.2.10 from the shaded jar. [INFO] Excluding org.json4s:json4s-ast_2.10:jar:3.2.10 from the shaded jar. [INFO] Excluding org.scala-lang:scalap:jar:2.10.0 from the shaded jar. [INFO] Excluding org.scala-lang:scala-compiler:jar:2.10.0 from the shaded jar. [INFO] Excluding org.apache.mesos:mesos:jar:shaded-protobuf:0.21.1 from the shaded jar. [INFO] Excluding com.clearspring.analytics:stream:jar:2.7.0 from the shaded jar. [INFO] Excluding io.dropwizard.metrics:metrics-graphite:jar:3.1.2 from the shaded jar. [INFO] Excluding com.fasterxml.jackson.module:jackson-module-scala_2.10:jar:2.4.4 from the shaded jar. [INFO] Excluding org.scala-lang:scala-reflect:jar:2.10.4 from the shaded jar. [INFO] Excluding oro:oro:jar:2.0.8 from the shaded jar. [INFO] Excluding org.tachyonproject:tachyon-client:jar:0.8.2 from the shaded jar. [INFO] Excluding org.tachyonproject:tachyon-underfs-hdfs:jar:0.8.2 from the shaded jar. [INFO] Excluding org.tachyonproject:tachyon-underfs-s3:jar:0.8.2 from the shaded jar. [INFO] Excluding org.tachyonproject:tachyon-underfs-local:jar:0.8.2 from the shaded jar. [INFO] Excluding net.razorvine:pyrolite:jar:4.9 from the shaded jar. [INFO] Excluding net.sf.py4j:py4j:jar:0.9 from the shaded jar. [INFO] Excluding org.spark-project.spark:unused:jar:1.0.0 from the shaded jar. [INFO] Excluding org.slf4j:slf4j-api:jar:1.7.10 from the shaded jar. [INFO] Replacing original artifact with shaded artifact. [INFO] Replacing /data/hive-ptest/working/apache-github-source-source/ql/target/hive-exec-2.1.0-SNAPSHOT.jar with /data/hive-ptest/working/apache-github-source-source/ql/target/hive-exec-2.1.0-SNAPSHOT-shaded.jar [INFO] Dependency-reduced POM written at: /data/hive-ptest/working/apache-github-source-source/ql/dependency-reduced-pom.xml [INFO] Dependency-reduced POM written at: /data/hive-ptest/working/apache-github-source-source/ql/dependency-reduced-pom.xml [INFO] Dependency-reduced POM written at: /data/hive-ptest/working/apache-github-source-source/ql/dependency-reduced-pom.xml [INFO] Dependency-reduced POM written at: /data/hive-ptest/working/apache-github-source-source/ql/dependency-reduced-pom.xml [INFO] Dependency-reduced POM written at:
[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde
[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282303#comment-15282303 ] Thejas M Nair commented on HIVE-13708: -- This change to CSVSerde breaks backward compatbility for anyone who had a scripted create table command with a non string column. Those statements would fail now. If we consider CSVSerde in isolation, the best thing to do about it is to address HIVE-13709, ie support other types as supported by LazySimpleSerde. That would lead to correct results and also be backward compatible. Regarding the generic change applicable to any such serde - It is a difficult choice between allowing logically incorrect results and backward compatibility. I think if we also make the changes in HIVE-13709, only users who use custom serde with same limitations (but without error checks) and also use unsupported types for that serde would be affected. That set is likely to be very small. I would vote for making this incompatible change and fix the logical correctness issue. > Create table should verify datatypes supported by the serde > --- > > Key: HIVE-13708 > URL: https://issues.apache.org/jira/browse/HIVE-13708 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Thejas M Nair >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Critical > Attachments: HIVE-13708.1.patch > > > As [~Goldshuv] mentioned in HIVE-. > Create table with serde such as OpenCSVSerde allows for creation of table > with columns of arbitrary types. But 'describe table' would still return > string datatypes, and so does selects on the table. > This is misleading and would result in users not getting intended results. > The create table ideally should disallow the creation of such tables with > unsupported types. > Example posted by [~Goldshuv] in HIVE- - > {noformat} > CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) > ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with > serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") > STORED AS TEXTFILE > LOCATION '' > tblproperties ("skip.header.line.count"="1"); > {noformat} > Now consider this sql: > hive> select min(totalprice) from test; > in this case given my data, the result should have been 874.89, but the > actual result became 11.57 (as it is first according to byte ordering of > a string type). this is a wrong result. > hive> desc extended test; > OK > o_totalprice string from deserializer > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde
[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282291#comment-15282291 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-13708: -- [~thejas] I checked whether we could do this in a generic way. As you mentioned, we can perform a deep check of the object inspector after initialize() and see if the types will match the column type in the table definition. My concern here is if it is backward compatible or will it break things that used to work previously. If we haven't enforced this rule previously, how will we expect the custom serde developer henceforth to know that this is an enforced rule in Hive. Also, it looked cleaner to implement this check in the actual serde itself (like for e.g. RegexSerDe has done a similar check in initialize()) since it seems that it is the responsibility of the Serde to interpret the data correctly and not the query processor. Let me know your feedback. Thanks Hari > Create table should verify datatypes supported by the serde > --- > > Key: HIVE-13708 > URL: https://issues.apache.org/jira/browse/HIVE-13708 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Thejas M Nair >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Critical > Attachments: HIVE-13708.1.patch > > > As [~Goldshuv] mentioned in HIVE-. > Create table with serde such as OpenCSVSerde allows for creation of table > with columns of arbitrary types. But 'describe table' would still return > string datatypes, and so does selects on the table. > This is misleading and would result in users not getting intended results. > The create table ideally should disallow the creation of such tables with > unsupported types. > Example posted by [~Goldshuv] in HIVE- - > {noformat} > CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) > ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with > serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") > STORED AS TEXTFILE > LOCATION '' > tblproperties ("skip.header.line.count"="1"); > {noformat} > Now consider this sql: > hive> select min(totalprice) from test; > in this case given my data, the result should have been 874.89, but the > actual result became 11.57 (as it is first according to byte ordering of > a string type). this is a wrong result. > hive> desc extended test; > OK > o_totalprice string from deserializer > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde
[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282269#comment-15282269 ] Thejas M Nair commented on HIVE-13708: -- [~hsubramaniyan] Can we do this in a generic way so that it is applicable to any serde that doesn't support the types being specified in create-table ? There are possibly other user created serdes that could also have this issue. I haven't looked deeper into the object inspectors. How about checking the objectinspector after initialize ? It seems like the types it will return can be determined from that. cc [~ashutoshc] I didn't mean this to be specifically about CSVSerde, but more general about the hive serde interaction. The specific change to that serde that I would like to see is in HIVE-13709 . > Create table should verify datatypes supported by the serde > --- > > Key: HIVE-13708 > URL: https://issues.apache.org/jira/browse/HIVE-13708 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Thejas M Nair >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Critical > Attachments: HIVE-13708.1.patch > > > As [~Goldshuv] mentioned in HIVE-. > Create table with serde such as OpenCSVSerde allows for creation of table > with columns of arbitrary types. But 'describe table' would still return > string datatypes, and so does selects on the table. > This is misleading and would result in users not getting intended results. > The create table ideally should disallow the creation of such tables with > unsupported types. > Example posted by [~Goldshuv] in HIVE- - > {noformat} > CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) > ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with > serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") > STORED AS TEXTFILE > LOCATION '' > tblproperties ("skip.header.line.count"="1"); > {noformat} > Now consider this sql: > hive> select min(totalprice) from test; > in this case given my data, the result should have been 874.89, but the > actual result became 11.57 (as it is first according to byte ordering of > a string type). this is a wrong result. > hive> desc extended test; > OK > o_totalprice string from deserializer > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde
[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282266#comment-15282266 ] Ashutosh Chauhan commented on HIVE-13708: - Patch is addressing different issue than the description. Would you like to update description of jira to reflect your patch? > Create table should verify datatypes supported by the serde > --- > > Key: HIVE-13708 > URL: https://issues.apache.org/jira/browse/HIVE-13708 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Thejas M Nair >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Critical > Attachments: HIVE-13708.1.patch > > > As [~Goldshuv] mentioned in HIVE-. > Create table with serde such as OpenCSVSerde allows for creation of table > with columns of arbitrary types. But 'describe table' would still return > string datatypes, and so does selects on the table. > This is misleading and would result in users not getting intended results. > The create table ideally should disallow the creation of such tables with > unsupported types. > Example posted by [~Goldshuv] in HIVE- - > {noformat} > CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) > ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with > serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") > STORED AS TEXTFILE > LOCATION '' > tblproperties ("skip.header.line.count"="1"); > {noformat} > Now consider this sql: > hive> select min(totalprice) from test; > in this case given my data, the result should have been 874.89, but the > actual result became 11.57 (as it is first according to byte ordering of > a string type). this is a wrong result. > hive> desc extended test; > OK > o_totalprice string from deserializer > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde
[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282227#comment-15282227 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-13708: -- Couple of points : 1. The original document says : CREATE TABLE my_table(a string, b string, ...) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' I believe this supports strictly string columns and not anything else(not even variants like varchar). Please correct me if this is wrong. 2. The description for this jira says: CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with There is not much we can do for 'com.bizo.hive.serde.csv.CSVSerde' in Hive. I will upload a patch that will fix for 'org.apache.hadoop.hive.serde2.OpenCSVSerde'. Thanks Hari > Create table should verify datatypes supported by the serde > --- > > Key: HIVE-13708 > URL: https://issues.apache.org/jira/browse/HIVE-13708 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Thejas M Nair >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Critical > > As [~Goldshuv] mentioned in HIVE-. > Create table with serde such as OpenCSVSerde allows for creation of table > with columns of arbitrary types. But 'describe table' would still return > string datatypes, and so does selects on the table. > This is misleading and would result in users not getting intended results. > The create table ideally should disallow the creation of such tables with > unsupported types. > Example posted by [~Goldshuv] in HIVE- - > {noformat} > CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) > ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with > serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") > STORED AS TEXTFILE > LOCATION '' > tblproperties ("skip.header.line.count"="1"); > {noformat} > Now consider this sql: > hive> select min(totalprice) from test; > in this case given my data, the result should have been 874.89, but the > actual result became 11.57 (as it is first according to byte ordering of > a string type). this is a wrong result. > hive> desc extended test; > OK > o_totalprice string from deserializer > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)