[ https://issues.apache.org/jira/browse/HIVE-8207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chao updated HIVE-8207: ----------------------- Description: Now that multi-table insertion is committed to branch, we should enable those related qtests. Here is a list of qfiles that should be activated (some of them may already be activated). The list may not be comprehensive. {noformat} add_part_multiple.q auto_smb_mapjoin_14.q bucket5.q column_access_stats.q date_udf.q groupby10.q groupby11.q groupby3_map_multi_distinct.q groupby3_map.q groupby3_map_skew.q groupby3_noskew_multi_distinct.q groupby3_noskew.q groupby7_map_multi_single_reducer.q groupby7_map.q groupby7_map_skew.q groupby7_noskew_multi_single_reducer.q groupby7_noskew.q groupby7.q groupby8_map.q groupby8_map_skew.q groupby8_noskew.q groupby8.q groupby9.q groupby_complex_types_multi_single_reducer.q groupby_complex_types.q groupby_cube1.q groupby_map_ppr_multi_distinct.q groupby_map_ppr.q groupby_multi_insert_common_distinct.q groupby_multi_single_reducer2.q groupby_multi_single_reducer3.q groupby_multi_single_reducer.q groupby_position.q groupby_ppr.q groupby_rollup1.q groupby_sort_1_23.q groupby_sort_1.q groupby_sort_skew_1_23.q infer_bucket_sort_multi_insert.q innerjoin.q input12_hadoop20.q input12.q input13.q input14.q input17.q input18.q input1_limit.q input_part2.q insert_into3.q join_nullsafe.q load_dyn_part8.q metadata_only_queries_with_filters.q multigroupby_singlemr.q multi_insert_gby2.q multi_insert_gby3.q multi_insert_gby.q multi_insert_lateral_view.qmulti_insert_move_tasks_share_dependencies.q multi_insert.q parallel.q partition_date2.q pcr.q ppd_multi_insert.q ppd_transform.q smb_mapjoin_11.q smb_mapjoin_12.q smb_mapjoin_13.q smb_mapjoin_15.q smb_mapjoin_16.q stats4.q subquery_multiinsert.q table_access_keys_stats.q tez_dml.q udaf_percentile_approx_20.q udaf_percentile_approx_23.q union17.q union18.q union19.q {noformat} There are some tests that cannot be enabled right now, due to various reasons: 1. ForwardOperator Issue, including {noformat} groupby7_noskew_multi_single_reducer.q groupby8_map.q groupby8_map_skew.q groupby8_noskew.q groupby8.q groupby9.q groupby10.q groupby_complex_types_multi_single_reducer.q groupby_multi_insert_common_distinct.q union17.q {noformat} *Reason*: currently, if the node to break in the operator tree is a ForwardOperator, we simple do nothing. However, we may have the following case: {noformat} ...... FOR -> RS_0 -> RS_1 \-> RS_2 {noformat} Here, {{RS_0}} leads to both {{RS_1}} and {{RS_2}}, and because of the issue in HIVE-7731 and HIVE-8118, both downstream branches will get duplicated results. 2. Stats issue, including: {noformat} bucket5.q infer_bucket_sort_multi_insert.q stats4.q smb_mapjoin_13.q smb_mapjoin_15.q {noformat} *Reason*: In these tests, I get diff error because {{numRows}} and {{rawDataSize}} are -1, but they are expected to be some positive value. I don't think this is related to multi-insertion. 3. Join/SMB Join Issue, including {noformat} auto_smb_mapjoin_14.q auto_sortmerge_join_13.q smb_mapjoin_11.q smb_mapjoin_12.q smb_mapjoin_13.q smb_mapjoin_15.q smb_mapjoin_16.q {noformat} *Reason*: These tests either failed with exception or failed with diff. I think it's because SMB Join (HIVE-8202) isn't supported right now. 4. Result doesn't match, including {noformat} groupby3_map_skew.q groupby_map_ppr_multi_distinct.q groupby_map_ppr.q partition_date2.q udaf_percentile_approx_23.q {noformat} *Reason*: The results from these tests are different from MR's. For instance, test for groupby3_map_skew.q failed because: {noformat} < 130091.0 260.182 256.10355987055016 98.0 0.0 142.92680950752379 143.06995106518903 20428.07288 20469.0109 --- > 130091.0 260.182 256.10355987055016 98.0 0.0 > 142.9268095075238 143.06995106518906 20428.07288 20469.0109 {noformat} I don't know why this will happen. But, I think they may not be related to multi-insertion. was: Now that multi-table insertion is committed to branch, we should enable those related qtests. Here is a list of qfiles that should be activated (some of them may already be activated). The list may not be comprehensive. {noformat} add_part_multiple.q auto_smb_mapjoin_14.q bucket5.q column_access_stats.q date_udf.q groupby10.q groupby11.q groupby3_map_multi_distinct.q groupby3_map.q groupby3_map_skew.q groupby3_noskew_multi_distinct.q groupby3_noskew.q groupby7_map_multi_single_reducer.q groupby7_map.q groupby7_map_skew.q groupby7_noskew_multi_single_reducer.q groupby7_noskew.q groupby7.q groupby8_map.q groupby8_map_skew.q groupby8_noskew.q groupby8.q groupby9.q groupby_complex_types_multi_single_reducer.q groupby_complex_types.q groupby_cube1.q groupby_map_ppr_multi_distinct.q groupby_map_ppr.q groupby_multi_insert_common_distinct.q groupby_multi_single_reducer2.q groupby_multi_single_reducer3.q groupby_multi_single_reducer.q groupby_position.q groupby_ppr.q groupby_rollup1.q groupby_sort_1_23.q groupby_sort_1.q groupby_sort_skew_1_23.q infer_bucket_sort_multi_insert.q innerjoin.q input12_hadoop20.q input12.q input13.q input14.q input17.q input18.q input1_limit.q input_part2.q insert_into3.q join_nullsafe.q load_dyn_part8.q metadata_only_queries_with_filters.q multigroupby_singlemr.q multi_insert_gby2.q multi_insert_gby3.q multi_insert_gby.q multi_insert_lateral_view.qmulti_insert_move_tasks_share_dependencies.q multi_insert.q parallel.q partition_date2.q pcr.q ppd_multi_insert.q ppd_transform.q smb_mapjoin_11.q smb_mapjoin_12.q smb_mapjoin_13.q smb_mapjoin_15.q smb_mapjoin_16.q stats4.q subquery_multiinsert.q table_access_keys_stats.q tez_dml.q udaf_percentile_approx_20.q udaf_percentile_approx_23.q union17.q union18.q union19.q {noformat} There are some tests that cannot be enabled right now, due to various reasons: 1. ForwardOperator Issue, including {noformat} groupby7_noskew_multi_single_reducer.q groupby8_map.q groupby8_map_skew.q groupby8_noskew.q groupby8.q groupby9.q groupby10.q groupby_complex_types_multi_single_reducer.q groupby_multi_insert_common_distinct.q union17.q {noformat} *Reason*: currently, if the node to break in the operator tree is a ForwardOperator, we simple do nothing. However, we may have the following case: {noformat} ...... FOR -> RS_0 -> RS_1 \-> RS_2 {noformat} Here, {{RS_0}} leads to both {{RS_1}} and {{RS_2}}, and because of the issue in HIVE-7731 and HIVE-8118, both downstream branches will get duplicated results. 2. Stats issue, including: {noformat} bucket5.q infer_bucket_sort_multi_insert.q stats4.q smb_mapjoin_13.q smb_mapjoin_15.q {noformat} *Reason*: In these tests, I get diff error because {{numRows}} and {{rawDataSize}} are -1, but they are expected to be some positive value. I don't think this is related to multi-insertion. 3. Join/SMB Join Issue, including {noformat} auto_smb_mapjoin_14.q auto_sortmerge_join_13.q smb_mapjoin_11.q smb_mapjoin_12.q smb_mapjoin_13.q smb_mapjoin_15.q smb_mapjoin_16.q {noformat} *Reason*: These tests either failed with exception or failed with diff. I think it's because SMB Join (HIVE-8202) isn't supported right now. 4. Result doesn't match, including {noformat} groupby3_map_skew.q groupby_map_ppr_multi_distinct.q groupby_map_ppr.q partition_date2.q udaf_percentile_approx_23.q {noformat} *Reason*: The results from these tests are different from MR's. For instance, test for groupby3_map_skew.q failed because: {noformat} < 130091.0 260.182 256.10355987055016 98.0 0.0 142.92680950752379 143.06995106518903 20428.07288 20469.0109 --- > 130091.0 260.182 256.10355987055016 98.0 0.0 > 142.9268095075238 143.06995106518906 20428.07288 20469.0109 {noformat} I don't know why this will happen. But, I think they may not be related to multi-insertion. > Add .q tests for multi-table insertion [Spark Branch] > ----------------------------------------------------- > > Key: HIVE-8207 > URL: https://issues.apache.org/jira/browse/HIVE-8207 > Project: Hive > Issue Type: Test > Components: Spark > Reporter: Chao > Assignee: Chao > Attachments: HIVE-8207.1-spark.patch > > > Now that multi-table insertion is committed to branch, we should enable those > related qtests. > Here is a list of qfiles that should be activated (some of them may already > be activated). > The list may not be comprehensive. > {noformat} > add_part_multiple.q > auto_smb_mapjoin_14.q > bucket5.q > column_access_stats.q > date_udf.q > groupby10.q > groupby11.q > groupby3_map_multi_distinct.q > groupby3_map.q > groupby3_map_skew.q > groupby3_noskew_multi_distinct.q > groupby3_noskew.q > groupby7_map_multi_single_reducer.q > groupby7_map.q > groupby7_map_skew.q > groupby7_noskew_multi_single_reducer.q > groupby7_noskew.q > groupby7.q > groupby8_map.q > groupby8_map_skew.q > groupby8_noskew.q > groupby8.q > groupby9.q > groupby_complex_types_multi_single_reducer.q > groupby_complex_types.q > groupby_cube1.q > groupby_map_ppr_multi_distinct.q > groupby_map_ppr.q > groupby_multi_insert_common_distinct.q > groupby_multi_single_reducer2.q > groupby_multi_single_reducer3.q > groupby_multi_single_reducer.q > groupby_position.q > groupby_ppr.q > groupby_rollup1.q > groupby_sort_1_23.q > groupby_sort_1.q > groupby_sort_skew_1_23.q > infer_bucket_sort_multi_insert.q > innerjoin.q > input12_hadoop20.q > input12.q > input13.q > input14.q > input17.q > input18.q > input1_limit.q > input_part2.q > insert_into3.q > join_nullsafe.q > load_dyn_part8.q > metadata_only_queries_with_filters.q > multigroupby_singlemr.q > multi_insert_gby2.q > multi_insert_gby3.q > multi_insert_gby.q > multi_insert_lateral_view.qmulti_insert_move_tasks_share_dependencies.q > multi_insert.q > parallel.q > partition_date2.q > pcr.q > ppd_multi_insert.q > ppd_transform.q > smb_mapjoin_11.q > smb_mapjoin_12.q > smb_mapjoin_13.q > smb_mapjoin_15.q > smb_mapjoin_16.q > stats4.q > subquery_multiinsert.q > table_access_keys_stats.q > tez_dml.q > udaf_percentile_approx_20.q > udaf_percentile_approx_23.q > union17.q > union18.q > union19.q > {noformat} > > There are some tests that cannot be enabled right now, due to various reasons: > 1. ForwardOperator Issue, including > {noformat} > groupby7_noskew_multi_single_reducer.q > groupby8_map.q > groupby8_map_skew.q > groupby8_noskew.q > groupby8.q > groupby9.q > groupby10.q > groupby_complex_types_multi_single_reducer.q > groupby_multi_insert_common_distinct.q > union17.q > {noformat} > *Reason*: currently, if the node to break in the operator tree is a > ForwardOperator, we simple do nothing. However, we may have the following > case: > {noformat} > ...... FOR -> RS_0 -> RS_1 > \-> RS_2 > {noformat} > Here, {{RS_0}} leads to both {{RS_1}} and {{RS_2}}, and because of the issue > in HIVE-7731 and HIVE-8118, both downstream branches will get duplicated > results. > 2. Stats issue, including: > {noformat} > bucket5.q > infer_bucket_sort_multi_insert.q > stats4.q > smb_mapjoin_13.q > smb_mapjoin_15.q > {noformat} > *Reason*: In these tests, I get diff error because {{numRows}} and > {{rawDataSize}} are -1, but they are expected to be some positive value. I > don't think this is related to multi-insertion. > 3. Join/SMB Join Issue, including > {noformat} > auto_smb_mapjoin_14.q > auto_sortmerge_join_13.q > smb_mapjoin_11.q > smb_mapjoin_12.q > smb_mapjoin_13.q > smb_mapjoin_15.q > smb_mapjoin_16.q > {noformat} > *Reason*: These tests either failed with exception or failed with diff. I > think it's because SMB Join (HIVE-8202) isn't supported right now. > 4. Result doesn't match, including > {noformat} > groupby3_map_skew.q > groupby_map_ppr_multi_distinct.q > groupby_map_ppr.q > partition_date2.q > udaf_percentile_approx_23.q > {noformat} > *Reason*: The results from these tests are different from MR's. For instance, > test for groupby3_map_skew.q failed because: > {noformat} > < 130091.0 260.182 256.10355987055016 98.0 0.0 > 142.92680950752379 143.06995106518903 20428.07288 20469.0109 > --- > > 130091.0 260.182 256.10355987055016 98.0 0.0 > > 142.9268095075238 143.06995106518906 20428.07288 20469.0109 > {noformat} > I don't know why this will happen. But, I think they may not be related to > multi-insertion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)