[ https://issues.apache.org/jira/browse/HIVE-8202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174641#comment-14174641 ]
Hive QA commented on HIVE-8202: ------------------------------- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12675394/HIVE-8202.1-spark.patch {color:red}ERROR:{color} -1 due to 130 failed/errored test(s), 6786 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_tez_smb_1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_16 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_count org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ctas org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_disable_merge_for_bucketing org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_escape_clusterby1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_escape_distributeby1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_escape_orderby1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_escape_sortby1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby3_map org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby3_map_multi_distinct org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby3_noskew org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby3_noskew_multi_distinct org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby7_map org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby7_map_multi_single_reducer org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby7_map_skew org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby7_noskew org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_cube1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_multi_single_reducer org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_multi_single_reducer2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_multi_single_reducer3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_position org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_ppr org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_rollup1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_sort_1_23 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_sort_skew_1_23 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_having org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_input14 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_input17 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_input18 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_input1_limit org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_insert_into1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_insert_into2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_insert_into3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join0 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join15 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join18 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join20 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join21 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join23 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_limit_pushdown org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_load_dyn_part14 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_load_dyn_part2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_mapreduce1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_mapreduce2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_merge1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_merge2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_metadata_only_queries org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_gby org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_gby2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_gby3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_lateral_view org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_move_tasks_share_dependencies org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multigroupby_singlemr org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_order org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_parallel org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_parallel_join0 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ppd_transform org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_sample10 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_sample6 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_script_pipe org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoin org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_sort org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_multiinsert org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_temp_table org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_transform_ppr1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_transform_ppr2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union10 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union11 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union14 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union15 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union16 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union18 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union19 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union23 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union25 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union28 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union30 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union33 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union5 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union6 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union7 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_ppr org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_10 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_15 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_16 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_18 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_19 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_20 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_21 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_24 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_25 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_5 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_6 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_7 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_8 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_9 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_cast_constant org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_data_types org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_decimal_aggregate org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_left_outer_join org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_13 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_14 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_15 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_9 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_part_project org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_mapjoin org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_nested_mapjoin org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_ptf org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_shufflejoin org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_timestamp_funcs {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/226/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/226/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-226/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 130 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12675394 > Support SMB Join for Hive on Spark [Spark Branch] > ------------------------------------------------- > > Key: HIVE-8202 > URL: https://issues.apache.org/jira/browse/HIVE-8202 > Project: Hive > Issue Type: Sub-task > Components: Spark > Reporter: Xuefu Zhang > Assignee: Szehon Ho > Attachments: HIVE-8202.1-spark.patch, Hive on Spark SMB Join.docx, > Hive on Spark SMB Join.pdf > > > SMB joins are used wherever the tables are sorted and bucketed. It's a > reduce-side join. The join boils down to just merging the already sorted > tables, allowing this operation to be faster than an ordinary map-join. > However, if the tables are partitioned, there could be a slow down as each > mapper would need to get a very small chunk of a partition which has a single > key. Thus, in some scenarios it's beneficial to convert SMB join to SMB map > join as well. > The task is to research and support the conversion from regular SMB join to > SMB map join for Spark execution engine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)