[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality
[ https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182669#comment-15182669 ] Hive QA commented on HIVE-13096: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12791499/HIVE-13096.04.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 9770 tests executed *Failed tests:* {noformat} TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_mapjoin_reduce org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7182/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7182/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7182/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12791499 - PreCommit-HIVE-TRUNK-Build > Cost to choose side table in MapJoin conversion based on cumulative > cardinality > --- > > Key: HIVE-13096 > URL: https://issues.apache.org/jira/browse/HIVE-13096 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, > HIVE-13096.03.patch, HIVE-13096.04.patch, HIVE-13096.patch > > > HIVE-11954 changed the logic to choose the side table in the MapJoin > conversion algorithm. Initial heuristic for the cost was based on number of > heavyweight operators. > This extends that work so the heuristic is based on accumulate cardinality. > In the future, we should choose the side based on total latency for the input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality
[ https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15180164#comment-15180164 ] Ashutosh Chauhan commented on HIVE-13096: - Ok, that makes sense. +1 > Cost to choose side table in MapJoin conversion based on cumulative > cardinality > --- > > Key: HIVE-13096 > URL: https://issues.apache.org/jira/browse/HIVE-13096 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, > HIVE-13096.03.patch, HIVE-13096.patch > > > HIVE-11954 changed the logic to choose the side table in the MapJoin > conversion algorithm. Initial heuristic for the cost was based on number of > heavyweight operators. > This extends that work so the heuristic is based on accumulate cardinality. > In the future, we should choose the side based on total latency for the input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality
[ https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15179951#comment-15179951 ] Jesus Camacho Rodriguez commented on HIVE-13096: [~ashutoshc], I finally could take a look back at this one. The heuristic change impacts the selection of table chosen for streaming, and it might change the shape of the DAG too e.g. in the presence of GB + Join. For instance, consider {{bucket_map_join_tez1.q}}. - Previously, the shape was: {noformat} Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 3 (CUSTOM_SIMPLE_EDGE) {noformat} Reducer2 contains a GB on the input from Map1 (TS on table1), followed by a Join. In this case, Map3 (TS on table2) is broadcasted for the Join execution that is done in Reduce2. - With the patch, the shape is: {noformat} Map 3 <- Reducer 2 (CUSTOM_EDGE) Reducer 2 <- Map 1 (SIMPLE_EDGE) {noformat} Reducer2 contains a GB on the input from Map1 (TS on table1). In this case, the output of GB is broadcasted for the Join execution that is done in Map3. > Cost to choose side table in MapJoin conversion based on cumulative > cardinality > --- > > Key: HIVE-13096 > URL: https://issues.apache.org/jira/browse/HIVE-13096 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, > HIVE-13096.03.patch, HIVE-13096.patch > > > HIVE-11954 changed the logic to choose the side table in the MapJoin > conversion algorithm. Initial heuristic for the cost was based on number of > heavyweight operators. > This extends that work so the heuristic is based on accumulate cardinality. > In the future, we should choose the side based on total latency for the input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality
[ https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174841#comment-15174841 ] Ashutosh Chauhan commented on HIVE-13096: - This heuristic change should have supposedly impacted only selection of table chosen for streaming, and not change shape of tez dag. Is that expected ? > Cost to choose side table in MapJoin conversion based on cumulative > cardinality > --- > > Key: HIVE-13096 > URL: https://issues.apache.org/jira/browse/HIVE-13096 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, > HIVE-13096.03.patch, HIVE-13096.patch > > > HIVE-11954 changed the logic to choose the side table in the MapJoin > conversion algorithm. Initial heuristic for the cost was based on number of > heavyweight operators. > This extends that work so the heuristic is based on accumulate cardinality. > In the future, we should choose the side based on total latency for the input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality
[ https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169587#comment-15169587 ] Hive QA commented on HIVE-13096: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12789935/HIVE-13096.03.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9828 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_mapjoin_reduce org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7101/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7101/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7101/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12789935 - PreCommit-HIVE-TRUNK-Build > Cost to choose side table in MapJoin conversion based on cumulative > cardinality > --- > > Key: HIVE-13096 > URL: https://issues.apache.org/jira/browse/HIVE-13096 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, > HIVE-13096.03.patch, HIVE-13096.patch > > > HIVE-11954 changed the logic to choose the side table in the MapJoin > conversion algorithm. Initial heuristic for the cost was based on number of > heavyweight operators. > This extends that work so the heuristic is based on accumulate cardinality. > In the future, we should choose the side based on total latency for the input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality
[ https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167372#comment-15167372 ] Jesus Camacho Rodriguez commented on HIVE-13096: I just did. Thanks > Cost to choose side table in MapJoin conversion based on cumulative > cardinality > --- > > Key: HIVE-13096 > URL: https://issues.apache.org/jira/browse/HIVE-13096 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, > HIVE-13096.03.patch, HIVE-13096.patch > > > HIVE-11954 changed the logic to choose the side table in the MapJoin > conversion algorithm. Initial heuristic for the cost was based on number of > heavyweight operators. > This extends that work so the heuristic is based on accumulate cardinality. > In the future, we should choose the side based on total latency for the input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality
[ https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167363#comment-15167363 ] Ashutosh Chauhan commented on HIVE-13096: - Can you create a RB entry? > Cost to choose side table in MapJoin conversion based on cumulative > cardinality > --- > > Key: HIVE-13096 > URL: https://issues.apache.org/jira/browse/HIVE-13096 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, > HIVE-13096.03.patch, HIVE-13096.patch > > > HIVE-11954 changed the logic to choose the side table in the MapJoin > conversion algorithm. Initial heuristic for the cost was based on number of > heavyweight operators. > This extends that work so the heuristic is based on accumulate cardinality. > In the future, we should choose the side based on total latency for the input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality
[ https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166465#comment-15166465 ] Hive QA commented on HIVE-13096: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12789388/HIVE-13096.02.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9790 tests executed *Failed tests:* {noformat} TestMiniTezCliDriver-schema_evol_text_nonvec_mapwork_table.q-orc_vectorization_ppd.q-vector_left_outer_join2.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_mapjoin_reduce org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7083/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7083/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7083/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12789388 - PreCommit-HIVE-TRUNK-Build > Cost to choose side table in MapJoin conversion based on cumulative > cardinality > --- > > Key: HIVE-13096 > URL: https://issues.apache.org/jira/browse/HIVE-13096 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, > HIVE-13096.patch > > > HIVE-11954 changed the logic to choose the side table in the MapJoin > conversion algorithm. Initial heuristic for the cost was based on number of > heavyweight operators. > This extends that work so the heuristic is based on accumulate cardinality. > In the future, we should choose the side based on total latency for the input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality
[ https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160221#comment-15160221 ] Jesus Camacho Rodriguez commented on HIVE-13096: [~ashutoshc], thanks for checking. In fact, I had missed that the {{getCumulativeCost}} method was overriden for Join operators (as it is part of _HiveRelMdDistinctRowCount_ ), thanks for catching that. However, the default {{getCumulativeCost}} is still applied over the rest of the operators (it is in _RelMdPercentageOriginalRows_). Hence, to mimic CBO cumulative cardinality estimation, we should combine both. I have uploaded a new patch with the updated method. > Cost to choose side table in MapJoin conversion based on cumulative > cardinality > --- > > Key: HIVE-13096 > URL: https://issues.apache.org/jira/browse/HIVE-13096 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13096.01.patch, HIVE-13096.patch > > > HIVE-11954 changed the logic to choose the side table in the MapJoin > conversion algorithm. Initial heuristic for the cost was based on number of > heavyweight operators. > This extends that work so the heuristic is based on accumulate cardinality. > In the future, we should choose the side based on total latency for the input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality
[ https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159948#comment-15159948 ] Ashutosh Chauhan commented on HIVE-13096: - Instead of recursively adding cardinality of tree for each operator, I think following heuristic might be better: {code} getCummCardinality (Operator op) { if (op.type = join) { cummCardinality += maxCardinality from all branches; } else { return cummCardinality; } } {code} That is to say, cardinality from any operator other than join does not contribute in cumulative cardinality. And for join, max cardinality from its input contribute in cummulative cardinality of tree. This is akin to what we have on logical side, where getCumulativeCost() is overriden only for join and is overriden in manner suggested here. > Cost to choose side table in MapJoin conversion based on cumulative > cardinality > --- > > Key: HIVE-13096 > URL: https://issues.apache.org/jira/browse/HIVE-13096 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13096.01.patch, HIVE-13096.patch > > > HIVE-11954 changed the logic to choose the side table in the MapJoin > conversion algorithm. Initial heuristic for the cost was based on number of > heavyweight operators. > This extends that work so the heuristic is based on accumulate cardinality. > In the future, we should choose the side based on total latency for the input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality
[ https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156800#comment-15156800 ] Jesus Camacho Rodriguez commented on HIVE-13096: [~ashutoshc], could you take a look? Thanks > Cost to choose side table in MapJoin conversion based on cumulative > cardinality > --- > > Key: HIVE-13096 > URL: https://issues.apache.org/jira/browse/HIVE-13096 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13096.01.patch, HIVE-13096.patch > > > HIVE-11954 changed the logic to choose the side table in the MapJoin > conversion algorithm. Initial heuristic for the cost was based on number of > heavyweight operators. > This extends that work so the heuristic is based on accumulate cardinality. > In the future, we should choose the side based on total latency for the input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality
[ https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156471#comment-15156471 ] Hive QA commented on HIVE-13096: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12788905/HIVE-13096.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9800 tests executed *Failed tests:* {noformat} TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_coltype_literals org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7055/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7055/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7055/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12788905 - PreCommit-HIVE-TRUNK-Build > Cost to choose side table in MapJoin conversion based on cumulative > cardinality > --- > > Key: HIVE-13096 > URL: https://issues.apache.org/jira/browse/HIVE-13096 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13096.01.patch, HIVE-13096.patch > > > HIVE-11954 changed the logic to choose the side table in the MapJoin > conversion algorithm. Initial heuristic for the cost was based on number of > heavyweight operators. > This extends that work so the heuristic is based on accumulate cardinality. > In the future, we should choose the side based on total latency for the input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality
[ https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15155661#comment-15155661 ] Hive QA commented on HIVE-13096: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12788703/HIVE-13096.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 9815 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_bucket_map_join_tez1 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_10 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket_map_join_tez1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cross_product_check_2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_mapjoin org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_inner_join org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.TestTxnCommands2.testInitiatorWithMultipleFailedCompactions org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7039/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7039/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7039/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12788703 - PreCommit-HIVE-TRUNK-Build > Cost to choose side table in MapJoin conversion based on cumulative > cardinality > --- > > Key: HIVE-13096 > URL: https://issues.apache.org/jira/browse/HIVE-13096 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13096.patch > > > HIVE-11954 changed the logic to choose the side table in the MapJoin > conversion algorithm. Initial heuristic for the cost was based on number of > heavyweight operators. > This extends that work so the heuristic is based on accumulate cardinality. > In the future, we should choose the side based on total latency for the input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)