[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality

2016-03-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182669#comment-15182669
 ] 

Hive QA commented on HIVE-13096:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12791499/HIVE-13096.04.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 9770 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7182/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7182/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7182/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12791499 - PreCommit-HIVE-TRUNK-Build

> Cost to choose side table in MapJoin conversion based on cumulative 
> cardinality
> ---
>
> Key: HIVE-13096
> URL: https://issues.apache.org/jira/browse/HIVE-13096
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, 
> HIVE-13096.03.patch, HIVE-13096.04.patch, HIVE-13096.patch
>
>
> HIVE-11954 changed the logic to choose the side table in the MapJoin 
> conversion algorithm. Initial heuristic for the cost was based on number of 
> heavyweight operators.
> This extends that work so the heuristic is based on accumulate cardinality. 
> In the future, we should choose the side based on total latency for the input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality

2016-03-04 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15180164#comment-15180164
 ] 

Ashutosh Chauhan commented on HIVE-13096:
-

Ok, that makes sense. +1

> Cost to choose side table in MapJoin conversion based on cumulative 
> cardinality
> ---
>
> Key: HIVE-13096
> URL: https://issues.apache.org/jira/browse/HIVE-13096
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, 
> HIVE-13096.03.patch, HIVE-13096.patch
>
>
> HIVE-11954 changed the logic to choose the side table in the MapJoin 
> conversion algorithm. Initial heuristic for the cost was based on number of 
> heavyweight operators.
> This extends that work so the heuristic is based on accumulate cardinality. 
> In the future, we should choose the side based on total latency for the input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality

2016-03-04 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15179951#comment-15179951
 ] 

Jesus Camacho Rodriguez commented on HIVE-13096:


[~ashutoshc], I finally could take a look back at this one.

The heuristic change impacts the selection of table chosen for streaming, and 
it might change the shape of the DAG too e.g. in the presence of GB + Join.

For instance, consider {{bucket_map_join_tez1.q}}.

- Previously, the shape was:
{noformat}
Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 3 (CUSTOM_SIMPLE_EDGE)
{noformat}
Reducer2 contains a GB on the input from Map1 (TS on table1), followed by a 
Join.
In this case, Map3 (TS on table2) is broadcasted for the Join execution that is 
done in Reduce2.

- With the patch, the shape is:
{noformat}
Map 3 <- Reducer 2 (CUSTOM_EDGE)
Reducer 2 <- Map 1 (SIMPLE_EDGE)
{noformat}
Reducer2 contains a GB on the input from Map1 (TS on table1).
In this case, the output of GB is broadcasted for the Join execution that is 
done in Map3.


> Cost to choose side table in MapJoin conversion based on cumulative 
> cardinality
> ---
>
> Key: HIVE-13096
> URL: https://issues.apache.org/jira/browse/HIVE-13096
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, 
> HIVE-13096.03.patch, HIVE-13096.patch
>
>
> HIVE-11954 changed the logic to choose the side table in the MapJoin 
> conversion algorithm. Initial heuristic for the cost was based on number of 
> heavyweight operators.
> This extends that work so the heuristic is based on accumulate cardinality. 
> In the future, we should choose the side based on total latency for the input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality

2016-03-01 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174841#comment-15174841
 ] 

Ashutosh Chauhan commented on HIVE-13096:
-

This heuristic change should have supposedly impacted only selection of table 
chosen for streaming, and not change shape of tez dag. Is that expected ?

> Cost to choose side table in MapJoin conversion based on cumulative 
> cardinality
> ---
>
> Key: HIVE-13096
> URL: https://issues.apache.org/jira/browse/HIVE-13096
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, 
> HIVE-13096.03.patch, HIVE-13096.patch
>
>
> HIVE-11954 changed the logic to choose the side table in the MapJoin 
> conversion algorithm. Initial heuristic for the cost was based on number of 
> heavyweight operators.
> This extends that work so the heuristic is based on accumulate cardinality. 
> In the future, we should choose the side based on total latency for the input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality

2016-02-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169587#comment-15169587
 ] 

Hive QA commented on HIVE-13096:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12789935/HIVE-13096.03.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9828 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7101/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7101/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7101/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12789935 - PreCommit-HIVE-TRUNK-Build

> Cost to choose side table in MapJoin conversion based on cumulative 
> cardinality
> ---
>
> Key: HIVE-13096
> URL: https://issues.apache.org/jira/browse/HIVE-13096
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, 
> HIVE-13096.03.patch, HIVE-13096.patch
>
>
> HIVE-11954 changed the logic to choose the side table in the MapJoin 
> conversion algorithm. Initial heuristic for the cost was based on number of 
> heavyweight operators.
> This extends that work so the heuristic is based on accumulate cardinality. 
> In the future, we should choose the side based on total latency for the input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality

2016-02-25 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167372#comment-15167372
 ] 

Jesus Camacho Rodriguez commented on HIVE-13096:


I just did. Thanks

> Cost to choose side table in MapJoin conversion based on cumulative 
> cardinality
> ---
>
> Key: HIVE-13096
> URL: https://issues.apache.org/jira/browse/HIVE-13096
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, 
> HIVE-13096.03.patch, HIVE-13096.patch
>
>
> HIVE-11954 changed the logic to choose the side table in the MapJoin 
> conversion algorithm. Initial heuristic for the cost was based on number of 
> heavyweight operators.
> This extends that work so the heuristic is based on accumulate cardinality. 
> In the future, we should choose the side based on total latency for the input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality

2016-02-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167363#comment-15167363
 ] 

Ashutosh Chauhan commented on HIVE-13096:
-

Can you create a RB entry?

> Cost to choose side table in MapJoin conversion based on cumulative 
> cardinality
> ---
>
> Key: HIVE-13096
> URL: https://issues.apache.org/jira/browse/HIVE-13096
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, 
> HIVE-13096.03.patch, HIVE-13096.patch
>
>
> HIVE-11954 changed the logic to choose the side table in the MapJoin 
> conversion algorithm. Initial heuristic for the cost was based on number of 
> heavyweight operators.
> This extends that work so the heuristic is based on accumulate cardinality. 
> In the future, we should choose the side based on total latency for the input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality

2016-02-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166465#comment-15166465
 ] 

Hive QA commented on HIVE-13096:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12789388/HIVE-13096.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9790 tests executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-schema_evol_text_nonvec_mapwork_table.q-orc_vectorization_ppd.q-vector_left_outer_join2.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7083/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7083/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7083/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12789388 - PreCommit-HIVE-TRUNK-Build

> Cost to choose side table in MapJoin conversion based on cumulative 
> cardinality
> ---
>
> Key: HIVE-13096
> URL: https://issues.apache.org/jira/browse/HIVE-13096
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13096.01.patch, HIVE-13096.02.patch, 
> HIVE-13096.patch
>
>
> HIVE-11954 changed the logic to choose the side table in the MapJoin 
> conversion algorithm. Initial heuristic for the cost was based on number of 
> heavyweight operators.
> This extends that work so the heuristic is based on accumulate cardinality. 
> In the future, we should choose the side based on total latency for the input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality

2016-02-23 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160221#comment-15160221
 ] 

Jesus Camacho Rodriguez commented on HIVE-13096:


[~ashutoshc], thanks for checking.

In fact, I had missed that the {{getCumulativeCost}} method was overriden for 
Join operators (as it is part of _HiveRelMdDistinctRowCount_ ), thanks for 
catching that. However, the default {{getCumulativeCost}} is still applied over 
the rest of the operators (it is in _RelMdPercentageOriginalRows_).

Hence, to mimic CBO cumulative cardinality estimation, we should combine both. 
I have uploaded a new patch with the updated method.

> Cost to choose side table in MapJoin conversion based on cumulative 
> cardinality
> ---
>
> Key: HIVE-13096
> URL: https://issues.apache.org/jira/browse/HIVE-13096
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13096.01.patch, HIVE-13096.patch
>
>
> HIVE-11954 changed the logic to choose the side table in the MapJoin 
> conversion algorithm. Initial heuristic for the cost was based on number of 
> heavyweight operators.
> This extends that work so the heuristic is based on accumulate cardinality. 
> In the future, we should choose the side based on total latency for the input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality

2016-02-23 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159948#comment-15159948
 ] 

Ashutosh Chauhan commented on HIVE-13096:
-

Instead of recursively adding cardinality of tree for each operator,  I think 
following heuristic might be better:

{code}
getCummCardinality (Operator op) {
  if (op.type = join) {
   cummCardinality += maxCardinality from all branches; 
}  else {
return cummCardinality;
}
}
{code}

That is to say, cardinality from any operator other than join does not 
contribute in cumulative cardinality. And for join, max cardinality from its 
input contribute in cummulative cardinality of tree. This is akin to what we 
have on logical side, where getCumulativeCost() is overriden only for join and 
is overriden in manner suggested here. 

> Cost to choose side table in MapJoin conversion based on cumulative 
> cardinality
> ---
>
> Key: HIVE-13096
> URL: https://issues.apache.org/jira/browse/HIVE-13096
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13096.01.patch, HIVE-13096.patch
>
>
> HIVE-11954 changed the logic to choose the side table in the MapJoin 
> conversion algorithm. Initial heuristic for the cost was based on number of 
> heavyweight operators.
> This extends that work so the heuristic is based on accumulate cardinality. 
> In the future, we should choose the side based on total latency for the input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality

2016-02-22 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156800#comment-15156800
 ] 

Jesus Camacho Rodriguez commented on HIVE-13096:


[~ashutoshc], could you take a look? Thanks

> Cost to choose side table in MapJoin conversion based on cumulative 
> cardinality
> ---
>
> Key: HIVE-13096
> URL: https://issues.apache.org/jira/browse/HIVE-13096
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13096.01.patch, HIVE-13096.patch
>
>
> HIVE-11954 changed the logic to choose the side table in the MapJoin 
> conversion algorithm. Initial heuristic for the cost was based on number of 
> heavyweight operators.
> This extends that work so the heuristic is based on accumulate cardinality. 
> In the future, we should choose the side based on total latency for the input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality

2016-02-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156471#comment-15156471
 ] 

Hive QA commented on HIVE-13096:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12788905/HIVE-13096.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9800 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_coltype_literals
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7055/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7055/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7055/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12788905 - PreCommit-HIVE-TRUNK-Build

> Cost to choose side table in MapJoin conversion based on cumulative 
> cardinality
> ---
>
> Key: HIVE-13096
> URL: https://issues.apache.org/jira/browse/HIVE-13096
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13096.01.patch, HIVE-13096.patch
>
>
> HIVE-11954 changed the logic to choose the side table in the MapJoin 
> conversion algorithm. Initial heuristic for the cost was based on number of 
> heavyweight operators.
> This extends that work so the heuristic is based on accumulate cardinality. 
> In the future, we should choose the side based on total latency for the input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13096) Cost to choose side table in MapJoin conversion based on cumulative cardinality

2016-02-20 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15155661#comment-15155661
 ] 

Hive QA commented on HIVE-13096:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12788703/HIVE-13096.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 9815 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_bucket_map_join_tez1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_10
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket_map_join_tez1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cross_product_check_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_mapjoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_inner_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.TestTxnCommands2.testInitiatorWithMultipleFailedCompactions
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7039/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7039/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7039/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12788703 - PreCommit-HIVE-TRUNK-Build

> Cost to choose side table in MapJoin conversion based on cumulative 
> cardinality
> ---
>
> Key: HIVE-13096
> URL: https://issues.apache.org/jira/browse/HIVE-13096
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13096.patch
>
>
> HIVE-11954 changed the logic to choose the side table in the MapJoin 
> conversion algorithm. Initial heuristic for the cost was based on number of 
> heavyweight operators.
> This extends that work so the heuristic is based on accumulate cardinality. 
> In the future, we should choose the side based on total latency for the input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)