[
https://issues.apache.org/jira/browse/HIVE-12491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15035613#comment-15035613
]
Hive QA commented on HIVE-12491:
--------------------------------
Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12775220/HIVE-12491.5.patch
{color:red}ERROR:{color} -1 due to no test(s) being added or modified.
{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 9869 tests
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_llap
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union17
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union_multiinsert
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mergejoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union_multiinsert
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union17
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}
Test results:
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6195/testReport
Console output:
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6195/console
Test logs:
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6195/
Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}
This message is automatically generated.
ATTACHMENT ID: 12775220 - PreCommit-HIVE-TRUNK-Build
> Column Statistics: 3 attribute join on a 2-source table is off
> --------------------------------------------------------------
>
> Key: HIVE-12491
> URL: https://issues.apache.org/jira/browse/HIVE-12491
> Project: Hive
> Issue Type: Bug
> Components: Statistics
> Affects Versions: 1.3.0, 2.0.0
> Reporter: Gopal V
> Assignee: Ashutosh Chauhan
> Attachments: HIVE-12491.2.patch, HIVE-12491.3.patch,
> HIVE-12491.4.patch, HIVE-12491.5.patch, HIVE-12491.WIP.patch, HIVE-12491.patch
>
>
> The eased out denominator has to detect duplicate row-stats from different
> attributes.
> {code}
> select account_id from customers c, customer_activation ca
> where c.customer_id = ca.customer_id
> and year(ca.dt) = year(c.dt) and month(ca.dt) = month(c.dt)
> and year(ca.dt) between year('2013-12-26') and year('2013-12-26')
> {code}
> {code}
> private Long getEasedOutDenominator(List<Long> distinctVals) {
> // Exponential back-off for NDVs.
> // 1) Descending order sort of NDVs
> // 2) denominator = NDV1 * (NDV2 ^ (1/2)) * (NDV3 ^ (1/4))) * ....
> Collections.sort(distinctVals, Collections.reverseOrder());
> long denom = distinctVals.get(0);
> for (int i = 1; i < distinctVals.size(); i++) {
> denom = (long) (denom * Math.pow(distinctVals.get(i), 1.0 / (1 <<
> i)));
> }
> return denom;
> }
> {code}
> This gets {{[8007986, 821974390, 821974390]}}, which is actually 3 columns 2
> of which are derived from the same column.
> {code}
> Reduce Output Operator (RS_12)
> key expressions: _col0 (type: bigint), year(_col2) (type: int),
> month(_col2) (type: int)
> sort order: +++
> Map-reduce partition columns: _col0 (type: bigint), year(_col2)
> (type: int), month(_col2) (type: int)
> value expressions: _col1 (type: bigint)
> Join Operator (JOIN_13)
> condition map:
> Inner Join 0 to 1
> keys:
> 0 _col0 (type: bigint), year(_col1) (type: int), month(_col1)
> (type: int)
> 1 _col0 (type: bigint), year(_col2) (type: int), month(_col2)
> (type: int)
> outputColumnNames: _col3
> {code}
> So the eased out denominator is off by a factor of 30,000 or so, causing OOMs
> in map-joins.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)