[
https://issues.apache.org/jira/browse/HIVE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16092577#comment-16092577
]
Hive QA commented on HIVE-17114:
--------------------------------
Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12877923/HIVE-17114.1.patch
{color:red}ERROR:{color} -1 due to no test(s) being added or modified.
{color:red}ERROR:{color} -1 due to 43 failed/errored test(s), 11073 tests
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb]
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
(batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
(batchId=145)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[constprog_semijoin]
(batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
(batchId=167)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_2]
(batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_use_op_stats]
(batchId=167)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_use_ts_stats_for_mapjoin]
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
(batchId=167)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2]
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14]
(batchId=233)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23]
(batchId=233)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby3]
(batchId=102)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join_filter]
(batchId=124)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_10]
(batchId=110)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_13]
(batchId=137)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_15]
(batchId=136)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_16]
(batchId=131)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_7]
(batchId=101)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_8]
(batchId=121)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_9]
(batchId=131)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in]
(batchId=124)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_cast_constant]
(batchId=104)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_count_distinct]
(batchId=111)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_decimal_aggregate]
(batchId=108)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_distinct_2]
(batchId=122)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_groupby_3]
(batchId=127)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_mapjoin_reduce]
(batchId=134)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_orderby_5]
(batchId=118)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_string_concat]
(batchId=114)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_12]
(batchId=104)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_13]
(batchId=121)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_14]
(batchId=106)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_15]
(batchId=127)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_16]
(batchId=118)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_9]
(batchId=100)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_short_regress]
(batchId=121)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_ptf]
(batchId=127)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_shufflejoin]
(batchId=131)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
(batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
(batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
(batchId=178)
{noformat}
Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6082/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6082/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6082/
Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 43 tests failed
{noformat}
This message is automatically generated.
ATTACHMENT ID: 12877923 - PreCommit-HIVE-Build
> HoS: Possible skew in shuffling when data is not really skewed
> --------------------------------------------------------------
>
> Key: HIVE-17114
> URL: https://issues.apache.org/jira/browse/HIVE-17114
> Project: Hive
> Issue Type: Bug
> Reporter: Rui Li
> Assignee: Rui Li
> Priority: Minor
> Attachments: HIVE-17114.1.patch
>
>
> Observed in HoS and may apply to other engines as well.
> When we join 2 tables on a single int key, we use the key itself as hash code
> in {{ObjectInspectorUtils.hashCode}}:
> {code}
> case INT:
> return ((IntObjectInspector) poi).get(o);
> {code}
> Suppose the keys are different but are all some multiples of 10. And if we
> choose 10 as #reducers, the shuffle will be skewed.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)