[jira] [Created] (HIVE-23685) Removing user's extra resources when executing File Merge Task
Qiang.Kang created HIVE-23685: - Summary: Removing user's extra resources when executing File Merge Task Key: HIVE-23685 URL: https://issues.apache.org/jira/browse/HIVE-23685 Project: Hive Issue Type: Bug Components: Physical Optimizer, Query Planning Reporter: Qiang.Kang Assignee: Qiang.Kang Hi, we find that MapReduce's file merge map containers will download user's extra resources(such as: added jars, files, archives) before launching task. When these resources are large or the network is busy, file merge jobs will be timeout, causing the query be failed. As we all know, file merge task will run correctly just with hive-exec.jar and MapReduce framework. Therefore, there is no need to download user's resources. The patch below prevents setting `tmpjars` for FileMerge Task. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: Github PR Pre Commit Build Error
yeah; I've also seen it a few days ago I've already increased it - but it needs at a jenkins pod restart; so I'll do it in the weekend when nothing is running https://github.com/kgyrtkirk/hive-test-kube/blob/ae4bc1567051630f642d8c4c791f0fcb7ae38eef/htk-jenkins/entrypoint#L7 On 6/12/20 3:56 PM, David Mollitor wrote: Hey Zoltan, A build just failed with: Timed out waiting for websocket connection. You should increase the value of system property org.csanchez.jenkins.plugins.kubernetes.pipeline.ContainerExecDecorator.websocketConnectionTimeout currently set at 60 seconds http://130.211.9.232/blue/organizations/jenkins/hive-precommit/detail/PR-1082/5/pipeline/94 Not sure if this needs to be increased. Thanks.
Github PR Pre Commit Build Error
Hey Zoltan, A build just failed with: Timed out waiting for websocket connection. You should increase the value of system property org.csanchez.jenkins.plugins.kubernetes.pipeline.ContainerExecDecorator.websocketConnectionTimeout currently set at 60 seconds http://130.211.9.232/blue/organizations/jenkins/hive-precommit/detail/PR-1082/5/pipeline/94 Not sure if this needs to be increased. Thanks.
[jira] [Created] (HIVE-23684) Large underestimation in NDV stats when input and join cardinality ratio is big
Stamatis Zampetakis created HIVE-23684: -- Summary: Large underestimation in NDV stats when input and join cardinality ratio is big Key: HIVE-23684 URL: https://issues.apache.org/jira/browse/HIVE-23684 Project: Hive Issue Type: Bug Reporter: Stamatis Zampetakis Assignee: Stamatis Zampetakis Large underestimations of NDV values may occur after a join operation since the current logic will decrease the original NDV values proportionally. The [code|https://github.com/apache/hive/blob/1271d08a3c51c021fa710449f8748b8cdb12b70f/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2558] compares the number of rows of each relation before the join with the number of rows after the join and extracts a ratio for each side. Based on this ratio it adapts (reduces) the NDV accordingly. Consider for instance the following query: {code:sql} select inv_warehouse_sk , inv_item_sk , stddev_samp(inv_quantity_on_hand) stdev , avg(inv_quantity_on_hand) mean from inventory , date_dim where inv_date_sk = d_date_sk and d_year = 1999 and d_moy = 2 group by inv_warehouse_sk, inv_item_sk; {code} For the sake of the discussion, I outline below some relevant stats (from TPCDS30tb): T(inventory) = 1627857000 T(date_dim) = 73049 T(inventory JOIN date_dim[d_year=1999 AND d_moy=2]) = 24948000 V(inventory, inv_date_sk) = 261 V(inventory, inv_item_sk) = 42 V(inventory, inv_warehouse_sk) = 27 V(date_dim, inv, d_date_sk) = 73049 For instance, in this query the join between inventory and date_dim has ~24M rows while inventory has ~1.5B so the NDV of the columns coming from inventory are reduced by a factor of ~100 so we end up with V(JOIN, inv_item_sk) = ~6K while the real one is 231000. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23683) Add queue time to compaction
Peter Vary created HIVE-23683: - Summary: Add queue time to compaction Key: HIVE-23683 URL: https://issues.apache.org/jira/browse/HIVE-23683 Project: Hive Issue Type: Improvement Components: Transactions Reporter: Peter Vary Assignee: Peter Vary It would be good to report to the user when the transaction is initiated. This info can be used when considering the health status of the compaction system -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23682) TestMetrics is flaky
Zoltan Haindrich created HIVE-23682: --- Summary: TestMetrics is flaky Key: HIVE-23682 URL: https://issues.apache.org/jira/browse/HIVE-23682 Project: Hive Issue Type: Sub-task Reporter: Zoltan Haindrich http://34.66.156.144:8080/job/hive-precommit/job/master/31/testReport/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23681) TestTriggersMoveWorkloadManager is unstable
Zoltan Haindrich created HIVE-23681: --- Summary: TestTriggersMoveWorkloadManager is unstable Key: HIVE-23681 URL: https://issues.apache.org/jira/browse/HIVE-23681 Project: Hive Issue Type: Sub-task Reporter: Zoltan Haindrich http://34.66.156.144:8080/job/hive-precommit/job/master/37/testReport/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23680) TestDbNotificationListener is unstable
Zoltan Haindrich created HIVE-23680: --- Summary: TestDbNotificationListener is unstable Key: HIVE-23680 URL: https://issues.apache.org/jira/browse/HIVE-23680 Project: Hive Issue Type: Sub-task Reporter: Zoltan Haindrich http://34.66.156.144:8080/job/hive-precommit/job/master/35/testReport/ http://130.211.9.232/job/hive-flaky-check/24/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23679) TestSparkClient is flaky
Zoltan Haindrich created HIVE-23679: --- Summary: TestSparkClient is flaky Key: HIVE-23679 URL: https://issues.apache.org/jira/browse/HIVE-23679 Project: Hive Issue Type: Sub-task Reporter: Zoltan Haindrich http://130.211.9.232/job/hive-precommit/job/master/34/testReport/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23678) Don't enforce ASF license headers on target files
Karen Coppage created HIVE-23678: Summary: Don't enforce ASF license headers on target files Key: HIVE-23678 URL: https://issues.apache.org/jira/browse/HIVE-23678 Project: Hive Issue Type: Bug Reporter: Karen Coppage Assignee: Karen Coppage -- This message was sent by Atlassian Jira (v8.3.4#803005)