[jira] [Commented] (TEZ-3168) Provide a more predictable approach for total resource guidance for wave/split calculation
[ https://issues.apache.org/jira/browse/TEZ-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15324067#comment-15324067 ] TezQA commented on TEZ-3168: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12809381/TEZ-3168.wip.3.patch against master revision 8985969. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.common.TestTezYARNUtils Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1787//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1787//console This message is automatically generated. > Provide a more predictable approach for total resource guidance for > wave/split calculation > --- > > Key: TEZ-3168 > URL: https://issues.apache.org/jira/browse/TEZ-3168 > Project: Apache Tez > Issue Type: Bug >Reporter: Hitesh Shah >Assignee: Hitesh Shah > Attachments: TEZ-3168.wip.2.patch, TEZ-3168.wip.3.patch, > TEZ-3168.wip.patch > > > Currently, Tez uses headroom for checking total available resources. This is > flaky as it ends up causing the split count to be determined by a point in > time lookup at what is available in the cluster. A better approach would be > either the queue size or even cluster size to get a more predictable count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3168) Provide a more predictable approach for total resource guidance for wave/split calculation
[ https://issues.apache.org/jira/browse/TEZ-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198476#comment-15198476 ] TezQA commented on TEZ-3168: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12793832/TEZ-3168.wip.patch against master revision 42b61f4. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 34 javac compiler warnings (more than the master's current 33 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.common.TestTezYARNUtils Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1566//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/1566//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1566//console This message is automatically generated. > Provide a more predictable approach for total resource guidance for > wave/split calculation > --- > > Key: TEZ-3168 > URL: https://issues.apache.org/jira/browse/TEZ-3168 > Project: Apache Tez > Issue Type: Bug >Reporter: Hitesh Shah >Assignee: Hitesh Shah > Attachments: TEZ-3168.wip.patch > > > Currently, Tez uses headroom for checking total available resources. This is > flaky as it ends up causing the split count to be determined by a point in > time lookup at what is available in the cluster. A better approach would be > either the queue size or even cluster size to get a more predictable count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3168) Provide a more predictable approach for total resource guidance for wave/split calculation
[ https://issues.apache.org/jira/browse/TEZ-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200952#comment-15200952 ] TezQA commented on TEZ-3168: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12794087/TEZ-3168.wip.2.patch against master revision 44c660a. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.test.TestFaultTolerance org.apache.tez.dag.app.rm.TestContainerReuse org.apache.tez.common.TestTezYARNUtils The following test timeouts occurred in : org.apache.tez.dag.app.dag.impl.TestCommit Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1576//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1576//console This message is automatically generated. > Provide a more predictable approach for total resource guidance for > wave/split calculation > --- > > Key: TEZ-3168 > URL: https://issues.apache.org/jira/browse/TEZ-3168 > Project: Apache Tez > Issue Type: Bug >Reporter: Hitesh Shah >Assignee: Hitesh Shah > Attachments: TEZ-3168.wip.2.patch, TEZ-3168.wip.patch > > > Currently, Tez uses headroom for checking total available resources. This is > flaky as it ends up causing the split count to be determined by a point in > time lookup at what is available in the cluster. A better approach would be > either the queue size or even cluster size to get a more predictable count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3168) Provide a more predictable approach for total resource guidance for wave/split calculation
[ https://issues.apache.org/jira/browse/TEZ-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200446#comment-15200446 ] Siddharth Seth commented on TEZ-3168: - bq. Looking at the queue capacity could be very wrong in cases where the user limits only allow the user a tiny fraction of the queue. The Tez AM will think it has access to a lot more than it really does. Do you know if headroom factors in the user limits ? The additional options are definitely better. One of the main problems rightnow is that on a busy cluster, an app may end up thinking it has very little capacity available, thus generating large splits. Even if a job were to complete - the additional capacity will not be used. We've seen scenarios where it's better to kill and restart such jobs so that they take up additional capacity. Queue capacity, in that respect, would be consistent and allow for capacity utilization. However, it has the downside of a large number of waves. Comments on the patch - getAdditionalTokens - not used yet. Assuming this will be used in YarnClient at some point ? - Getting the RM delegation token, renewer etc. I don't think YARN has a public library to figure this out - that would be useful. In case of the YARN delegation token, I'm not sure why the API even exposes a renewer. This may need some changes to account for HA - that differs in the MapReduce getDelegationToken call. - tez.am.total.resource.calculator - rename to something like tez.am.total.resource.reporting.mechanism ? (calculator sounds like a plugin/class) - There's a mismatch in the default between the documentation (headroom) and constant (cluster) - getMaxAvailableResources - Is this being deleted ? Will be an incompatible change. If so, could you please defer it to a separate jira which can be committed just before the next 0.8.3 release. - RMDelegationTokenIdentifier.KIND_NAME - would this token end up being part of the dag credentials, after it is fetched by the AM ? - TaskSchedulerService - headroom= Resources.add(allocatedResources, getAvailableResources()) - this changes behaviour to some extent. However, I don't think it matters since the old code would set this value only once, before any containers have been allocated (allocatedResources = 0) - Resources are updated only once at startup, and on the first invocation of getResources. Given that InputInitializers within a DAG can run at different times, and multiple DAGs can run in AM - I think it's better to update these values more often. e.g. On a nodeReport change for cluster and queue resources. On dagComplete in general. Timed interval for queue and headroom. - We could move YarnClient creation into the shim itself - managing it's lifetime becomes problematic though. - Timeout missing on the new test. Also not sure what it's doing by checking the DEFAULT constant against all possible values. is that to future proof the test ? - Nit: Unused import in TezYarnClient - Nit: Avoid config lookup - TotalResourceCalculatorType.lookup(conf.get ... > Provide a more predictable approach for total resource guidance for > wave/split calculation > --- > > Key: TEZ-3168 > URL: https://issues.apache.org/jira/browse/TEZ-3168 > Project: Apache Tez > Issue Type: Bug >Reporter: Hitesh Shah >Assignee: Hitesh Shah > Attachments: TEZ-3168.wip.patch > > > Currently, Tez uses headroom for checking total available resources. This is > flaky as it ends up causing the split count to be determined by a point in > time lookup at what is available in the cluster. A better approach would be > either the queue size or even cluster size to get a more predictable count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3168) Provide a more predictable approach for total resource guidance for wave/split calculation
[ https://issues.apache.org/jira/browse/TEZ-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200619#comment-15200619 ] Bikas Saha commented on TEZ-3168: - For all of the problems with queue capacity, IMO cluster capacity is a more stable metric to look at. Logically, the data is distributed across the cluster and so accounting for that dispersion while calculating splits. This also solves the current immediate problem of creating too small splits. Essentially the job wants to run tasks across all cluster nodes. The queue capacity determines how the job gets waves/windows of tasks that move around the cluster to read that data locally. > Provide a more predictable approach for total resource guidance for > wave/split calculation > --- > > Key: TEZ-3168 > URL: https://issues.apache.org/jira/browse/TEZ-3168 > Project: Apache Tez > Issue Type: Bug >Reporter: Hitesh Shah >Assignee: Hitesh Shah > Attachments: TEZ-3168.wip.2.patch, TEZ-3168.wip.patch > > > Currently, Tez uses headroom for checking total available resources. This is > flaky as it ends up causing the split count to be determined by a point in > time lookup at what is available in the cluster. A better approach would be > either the queue size or even cluster size to get a more predictable count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3168) Provide a more predictable approach for total resource guidance for wave/split calculation
[ https://issues.apache.org/jira/browse/TEZ-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199712#comment-15199712 ] Hitesh Shah commented on TEZ-3168: -- Should have clarified - still need to test and verify on a secure cluster hence the wip tag. For the most part, the approach and most of the code should likely remain unchanged. > Provide a more predictable approach for total resource guidance for > wave/split calculation > --- > > Key: TEZ-3168 > URL: https://issues.apache.org/jira/browse/TEZ-3168 > Project: Apache Tez > Issue Type: Bug >Reporter: Hitesh Shah >Assignee: Hitesh Shah > Attachments: TEZ-3168.wip.patch > > > Currently, Tez uses headroom for checking total available resources. This is > flaky as it ends up causing the split count to be determined by a point in > time lookup at what is available in the cluster. A better approach would be > either the queue size or even cluster size to get a more predictable count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3168) Provide a more predictable approach for total resource guidance for wave/split calculation
[ https://issues.apache.org/jira/browse/TEZ-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201554#comment-15201554 ] Jason Lowe commented on TEZ-3168: - Forgot to mention that node labels could also mess with this in a significant way -- reported capacity of a queue may be completely misrepresented when node labels are partitioning the cluster and restricting what the app can access in a particular queue. I can see the case where we want to over-split the data if we can run far more tasks in parallel than splits. With container reuse to mitigate a large portion of the per-task overhead, it should be better to overestimate rather than underestimate what we're capable of running. However if we excessively overestimate what we can run simultaneously it will amplify the per-task overhead. We already see many detrimental effects of over partitioning due to the pressure it puts on the AM to manage that many tasks and events and the extra overhead in the shuffle and merge for each task. Underestimating the number of splits is definitely a concern, but excessive overestimating could be really bad as well, ultimately destroying the AM if its heap can't accommodate. I suppose it's no worse than today given it keeps the same behavior by default. It's a user- or admin-driven decision to choose a different scheduling heuristic, and they would need to be aware of the cluster setup assumptions those heuristics are making. > Provide a more predictable approach for total resource guidance for > wave/split calculation > --- > > Key: TEZ-3168 > URL: https://issues.apache.org/jira/browse/TEZ-3168 > Project: Apache Tez > Issue Type: Bug >Reporter: Hitesh Shah >Assignee: Hitesh Shah > Attachments: TEZ-3168.wip.2.patch, TEZ-3168.wip.patch > > > Currently, Tez uses headroom for checking total available resources. This is > flaky as it ends up causing the split count to be determined by a point in > time lookup at what is available in the cluster. A better approach would be > either the queue size or even cluster size to get a more predictable count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3168) Provide a more predictable approach for total resource guidance for wave/split calculation
[ https://issues.apache.org/jira/browse/TEZ-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198088#comment-15198088 ] Hitesh Shah commented on TEZ-3168: -- [~vinodkv] [~jianhe] Mind taking a look at this patch to see whether we are using the APIs for queue and cluster resources correctly. Also, if we are accidentally using any private APIs, let us know. > Provide a more predictable approach for total resource guidance for > wave/split calculation > --- > > Key: TEZ-3168 > URL: https://issues.apache.org/jira/browse/TEZ-3168 > Project: Apache Tez > Issue Type: Bug >Reporter: Hitesh Shah >Assignee: Hitesh Shah > Attachments: TEZ-3168.wip.patch > > > Currently, Tez uses headroom for checking total available resources. This is > flaky as it ends up causing the split count to be determined by a point in > time lookup at what is available in the cluster. A better approach would be > either the queue size or even cluster size to get a more predictable count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3168) Provide a more predictable approach for total resource guidance for wave/split calculation
[ https://issues.apache.org/jira/browse/TEZ-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198874#comment-15198874 ] Siddharth Seth commented on TEZ-3168: - Is the patch ready for review (the name says wip) ? Quickly scanned through it, and I think it looks good for the most part. Will look some more if this is ready to go in. > Provide a more predictable approach for total resource guidance for > wave/split calculation > --- > > Key: TEZ-3168 > URL: https://issues.apache.org/jira/browse/TEZ-3168 > Project: Apache Tez > Issue Type: Bug >Reporter: Hitesh Shah >Assignee: Hitesh Shah > Attachments: TEZ-3168.wip.patch > > > Currently, Tez uses headroom for checking total available resources. This is > flaky as it ends up causing the split count to be determined by a point in > time lookup at what is available in the cluster. A better approach would be > either the queue size or even cluster size to get a more predictable count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3168) Provide a more predictable approach for total resource guidance for wave/split calculation
[ https://issues.apache.org/jira/browse/TEZ-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200257#comment-15200257 ] Jason Lowe commented on TEZ-3168: - Couple of points after a quick review: Looking at the queue capacity could be very wrong in cases where the user limits only allow the user a tiny fraction of the queue. The Tez AM will think it has access to a lot more than it really does. Apps can be moved between queues, so if someone moved the Tez AM from one queue to another it could be looking at the wrong queue when it makes decisions. It's unfortunate that the root queue metrics aren't conveyed in the metrics returned from getYarnClusterMetrics. They are tracked in the RM ClusterMetrics but for some reason not conveyed to the client. That would be cheaper for both the RM and the Tez AM since both could avoid looping over root-level queues. But that would couple this with a pending YARN change. > Provide a more predictable approach for total resource guidance for > wave/split calculation > --- > > Key: TEZ-3168 > URL: https://issues.apache.org/jira/browse/TEZ-3168 > Project: Apache Tez > Issue Type: Bug >Reporter: Hitesh Shah >Assignee: Hitesh Shah > Attachments: TEZ-3168.wip.patch > > > Currently, Tez uses headroom for checking total available resources. This is > flaky as it ends up causing the split count to be determined by a point in > time lookup at what is available in the cluster. A better approach would be > either the queue size or even cluster size to get a more predictable count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)