[jira] [Commented] (YARN-1959) Fix headroom calculation in Fair Scheduler
[ https://issues.apache.org/jira/browse/YARN-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974467#comment-13974467 ] Jason Lowe commented on YARN-1959: -- Yes, over-reporting of the headroom in the CapacityScheduler is a known issue. See YARN-1857. I think the calculation for the CapacityScheduler should be more like min((userLimit-userConsumed), (queueMax-queueConsumed)). The idea being that one can't go over the user limit but also can't go over what the queue has free either. Fix headroom calculation in Fair Scheduler -- Key: YARN-1959 URL: https://issues.apache.org/jira/browse/YARN-1959 Project: Hadoop YARN Issue Type: Bug Reporter: Sandy Ryza The Fair Scheduler currently always sets the headroom to 0. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1959) Fix headroom calculation in Fair Scheduler
[ https://issues.apache.org/jira/browse/YARN-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974550#comment-13974550 ] Jason Lowe commented on YARN-1959: -- Good point, it would also need to min against the available cluster resources to cover the case of cross-queue contention. Fix headroom calculation in Fair Scheduler -- Key: YARN-1959 URL: https://issues.apache.org/jira/browse/YARN-1959 Project: Hadoop YARN Issue Type: Bug Reporter: Sandy Ryza The Fair Scheduler currently always sets the headroom to 0. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (YARN-1966) Capacity Scheduler acl_submit_applications in Leaf Queue finally considers root queue default always
[ https://issues.apache.org/jira/browse/YARN-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved YARN-1966. -- Resolution: Duplicate This is a duplicate of YARN-1269 and related to YARN-1941 and YARN-1951. In any case I don't think we want to special-case the root queue, as the same issue could exist in a subtree where access to the subtree root allows access to any queue within the subtree. Actually I believe this is by design. It allows admins to configure access to an entire subtree of queues by giving access to the root of the subtree rather than having to add the access to each leaf queue. So for your example above you'll want to set the root queue's ACLs to be empty so that one must have access to the leaf queue in order to submit. Capacity Scheduler acl_submit_applications in Leaf Queue finally considers root queue default always Key: YARN-1966 URL: https://issues.apache.org/jira/browse/YARN-1966 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.4.0 Reporter: Sunil G Attachments: Yarn-1966.1.patch Given with below configurations, property nameyarn.scheduler.capacity.root.queues/name valuefast,medium/value /property property nameyarn.scheduler.capacity.root.fast.acl_submit_applications/name valuehadoop/value /property property nameyarn.scheduler.capacity.root.slow.acl_submit_applications/name valuehadoop/value /property In this case, the expectation is like hadoop user can only submit job to fast or slow queue. But now any user can submit job to these queues. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1978) TestLogAggregationService#testLocalFileDeletionAfterUpload fails sometimes
[ https://issues.apache.org/jira/browse/YARN-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978982#comment-13978982 ] Jason Lowe commented on YARN-1978: -- I've recently seen this happen on my single-node cluster setup on Linux as well. TestLogAggregationService#testLocalFileDeletionAfterUpload fails sometimes -- Key: YARN-1978 URL: https://issues.apache.org/jira/browse/YARN-1978 Project: Hadoop YARN Issue Type: Test Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli This happens in a Windows VM, though the issue isn't related to Windows. {code} --- Test set: org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService --- Tests run: 11, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 6.859 sec FAILURE! - in org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService testLocalFileDeletionAfterUpload(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService) Time elapsed: 0.906 sec FAILURE! junit.framework.AssertionFailedError: check Y:\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-nodemanager\target\TestLogAggregationService-localLogDir\application_1234_0001\container_1234_0001_01_01\stdout at junit.framework.Assert.fail(Assert.java:50) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertFalse(Assert.java:34) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testLocalFileDeletionAfterUpload(TestLogAggregationService.java:201) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1978) TestLogAggregationService#testLocalFileDeletionAfterUpload fails sometimes
[ https://issues.apache.org/jira/browse/YARN-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979004#comment-13979004 ] Jason Lowe commented on YARN-1978: -- Is calling sched.awaitTermination(10, SECONDS) without ever calling sched.shutdown() or sched.shutdownNow() equivalent to Thread.sleep(10*1000)? I'm wondering if this change is going to always cause the NM to take 10 seconds to shutdown which isn't ideal. TestLogAggregationService#testLocalFileDeletionAfterUpload fails sometimes -- Key: YARN-1978 URL: https://issues.apache.org/jira/browse/YARN-1978 Project: Hadoop YARN Issue Type: Test Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Attachments: YARN-1978.txt This happens in a Windows VM, though the issue isn't related to Windows. {code} --- Test set: org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService --- Tests run: 11, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 6.859 sec FAILURE! - in org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService testLocalFileDeletionAfterUpload(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService) Time elapsed: 0.906 sec FAILURE! junit.framework.AssertionFailedError: check Y:\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-nodemanager\target\TestLogAggregationService-localLogDir\application_1234_0001\container_1234_0001_01_01\stdout at junit.framework.Assert.fail(Assert.java:50) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertFalse(Assert.java:34) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testLocalFileDeletionAfterUpload(TestLogAggregationService.java:201) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1975) Used resources shows escaped html in CapacityScheduler and FairScheduler page
[ https://issues.apache.org/jira/browse/YARN-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980176#comment-13980176 ] Jason Lowe commented on YARN-1975: -- +1, committing this. Used resources shows escaped html in CapacityScheduler and FairScheduler page - Key: YARN-1975 URL: https://issues.apache.org/jira/browse/YARN-1975 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 3.0.0, 2.4.0 Reporter: Nathan Roberts Assignee: Mit Desai Attachments: YARN-1975.patch, screenshot-1975.png Used resources displays as amp;lt;memory:, vCores;amp;gt; with capacity scheduler -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-1981) Nodemanager version is not updated when a node reconnects
Jason Lowe created YARN-1981: Summary: Nodemanager version is not updated when a node reconnects Key: YARN-1981 URL: https://issues.apache.org/jira/browse/YARN-1981 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.4.0 Reporter: Jason Lowe Assignee: Jason Lowe When a nodemanager is quickly restarted and happens to change versions during the restart (e.g.: rolling upgrade scenario) the NM version as reported by the RM is not updated. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1981) Nodemanager version is not updated when a node reconnects
[ https://issues.apache.org/jira/browse/YARN-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1981: - Attachment: YARN-1981.patch Patch that updates the nodemanager version when a node reconnects. Nodemanager version is not updated when a node reconnects - Key: YARN-1981 URL: https://issues.apache.org/jira/browse/YARN-1981 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.4.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1981.patch When a nodemanager is quickly restarted and happens to change versions during the restart (e.g.: rolling upgrade scenario) the NM version as reported by the RM is not updated. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1354) Recover applications upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980360#comment-13980360 ] Jason Lowe commented on YARN-1354: -- Yes, we can't rely on any active containers to tell us which apps are active. I stumbled across YARN-1421, and I think that's the best way to solve the lost FINISH_APPS event. We can already lose them in the RM restart scenario, and the proposed fix in that JIRA (having the NM heartbeat the active applications along with active containers) would solve it for the NM restart case as well. As for nmStore.start() being called during serviceInit, that's because we're recovering the secret manager states during init and the store needs to be started in order to do that. We might be able to postpone the recovery until start but I thought it was safer to recover during init to avoid any racing between component startups and when they touched other components relative to when those components recover. I need to update the patch to handle the runtime DBException issue that was pointed out in the review for MAPREDUCE-5652. I hope to have that updated patch posted shortly. Recover applications upon nodemanager restart - Key: YARN-1354 URL: https://issues.apache.org/jira/browse/YARN-1354 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1354-v1.patch The set of active applications in the nodemanager context need to be recovered for work-preserving nodemanager restart -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-1984) LeveldbTimelineStore does not handle db exceptions properly
Jason Lowe created YARN-1984: Summary: LeveldbTimelineStore does not handle db exceptions properly Key: YARN-1984 URL: https://issues.apache.org/jira/browse/YARN-1984 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Jason Lowe The org.iq80.leveldb.DB and DBIterator methods throw runtime exceptions rather than IOException which can easily leak up the stack and kill threads (e.g.: the deletion thread). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1984) LeveldbTimelineStore does not handle db exceptions properly
[ https://issues.apache.org/jira/browse/YARN-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981185#comment-13981185 ] Jason Lowe commented on YARN-1984: -- Ran across this while working with leveldb as part of MAPREDUCE-5652 and YARN-1336. There are two DBExceptions, NativeDB.DBException and leveldb.DBException. The former is derived from IOException raised by the low level JNI code, while the latter is derived from RuntimeException and is thrown by the JniDB wrapper code. To make matters worse, DBIterator throws _raw_ RuntimeException rather than the runtime DBException from its methods, so database errors can leak up the stack even if code is expecting the runtime DBException. The timeline store should be handling the runtime exceptions and treat them like I/O errors, at least to keep it from tearing down the deletion thread (if not other cases). We may want to create a wrapper utility class for DBIterator in YARN as a workaround so interacting with the database only requires handling of leveldb.DBException rather than also trying to wrestle with the raw RuntimeExceptions from the iterator. See the DBIterator wrapper class in https://issues.apache.org/jira/secure/attachment/12641927/MAPREDUCE-5652-v8.patch as a rough example. LeveldbTimelineStore does not handle db exceptions properly --- Key: YARN-1984 URL: https://issues.apache.org/jira/browse/YARN-1984 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Jason Lowe The org.iq80.leveldb.DB and DBIterator methods throw runtime exceptions rather than IOException which can easily leak up the stack and kill threads (e.g.: the deletion thread). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1985) YARN issues wrong state when running beyond virtual memory limits
[ https://issues.apache.org/jira/browse/YARN-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981296#comment-13981296 ] Jason Lowe commented on YARN-1985: -- Do you have the relevant portions of the RM log for these 4 containers showing it has marked them completed? If these all occurred on the same node, the relevant NM log would be great as well. YARN issues wrong state when running beyond virtual memory limits --- Key: YARN-1985 URL: https://issues.apache.org/jira/browse/YARN-1985 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0 Reporter: Oleg Zhurakousky When deploying YARN application with multiple containers and AM determines that the resource limits been reached (e.g., virtual memory) it starts killing *all* containers while reporting a *single* COMPLETED status essentially hanging AM waiting for more containers to report its state. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1985) YARN issues wrong state when running beyond virtual memory limits
[ https://issues.apache.org/jira/browse/YARN-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981312#comment-13981312 ] Jason Lowe commented on YARN-1985: -- There are only three states for a container: NEW, RUNNING, or COMPLETED. Note that COMPLETED does not imply success rather that the container is no longer running. In order to discern success or failure from a completed container one must examine the exit code of the container (i.e.: the ContainerStatus#getExitStatus method). Are both containers running over their memory limits or is only one running over and somehow both are being killed? That's where the RM/NM logs would help. YARN issues wrong state when running beyond virtual memory limits --- Key: YARN-1985 URL: https://issues.apache.org/jira/browse/YARN-1985 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0 Reporter: Oleg Zhurakousky When deploying YARN application with multiple containers and AM determines that the resource limits been reached (e.g., virtual memory) it starts killing *all* containers while reporting a *single* COMPLETED status essentially hanging AM waiting for more containers to report its state. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1985) YARN issues wrong state when running beyond virtual memory limits
[ https://issues.apache.org/jira/browse/YARN-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981497#comment-13981497 ] Jason Lowe commented on YARN-1985: -- The exit status should be whatever exit status came from the process when it exited. When a container is killed the NM first sends a SIGTERM and then a short time later (250 msec IIRC) it sends SIGKILL. A process that exits with a status code of 0 despite receiving SIGTERM could explain the behavior. It could also happen if the container exited on its own after the NM logged that it was going to kill it but before it actually tried to kill it. Looking at the DefaultContainerExecutor code it certainly appears that the process being killed must have returned an exit code of zero unless you are seeing logs such as Exit code from container container_1398429077682_0006_02_05 is : in the logs. I'm not sure exactly what's being run in the container, but checking if that will return an exit code of 0 despite being killed by SIGTERM seems like the next best place to look. YARN issues wrong state when running beyond virtual memory limits --- Key: YARN-1985 URL: https://issues.apache.org/jira/browse/YARN-1985 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0 Reporter: Oleg Zhurakousky Priority: Minor When deploying YARN application with multiple containers and AM determines that the resource limits been reached (e.g., virtual memory) it starts killing *all* containers while reporting a *single* COMPLETED status essentially hanging AM waiting for more containers to report its state. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-1987) Wrapper for leveldb DBIterator to aid in handling database exceptions
Jason Lowe created YARN-1987: Summary: Wrapper for leveldb DBIterator to aid in handling database exceptions Key: YARN-1987 URL: https://issues.apache.org/jira/browse/YARN-1987 Project: Hadoop YARN Issue Type: Improvement Reporter: Jason Lowe Assignee: Jason Lowe Per discussions in YARN-1984 and MAPREDUCE-5652, it would be nice to have a utility wrapper around leveldb's DBIterator to translate the raw RuntimeExceptions it can throw into DBExceptions to make it easier to handle database errors while iterating. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1987) Wrapper for leveldb DBIterator to aid in handling database exceptions
[ https://issues.apache.org/jira/browse/YARN-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1987: - Attachment: YARN-1987.patch Wrapper for leveldb DBIterator to aid in handling database exceptions - Key: YARN-1987 URL: https://issues.apache.org/jira/browse/YARN-1987 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1987.patch Per discussions in YARN-1984 and MAPREDUCE-5652, it would be nice to have a utility wrapper around leveldb's DBIterator to translate the raw RuntimeExceptions it can throw into DBExceptions to make it easier to handle database errors while iterating. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1362) Distinguish between nodemanager shutdown for decommission vs shutdown for restart
[ https://issues.apache.org/jira/browse/YARN-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1362: - Attachment: YARN-1362.patch Small patch that enhances the NM context that provides get/set for a decomm flag. This allows code to query whether the NM has been told to decommission and act accordingly during shutdown. Distinguish between nodemanager shutdown for decommission vs shutdown for restart - Key: YARN-1362 URL: https://issues.apache.org/jira/browse/YARN-1362 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Attachments: YARN-1362.patch When a nodemanager shuts down it needs to determine if it is likely to be restarted. If a restart is likely then it needs to preserve container directories, logs, distributed cache entries, etc. If it is being shutdown more permanently (e.g.: like a decommission) then the nodemanager should cleanup directories and logs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2002) Support for passing Job priority through Application Submission Context in Mapreduce Side
[ https://issues.apache.org/jira/browse/YARN-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13985485#comment-13985485 ] Jason Lowe commented on YARN-2002: -- Moving this to MAPREDUCE since that's where the changes need to be made. Will link this issue back to YARN-1963. I think a small unit test should be added as part of this change to verify that when a priority is set the resulting submission context from YARNRunner has the appropriate priority setting. I suspect the tests in YARN-2004 will be more of an integration test rather than a unit test. Support for passing Job priority through Application Submission Context in Mapreduce Side - Key: YARN-2002 URL: https://issues.apache.org/jira/browse/YARN-2002 Project: Hadoop YARN Issue Type: Sub-task Components: api, resourcemanager Reporter: Sunil G Attachments: Yarn-2002.1.patch Job Prioirty can be set from client side as below [Configuration and api]. a. JobConf.getJobPriority() and Job.setPriority(JobPriority priority) b. We can also use configuration mapreduce.job.priority. Now this Job priority can be passed in Application Submission context from Client side. Here we can reuse the MRJobConfig.PRIORITY configuration. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-2002) Support for passing Job priority through Application Submission Context in Mapreduce Side
[ https://issues.apache.org/jira/browse/YARN-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-2002: - Issue Type: Improvement (was: Sub-task) Parent: (was: YARN-1963) Support for passing Job priority through Application Submission Context in Mapreduce Side - Key: YARN-2002 URL: https://issues.apache.org/jira/browse/YARN-2002 Project: Hadoop YARN Issue Type: Improvement Components: api, resourcemanager Reporter: Sunil G Attachments: Yarn-2002.1.patch Job Prioirty can be set from client side as below [Configuration and api]. a. JobConf.getJobPriority() and Job.setPriority(JobPriority priority) b. We can also use configuration mapreduce.job.priority. Now this Job priority can be passed in Application Submission context from Client side. Here we can reuse the MRJobConfig.PRIORITY configuration. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-2005) Blacklisting support for scheduling AMs
Jason Lowe created YARN-2005: Summary: Blacklisting support for scheduling AMs Key: YARN-2005 URL: https://issues.apache.org/jira/browse/YARN-2005 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.4.0, 0.23.10 Reporter: Jason Lowe It would be nice if the RM supported blacklisting a node for an AM launch after the same node fails a configurable number of AM attempts. This would be similar to the blacklisting support for scheduling task attempts in the MapReduce AM but for scheduling AM attempts on the RM side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2005) Blacklisting support for scheduling AMs
[ https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13985575#comment-13985575 ] Jason Lowe commented on YARN-2005: -- This is particularly helpful on a busy cluster where one node happens to be in a state where it can't launch containers for some reason but hasn't self-declared an UNHEALTHY state. In that scenario the only place with spare capacity is a node that fails every container attempt, and apps can fail due to the RM not realizing that repeated AM attempts on the same node aren't working. In that sense a fix for YARN-1073 could help quite a bit, but there could still be scenarios where a particular app's AMs end up failing on certain nodes but other containers run just fine. Blacklisting support for scheduling AMs --- Key: YARN-2005 URL: https://issues.apache.org/jira/browse/YARN-2005 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 0.23.10, 2.4.0 Reporter: Jason Lowe It would be nice if the RM supported blacklisting a node for an AM launch after the same node fails a configurable number of AM attempts. This would be similar to the blacklisting support for scheduling task attempts in the MapReduce AM but for scheduling AM attempts on the RM side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1338) Recover localized resource cache state upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1338: - Attachment: YARN-1338v3-and-YARN-1987.patch Updating the patch to address the DBException handling that was brought up in the MAPREDUCE-5652 review and applies here. Note that this now depends upon YARN-1987 as that provides the utility wrapper for the leveldb iterator to translate raw RuntimeException to the more helpful DBException so we can act accordingly when errors occur. The other notable change in the patch is renaming LevelDB to Leveldb for consistency with the existing LeveldbTimelineStore naming convention. This latest patch includes the necessary pieces of YARN-1987 so it can compile and Jenkins can comment. Recover localized resource cache state upon nodemanager restart --- Key: YARN-1338 URL: https://issues.apache.org/jira/browse/YARN-1338 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1338.patch, YARN-1338v2.patch, YARN-1338v3-and-YARN-1987.patch Today when node manager restarts we clean up all the distributed cache files from disk. This is definitely not ideal from 2 aspects. * For work preserving restart we definitely want them as running containers are using them * For even non work preserving restart this will be useful in the sense that we don't have to download them again if needed by future tasks. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1341) Recover NMTokens upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1341: - Attachment: YARN-1341v4-and-YARN-1987.patch Updating the patch to address the DBException handling that was brought up in the MAPREDUCE-5652 review and applies here. Note that this now depends upon YARN-1987 as that provides the utility wrapper for the leveldb iterator to translate raw RuntimeException to the more helpful DBException so we can act accordingly when errors occur. The other notable change in the patch is renaming LevelDB to Leveldb for consistency with the existing LeveldbTimelineStore naming convention. This latest patch includes the necessary pieces of YARN-1987 so it can compile and Jenkins can comment. Recover NMTokens upon nodemanager restart - Key: YARN-1341 URL: https://issues.apache.org/jira/browse/YARN-1341 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, YARN-1341v4-and-YARN-1987.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1987) Wrapper for leveldb DBIterator to aid in handling database exceptions
[ https://issues.apache.org/jira/browse/YARN-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13985977#comment-13985977 ] Jason Lowe commented on YARN-1987: -- Thanks for the feedback, Ming! bq. LeveldbIterator.close rethrows IOException instead of DBException. Just wonder which is better, given JniDBFactory.factory.open throws DBException. JniDBFactory.factory.open throws NativeDB.DBException which is an IOException rather than the runtime DBException. Also since close() already declares that it can throw IOException which callers either have to handle or propagate it seemed better to leverage that declared exception than a runtime exception which callers can easily overlook. bq. It seems store open via JniDBFactory.factory.open can also be useful to put into a wrapper class, to take care of catch the exception if the store doesn't exist and create a new one. If all one cares about is to make sure the database is created even if it doesn't exist then that's already covered by the leveldb interfaces by calling createIfMissing() on the options passed to the open call. In the NM restart case I wanted to know when the database was being created so the code can either check the existing schema version or set the schema version, respectively. If that's something that needs to be put in a utility method then I agree it's a separate JIRA. Wrapper for leveldb DBIterator to aid in handling database exceptions - Key: YARN-1987 URL: https://issues.apache.org/jira/browse/YARN-1987 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1987.patch Per discussions in YARN-1984 and MAPREDUCE-5652, it would be nice to have a utility wrapper around leveldb's DBIterator to translate the raw RuntimeExceptions it can throw into DBExceptions to make it easier to handle database errors while iterating. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1342) Recover container tokens upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1342: - Attachment: YARN-1342v3-and-YARN-1987.patch Updating the patch to address the DBException handling that was brought up in the MAPREDUCE-5652 review and applies here. Note that this now depends upon YARN-1987 as that provides the utility wrapper for the leveldb iterator to translate raw RuntimeException to the more helpful DBException so we can act accordingly when errors occur. The other notable change in the patch is renaming LevelDB to Leveldb for consistency with the existing LeveldbTimelineStore naming convention. This latest patch includes the necessary pieces of YARN-1987 so it can compile and Jenkins can comment. Recover container tokens upon nodemanager restart - Key: YARN-1342 URL: https://issues.apache.org/jira/browse/YARN-1342 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1342.patch, YARN-1342v2.patch, YARN-1342v3-and-YARN-1987.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1339) Recover DeletionService state upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1339: - Attachment: YARN-1339v3-and-YARN-1987.patch Updating the patch to address the DBException handling that was brought up in the MAPREDUCE-5652 review and applies here. Note that this now depends upon YARN-1987 as that provides the utility wrapper for the leveldb iterator to translate raw RuntimeException to the more helpful DBException so we can act accordingly when errors occur. The other notable change in the patch is renaming LevelDB to Leveldb for consistency with the existing LeveldbTimelineStore naming convention. This latest patch includes the necessary pieces of YARN-1987 so it can compile and Jenkins can comment. Recover DeletionService state upon nodemanager restart -- Key: YARN-1339 URL: https://issues.apache.org/jira/browse/YARN-1339 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1339.patch, YARN-1339v2.patch, YARN-1339v3-and-YARN-1987.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1354) Recover applications upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1354: - Attachment: YARN-1354-v2-and-YARN-1987-and-YARN-1362.patch Updating the patch to address the DBException handling that was brought up in the MAPREDUCE-5652 review and applies here. Note that this now depends upon YARN-1987 as that provides the utility wrapper for the leveldb iterator to translate raw RuntimeException to the more helpful DBException so we can act accordingly when errors occur. This patch also addresses the issue where apps were being cleaned up on shutdown. This leverages YARN-1362 so we can distinguish a decommission shutdown, and it will avoid cleaning up applications if the state store can recover and we are not being decommissioned. The other notable change in the patch is renaming LevelDB to Leveldb for consistency with the existing LeveldbTimelineStore naming convention. This latest patch includes the necessary pieces of YARN-1987 and YARN-1362 so it can compile and Jenkins can comment. Recover applications upon nodemanager restart - Key: YARN-1354 URL: https://issues.apache.org/jira/browse/YARN-1354 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1354-v1.patch, YARN-1354-v2-and-YARN-1987-and-YARN-1362.patch The set of active applications in the nodemanager context need to be recovered for work-preserving nodemanager restart -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1987) Wrapper for leveldb DBIterator to aid in handling database exceptions
[ https://issues.apache.org/jira/browse/YARN-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1987: - Attachment: YARN-1987v2.patch Updated the patch to add the Evolving annotation. Wrapper for leveldb DBIterator to aid in handling database exceptions - Key: YARN-1987 URL: https://issues.apache.org/jira/browse/YARN-1987 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1987.patch, YARN-1987v2.patch Per discussions in YARN-1984 and MAPREDUCE-5652, it would be nice to have a utility wrapper around leveldb's DBIterator to translate the raw RuntimeExceptions it can throw into DBExceptions to make it easier to handle database errors while iterating. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-2046) Out of band heartbeats are sent only on container kill and possibly too early
Jason Lowe created YARN-2046: Summary: Out of band heartbeats are sent only on container kill and possibly too early Key: YARN-2046 URL: https://issues.apache.org/jira/browse/YARN-2046 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.4.0, 0.23.10 Reporter: Jason Lowe [~mingma] pointed out in the review discussion for MAPREDUCE-5465 that the NM is currently sending out of band heartbeats only when stopContainer is called. In addition those heartbeats might be sent too early because the container kill event is asynchronously posted then the heartbeat monitor is notified. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2046) Out of band heartbeats are sent only on container kill and possibly too early
[ https://issues.apache.org/jira/browse/YARN-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995147#comment-13995147 ] Jason Lowe commented on YARN-2046: -- We should consider sending out of band heartbeats after a container completes rather than when a container is killed. For a cluster running MapReduce this should be almost equivalent in terms of number of OOB heartbeats sent since the MR AM always kills completed task attempts until MAPREDUCE-5465 is addressed. Out of band heartbeats are sent only on container kill and possibly too early - Key: YARN-2046 URL: https://issues.apache.org/jira/browse/YARN-2046 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 0.23.10, 2.4.0 Reporter: Jason Lowe [~mingma] pointed out in the review discussion for MAPREDUCE-5465 that the NM is currently sending out of band heartbeats only when stopContainer is called. In addition those heartbeats might be sent too early because the container kill event is asynchronously posted then the heartbeat monitor is notified. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (YARN-2040) Recover information about finished containers
[ https://issues.apache.org/jira/browse/YARN-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved YARN-2040. -- Resolution: Duplicate This will be covered by YARN-1337. Recover information about finished containers - Key: YARN-2040 URL: https://issues.apache.org/jira/browse/YARN-2040 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.4.0 Reporter: Karthik Kambatla The NM should store and recover information about finished containers as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1515) Ability to dump the container threads and stop the containers in a single RPC
[ https://issues.apache.org/jira/browse/YARN-1515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993762#comment-13993762 ] Jason Lowe commented on YARN-1515: -- I apologize for the long delay in reviewing and resulting upmerge it caused. Patch looks good to me with just some minor comments: - StopContainerRequest#getDumpThreads#getDumpThreads should have javadocs and interface annotations like the other methods - Why is StopContainersRequest#getStopRequests marked Unstable but setStopRequests is Stable? - Nit: dumpThreads is an event-specific field, would be nice to have an AMLauncherCleanupEvent that takes just the app attempt in the constructor and derives from AMLauncherEvent. Ability to dump the container threads and stop the containers in a single RPC - Key: YARN-1515 URL: https://issues.apache.org/jira/browse/YARN-1515 Project: Hadoop YARN Issue Type: New Feature Components: api, nodemanager Reporter: Gera Shegalov Assignee: Gera Shegalov Attachments: YARN-1515.v01.patch, YARN-1515.v02.patch, YARN-1515.v03.patch, YARN-1515.v04.patch, YARN-1515.v05.patch, YARN-1515.v06.patch This is needed to implement MAPREDUCE-5044 to enable thread diagnostics for timed-out task attempts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-182) Unnecessary Container killed by the ApplicationMaster message for successful containers
[ https://issues.apache.org/jira/browse/YARN-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996398#comment-13996398 ] Jason Lowe commented on YARN-182: - bq. In my case the reducers were moved to COMPLETED state after 22 mins, they had reached 100% progress at 15 mins. Having progress reach 100% but the task not completing for 7 more minutes is an unrelated issue. Check your reducer logs and/or the input format which is responsible for setting the progress. This is probably a question better suited for the u...@hadoop.apache.org mailing list. Unnecessary Container killed by the ApplicationMaster message for successful containers - Key: YARN-182 URL: https://issues.apache.org/jira/browse/YARN-182 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.0.1-alpha Reporter: zhengqiu cai Assignee: Omkar Vinit Joshi Labels: hadoop, usability Attachments: Log.txt I was running wordcount and the resourcemanager web UI shown the status as FINISHED SUCCEEDED, but the log shown Container killed by the ApplicationMaster -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1362) Distinguish between nodemanager shutdown for decommission vs shutdown for restart
[ https://issues.apache.org/jira/browse/YARN-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996395#comment-13996395 ] Jason Lowe commented on YARN-1362: -- Yes, that's the intended behavior. If ops is shutting down the NM and not expecting it to return anytime soon then it should be decommissioned from the RM. Distinguish between nodemanager shutdown for decommission vs shutdown for restart - Key: YARN-1362 URL: https://issues.apache.org/jira/browse/YARN-1362 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1362.patch When a nodemanager shuts down it needs to determine if it is likely to be restarted. If a restart is likely then it needs to preserve container directories, logs, distributed cache entries, etc. If it is being shutdown more permanently (e.g.: like a decommission) then the nodemanager should cleanup directories and logs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1751) Improve MiniYarnCluster for log aggregation testing
[ https://issues.apache.org/jira/browse/YARN-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996587#comment-13996587 ] Jason Lowe commented on YARN-1751: -- +1, committing this. Improve MiniYarnCluster for log aggregation testing --- Key: YARN-1751 URL: https://issues.apache.org/jira/browse/YARN-1751 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Reporter: Ming Ma Assignee: Ming Ma Attachments: YARN-1751-trunk.patch, YARN-1751.patch MiniYarnCluster specifies individual remote log aggregation root dir for each NM. Test code that uses MiniYarnCluster won't be able to get the value of log aggregation root dir. The following code isn't necessary in MiniYarnCluster. File remoteLogDir = new File(testWorkDir, MiniYARNCluster.this.getName() + -remoteLogDir-nm- + index); remoteLogDir.mkdir(); config.set(YarnConfiguration.NM_REMOTE_APP_LOG_DIR, remoteLogDir.getAbsolutePath()); In LogCLIHelpers.java, dumpAllContainersLogs should pass its conf object to FileContext.getFileContext() call. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1337) Recover containers upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1337: - Description: To support work-preserving NM restart we need to recover the state of the containers when the nodemanager went down. This includes informing the RM of containers that have exited in the interim and a strategy for dealing with the exit codes from those containers along with how to reacquire the active containers and determine their exit codes when they terminate. The state of finished containers also needs to be recovered. (was: To support work-preserving NM restart we need to recover the state of the containers that were active when the nodemanager went down. This includes informing the RM of containers that have exited in the interim and a strategy for dealing with the exit codes from those containers along with how to reacquire the active containers and determine their exit codes when they terminate.) Summary: Recover containers upon nodemanager restart (was: Recover active container state upon nodemanager restart) Updating headline and description to note that this task also includes recovering the state of finished containers as well. Recover containers upon nodemanager restart --- Key: YARN-1337 URL: https://issues.apache.org/jira/browse/YARN-1337 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe To support work-preserving NM restart we need to recover the state of the containers when the nodemanager went down. This includes informing the RM of containers that have exited in the interim and a strategy for dealing with the exit codes from those containers along with how to reacquire the active containers and determine their exit codes when they terminate. The state of finished containers also needs to be recovered. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2014) Performance: AM scaleability is 10% slower in 2.4 compared to 0.23.9
[ https://issues.apache.org/jira/browse/YARN-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996965#comment-13996965 ] Jason Lowe commented on YARN-2014: -- HADOOP-7549 added service loading of filesystems, and HADOOP-7350 added service loading of compression codecs. I'll see if I have some time to disable the service loading of unnecessary classes. Performance: AM scaleability is 10% slower in 2.4 compared to 0.23.9 Key: YARN-2014 URL: https://issues.apache.org/jira/browse/YARN-2014 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: patrick white Assignee: Jason Lowe Performance comparison benchmarks from 2.x against 0.23 shows AM scalability benchmark's runtime is approximately 10% slower in 2.4.0. The trend is consistent across later releases in both lines, latest release numbers are: 2.4.0.0 runtime 255.6 seconds (avg 5 passes) 0.23.9.12 runtime 230.4 seconds (avg 5 passes) Diff: -9.9% AM Scalability test is essentially a sleep job that measures time to launch and complete a large number of mappers. The diff is consistent and has been reproduced in both a larger (350 node, 100,000 mappers) perf environment, as well as a small (10 node, 2,900 mappers) demo cluster. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2014) Performance: AM scaleability is 10% slower in 2.4 compared to 0.23.9
[ https://issues.apache.org/jira/browse/YARN-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996494#comment-13996494 ] Jason Lowe commented on YARN-2014: -- I did a bit of investigation on this, and the problem appears to be around the duration of the tasks. In 2.4 the sleep job tasks are taking about 660 msec longer to execute than they do in 0.23. I didn't nail down exactly where this extra delay was coming from, but I did notice that the tasks in 2.4 are loading over 800 more classes than they do in 0.23. I think most of these are coming from the service loader for FileSystem schemas, as the 2.4 tasks loads every FileSystem available and 0.23 does not. In 0.23 FileSystem schemas are declared in configs, but in 2.4 they are dynamically detected and loaded via a service loader. The ~0.5s delay in the task appears to be a fixed startup cost and is amplified by the AM scalability test since it runs very short tasks (the main portion of the map task lasts 1 second) and multiple tasks are run per map slot on the cluster, serializing the task startup delays. Performance: AM scaleability is 10% slower in 2.4 compared to 0.23.9 Key: YARN-2014 URL: https://issues.apache.org/jira/browse/YARN-2014 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: patrick white Performance comparison benchmarks from 2.x against 0.23 shows AM scalability benchmark's runtime is approximately 10% slower in 2.4.0. The trend is consistent across later releases in both lines, latest release numbers are: 2.4.0.0 runtime 255.6 seconds (avg 5 passes) 0.23.9.12 runtime 230.4 seconds (avg 5 passes) Diff: -9.9% AM Scalability test is essentially a sleep job that measures time to launch and complete a large number of mappers. The diff is consistent and has been reproduced in both a larger (350 node, 100,000 mappers) perf environment, as well as a small (10 node, 2,900 mappers) demo cluster. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-2034) Description for yarn.nodemanager.localizer.cache.target-size-mb is incorrect
Jason Lowe created YARN-2034: Summary: Description for yarn.nodemanager.localizer.cache.target-size-mb is incorrect Key: YARN-2034 URL: https://issues.apache.org/jira/browse/YARN-2034 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.4.0, 0.23.10 Reporter: Jason Lowe Priority: Minor The description for yarn.nodemanager.localizer.cache.target-size-mb says that it is a setting per local directory, but according to the code it's a setting for the entire node. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1751) Improve MiniYarnCluster and LogCLIHelpers for log aggregation testing
[ https://issues.apache.org/jira/browse/YARN-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993565#comment-13993565 ] Jason Lowe commented on YARN-1751: -- Despite them both being small changes, I think these should be separate JIRA since they're otherwise unrelated changes for different problems and can stand on their own. We can morph this JIRA into one of them and file a new one to cover the other. For the LogCLIHelpers change, I think it should be calling FileContext.getFileContext(remoteAppLogDir.toUri(), conf) in case the remoteAppLogDir is not on the default filesystem. There's also the question of whether it should guard against a null conf, since oddly despite LogCLIHelpers being Configurable it isn't using the config until after this change. I think I'm leaning towards leaving it null and letting the NPE occur so callers will fix it. We've had lots of performance problems and other weirdness in the past when code forgot to pass down a custom config and things sorta worked with the default one. +1 for the MiniYarnCluster change. Improve MiniYarnCluster and LogCLIHelpers for log aggregation testing - Key: YARN-1751 URL: https://issues.apache.org/jira/browse/YARN-1751 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Reporter: Ming Ma Assignee: Ming Ma Attachments: YARN-1751-trunk.patch MiniYarnCluster specifies individual remote log aggregation root dir for each NM. Test code that uses MiniYarnCluster won't be able to get the value of log aggregation root dir. The following code isn't necessary in MiniYarnCluster. File remoteLogDir = new File(testWorkDir, MiniYARNCluster.this.getName() + -remoteLogDir-nm- + index); remoteLogDir.mkdir(); config.set(YarnConfiguration.NM_REMOTE_APP_LOG_DIR, remoteLogDir.getAbsolutePath()); In LogCLIHelpers.java, dumpAllContainersLogs should pass its conf object to FileContext.getFileContext() call. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-2034) Description for yarn.nodemanager.localizer.cache.target-size-mb is incorrect
[ https://issues.apache.org/jira/browse/YARN-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-2034: - Description: The description in yarn-default.xml for yarn.nodemanager.localizer.cache.target-size-mb says that it is a setting per local directory, but according to the code it's a setting for the entire node. (was: The description for yarn.nodemanager.localizer.cache.target-size-mb says that it is a setting per local directory, but according to the code it's a setting for the entire node.) Description for yarn.nodemanager.localizer.cache.target-size-mb is incorrect Key: YARN-2034 URL: https://issues.apache.org/jira/browse/YARN-2034 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 0.23.10, 2.4.0 Reporter: Jason Lowe Priority: Minor The description in yarn-default.xml for yarn.nodemanager.localizer.cache.target-size-mb says that it is a setting per local directory, but according to the code it's a setting for the entire node. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2050) Fix LogCLIHelpers to create the correct FileContext
[ https://issues.apache.org/jira/browse/YARN-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996685#comment-13996685 ] Jason Lowe commented on YARN-2050: -- bq. remoteAppLogDir.toUri().getScheme() returns null and AbstractFileSystem.createFileSystem doesn't like it if dumpAllContainersLogs calls FileContext.getFileContext(remoteAppLogDir.toUri()) Argh right, I forgot that FileContext is less-than-helpful in this regard. It needs to be something like this: {code} Path qualifiedLogDir = FileContext.getFileContext(getConf()).makeQualified(remoteAppLogDir); FileContext fc = FileContext.getFileContext(qualifiedLogDir.toUri(), getConf()); nodeFiles = fc.listStatus(qualifiedLogDir); {code} This allows the code to handle cases where the remote log dir has been configured to be a different filesystem than the default filesystem. Fix LogCLIHelpers to create the correct FileContext --- Key: YARN-2050 URL: https://issues.apache.org/jira/browse/YARN-2050 Project: Hadoop YARN Issue Type: Bug Reporter: Ming Ma Assignee: Ming Ma Attachments: YARN-2050.patch LogCLIHelpers calls FileContext.getFileContext() without any parameters. Thus the FileContext created isn't necessarily the FileContext for remote log. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-182) Unnecessary Container killed by the ApplicationMaster message for successful containers
[ https://issues.apache.org/jira/browse/YARN-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995087#comment-13995087 ] Jason Lowe commented on YARN-182: - I don't believe this is related to YARN-903, rather it seems more likely to be related to MAPREDUCE-5465. The MapReduce ApplicationMaster kills tasks as soon as they report success via the umbilical connection, and sometimes that kill arrives before the task exits on its own. In those cases the containers will be marked as killed by the ApplicationMaster. Unnecessary Container killed by the ApplicationMaster message for successful containers - Key: YARN-182 URL: https://issues.apache.org/jira/browse/YARN-182 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.0.1-alpha Reporter: zhengqiu cai Assignee: Omkar Vinit Joshi Labels: hadoop, usability Attachments: Log.txt I was running wordcount and the resourcemanager web UI shown the status as FINISHED SUCCEEDED, but the log shown Container killed by the ApplicationMaster -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2034) Description for yarn.nodemanager.localizer.cache.target-size-mb is incorrect
[ https://issues.apache.org/jira/browse/YARN-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993020#comment-13993020 ] Jason Lowe commented on YARN-2034: -- While updating it we may also want to clarify that it is a target retention size that only includes resources with PUBLIC and PRIVATE visibility and excludes resources with APPLICATION visibility. Description for yarn.nodemanager.localizer.cache.target-size-mb is incorrect Key: YARN-2034 URL: https://issues.apache.org/jira/browse/YARN-2034 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 0.23.10, 2.4.0 Reporter: Jason Lowe Priority: Minor The description in yarn-default.xml for yarn.nodemanager.localizer.cache.target-size-mb says that it is a setting per local directory, but according to the code it's a setting for the entire node. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1338) Recover localized resource cache state upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1338: - Attachment: YARN-1338v4.patch Updating patch to trunk. Recover localized resource cache state upon nodemanager restart --- Key: YARN-1338 URL: https://issues.apache.org/jira/browse/YARN-1338 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1338.patch, YARN-1338v2.patch, YARN-1338v3-and-YARN-1987.patch, YARN-1338v4.patch Today when node manager restarts we clean up all the distributed cache files from disk. This is definitely not ideal from 2 aspects. * For work preserving restart we definitely want them as running containers are using them * For even non work preserving restart this will be useful in the sense that we don't have to download them again if needed by future tasks. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1354) Recover applications upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1354: - Attachment: YARN-1354-v3.patch Updated patch now that YARN-1987 and YARN-1362 have been committed. Recover applications upon nodemanager restart - Key: YARN-1354 URL: https://issues.apache.org/jira/browse/YARN-1354 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1354-v1.patch, YARN-1354-v2-and-YARN-1987-and-YARN-1362.patch, YARN-1354-v3.patch The set of active applications in the nodemanager context need to be recovered for work-preserving nodemanager restart -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1962) Timeline server is enabled by default
[ https://issues.apache.org/jira/browse/YARN-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994006#comment-13994006 ] Jason Lowe commented on YARN-1962: -- +1 lgtm. Will commit this early next week to give [~zjshen] a chance to comment. Timeline server is enabled by default - Key: YARN-1962 URL: https://issues.apache.org/jira/browse/YARN-1962 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: YARN-1962.1.patch, YARN-1962.2.patch Since Timeline server is not matured and secured yet, enabling it by default might create some confusion. We were playing with 2.4.0 and found a lot of exceptions for distributed shell example related to connection refused error. Btw, we didn't run TS because it is not secured yet. Although it is possible to explicitly turn it off through yarn-site config. In my opinion, this extra change for this new service is not worthy at this point,. This JIRA is to turn it off by default. If there is an agreement, i can put a simple patch about this. {noformat} 14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response from the timeline server. com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: Connection refused at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) at com.sun.jersey.api.client.Client.handle(Client.java:648) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at sun.net.NetworkClient.doConnect(NetworkClient.java:180) at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) at sun.net.www.http.HttpClient.in14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response from the timeline server. com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: Connection refused at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) at com.sun.jersey.api.client.Client.handle(Client.java:648) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198) at
[jira] [Updated] (YARN-1339) Recover DeletionService state upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1339: - Attachment: YARN-1339v4.patch Updated patch now that YARN-1987 has been committed. Recover DeletionService state upon nodemanager restart -- Key: YARN-1339 URL: https://issues.apache.org/jira/browse/YARN-1339 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1339.patch, YARN-1339v2.patch, YARN-1339v3-and-YARN-1987.patch, YARN-1339v4.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-2079) Recover NonAggregatingLogHandler state upon nodemanager restart
Jason Lowe created YARN-2079: Summary: Recover NonAggregatingLogHandler state upon nodemanager restart Key: YARN-2079 URL: https://issues.apache.org/jira/browse/YARN-2079 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.4.0 Reporter: Jason Lowe The state of NonAggregatingLogHandler needs to be persisted so logs are properly deleted across a nodemanager restart. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2050) Fix LogCLIHelpers to create the correct FileContext
[ https://issues.apache.org/jira/browse/YARN-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003548#comment-14003548 ] Jason Lowe commented on YARN-2050: -- +1 lgtm. Committing this. Fix LogCLIHelpers to create the correct FileContext --- Key: YARN-2050 URL: https://issues.apache.org/jira/browse/YARN-2050 Project: Hadoop YARN Issue Type: Bug Reporter: Ming Ma Assignee: Ming Ma Attachments: YARN-2050-2.patch, YARN-2050.patch LogCLIHelpers calls FileContext.getFileContext() without any parameters. Thus the FileContext created isn't necessarily the FileContext for remote log. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1338) Recover localized resource cache state upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1338: - Attachment: YARN-1338v5.patch Thanks for the review, Junping! Attaching a patch to address your comments with specific responses below. bq. beside null store and a leveled store, I saw a memory store implemented there but no usage so far. Does it helps in some scenario or only for test purpose? It's only for use in unit tests which is why it's located under src/test/. It stores state in the memory of the JVM itself, so it's not very useful for real-world recovery scenarios. The state is lost when the NM crashes/exits. bq. Can we abstract code since if block into a method, something like: initializeNMStore(conf)? which can make NodeManager#serviceInit() simpler. Done. bq. Does size here represent for size of local resource? If so, may be duplicated with the size within LocalResourceProto? As I understand it they are slightly different. The size in the LocalResourceProto is the size of the resource that will be downloaded, while the size in LocalizedResource (and also persisted in LocalizedResourceProto) is the size of the resource on the local disk. These can be different if the resource is uncompressed/unarchived after downloading (e.g.: a .tar.gz resource). bq. May be we should check appResourceState(appEntry.getValue)’s localizedResources and inProgressResources is not empty before recover it as we check for userResourceState? Done. I also added a LocalResourceTrackerState#isEmpty method to make the code a bit cleaner. bq. May be even in case tk.appId !=null, we should load private resource state as well? No, if tk.appId is not null then this is state for an app-specific resource tracker and not for a private resource tracker. See the javadoc for NMStateStoreService#startResourceLocalization or NMStateStoreService#finishResourceLocalziation for some hints, and I also added some comments to the NMMemoryStateStoreService to clarify how the user and appId are used to discern public vs. private vs. app-specific trackers. Recover localized resource cache state upon nodemanager restart --- Key: YARN-1338 URL: https://issues.apache.org/jira/browse/YARN-1338 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1338.patch, YARN-1338v2.patch, YARN-1338v3-and-YARN-1987.patch, YARN-1338v4.patch, YARN-1338v5.patch Today when node manager restarts we clean up all the distributed cache files from disk. This is definitely not ideal from 2 aspects. * For work preserving restart we definitely want them as running containers are using them * For even non work preserving restart this will be useful in the sense that we don't have to download them again if needed by future tasks. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1338) Recover localized resource cache state upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1338: - Attachment: YARN-1338v6.patch Thanks for the additional comments, Junping. bq. Do we have any code to destroy DB items for NMState when NM is decommissioned (not expecting short-term restart)? Good point. I added shutdown code that removes the recovery directory if the shutdown is due to a decommission. I also added a unit test for this scenario. {quote} In LocalResourcesTrackerImpl#recoverResource() +incrementFileCountForLocalCacheDirectory(localDir.getParent()); Given localDir is already the parent of localPath, may be we should just increment locaDir rather than its parent? I didn't see we have unit test to check file count for resource directory after recovery. May be we should add some? {quote} The last component of localDir is the unique resource ID and not a directory managed by the local cache directory manager. The directory allocated by the local cache directory manager has an additional directory added by the localization process which is named after the unique ID for the local resource. For example, the localPath might be something like /local/root/0/1/52/resource.jar and localDir is /local/root/0/1/52. The '52' is the unique resource ID (always = 10 so it can't conflict with single-character cache mgr subdirs) and /local/root/0/1 is the directory managed by the local dir cache manager. If we passed localDir to the local dir cache manager it would get confused since it would try to parse the last component as a subdirectory it created but it isn't that. I did add a unit test to verify local cache directory counts are incremented properly when resources are recovered. This required exposing a couple of methods as package-private to get the necessary information for the test. Recover localized resource cache state upon nodemanager restart --- Key: YARN-1338 URL: https://issues.apache.org/jira/browse/YARN-1338 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1338.patch, YARN-1338v2.patch, YARN-1338v3-and-YARN-1987.patch, YARN-1338v4.patch, YARN-1338v5.patch, YARN-1338v6.patch Today when node manager restarts we clean up all the distributed cache files from disk. This is definitely not ideal from 2 aspects. * For work preserving restart we definitely want them as running containers are using them * For even non work preserving restart this will be useful in the sense that we don't have to download them again if needed by future tasks. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1801) NPE in public localizer
[ https://issues.apache.org/jira/browse/YARN-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14011601#comment-14011601 ] Jason Lowe commented on YARN-1801: -- Strictly speaking, the patch does prevent the NPE. However the public localizer is still effectively doomed if this condition occurs because it returns from the run() method. That will shutdown the localizer thread and public local resource requests will stop being processed. In that sense we've traded an NPE with a traceback for a one-line log message. I'm not sure this is an improvement, since at least the traceback is easier to notice in the NM log and we get a corresponding fatal log when someone goes hunting for what went wrong with the public localizer. The real issue is we need to understand what happened to cause pending.remove(completed) to return null. This should never happen, and if it does then it means we have a bug. Trying to recover from this condition is patching a symptom rather than a root cause. The problem that lead to the null request event _might_ have been fixed by YARN-1575 which wasn't present in 2.2 where the original bug occurred. It would be interesting to know if this has reoccurred since 2.3.0. Assuming this is still a potential issue, we should either find a way to prevent it from ever occurring or recover in a way that keeps the public localizer working as much as possible. It'd be great if we could just pull from the queue and receive a structure that has both the request event and the FuturePath so we don't have to worry about a FuturePath with no associated event. If we're going to try to recover instead, we'd have to log an error and try to cleanup. With no associated request event and no path if we got an execution error, it's going to be particularly difficult to recover properly. NPE in public localizer --- Key: YARN-1801 URL: https://issues.apache.org/jira/browse/YARN-1801 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Jason Lowe Assignee: Hong Zhiguo Priority: Critical Attachments: YARN-1801.patch While investigating YARN-1800 found this in the NM logs that caused the public localizer to shutdown: {noformat} 2014-01-23 01:26:38,655 INFO localizer.ResourceLocalizationService (ResourceLocalizationService.java:addResource(651)) - Downloading public rsrc:{ hdfs://colo-2:8020/user/fertrist/oozie-oozi/601-140114233013619-oozie-oozi-W/aggregator--map-reduce/map-reduce-launcher.jar, 1390440382009, FILE, null } 2014-01-23 01:26:38,656 FATAL localizer.ResourceLocalizationService (ResourceLocalizationService.java:run(726)) - Error: Shutting down java.lang.NullPointerException at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.run(ResourceLocalizationService.java:712) 2014-01-23 01:26:38,656 INFO localizer.ResourceLocalizationService (ResourceLocalizationService.java:run(728)) - Public cache exiting {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-2114) Inform container of container-specific local directories
Jason Lowe created YARN-2114: Summary: Inform container of container-specific local directories Key: YARN-2114 URL: https://issues.apache.org/jira/browse/YARN-2114 Project: Hadoop YARN Issue Type: Improvement Components: api Affects Versions: 2.5.0 Reporter: Jason Lowe It would be nice if a container could know which local directories it can use for temporary data and those directories will be automatically cleaned up when the container exits. The current working directory is one of those directories, but it's tricky (and potentially not forward-compatible) to determine the other directories to use on a multi-disk node. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2114) Inform container of container-specific local directories
[ https://issues.apache.org/jira/browse/YARN-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14012544#comment-14012544 ] Jason Lowe commented on YARN-2114: -- Currently a container can obtain a list of local directories to use by examining the LOCAL_DIRS environment variable. However these directories have an lifespan that matches the application (i.e.: they will only be deleted when the entire application completes). Therefore if a container writes some temporary data to this directory and the container crashes or it otherwise orphans that data, the data won't be cleaned up when the container completes but rather only when the entire application completes. There's use-cases for both: data that survives as long as the application is active and data that only survives as long as the container is active. Given the way YARN works today, a container can take the list of directories from LOCAL_DIRS and tack on the CONTAINER_ID to find these directories. However that might not be forward compatible unless we commit to that always working. It would be cleaner if there was a separate variable, maybe CONTAINER_LOCAL_DIRS, that listed the directories that container-specific rather than app-specific. Inform container of container-specific local directories Key: YARN-2114 URL: https://issues.apache.org/jira/browse/YARN-2114 Project: Hadoop YARN Issue Type: Improvement Components: api Affects Versions: 2.5.0 Reporter: Jason Lowe It would be nice if a container could know which local directories it can use for temporary data and those directories will be automatically cleaned up when the container exits. The current working directory is one of those directories, but it's tricky (and potentially not forward-compatible) to determine the other directories to use on a multi-disk node. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1424) RMAppAttemptImpl should precompute a zeroed ApplicationResourceUsageReport to return when attempt not active
[ https://issues.apache.org/jira/browse/YARN-1424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025314#comment-14025314 ] Jason Lowe commented on YARN-1424: -- -1 was chosen to explicitly mark the field as having an invalid value to note that the true value is unknown or inaccessible. For example, having no reserved containers is a valid state, so how would a client distinguish between an application having no reservations and the case where we cannot report the proper value? Similarly unmanaged AMs don't have resource utilization on the cluster for the AM itself, so an unmanaged AM without any other containers could show up with zero resource consumption. I could see it either way -- either clients will blindly sum and assume they won't need to differentiate apps they can't see details from apps they can, or clients will want to differentiate and then we need a way to convey that. If we do change to using zeroes then technically this would be an incompatible change since we would no longer be flagging unreportable/inaccessible values via the -1 value. RMAppAttemptImpl should precompute a zeroed ApplicationResourceUsageReport to return when attempt not active Key: YARN-1424 URL: https://issues.apache.org/jira/browse/YARN-1424 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.4.0 Reporter: Sandy Ryza Assignee: Ray Chiang Priority: Minor Labels: newbie Attachments: YARN1424-01.patch, YARN1424-02.patch RMAppImpl has a DUMMY_APPLICATION_RESOURCE_USAGE_REPORT to return when the caller of createAndGetApplicationReport doesn't have access. RMAppAttemptImpl should have something similar for getApplicationResourceUsageReport. It also might make sense to put the dummy report into ApplicationResourceUsageReport and allow both to use it. A test would also be useful to verify that RMAppAttemptImpl#getApplicationResourceUsageReport doesn't return null if the scheduler doesn't have a report to return. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1339) Recover DeletionService state upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1339: - Attachment: YARN-1339v5.patch Refreshed patch to trunk. Recover DeletionService state upon nodemanager restart -- Key: YARN-1339 URL: https://issues.apache.org/jira/browse/YARN-1339 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1339.patch, YARN-1339v2.patch, YARN-1339v3-and-YARN-1987.patch, YARN-1339v4.patch, YARN-1339v5.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1339) Recover DeletionService state upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1339: - Attachment: YARN-1339v6.patch Thanks for the review, Junping! bq. Shall we add if (stateStore.canRecover()) so that we only do record work when stateStore in levelDB? Fixed. I added an early-out for this in the recordDeletionTaskInStateStore method. bq. Does it necessary for turning set into an array here? I only see iteratoring elements later. If so, it could be simpler to leave it as set. Theoretically yes. We need to return a copy of the successors because the set is protected by synchronization on the FileDeletionTask. If we return it directly then we could corrupt it via concurrency. In practice this probably doesn't occur, but for maintenance sake I played it on the safe side. I don't think this is going to be a performance problem because in deletion tasks typically don't have very many successors. I suppose we could use a ConcurrentMap instead of a HashSet and expose direct access to it, but that seemed like overkill -- there might be a lot of deletion tasks lingering around and concurrent maps are more memory-intensive. Or we could use a SynchronizedSet and make sure the iterating code locks it appropriately. Let me know if something needs to change here. bq. May be we should quickly return from recover method if state is empty for cases that NM get first-time start or no scheduled fileDeletionTask during NM restart. The method is already going to be pretty quick if there aren't any deletion tasks recovered. In that case all it will do is allocate a hash map and hash set, iterate over the empty list of tasks, iterate over the empty hash map and return. Given the recover method is only called on startup once and isn't that expensive in the nothing-to-do case, an early out seems like an unnecessary optimization -- or am I missing something? bq. I think this is the same with basePaths.add(new Path(basedir)). Also I think some path.toUri().toString() can be replaced with path.toString() directly. Fixed. I vaguely recall having some issues using the raw paths on an early prototype of the code, but it seems to be working fine without explicit URI usage. bq. May be we should throw exceptions in all three methods to keep consistent? The point of the null state store is to silently eat attempts to store. It only throws for recovery methods to note to developers that there is no way to recover from this state store. If we put exceptions in all the store methods then we'll have to sprinkle canRecover() checks throughout the code which I'd rather avoid if possible. NullRMStateStore behaves the same way as do the other store methods in the null state store, so in that sense it's consistent. Recover DeletionService state upon nodemanager restart -- Key: YARN-1339 URL: https://issues.apache.org/jira/browse/YARN-1339 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1339.patch, YARN-1339v2.patch, YARN-1339v3-and-YARN-1987.patch, YARN-1339v4.patch, YARN-1339v5.patch, YARN-1339v6.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-2147) client lacks delegation token exception details when application submit fails
Jason Lowe created YARN-2147: Summary: client lacks delegation token exception details when application submit fails Key: YARN-2147 URL: https://issues.apache.org/jira/browse/YARN-2147 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.4.0 Reporter: Jason Lowe Priority: Minor When an client submits an application and the delegation token process fails the client can lack critical details needed to understand the nature of the error. Only the message of the error exception is conveyed to the client, which sometimes isn't enough to debug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2147) client lacks delegation token exception details when application submit fails
[ https://issues.apache.org/jira/browse/YARN-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14027979#comment-14027979 ] Jason Lowe commented on YARN-2147: -- For example, here's a sample log from a client submitting a job that failed: {noformat} 2014-05-14 10:36:16,111 [JobControl] INFO org.apache.hadoop.mapred.ResourceMgrDelegate - Submitted application application_1394826486018_9924515 to ResourceManager at xx/xx:xx 2014-05-14 10:36:16,116 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Cleaning up the staging area /user/xx/.staging/job_1394826486018_9924515 2014-05-14 10:36:16,117 [JobControl] ERROR org.apache.hadoop.security.UserGroupInformation - PriviledgedActionException as:xx (auth:SIMPLE) cause:java.io.IOException: Failed to run job : Read timed out 2014-05-14 10:36:16,118 [JobControl] INFO org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob - xx got an error while submitting java.io.IOException: Failed to run job : Read timed out at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:301) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:410) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1284) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:336) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128) at org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:191) {noformat} All the user sees is a read timeout but no details as to where it was connecting or what service was involved. Was this a timeout connecting to the RM? A timeout on the RM side? Something else entirely? Hard to tell from just Read timed out. Looking at the exception logged at the RM side the full stacktrace shows that it was timing out trying to grab a delegation token from a remote server for webhdfs. Those kinds of details need to be conveyed back to the client, either via the full stacktrace from the RM exception or via a more informative exception message when delegation token renewal fails during app submission. client lacks delegation token exception details when application submit fails - Key: YARN-2147 URL: https://issues.apache.org/jira/browse/YARN-2147 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.4.0 Reporter: Jason Lowe Priority: Minor When an client submits an application and the delegation token process fails the client can lack critical details needed to understand the nature of the error. Only the message of the error exception is conveyed to the client, which sometimes isn't enough to debug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-853) maximum-am-resource-percent doesn't work after refreshQueues command
[ https://issues.apache.org/jira/browse/YARN-853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-853: Fix Version/s: 0.23.11 Thanks, Deveraj! I committed this to branch-0.23 as well. maximum-am-resource-percent doesn't work after refreshQueues command Key: YARN-853 URL: https://issues.apache.org/jira/browse/YARN-853 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0, 2.1.0-beta, 2.0.5-alpha Reporter: Devaraj K Assignee: Devaraj K Fix For: 2.1.0-beta, 0.23.11 Attachments: YARN-853-1.patch, YARN-853-2.patch, YARN-853-3.patch, YARN-853-4.patch, YARN-853.patch If we update yarn.scheduler.capacity.maximum-am-resource-percent / yarn.scheduler.capacity.queue-path.maximum-am-resource-percent configuration and then do the refreshNodes, it uses the new config value to calculate Max Active Applications and Max Active Application Per User. If we add new node after issuing 'rmadmin -refreshQueues' command, it uses the old maximum-am-resource-percent config value to calculate Max Active Applications and Max Active Application Per User. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2167) LeveldbIterator should get closed in NMLeveldbStateStoreService#loadLocalizationState() within finally block
[ https://issues.apache.org/jira/browse/YARN-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033127#comment-14033127 ] Jason Lowe commented on YARN-2167: -- +1 pending Jenkins. LeveldbIterator should get closed in NMLeveldbStateStoreService#loadLocalizationState() within finally block Key: YARN-2167 URL: https://issues.apache.org/jira/browse/YARN-2167 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Junping Du Assignee: Junping Du Attachments: YARN-2167.patch In NMLeveldbStateStoreService#loadLocalizationState(), we have LeveldbIterator to read NM's localization state but it is not get closed in finally block. We should close this connection to DB as a common practice. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2167) LeveldbIterator should get closed in NMLeveldbStateStoreService#loadLocalizationState() within finally block
[ https://issues.apache.org/jira/browse/YARN-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033328#comment-14033328 ] Jason Lowe commented on YARN-2167: -- +1 lgtm. Committing this. LeveldbIterator should get closed in NMLeveldbStateStoreService#loadLocalizationState() within finally block Key: YARN-2167 URL: https://issues.apache.org/jira/browse/YARN-2167 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Junping Du Assignee: Junping Du Attachments: YARN-2167.patch In NMLeveldbStateStoreService#loadLocalizationState(), we have LeveldbIterator to read NM's localization state but it is not get closed in finally block. We should close this connection to DB as a common practice. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()
Jason Lowe created YARN-2171: Summary: AMs block on the CapacityScheduler lock during allocate() Key: YARN-2171 URL: https://issues.apache.org/jira/browse/YARN-2171 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.4.0, 0.23.10 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Critical When AMs heartbeat into the RM via the allocate() call they are blocking on the CapacityScheduler lock when trying to get the number of nodes in the cluster via getNumClusterNodes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()
[ https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033864#comment-14033864 ] Jason Lowe commented on YARN-2171: -- When the CapacityScheduler scheduler thread is running full-time due to a constant stream of events (e.g.: large number of running applications with a large number of cluster nodes) then the CapacityScheduler lock is held by that scheduler loop most of the time. As AMs heartbeat into the RM to try to get their resources, the capacity scheduler code goes out of its way to try to avoid having the AMs grab the scheduler lock. Unfortunately this one was missed to get this one integer value. Therefore they end up piling up on the scheduler lock, filling all of the IPC handlers of the ApplicationMasterService and the others back up on the call queue. Once the scheduler releases the lock it will quickly try to grab it again, so only a few AMs end up getting through the gate and the IPC handlers fill again with the next batch of AMs blocking on the scheduler lock. This causes the average RPC response times to skyrocket for AMs. AMs experience large delays getting their allocations which in turn leads to lower cluster utilization and increased application runtimes. AMs block on the CapacityScheduler lock during allocate() - Key: YARN-2171 URL: https://issues.apache.org/jira/browse/YARN-2171 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 0.23.10, 2.4.0 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Critical When AMs heartbeat into the RM via the allocate() call they are blocking on the CapacityScheduler lock when trying to get the number of nodes in the cluster via getNumClusterNodes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()
[ https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-2171: - Attachment: YARN-2171.patch Patch to use AtomicInteger for the number of nodes so we can avoid grabbing the lock to access the value. I also added a unit test to verify allocate doesn't try to grab the capacity scheduler lock. AMs block on the CapacityScheduler lock during allocate() - Key: YARN-2171 URL: https://issues.apache.org/jira/browse/YARN-2171 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 0.23.10, 2.4.0 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Critical Attachments: YARN-2171.patch When AMs heartbeat into the RM via the allocate() call they are blocking on the CapacityScheduler lock when trying to get the number of nodes in the cluster via getNumClusterNodes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-365) Each NM heartbeat should not generate an event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-365: Attachment: YARN-365.branch-0.23.patch Patch for branch-0.23. RM unit tests pass, and I manually tested it as well on a single-node cluster forcing the scheduler to run slower than the heartbeat interval. Each NM heartbeat should not generate an event for the Scheduler Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Fix For: 2.1.0-beta Attachments: Prototype2.txt, Prototype3.txt, YARN-365.1.patch, YARN-365.10.patch, YARN-365.2.patch, YARN-365.3.patch, YARN-365.4.patch, YARN-365.5.patch, YARN-365.6.patch, YARN-365.7.patch, YARN-365.8.patch, YARN-365.9.patch, YARN-365.branch-0.23.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()
[ https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-2171: - Attachment: YARN-2171v2.patch The point of the unit test was to catch regressions at a high level. If anyone changes the code such that calling allocate() will grab the scheduler lock then the test will fail, whether that's a regression in this particular method or some new method that's added that ApplicationMasterService or CapacityScheduler itself calls and grabs the lock. I added a separate unit test to exercise the getNumClusterNodes method. The AHS unit test failure seems unrelated, and it passes for me locally even with this change. AMs block on the CapacityScheduler lock during allocate() - Key: YARN-2171 URL: https://issues.apache.org/jira/browse/YARN-2171 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 0.23.10, 2.4.0 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Critical Attachments: YARN-2171.patch, YARN-2171v2.patch When AMs heartbeat into the RM via the allocate() call they are blocking on the CapacityScheduler lock when trying to get the number of nodes in the cluster via getNumClusterNodes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-2176) CapacityScheduler loops over all running applications rather than actively requesting apps
Jason Lowe created YARN-2176: Summary: CapacityScheduler loops over all running applications rather than actively requesting apps Key: YARN-2176 URL: https://issues.apache.org/jira/browse/YARN-2176 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler Affects Versions: 2.4.0 Reporter: Jason Lowe The capacity scheduler performance is primarily dominated by LeafQueue.assignContainers, and that currently loops over all applications that are running in the queue. It would be more efficient if we looped over just the applications that are actively asking for resources rather than all applications, as there could be thousands of applications running but only a few hundred that are currently asking for resources. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1341) Recover NMTokens upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1341: - Attachment: YARN-1341v5.patch Thanks for taking a look, Junping! I've updated the patch to trunk. Recover NMTokens upon nodemanager restart - Key: YARN-1341 URL: https://issues.apache.org/jira/browse/YARN-1341 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2176) CapacityScheduler loops over all running applications rather than actively requesting apps
[ https://issues.apache.org/jira/browse/YARN-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14035799#comment-14035799 ] Jason Lowe commented on YARN-2176: -- AppSchedulingInfo is already determining when an app is actively requesting to be able to update the QueueMetrics.activeApplications metric. (It's confusing that LeafQueue also has an activeApplications collection which is actually the applications running not just the ones requesting.) It would be nice to leverage the work already being done by AppSchedulingInfo, which is currently calling the ActiveUsersManager activateApplication and deactivateApplication methods when necessary. CapacityScheduler could potentially have a derived ActiveUsersManager class that in addition notifies the LeafQueue so the queue can track apps requesting and apps not requesting separately. To preserve allocation semantics we'd have to track the original order of the applications so activating an application inserts it into the list of requesting applications in the same relative order to other requesting applications regardless of how many times it's been activated or deactivated. CapacityScheduler loops over all running applications rather than actively requesting apps -- Key: YARN-2176 URL: https://issues.apache.org/jira/browse/YARN-2176 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler Affects Versions: 2.4.0 Reporter: Jason Lowe The capacity scheduler performance is primarily dominated by LeafQueue.assignContainers, and that currently loops over all applications that are running in the queue. It would be more efficient if we looped over just the applications that are actively asking for resources rather than all applications, as there could be thousands of applications running but only a few hundred that are currently asking for resources. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2176) CapacityScheduler loops over all running applications rather than actively requesting apps
[ https://issues.apache.org/jira/browse/YARN-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14035905#comment-14035905 ] Jason Lowe commented on YARN-2176: -- That proposal would work for the deactivate path, but how does it work for the activate case? If the queue is not normally looping over the deactivated apps then it is not going to call hasPendingRequests() on them and we won't ever add it back to the list of active apps to iterate. If we do always loop over the deactivated apps to call this then that sorta defeats a large portion of the optimization. There needs to be more than a predicate function for the leaf queue to call, unless I'm missing something. CapacityScheduler loops over all running applications rather than actively requesting apps -- Key: YARN-2176 URL: https://issues.apache.org/jira/browse/YARN-2176 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler Affects Versions: 2.4.0 Reporter: Jason Lowe The capacity scheduler performance is primarily dominated by LeafQueue.assignContainers, and that currently loops over all applications that are running in the queue. It would be more efficient if we looped over just the applications that are actively asking for resources rather than all applications, as there could be thousands of applications running but only a few hundred that are currently asking for resources. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2176) CapacityScheduler loops over all running applications rather than actively requesting apps
[ https://issues.apache.org/jira/browse/YARN-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14036093#comment-14036093 ] Jason Lowe commented on YARN-2176: -- Sure, that works if we think that's cleaner. It's a little weird that AppSchedulingInfo is already calling back into an object obtained from the queue to notify of app activation state (i.e.: the ActiveUsersManager instance) and then we'd register a second object from the same queue to receive the same events. IMHO it'd be nice to not have two separate paths to tell the queue about the same thing. CapacityScheduler loops over all running applications rather than actively requesting apps -- Key: YARN-2176 URL: https://issues.apache.org/jira/browse/YARN-2176 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler Affects Versions: 2.4.0 Reporter: Jason Lowe The capacity scheduler performance is primarily dominated by LeafQueue.assignContainers, and that currently loops over all applications that are running in the queue. It would be more efficient if we looped over just the applications that are actively asking for resources rather than all applications, as there could be thousands of applications running but only a few hundred that are currently asking for resources. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2176) CapacityScheduler loops over all running applications rather than actively requesting apps
[ https://issues.apache.org/jira/browse/YARN-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14036451#comment-14036451 ] Jason Lowe commented on YARN-2176: -- ActiveUsersManager doesn't have a reference to the leaf queue today, but it's created by the leaf queue, specific to the leaf queue, and therefore trivial for it to have it if necessary. The end result is effectively the same, AppSchedulingInfo-ActiveUsersManager-LeafQueue is not that different than AppSchedulingInfo-LeafQueue-ActiveUsersManager as far as hops go. In many ways ActiveUsersManager is already a callback object to the queue. It's a queue-specific object, created by the queue, that is used to do three things: # Notify that an application is actively requesting resources via the activateApplication method # Notify that an application is no longer requesting via the deactivateApplication method # Obtain the current number of active users in the queue which is really close to the interface we need. I'm not sure why ActiveUsersManager's methods aren't just part of the Queue interface rather than a separate object. The fact that it's tracked in a separate object internally should be an implementation detail of the queue. I originally proposed the ActiveUsersManager override because it would be a cleaner implementation in terms of entities that would need to be modified. AppSchedulingInfo, ActiveUsersManager, Queue, and all the stuff outside of CapacityScheduler all would remain unchanged, and the implementation is localized to the scheduler that needs it. (Actually I think it's localized just to LeafQueue within the CapacityScheduler itself as well.) I'm not excited about the callback approach since it's yet-another-interface and queues have to remember to register or it doesn't work correctly. I'd rather it be more straightforward, where AppSchedulingInfo calls the queue directly to notify it. No init-time callback registration necessary and more straightforward to understand. But we can't change Queue without breaking compatibility (yay interfaces), so that leaves us with either the original proposal (i.e.: leverage ActiveUsersManager as the callback interface), doing a callback registration approach, or some RTTI-like approach (i.e.: deriving a new interface from Queue and having AppSchedulingInfo check if the queue is really an instance of that, sorta like how PreemptableResourceScheduler is handled today for the scheduler interface). If we do go with the callback approach then we can't have the leaf queue register on behalf of the ActiveUsersManager or we risk breaking backwards compatibility. Currently AppSchedulingInfo is expected to update ActiveUsersManager directly, and if we change it to no longer do that but expect a callback to be registered instead, existing queues that fail to register the callback (because they weren't updated along with the change) will fail to get their ActiveUsersManager object updated. Therefore I think we're stuck with AppSchedulingInfo always updating ActiveUsersManager or at best ActiveUsersManager registering the callback itself separately from the queue. CapacityScheduler loops over all running applications rather than actively requesting apps -- Key: YARN-2176 URL: https://issues.apache.org/jira/browse/YARN-2176 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler Affects Versions: 2.4.0 Reporter: Jason Lowe The capacity scheduler performance is primarily dominated by LeafQueue.assignContainers, and that currently loops over all applications that are running in the queue. It would be more efficient if we looped over just the applications that are actively asking for resources rather than all applications, as there could be thousands of applications running but only a few hundred that are currently asking for resources. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1341) Recover NMTokens upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1341: - Attachment: YARN-1341v6.patch Thanks for reviewing, Junping! bq. The change in BaseContainerTokenSecretManager.java is not necessary and I believe that belongs to YARN-1342. Good catch, removed. bq. Can we consolidate the code in a separated method together with NMContainerTokenSecretManager as we will do similar thing to recover ContainerToken staff which make code have duplicated things? I'm not sure I understand what you're requesting. Recovering the NM tokens is one line of code (3 if we count the if canRecover part), and recovering the container tokens in YARN-1342 will add one more line for that (inside the same if canRecover block). I went ahead and factored this into a separate method, however I'm not sure it matches what you were expecting as I don't see where we're saving duplicated code. If what's in the updated patch isn't what you expected, please provide some sample pseudo-code to demonstrate how we can avoid duplication of code. bq. Does log error here is just enough in case of failure in store? If Master key is updated but not persistent, then it could cause some inconsistency when recover it. I think we should throw some exception here if store get failed and rollback the key just set. The problem with throwing an exception is what to do with the exception -- do we take down the NM? That seems like a drastic answer since the NM will likely chug along just fine without the key stored. It only becomes a problem when the NM restarts and restores an old key. However if we rollback the old key here then we take that only-breaks-if-we-happened-to-restart case and make it an always-breaks scenario. Eventually the old key will no longer be valid to the RM, and none of the AMs will be able to authenticate to the NM. Therefore I thought it would be better to log the error, press onward, and hope we don't restart before we store a valid key again (maybe store error was transient) rather than either take down the NM or have things start failing even without a restart. Recover NMTokens upon nodemanager restart - Key: YARN-1341 URL: https://issues.apache.org/jira/browse/YARN-1341 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2175) Container localization has no timeouts and tasks can be stuck there for a long time
[ https://issues.apache.org/jira/browse/YARN-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14037337#comment-14037337 ] Jason Lowe commented on YARN-2175: -- I also wonder if there's been a regression, since at least in 0.23 containers that are localizing can be killed by the ApplicationMaster. The MR AM does this when mapreduce.task.timeout triggers a kill of a task due to lack of progress. The MR AM kills the container and that, in turn, causes the localizer to die because the NM tells the localizer to DIE during its next heartbeat. Although if the localizer gets stuck and stops heartbeating and the NM lost track of it due to the container kill then it seems like we could leak a hung localizer process. Container localization has no timeouts and tasks can be stuck there for a long time --- Key: YARN-2175 URL: https://issues.apache.org/jira/browse/YARN-2175 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.4.0 Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot There are no timeouts that can be used to limit the time taken by various container startup operations. Localization for example could take a long time and there is no way to kill an task if its stuck in these states. These may have nothing to do with the task itself and could be an issue within the platform. Ideally there should be configurable limits for various states within the NodeManager to limit various states. The RM does not care about most of these and its only between AM and the NM. We can start by making these global configurable defaults and in future we can make it fancier by letting AM override them in the start container request. This jira will be used to limit localization time and we open others if we feel we need to limit other operations. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039138#comment-14039138 ] Jason Lowe commented on YARN-1341: -- bq. The worst case seems to me is: NM restart with partial state recovered, this inconsistent state is not aware by running containers which could bring some weird bugs. Yes, you're correct. The worst-case is likely where we come up, fail to realize a container is running, and therefore the container leaks. I think we should handle store errors on a case-by-case basis, based on the ramifications of how the system will recover without that information. For containers, a container should fail to launch if state store errors occur to mark the container request and/or mark the container launch. The YARN-1336 prototype patch already does this when containers are requested and in ContainerLaunch. That way the worst-case scenario for a container is that we throw an error for the container request or the container fails before launch due to a state store error. We failed to launch a container, but the whole NM doesn't go down. If we fail to mark the container completed in the store then worst-case scenario is that we try to recover a container that isn't there, which again will mark the container as failed and we'll report that to the RM. If the RM doesn't know about the failed container (because the container/app is long gone) then it will just ignore it. For deletion service, if we fail to update the store then we may fail to delete something when we recover if we happened to restart in-between. If we ignore the error then it's very likely the NM will _not_ restart before the deletion time expires and the file is deleted. However if we tear down the NM on a store error then we will also fail to delete it when the NM restarts later since we failed to record it, meaning we made things purely worse -- we lost work _and_ leaked the thing we were supposed to delete. Therefore for deletion tasks I think the current behavior is appropriate. For localized resources failing to update the store means we could end up leaking a resource or thinking a resource is there when it's really not. The latter isn't a huge problem because when we try to reference the resource again it checks if it's there, and if it isn't it re-localizes it again. Not knowing a resource is there is a bigger issue, and there's a couple of ways to tackle that one -- either fail the localization of the resource when the state store error occurs or have the NM scan the local resource directories for unknown resources when it recovers. For the RM master key, I see it very similar to the deletion task case. If we fail to store it then the NM will update it in memory, and can keep going. If we restart without recovering an older key (the current key will be obtained when the NM re-registers with the RM) then we may fail to let AMs connect that only have an older key. Containers that were still on the NM will still continue. If we take down the NM when the store hiccups then we lose work which seems worse than a possibility the AM could fail to connect to the NM (which can and does already happen today due to network cuts, etc.) bq. May be we add some stale tag on NMStateStore and mark this when store failure happens and never load a staled store. If we had an error storing then we're likely to have the same error trying to store a stale tag, or am I misunderstanding the proposal? Also as I mentioned above, there are many cases where a partial recovery isn't a bad thing as the system can recover via other means (e.g.: trying to recover a container that already completed should be benign, trying to delete a container directory that's already deleted is benign, etc.). Recover NMTokens upon nodemanager restart - Key: YARN-1341 URL: https://issues.apache.org/jira/browse/YARN-1341 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2176) CapacityScheduler loops over all running applications rather than actively requesting apps
[ https://issues.apache.org/jira/browse/YARN-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039143#comment-14039143 ] Jason Lowe commented on YARN-2176: -- Ah, yes. AppSchedulingInfo should only be created by the built-in schedulers, so we can just have that expect the new Queue interface that has the activate/deactivate app methods. While we're at it we can remove the knowledge of ActiveUsersManager from AppSchedulingInfo and just have the queues update their own ActiveUsersManager instances when their activate/deactivate methods are called. That will streamline the AppSchedulingInfo code a bit. CapacityScheduler loops over all running applications rather than actively requesting apps -- Key: YARN-2176 URL: https://issues.apache.org/jira/browse/YARN-2176 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler Affects Versions: 2.4.0 Reporter: Jason Lowe The capacity scheduler performance is primarily dominated by LeafQueue.assignContainers, and that currently loops over all applications that are running in the queue. It would be more efficient if we looped over just the applications that are actively asking for resources rather than all applications, as there could be thousands of applications running but only a few hundred that are currently asking for resources. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039235#comment-14039235 ] Jason Lowe commented on YARN-1341: -- bq. Application state - If we failed to store the application update, i.e. from init to finish, then we get wrong state on application after recovery. Yes, applications should be like containers. If we fail to store an application start in the state store then we should fail the container launch that triggered the application to be added. This already happens in the current patch for YARN-1354. If we fail to store the completion of an application then worst-case we will report an application to the RM on restart that isn't active, and the RM will correct the NM when it re-registers. bq. NodeManagerMetrics - The metrics of NM will get mess up if partial updated. I wasn't planning on persisting metrics during restart, as there are quite a few (e.g.: RPC metrics, etc.), and I'm not sure it's critical that they be preserved across a restart. Does RM restart do this or are there plans to do so? bq. About stale tag on NMStateStore - I don't mean to put on NMStateStore, but haven't think clearly on where to do - may be we can persistent on local disk directly or send to RM and retrieval it in NM registration? I think in most cases the attempt to update the stale tag, even if it's separate from the NMStateStore, will often fail in a similar way when the state store fails (e.g.: full local disk, read-only filesystem, etc.). Therefore I don't believe the effort to maintain a stale tag is going to be worth it. Also if we refuse to load a state store that's stale then we are going to leak containers because we won't try to recover anything from a stale state store. Instead I think we should decide in the various store failure cases whether the error should be fatal to the operation (which may lead to it being fatal to the NM overall) or if we feel the recovery with stale information is a better outcome than taking the NM down. In the latter case we should just log the error and move on. Recover NMTokens upon nodemanager restart - Key: YARN-1341 URL: https://issues.apache.org/jira/browse/YARN-1341 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-2185) Use pipes when localizing archives
Jason Lowe created YARN-2185: Summary: Use pipes when localizing archives Key: YARN-2185 URL: https://issues.apache.org/jira/browse/YARN-2185 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.4.0 Reporter: Jason Lowe Currently the nodemanager downloads an archive to a local file, unpacks it, and then removes it. It would be more efficient to stream the data as it's being unpacked to avoid both the extra disk space requirements and the additional disk activity from storing the archive. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-2202) Metrics recovery for nodemanager restart
Jason Lowe created YARN-2202: Summary: Metrics recovery for nodemanager restart Key: YARN-2202 URL: https://issues.apache.org/jira/browse/YARN-2202 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Jason Lowe Per a side discussion in the review of YARN-1341, we should investigate what metrics need to be persisted and recovered as part of NM restart. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14042559#comment-14042559 ] Jason Lowe commented on YARN-1341: -- bq. So far from I know, RM restart didn't track this because these metrics will be recover during events recovery in RM restart. In current NM restart, some metrics could be lost, i.e. allocatedContainers, etc. I think we should either count them back as part of events during recovery or persistent them. Thoughts? Not all of the RM metrics will be recovered, correct? RPC metrics will be zeroed since those aren't persisted (nor should they be, IMHO). Aggregate containers allocated/released in the queue metrics will be wrong since the RM restart work, by design, doesn't store per-container state. If the cluster stays up too long then apps submitted/completed/failed/killed will not be correct, as I believe it will only count the applications that haven't been reaped due to retention policies. Anyway this is outside the scope of this JIRA, and I'll file a separate JIRA underneath the YARN-1336 umbrella to discuss what we should do about NM metrics and restart. bq. If so, how about we don't apply these changes until these changes can be persistent? If so, we still keep consistent between state store and NM's current state. Even we choose to fail the NM, we still can load state and recover the working. Again I think this is a case-by-case thing. For the RM master key, I'd rather keep going with the current master key and hope the next key update is able to persist (e.g.: a full disk where the state is stored that is later cleared up) rather than ditch the new key update and risk bringing down the NM because it can no longer keep talking to the RM or AMs. As I mentioned earlier, the failure to persist the RM master key or the master key used by an AM is that _if_ the NM happens to restart then some AMs _might_ not be able to authenticate with the NM until they get updated to the new master key. If we take down the NM or keep going but fail to update the master key in memory then this seems purely worse. The opportunity for error has widened, but I don't see any advantage gained by doing so. bq. Do we expect some operations can be failed while other operation can be successful? If this means short-term unavailable for persistent effort, we can just handle it by adding retry. If not, we should expect other operations that fetal get failed soon enough, and in this case, log error and move on in non-fatal operations don't have many differences. No? I don't expect immediate retry to help, and if the state store implementation is such that immediate retry is likely to help then the state store implementation should do that directly before throwing the error rather than relying on the upper-layer code to do so. However I do expect there to be common failure modes where the error state is temporary but not in the immediate sense (e.g.: the full disk scenario). And although an NM can't launch containers without a working state store, there's still a lot of useful stuff an NM can do with a broken state store -- report status of active containers, serve up shuffle data, etc. So far I don't think any of the state store updates should result in a teardown of the NM if there is a failure, although please let me know if you have a scenario where we should. Recover NMTokens upon nodemanager restart - Key: YARN-1341 URL: https://issues.apache.org/jira/browse/YARN-1341 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (YARN-2210) resource manager fails to start if core-site.xml contains an xi:include
[ https://issues.apache.org/jira/browse/YARN-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved YARN-2210. -- Resolution: Duplicate Resolving as a dup of YARN-1741, as that has more discussion around how this was broken and potential fixes. resource manager fails to start if core-site.xml contains an xi:include --- Key: YARN-2210 URL: https://issues.apache.org/jira/browse/YARN-2210 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.4.0 Reporter: Sangjin Lee Priority: Critical The resource manager fails to start if core-site.xml contains an xi:include. This is easily reproduced with a pseudo-distributed mode. Just add something like this in the core-site.xml: {noformat} configuration xmlns:xi=http://www.w3.org/2001/XInclude; xi:include href=mounttable.xml/ ... {noformat} and place mounttable.xml in the same directory (doesn't matter what the file is really). Then try starting the resource manager, and it will fail while handling this include. The exception encountered: {noformat} [Warning] :20:38: Include operation failed, reverting to fallback. Resource error reading file as XML (href='mounttable.xml'). Reason: /Users/sjlee/hadoop-2.4.0/mounttable.xml (No such file or directory) [Fatal Error] :20:38: An include failed, and no fallback element was found. 14/06/24 23:30:16 FATAL conf.Configuration: error parsing conf java.io.BufferedInputStream@7426dbec org.xml.sax.SAXParseException: An include failed, and no fallback element was found. at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:246) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:124) at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2173) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2246) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2195) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2102) at org.apache.hadoop.conf.Configuration.get(Configuration.java:851) at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:870) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1889) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1919) at org.apache.hadoop.security.Groups.init(Groups.java:64) at org.apache.hadoop.security.Groups.getUserToGroupsMappingServiceWithLoadedConfiguration(Groups.java:255) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:197) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1038) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1741) XInclude support broken for YARN ResourceManager
[ https://issues.apache.org/jira/browse/YARN-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1741: - Priority: Critical (was: Minor) Bumping the priority of this based on YARN-2210 and the fact that existing configuration setups that relied on relative xincludes used to work in prior releases. XInclude support broken for YARN ResourceManager Key: YARN-1741 URL: https://issues.apache.org/jira/browse/YARN-1741 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.4.0 Reporter: Eric Sirianni Priority: Critical Labels: regression The XInclude support in Hadoop configuration files (introduced via HADOOP-4944) was broken by the recent {{ConfigurationProvider}} changes to YARN ResourceManager. Specifically, YARN-1459 and, more generally, the YARN-1611 family of JIRAs for ResourceManager HA. The issue is that {{ConfigurationProvider}} provides a raw {{InputStream}} as a {{Configuration}} resource for what was previously a {{Path}}-based resource. For {{Path}} resources, the absolute file path is used as the {{systemId}} for the {{DocumentBuilder.parse()}} call: {code} } else if (resource instanceof Path) { // a file resource ... doc = parse(builder, new BufferedInputStream( new FileInputStream(file)), ((Path)resource).toString()); } {code} The {{systemId}} is used to resolve XIncludes (among other things): {code} /** * Parse the content of the given codeInputStream/code as an * XML document and return a new DOM Document object. ... * @param systemId Provide a base for resolving relative URIs. ... */ public Document parse(InputStream is, String systemId) {code} However, for loading raw {{InputStream}} resources, the {{systemId}} is set to {{null}}: {code} } else if (resource instanceof InputStream) { doc = parse(builder, (InputStream) resource, null); {code} causing XInclude resolution to fail. In our particular environment, we make extensive use of XIncludes to standardize common configuration parameters across multiple Hadoop clusters. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045959#comment-14045959 ] Jason Lowe commented on YARN-1341: -- Agree it's not ideal to discuss handling state store errors for all NM components in this JIRA. In general I'd prefer to discuss and address each case with the corresponding JIRA, e.g.: application state store errors discussed and addressed in YARN-1354, container state store errors in YARN-1337, etc. If we feel there's significant utility to committing a JIRA before all the issues are addressed then we can file one or more followup JIRAs to track those outstanding issues. That's the normal process we follow with other features/fixes as well. So if we follow that process then we're back to the discussion about RM master keys not being able to be stored in the state store. The choices we've discussed are: 1) Log an error, update the master key in memory, and continue 2) Log an error, _not_ update the master key in memory, and continue 3) Log an error and tear down the NM I'd prefer 1) since that is the option that preserves the most work in all scenarios I can think of, and I don't know of a scenario where 2) would handle it better. However I could be convinced given the right scenario. I'd really rather avoid 3) since that seems like a severe way to handle the error and guarantees work is lost. Oh there is one more handling scenario we briefly discussed where we flag the NM as undesirable. When that occurs we don't shoot the containers that are running, but we avoid adding new containers since the node is having issues (i.e.: a drain-decommission). I feel that would be a separate JIRA since it needs YARN-914, and we'd still need to decide how to handle the error until the decommission is complete (i.e.: choice 1 or 2 above). Recover NMTokens upon nodemanager restart - Key: YARN-1341 URL: https://issues.apache.org/jira/browse/YARN-1341 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2104) Scheduler queue filter failed to work because index of queue column changed
[ https://issues.apache.org/jira/browse/YARN-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046512#comment-14046512 ] Jason Lowe commented on YARN-2104: -- +1 lgtm. The test failure is unrelated. Committing this. Scheduler queue filter failed to work because index of queue column changed --- Key: YARN-2104 URL: https://issues.apache.org/jira/browse/YARN-2104 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, webapp Affects Versions: 2.4.0 Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-2104.patch YARN-563 added, {code} + th(.type, Application Type”). {code} to application table, which makes queue’s column index from 3 to 4. And in scheduler page, queue’s column index is hard coded to 3 when filter application with queue’s name, {code} if (q == 'root') q = '';, else q = '^' + q.substr(q.lastIndexOf('.') + 1) + '$';, $('#apps').dataTable().fnFilter(q, 3, true);, {code} So queue filter will not work for application page. Reproduce steps: (Thanks Bo Yang for pointing this) {code} 1) In default setup, there’s a default queue under root queue 2) Run an arbitrary application, you can find it in “Applications” page 3) Click “Default” queue in scheduler page 4) Click “Applications”, no application will show here 5) Click “Root” queue in scheduler page 6) Click “Applications”, application will show again {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2263) CSQueueUtils.computeMaxActiveApplicationsPerUser may cause deadlock for nested MapReduce jobs
[ https://issues.apache.org/jira/browse/YARN-2263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14058132#comment-14058132 ] Jason Lowe commented on YARN-2263: -- 1 is an appropriate lower bound since we don't ever want the maximum number of applications for a user to be zero or less. (That would be a worthless queue since we could submit jobs to it but no jobs would activate.) I'm assuming it only causes a deadlock in the case where the active job submits and waits for the completion of other jobs? If it simply submits jobs and exits then even if the queue is so tiny that only 1 active job per user is allowed then the jobs should eventually complete (assuming sufficient resources to launch an AM _and_ at least one task simultaneously if this is MapReduce). If the concern is that the queue can be too small to allow running more than one application simultaneously for a user and some app frameworks might not like that, then yes that could be an issue. However I'm not sure that is YARN's problem to solve. I could have an application framework that for whatever reason requires 10 jobs to be running simultaneously to work. There could definitely be a queue config that will not allow that to run properly because the queue is too small to support 10 simultaneous applications by a single user. Should YARN handle this scenario? If so, how would it detect it, and what should it do to mitigate it? I would argue the same applies to the simpler job-launching-job-and-waiting scenario. Some queues are going to be too small to support that. Users can work around issues like this with smarter queue setups. This is touched upon in MAPREDUCE-4304 and elsewhere for the Oozie case which is a similar scenario. We can setup a separate queue for the launcher jobs separate from a queue where the other jobs run. That way we can't accidentally fill the cluster/queue with just launcher jobs and deadlock. CSQueueUtils.computeMaxActiveApplicationsPerUser may cause deadlock for nested MapReduce jobs - Key: YARN-2263 URL: https://issues.apache.org/jira/browse/YARN-2263 Project: Hadoop YARN Issue Type: Bug Affects Versions: 0.23.10, 2.4.1 Reporter: Chen He computeMaxActiveApplicationsPerUser() has a lower bound 1. For a nested MapReduce job which files new mapreduce jobs in its mapper/reducer, it will cause job stuck. public static int computeMaxActiveApplicationsPerUser( int maxActiveApplications, int userLimit, float userLimitFactor) { return Math.max( (int)Math.ceil( maxActiveApplications * (userLimit / 100.0f) * userLimitFactor), 1); } -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2259) NM-Local dir cleanup failing when Resourcemanager switches
[ https://issues.apache.org/jira/browse/YARN-2259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14058200#comment-14058200 ] Jason Lowe commented on YARN-2259: -- This sounds like the NM wasn't notified of the application completing and therefore didn't process the cleanup. Possibly a duplicate of YARN-1421? NM-Local dir cleanup failing when Resourcemanager switches -- Key: YARN-2259 URL: https://issues.apache.org/jira/browse/YARN-2259 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.4.0 Environment: Reporter: Nishan Shetty Attachments: Capture.PNG Induce RM switchover while job is in progress Observe that NM-Local dir cleanup failing when Resourcemanager switches. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1421) Node managers will not receive application finish event where containers ran before RM restart
[ https://issues.apache.org/jira/browse/YARN-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14058201#comment-14058201 ] Jason Lowe commented on YARN-1421: -- Was this fixed by YARN-1885? Node managers will not receive application finish event where containers ran before RM restart -- Key: YARN-1421 URL: https://issues.apache.org/jira/browse/YARN-1421 Project: Hadoop YARN Issue Type: Bug Reporter: Omkar Vinit Joshi Assignee: Omkar Vinit Joshi Priority: Critical Problem :- Today for every application we track the node managers where containers ran. So when application finishes it notifies all those node managers about application finish event (via node manager heartbeat). However if rm restarts then we forget this past information and those node managers will never get application finish event and will keep reporting finished applications. Proposed Solution :- Instead of remembering the node managers where containers ran for this particular application it would be better if we depend on node manager heartbeat to take this decision. i.e. when node manager heartbeats saying it is running application (app1, app2) then we should check those application's status in RM's memory {code}rmContext.getRMApps(){code} and if either they are not found (very old applications) or they are in their final state (FINISHED, KILLED, FAILED) then we should immediately notify the node manager about the application finish event. By doing this we are reducing the state which we need to store at RM after restart. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2045) Data persisted in NM should be versioned
[ https://issues.apache.org/jira/browse/YARN-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14058952#comment-14058952 ] Jason Lowe commented on YARN-2045: -- Thanks for the patch, Junping! Is the schema version something that's appropriate to be at the state store interface? To me the schema seems specific to a particular state store implementation, and version 1.1 of a leveldb implementation may mean something totally different to a mysqldb or PosixSharedMemSegmentStore, etc. One of those implementations may have decided to change its layout (e.g.: to fix a bug or make the store more efficient) which is a schema change that shouldn't be exposed outside of the implementation. IMHO it's the responsibility of the state store implementation to marshal the data being conveyed via the state store interface, and something sitting above the implementation layer (i.e.: interacting with NMStateStoreService) ideally shouldn't have to deal with schema layout changes. Is there an example scenario where code agnostic of the state store implementation needs to check the schema version? If there is a need to convey high-level interface changes via a schema version then arguably there needs to be separate versions, one for the high-level interface changes and possibly an implementation-specific schema version. Ideally we'd just need the latter. Other comments on the patch: - I think we should have a fallback in loadVersion to check for the original string-based schema. Given there's only ever been '1.0' maybe we can check for an exact match of that before trying to parse it as a protobuf - Nit: the PBImpl is unnecessary, as this will never be sent via RPC (especially if it's state store specific). The PBImpl is extra boilerplate that isn't buying us anything, and the code would be simpler just using NMDBSchemaVersionProto directly. Data persisted in NM should be versioned Key: YARN-2045 URL: https://issues.apache.org/jira/browse/YARN-2045 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.4.1 Reporter: Junping Du Assignee: Junping Du Attachments: YARN-2045-v2.patch, YARN-2045.patch As a split task from YARN-667, we want to add version info to NM related data, include: - NodeManager local LevelDB state - NodeManager directory structure -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2045) Data persisted in NM should be versioned
[ https://issues.apache.org/jira/browse/YARN-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14060694#comment-14060694 ] Jason Lowe commented on YARN-2045: -- bq. If any incompatible changes happen (not matter layout change, Proto changes) in future, the version check here can detect and stop the further wrong behavior. Ah I see, for tracking protobuf schema changes for any protobufs passed via the NMStateStoreService interface directly. That's not quite the same concept as the layout of the state store itself (i.e.: implementation-specific schema). For example, if leveldb decided to rework the way it uses keys in the database that would be a schema change not reflected in this version, correct? I can see wanting to expose a common schema version if code independent of the state store implementation wants to marshal the data from one schema to another. I just found it odd that the state store _implementation_ code is checking this value yet it's exposed in a state store interface which is not implementation specific. That's what felt wrong to me. For this kind of schema version, the code to marshal data between compatible versions should be in a common place, not in each state store implementation, and either all state stores end up supporting the compatible versions or they don't (because the code to handle any conversions is common). If a state store implementation can't handle the difference between two different versions but others can that seems like a state store specific schema version conflict. If each state store implementation is going to separately check the schema version and separately determine what versions are compatible then this is an implementation-specific artifact that should not be exposed in the interface, IMHO. Again, maybe a specific example where it would be appropriate would help me see things clearly. My apologies if I'm completely missing the point. bq. I agree that PBImpl here is not too much use, but just make interface looks more clear, i.e. I can call some handy method like: isCompatibleTo() rather than manipulate proto object directly. I'm not advocating for the removal of the NMDBSchemaVersion class, rather just the PBImpl class. All we really need from the PBImpl is getMajorVersion and getMinorVersion, and the interface between PBImpl and the raw protobuf is the same in that regard. The PBImpl isn't adding any value here, since it's not the one providing the useful isCompatibleTo() method or any other methods that are easier to use than the raw protobuf. We could just have NMDBSchemaVersion class wrap the raw protobuf and have the same easy-to-use interface without all the extra PBImpl code. And as a bonus, NMDBSchemaVersion's hashCode and equals methods can delegate to the raw protobuf methods to remove even more code. Data persisted in NM should be versioned Key: YARN-2045 URL: https://issues.apache.org/jira/browse/YARN-2045 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.4.1 Reporter: Junping Du Assignee: Junping Du Attachments: YARN-2045-v2.patch, YARN-2045.patch As a split task from YARN-667, we want to add version info to NM related data, include: - NodeManager local LevelDB state - NodeManager directory structure -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2152) Recover missing container information
[ https://issues.apache.org/jira/browse/YARN-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061354#comment-14061354 ] Jason Lowe commented on YARN-2152: -- I think this may have broken backwards compatibility for ContainerTokenIdentifier. We're running tests with NM restart functionality (see YARN-1336) and during an upgrade the NM failed to parse a stored container token with this error: {noformat} java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.hadoop.yarn.security.ContainerTokenIdentifier.readFields(ContainerTokenIdentifier.java:159) at org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:142) at org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerTokenIdentifier(BuilderUtils.java:262) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverContainer(ContainerManagerImpl.java:292) [...] {noformat} It looks like it's trying to parse the new priority and creationTime fields that were added to the token identifier but old tokens don't have them. Recover missing container information - Key: YARN-2152 URL: https://issues.apache.org/jira/browse/YARN-2152 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Jian He Assignee: Jian He Fix For: 2.5.0 Attachments: YARN-2152.1.patch, YARN-2152.1.patch, YARN-2152.2.patch, YARN-2152.3.patch Container information such as container priority and container start time cannot be recovered because NM container today lacks such container information to send across on NM registration when RM recovery happens -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2152) Recover missing container information
[ https://issues.apache.org/jira/browse/YARN-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062060#comment-14062060 ] Jason Lowe commented on YARN-2152: -- Yeah that's what I suspected as well, but I wanted to mention it in case I missed something. It's crucial we get the token compatibility sorted out sooner rather than later, otherwise I can see us regularly breaking compatibility between even minor versions as we tweak tokens to add features. Whenever that happens rolling upgrades will not work in practice. Recover missing container information - Key: YARN-2152 URL: https://issues.apache.org/jira/browse/YARN-2152 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Jian He Assignee: Jian He Fix For: 2.5.0 Attachments: YARN-2152.1.patch, YARN-2152.1.patch, YARN-2152.2.patch, YARN-2152.3.patch Container information such as container priority and container start time cannot be recovered because NM container today lacks such container information to send across on NM registration when RM recovery happens -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2045) Data persisted in NM should be versioned
[ https://issues.apache.org/jira/browse/YARN-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062212#comment-14062212 ] Jason Lowe commented on YARN-2045: -- bq. I agree the concept is not quite the same but I tend to handle them both together as either of change (protobuf schema or layout schema) will bring difficulty/risky for NMStateStoreService to load old version of data. I think lumping them together and handling them in the implementation-specific code is fine, but if the implementation is handling all the details then why is it exposed in the interface? I think the most telling point is that in the proposed patch no common code actually uses the interfaces that were added. Each implementation does its own version setting, its own compatibility check, and I assume its own marshaling in the future if necessary. The interfaces aren't called by common code. Maybe I'm not seeing the future use case of these methods? I guess it could be useful for common code to do logging/reporting of the persisted/current versions or maybe to do a very simplistic incompatibility check (e.g.: assume different major numbers means incompatible), although arguably the implementation could simply log these numbers as it initializes and is already doing an implementation-specific compatibility check. However I'm particularly doubtful of the storeVersion method as it seems like the only way to safely convert versions in the general sense is with implementation-specific code. Using the conversion pseudo-code above as an example, if we crash halfway through the conversion of a series of objects then we have a mix of old and new data on the next restart but the stored version number is still old (or vice-versa if we store the new version first then convert). In an implementation-specific approach it may be possible to make the conversion atomic, e.g.: using a batch write for the entire conversion in leveldb. Therefore it makes more sense to me that an implementation should be responsible for deciding when and how to update the persisted schema version. I would expect implementations to do this sort of conversion during initialization and potentially the old persisted version would never be seen since it would already be converted. Do you have an example where using the storeVersion method in the interface via implementation-independent code would be more appropriate and therefore the storeVersion method in the interface is necessary? To summarize, I can see exposing the ability to get the persisted and current state store versions in the interface for logging, etc. However I don't see how implementation-independent code can properly update the version via the interface. We're lumping both interface and implementation-specific schema changes in the same version number, and it isn't possible to do an update of multiple store objects atomically via the current interface. bq. Are you suggesting NMDBSchemaVersion to play as PBImpl directly to include raw protobuf or something else? Sort of a subset of what the PBImpl is doing. I was thinking of having NMDBSchemaVersion wrap the protobuf but in a read-only way (i.e.: no set methods, no builder stuff). If one wants to change the version number, create a new protobuf. PBImpls tend to get into trouble because they can be written, and it's simpler to treat the protobufs as immutable as they were intended. Another approach would be to simply have some static helper util methods that take two protobufs to do the compatible checks, etc. Although I don't think we can really implement a useful isCompatibleTo check in implementaion-independent code since the version numbers encodes implementation-specific schema information. Anyway I didn't mean to drag out this change for too long. I'm wondering about these interfaces since I'm a strong believer that interfaces should be minimal and necessary, and I'm having difficulty seeing how these interfaces are really going to be used. However I'm probably in the minority on these methods. If people feel strongly that these interfaces are necessary and useful then go ahead and add them. It seems to me that these interfaces will either never be called or only called for trivial reasons (e.g.: logging). However I don't think having them is going to break anything or be an unreasonable burden on an implementation, rather just extra baggage that state store implementations have to expose. As for the PBImpl, it's mostly a nit. If you really would rather keep it in I guess that's fine. We should be able to remove it later if we realize we don't have a use for it. The main change I think has to be made is the leveldb schema check should handle the original method for storing the schema. Two ways to handle that are either explicitly check for the 1.0 string before trying to parse the
[jira] [Commented] (YARN-2293) Scoring for NMs to identify a better candidate to launch AMs
[ https://issues.apache.org/jira/browse/YARN-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062530#comment-14062530 ] Jason Lowe commented on YARN-2293: -- This sounds very similar to YARN-2005, if a bit more general. This approach sounds like it could support a gray area for NMs where it really doesn't like to launch AMs on a node but may choose to do so anyway if that's the only place it can find. It may be more fruitful to continue this discussion over on YARN-2005 and hash through how exit status would map to scoring adjustments, how the score would affect scheduling, and work through various corner cases. Scoring for NMs to identify a better candidate to launch AMs Key: YARN-2293 URL: https://issues.apache.org/jira/browse/YARN-2293 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager, resourcemanager Reporter: Sunil G Assignee: Sunil G Container exit status from NM is giving indications of reasons for its failure. Some times, it may be because of container launching problems in NM. In a heterogeneous cluster, some machines with weak hardware may cause more failures. It will be better not to launch AMs there more often. Also I would like to clear that container failures because of buggy job should not result in decreasing score. As mentioned earlier, based on exit status if a scoring mechanism is added for NMs in RM, then NMs with better scores can be given for launching AMs. Thoughts? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1336) Work-preserving nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1336: - Attachment: NMRestartDesignOverview.pdf Attaching a PDF that briefly describes the approach and how the methods of the state store interface are used to persist and recover state. Work-preserving nodemanager restart --- Key: YARN-1336 URL: https://issues.apache.org/jira/browse/YARN-1336 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: NMRestartDesignOverview.pdf, YARN-1336-rollup.patch This serves as an umbrella ticket for tasks related to work-preserving nodemanager restart. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062861#comment-14062861 ] Jason Lowe commented on YARN-1341: -- Thanks for commenting, Devaraj! My apologies for the late reply, as I was on vacation and am still catching up. bq. In addition to option 1), I'd think of making the NM down if NM fails to store RM keys for certain number of times(configurable) consecutively. As for retries, I mentioned earlier that if retries are likely to help then the state store implementation should do so rather than have the common code do so. For the leveldb implementation it is very unlikely that a retry is going to do anything other than just make the operation take longer to ultimately fail. The the firmware of the drive is already going to implement a large number of retries to attempt to recover from hardware errors, and non-hardware local filesystem errors are highly unlikely to be fixed by simply retrying immediately. If that were the case then I'd expect retries to be implemented in many other places where the local filesystem is used by Hadoop code. bq. And also we can make it(i.e. tear down NM or not) as configurable I'd like to avoid adding yet more config options unless we think we really need them, but if people agree this needs to be configurable then we can do so. Also I assume in that scenario you would want the NM to shutdown while also tearing down containers, cleaning up, etc. as if it didn't support recovery. Tearing down the NM on a state store error just to have it start up again and try to recover with stale state seems pointless -- might as well have just kept running which is a better outcome. Or am I missing a use case for that? And thanks, Junping, for the recent comments! bq. If you are also agree on this, we can separate this document effort to other JIRA (Umbrella or a dedicated one, whatever you like) and continue the discussion on this particular case. Sure, we can discuss general error handling or an overall document for it either on YARN-1336 or a new JIRA. bq. a. if currentMasterKey is stale, it can be updated and override soon with registering to RM later. Nothing is affected. Correct, the NM should receive the current master key upon re-registration with the RM after it restarts. bq. b. if previousMasterKey is stale, then the real previous master key is lost, so the affection is: AMs with real master key cannot connect to NM to launch containers. AMs that have the current master key will still be able to connect because the NM just got the current master key as described in a). AM's that have the previous master key will not be able to connect to the NM unless that particular master key also happened to be successfully associated with the attempt in the state store (related to case c). bq. c. if applicationMasterKeys are stale, then previous old keys get tracked in applicationMasterKeys get lost after restart. The affection is: AMs with old keys cannot connect to NM to launch containers. AMs that use an old key (i.e.: not the current or previous master key) would be unable to connect to the NM. bq. Anything I am missing here? I don't believe so. The bottom line is that an AM may not be able to successfully connect to an NM after a restart with stale NM token state. Recover NMTokens upon nodemanager restart - Key: YARN-1341 URL: https://issues.apache.org/jira/browse/YARN-1341 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Moved] (YARN-1243) ResourceManager: Error in handling event type NODE_UPDATE to the scheduler - NPE at SchedulerApp.java:411
[ https://issues.apache.org/jira/browse/YARN-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe moved MAPREDUCE-5539 to YARN-1243: - Component/s: (was: capacity-sched) capacityscheduler Affects Version/s: (was: 0.23.8) 0.23.8 Key: YARN-1243 (was: MAPREDUCE-5539) Project: Hadoop YARN (was: Hadoop Map/Reduce) ResourceManager: Error in handling event type NODE_UPDATE to the scheduler - NPE at SchedulerApp.java:411 - Key: YARN-1243 URL: https://issues.apache.org/jira/browse/YARN-1243 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 0.23.8 Environment: RHEL - 6.4, Hadoop 0.23.8 Reporter: Sanjay Upadhyay 2013-09-26 03:25:02,262 [ResourceManager Event Processor] FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApp.unreserve(SchedulerApp.java:411) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.unreserve(LeafQueue.java:1333) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1261) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1137) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1092) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignReservedContainer(LeafQueue.java:887) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:788) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:594) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:656) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:80) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:340) at java.lang.Thread.run(Thread.java:722) Yarn Resource manager exits at this NPE -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1243) ResourceManager: Error in handling event type NODE_UPDATE to the scheduler - NPE at SchedulerApp.java:411
[ https://issues.apache.org/jira/browse/YARN-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1243: - Attachment: YARN-1243.branch-0.23.patch Patch that backports YARN-845 fix to branch-0.23. ResourceManager: Error in handling event type NODE_UPDATE to the scheduler - NPE at SchedulerApp.java:411 - Key: YARN-1243 URL: https://issues.apache.org/jira/browse/YARN-1243 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 0.23.8 Environment: RHEL - 6.4, Hadoop 0.23.8 Reporter: Sanjay Upadhyay Attachments: YARN-1243.branch-0.23.patch 2013-09-26 03:25:02,262 [ResourceManager Event Processor] FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApp.unreserve(SchedulerApp.java:411) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.unreserve(LeafQueue.java:1333) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1261) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1137) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1092) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignReservedContainer(LeafQueue.java:887) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:788) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:594) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:656) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:80) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:340) at java.lang.Thread.run(Thread.java:722) Yarn Resource manager exits at this NPE -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1243) ResourceManager: Error in handling event type NODE_UPDATE to the scheduler - NPE at SchedulerApp.java:411
[ https://issues.apache.org/jira/browse/YARN-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13778921#comment-13778921 ] Jason Lowe commented on YARN-1243: -- Jenkins only handles trunk patches. I manually ran the resourcemanager unit tests after this patch was applied and they all passed. ResourceManager: Error in handling event type NODE_UPDATE to the scheduler - NPE at SchedulerApp.java:411 - Key: YARN-1243 URL: https://issues.apache.org/jira/browse/YARN-1243 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 0.23.8 Environment: RHEL - 6.4, Hadoop 0.23.8 Reporter: Sanjay Upadhyay Assignee: Jason Lowe Attachments: YARN-1243.branch-0.23.patch 2013-09-26 03:25:02,262 [ResourceManager Event Processor] FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApp.unreserve(SchedulerApp.java:411) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.unreserve(LeafQueue.java:1333) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1261) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1137) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1092) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignReservedContainer(LeafQueue.java:887) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:788) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:594) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:656) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:80) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:340) at java.lang.Thread.run(Thread.java:722) Yarn Resource manager exits at this NPE -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira