[jira] [Commented] (YARN-3934) Application with large ApplicationSubmissionContext can cause RM to exit when ZK store is used
[ https://issues.apache.org/jira/browse/YARN-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15614013#comment-15614013 ] Dustin Cote commented on YARN-3934: --- [~templedf] feel free to take it over if you'd like. I'm not going to have a chance to address this for the foreseeable future. > Application with large ApplicationSubmissionContext can cause RM to exit when > ZK store is used > -- > > Key: YARN-3934 > URL: https://issues.apache.org/jira/browse/YARN-3934 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Ming Ma >Assignee: Dustin Cote > Labels: oct16-easy > Attachments: YARN-3934-1.patch > > > Use the following steps to test. > 1. Set up ZK as the RM HA store. > 2. Submit a job that refers to lots of distributed cache files with long HDFS > path, which will cause the app state size to exceed ZK's max object size > limit. > 3. RM can't write to ZK and exit with the following exception. > {noformat} > 2015-07-10 22:21:13,002 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a > org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type > STATE_STORE_OP_FAILED. Cause: > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:127) > at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:935) > at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:915) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:944) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:941) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1083) > {noformat} > In this case, RM could have rejected the app during submitApplication RPC if > the size of ApplicationSubmissionContext is too large. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3934) Application with large ApplicationSubmissionContext can cause RM to exit when ZK store is used
[ https://issues.apache.org/jira/browse/YARN-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15152283#comment-15152283 ] Dustin Cote commented on YARN-3934: --- [~mingma] can you have a look at the proposed patch and let me know if you feel it addresses your issue appropriately? > Application with large ApplicationSubmissionContext can cause RM to exit when > ZK store is used > -- > > Key: YARN-3934 > URL: https://issues.apache.org/jira/browse/YARN-3934 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Ming Ma >Assignee: Dustin Cote > Attachments: YARN-3934-1.patch > > > Use the following steps to test. > 1. Set up ZK as the RM HA store. > 2. Submit a job that refers to lots of distributed cache files with long HDFS > path, which will cause the app state size to exceed ZK's max object size > limit. > 3. RM can't write to ZK and exit with the following exception. > {noformat} > 2015-07-10 22:21:13,002 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a > org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type > STATE_STORE_OP_FAILED. Cause: > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:127) > at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:935) > at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:915) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:944) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:941) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1083) > {noformat} > In this case, RM could have rejected the app during submitApplication RPC if > the size of ApplicationSubmissionContext is too large. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4551) Address the duplication between StatusUpdateWhenHealthy and StatusUpdateWhenUnhealthy transitions
[ https://issues.apache.org/jira/browse/YARN-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote updated YARN-4551: -- Assignee: Sunil G > Address the duplication between StatusUpdateWhenHealthy and > StatusUpdateWhenUnhealthy transitions > - > > Key: YARN-4551 > URL: https://issues.apache.org/jira/browse/YARN-4551 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla >Assignee: Sunil G >Priority: Minor > Labels: newbie > Attachments: 0001-YARN-4551.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3934) Application with large ApplicationSubmissionContext can cause RM to exit when ZK store is used
[ https://issues.apache.org/jira/browse/YARN-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote updated YARN-3934: -- Attachment: YARN-3934-1.patch Here's a first attempt at the fix. We cannot know with certainty what ZK has set for jute.maxbuffer on the server side, so we have to make the assumption that it matches what is on the client side (in this case the RM). I've setup the code to read the property as a system property which is how we normally specify it. There may be a desire to standardize it into the YARN config later on, but I think that's outside the scope of fixing this. Without the patch, the ZK connection is broken and retried by default *1000* times, so the RM doesn't go down for awhile and all applications are blocked from submission. I think it's probably worth revisiting that default value as well, but I'd like some feedback from reviewers on that if we should open a separate JIRA there. > Application with large ApplicationSubmissionContext can cause RM to exit when > ZK store is used > -- > > Key: YARN-3934 > URL: https://issues.apache.org/jira/browse/YARN-3934 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Ming Ma >Assignee: Dustin Cote > Attachments: YARN-3934-1.patch > > > Use the following steps to test. > 1. Set up ZK as the RM HA store. > 2. Submit a job that refers to lots of distributed cache files with long HDFS > path, which will cause the app state size to exceed ZK's max object size > limit. > 3. RM can't write to ZK and exit with the following exception. > {noformat} > 2015-07-10 22:21:13,002 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a > org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type > STATE_STORE_OP_FAILED. Cause: > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:127) > at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:935) > at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:915) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:944) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:941) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1083) > {noformat} > In this case, RM could have rejected the app during submitApplication RPC if > the size of ApplicationSubmissionContext is too large. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3934) Application with large ApplicationSubmissionContext can cause RM to exit when ZK store is used
[ https://issues.apache.org/jira/browse/YARN-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote reassigned YARN-3934: - Assignee: Dustin Cote > Application with large ApplicationSubmissionContext can cause RM to exit when > ZK store is used > -- > > Key: YARN-3934 > URL: https://issues.apache.org/jira/browse/YARN-3934 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Ming Ma >Assignee: Dustin Cote > > Use the following steps to test. > 1. Set up ZK as the RM HA store. > 2. Submit a job that refers to lots of distributed cache files with long HDFS > path, which will cause the app state size to exceed ZK's max object size > limit. > 3. RM can't write to ZK and exit with the following exception. > {noformat} > 2015-07-10 22:21:13,002 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a > org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type > STATE_STORE_OP_FAILED. Cause: > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:127) > at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:935) > at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:915) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:944) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:941) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1083) > {noformat} > In this case, RM could have rejected the app during submitApplication RPC if > the size of ApplicationSubmissionContext is too large. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-4286) yarn-default.xml has a typo: yarn-nodemanager.local-dirs should be yarn.nodemanager.local-dirs
[ https://issues.apache.org/jira/browse/YARN-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote resolved YARN-4286. --- Resolution: Duplicate Turns out this is already reported and a patch is available, so I'm closing this one as duplicate. > yarn-default.xml has a typo: yarn-nodemanager.local-dirs should be > yarn.nodemanager.local-dirs > -- > > Key: YARN-4286 > URL: https://issues.apache.org/jira/browse/YARN-4286 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Dustin Cote >Priority: Trivial > Labels: newbie > > In the yarn-default.xml, the property yarn.nodemanager.local-dirs is > referenced as yarn-nodemanager.local-dirs in multiple places. This should be > a straightforward fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4286) yarn-default.xml has a typo: yarn-nodemanager.local-dirs should be yarn.nodemanager.local-dirs
[ https://issues.apache.org/jira/browse/YARN-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote updated YARN-4286: -- Labels: newbie (was: ) > yarn-default.xml has a typo: yarn-nodemanager.local-dirs should be > yarn.nodemanager.local-dirs > -- > > Key: YARN-4286 > URL: https://issues.apache.org/jira/browse/YARN-4286 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Dustin Cote >Priority: Trivial > Labels: newbie > > In the yarn-default.xml, the property yarn.nodemanager.local-dirs is > referenced as yarn-nodemanager.local-dirs in multiple places. This should be > a straightforward fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4286) yarn-default.xml has a typo: yarn-nodemanager.local-dirs should be yarn.nodemanager.local-dirs
Dustin Cote created YARN-4286: - Summary: yarn-default.xml has a typo: yarn-nodemanager.local-dirs should be yarn.nodemanager.local-dirs Key: YARN-4286 URL: https://issues.apache.org/jira/browse/YARN-4286 Project: Hadoop YARN Issue Type: Bug Reporter: Dustin Cote Priority: Trivial In the yarn-default.xml, the property yarn.nodemanager.local-dirs is referenced as yarn-nodemanager.local-dirs in multiple places. This should be a straightforward fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2369) Environment variable handling assumes values should be appended
[ https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote updated YARN-2369: -- Attachment: YARN-2369-7.patch [~jlowe] sorry this took so long, I got sidetracked with other projects! I think I've now included your suggestions in v7 of this patch and there are a couple of checkstyle warnings as you might have expected. If there's anything missing from this please let me know :) > Environment variable handling assumes values should be appended > --- > > Key: YARN-2369 > URL: https://issues.apache.org/jira/browse/YARN-2369 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jason Lowe >Assignee: Dustin Cote > Attachments: YARN-2369-1.patch, YARN-2369-2.patch, YARN-2369-3.patch, > YARN-2369-4.patch, YARN-2369-5.patch, YARN-2369-6.patch, YARN-2369-7.patch > > > When processing environment variables for a container context the code > assumes that the value should be appended to any pre-existing value in the > environment. This may be desired behavior for handling path-like environment > variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a > non-intuitive and harmful way to handle any variable that does not have > path-like semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2369) Environment variable handling assumes values should be appended
[ https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682379#comment-14682379 ] Dustin Cote commented on YARN-2369: --- [~jlowe] thanks for all the input. I'll clean this latest patch up based on these comments this week. Happy to throw this in the MAPREDUCE project instead as well, since basically all the changes are in the MR client. I don't think sub JIRAs would be necessary since it's a pretty small change on the YARN side, but I leave that to the project management experts. I don't see any organizational problem keeping it all in one JIRA here. > Environment variable handling assumes values should be appended > --- > > Key: YARN-2369 > URL: https://issues.apache.org/jira/browse/YARN-2369 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jason Lowe >Assignee: Dustin Cote > Attachments: YARN-2369-1.patch, YARN-2369-2.patch, YARN-2369-3.patch, > YARN-2369-4.patch, YARN-2369-5.patch, YARN-2369-6.patch > > > When processing environment variables for a container context the code > assumes that the value should be appended to any pre-existing value in the > environment. This may be desired behavior for handling path-like environment > variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a > non-intuitive and harmful way to handle any variable that does not have > path-like semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3924) Submitting an application to standby ResourceManager should respond better than Connection Refused
[ https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681713#comment-14681713 ] Dustin Cote commented on YARN-3924: --- Yes, [~ajithshetty] that's the point I'm trying to get across. The scenario that is problematic is: {quote} Configuring wrong/invalid ha.rm-ids at client is user mistake, this can be rechecked by user. {quote} Returning "Connection Refused" gives the user no information that this is what happened. Generally, I see users looking for closed ports or firewall issues when they see this message back, when really they've just forgotten to change their Oozie workflow to point to a logical RM name after enabling HA. This kind of error is doubly hard to debug when it works intermittently (because when a failover occurs, suddenly their workflow starts working again!). Yes, this is the current RM HA design, so it's not as easy as changing the message or exception type. That said, I still think it's a good supportability/usability improvement. > Submitting an application to standby ResourceManager should respond better > than Connection Refused > -- > > Key: YARN-3924 > URL: https://issues.apache.org/jira/browse/YARN-3924 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Dustin Cote >Assignee: Ajith S >Priority: Minor > > When submitting an application directly to a standby resource manager, the > resource manager responds with 'Connection Refused' rather than indicating > that it is a standby resource manager. Because the resource manager is aware > of its own state, I feel like we can have the 8032 port open for standby > resource managers and reject the request with something like 'Cannot process > application submission from this standby resource manager'. > This would be especially helpful for debugging oozie problems when users put > in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM > address but rather point to a specific resource manager). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2369) Environment variable handling assumes values should be appended
[ https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14661895#comment-14661895 ] Dustin Cote commented on YARN-2369: --- [~vinodkv] or [~jlowe] do you have any further feedback on this patch or am I missing something keeping this from getting submitted? I appreciate your help so far :) > Environment variable handling assumes values should be appended > --- > > Key: YARN-2369 > URL: https://issues.apache.org/jira/browse/YARN-2369 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jason Lowe >Assignee: Dustin Cote > Attachments: YARN-2369-1.patch, YARN-2369-2.patch, YARN-2369-3.patch, > YARN-2369-4.patch, YARN-2369-5.patch, YARN-2369-6.patch > > > When processing environment variables for a container context the code > assumes that the value should be appended to any pre-existing value in the > environment. This may be desired behavior for handling path-like environment > variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a > non-intuitive and harmful way to handle any variable that does not have > path-like semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3924) Submitting an application to standby ResourceManager should respond better than Connection Refused
[ https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626874#comment-14626874 ] Dustin Cote commented on YARN-3924: --- It doesn't if you put in the standby resource manager for the 'jobTracker' stanza in oozie or if you misconfigure yarn.resourcemanager.ha.rm-ids to include only the standby resource manager. The oozie scenario is more the real user scenario, but I reproduced this by using the yarn.resourcemanager.ha.rm-ids method. I assume closing the 8032 port for the standby RM is by design, but can we indicate that the RM is in standby instead of just saying connection refused? > Submitting an application to standby ResourceManager should respond better > than Connection Refused > -- > > Key: YARN-3924 > URL: https://issues.apache.org/jira/browse/YARN-3924 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Dustin Cote >Priority: Minor > > When submitting an application directly to a standby resource manager, the > resource manager responds with 'Connection Refused' rather than indicating > that it is a standby resource manager. Because the resource manager is aware > of its own state, I feel like we can have the 8032 port open for standby > resource managers and reject the request with something like 'Cannot process > application submission from this standby resource manager'. > This would be especially helpful for debugging oozie problems when users put > in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM > address but rather point to a specific resource manager). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3924) Submitting an application to standby ResourceManager should respond better than Connection Refused
Dustin Cote created YARN-3924: - Summary: Submitting an application to standby ResourceManager should respond better than Connection Refused Key: YARN-3924 URL: https://issues.apache.org/jira/browse/YARN-3924 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Dustin Cote Priority: Minor When submitting an application directly to a standby resource manager, the resource manager responds with 'Connection Refused' rather than indicating that it is a standby resource manager. Because the resource manager is aware of its own state, I feel like we can have the 8032 port open for standby resource managers and reject the request with something like 'Cannot process application submission from this standby resource manager'. This would be especially helpful for debugging oozie problems when users put in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM address but rather point to a specific resource manager). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2369) Environment variable handling assumes values should be appended
[ https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote updated YARN-2369: -- Attachment: YARN-2369-6.patch Fixed all the style bugs possible in v6 of the patch, so I don't see anything left to do here. [~vinodkv] anything outstanding here in your mind? > Environment variable handling assumes values should be appended > --- > > Key: YARN-2369 > URL: https://issues.apache.org/jira/browse/YARN-2369 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jason Lowe >Assignee: Dustin Cote > Attachments: YARN-2369-1.patch, YARN-2369-2.patch, YARN-2369-3.patch, > YARN-2369-4.patch, YARN-2369-5.patch, YARN-2369-6.patch > > > When processing environment variables for a container context the code > assumes that the value should be appended to any pre-existing value in the > environment. This may be desired behavior for handling path-like environment > variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a > non-intuitive and harmful way to handle any variable that does not have > path-like semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2369) Environment variable handling assumes values should be appended
[ https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote updated YARN-2369: -- Attachment: YARN-2369-5.patch Missed a couple of style items. Submitting again. > Environment variable handling assumes values should be appended > --- > > Key: YARN-2369 > URL: https://issues.apache.org/jira/browse/YARN-2369 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jason Lowe >Assignee: Dustin Cote > Attachments: YARN-2369-1.patch, YARN-2369-2.patch, YARN-2369-3.patch, > YARN-2369-4.patch, YARN-2369-5.patch > > > When processing environment variables for a container context the code > assumes that the value should be appended to any pre-existing value in the > environment. This may be desired behavior for handling path-like environment > variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a > non-intuitive and harmful way to handle any variable that does not have > path-like semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2369) Environment variable handling assumes values should be appended
[ https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote updated YARN-2369: -- Attachment: YARN-2369-4.patch Checkstyle fixes, first try > Environment variable handling assumes values should be appended > --- > > Key: YARN-2369 > URL: https://issues.apache.org/jira/browse/YARN-2369 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jason Lowe >Assignee: Dustin Cote > Attachments: YARN-2369-1.patch, YARN-2369-2.patch, YARN-2369-3.patch, > YARN-2369-4.patch > > > When processing environment variables for a container context the code > assumes that the value should be appended to any pre-existing value in the > environment. This may be desired behavior for handling path-like environment > variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a > non-intuitive and harmful way to handle any variable that does not have > path-like semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1814) Better error message when browsing logs in the RM/NM webuis
[ https://issues.apache.org/jira/browse/YARN-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551114#comment-14551114 ] Dustin Cote commented on YARN-1814: --- [~jianhe] indeed it looks like this one already got fixed in a later version. I'm not sure where, but I see that when I test this on 2.6, I get an authorization error instead. This can probably be closed as invalid. > Better error message when browsing logs in the RM/NM webuis > --- > > Key: YARN-1814 > URL: https://issues.apache.org/jira/browse/YARN-1814 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.3.0 >Reporter: Andrew Wang >Assignee: Dustin Cote >Priority: Minor > Attachments: YARN-1814-1.patch, YARN-1814-2.patch > > > Browsing the webUI as a different user than the one who ran an MR job, I > click into host:8088/cluster/app/, then the "logs" link. This > redirects to the NM, but since I don't have permissions it prints out: > bq. Failed redirect for container_1394482121761_0010_01_01 > bq. Failed while trying to construct the redirect url to the log server. Log > Server url may not be configured > bq. Container does not exist. > It'd be nicer to print something about permissions instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2369) Environment variable handling assumes values should be appended
[ https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote updated YARN-2369: -- Attachment: YARN-2369-3.patch [~vinodkv] here's v3 of the patch. I've got a new unit test in this one and I'm using MRJobConfig now for the property (now with a new and improved name). I think I've trimmed down the lines, but if something looks misplaced please let me know. Thanks! > Environment variable handling assumes values should be appended > --- > > Key: YARN-2369 > URL: https://issues.apache.org/jira/browse/YARN-2369 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jason Lowe >Assignee: Dustin Cote > Attachments: YARN-2369-1.patch, YARN-2369-2.patch, YARN-2369-3.patch > > > When processing environment variables for a container context the code > assumes that the value should be appended to any pre-existing value in the > environment. This may be desired behavior for handling path-like environment > variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a > non-intuitive and harmful way to handle any variable that does not have > path-like semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2369) Environment variable handling assumes values should be appended
[ https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14529194#comment-14529194 ] Dustin Cote commented on YARN-2369: --- [~vinodkv] thanks for the feedback! I would expect (b) for general apps as a specific config for MR jobs. Should I put the config in MRConfig or MRJobConfig instead then? I'll get the specific comments you have fixed once I have the test case in too. I'll put them all in the next patch. Thanks again! > Environment variable handling assumes values should be appended > --- > > Key: YARN-2369 > URL: https://issues.apache.org/jira/browse/YARN-2369 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jason Lowe >Assignee: Dustin Cote > Attachments: YARN-2369-1.patch, YARN-2369-2.patch > > > When processing environment variables for a container context the code > assumes that the value should be appended to any pre-existing value in the > environment. This may be desired behavior for handling path-like environment > variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a > non-intuitive and harmful way to handle any variable that does not have > path-like semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2369) Environment variable handling assumes values should be appended
[ https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote updated YARN-2369: -- Attachment: YARN-2369-2.patch New patch has a configurable whitelist for variables with append enabled as yarn.application.variables.with.append. The existing unit tests are passing from TestMRApps. Next up, a test to try to append to a variable not on the default white list and verify it gets replaced instead of appended to. [~jlowe], any thoughts on things that are missing from this latest patch or problems with the design? > Environment variable handling assumes values should be appended > --- > > Key: YARN-2369 > URL: https://issues.apache.org/jira/browse/YARN-2369 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jason Lowe >Assignee: Dustin Cote > Attachments: YARN-2369-1.patch, YARN-2369-2.patch > > > When processing environment variables for a container context the code > assumes that the value should be appended to any pre-existing value in the > environment. This may be desired behavior for handling path-like environment > variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a > non-intuitive and harmful way to handle any variable that does not have > path-like semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2369) Environment variable handling assumes values should be appended
[ https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503230#comment-14503230 ] Dustin Cote commented on YARN-2369: --- I don't see a unit test suite dealing with env variables already, so I'm not sure it makes sense to make a unit test for this. I don't have a better solution than above, so I'm sticking with the submitted patch unless I hear otherwise in a review. > Environment variable handling assumes values should be appended > --- > > Key: YARN-2369 > URL: https://issues.apache.org/jira/browse/YARN-2369 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jason Lowe >Assignee: Dustin Cote > Attachments: YARN-2369-1.patch > > > When processing environment variables for a container context the code > assumes that the value should be appended to any pre-existing value in the > environment. This may be desired behavior for handling path-like environment > variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a > non-intuitive and harmful way to handle any variable that does not have > path-like semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2369) Environment variable handling assumes values should be appended
[ https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote updated YARN-2369: -- Attachment: YARN-2369-1.patch This patch gets this working for the scenario outlined in the description with 2 JAVA_HOME specifications. I made it so any variable set that doesn't match *PATH* would replace instead of append. Any variable that matches *PATH* would continue to be appended to as before. I tried it out but am wondering if there's a more complete way to do the white listing of appendable variables? I could define them myself somewhere, but that doesn't seem any better than checking for the presence of the substring PATH. Any thoughts here? Also test wise, I'm looking for some unit test suite for env variables that I can add onto here. Does one exist? I'll look myself too, but just in case someone knows offhand. > Environment variable handling assumes values should be appended > --- > > Key: YARN-2369 > URL: https://issues.apache.org/jira/browse/YARN-2369 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jason Lowe >Assignee: Dustin Cote > Attachments: YARN-2369-1.patch > > > When processing environment variables for a container context the code > assumes that the value should be appended to any pre-existing value in the > environment. This may be desired behavior for handling path-like environment > variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a > non-intuitive and harmful way to handle any variable that does not have > path-like semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2369) Environment variable handling assumes values should be appended
[ https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14491476#comment-14491476 ] Dustin Cote commented on YARN-2369: --- It turns out my initial idea causes other problems with the application master, so I'm going up the call chain a bit to see what's going on. Seems like it breaks org.apache.hadoop.mapreduce.v2.util.MRApps.java#addToEnvironment expectations at the moment, so maybe there's something else that needs to be done. > Environment variable handling assumes values should be appended > --- > > Key: YARN-2369 > URL: https://issues.apache.org/jira/browse/YARN-2369 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jason Lowe >Assignee: Dustin Cote > > When processing environment variables for a container context the code > assumes that the value should be appended to any pre-existing value in the > environment. This may be desired behavior for handling path-like environment > variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a > non-intuitive and harmful way to handle any variable that does not have > path-like semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2369) Environment variable handling assumes values should be appended
[ https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote updated YARN-2369: -- Attachment: (was: YARN-2369-1.patch) > Environment variable handling assumes values should be appended > --- > > Key: YARN-2369 > URL: https://issues.apache.org/jira/browse/YARN-2369 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jason Lowe >Assignee: Dustin Cote > > When processing environment variables for a container context the code > assumes that the value should be appended to any pre-existing value in the > environment. This may be desired behavior for handling path-like environment > variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a > non-intuitive and harmful way to handle any variable that does not have > path-like semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2369) Environment variable handling assumes values should be appended
[ https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote updated YARN-2369: -- Attachment: YARN-2369-1.patch I like the second idea where the user should explicitly append to the variable. I think we can do this just by removing the code to append and just replace the entire variable every time we get an update. I'm going to try this out, but figured I'd attach the code change in case I'm missing something obvious. > Environment variable handling assumes values should be appended > --- > > Key: YARN-2369 > URL: https://issues.apache.org/jira/browse/YARN-2369 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jason Lowe >Assignee: Dustin Cote > Attachments: YARN-2369-1.patch > > > When processing environment variables for a container context the code > assumes that the value should be appended to any pre-existing value in the > environment. This may be desired behavior for handling path-like environment > variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a > non-intuitive and harmful way to handle any variable that does not have > path-like semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3419) ConcurrentModificationException in FSLeafQueue
[ https://issues.apache.org/jira/browse/YARN-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386850#comment-14386850 ] Dustin Cote commented on YARN-3419: --- [~ozawa] yep that will probably fix it, just on the caller side. Didn't see that one. I'm ok if you want to mark this as a duplicate. Probably no reason to safeguard against others that might use FSLeafQueue improperly with the potential for more memory use with my suggested fix. Thanks for checking! > ConcurrentModificationException in FSLeafQueue > -- > > Key: YARN-3419 > URL: https://issues.apache.org/jira/browse/YARN-3419 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.5.0 >Reporter: Dustin Cote >Assignee: Dustin Cote > Attachments: YARN-3419-1.patch > > > Heavy Resource Manager use causes a manifestation of a > ConcurrentModificationException in FSLeafQueue. Doesn't look like > FSLeafQueue does anything except add, remove, traverse, and get sorted, so I > think we could use a CopyOnWriteArrayList that will use a bit more memory but > remove these exceptions. Seems to me that there will be relatively few app > adds compared to the number of traversals. Stack trace below: > 2015-03-27 00:47:34,773 ERROR > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in > handling event type CONTAINER_ALLOCATED for applicationAttempt > application_1427401429921_3388 > java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) > at java.util.ArrayList$Itr.next(ArrayList.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:929) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:922) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:757) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:110) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:765) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:746) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) > at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3419) ConcurrentModificationException in FSLeafQueue
[ https://issues.apache.org/jira/browse/YARN-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote updated YARN-3419: -- Attachment: YARN-3419-1.patch Attaching a patch moving to a thread safe ArrayList implementation > ConcurrentModificationException in FSLeafQueue > -- > > Key: YARN-3419 > URL: https://issues.apache.org/jira/browse/YARN-3419 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.5.0 >Reporter: Dustin Cote >Assignee: Dustin Cote > Attachments: YARN-3419-1.patch > > > Heavy Resource Manager use causes a manifestation of a > ConcurrentModificationException in FSLeafQueue. Doesn't look like > FSLeafQueue does anything except add, remove, traverse, and get sorted, so I > think we could use a CopyOnWriteArrayList that will use a bit more memory but > remove these exceptions. Seems to me that there will be relatively few app > adds compared to the number of traversals. Stack trace below: > 2015-03-27 00:47:34,773 ERROR > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in > handling event type CONTAINER_ALLOCATED for applicationAttempt > application_1427401429921_3388 > java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) > at java.util.ArrayList$Itr.next(ArrayList.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:929) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:922) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:757) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:110) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:765) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:746) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) > at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3419) ConcurrentModificationException in FSLeafQueue
Dustin Cote created YARN-3419: - Summary: ConcurrentModificationException in FSLeafQueue Key: YARN-3419 URL: https://issues.apache.org/jira/browse/YARN-3419 Project: Hadoop YARN Issue Type: Bug Components: yarn Affects Versions: 2.5.0 Reporter: Dustin Cote Assignee: Dustin Cote Heavy Resource Manager use causes a manifestation of a ConcurrentModificationException in FSLeafQueue. Doesn't look like FSLeafQueue does anything except add, remove, traverse, and get sorted, so I think we could use a CopyOnWriteArrayList that will use a bit more memory but remove these exceptions. Seems to me that there will be relatively few app adds compared to the number of traversals. Stack trace below: 2015-03-27 00:47:34,773 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type CONTAINER_ALLOCATED for applicationAttempt application_1427401429921_3388 java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) at java.util.ArrayList$Itr.next(ArrayList.java:831) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:929) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:922) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:757) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:110) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:765) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:746) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2369) Environment variable handling assumes values should be appended
[ https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14369953#comment-14369953 ] Dustin Cote commented on YARN-2369: --- [~jlowe] or [~aw] is this one still needed? If it is, I'd like to take a crack at it. I've had problems with the LD_LIBRARY_PATH in my own experiences, so if it's not fixed by something else in a later version I think it should be. > Environment variable handling assumes values should be appended > --- > > Key: YARN-2369 > URL: https://issues.apache.org/jira/browse/YARN-2369 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jason Lowe >Assignee: Dustin Cote > > When processing environment variables for a container context the code > assumes that the value should be appended to any pre-existing value in the > environment. This may be desired behavior for handling path-like environment > variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a > non-intuitive and harmful way to handle any variable that does not have > path-like semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-2369) Environment variable handling assumes values should be appended
[ https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote reassigned YARN-2369: - Assignee: Dustin Cote > Environment variable handling assumes values should be appended > --- > > Key: YARN-2369 > URL: https://issues.apache.org/jira/browse/YARN-2369 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jason Lowe >Assignee: Dustin Cote > > When processing environment variables for a container context the code > assumes that the value should be appended to any pre-existing value in the > environment. This may be desired behavior for handling path-like environment > variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a > non-intuitive and harmful way to handle any variable that does not have > path-like semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1814) Better error message when browsing logs in the RM/NM webuis
[ https://issues.apache.org/jira/browse/YARN-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248983#comment-14248983 ] Dustin Cote commented on YARN-1814: --- No tests associated here as this is only a string update. > Better error message when browsing logs in the RM/NM webuis > --- > > Key: YARN-1814 > URL: https://issues.apache.org/jira/browse/YARN-1814 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.3.0 >Reporter: Andrew Wang >Assignee: Dustin Cote >Priority: Minor > Attachments: YARN-1814-1.patch > > > Browsing the webUI as a different user than the one who ran an MR job, I > click into host:8088/cluster/app/, then the "logs" link. This > redirects to the NM, but since I don't have permissions it prints out: > bq. Failed redirect for container_1394482121761_0010_01_01 > bq. Failed while trying to construct the redirect url to the log server. Log > Server url may not be configured > bq. Container does not exist. > It'd be nicer to print something about permissions instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1814) Better error message when browsing logs in the RM/NM webuis
[ https://issues.apache.org/jira/browse/YARN-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote updated YARN-1814: -- Attachment: YARN-1814-1.patch > Better error message when browsing logs in the RM/NM webuis > --- > > Key: YARN-1814 > URL: https://issues.apache.org/jira/browse/YARN-1814 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.3.0 >Reporter: Andrew Wang >Assignee: Dustin Cote >Priority: Minor > Attachments: YARN-1814-1.patch > > > Browsing the webUI as a different user than the one who ran an MR job, I > click into host:8088/cluster/app/, then the "logs" link. This > redirects to the NM, but since I don't have permissions it prints out: > bq. Failed redirect for container_1394482121761_0010_01_01 > bq. Failed while trying to construct the redirect url to the log server. Log > Server url may not be configured > bq. Container does not exist. > It'd be nicer to print something about permissions instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1814) Better error message when browsing logs in the RM/NM webuis
[ https://issues.apache.org/jira/browse/YARN-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248980#comment-14248980 ] Dustin Cote commented on YARN-1814: --- Changing the error message to suggest that the user logged in may not have access to view the container logs for this job. Attaching patch now. > Better error message when browsing logs in the RM/NM webuis > --- > > Key: YARN-1814 > URL: https://issues.apache.org/jira/browse/YARN-1814 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.3.0 >Reporter: Andrew Wang >Assignee: Dustin Cote >Priority: Minor > > Browsing the webUI as a different user than the one who ran an MR job, I > click into host:8088/cluster/app/, then the "logs" link. This > redirects to the NM, but since I don't have permissions it prints out: > bq. Failed redirect for container_1394482121761_0010_01_01 > bq. Failed while trying to construct the redirect url to the log server. Log > Server url may not be configured > bq. Container does not exist. > It'd be nicer to print something about permissions instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-1814) Better error message when browsing logs in the RM/NM webuis
[ https://issues.apache.org/jira/browse/YARN-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote reassigned YARN-1814: - Assignee: Dustin Cote > Better error message when browsing logs in the RM/NM webuis > --- > > Key: YARN-1814 > URL: https://issues.apache.org/jira/browse/YARN-1814 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.3.0 >Reporter: Andrew Wang >Assignee: Dustin Cote >Priority: Minor > > Browsing the webUI as a different user than the one who ran an MR job, I > click into host:8088/cluster/app/, then the "logs" link. This > redirects to the NM, but since I don't have permissions it prints out: > bq. Failed redirect for container_1394482121761_0010_01_01 > bq. Failed while trying to construct the redirect url to the log server. Log > Server url may not be configured > bq. Container does not exist. > It'd be nicer to print something about permissions instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2950) Change message to mandate, not suggest JS requirement on UI
[ https://issues.apache.org/jira/browse/YARN-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote updated YARN-2950: -- Attachment: YARN-2950-1.patch Updating the error message as Harsh suggested. > Change message to mandate, not suggest JS requirement on UI > --- > > Key: YARN-2950 > URL: https://issues.apache.org/jira/browse/YARN-2950 > Project: Hadoop YARN > Issue Type: Improvement > Components: webapp >Affects Versions: 2.5.0 >Reporter: Harsh J >Assignee: Dustin Cote >Priority: Minor > Labels: newbie > Attachments: YARN-2950-1.patch > > > Most of YARN's UIs do not work with JavaScript disabled on the browser, cause > they appear to send back data as JS arrays instead of within the actual HTML > content. > The JQueryUI prints only a mild warning about this suggesting that {{This > page works best with javascript enabled.}}, when in fact it ought to be > {{This page will not function without javascript enabled. Please enable > javascript on your browser.}} or something as such (more direct). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-2950) Change message to mandate, not suggest JS requirement on UI
[ https://issues.apache.org/jira/browse/YARN-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote reassigned YARN-2950: - Assignee: Dustin Cote > Change message to mandate, not suggest JS requirement on UI > --- > > Key: YARN-2950 > URL: https://issues.apache.org/jira/browse/YARN-2950 > Project: Hadoop YARN > Issue Type: Improvement > Components: webapp >Affects Versions: 2.5.0 >Reporter: Harsh J >Assignee: Dustin Cote >Priority: Minor > Labels: newbie > > Most of YARN's UIs do not work with JavaScript disabled on the browser, cause > they appear to send back data as JS arrays instead of within the actual HTML > content. > The JQueryUI prints only a mild warning about this suggesting that {{This > page works best with javascript enabled.}}, when in fact it ought to be > {{This page will not function without javascript enabled. Please enable > javascript on your browser.}} or something as such (more direct). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2891) Failed Container Executor does not provide a clear error message
[ https://issues.apache.org/jira/browse/YARN-2891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote updated YARN-2891: -- Attachment: YARN-2891-1.patch Attaching a patch file because I'm not really sure how to submit it otherwise for review. > Failed Container Executor does not provide a clear error message > > > Key: YARN-2891 > URL: https://issues.apache.org/jira/browse/YARN-2891 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 2.5.1 > Environment: any >Reporter: Dustin Cote >Assignee: Dustin Cote >Priority: Minor > Attachments: YARN-2891-1.patch > > > When checking access to directories, the container executor does not provide > clear information on which directory actually could not be accessed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2891) Failed Container Executor does not provide a clear error message
[ https://issues.apache.org/jira/browse/YARN-2891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote updated YARN-2891: -- Issue Type: Improvement (was: Bug) > Failed Container Executor does not provide a clear error message > > > Key: YARN-2891 > URL: https://issues.apache.org/jira/browse/YARN-2891 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 2.5.1 > Environment: any >Reporter: Dustin Cote >Priority: Minor > > When checking access to directories, the container executor does not provide > clear information on which directory actually could not be accessed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2891) Failed Container Executor does not provide a clear error message
Dustin Cote created YARN-2891: - Summary: Failed Container Executor does not provide a clear error message Key: YARN-2891 URL: https://issues.apache.org/jira/browse/YARN-2891 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.5.1 Environment: any Reporter: Dustin Cote Priority: Minor When checking access to directories, the container executor does not provide clear information on which directory actually could not be accessed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)