[jira] [Commented] (YARN-3934) Application with large ApplicationSubmissionContext can cause RM to exit when ZK store is used

2016-10-27 Thread Dustin Cote (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15614013#comment-15614013
 ] 

Dustin Cote commented on YARN-3934:
---

[~templedf] feel free to take it over if you'd like.  I'm not going to have a 
chance to address this for the foreseeable future.

> Application with large ApplicationSubmissionContext can cause RM to exit when 
> ZK store is used
> --
>
> Key: YARN-3934
> URL: https://issues.apache.org/jira/browse/YARN-3934
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Ming Ma
>Assignee: Dustin Cote
>  Labels: oct16-easy
> Attachments: YARN-3934-1.patch
>
>
> Use the following steps to test.
> 1. Set up ZK as the RM HA store.
> 2. Submit a job that refers to lots of distributed cache files with long HDFS 
> path, which will cause the app state size to exceed ZK's max object size 
> limit.
> 3. RM can't write to ZK and exit with the following exception.
> {noformat}
> 2015-07-10 22:21:13,002 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a 
> org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type 
> STATE_STORE_OP_FAILED. Cause:
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:935)
> at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:915)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:944)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:941)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1083)
> {noformat}
> In this case, RM could have rejected the app during submitApplication RPC if 
> the size of ApplicationSubmissionContext is too large.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3934) Application with large ApplicationSubmissionContext can cause RM to exit when ZK store is used

2016-02-18 Thread Dustin Cote (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15152283#comment-15152283
 ] 

Dustin Cote commented on YARN-3934:
---

[~mingma] can you have a look at the proposed patch and let me know if you feel 
it addresses your issue appropriately?

> Application with large ApplicationSubmissionContext can cause RM to exit when 
> ZK store is used
> --
>
> Key: YARN-3934
> URL: https://issues.apache.org/jira/browse/YARN-3934
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Dustin Cote
> Attachments: YARN-3934-1.patch
>
>
> Use the following steps to test.
> 1. Set up ZK as the RM HA store.
> 2. Submit a job that refers to lots of distributed cache files with long HDFS 
> path, which will cause the app state size to exceed ZK's max object size 
> limit.
> 3. RM can't write to ZK and exit with the following exception.
> {noformat}
> 2015-07-10 22:21:13,002 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a 
> org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type 
> STATE_STORE_OP_FAILED. Cause:
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:935)
> at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:915)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:944)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:941)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1083)
> {noformat}
> In this case, RM could have rejected the app during submitApplication RPC if 
> the size of ApplicationSubmissionContext is too large.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4551) Address the duplication between StatusUpdateWhenHealthy and StatusUpdateWhenUnhealthy transitions

2016-01-11 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated YARN-4551:
--
Assignee: Sunil G

> Address the duplication between StatusUpdateWhenHealthy and 
> StatusUpdateWhenUnhealthy transitions
> -
>
> Key: YARN-4551
> URL: https://issues.apache.org/jira/browse/YARN-4551
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Sunil G
>Priority: Minor
>  Labels: newbie
> Attachments: 0001-YARN-4551.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3934) Application with large ApplicationSubmissionContext can cause RM to exit when ZK store is used

2016-01-06 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated YARN-3934:
--
Attachment: YARN-3934-1.patch

Here's a first attempt at the fix.  We cannot know with certainty what ZK has 
set for jute.maxbuffer on the server side, so we have to make the assumption 
that it matches what is on the client side (in this case the RM).  I've setup 
the code to read the property as a system property which is how we normally 
specify it.  There may be a desire to standardize it into the YARN config later 
on, but I think that's outside the scope of fixing this.  Without the patch, 
the ZK connection is broken and retried by default *1000* times, so the RM 
doesn't go down for awhile and all applications are blocked from submission.  I 
think it's probably worth revisiting that default value as well, but I'd like 
some feedback from reviewers on that if we should open a separate JIRA there.

> Application with large ApplicationSubmissionContext can cause RM to exit when 
> ZK store is used
> --
>
> Key: YARN-3934
> URL: https://issues.apache.org/jira/browse/YARN-3934
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Dustin Cote
> Attachments: YARN-3934-1.patch
>
>
> Use the following steps to test.
> 1. Set up ZK as the RM HA store.
> 2. Submit a job that refers to lots of distributed cache files with long HDFS 
> path, which will cause the app state size to exceed ZK's max object size 
> limit.
> 3. RM can't write to ZK and exit with the following exception.
> {noformat}
> 2015-07-10 22:21:13,002 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a 
> org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type 
> STATE_STORE_OP_FAILED. Cause:
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:935)
> at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:915)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:944)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:941)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1083)
> {noformat}
> In this case, RM could have rejected the app during submitApplication RPC if 
> the size of ApplicationSubmissionContext is too large.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3934) Application with large ApplicationSubmissionContext can cause RM to exit when ZK store is used

2016-01-05 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote reassigned YARN-3934:
-

Assignee: Dustin Cote

> Application with large ApplicationSubmissionContext can cause RM to exit when 
> ZK store is used
> --
>
> Key: YARN-3934
> URL: https://issues.apache.org/jira/browse/YARN-3934
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Dustin Cote
>
> Use the following steps to test.
> 1. Set up ZK as the RM HA store.
> 2. Submit a job that refers to lots of distributed cache files with long HDFS 
> path, which will cause the app state size to exceed ZK's max object size 
> limit.
> 3. RM can't write to ZK and exit with the following exception.
> {noformat}
> 2015-07-10 22:21:13,002 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a 
> org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type 
> STATE_STORE_OP_FAILED. Cause:
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:935)
> at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:915)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:944)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:941)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1083)
> {noformat}
> In this case, RM could have rejected the app during submitApplication RPC if 
> the size of ApplicationSubmissionContext is too large.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-4286) yarn-default.xml has a typo: yarn-nodemanager.local-dirs should be yarn.nodemanager.local-dirs

2015-10-23 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote resolved YARN-4286.
---
Resolution: Duplicate

Turns out this is already reported and a patch is available, so I'm closing 
this one as duplicate.

> yarn-default.xml has a typo: yarn-nodemanager.local-dirs should be 
> yarn.nodemanager.local-dirs
> --
>
> Key: YARN-4286
> URL: https://issues.apache.org/jira/browse/YARN-4286
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Dustin Cote
>Priority: Trivial
>  Labels: newbie
>
> In the yarn-default.xml, the property yarn.nodemanager.local-dirs is 
> referenced as yarn-nodemanager.local-dirs in multiple places.  This should be 
> a straightforward fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4286) yarn-default.xml has a typo: yarn-nodemanager.local-dirs should be yarn.nodemanager.local-dirs

2015-10-21 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated YARN-4286:
--
Labels: newbie  (was: )

> yarn-default.xml has a typo: yarn-nodemanager.local-dirs should be 
> yarn.nodemanager.local-dirs
> --
>
> Key: YARN-4286
> URL: https://issues.apache.org/jira/browse/YARN-4286
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Dustin Cote
>Priority: Trivial
>  Labels: newbie
>
> In the yarn-default.xml, the property yarn.nodemanager.local-dirs is 
> referenced as yarn-nodemanager.local-dirs in multiple places.  This should be 
> a straightforward fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4286) yarn-default.xml has a typo: yarn-nodemanager.local-dirs should be yarn.nodemanager.local-dirs

2015-10-21 Thread Dustin Cote (JIRA)
Dustin Cote created YARN-4286:
-

 Summary: yarn-default.xml has a typo: yarn-nodemanager.local-dirs 
should be yarn.nodemanager.local-dirs
 Key: YARN-4286
 URL: https://issues.apache.org/jira/browse/YARN-4286
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Dustin Cote
Priority: Trivial


In the yarn-default.xml, the property yarn.nodemanager.local-dirs is referenced 
as yarn-nodemanager.local-dirs in multiple places.  This should be a 
straightforward fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2369) Environment variable handling assumes values should be appended

2015-09-20 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated YARN-2369:
--
Attachment: YARN-2369-7.patch

[~jlowe] sorry this took so long, I got sidetracked with other projects!  I 
think I've now included your suggestions in v7 of this patch and there are a 
couple of checkstyle warnings as you might have expected.  If there's anything 
missing from this please let me know :)

> Environment variable handling assumes values should be appended
> ---
>
> Key: YARN-2369
> URL: https://issues.apache.org/jira/browse/YARN-2369
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jason Lowe
>Assignee: Dustin Cote
> Attachments: YARN-2369-1.patch, YARN-2369-2.patch, YARN-2369-3.patch, 
> YARN-2369-4.patch, YARN-2369-5.patch, YARN-2369-6.patch, YARN-2369-7.patch
>
>
> When processing environment variables for a container context the code 
> assumes that the value should be appended to any pre-existing value in the 
> environment.  This may be desired behavior for handling path-like environment 
> variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a 
> non-intuitive and harmful way to handle any variable that does not have 
> path-like semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2369) Environment variable handling assumes values should be appended

2015-08-11 Thread Dustin Cote (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682379#comment-14682379
 ] 

Dustin Cote commented on YARN-2369:
---

[~jlowe] thanks for all the input.  I'll clean this latest patch up based on 
these comments this week.

Happy to throw this in the MAPREDUCE project instead as well, since basically 
all the changes are in the MR client.  I don't think sub JIRAs would be 
necessary since it's a pretty small change on the YARN side, but I leave that 
to the project management experts.  I don't see any organizational problem 
keeping it all in one JIRA here.  

> Environment variable handling assumes values should be appended
> ---
>
> Key: YARN-2369
> URL: https://issues.apache.org/jira/browse/YARN-2369
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jason Lowe
>Assignee: Dustin Cote
> Attachments: YARN-2369-1.patch, YARN-2369-2.patch, YARN-2369-3.patch, 
> YARN-2369-4.patch, YARN-2369-5.patch, YARN-2369-6.patch
>
>
> When processing environment variables for a container context the code 
> assumes that the value should be appended to any pre-existing value in the 
> environment.  This may be desired behavior for handling path-like environment 
> variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a 
> non-intuitive and harmful way to handle any variable that does not have 
> path-like semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3924) Submitting an application to standby ResourceManager should respond better than Connection Refused

2015-08-11 Thread Dustin Cote (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681713#comment-14681713
 ] 

Dustin Cote commented on YARN-3924:
---

Yes, [~ajithshetty] that's the point I'm trying to get across.  The scenario 
that is problematic is:
{quote}
Configuring wrong/invalid ha.rm-ids at client is user mistake, this can be 
rechecked by user.
{quote} 

Returning "Connection Refused" gives the user no information that this is what 
happened.  Generally, I see users looking for closed ports or firewall issues 
when they see this message back, when really they've just forgotten to change 
their Oozie workflow to point to a logical RM name after enabling HA.  This 
kind of error is doubly hard to debug when it works intermittently (because 
when a failover occurs, suddenly their workflow starts working again!).  Yes, 
this is the current RM HA design, so it's not as easy as changing the message 
or exception type.  That said, I still think it's a good 
supportability/usability improvement. 

> Submitting an application to standby ResourceManager should respond better 
> than Connection Refused
> --
>
> Key: YARN-3924
> URL: https://issues.apache.org/jira/browse/YARN-3924
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Dustin Cote
>Assignee: Ajith S
>Priority: Minor
>
> When submitting an application directly to a standby resource manager, the 
> resource manager responds with 'Connection Refused' rather than indicating 
> that it is a standby resource manager.  Because the resource manager is aware 
> of its own state, I feel like we can have the 8032 port open for standby 
> resource managers and reject the request with something like 'Cannot process 
> application submission from this standby resource manager'.  
> This would be especially helpful for debugging oozie problems when users put 
> in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM 
> address but rather point to a specific resource manager).  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2369) Environment variable handling assumes values should be appended

2015-08-07 Thread Dustin Cote (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14661895#comment-14661895
 ] 

Dustin Cote commented on YARN-2369:
---

[~vinodkv] or [~jlowe] do you have any further feedback on this patch or am I 
missing something keeping this from getting submitted?  I appreciate your help 
so far :)

> Environment variable handling assumes values should be appended
> ---
>
> Key: YARN-2369
> URL: https://issues.apache.org/jira/browse/YARN-2369
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jason Lowe
>Assignee: Dustin Cote
> Attachments: YARN-2369-1.patch, YARN-2369-2.patch, YARN-2369-3.patch, 
> YARN-2369-4.patch, YARN-2369-5.patch, YARN-2369-6.patch
>
>
> When processing environment variables for a container context the code 
> assumes that the value should be appended to any pre-existing value in the 
> environment.  This may be desired behavior for handling path-like environment 
> variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a 
> non-intuitive and harmful way to handle any variable that does not have 
> path-like semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3924) Submitting an application to standby ResourceManager should respond better than Connection Refused

2015-07-14 Thread Dustin Cote (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626874#comment-14626874
 ] 

Dustin Cote commented on YARN-3924:
---

It doesn't if you put in the standby resource manager for the 'jobTracker' 
stanza in oozie or if you misconfigure yarn.resourcemanager.ha.rm-ids to 
include only the standby resource manager.  The oozie scenario is more the real 
user scenario, but I reproduced this by using the 
yarn.resourcemanager.ha.rm-ids method.  I assume closing the 8032 port for the 
standby RM is by design, but can we indicate that the RM is in standby instead 
of just saying connection refused?  

> Submitting an application to standby ResourceManager should respond better 
> than Connection Refused
> --
>
> Key: YARN-3924
> URL: https://issues.apache.org/jira/browse/YARN-3924
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Dustin Cote
>Priority: Minor
>
> When submitting an application directly to a standby resource manager, the 
> resource manager responds with 'Connection Refused' rather than indicating 
> that it is a standby resource manager.  Because the resource manager is aware 
> of its own state, I feel like we can have the 8032 port open for standby 
> resource managers and reject the request with something like 'Cannot process 
> application submission from this standby resource manager'.  
> This would be especially helpful for debugging oozie problems when users put 
> in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM 
> address but rather point to a specific resource manager).  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3924) Submitting an application to standby ResourceManager should respond better than Connection Refused

2015-07-14 Thread Dustin Cote (JIRA)
Dustin Cote created YARN-3924:
-

 Summary: Submitting an application to standby ResourceManager 
should respond better than Connection Refused
 Key: YARN-3924
 URL: https://issues.apache.org/jira/browse/YARN-3924
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Dustin Cote
Priority: Minor


When submitting an application directly to a standby resource manager, the 
resource manager responds with 'Connection Refused' rather than indicating that 
it is a standby resource manager.  Because the resource manager is aware of its 
own state, I feel like we can have the 8032 port open for standby resource 
managers and reject the request with something like 'Cannot process application 
submission from this standby resource manager'.  

This would be especially helpful for debugging oozie problems when users put in 
the wrong address for the 'jobtracker' (i.e. they don't put the logical RM 
address but rather point to a specific resource manager).  




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2369) Environment variable handling assumes values should be appended

2015-06-25 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated YARN-2369:
--
Attachment: YARN-2369-6.patch

Fixed all the style bugs possible in v6 of the patch, so I don't see anything 
left to do here.  [~vinodkv] anything outstanding here in your mind?

> Environment variable handling assumes values should be appended
> ---
>
> Key: YARN-2369
> URL: https://issues.apache.org/jira/browse/YARN-2369
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jason Lowe
>Assignee: Dustin Cote
> Attachments: YARN-2369-1.patch, YARN-2369-2.patch, YARN-2369-3.patch, 
> YARN-2369-4.patch, YARN-2369-5.patch, YARN-2369-6.patch
>
>
> When processing environment variables for a container context the code 
> assumes that the value should be appended to any pre-existing value in the 
> environment.  This may be desired behavior for handling path-like environment 
> variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a 
> non-intuitive and harmful way to handle any variable that does not have 
> path-like semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2369) Environment variable handling assumes values should be appended

2015-05-25 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated YARN-2369:
--
Attachment: YARN-2369-5.patch

Missed a couple of style items.  Submitting again.

> Environment variable handling assumes values should be appended
> ---
>
> Key: YARN-2369
> URL: https://issues.apache.org/jira/browse/YARN-2369
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jason Lowe
>Assignee: Dustin Cote
> Attachments: YARN-2369-1.patch, YARN-2369-2.patch, YARN-2369-3.patch, 
> YARN-2369-4.patch, YARN-2369-5.patch
>
>
> When processing environment variables for a container context the code 
> assumes that the value should be appended to any pre-existing value in the 
> environment.  This may be desired behavior for handling path-like environment 
> variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a 
> non-intuitive and harmful way to handle any variable that does not have 
> path-like semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2369) Environment variable handling assumes values should be appended

2015-05-25 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated YARN-2369:
--
Attachment: YARN-2369-4.patch

Checkstyle fixes, first try

> Environment variable handling assumes values should be appended
> ---
>
> Key: YARN-2369
> URL: https://issues.apache.org/jira/browse/YARN-2369
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jason Lowe
>Assignee: Dustin Cote
> Attachments: YARN-2369-1.patch, YARN-2369-2.patch, YARN-2369-3.patch, 
> YARN-2369-4.patch
>
>
> When processing environment variables for a container context the code 
> assumes that the value should be appended to any pre-existing value in the 
> environment.  This may be desired behavior for handling path-like environment 
> variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a 
> non-intuitive and harmful way to handle any variable that does not have 
> path-like semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1814) Better error message when browsing logs in the RM/NM webuis

2015-05-19 Thread Dustin Cote (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551114#comment-14551114
 ] 

Dustin Cote commented on YARN-1814:
---

[~jianhe] indeed it looks like this one already got fixed in a later version.  
I'm not sure where, but I see that when I test this on 2.6, I get an 
authorization error instead.  This can probably be closed as invalid.

> Better error message when browsing logs in the RM/NM webuis
> ---
>
> Key: YARN-1814
> URL: https://issues.apache.org/jira/browse/YARN-1814
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: Andrew Wang
>Assignee: Dustin Cote
>Priority: Minor
> Attachments: YARN-1814-1.patch, YARN-1814-2.patch
>
>
> Browsing the webUI as a different user than the one who ran an MR job, I 
> click into host:8088/cluster/app/, then the "logs" link. This 
> redirects to the NM, but since I don't have permissions it prints out:
> bq. Failed redirect for container_1394482121761_0010_01_01
> bq. Failed while trying to construct the redirect url to the log server. Log 
> Server url may not be configured
> bq. Container does not exist.
> It'd be nicer to print something about permissions instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2369) Environment variable handling assumes values should be appended

2015-05-12 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated YARN-2369:
--
Attachment: YARN-2369-3.patch

[~vinodkv] here's v3 of the patch.  I've got a new unit test in this one and 
I'm using MRJobConfig now for the property (now with a new and improved name).  
I think I've trimmed down the lines, but if something looks misplaced please 
let me know.  Thanks!

> Environment variable handling assumes values should be appended
> ---
>
> Key: YARN-2369
> URL: https://issues.apache.org/jira/browse/YARN-2369
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jason Lowe
>Assignee: Dustin Cote
> Attachments: YARN-2369-1.patch, YARN-2369-2.patch, YARN-2369-3.patch
>
>
> When processing environment variables for a container context the code 
> assumes that the value should be appended to any pre-existing value in the 
> environment.  This may be desired behavior for handling path-like environment 
> variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a 
> non-intuitive and harmful way to handle any variable that does not have 
> path-like semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2369) Environment variable handling assumes values should be appended

2015-05-05 Thread Dustin Cote (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14529194#comment-14529194
 ] 

Dustin Cote commented on YARN-2369:
---

[~vinodkv] thanks for the feedback!  I would expect (b) for general apps as a 
specific config for MR jobs.  Should I put the config in MRConfig or 
MRJobConfig instead then?  I'll get the specific comments you have fixed once I 
have the test case in too.  I'll put them all in the next patch.  Thanks again!

> Environment variable handling assumes values should be appended
> ---
>
> Key: YARN-2369
> URL: https://issues.apache.org/jira/browse/YARN-2369
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jason Lowe
>Assignee: Dustin Cote
> Attachments: YARN-2369-1.patch, YARN-2369-2.patch
>
>
> When processing environment variables for a container context the code 
> assumes that the value should be appended to any pre-existing value in the 
> environment.  This may be desired behavior for handling path-like environment 
> variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a 
> non-intuitive and harmful way to handle any variable that does not have 
> path-like semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2369) Environment variable handling assumes values should be appended

2015-04-30 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated YARN-2369:
--
Attachment: YARN-2369-2.patch

New patch has a configurable whitelist for variables with append enabled as 
yarn.application.variables.with.append.  The existing unit tests are passing 
from TestMRApps.  Next up, a test to try to append to a variable not on the 
default white list and verify it gets replaced instead of appended to.  
[~jlowe], any thoughts on things that are missing from this latest patch or 
problems with the design?

> Environment variable handling assumes values should be appended
> ---
>
> Key: YARN-2369
> URL: https://issues.apache.org/jira/browse/YARN-2369
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jason Lowe
>Assignee: Dustin Cote
> Attachments: YARN-2369-1.patch, YARN-2369-2.patch
>
>
> When processing environment variables for a container context the code 
> assumes that the value should be appended to any pre-existing value in the 
> environment.  This may be desired behavior for handling path-like environment 
> variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a 
> non-intuitive and harmful way to handle any variable that does not have 
> path-like semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2369) Environment variable handling assumes values should be appended

2015-04-20 Thread Dustin Cote (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503230#comment-14503230
 ] 

Dustin Cote commented on YARN-2369:
---

I don't see a unit test suite dealing with env variables already, so I'm not 
sure it makes sense to make a unit test for this.  I don't have a better 
solution than above, so I'm sticking with the submitted patch unless I hear 
otherwise in a review.

> Environment variable handling assumes values should be appended
> ---
>
> Key: YARN-2369
> URL: https://issues.apache.org/jira/browse/YARN-2369
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jason Lowe
>Assignee: Dustin Cote
> Attachments: YARN-2369-1.patch
>
>
> When processing environment variables for a container context the code 
> assumes that the value should be appended to any pre-existing value in the 
> environment.  This may be desired behavior for handling path-like environment 
> variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a 
> non-intuitive and harmful way to handle any variable that does not have 
> path-like semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2369) Environment variable handling assumes values should be appended

2015-04-12 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated YARN-2369:
--
Attachment: YARN-2369-1.patch

This patch gets this working for the scenario outlined in the description with 
2 JAVA_HOME specifications.  I made it so any variable set that doesn't match 
*PATH* would replace instead of append.  Any variable that matches *PATH* would 
continue to be appended to as before.  I tried it out but am wondering if 
there's a more complete way to do the white listing of appendable variables?  I 
could define them myself somewhere, but that doesn't seem any better than 
checking for the presence of the substring PATH.  Any thoughts here?

Also test wise, I'm looking for some unit test suite for env variables that I 
can add onto here.  Does one exist?  I'll look myself too, but just in case 
someone knows offhand.

> Environment variable handling assumes values should be appended
> ---
>
> Key: YARN-2369
> URL: https://issues.apache.org/jira/browse/YARN-2369
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jason Lowe
>Assignee: Dustin Cote
> Attachments: YARN-2369-1.patch
>
>
> When processing environment variables for a container context the code 
> assumes that the value should be appended to any pre-existing value in the 
> environment.  This may be desired behavior for handling path-like environment 
> variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a 
> non-intuitive and harmful way to handle any variable that does not have 
> path-like semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2369) Environment variable handling assumes values should be appended

2015-04-12 Thread Dustin Cote (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14491476#comment-14491476
 ] 

Dustin Cote commented on YARN-2369:
---

It turns out my initial idea causes other problems with the application master, 
so I'm going up the call chain a bit to see what's going on.  Seems like it 
breaks org.apache.hadoop.mapreduce.v2.util.MRApps.java#addToEnvironment 
expectations at the moment, so maybe there's something else that needs to be 
done.

> Environment variable handling assumes values should be appended
> ---
>
> Key: YARN-2369
> URL: https://issues.apache.org/jira/browse/YARN-2369
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jason Lowe
>Assignee: Dustin Cote
>
> When processing environment variables for a container context the code 
> assumes that the value should be appended to any pre-existing value in the 
> environment.  This may be desired behavior for handling path-like environment 
> variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a 
> non-intuitive and harmful way to handle any variable that does not have 
> path-like semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2369) Environment variable handling assumes values should be appended

2015-04-11 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated YARN-2369:
--
Attachment: (was: YARN-2369-1.patch)

> Environment variable handling assumes values should be appended
> ---
>
> Key: YARN-2369
> URL: https://issues.apache.org/jira/browse/YARN-2369
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jason Lowe
>Assignee: Dustin Cote
>
> When processing environment variables for a container context the code 
> assumes that the value should be appended to any pre-existing value in the 
> environment.  This may be desired behavior for handling path-like environment 
> variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a 
> non-intuitive and harmful way to handle any variable that does not have 
> path-like semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2369) Environment variable handling assumes values should be appended

2015-04-01 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated YARN-2369:
--
Attachment: YARN-2369-1.patch

I like the second idea where the user should explicitly append to the variable. 
 I think we can do this just by removing the code to append and just replace 
the entire variable every time we get an update.  I'm going to try this out, 
but figured I'd attach the code change in case I'm missing something obvious.

> Environment variable handling assumes values should be appended
> ---
>
> Key: YARN-2369
> URL: https://issues.apache.org/jira/browse/YARN-2369
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jason Lowe
>Assignee: Dustin Cote
> Attachments: YARN-2369-1.patch
>
>
> When processing environment variables for a container context the code 
> assumes that the value should be appended to any pre-existing value in the 
> environment.  This may be desired behavior for handling path-like environment 
> variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a 
> non-intuitive and harmful way to handle any variable that does not have 
> path-like semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3419) ConcurrentModificationException in FSLeafQueue

2015-03-30 Thread Dustin Cote (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386850#comment-14386850
 ] 

Dustin Cote commented on YARN-3419:
---

[~ozawa] yep that will probably fix it, just on the caller side.  Didn't see 
that one.  I'm ok if you want to mark this as a duplicate.  Probably no reason 
to safeguard against others that might use FSLeafQueue improperly with the 
potential for more memory use with my suggested fix.  Thanks for checking!

> ConcurrentModificationException in FSLeafQueue
> --
>
> Key: YARN-3419
> URL: https://issues.apache.org/jira/browse/YARN-3419
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.5.0
>Reporter: Dustin Cote
>Assignee: Dustin Cote
> Attachments: YARN-3419-1.patch
>
>
> Heavy Resource Manager use causes a manifestation of a 
> ConcurrentModificationException in FSLeafQueue.  Doesn't look like 
> FSLeafQueue does anything except add, remove, traverse, and get sorted, so I 
> think we could use a CopyOnWriteArrayList that will use a bit more memory but 
> remove these exceptions.  Seems to me that there will be relatively few app 
> adds compared to the number of traversals.  Stack trace below:
> 2015-03-27 00:47:34,773  ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
> handling event type CONTAINER_ALLOCATED for applicationAttempt 
> application_1427401429921_3388
> java.util.ConcurrentModificationException
>   at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
>   at java.util.ArrayList$Itr.next(ArrayList.java:831)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:929)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:922)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:757)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:110)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:765)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:746)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>   at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3419) ConcurrentModificationException in FSLeafQueue

2015-03-30 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated YARN-3419:
--
Attachment: YARN-3419-1.patch

Attaching a patch moving to a thread safe ArrayList implementation 

> ConcurrentModificationException in FSLeafQueue
> --
>
> Key: YARN-3419
> URL: https://issues.apache.org/jira/browse/YARN-3419
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.5.0
>Reporter: Dustin Cote
>Assignee: Dustin Cote
> Attachments: YARN-3419-1.patch
>
>
> Heavy Resource Manager use causes a manifestation of a 
> ConcurrentModificationException in FSLeafQueue.  Doesn't look like 
> FSLeafQueue does anything except add, remove, traverse, and get sorted, so I 
> think we could use a CopyOnWriteArrayList that will use a bit more memory but 
> remove these exceptions.  Seems to me that there will be relatively few app 
> adds compared to the number of traversals.  Stack trace below:
> 2015-03-27 00:47:34,773  ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
> handling event type CONTAINER_ALLOCATED for applicationAttempt 
> application_1427401429921_3388
> java.util.ConcurrentModificationException
>   at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
>   at java.util.ArrayList$Itr.next(ArrayList.java:831)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:929)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:922)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:757)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:110)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:765)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:746)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>   at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3419) ConcurrentModificationException in FSLeafQueue

2015-03-30 Thread Dustin Cote (JIRA)
Dustin Cote created YARN-3419:
-

 Summary: ConcurrentModificationException in FSLeafQueue
 Key: YARN-3419
 URL: https://issues.apache.org/jira/browse/YARN-3419
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.5.0
Reporter: Dustin Cote
Assignee: Dustin Cote


Heavy Resource Manager use causes a manifestation of a 
ConcurrentModificationException in FSLeafQueue.  Doesn't look like FSLeafQueue 
does anything except add, remove, traverse, and get sorted, so I think we could 
use a CopyOnWriteArrayList that will use a bit more memory but remove these 
exceptions.  Seems to me that there will be relatively few app adds compared to 
the number of traversals.  Stack trace below:
2015-03-27 00:47:34,773  ERROR 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
handling event type CONTAINER_ALLOCATED for applicationAttempt 
application_1427401429921_3388
java.util.ConcurrentModificationException
at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
at java.util.ArrayList$Itr.next(ArrayList.java:831)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:929)
at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:922)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:757)
at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:110)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:765)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:746)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2369) Environment variable handling assumes values should be appended

2015-03-19 Thread Dustin Cote (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14369953#comment-14369953
 ] 

Dustin Cote commented on YARN-2369:
---

[~jlowe] or [~aw] is this one still needed?  If it is, I'd like to take a crack 
at it.  I've had problems with the LD_LIBRARY_PATH in my own experiences, so if 
it's not fixed by something else in a later version I think it should be.

> Environment variable handling assumes values should be appended
> ---
>
> Key: YARN-2369
> URL: https://issues.apache.org/jira/browse/YARN-2369
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jason Lowe
>Assignee: Dustin Cote
>
> When processing environment variables for a container context the code 
> assumes that the value should be appended to any pre-existing value in the 
> environment.  This may be desired behavior for handling path-like environment 
> variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a 
> non-intuitive and harmful way to handle any variable that does not have 
> path-like semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-2369) Environment variable handling assumes values should be appended

2015-03-19 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote reassigned YARN-2369:
-

Assignee: Dustin Cote

> Environment variable handling assumes values should be appended
> ---
>
> Key: YARN-2369
> URL: https://issues.apache.org/jira/browse/YARN-2369
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jason Lowe
>Assignee: Dustin Cote
>
> When processing environment variables for a container context the code 
> assumes that the value should be appended to any pre-existing value in the 
> environment.  This may be desired behavior for handling path-like environment 
> variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a 
> non-intuitive and harmful way to handle any variable that does not have 
> path-like semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1814) Better error message when browsing logs in the RM/NM webuis

2014-12-16 Thread Dustin Cote (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248983#comment-14248983
 ] 

Dustin Cote commented on YARN-1814:
---

No tests associated here as this is only a string update.

> Better error message when browsing logs in the RM/NM webuis
> ---
>
> Key: YARN-1814
> URL: https://issues.apache.org/jira/browse/YARN-1814
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: Andrew Wang
>Assignee: Dustin Cote
>Priority: Minor
> Attachments: YARN-1814-1.patch
>
>
> Browsing the webUI as a different user than the one who ran an MR job, I 
> click into host:8088/cluster/app/, then the "logs" link. This 
> redirects to the NM, but since I don't have permissions it prints out:
> bq. Failed redirect for container_1394482121761_0010_01_01
> bq. Failed while trying to construct the redirect url to the log server. Log 
> Server url may not be configured
> bq. Container does not exist.
> It'd be nicer to print something about permissions instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1814) Better error message when browsing logs in the RM/NM webuis

2014-12-16 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated YARN-1814:
--
Attachment: YARN-1814-1.patch

> Better error message when browsing logs in the RM/NM webuis
> ---
>
> Key: YARN-1814
> URL: https://issues.apache.org/jira/browse/YARN-1814
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: Andrew Wang
>Assignee: Dustin Cote
>Priority: Minor
> Attachments: YARN-1814-1.patch
>
>
> Browsing the webUI as a different user than the one who ran an MR job, I 
> click into host:8088/cluster/app/, then the "logs" link. This 
> redirects to the NM, but since I don't have permissions it prints out:
> bq. Failed redirect for container_1394482121761_0010_01_01
> bq. Failed while trying to construct the redirect url to the log server. Log 
> Server url may not be configured
> bq. Container does not exist.
> It'd be nicer to print something about permissions instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1814) Better error message when browsing logs in the RM/NM webuis

2014-12-16 Thread Dustin Cote (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248980#comment-14248980
 ] 

Dustin Cote commented on YARN-1814:
---

Changing the error message to suggest that the user logged in may not have 
access to view the container logs for this job.  Attaching patch now.

> Better error message when browsing logs in the RM/NM webuis
> ---
>
> Key: YARN-1814
> URL: https://issues.apache.org/jira/browse/YARN-1814
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: Andrew Wang
>Assignee: Dustin Cote
>Priority: Minor
>
> Browsing the webUI as a different user than the one who ran an MR job, I 
> click into host:8088/cluster/app/, then the "logs" link. This 
> redirects to the NM, but since I don't have permissions it prints out:
> bq. Failed redirect for container_1394482121761_0010_01_01
> bq. Failed while trying to construct the redirect url to the log server. Log 
> Server url may not be configured
> bq. Container does not exist.
> It'd be nicer to print something about permissions instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-1814) Better error message when browsing logs in the RM/NM webuis

2014-12-16 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote reassigned YARN-1814:
-

Assignee: Dustin Cote

> Better error message when browsing logs in the RM/NM webuis
> ---
>
> Key: YARN-1814
> URL: https://issues.apache.org/jira/browse/YARN-1814
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: Andrew Wang
>Assignee: Dustin Cote
>Priority: Minor
>
> Browsing the webUI as a different user than the one who ran an MR job, I 
> click into host:8088/cluster/app/, then the "logs" link. This 
> redirects to the NM, but since I don't have permissions it prints out:
> bq. Failed redirect for container_1394482121761_0010_01_01
> bq. Failed while trying to construct the redirect url to the log server. Log 
> Server url may not be configured
> bq. Container does not exist.
> It'd be nicer to print something about permissions instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2950) Change message to mandate, not suggest JS requirement on UI

2014-12-12 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated YARN-2950:
--
Attachment: YARN-2950-1.patch

Updating the error message as Harsh suggested.

> Change message to mandate, not suggest JS requirement on UI
> ---
>
> Key: YARN-2950
> URL: https://issues.apache.org/jira/browse/YARN-2950
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: webapp
>Affects Versions: 2.5.0
>Reporter: Harsh J
>Assignee: Dustin Cote
>Priority: Minor
>  Labels: newbie
> Attachments: YARN-2950-1.patch
>
>
> Most of YARN's UIs do not work with JavaScript disabled on the browser, cause 
> they appear to send back data as JS arrays instead of within the actual HTML 
> content.
> The JQueryUI prints only a mild warning about this suggesting that {{This 
> page works best with javascript enabled.}}, when in fact it ought to be 
> {{This page will not function without javascript enabled. Please enable 
> javascript on your browser.}} or something as such (more direct).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-2950) Change message to mandate, not suggest JS requirement on UI

2014-12-11 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote reassigned YARN-2950:
-

Assignee: Dustin Cote

> Change message to mandate, not suggest JS requirement on UI
> ---
>
> Key: YARN-2950
> URL: https://issues.apache.org/jira/browse/YARN-2950
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: webapp
>Affects Versions: 2.5.0
>Reporter: Harsh J
>Assignee: Dustin Cote
>Priority: Minor
>  Labels: newbie
>
> Most of YARN's UIs do not work with JavaScript disabled on the browser, cause 
> they appear to send back data as JS arrays instead of within the actual HTML 
> content.
> The JQueryUI prints only a mild warning about this suggesting that {{This 
> page works best with javascript enabled.}}, when in fact it ought to be 
> {{This page will not function without javascript enabled. Please enable 
> javascript on your browser.}} or something as such (more direct).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2891) Failed Container Executor does not provide a clear error message

2014-12-01 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated YARN-2891:
--
Attachment: YARN-2891-1.patch

Attaching a patch file because I'm not really sure how to submit it otherwise 
for review.

> Failed Container Executor does not provide a clear error message
> 
>
> Key: YARN-2891
> URL: https://issues.apache.org/jira/browse/YARN-2891
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.5.1
> Environment: any
>Reporter: Dustin Cote
>Assignee: Dustin Cote
>Priority: Minor
> Attachments: YARN-2891-1.patch
>
>
> When checking access to directories, the container executor does not provide 
> clear information on which directory actually could not be accessed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2891) Failed Container Executor does not provide a clear error message

2014-11-21 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated YARN-2891:
--
Issue Type: Improvement  (was: Bug)

> Failed Container Executor does not provide a clear error message
> 
>
> Key: YARN-2891
> URL: https://issues.apache.org/jira/browse/YARN-2891
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.5.1
> Environment: any
>Reporter: Dustin Cote
>Priority: Minor
>
> When checking access to directories, the container executor does not provide 
> clear information on which directory actually could not be accessed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2891) Failed Container Executor does not provide a clear error message

2014-11-21 Thread Dustin Cote (JIRA)
Dustin Cote created YARN-2891:
-

 Summary: Failed Container Executor does not provide a clear error 
message
 Key: YARN-2891
 URL: https://issues.apache.org/jira/browse/YARN-2891
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.1
 Environment: any
Reporter: Dustin Cote
Priority: Minor


When checking access to directories, the container executor does not provide 
clear information on which directory actually could not be accessed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)