[jira] [Updated] (YARN-3964) Support NodeLabelsProvider at Resource Manager side

2015-10-02 Thread Dian Fu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu updated YARN-3964:
--
Attachment: YARN-3964.015.patch

Attaching a new patch to fix the findbugs warning.

> Support NodeLabelsProvider at Resource Manager side
> ---
>
> Key: YARN-3964
> URL: https://issues.apache.org/jira/browse/YARN-3964
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Dian Fu
>Assignee: Dian Fu
> Attachments: YARN-3964 design doc.pdf, YARN-3964.002.patch, 
> YARN-3964.003.patch, YARN-3964.004.patch, YARN-3964.005.patch, 
> YARN-3964.006.patch, YARN-3964.007.patch, YARN-3964.007.patch, 
> YARN-3964.008.patch, YARN-3964.009.patch, YARN-3964.010.patch, 
> YARN-3964.011.patch, YARN-3964.012.patch, YARN-3964.013.patch, 
> YARN-3964.014.patch, YARN-3964.015.patch, YARN-3964.1.patch
>
>
> Currently, CLI/REST API is provided in Resource Manager to allow users to 
> specify labels for nodes. For labels which may change over time, users will 
> have to start a cron job to update the labels. This has the following 
> limitations:
> - The cron job needs to be run in the YARN admin user.
> - This makes it a little complicate to maintain as users will have to make 
> sure this service/daemon is alive.
> Adding a Node Labels Provider in Resource Manager will provide user more 
> flexibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-4217) Failed AM attempt retries on same failed host

2015-10-02 Thread Eric Payne (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne resolved YARN-4217.
--
Resolution: Duplicate

bq. Eric Payne - is this a duplicate of YARN-2005?
[~vvasudev], yes it is. I did do a search, but I missed that one. Thanks a lot!

> Failed AM attempt retries on same failed host
> -
>
> Key: YARN-4217
> URL: https://issues.apache.org/jira/browse/YARN-4217
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 2.7.1
>Reporter: Eric Payne
>
> This happens when the cluster is maxed out. One node is going bad, so 
> everything that happens on it fails, so the bad node is never busy. Since the 
> cluster is maxed out, when the RM looks for a node with available resources, 
> it will always find the almost bad one because nothing can run on it so it 
> has available resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1550) NPE in FairSchedulerAppsBlock#render

2015-10-02 Thread Michael England (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941191#comment-14941191
 ] 

Michael England commented on YARN-1550:
---

I have hit this issue in Hadoop 2.5.1 - I am guessing this fix was never 
committed into the master branch?

> NPE in FairSchedulerAppsBlock#render
> 
>
> Key: YARN-1550
> URL: https://issues.apache.org/jira/browse/YARN-1550
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.2.0
>Reporter: caolong
>Assignee: Anubhav Dhoot
>Priority: Critical
> Fix For: 2.5.0
>
> Attachments: YARN-1550.001.patch, YARN-1550.002.patch, 
> YARN-1550.003.patch, YARN-1550.patch
>
>
> three Steps :
> 1、debug at RMAppManager#submitApplication after code
> if (rmContext.getRMApps().putIfAbsent(applicationId, application) !=
> null) {
>   String message = "Application with id " + applicationId
>   + " is already present! Cannot add a duplicate!";
>   LOG.warn(message);
>   throw RPCUtil.getRemoteException(message);
> }
> 2、submit one application:hadoop jar 
> ~/hadoop-current/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.0.0-ydh2.2.0-tests.jar
>  sleep -Dhadoop.job.ugi=test2,#11 -Dmapreduce.job.queuename=p1 -m 1 -mt 1 
> -r 1
> 3、go in page :http://ip:50030/cluster/scheduler and find 500 ERROR!
> the log:
> {noformat}
> 2013-12-30 11:51:43,795 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error 
> handling URI: /cluster/scheduler
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.FairSchedulerAppsBlock.render(FairSchedulerAppsBlock.java:96)
>   at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:66)
>   at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:76)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4209) RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition

2015-10-02 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941207#comment-14941207
 ] 

Rohith Sharma K S commented on YARN-4209:
-

+1 for the latest patch.. I will wait for a day before committing for others to 
have look at the final patch.

> RMStateStore FENCED state doesn’t work due to updateFencedState called by 
> stateMachine.doTransition
> ---
>
> Key: YARN-4209
> URL: https://issues.apache.org/jira/browse/YARN-4209
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.2
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Critical
> Attachments: YARN-4209.000.patch, YARN-4209.001.patch, 
> YARN-4209.002.patch
>
>
> RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by 
> {{stateMachine.doTransition}}. The reason is
> {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded 
> in {{stateMachine.doTransition}} called from public 
> API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So 
> right after the internal state transition from {{updateFencedState}} changes 
> the state to FENCED state, the external state transition changes the state 
> back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE 
> state even after {{notifyStoreOperationFailed}} is called. The only working 
> case for FENCED state is {{notifyStoreOperationFailed}} called from 
> {{ZKRMStateStore#VerifyActiveStatusThread}}.
> For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter 
> external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => 
> {{notifyStoreOperationFailed}} 
> =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal 
> {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} 
> change state to FENCED => exit external {{stateMachine.doTransition}} change 
> state to ACTIVE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3964) Support NodeLabelsProvider at Resource Manager side

2015-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940906#comment-14940906
 ] 

Hadoop QA commented on YARN-3964:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  19m 51s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 4 new or modified test files. |
| {color:green}+1{color} | javac |   7m 58s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  6s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 16s | The applied patch generated 
1 release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m  0s | The applied patch generated  1 
new checkstyle issues (total was 211, now 211). |
| {color:green}+1{color} | whitespace |   0m  4s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 43s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 24s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   2m  3s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |  60m 25s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | | 110m 29s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764740/YARN-3964.015.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / fd026f5 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-YARN-Build/9328/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/9328/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9328/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9328/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9328/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9328/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9328/console |


This message was automatically generated.

> Support NodeLabelsProvider at Resource Manager side
> ---
>
> Key: YARN-3964
> URL: https://issues.apache.org/jira/browse/YARN-3964
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Dian Fu
>Assignee: Dian Fu
> Attachments: YARN-3964 design doc.pdf, YARN-3964.002.patch, 
> YARN-3964.003.patch, YARN-3964.004.patch, YARN-3964.005.patch, 
> YARN-3964.006.patch, YARN-3964.007.patch, YARN-3964.007.patch, 
> YARN-3964.008.patch, YARN-3964.009.patch, YARN-3964.010.patch, 
> YARN-3964.011.patch, YARN-3964.012.patch, YARN-3964.013.patch, 
> YARN-3964.014.patch, YARN-3964.015.patch, YARN-3964.1.patch
>
>
> Currently, CLI/REST API is provided in Resource Manager to allow users to 
> specify labels for nodes. For labels which may change over time, users will 
> have to start a cron job to update the labels. This has the following 
> limitations:
> - The cron job needs to be run in the YARN admin user.
> - This makes it a little complicate to maintain as users will have to make 
> sure this service/daemon is alive.
> Adding a Node Labels Provider in Resource Manager will provide user more 
> flexibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4170) AM need to be notified with priority in AllocateResponse

2015-10-02 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941270#comment-14941270
 ] 

Rohith Sharma K S commented on YARN-4170:
-

Currently for every AM heartbeat, Priority has been sent back to AM in 
AllocateResponse. Instead of sending same priority in every heartbeat, let send 
priority only if priority has been updated?

> AM need to be notified with priority in AllocateResponse 
> -
>
> Key: YARN-4170
> URL: https://issues.apache.org/jira/browse/YARN-4170
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-YARN-4170.patch, 0002-YARN-4170.patch
>
>
> As discussed in MAPREDUCE-5870, Application Master need to be notified with 
> priority in Allocate heartbeat.  This will help AM to know the priority and 
> can update JobStatus when client asks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3798) ZKRMStateStore shouldn't create new session without occurrance of SESSIONEXPIED

2015-10-02 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941229#comment-14941229
 ] 

Tsuyoshi Ozawa commented on YARN-3798:
--

[~jianhe] could you take a look at patches?

> ZKRMStateStore shouldn't create new session without occurrance of 
> SESSIONEXPIED
> ---
>
> Key: YARN-3798
> URL: https://issues.apache.org/jira/browse/YARN-3798
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
> Environment: Suse 11 Sp3
>Reporter: Bibin A Chundatt
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: RM.log, YARN-3798-2.7.002.patch, 
> YARN-3798-branch-2.6.01.patch, YARN-3798-branch-2.7.002.patch, 
> YARN-3798-branch-2.7.003.patch, YARN-3798-branch-2.7.004.patch, 
> YARN-3798-branch-2.7.005.patch, YARN-3798-branch-2.7.006.patch, 
> YARN-3798-branch-2.7.patch
>
>
> RM going down with NoNode exception during create of znode for appattempt
> *Please find the exception logs*
> {code}
> 2015-06-09 10:09:44,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session connected
> 2015-06-09 10:09:44,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session restored
> 2015-06-09 10:09:44,886 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> Exception while executing a ZK operation.
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:1405)
>   at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:1310)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:926)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1101)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1122)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:937)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:970)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.updateApplicationAttemptStateInternal(ZKRMStateStore.java:671)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:275)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:260)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:837)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:900)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:895)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:175)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108)
>   at java.lang.Thread.run(Thread.java:745)
> 2015-06-09 10:09:44,887 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: Maxed 
> out ZK retries. Giving up!
> 2015-06-09 10:09:44,887 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Error 
> updating appAttempt: appattempt_1433764310492_7152_01
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:1405)
>   at 

[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2015-10-02 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941233#comment-14941233
 ] 

Jason Lowe commented on YARN-4051:
--

Thanks for the patch!  Sorry for the delay, as I missed this when it was 
originally filed.

I'm lukewarm on an event buffering approach since we have to track it and 
remember to propagate it at all the appropriate times which is a maintenance 
burden.  Would it be simpler if we simply prevented the kill request from 
coming in too soon?  Seems like another way to fix this would be to prevent 
kill requests from arriving before we're done recovering containers.  We could 
do a similar "try again" response as we do for container start requests while 
still recovering, and we can postpone finish application processing until after 
containers are recovered.

However we decide to fix this, there should be a unit test to cover the 
scenario.

> ContainerKillEvent is lost when container is  In New State and is recovering
> 
>
> Key: YARN-4051
> URL: https://issues.apache.org/jira/browse/YARN-4051
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: sandflee
>Assignee: sandflee
>Priority: Critical
> Attachments: YARN-4051.01.patch, YARN-4051.02.patch, 
> YARN-4051.03.patch
>
>
> As in YARN-4050, NM event dispatcher is blocked, and container is in New 
> state, when we finish application, the container still alive even after NM 
> event dispatcher is unblocked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-10-02 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941289#comment-14941289
 ] 

zhihai xu commented on YARN-3619:
-

I just attached the previous confused patch(YARN-3619.alt.patch) which I 
removed for comparison. It shares both timer and timer task, which is more 
complicated than YARN-3619.001.patch, So I think YARN-3619.001.patch(share 
timer only) is a better approach, which is simpler and more accurately controls 
the time to unregister the container metrics.

> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: zhihai xu
> Attachments: YARN-3619.000.patch, YARN-3619.001.patch, 
> YARN-3619.alt.patch, test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3798) ZKRMStateStore shouldn't create new session without occurrance of SESSIONEXPIED

2015-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941236#comment-14941236
 ] 

Hadoop QA commented on YARN-3798:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12746555/YARN-3798-branch-2.6.01.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 439f43a |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9329/console |


This message was automatically generated.

> ZKRMStateStore shouldn't create new session without occurrance of 
> SESSIONEXPIED
> ---
>
> Key: YARN-3798
> URL: https://issues.apache.org/jira/browse/YARN-3798
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
> Environment: Suse 11 Sp3
>Reporter: Bibin A Chundatt
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: RM.log, YARN-3798-2.7.002.patch, 
> YARN-3798-branch-2.6.01.patch, YARN-3798-branch-2.7.002.patch, 
> YARN-3798-branch-2.7.003.patch, YARN-3798-branch-2.7.004.patch, 
> YARN-3798-branch-2.7.005.patch, YARN-3798-branch-2.7.006.patch, 
> YARN-3798-branch-2.7.patch
>
>
> RM going down with NoNode exception during create of znode for appattempt
> *Please find the exception logs*
> {code}
> 2015-06-09 10:09:44,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session connected
> 2015-06-09 10:09:44,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session restored
> 2015-06-09 10:09:44,886 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> Exception while executing a ZK operation.
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:1405)
>   at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:1310)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:926)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1101)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1122)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:937)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:970)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.updateApplicationAttemptStateInternal(ZKRMStateStore.java:671)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:275)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:260)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:837)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:900)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:895)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:175)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108)
>   at java.lang.Thread.run(Thread.java:745)
> 2015-06-09 10:09:44,887 INFO 
> 

[jira] [Commented] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-10-02 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941249#comment-14941249
 ] 

zhihai xu commented on YARN-3619:
-

The checkstyle issues and release audit warnings for the latest patch 
YARN-3619.001.patch were pre-existing.

> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: zhihai xu
> Attachments: YARN-3619.000.patch, YARN-3619.001.patch, test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-10-02 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3619:

Attachment: YARN-3619.alt.patch

> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: zhihai xu
> Attachments: YARN-3619.000.patch, YARN-3619.001.patch, 
> YARN-3619.alt.patch, test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941374#comment-14941374
 ] 

Hadoop QA commented on YARN-3619:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  21m 16s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 50s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  5s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 16s | The applied patch generated 
1 release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 59s | The applied patch generated  1 
new checkstyle issues (total was 211, now 211). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   6m 17s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | common tests |   6m 32s | Tests failed in 
hadoop-common. |
| {color:green}+1{color} | yarn tests |   0m 23s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   2m  0s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   8m 41s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  69m  6s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-nodemanager |
| Failed unit tests | hadoop.fs.sftp.TestSFTPFileSystem |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764776/YARN-3619.alt.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 439f43a |
| Release Audit | 
https://builds.apache.org/job/PreCommit-YARN-Build/9330/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/9330/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/9330/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9330/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9330/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9330/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9330/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9330/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9330/console |


This message was automatically generated.

> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: zhihai xu
> Attachments: YARN-3619.000.patch, YARN-3619.001.patch, 
> YARN-3619.alt.patch, test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3996) YARN-789 (Support for zero capabilities in fairscheduler) is broken after YARN-3305

2015-10-02 Thread Neelesh Srinivas Salian (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941447#comment-14941447
 ] 

Neelesh Srinivas Salian commented on YARN-3996:
---

[~adhoot] thanks for the review. Will update with Version 1.

> YARN-789 (Support for zero capabilities in fairscheduler) is broken after 
> YARN-3305
> ---
>
> Key: YARN-3996
> URL: https://issues.apache.org/jira/browse/YARN-3996
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, fairscheduler
>Reporter: Anubhav Dhoot
>Assignee: Neelesh Srinivas Salian
>Priority: Critical
> Attachments: YARN-3996.prelim.patch
>
>
> RMAppManager#validateAndCreateResourceRequest calls into normalizeRequest 
> with mininumResource for the incrementResource. This causes normalize to 
> return zero if minimum is set to zero as per YARN-789



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4218) Metric for resource*time that was preempted

2015-10-02 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-4218:
---
Attachment: screenshot-1.png

> Metric for resource*time that was preempted
> ---
>
> Key: YARN-4218
> URL: https://issues.apache.org/jira/browse/YARN-4218
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: YARN-4218.patch, YARN-4218.wip.patch, screenshot-1.png
>
>
> After YARN-415 we have the ability to track the resource*time footprint of a 
> job and preemption metrics shows how many containers were preempted on a job. 
> However we don't have a metric showing the resource*time footprint cost of 
> preemption. In other words, we know how many containers were preempted but we 
> don't have a good measure of how much work was lost as a result of preemption.
> We should add this metric so we can analyze how much work preemption is 
> costing on a grid and better track which jobs were heavily impacted by it. A 
> job that has 100 containers preempted that only lasted a minute each and were 
> very small is going to be less impacted than a job that only lost a single 
> container but that container was huge and had been running for 3 days.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4218) Metric for resource*time that was preempted

2015-10-02 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-4218:
---
Attachment: YARN-4218.patch

The latest patch added the new metric for preempted work(resource*time), and 
push the new metric to appreport, UI, rest autolog and timeline server

> Metric for resource*time that was preempted
> ---
>
> Key: YARN-4218
> URL: https://issues.apache.org/jira/browse/YARN-4218
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: YARN-4218.patch, YARN-4218.wip.patch
>
>
> After YARN-415 we have the ability to track the resource*time footprint of a 
> job and preemption metrics shows how many containers were preempted on a job. 
> However we don't have a metric showing the resource*time footprint cost of 
> preemption. In other words, we know how many containers were preempted but we 
> don't have a good measure of how much work was lost as a result of preemption.
> We should add this metric so we can analyze how much work preemption is 
> costing on a grid and better track which jobs were heavily impacted by it. A 
> job that has 100 containers preempted that only lasted a minute each and were 
> very small is going to be less impacted than a job that only lost a single 
> container but that container was huge and had been running for 3 days.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-10-02 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941527#comment-14941527
 ] 

Jason Lowe commented on YARN-3619:
--

+1 for the .001 patch.  The patch doesn't apply to branch-2.7 cleanly, and I'd 
like to get it fixed in 2.7.2.  Can you provide a patch against branch-2.7?

> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: zhihai xu
> Attachments: YARN-3619.000.patch, YARN-3619.001.patch, 
> YARN-3619.alt.patch, test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4140) RM container allocation delayed incase of app submitted to Nodelabel partition

2015-10-02 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-4140:
---
Attachment: 0014-YARN-4140.patch

> RM container allocation delayed incase of app submitted to Nodelabel partition
> --
>
> Key: YARN-4140
> URL: https://issues.apache.org/jira/browse/YARN-4140
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: 0001-YARN-4140.patch, 0002-YARN-4140.patch, 
> 0003-YARN-4140.patch, 0004-YARN-4140.patch, 0005-YARN-4140.patch, 
> 0006-YARN-4140.patch, 0007-YARN-4140.patch, 0008-YARN-4140.patch, 
> 0009-YARN-4140.patch, 0010-YARN-4140.patch, 0011-YARN-4140.patch, 
> 0012-YARN-4140.patch, 0013-YARN-4140.patch, 0014-YARN-4140.patch
>
>
> Trying to run application on Nodelabel partition I  found that the 
> application execution time is delayed by 5 – 10 min for 500 containers . 
> Total 3 machines 2 machines were in same partition and app submitted to same.
> After enabling debug was able to find the below
> # From AM the container ask is for OFF-SWITCH
> # RM allocating all containers to NODE_LOCAL as shown in logs below.
> # So since I was having about 500 containers time taken was about – 6 minutes 
> to allocate 1st map after AM allocation.
> # Tested with about 1K maps using PI job took 17 minutes to allocate  next 
> container after AM allocation
> Once 500 container allocation on NODE_LOCAL is done the next container 
> allocation is done on OFF_SWITCH
> {code}
> 2015-09-09 15:21:58,954 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
>  showRequests: application=application_1441791998224_0001 request={Priority: 
> 20, Capability: , # Containers: 500, Location: 
> /default-rack, Relax Locality: true, Node Label Expression: }
> 2015-09-09 15:21:58,954 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
>  showRequests: application=application_1441791998224_0001 request={Priority: 
> 20, Capability: , # Containers: 500, Location: *, Relax 
> Locality: true, Node Label Expression: 3}
> 2015-09-09 15:21:58,954 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
>  showRequests: application=application_1441791998224_0001 request={Priority: 
> 20, Capability: , # Containers: 500, Location: 
> host-10-19-92-143, Relax Locality: true, Node Label Expression: }
> 2015-09-09 15:21:58,954 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
>  showRequests: application=application_1441791998224_0001 request={Priority: 
> 20, Capability: , # Containers: 500, Location: 
> host-10-19-92-117, Relax Locality: true, Node Label Expression: }
> 2015-09-09 15:21:58,954 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5, 
> usedResources=, usedCapacity=0.0, 
> absoluteUsedCapacity=0.0, numApps=1, numContainers=1 -->  vCores:0>, NODE_LOCAL
> {code}
>  
> {code}
> 2015-09-09 14:35:45,467 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5, 
> usedResources=, usedCapacity=0.0, 
> absoluteUsedCapacity=0.0, numApps=1, numContainers=1 -->  vCores:0>, NODE_LOCAL
> 2015-09-09 14:35:45,831 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5, 
> usedResources=, usedCapacity=0.0, 
> absoluteUsedCapacity=0.0, numApps=1, numContainers=1 -->  vCores:0>, NODE_LOCAL
> 2015-09-09 14:35:46,469 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5, 
> usedResources=, usedCapacity=0.0, 
> absoluteUsedCapacity=0.0, numApps=1, numContainers=1 -->  vCores:0>, NODE_LOCAL
> 2015-09-09 14:35:46,832 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5, 
> usedResources=, usedCapacity=0.0, 
> absoluteUsedCapacity=0.0, numApps=1, numContainers=1 -->  vCores:0>, NODE_LOCAL
> {code}
> {code}
> dsperf@host-127:/opt/bibin/dsperf/HAINSTALL/install/hadoop/resourcemanager/logs1>
>  cat 

[jira] [Commented] (YARN-3996) YARN-789 (Support for zero capabilities in fairscheduler) is broken after YARN-3305

2015-10-02 Thread Anubhav Dhoot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941443#comment-14941443
 ] 

Anubhav Dhoot commented on YARN-3996:
-

Approach looks ok

> YARN-789 (Support for zero capabilities in fairscheduler) is broken after 
> YARN-3305
> ---
>
> Key: YARN-3996
> URL: https://issues.apache.org/jira/browse/YARN-3996
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, fairscheduler
>Reporter: Anubhav Dhoot
>Assignee: Neelesh Srinivas Salian
>Priority: Critical
> Attachments: YARN-3996.prelim.patch
>
>
> RMAppManager#validateAndCreateResourceRequest calls into normalizeRequest 
> with mininumResource for the incrementResource. This causes normalize to 
> return zero if minimum is set to zero as per YARN-789



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4140) RM container allocation delayed incase of app submitted to Nodelabel partition

2015-10-02 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941450#comment-14941450
 ] 

Bibin A Chundatt commented on YARN-4140:


Uploaded new patch for the same.
# String lastAnyRequestLabel = null; is not necessary. - Added check not update 
label for ANY since it gets handled already as part of code available and 
lastRequest label will be correct for queue and appUsage.
# Only formating is done now.
# Handled FIFO and Fair testcase correction

> RM container allocation delayed incase of app submitted to Nodelabel partition
> --
>
> Key: YARN-4140
> URL: https://issues.apache.org/jira/browse/YARN-4140
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: 0001-YARN-4140.patch, 0002-YARN-4140.patch, 
> 0003-YARN-4140.patch, 0004-YARN-4140.patch, 0005-YARN-4140.patch, 
> 0006-YARN-4140.patch, 0007-YARN-4140.patch, 0008-YARN-4140.patch, 
> 0009-YARN-4140.patch, 0010-YARN-4140.patch, 0011-YARN-4140.patch, 
> 0012-YARN-4140.patch, 0013-YARN-4140.patch, 0014-YARN-4140.patch
>
>
> Trying to run application on Nodelabel partition I  found that the 
> application execution time is delayed by 5 – 10 min for 500 containers . 
> Total 3 machines 2 machines were in same partition and app submitted to same.
> After enabling debug was able to find the below
> # From AM the container ask is for OFF-SWITCH
> # RM allocating all containers to NODE_LOCAL as shown in logs below.
> # So since I was having about 500 containers time taken was about – 6 minutes 
> to allocate 1st map after AM allocation.
> # Tested with about 1K maps using PI job took 17 minutes to allocate  next 
> container after AM allocation
> Once 500 container allocation on NODE_LOCAL is done the next container 
> allocation is done on OFF_SWITCH
> {code}
> 2015-09-09 15:21:58,954 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
>  showRequests: application=application_1441791998224_0001 request={Priority: 
> 20, Capability: , # Containers: 500, Location: 
> /default-rack, Relax Locality: true, Node Label Expression: }
> 2015-09-09 15:21:58,954 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
>  showRequests: application=application_1441791998224_0001 request={Priority: 
> 20, Capability: , # Containers: 500, Location: *, Relax 
> Locality: true, Node Label Expression: 3}
> 2015-09-09 15:21:58,954 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
>  showRequests: application=application_1441791998224_0001 request={Priority: 
> 20, Capability: , # Containers: 500, Location: 
> host-10-19-92-143, Relax Locality: true, Node Label Expression: }
> 2015-09-09 15:21:58,954 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
>  showRequests: application=application_1441791998224_0001 request={Priority: 
> 20, Capability: , # Containers: 500, Location: 
> host-10-19-92-117, Relax Locality: true, Node Label Expression: }
> 2015-09-09 15:21:58,954 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5, 
> usedResources=, usedCapacity=0.0, 
> absoluteUsedCapacity=0.0, numApps=1, numContainers=1 -->  vCores:0>, NODE_LOCAL
> {code}
>  
> {code}
> 2015-09-09 14:35:45,467 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5, 
> usedResources=, usedCapacity=0.0, 
> absoluteUsedCapacity=0.0, numApps=1, numContainers=1 -->  vCores:0>, NODE_LOCAL
> 2015-09-09 14:35:45,831 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5, 
> usedResources=, usedCapacity=0.0, 
> absoluteUsedCapacity=0.0, numApps=1, numContainers=1 -->  vCores:0>, NODE_LOCAL
> 2015-09-09 14:35:46,469 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5, 
> usedResources=, usedCapacity=0.0, 
> absoluteUsedCapacity=0.0, numApps=1, numContainers=1 -->  vCores:0>, NODE_LOCAL
> 2015-09-09 14:35:46,832 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> Assigned to 

[jira] [Updated] (YARN-4218) Metric for resource*time that was preempted

2015-10-02 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-4218:
---
Attachment: screenshot-2.png

> Metric for resource*time that was preempted
> ---
>
> Key: YARN-4218
> URL: https://issues.apache.org/jira/browse/YARN-4218
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: YARN-4218.patch, YARN-4218.wip.patch, screenshot-1.png, 
> screenshot-2.png, screenshot-3.png
>
>
> After YARN-415 we have the ability to track the resource*time footprint of a 
> job and preemption metrics shows how many containers were preempted on a job. 
> However we don't have a metric showing the resource*time footprint cost of 
> preemption. In other words, we know how many containers were preempted but we 
> don't have a good measure of how much work was lost as a result of preemption.
> We should add this metric so we can analyze how much work preemption is 
> costing on a grid and better track which jobs were heavily impacted by it. A 
> job that has 100 containers preempted that only lasted a minute each and were 
> very small is going to be less impacted than a job that only lost a single 
> container but that container was huge and had been running for 3 days.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4218) Metric for resource*time that was preempted

2015-10-02 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-4218:
---
Attachment: screenshot-3.png

> Metric for resource*time that was preempted
> ---
>
> Key: YARN-4218
> URL: https://issues.apache.org/jira/browse/YARN-4218
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: YARN-4218.patch, YARN-4218.wip.patch, screenshot-1.png, 
> screenshot-2.png, screenshot-3.png
>
>
> After YARN-415 we have the ability to track the resource*time footprint of a 
> job and preemption metrics shows how many containers were preempted on a job. 
> However we don't have a metric showing the resource*time footprint cost of 
> preemption. In other words, we know how many containers were preempted but we 
> don't have a good measure of how much work was lost as a result of preemption.
> We should add this metric so we can analyze how much work preemption is 
> costing on a grid and better track which jobs were heavily impacted by it. A 
> job that has 100 containers preempted that only lasted a minute each and were 
> very small is going to be less impacted than a job that only lost a single 
> container but that container was huge and had been running for 3 days.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4140) RM container allocation delayed incase of app submitted to Nodelabel partition

2015-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941583#comment-14941583
 ] 

Hadoop QA commented on YARN-4140:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 33s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   7m 51s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  2s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 16s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 12s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 30s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 32s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 26s | Tests passed in 
hadoop-yarn-server-common. |
| {color:green}+1{color} | yarn tests |  60m 14s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | | 102m 14s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764796/0014-YARN-4140.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 439f43a |
| Release Audit | 
https://builds.apache.org/job/PreCommit-YARN-Build/9331/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| hadoop-yarn-server-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9331/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9331/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9331/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9331/console |


This message was automatically generated.

> RM container allocation delayed incase of app submitted to Nodelabel partition
> --
>
> Key: YARN-4140
> URL: https://issues.apache.org/jira/browse/YARN-4140
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: 0001-YARN-4140.patch, 0002-YARN-4140.patch, 
> 0003-YARN-4140.patch, 0004-YARN-4140.patch, 0005-YARN-4140.patch, 
> 0006-YARN-4140.patch, 0007-YARN-4140.patch, 0008-YARN-4140.patch, 
> 0009-YARN-4140.patch, 0010-YARN-4140.patch, 0011-YARN-4140.patch, 
> 0012-YARN-4140.patch, 0013-YARN-4140.patch, 0014-YARN-4140.patch
>
>
> Trying to run application on Nodelabel partition I  found that the 
> application execution time is delayed by 5 – 10 min for 500 containers . 
> Total 3 machines 2 machines were in same partition and app submitted to same.
> After enabling debug was able to find the below
> # From AM the container ask is for OFF-SWITCH
> # RM allocating all containers to NODE_LOCAL as shown in logs below.
> # So since I was having about 500 containers time taken was about – 6 minutes 
> to allocate 1st map after AM allocation.
> # Tested with about 1K maps using PI job took 17 minutes to allocate  next 
> container after AM allocation
> Once 500 container allocation on NODE_LOCAL is done the next container 
> allocation is done on OFF_SWITCH
> {code}
> 2015-09-09 15:21:58,954 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
>  showRequests: application=application_1441791998224_0001 request={Priority: 
> 20, Capability: , # Containers: 500, Location: 
> /default-rack, Relax Locality: true, Node Label Expression: }
> 2015-09-09 15:21:58,954 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
>  showRequests: application=application_1441791998224_0001 request={Priority: 
> 20, Capability: , # Containers: 500, Location: *, Relax 
> Locality: true, Node Label Expression: 3}
> 2015-09-09 15:21:58,954 DEBUG 
> 

[jira] [Updated] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-10-02 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3619:

Attachment: YARN-3619.branch-2.7.patch

> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: zhihai xu
> Attachments: YARN-3619.000.patch, YARN-3619.001.patch, 
> YARN-3619.alt.patch, YARN-3619.branch-2.7.patch, test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3864) Implement support for querying single app and all apps for a flow run

2015-10-02 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941624#comment-14941624
 ] 

Sangjin Lee commented on YARN-3864:
---

The design doc says

{quote}
Flow runs for a given flow must have unique and totally ordered run identifiers.
{quote}

It should be safe to assume that the flow run id is sequential *for a given 
flow*.

> Implement support for querying single app and all apps for a flow run
> -
>
> Key: YARN-3864
> URL: https://issues.apache.org/jira/browse/YARN-3864
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: YARN-3864-YARN-2928.01.patch, 
> YARN-3864-YARN-2928.02.patch
>
>
> This JIRA will handle support for querying all apps for a flow run in HBase 
> reader implementation.
> And also REST API implementation for single app and multiple apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3864) Implement support for querying single app and all apps for a flow run

2015-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941663#comment-14941663
 ] 

Hadoop QA commented on YARN-3864:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  16m 34s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   8m 28s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 23s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 17s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  6s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 39s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 42s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 55s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   3m 21s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  42m 53s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764824/YARN-3864-YARN-2928.02.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / a95b8f5 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9333/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9333/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9333/console |


This message was automatically generated.

> Implement support for querying single app and all apps for a flow run
> -
>
> Key: YARN-3864
> URL: https://issues.apache.org/jira/browse/YARN-3864
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: YARN-3864-YARN-2928.01.patch, 
> YARN-3864-YARN-2928.02.patch
>
>
> This JIRA will handle support for querying all apps for a flow run in HBase 
> reader implementation.
> And also REST API implementation for single app and multiple apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4218) Metric for resource*time that was preempted

2015-10-02 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-4218:
---
Attachment: YARN-4218.2.patch

.2 patch updated appSummary log

> Metric for resource*time that was preempted
> ---
>
> Key: YARN-4218
> URL: https://issues.apache.org/jira/browse/YARN-4218
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: YARN-4218.2.patch, YARN-4218.patch, YARN-4218.wip.patch, 
> screenshot-1.png, screenshot-2.png, screenshot-3.png
>
>
> After YARN-415 we have the ability to track the resource*time footprint of a 
> job and preemption metrics shows how many containers were preempted on a job. 
> However we don't have a metric showing the resource*time footprint cost of 
> preemption. In other words, we know how many containers were preempted but we 
> don't have a good measure of how much work was lost as a result of preemption.
> We should add this metric so we can analyze how much work preemption is 
> costing on a grid and better track which jobs were heavily impacted by it. A 
> job that has 100 containers preempted that only lasted a minute each and were 
> very small is going to be less impacted than a job that only lost a single 
> container but that container was huge and had been running for 3 days.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3864) Implement support for querying single app and all apps for a flow run

2015-10-02 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941606#comment-14941606
 ] 

Varun Saxena commented on YARN-3864:


Sorry I mean flow run ids' cant be guaranteed to be sorted as per run start 
time.

> Implement support for querying single app and all apps for a flow run
> -
>
> Key: YARN-3864
> URL: https://issues.apache.org/jira/browse/YARN-3864
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: YARN-3864-YARN-2928.01.patch, 
> YARN-3864-YARN-2928.02.patch
>
>
> This JIRA will handle support for querying all apps for a flow run in HBase 
> reader implementation.
> And also REST API implementation for single app and multiple apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3864) Implement support for querying single app and all apps for a flow run

2015-10-02 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941619#comment-14941619
 ] 

Sangjin Lee commented on YARN-3864:
---

Thanks for updating the patch [~varun_saxena]. I'm now in transit, and it will 
take a little time before I can go over the patch in more detail.

I'd like to summarize a couple of issues, and like us to tackle them after 
this. I don't think they need to hold this JIRA up.

(1) the REST API
At some point, let's take a look at the REST API (as discussed earlier), and 
see if we need to make them more consistent with the overall REST best 
practices and more importantly hadoop's REST style (YARN, etc.). It's not 
terribly urgent, and we just need to review it some time soon.

(2) storing user in the app-to-flow table
[~vrushalic] brought up the point of the user. And I'm realizing that we may 
need to store the user in the app-to-flow table. Was there a reason that we 
didn't store the user info in the app-to-flow table? The issue is that the user 
is a critical piece of the context (cluster/user/flow_id/flow_run_id/app_id), 
and the app-to-flow lookup should have it.

Right now in the REST call, the user is optional, and if it is not provided it 
is deduced from the caller UGI. But IMO the caller UGI is a poor/incorrect 
choice. This would work only if the user/client that's executing the REST call 
is the owner of that YARN app. Rather, the user should be always the user of 
the app, regardless of who executes the REST call (authorization is a separate 
topic).

I propose that we store the user id in the app-to-flow table, and when we 
recover the context on the read path, we use that user if the user is not 
provided by the caller. We don't have to do it as part of this JIRA, but we 
should fix this shortly after. What are your thoughts?


> Implement support for querying single app and all apps for a flow run
> -
>
> Key: YARN-3864
> URL: https://issues.apache.org/jira/browse/YARN-3864
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: YARN-3864-YARN-2928.01.patch, 
> YARN-3864-YARN-2928.02.patch
>
>
> This JIRA will handle support for querying all apps for a flow run in HBase 
> reader implementation.
> And also REST API implementation for single app and multiple apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3864) Implement support for querying single app and all apps for a flow run

2015-10-02 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941631#comment-14941631
 ] 

Varun Saxena commented on YARN-3864:


[~sjlee0],
bq. At some point, let's take a look at the REST API (as discussed earlier), 
and see if we need to make them more consistent with the overall REST best 
practices and more importantly hadoop's REST style (YARN, etc.). It's not 
terribly urgent, and we just need to review it some time soon.
We can do so. For user facing APIs' the form and the naming is very important. 
Everyone can chime in on this one.

bq. storing user in the app-to-flow table
Even I was wondering why its not there. In FS implementation, it was kept in 
the mapping.
I think this should be fixed. Should I raise a new JIRA for this and fix it ?

bq. I propose that we store the user id in the app-to-flow table, and when we 
recover the context on the read path, we use that user if the user is not 
provided by the caller.
But for certain queries such as querying flows and flowruns or all apps for a 
flow/flow run, we will not have context information. Because context is for 
cluster and app. And for these queries, a single app has no meaning. I think 
for these queries, we can take UGI and for others take it from context. But 
this can make it confusing for user. So one option is to disregard UGI 
altogether and force user to be supplied for queries for which context cant be 
retrieved.

> Implement support for querying single app and all apps for a flow run
> -
>
> Key: YARN-3864
> URL: https://issues.apache.org/jira/browse/YARN-3864
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: YARN-3864-YARN-2928.01.patch, 
> YARN-3864-YARN-2928.02.patch
>
>
> This JIRA will handle support for querying all apps for a flow run in HBase 
> reader implementation.
> And also REST API implementation for single app and multiple apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3864) Implement support for querying single app and all apps for a flow run

2015-10-02 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941673#comment-14941673
 ] 

Varun Saxena commented on YARN-3864:


bq. Yes, let's open a new JIRA. I can get to this next week, or you can take it 
on if you prefer.
Ok. Will open a new JIRA. I can take it up. Will create a patch on top of this 
one.

> Implement support for querying single app and all apps for a flow run
> -
>
> Key: YARN-3864
> URL: https://issues.apache.org/jira/browse/YARN-3864
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: YARN-3864-YARN-2928.01.patch, 
> YARN-3864-YARN-2928.02.patch
>
>
> This JIRA will handle support for querying all apps for a flow run in HBase 
> reader implementation.
> And also REST API implementation for single app and multiple apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941718#comment-14941718
 ] 

Hudson commented on YARN-3619:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8559 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8559/])
YARN-3619. ContainerMetrics unregisters during getMetrics and leads to (jlowe: 
rev fdf02d1f26cea372bf69e071f57b8bfc09c092c4)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainerMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainerMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml


> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: zhihai xu
> Fix For: 2.7.2
>
> Attachments: YARN-3619.000.patch, YARN-3619.001.patch, 
> YARN-3619.alt.patch, YARN-3619.branch-2.7.patch, test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3864) Implement support for querying single app and all apps for a flow run

2015-10-02 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3864:
---
Attachment: YARN-3864-YARN-2928.02.patch

> Implement support for querying single app and all apps for a flow run
> -
>
> Key: YARN-3864
> URL: https://issues.apache.org/jira/browse/YARN-3864
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: YARN-3864-YARN-2928.01.patch, 
> YARN-3864-YARN-2928.02.patch
>
>
> This JIRA will handle support for querying all apps for a flow run in HBase 
> reader implementation.
> And also REST API implementation for single app and multiple apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3864) Implement support for querying single app and all apps for a flow run

2015-10-02 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3864:
---
Attachment: YARN-3864-YARN-2928.02.patch

Attaching a patch addressing Vrushali's and Sangjin's comments.
Have given a separate endpoint for apps within a flow run in addition to 
separate endpoint for apps with a flow. Did not go with option of flow run as 
an optional query param as apps for a flow run might be frequently accessed 
query esp from Web UI.

Moreover, I am assuming when we are querying apps for a flow run id, keys will 
be sorted after YARN-4178. But for apps within flow, I am assuming it wont be 
sorted. That is because flow run ids' cannot be guaranteed to be sorted as per 
app's created time.

> Implement support for querying single app and all apps for a flow run
> -
>
> Key: YARN-3864
> URL: https://issues.apache.org/jira/browse/YARN-3864
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: YARN-3864-YARN-2928.01.patch, 
> YARN-3864-YARN-2928.02.patch
>
>
> This JIRA will handle support for querying all apps for a flow run in HBase 
> reader implementation.
> And also REST API implementation for single app and multiple apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941627#comment-14941627
 ] 

Hadoop QA commented on YARN-3619:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764832/YARN-3619.branch-2.7.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | branch-2 / 7964b13 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9335/console |


This message was automatically generated.

> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: zhihai xu
> Attachments: YARN-3619.000.patch, YARN-3619.001.patch, 
> YARN-3619.alt.patch, YARN-3619.branch-2.7.patch, test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3864) Implement support for querying single app and all apps for a flow run

2015-10-02 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941662#comment-14941662
 ] 

Sangjin Lee commented on YARN-3864:
---

{quote}
Even I was wondering why its not there. In FS implementation, it was kept in 
the mapping.
I think this should be fixed. Should I raise a new JIRA for this and fix it ?
{quote}

Yes, let's open a new JIRA. I can get to this next week, or you can take it on 
if you prefer. Let me know. I'm fine either way. For the record, I don't think 
this needs to be done by the POC time. The same goes for the REST API review.

{quote}
But for certain queries such as querying flows and flowruns or all apps for a 
flow/flow run, we will not have context information. Because context is for 
cluster and app. And for these queries, a single app has no meaning. I think 
for these queries, we can take UGI and for others take it from context. But 
this can make it confusing for user. So one option is to disregard UGI 
altogether and force user to be supplied for queries for which context cant be 
retrieved.
{quote}

Yes, I'm specifically talking about the existing app-to-flow lookup case which 
applies only to generic entities and apps. Sorry if it was not clear. I agree 
the flow run case is a little different. I do think the UGI is not very 
meaningful in that case either (if the caller user does not match the right 
user, it will return nothing anyway). I'm OK with making the user required.

> Implement support for querying single app and all apps for a flow run
> -
>
> Key: YARN-3864
> URL: https://issues.apache.org/jira/browse/YARN-3864
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: YARN-3864-YARN-2928.01.patch, 
> YARN-3864-YARN-2928.02.patch
>
>
> This JIRA will handle support for querying all apps for a flow run in HBase 
> reader implementation.
> And also REST API implementation for single app and multiple apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4221) Store user in app to flow table

2015-10-02 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4221:
---
Description: 
We should store user as well in in app to flow table.
For queries where user is not supplied and flow context can be retrieved from 
app to flow table, we should take the user from app to flow table instead of 
considering UGI as default user.

This is as per discussion on YARN-3864

  was:
We should store user as well in in app to flow table.
For queries where user is not supplied and flow context can be retrieved from 
app to flow table, we should take the user from app to flow table instead of 
considering UGI as default user.


> Store user in app to flow table
> ---
>
> Key: YARN-4221
> URL: https://issues.apache.org/jira/browse/YARN-4221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>
> We should store user as well in in app to flow table.
> For queries where user is not supplied and flow context can be retrieved from 
> app to flow table, we should take the user from app to flow table instead of 
> considering UGI as default user.
> This is as per discussion on YARN-3864



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-10-02 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941683#comment-14941683
 ] 

Jason Lowe commented on YARN-3619:
--

+1 for the branch-2.7 patch.  Committing this.

> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: zhihai xu
> Attachments: YARN-3619.000.patch, YARN-3619.001.patch, 
> YARN-3619.alt.patch, YARN-3619.branch-2.7.patch, test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-10-02 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941622#comment-14941622
 ] 

zhihai xu commented on YARN-3619:
-

thanks [~jlowe]! Yes, I uploaded a patch YARN-3619.branch-2.7.patch based on 
branch-2.7.

> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: zhihai xu
> Attachments: YARN-3619.000.patch, YARN-3619.001.patch, 
> YARN-3619.alt.patch, YARN-3619.branch-2.7.patch, test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3864) Implement support for querying single app and all apps for a flow run

2015-10-02 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941634#comment-14941634
 ] 

Varun Saxena commented on YARN-3864:


Ohh...I missed that part. I was assuming logically it will be in sequence but 
we were not enforcing it. But if this is documented this should be fine.

Will change in next patch. 

> Implement support for querying single app and all apps for a flow run
> -
>
> Key: YARN-3864
> URL: https://issues.apache.org/jira/browse/YARN-3864
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: YARN-3864-YARN-2928.01.patch, 
> YARN-3864-YARN-2928.02.patch
>
>
> This JIRA will handle support for querying all apps for a flow run in HBase 
> reader implementation.
> And also REST API implementation for single app and multiple apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3864) Implement support for querying single app and all apps for a flow run

2015-10-02 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941646#comment-14941646
 ] 

Varun Saxena commented on YARN-3864:


I think REST API refactoring we can do after web UI poc.

> Implement support for querying single app and all apps for a flow run
> -
>
> Key: YARN-3864
> URL: https://issues.apache.org/jira/browse/YARN-3864
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: YARN-3864-YARN-2928.01.patch, 
> YARN-3864-YARN-2928.02.patch
>
>
> This JIRA will handle support for querying all apps for a flow run in HBase 
> reader implementation.
> And also REST API implementation for single app and multiple apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4221) Store user in app to flow table

2015-10-02 Thread Varun Saxena (JIRA)
Varun Saxena created YARN-4221:
--

 Summary: Store user in app to flow table
 Key: YARN-4221
 URL: https://issues.apache.org/jira/browse/YARN-4221
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4221) Store user in app to flow table

2015-10-02 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4221:
---
Description: 
We should store user as well in in app to flow table.
For queries where user is not supplied and flow context can be retrieved from 
app to flow table, we should take the user from app to flow table instead of 
considering UGI as default user.

> Store user in app to flow table
> ---
>
> Key: YARN-4221
> URL: https://issues.apache.org/jira/browse/YARN-4221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>
> We should store user as well in in app to flow table.
> For queries where user is not supplied and flow context can be retrieved from 
> app to flow table, we should take the user from app to flow table instead of 
> considering UGI as default user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3864) Implement support for querying single app and all apps for a flow run

2015-10-02 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941684#comment-14941684
 ] 

Varun Saxena commented on YARN-3864:


[~sjlee0], filed YARN-4221.

> Implement support for querying single app and all apps for a flow run
> -
>
> Key: YARN-3864
> URL: https://issues.apache.org/jira/browse/YARN-3864
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: YARN-3864-YARN-2928.01.patch, 
> YARN-3864-YARN-2928.02.patch
>
>
> This JIRA will handle support for querying all apps for a flow run in HBase 
> reader implementation.
> And also REST API implementation for single app and multiple apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3864) Implement support for querying single app and all apps for a flow run

2015-10-02 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3864:
---
Attachment: YARN-3864-YARN-2928.03.patch

As per [~sjlee0]'s comment, we will have run ids' in sequence so updating the 
patch to consider them in sorted order.

> Implement support for querying single app and all apps for a flow run
> -
>
> Key: YARN-3864
> URL: https://issues.apache.org/jira/browse/YARN-3864
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: YARN-3864-YARN-2928.01.patch, 
> YARN-3864-YARN-2928.02.patch, YARN-3864-YARN-2928.03.patch
>
>
> This JIRA will handle support for querying all apps for a flow run in HBase 
> reader implementation.
> And also REST API implementation for single app and multiple apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3864) Implement support for querying single app and all apps for a flow run

2015-10-02 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3864:
---
Attachment: (was: YARN-3864-YARN-2928.02.patch)

> Implement support for querying single app and all apps for a flow run
> -
>
> Key: YARN-3864
> URL: https://issues.apache.org/jira/browse/YARN-3864
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: YARN-3864-YARN-2928.01.patch
>
>
> This JIRA will handle support for querying all apps for a flow run in HBase 
> reader implementation.
> And also REST API implementation for single app and multiple apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3864) Implement support for querying single app and all apps for a flow run

2015-10-02 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941586#comment-14941586
 ] 

Varun Saxena commented on YARN-3864:


[~gtCarrera9], the endpoint for web UI poc will now be /flowrunapps instead of 
/apps

> Implement support for querying single app and all apps for a flow run
> -
>
> Key: YARN-3864
> URL: https://issues.apache.org/jira/browse/YARN-3864
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: YARN-3864-YARN-2928.01.patch, 
> YARN-3864-YARN-2928.02.patch
>
>
> This JIRA will handle support for querying all apps for a flow run in HBase 
> reader implementation.
> And also REST API implementation for single app and multiple apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3864) Implement support for querying single app and all apps for a flow run

2015-10-02 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941727#comment-14941727
 ] 

Li Lu commented on YARN-3864:
-

OK, I can update my local version accordingly. 

> Implement support for querying single app and all apps for a flow run
> -
>
> Key: YARN-3864
> URL: https://issues.apache.org/jira/browse/YARN-3864
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: YARN-3864-YARN-2928.01.patch, 
> YARN-3864-YARN-2928.02.patch, YARN-3864-YARN-2928.03.patch
>
>
> This JIRA will handle support for querying all apps for a flow run in HBase 
> reader implementation.
> And also REST API implementation for single app and multiple apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-10-02 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941869#comment-14941869
 ] 

zhihai xu commented on YARN-3619:
-

Thanks [~jlowe] for reviewing and committing the very old patch! I really 
appreciate it.

> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: zhihai xu
> Fix For: 2.7.2
>
> Attachments: YARN-3619.000.patch, YARN-3619.001.patch, 
> YARN-3619.alt.patch, YARN-3619.branch-2.7.patch, test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1509) Make AMRMClient support send increase container request and get increased/decreased containers

2015-10-02 Thread MENG DING (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

MENG DING updated YARN-1509:

Attachment: YARN-1509.3.patch

Attaching latest patch:
* Added AbstractCallbackHandler in AMRMClientAsync
* Added more debug logs

[~leftnoteasy], do you have any question/concern regarding my previous response?

> Make AMRMClient support send increase container request and get 
> increased/decreased containers
> --
>
> Key: YARN-1509
> URL: https://issues.apache.org/jira/browse/YARN-1509
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan (No longer used)
>Assignee: MENG DING
> Attachments: YARN-1509.1.patch, YARN-1509.2.patch, YARN-1509.3.patch
>
>
> As described in YARN-1197, we need add API in AMRMClient to support
> 1) Add increase request
> 2) Can get successfully increased/decreased containers from RM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941810#comment-14941810
 ] 

Hudson commented on YARN-3619:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #481 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/481/])
YARN-3619. ContainerMetrics unregisters during getMetrics and leads to (jlowe: 
rev fdf02d1f26cea372bf69e071f57b8bfc09c092c4)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainerMetrics.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainerMetrics.java


> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: zhihai xu
> Fix For: 2.7.2
>
> Attachments: YARN-3619.000.patch, YARN-3619.001.patch, 
> YARN-3619.alt.patch, YARN-3619.branch-2.7.patch, test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1510) Make NMClient support change container resources

2015-10-02 Thread MENG DING (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

MENG DING updated YARN-1510:

Attachment: YARN-1510.4.patch

Attach latest patch which adds an {{NMClientAsync.AbstractCallbackHandler}} 
class that implements the original  {{NMClientAsync.CallbackHandler}} 
interface. New methods are all defined in the abstract class. This makes sure 
that the build will not break when old applications compile with the new client 
API.

The same will be done for {{AMRMClientAsync}}.

> Make NMClient support change container resources
> 
>
> Key: YARN-1510
> URL: https://issues.apache.org/jira/browse/YARN-1510
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Wangda Tan (No longer used)
>Assignee: MENG DING
> Attachments: YARN-1510-YARN-1197.1.patch, 
> YARN-1510-YARN-1197.2.patch, YARN-1510.3.patch, YARN-1510.4.patch
>
>
> As described in YARN-1197, YARN-1449, we need add API in NMClient to support
> 1) sending request of increase/decrease container resource limits
> 2) get succeeded/failed changed containers response from NM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3864) Implement support for querying single app and all apps for a flow run

2015-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941839#comment-14941839
 ] 

Hadoop QA commented on YARN-3864:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  20m 32s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |  10m 18s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  12m 19s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 27s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 19s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  6s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 49s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 49s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 10s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  19m 34s | Tests failed in 
hadoop-yarn-server-timelineservice. |
| | |  67m 35s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorage |
|   | 
hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowActivity |
|   | 
hadoop.yarn.server.timelineservice.storage.TestPhoenixOfflineAggregationWriterImpl
 |
|   | 
hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage
 |
|   | hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRun |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764841/YARN-3864-YARN-2928.03.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / a95b8f5 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9336/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9336/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9336/console |


This message was automatically generated.

> Implement support for querying single app and all apps for a flow run
> -
>
> Key: YARN-3864
> URL: https://issues.apache.org/jira/browse/YARN-3864
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: YARN-3864-YARN-2928.01.patch, 
> YARN-3864-YARN-2928.02.patch, YARN-3864-YARN-2928.03.patch
>
>
> This JIRA will handle support for querying all apps for a flow run in HBase 
> reader implementation.
> And also REST API implementation for single app and multiple apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941899#comment-14941899
 ] 

Hudson commented on YARN-3619:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #473 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/473/])
YARN-3619. ContainerMetrics unregisters during getMetrics and leads to (jlowe: 
rev fdf02d1f26cea372bf69e071f57b8bfc09c092c4)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainerMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainerMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml


> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: zhihai xu
> Fix For: 2.7.2
>
> Attachments: YARN-3619.000.patch, YARN-3619.001.patch, 
> YARN-3619.alt.patch, YARN-3619.branch-2.7.patch, test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941924#comment-14941924
 ] 

Hudson commented on YARN-3619:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1212 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1212/])
YARN-3619. ContainerMetrics unregisters during getMetrics and leads to (jlowe: 
rev fdf02d1f26cea372bf69e071f57b8bfc09c092c4)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainerMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainerMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java


> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: zhihai xu
> Fix For: 2.7.2
>
> Attachments: YARN-3619.000.patch, YARN-3619.001.patch, 
> YARN-3619.alt.patch, YARN-3619.branch-2.7.patch, test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941962#comment-14941962
 ] 

Hudson commented on YARN-3619:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2417 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2417/])
YARN-3619. ContainerMetrics unregisters during getMetrics and leads to (jlowe: 
rev fdf02d1f26cea372bf69e071f57b8bfc09c092c4)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainerMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainerMetrics.java


> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: zhihai xu
> Fix For: 2.7.2
>
> Attachments: YARN-3619.000.patch, YARN-3619.001.patch, 
> YARN-3619.alt.patch, YARN-3619.branch-2.7.patch, test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4175) Example of use YARN-1197

2015-10-02 Thread MENG DING (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

MENG DING updated YARN-4175:

Attachment: YARN-4175.1.patch

Submit the initial patch for review. Will not kick jenkins as it depends on 
YARN-1509 and YARN-1510.

As mentioned in the previous post, this ticket enhances the DistributedShell 
application to enable the option to start an IPC service in the application 
master. Once started, user can use the Client program to talk to IPC service to 
issue the increase/decrease container resource requests.

More specifically:
* To enable IPC service in the application master, user needs to specify the 
*enable_ipc* option. For example:
{{hadoop org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.0.0.jar
 -shell_command "sleep 10" -num_containers 10 *-enable_ipc*}}

* Once the application has started, user can start a new client and specify the 
*appmaster* option to set the client to the appmaster mode. Under this mode, 
user can specify *application_id*, *container_id*, *action*, 
*container_memory*, *container_vcores* options to request container resizing. 
For example, to increase a container resource, the user can do:
{{hadoop org.apache.hadoop.yarn.applications.distributedshell.Client -appmaster 
-application_id= -container_id= 
-action=INCREASE_CONTAINER -container_memory=2048 -container_vcores=1}} 

If you want to try this patch, you need to apply YARN-1510 and YARN-1509 first, 
or wait until these two patches are committed.

> Example of use YARN-1197
> 
>
> Key: YARN-4175
> URL: https://issues.apache.org/jira/browse/YARN-4175
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, nodemanager, resourcemanager
>Reporter: Wangda Tan
>Assignee: MENG DING
> Attachments: YARN-4175.1.patch
>
>
> Like YARN-2609, we need a example program to demonstrate how to use YARN-1197 
> from end-to-end.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4009) CORS support for ResourceManager REST API

2015-10-02 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941935#comment-14941935
 ] 

Hitesh Shah commented on YARN-4009:
---

[~jeagles] Did you get a chance to look at the latest patch? 

> CORS support for ResourceManager REST API
> -
>
> Key: YARN-4009
> URL: https://issues.apache.org/jira/browse/YARN-4009
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prakash Ramachandran
>Assignee: Varun Vasudev
> Attachments: YARN-4009.001.patch, YARN-4009.002.patch, 
> YARN-4009.003.patch, YARN-4009.004.patch, YARN-4009.005.patch
>
>
> Currently the REST API's do not have CORS support. This means any UI (running 
> in browser) cannot consume the REST API's. For ex Tez UI would like to use 
> the REST API for getting application, application attempt information exposed 
> by the API's. 
> It would be very useful if CORS is enabled for the REST API's.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1509) Make AMRMClient support send increase container request and get increased/decreased containers

2015-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941992#comment-14941992
 ] 

Hadoop QA commented on YARN-1509:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 52s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   8m  5s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 29s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 17s | The applied patch generated 
1 release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 36s | The applied patch generated  5 
new checkstyle issues (total was 79, now 78). |
| {color:red}-1{color} | whitespace |   0m  5s | The patch has 4  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 29s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 56s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   7m 26s | Tests passed in 
hadoop-yarn-client. |
| | |  46m 52s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764874/YARN-1509.3.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / fdf02d1 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-YARN-Build/9338/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/9338/artifact/patchprocess/diffcheckstylehadoop-yarn-client.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/9338/artifact/patchprocess/whitespace.txt
 |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9338/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9338/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9338/console |


This message was automatically generated.

> Make AMRMClient support send increase container request and get 
> increased/decreased containers
> --
>
> Key: YARN-1509
> URL: https://issues.apache.org/jira/browse/YARN-1509
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan (No longer used)
>Assignee: MENG DING
> Attachments: YARN-1509.1.patch, YARN-1509.2.patch, YARN-1509.3.patch
>
>
> As described in YARN-1197, we need add API in AMRMClient to support
> 1) Add increase request
> 2) Can get successfully increased/decreased containers from RM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1510) Make NMClient support change container resources

2015-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941965#comment-14941965
 ] 

Hadoop QA commented on YARN-1510:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m  5s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   8m 46s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  11m 21s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 17s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 30s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  2s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 43s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 39s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m  0s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |   7m 11s | Tests failed in 
hadoop-yarn-client. |
| | |  49m 39s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.yarn.client.api.impl.TestYarnClient |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764858/YARN-1510.4.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / fdf02d1 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-YARN-Build/9337/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9337/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9337/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9337/console |


This message was automatically generated.

> Make NMClient support change container resources
> 
>
> Key: YARN-1510
> URL: https://issues.apache.org/jira/browse/YARN-1510
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Wangda Tan (No longer used)
>Assignee: MENG DING
> Attachments: YARN-1510-YARN-1197.1.patch, 
> YARN-1510-YARN-1197.2.patch, YARN-1510.3.patch, YARN-1510.4.patch
>
>
> As described in YARN-1197, YARN-1449, we need add API in NMClient to support
> 1) sending request of increase/decrease container resource limits
> 2) get succeeded/failed changed containers response from NM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4222) Retries is typoed to spell Retires in parts of hadoop-yarn and hadoop-common

2015-10-02 Thread Neelesh Srinivas Salian (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neelesh Srinivas Salian updated YARN-4222:
--
Summary: Retries is typoed to spell Retires in parts of hadoop-yarn and 
hadoop-common  (was: Retries is typoed to spell Retires in parts of the 
hadoop-yarn and hadoop-common)

> Retries is typoed to spell Retires in parts of hadoop-yarn and hadoop-common
> 
>
> Key: YARN-4222
> URL: https://issues.apache.org/jira/browse/YARN-4222
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.1
>Reporter: Neelesh Srinivas Salian
>Assignee: Neelesh Srinivas Salian
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: YARN-4222.001.patch
>
>
> Spotted this typo in the code while working on a separate YARN issue.
> E.g DEFAULT_RM_NODEMANAGER_CONNECT_RETIRES
> Checked in the whole project. Found a few occurrences of the typo in 
> code/comment. 
> The JIRA is meant to help fix those typos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1897) CLI and core support for signal container functionality

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14942106#comment-14942106
 ] 

Hudson commented on YARN-1897:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #474 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/474/])
YARN-1897. CLI and core support for signal container functionality. (xgong: rev 
8f08532bde153811368e1b8336446fba4743f9d2)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/YarnCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/SignalContainerRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainersLauncherEventType.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/SignalContainerResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestYarnCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/CMgrSignalContainersEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeEventType.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/SignalContainerResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatResponsePBImpl.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/ContainerManagerEventType.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationClientProtocol.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/ApplicationCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestSignalContainer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/SignalContainerRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/SignalContainersLauncherEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeSignalContainerEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/applicationclient_protocol.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/YarnClient.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ResourceMgrDelegate.java
* 

[jira] [Commented] (YARN-1897) CLI and core support for signal container functionality

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14942116#comment-14942116
 ] 

Hudson commented on YARN-1897:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #448 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/448/])
YARN-1897. CLI and core support for signal container functionality. (xgong: rev 
8f08532bde153811368e1b8336446fba4743f9d2)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/ApplicationCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/SignalContainerResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/applicationclient_protocol.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestSignalContainer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ResourceMgrDelegate.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/service/ApplicationClientProtocolPBServiceImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeEventType.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/SignalContainerCommand.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/client/ApplicationClientProtocolPBClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/SignalContainerRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/MockResourceManagerFacade.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/YarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAuditLogger.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainersLauncherEventType.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/SignalContainerRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/ContainerManagerEventType.java
* 

[jira] [Commented] (YARN-1897) CLI and core support for signal container functionality

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14942126#comment-14942126
 ] 

Hudson commented on YARN-1897:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2388 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2388/])
YARN-1897. CLI and core support for signal container functionality. (xgong: rev 
8f08532bde153811368e1b8336446fba4743f9d2)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/client/ApplicationClientProtocolPBClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/MockResourceManagerFacade.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/SignalContainerRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeSignalContainerEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/SignalContainersLauncherEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/SignalContainerResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestContainerManagerWithLCE.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/service/ApplicationClientProtocolPBServiceImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestYarnCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/SignalContainerResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestSignalContainer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeEventType.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationClientProtocol.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/applicationclient_protocol.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/YarnCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/SignalContainerRequest.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ResourceMgrDelegate.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/YarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainersLauncher.java
* 

[jira] [Commented] (YARN-4222) Retries is typoed to spell Retires in parts of hadoop-yarn and hadoop-common

2015-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14942129#comment-14942129
 ] 

Hadoop QA commented on YARN-4222:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  23m 59s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   9m  3s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  11m 43s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 17s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   4m 22s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 48s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 40s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   9m  7s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  10m 16s | Tests passed in 
hadoop-common. |
| {color:green}+1{color} | yarn tests |   0m 38s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   2m 33s | Tests passed in 
hadoop-yarn-common. |
| {color:red}-1{color} | yarn tests |  70m 33s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | | 145m  3s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification |
|   | hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764894/YARN-4222.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 8f08532 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-YARN-Build/9339/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9339/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9339/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9339/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9339/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9339/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9339/console |


This message was automatically generated.

> Retries is typoed to spell Retires in parts of hadoop-yarn and hadoop-common
> 
>
> Key: YARN-4222
> URL: https://issues.apache.org/jira/browse/YARN-4222
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.1
>Reporter: Neelesh Srinivas Salian
>Assignee: Neelesh Srinivas Salian
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: YARN-4222.001.patch
>
>
> Spotted this typo in the code while working on a separate YARN issue.
> E.g DEFAULT_RM_NODEMANAGER_CONNECT_RETIRES
> Checked in the whole project. Found a few occurrences of the typo in 
> code/comment. 
> The JIRA is meant to help fix those typos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1897) CLI and core support for signal container functionality

2015-10-02 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14942039#comment-14942039
 ] 

Xuan Gong commented on YARN-1897:
-

Committed into trunk/branch-2. Thanks, [~mingma]

> CLI and core support for signal container functionality
> ---
>
> Key: YARN-1897
> URL: https://issues.apache.org/jira/browse/YARN-1897
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Ming Ma
>Assignee: Ming Ma
> Fix For: 2.8.0
>
> Attachments: YARN-1897-2.patch, YARN-1897-3.patch, YARN-1897-4.patch, 
> YARN-1897-5.patch, YARN-1897-6.patch, YARN-1897-7.patch, YARN-1897-8.patch, 
> YARN-1897.1.patch
>
>
> We need to define SignalContainerRequest and SignalContainerResponse first as 
> they are needed by other sub tasks. SignalContainerRequest should use 
> OS-independent commands and provide a way to application to specify "reason" 
> for diagnosis. SignalContainerResponse might be empty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1807) CLI update to allow people to signal a specfic container

2015-10-02 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1807:

Assignee: Ming Ma

> CLI update to allow people to signal a specfic container
> 
>
> Key: YARN-1807
> URL: https://issues.apache.org/jira/browse/YARN-1807
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: client
>Reporter: Ming Ma
>Assignee: Ming Ma
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-1807) CLI update to allow people to signal a specfic container

2015-10-02 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong resolved YARN-1807.
-
Resolution: Fixed

> CLI update to allow people to signal a specfic container
> 
>
> Key: YARN-1807
> URL: https://issues.apache.org/jira/browse/YARN-1807
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: client
>Reporter: Ming Ma
>Assignee: Ming Ma
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1807) CLI update to allow people to signal a specfic container

2015-10-02 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14942047#comment-14942047
 ] 

Xuan Gong commented on YARN-1807:
-

Close as duplicate since YARN-1897 has been committed

> CLI update to allow people to signal a specfic container
> 
>
> Key: YARN-1807
> URL: https://issues.apache.org/jira/browse/YARN-1807
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: client
>Reporter: Ming Ma
>Assignee: Ming Ma
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1897) CLI and core support for signal container functionality

2015-10-02 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14942053#comment-14942053
 ] 

Ming Ma commented on YARN-1897:
---

Thanks [~xgong] for helping with design, code review and the commit. Also 
thanks [~djp] [~ste...@apache.org] [~vinodkv] [~jira.shegalov] [~lohit] and 
others for the valuable input. The remaining items such as webUI and support 
for custom command will be done separately.

> CLI and core support for signal container functionality
> ---
>
> Key: YARN-1897
> URL: https://issues.apache.org/jira/browse/YARN-1897
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Ming Ma
>Assignee: Ming Ma
> Fix For: 2.8.0
>
> Attachments: YARN-1897-2.patch, YARN-1897-3.patch, YARN-1897-4.patch, 
> YARN-1897-5.patch, YARN-1897-6.patch, YARN-1897-7.patch, YARN-1897-8.patch, 
> YARN-1897.1.patch
>
>
> We need to define SignalContainerRequest and SignalContainerResponse first as 
> they are needed by other sub tasks. SignalContainerRequest should use 
> OS-independent commands and provide a way to application to specify "reason" 
> for diagnosis. SignalContainerResponse might be empty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1897) CLI and core support for signal container functionality

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14942057#comment-14942057
 ] 

Hudson commented on YARN-1897:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8560 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8560/])
YARN-1897. CLI and core support for signal container functionality. (xgong: rev 
8f08532bde153811368e1b8336446fba4743f9d2)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainersLauncher.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestSignalContainer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/YarnCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAuditLogger.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/SignalContainerCommand.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/service/ApplicationClientProtocolPBServiceImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/SignalContainerRequestPBImpl.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestClientRedirect.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeEventType.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeSignalContainerEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/MockResourceManagerFacade.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/applicationclient_protocol.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/SignalContainersLauncherEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestYarnCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationClientProtocol.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/YarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/SignalContainerResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/ApplicationCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/ContainerManagerEventType.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* 

[jira] [Created] (YARN-4222) Retries is typoed to spell Retires in parts of the hadoop-yarn and hadoop-common

2015-10-02 Thread Neelesh Srinivas Salian (JIRA)
Neelesh Srinivas Salian created YARN-4222:
-

 Summary: Retries is typoed to spell Retires in parts of the 
hadoop-yarn and hadoop-common
 Key: YARN-4222
 URL: https://issues.apache.org/jira/browse/YARN-4222
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.1
Reporter: Neelesh Srinivas Salian
Assignee: Neelesh Srinivas Salian
Priority: Minor


Spotted this typo in the code while working on a separate YARN issue.
E.g DEFAULT_RM_NODEMANAGER_CONNECT_RETIRES

Checked in the whole project. Found a few occurrences of the typo in 
code/comment. 

The JIRA is meant to help fix those typos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1897) CLI and core support for signal container functionality

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14942078#comment-14942078
 ] 

Hudson commented on YARN-1897:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #482 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/482/])
YARN-1897. CLI and core support for signal container functionality. (xgong: rev 
8f08532bde153811368e1b8336446fba4743f9d2)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/MockResourceManagerFacade.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/SignalContainerResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/SignalContainerRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/SignalContainerResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestSignalContainer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainersLauncherEventType.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainersLauncher.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestClientRedirect.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeSignalContainerEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeEventType.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAuditLogger.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/YarnCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestYarnCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/client/ApplicationClientProtocolPBClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/applicationclient_protocol.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationClientProtocol.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* 

[jira] [Commented] (YARN-1897) CLI and core support for signal container functionality

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14942087#comment-14942087
 ] 

Hudson commented on YARN-1897:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #1213 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1213/])
YARN-1897. CLI and core support for signal container functionality. (xgong: rev 
8f08532bde153811368e1b8336446fba4743f9d2)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/MockResourceManagerFacade.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/applicationclient_protocol.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/SignalContainerResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/SignalContainerRequest.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ResourceMgrDelegate.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainersLauncherEventType.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/service/ApplicationClientProtocolPBServiceImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainersLauncher.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/SignalContainersLauncherEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestSignalContainer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeSignalContainerEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAuditLogger.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestContainerManagerWithLCE.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/SignalContainerResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestYarnCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/CMgrSignalContainersEvent.java
* 

[jira] [Commented] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14942027#comment-14942027
 ] 

Hudson commented on YARN-3619:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #447 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/447/])
YARN-3619. ContainerMetrics unregisters during getMetrics and leads to (jlowe: 
rev fdf02d1f26cea372bf69e071f57b8bfc09c092c4)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainerMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainerMetrics.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml


> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: zhihai xu
> Fix For: 2.7.2
>
> Attachments: YARN-3619.000.patch, YARN-3619.001.patch, 
> YARN-3619.alt.patch, YARN-3619.branch-2.7.patch, test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1803) Signal container support in nodemanager

2015-10-02 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14942040#comment-14942040
 ] 

Xuan Gong commented on YARN-1803:
-

Close this as duplication since YARN-1897 has been committed

> Signal container support in nodemanager
> ---
>
> Key: YARN-1803
> URL: https://issues.apache.org/jira/browse/YARN-1803
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Ming Ma
>Assignee: Ming Ma
>  Labels: BB2015-05-TBR
> Attachments: YARN-1803.patch
>
>
> It could include the followings.
> 1. ContainerManager is able to process a new event type 
> ContainerManagerEventType.SIGNAL_CONTAINERS coming from NodeStatusUpdater and 
> deliver the request to ContainerExecutor.
> 2. Translate the platform independent signal command to Linux specific 
> signals. Windows support will be tracked by another task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-1804) Signal container request delivery from client to resourcemanager

2015-10-02 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong resolved YARN-1804.
-
Resolution: Duplicate

> Signal container request delivery from client to resourcemanager
> 
>
> Key: YARN-1804
> URL: https://issues.apache.org/jira/browse/YARN-1804
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: client, resourcemanager
>Reporter: Ming Ma
>Assignee: Xuan Gong
>
> It could include the following work items
> 1. Define the OS independent SignalContainerCMD enum commands. We will start 
> with known requirements such as KILL. We can expand the list later.
> 2. Add a new method signalContainer to ApplicationClientProtocol. 
> signalContainerRequest will include containerId as well as SignalContainerCMD.
> 3. Add signalContainer method to YarnClient and YarnClientImpl.
> 4. RM will deliver the request to the RMNode object that owns the container.
> 5. RM needs to have the proper authorization for the signal request. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1804) Signal container request delivery from client to resourcemanager

2015-10-02 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14942043#comment-14942043
 ] 

Xuan Gong commented on YARN-1804:
-

Close as duplicate since YARN-1897 is committed.

> Signal container request delivery from client to resourcemanager
> 
>
> Key: YARN-1804
> URL: https://issues.apache.org/jira/browse/YARN-1804
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: client, resourcemanager
>Reporter: Ming Ma
>Assignee: Xuan Gong
>
> It could include the following work items
> 1. Define the OS independent SignalContainerCMD enum commands. We will start 
> with known requirements such as KILL. We can expand the list later.
> 2. Add a new method signalContainer to ApplicationClientProtocol. 
> signalContainerRequest will include containerId as well as SignalContainerCMD.
> 3. Add signalContainer method to YarnClient and YarnClientImpl.
> 4. RM will deliver the request to the RMNode object that owns the container.
> 5. RM needs to have the proper authorization for the signal request. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1805) Signal container request delivery from resourcemanager to nodemanager

2015-10-02 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14942045#comment-14942045
 ] 

Xuan Gong commented on YARN-1805:
-

Close as duplication since YARN-1897 has been committed.

> Signal container request delivery from resourcemanager to nodemanager
> -
>
> Key: YARN-1805
> URL: https://issues.apache.org/jira/browse/YARN-1805
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Ming Ma
>Assignee: Ming Ma
>  Labels: BB2015-05-TBR
> Attachments: YARN-1805.patch
>
>
> 1. Update ResourceTracker's HeartbeatResponse to include the list of 
> SignalContainerRequest.
> 2. Upon receiving the request, NM's NodeStatusUpdater will deliver the 
> request to ContainerManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4222) Retries is typoed to spell Retires in parts of the hadoop-yarn and hadoop-common

2015-10-02 Thread Neelesh Srinivas Salian (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neelesh Srinivas Salian updated YARN-4222:
--
Attachment: YARN-4222.001.patch

> Retries is typoed to spell Retires in parts of the hadoop-yarn and 
> hadoop-common
> 
>
> Key: YARN-4222
> URL: https://issues.apache.org/jira/browse/YARN-4222
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.1
>Reporter: Neelesh Srinivas Salian
>Assignee: Neelesh Srinivas Salian
>Priority: Minor
> Attachments: YARN-4222.001.patch
>
>
> Spotted this typo in the code while working on a separate YARN issue.
> E.g DEFAULT_RM_NODEMANAGER_CONNECT_RETIRES
> Checked in the whole project. Found a few occurrences of the typo in 
> code/comment. 
> The JIRA is meant to help fix those typos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14942081#comment-14942081
 ] 

Hudson commented on YARN-3619:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2387 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2387/])
YARN-3619. ContainerMetrics unregisters during getMetrics and leads to (jlowe: 
rev fdf02d1f26cea372bf69e071f57b8bfc09c092c4)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainerMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainerMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java
* hadoop-yarn-project/CHANGES.txt


> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: zhihai xu
> Fix For: 2.7.2
>
> Attachments: YARN-3619.000.patch, YARN-3619.001.patch, 
> YARN-3619.alt.patch, YARN-3619.branch-2.7.patch, test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1897) CLI and core support for signal container functionality

2015-10-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14942083#comment-14942083
 ] 

Hudson commented on YARN-1897:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2418 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2418/])
YARN-1897. CLI and core support for signal container functionality. (xgong: rev 
8f08532bde153811368e1b8336446fba4743f9d2)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestContainerManagerWithLCE.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ResourceMgrDelegate.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/MockResourceManagerFacade.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/SignalContainersLauncherEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/SignalContainerResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/YarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/service/ApplicationClientProtocolPBServiceImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/ContainerManagerEventType.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAuditLogger.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeEventType.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/SignalContainerCommand.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestSignalContainer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeSignalContainerEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainersLauncherEventType.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/client/ApplicationClientProtocolPBClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/SignalContainerRequestPBImpl.java
*