[jira] [Commented] (YARN-5496) Make Node Heatmap Chart categories clickable

2017-03-08 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902631#comment-15902631
 ] 

Sunil G commented on YARN-5496:
---

+1. Will commit later today if there are no objections.

> Make Node Heatmap Chart categories clickable
> 
>
> Key: YARN-5496
> URL: https://issues.apache.org/jira/browse/YARN-5496
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Yesha Vora
>Assignee: Gergely Novák
> Attachments: YARN-5496.001.patch, YARN-5496.002.patch, 
> YARN-5496.003.patch, YARN-5496.004.patch
>
>
> Make Node Heatmap Chart categories clickable. 
> This Heatmap chart has few categories like 10% used, 30% used etc.
> This tags should be clickable. If user clicks on 10% used tag, it should show 
> hosts with 10% usage.  This can be a useful feature for clusters having 1000s 
> of nodes.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6196) [YARN-3368] Invalid information in Node pages and improve Resource Donut chart with better label

2017-03-08 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902630#comment-15902630
 ] 

Sunil G commented on YARN-6196:
---

Change looks fine to me.. Committing later today if there are no other concerns.

> [YARN-3368] Invalid information in Node pages and improve Resource Donut 
> chart with better label
> 
>
> Key: YARN-6196
> URL: https://issues.apache.org/jira/browse/YARN-6196
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Akhil PB
>Assignee: Akhil PB
> Attachments: YARN-6196.001.patch
>
>
> In nodes page:
> # Change 'Nodes Table' label to 'Information'
> # Show Health Report as N/A if not available
> # When there are 0 nodes in cluster, nodes page breaks.
> In node page:
> # Node Health Report missing
> # NodeManager Start Time shows Invalid Date
> # Reverse colors in the 'Resouce - Memory' and 'Resource - VCores' donut 
> charts
> # Convert Resource Memory into GB/TB
> # Diagnostics is empty in Container Information



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6301) Fair scheduler docs should explain the meaning of setting a queue's weight to zero

2017-03-08 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902618#comment-15902618
 ] 

Tao Jie commented on YARN-6301:
---

[~templedf], today queue's weight is allowed to be zero even negative. It seems 
to me that the queue could not get any share more than the MinResource in that 
case, am I correct? Should we add a non-negative check here since negative 
weight of queue is more confusing?

> Fair scheduler docs should explain the meaning of setting a queue's weight to 
> zero
> --
>
> Key: YARN-6301
> URL: https://issues.apache.org/jira/browse/YARN-6301
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.0.0-alpha2
>Reporter: Daniel Templeton
>  Labels: docs
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



Spark Streaming Logs Rotation

2017-03-08 Thread Nehal Syed
Hey Everyone,
I am stuck in an issue of logs rotation with spark streaming job which run
on yarn. Yarn continuously write container stderr and stdout logs to
containers/ folder and it fills up disk space and crash cluster. I want to
continuously move logs to hdfs or s3 and then truncate source file.

How can I split and truncate those open container log files?
Can I use any RollingFileAppender to keep the file size small?
Is there any workaround to handle this growing file?

I am using AWS EMR 5.3.0 that is packaged with:
Spark: Spark 2.1.0 on Hadoop 2.7.3, YARN with Ganglia 3.7.2 and Zeppelin
0.6.2

I have already tried running 'truncate' command and 'logrotate truncate' as
well, nothing changed growing file to zero. AWS support has accepted this
as true and also that they didn't know about this issue before I opened
ticket with them.

Please help me if you have any knowledge.

Thanks
Nehal


[jira] [Assigned] (YARN-2985) YARN should support to delete the aggregated logs for Non-MapReduce applications

2017-03-08 Thread Steven Rand (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rand reassigned YARN-2985:
-

Assignee: Steven Rand

> YARN should support to delete the aggregated logs for Non-MapReduce 
> applications
> 
>
> Key: YARN-2985
> URL: https://issues.apache.org/jira/browse/YARN-2985
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: log-aggregation, nodemanager
>Reporter: Xu Yang
>Assignee: Steven Rand
>
> Before Hadoop 2.6, the LogAggregationService is started in NodeManager. But 
> the AggregatedLogDeletionService is started in mapreduce`s JobHistoryServer. 
> Therefore, the Non-MapReduce application can aggregate their logs to HDFS, 
> but can not delete those logs. Need the NodeManager take over the function of 
> aggregated log deletion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-6120) add retention of aggregated logs to Timeline Server

2017-03-08 Thread Steven Rand (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rand resolved YARN-6120.
---
Resolution: Duplicate

I now have the ability to submit a patch for YARN-2985, so this duplicate JIRA 
is unnecessary. 

> add retention of aggregated logs to Timeline Server
> ---
>
> Key: YARN-6120
> URL: https://issues.apache.org/jira/browse/YARN-6120
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: log-aggregation, timelineserver
>Affects Versions: 2.7.3
>Reporter: Steven Rand
> Attachments: YARN-6120.001.patch
>
>
> The MR History Server performs retention of aggregated logs for MapReduce 
> applications. However, there is no way of enforcing retention on aggregated 
> logs for other types of applications. This JIRA proposes to add log retention 
> to the Timeline Server.
> Also, this is arguably a duplicate of 
> https://issues.apache.org/jira/browse/YARN-2985, but I could not find a way 
> to attach a patch for that issue. If someone closes this as a duplicate, 
> could you please assign that issue to me?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6307) Refactor FairShareComparator#compare

2017-03-08 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902482#comment-15902482
 ] 

Tao Jie commented on YARN-6307:
---

Thank you [~yufeigu], FairShareComparator#compare is called very frequently in 
each container allocation process. It would improve the scheduler performance 
if we can simplify this method.
Furthermore, I don't think it is necessary to sort the queue hierarchy from the 
root to leafqueue in every node update. Can we do the sort in the update 
thread, then share the result for node update? It would reduce much redundant 
sort. Maybe we can improve this in another JIRA.

> Refactor FairShareComparator#compare
> 
>
> Key: YARN-6307
> URL: https://issues.apache.org/jira/browse/YARN-6307
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>
> The method did three things: check the min share ratio, check weight ratio, 
> break tied by submit time and name. They are mixed with each other which is 
> not easy to read and maintenance,  poor style. Additionally, there are 
> potential performance issues, like no need to calculate weight ratio every 
> time. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6164) Expose maximum-am-resource-percent in YarnClient

2017-03-08 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902469#comment-15902469
 ] 

Sunil G commented on YARN-6164:
---

[~benson.qiu], Thanks for the taking interest to contribute. Really appreciate.
This is a good addition to api layer to provide missing details. So I am not 
seeing any reason not to add this. 
Please prepare patch, and I will be able to help. :)

> Expose maximum-am-resource-percent in YarnClient
> 
>
> Key: YARN-6164
> URL: https://issues.apache.org/jira/browse/YARN-6164
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.7.2
>Reporter: Benson Qiu
>Assignee: Benson Qiu
> Attachments: YARN-6164.001.patch, YARN-6164.002.patch, 
> YARN-6164.003.patch, YARN-6164.004.patch, YARN-6164.005.patch
>
>
> `yarn.scheduler.capacity.maximum-am-resource-percent` is exposed through the 
> [Cluster Scheduler 
> API|http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Scheduler_API],
>  but not through 
> [YarnClient|https://hadoop.apache.org/docs/current/api/org/apache/hadoop/yarn/client/api/YarnClient.html].
> Since YarnClient and RM REST APIs depend on different ports (8032 vs 8088 by 
> default), it would be nice to expose `maximum-am-resource-percent` in 
> YarnClient as well. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6300) NULL_UPDATE_REQUESTS is redundant in TestFairScheduler

2017-03-08 Thread Yuanbo Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902391#comment-15902391
 ] 

Yuanbo Liu commented on YARN-6300:
--

[~haibochen] Thanks for your review.

> NULL_UPDATE_REQUESTS is redundant in TestFairScheduler
> --
>
> Key: YARN-6300
> URL: https://issues.apache.org/jira/browse/YARN-6300
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha2
>Reporter: Daniel Templeton
>Assignee: Yuanbo Liu
>Priority: Minor
>  Labels: newbie
> Attachments: YARN-6300.001.patch
>
>
> The {{TestFairScheduler.NULL_UPDATE_REQUESTS}} field hides 
> {{FairSchedulerTestBase.NULL_UPDATE_REQUESTS}}, which has the same value.  
> The {{NULL_UPDATE_REQUESTS}} field should be removed from 
> {{TestFairScheduler}}.
> While you're at it, maybe also remove the unused import.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6312) Applicatioin trying to connect standby resourcemanager's 8030 port and failed

2017-03-08 Thread wuchang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wuchang updated YARN-6312:
--
Description: 
{code}
17/03/08 10:26:39 INFO retry.RetryInvocationHandler: Exception while invoking 
allocate of class ApplicationMasterProtocolPBClientImpl over resourcemanager2. 
Trying to fail over immediately.
java.io.EOFException: End of File Exception between local host is: 
"12.23.45.1"; destination host is: "12.23.45.1":8030; : java.io.EOFException; 
For more details see:  http://wiki.apache.org/hadoop/EOFException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:765)
at org.apache.hadoop.ipc.Client.call(Client.java:1479)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy15.allocate(Unknown Source)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)
at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy16.allocate(Unknown Source)
at 
org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:277)
at 
org.apache.spark.deploy.yarn.YarnAllocator.allocateResources(YarnAllocator.scala:265)
at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$1.run(ApplicationMaster.scala:428)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at 
org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1084)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:979)
17/03/08 10:26:39 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
to resourcemanager1
17/03/08 10:26:39 WARN ipc.Client: Exception encountered while connecting to 
the server : 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 appattempt_1483970589433_182559_02 not found in AMRMTokenSecretManager.
17/03/08 10:26:39 INFO retry.RetryInvocationHandler: Exception while invoking 
allocate of class ApplicationMasterProtocolPBClientImpl over resourcemanager1 
after 1 fail over attempts. Trying to fail over immediately.
org.apache.hadoop.security.token.SecretManager$InvalidToken: 
appattempt_1483970589433_182559_02 not found in AMRMTokenSecretManager.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79)
at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy16.allocate(Unknown Source)
at 
org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:277)
at 
org.apache.spark.deploy.yarn.YarnAllocator.allocateResources(YarnAllocator.scala:265)
at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$1.run(ApplicationMaster.scala:428)


{code}

after several retries, my app finally quited abnormally with error:
{code}
17/03/08 11:10:44 ERROR util.Utils: Uncaught exception in thread 

[jira] [Updated] (YARN-6312) Applicatioin trying to connect standby resourcemanager's 8030 port and failed

2017-03-08 Thread wuchang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wuchang updated YARN-6312:
--
Description: 
{quote}
java.net.ConnectException: Call From 12.23.45.2:8030 failed on connection 
exception: java.net.ConnectException: Connection refused; For more details see: 
 http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.GeneratedConstructorAccessor39.newInstance(Unknown 
Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
at org.apache.hadoop.ipc.Client.call(Client.java:1479)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy15.finishApplicationMaster(Unknown Source)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.finishApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:92)
at sun.reflect.GeneratedMethodAccessor50.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy16.finishApplicationMaster(Unknown Source)
at 
org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.unregisterApplicationMaster(AMRMClientImpl.java:378)
at 
org.apache.spark.deploy.yarn.YarnRMClient.unregister(YarnRMClient.scala:89)
at 
org.apache.spark.deploy.yarn.ApplicationMaster.unregister(ApplicationMaster.scala:285)
at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply$mcV$sp(ApplicationMaster.scala:224)
at 
org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:215)
at 
org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:187)
at 
org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:187)
at 
org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:187)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1857)
at 
org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:187)
at 
org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:187)
at 
org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:187)
at scala.util.Try$.apply(Try.scala:192)
at 
org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:187)
at 
org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:177)
at 
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
... 26 more

{quote}


the error stack is as below.

I have two resource managers 12.23.45.1/2 , 12.23.45.1 is an active 
recourcemanager and 12.23.45.2 is a standby one. I checked the yarn-site.xml :

{code}


yarn.resourcemanager.scheduler.address.resourcemanager1
12.23.45.1:8030



yarn.resourcemanager.scheduler.address.resourcemanager2
12.23.45.2:8030


{code}

and I use netstat -ntlp|grep 8083 , I find that the 8030 port is listening in 
my active resourcemanager but is missing in my standby recourcemanager.I don't 
know if this case is normal . 

  was:

[jira] [Updated] (YARN-6312) Applicatioin trying to connect standby resourcemanager's 8030 port and failed

2017-03-08 Thread wuchang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wuchang updated YARN-6312:
--
Description: 
{code}
java.net.ConnectException: Call From 12.23.45.2:8030 failed on connection 
exception: java.net.ConnectException: Connection refused; For more details see: 
 http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.GeneratedConstructorAccessor39.newInstance(Unknown 
Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
at org.apache.hadoop.ipc.Client.call(Client.java:1479)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy15.finishApplicationMaster(Unknown Source)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.finishApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:92)
at sun.reflect.GeneratedMethodAccessor50.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy16.finishApplicationMaster(Unknown Source)
at 
org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.unregisterApplicationMaster(AMRMClientImpl.java:378)
at 
org.apache.spark.deploy.yarn.YarnRMClient.unregister(YarnRMClient.scala:89)
at 
org.apache.spark.deploy.yarn.ApplicationMaster.unregister(ApplicationMaster.scala:285)
at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply$mcV$sp(ApplicationMaster.scala:224)
at 
org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:215)
at 
org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:187)
at 
org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:187)
at 
org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:187)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1857)
at 
org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:187)
at 
org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:187)
at 
org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:187)
at scala.util.Try$.apply(Try.scala:192)
at 
org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:187)
at 
org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:177)
at 
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
... 26 more

{code}


the error stack is as below.

I have two resource managers 12.23.45.1/2 , 12.23.45.1 is an active 
recourcemanager and 12.23.45.2 is a standby one. I checked the yarn-site.xml :

{code}


yarn.resourcemanager.scheduler.address.resourcemanager1
12.23.45.1:8030



yarn.resourcemanager.scheduler.address.resourcemanager2
12.23.45.2:8030


{code}

and I use netstat -ntlp|grep 8083 , I find that the 8030 port is listening in 
my active resourcemanager but is missing in my standby recourcemanager.I don't 
know if this case is normal . 

  was:

[jira] [Created] (YARN-6312) Applicatioin trying to connect standby resourcemanager's 8030 port and failed

2017-03-08 Thread wuchang (JIRA)
wuchang created YARN-6312:
-

 Summary: Applicatioin trying to connect standby resourcemanager's 
8030 port and failed
 Key: YARN-6312
 URL: https://issues.apache.org/jira/browse/YARN-6312
 Project: Hadoop YARN
  Issue Type: Task
  Components: distributed-scheduling
Affects Versions: 2.7.2
Reporter: wuchang


{quote}
java.net.ConnectException: Call From 12.23.45.2:8030 failed on connection 
exception: java.net.ConnectException: Connection refused; For more details see: 
 http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.GeneratedConstructorAccessor39.newInstance(Unknown 
Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
at org.apache.hadoop.ipc.Client.call(Client.java:1479)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy15.finishApplicationMaster(Unknown Source)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.finishApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:92)
at sun.reflect.GeneratedMethodAccessor50.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy16.finishApplicationMaster(Unknown Source)
at 
org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.unregisterApplicationMaster(AMRMClientImpl.java:378)
at 
org.apache.spark.deploy.yarn.YarnRMClient.unregister(YarnRMClient.scala:89)
at 
org.apache.spark.deploy.yarn.ApplicationMaster.unregister(ApplicationMaster.scala:285)
at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply$mcV$sp(ApplicationMaster.scala:224)
at 
org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:215)
at 
org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:187)
at 
org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:187)
at 
org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:187)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1857)
at 
org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:187)
at 
org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:187)
at 
org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:187)
at scala.util.Try$.apply(Try.scala:192)
at 
org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:187)
at 
org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:177)
at 
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
... 26 more

{quote}


the error stack is as below.

I have two resource managers 12.23.45.1/2 , 12.23.45.1 is an active 
recourcemanager and 12.23.45.2 is a standby one. I checked the yarn-site.xml :

{quote}


yarn.resourcemanager.scheduler.address.resourcemanager1
12.23.45.1:8030



yarn.resourcemanager.scheduler.address.resourcemanager2

[jira] [Updated] (YARN-6264) Resource comparison should depends on policy

2017-03-08 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-6264:
---
Attachment: YARN-6264.003.patch

> Resource comparison should depends on policy
> 
>
> Key: YARN-6264
> URL: https://issues.apache.org/jira/browse/YARN-6264
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: YARN-6264.001.patch, YARN-6264.002.patch, 
> YARN-6264.003.patch
>
>
> In method {{canRunAppAM()}}, we should use policy related resource comparison 
> instead of using {{Resources.fitsIn()}} to determined if the queue has enough 
> resource for the AM. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6308) Fix TestAMRMClient compilation errors

2017-03-08 Thread Steven Rand (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902269#comment-15902269
 ] 

Steven Rand commented on YARN-6308:
---

Attached a new patch to HADOOP-14062, though I think this issue should have 
been fixed by the previous patch being reverted.

> Fix TestAMRMClient compilation errors
> -
>
> Key: YARN-6308
> URL: https://issues.apache.org/jira/browse/YARN-6308
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha3
>Reporter: Manoj Govindassamy
>
> Looks like fixes committed for HADOOP-14062 and YARN-6218 had conflicts and 
> left TestAMRMClient in a dangling state with compilation errors. 
> TestAMRMClient needs a fix.
> {code}
> [ERROR] COMPILATION ERROR : 
> [ERROR] 
> /Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[145,5]
>  non-static variable yarnCluster cannot be referenced from a static context
> [ERROR] 
> /Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[145,71]
>  non-static variable nodeCount cannot be referenced from a static context
> [ERROR] 
> /Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[146,5]
>  non-static variable yarnCluster cannot be referenced from a static context
> ..
> ..
> [ERROR] 
> /Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[204,9]
>  non-static variable attemptId cannot be referenced from a static context
> [ERROR] 
> /Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[207,20]
>  non-static variable attemptId cannot be referenced from a static context
> [ERROR] 
> /Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[206,13]
>  non-static variable yarnCluster cannot be referenced from a static context
> [ERROR] 
> /Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[874,5]
>  cannot find symbol
> [ERROR] symbol:   method tearDown()
> [ERROR] location: class org.apache.hadoop.yarn.client.api.impl.TestAMRMClient
> [ERROR] 
> /Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[876,5]
>  cannot find symbol
> [ERROR] symbol:   method startApp()
> [ERROR] location: class org.apache.hadoop.yarn.client.api.impl.TestAMRMClient
> [ERROR] 
> /Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[881,5]
>  cannot find symbol
> [ERROR] symbol:   method tearDown()
> [ERROR] location: class org.apache.hadoop.yarn.client.api.impl.TestAMRMClient
> [ERROR] 
> /Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[885,5]
>  cannot find symbol
> [ERROR] symbol:   method startApp()
> [ERROR] location: class org.apache.hadoop.yarn.client.api.impl.TestAMRMClient
> [ERROR] -> [Help 1]
> [ERROR] 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3448) Add Rolling Time To Lives Level DB Plugin Capabilities

2017-03-08 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902243#comment-15902243
 ] 

Ravi Prakash commented on YARN-3448:


Thanks for the awesome design and fix Jon! I've opened YARN-6311 for adding 
documentation for this store.

> Add Rolling Time To Lives Level DB Plugin Capabilities
> --
>
> Key: YARN-3448
> URL: https://issues.apache.org/jira/browse/YARN-3448
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: YARN-3448.10.patch, YARN-3448.12.patch, 
> YARN-3448.13.patch, YARN-3448.14.patch, YARN-3448.15.patch, 
> YARN-3448.16.patch, YARN-3448.17.patch, YARN-3448.1.patch, YARN-3448.2.patch, 
> YARN-3448.3.patch, YARN-3448.4.patch, YARN-3448.5.patch, YARN-3448.7.patch, 
> YARN-3448.8.patch, YARN-3448.9.patch
>
>
> For large applications, the majority of the time in LeveldbTimelineStore is 
> spent deleting old entities record at a time. An exclusive write lock is held 
> during the entire deletion phase which in practice can be hours. If we are to 
> relax some of the consistency constraints, other performance enhancing 
> techniques can be employed to maximize the throughput and minimize locking 
> time.
> Split the 5 sections of the leveldb database (domain, owner, start time, 
> entity, index) into 5 separate databases. This allows each database to 
> maximize the read cache effectiveness based on the unique usage patterns of 
> each database. With 5 separate databases each lookup is much faster. This can 
> also help with I/O to have the entity and index databases on separate disks.
> Rolling DBs for entity and index DBs. 99.9% of the data are in these two 
> sections 4:1 ration (index to entity) at least for tez. We replace DB record 
> removal with file system removal if we create a rolling set of databases that 
> age out and can be efficiently removed. To do this we must place a constraint 
> to always place an entity's events into it's correct rolling db instance 
> based on start time. This allows us to stitching the data back together while 
> reading and artificial paging.
> Relax the synchronous writes constraints. If we are willing to accept losing 
> some records that we not flushed in the operating system during a crash, we 
> can use async writes that can be much faster.
> Prefer Sequential writes. sequential writes can be several times faster than 
> random writes. Spend some small effort arranging the writes in such a way 
> that will trend towards sequential write performance over random write 
> performance.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6311) We should write documentation for RollingLevelDBTimelineStore

2017-03-08 Thread Ravi Prakash (JIRA)
Ravi Prakash created YARN-6311:
--

 Summary: We should write documentation for 
RollingLevelDBTimelineStore
 Key: YARN-6311
 URL: https://issues.apache.org/jira/browse/YARN-6311
 Project: Hadoop YARN
  Issue Type: Wish
Affects Versions: 3.0.0-alpha2
Reporter: Ravi Prakash
Priority: Minor


YARN-3448 added the RollingLevelDBTimelineStore to deal with problems in 
LevelDBTimelineStore . We should add documentation for it in TimelineServer.md .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5948) Implement MutableConfigurationManager for handling storage into configuration store

2017-03-08 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902193#comment-15902193
 ] 

Wangda Tan commented on YARN-5948:
--

Thanks [~jhung] for working on the patch, few minor comments:

1) In addition to load the class by the 
{{yarn.scheduler.configuration.store.class}}, I think we can have a combination 
of them:There're some predefined short names like "memory"/"derby", etc. And if 
user specified an unknown type, we will load the class by name from class path.

2) yarn-default.xml, could you specify default {{}} as well?

> Implement MutableConfigurationManager for handling storage into configuration 
> store
> ---
>
> Key: YARN-5948
> URL: https://issues.apache.org/jira/browse/YARN-5948
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
> Attachments: YARN-5948.001.patch, YARN-5948-YARN-5734.002.patch, 
> YARN-5948-YARN-5734.003.patch, YARN-5948-YARN-5734.004.patch, 
> YARN-5948-YARN-5734.005.patch, YARN-5948-YARN-5734.006.patch, 
> YARN-5948-YARN-5734.007.patch
>
>
> The MutableConfigurationManager will take REST calls with desired client 
> configuration changes and call YarnConfigurationStore methods to store these 
> changes in the backing store.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6288) Refactor AppLogAggregatorImpl#uploadLogsForContainers

2017-03-08 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902184#comment-15902184
 ] 

Haibo Chen commented on YARN-6288:
--

[~ajisakaa] Have you considered making LogWriter closable so that try clause 
can auto-close it?

> Refactor AppLogAggregatorImpl#uploadLogsForContainers
> -
>
> Key: YARN-6288
> URL: https://issues.apache.org/jira/browse/YARN-6288
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Minor
>  Labels: supportability
> Attachments: YARN-6288.01.patch
>
>
> In AppLogAggregatorImpl.java, if an exception occurs in writing container log 
> to remote filesystem, the exception is not caught and ignored.
> https://github.com/apache/hadoop/blob/f59e36b4ce71d3019ab91b136b6d7646316954e7/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java#L398



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6165) Intra-queue preemption occurs even when preemption is turned off for a specific queue.

2017-03-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902149#comment-15902149
 ] 

Hudson commented on YARN-6165:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11375 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11375/])
YARN-6165. Intra-queue preemption occurs even when preemption is turned (jlowe: 
rev d7762a55113a529abd6f4ecb8e6d9b0a84b56e08)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicyIntraQueue.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/IntraQueueCandidatesSelector.java


> Intra-queue preemption occurs even when preemption is turned off for a 
> specific queue.
> --
>
> Key: YARN-6165
> URL: https://issues.apache.org/jira/browse/YARN-6165
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, scheduler preemption
>Affects Versions: 3.0.0-alpha2
>Reporter: Eric Payne
>Assignee: Eric Payne
> Fix For: 2.9.0, 2.8.1, 3.0.0-alpha3
>
> Attachments: YARN-6165.001.patch
>
>
> Intra-queue preemption occurs even when preemption is turned on for the whole 
> cluster ({{yarn.resourcemanager.scheduler.monitor.enable == true}}) but 
> turned off for a specific queue 
> ({{yarn.scheduler.capacity.root.queue1.disable_preemption == true}}).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6310) OutputStreams in AggregatedLogFormat.LogWriter can be left open upon exceptions

2017-03-08 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-6310:


 Summary: OutputStreams in AggregatedLogFormat.LogWriter can be 
left open upon exceptions
 Key: YARN-6310
 URL: https://issues.apache.org/jira/browse/YARN-6310
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 3.0.0-alpha2
Reporter: Haibo Chen
Assignee: Haibo Chen






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6165) Intra-queue preemption occurs even when preemption is turned off for a specific queue.

2017-03-08 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902110#comment-15902110
 ] 

Jason Lowe commented on YARN-6165:
--

+1 lgtm.  The TestRMRestart failure is unrelated and will be fixed by 
YARN-5548.  I'll fixup the whitespace nit during the commit.

> Intra-queue preemption occurs even when preemption is turned off for a 
> specific queue.
> --
>
> Key: YARN-6165
> URL: https://issues.apache.org/jira/browse/YARN-6165
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, scheduler preemption
>Affects Versions: 3.0.0-alpha2
>Reporter: Eric Payne
>Assignee: Eric Payne
> Attachments: YARN-6165.001.patch
>
>
> Intra-queue preemption occurs even when preemption is turned on for the whole 
> cluster ({{yarn.resourcemanager.scheduler.monitor.enable == true}}) but 
> turned off for a specific queue 
> ({{yarn.scheduler.capacity.root.queue1.disable_preemption == true}}).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5311) Document graceful decommission CLI and usage

2017-03-08 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902101#comment-15902101
 ] 

Junping Du edited comment on YARN-5311 at 3/8/17 10:41 PM:
---

Sorry for coming late on this as reviewing document is always a not-easy work. 
Thanks [~elek] for the patch, some comments so far:
1. In overview, we should explain some high level use cases - like elasticity 
for yarn nodes in public cloud infrastructure, etc. Also, we should mention 
timeout tracking in client and server side and their differences in prospective 
of IT operations.

2. As far as I remember, we don't support specified timeout value in exclude 
file for client side timeout tracking initially. It seems YARN-4676 only 
support that for server side tracking. We should mention that explicitly.

3. Also, for exclude file, we should mention currently we only support plain 
text (no timeout value) and XML. However, we have plan to support JSON format 
in future - please refer YARN-5536 for more details.

4. We should mention the behavior for RM get restarted/failed over, the 
decommissioning node will get decommissioned after RM come back as no timeout 
value get preserved so far. We should enhance it later - with YARN-5464 get 
fixed. So far we can just mention the current behavior as a NOTE but we can 
update later once we have better solution.

Some NITs:

bq. (Note: It isn't needed to restart resourcemanager in case of changing the 
exclude-path as it's reread at every `refresNodes` command)
We should make it more readable, something like: "It is unnecessary to restart 
RM in case of changing the exclude-path as this config will be read again for 
every 'refreshNodes' command"

bq. +* WAIT_CONTAINER --- wait for running containers to complete.
Capitalize "w" for wait as other items.

bq. +* WAIT_APP --- wait for running application to complete (after all 
containers complete)
Same comments above.


was (Author: djp):
Sorry for coming late on this as reviewing document is always a not-easy work. 
Thanks [~elek] for the patch, some comments so far:
1. In overview, we should explain some high level use cases - like elasticity 
for yarn nodes in public cloud infrastructure, etc. Also, we should mention 
timeout tracking in client and server side and their differences in prospective 
of IT operations.

2. As far as I remember, we don't support specified timeout value in exclude 
file for client side timeout tracking initially. It seems YARN-4676 only 
support that for server side tracking. We should mention that explicitly.

3. Also, for exclude file, we should mention currently we only support plain 
text (no timeout value) and XML. However, we have plan to support JSON format 
in future - please refer YARN-5536 for more details.

4. We should mention the behavior for RM get restarted/failed over, the 
decommissioning node will get decommissioned after RM come back as no timeout 
value get preserved so far. We should enhance it later - with YARN-5464 get 
fixed. So far we can just mention the current behavior as a NOTE but we can 
update later once we have better solution.

Some NITs:

bq. (Note: It isn't needed to restart resourcemanager in case of changing the 
exclude-path as it's reread at every `refresNodes` command)
It is unnecessary to restart RM in case of changing the exclude-path as this 
config will be read again for every 'refreshNodes' command

bq. +* WAIT_CONTAINER --- wait for running containers to complete.
Capitalize "w" for wait as other items.

bq. +* WAIT_APP --- wait for running application to complete (after all 
containers complete)
Same comments above.

> Document graceful decommission CLI and usage
> 
>
> Key: YARN-5311
> URL: https://issues.apache.org/jira/browse/YARN-5311
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 2.9.0
>Reporter: Junping Du
>Assignee: Elek, Marton
> Attachments: YARN-5311.001.patch, YARN-5311.002.patch, 
> YARN-5311.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5311) Document graceful decommission CLI and usage

2017-03-08 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902101#comment-15902101
 ] 

Junping Du commented on YARN-5311:
--

Sorry for coming late on this as reviewing document is always a not-easy work. 
Thanks [~elek] for the patch, some comments so far:
1. In overview, we should explain some high level use cases - like elasticity 
for yarn nodes in public cloud infrastructure, etc. Also, we should mention 
timeout tracking in client and server side and their differences in prospective 
of IT operations.

2. As far as I remember, we don't support specified timeout value in exclude 
file for client side timeout tracking initially. It seems YARN-4676 only 
support that for server side tracking. We should mention that explicitly.

3. Also, for exclude file, we should mention currently we only support plain 
text (no timeout value) and XML. However, we have plan to support JSON format 
in future - please refer YARN-5536 for more details.

4. We should mention the behavior for RM get restarted/failed over, the 
decommissioning node will get decommissioned after RM come back as no timeout 
value get preserved so far. We should enhance it later - with YARN-5464 get 
fixed. So far we can just mention the current behavior as a NOTE but we can 
update later once we have better solution.

Some NITs:

bq. (Note: It isn't needed to restart resourcemanager in case of changing the 
exclude-path as it's reread at every `refresNodes` command)
It is unnecessary to restart RM in case of changing the exclude-path as this 
config will be read again for every 'refreshNodes' command

bq. +* WAIT_CONTAINER --- wait for running containers to complete.
Capitalize "w" for wait as other items.

bq. +* WAIT_APP --- wait for running application to complete (after all 
containers complete)
Same comments above.

> Document graceful decommission CLI and usage
> 
>
> Key: YARN-5311
> URL: https://issues.apache.org/jira/browse/YARN-5311
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 2.9.0
>Reporter: Junping Du
>Assignee: Elek, Marton
> Attachments: YARN-5311.001.patch, YARN-5311.002.patch, 
> YARN-5311.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5311) Document graceful decommission CLI and usage

2017-03-08 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-5311:
-
Labels:   (was: oct16-easy)

> Document graceful decommission CLI and usage
> 
>
> Key: YARN-5311
> URL: https://issues.apache.org/jira/browse/YARN-5311
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 2.9.0
>Reporter: Junping Du
>Assignee: Elek, Marton
> Attachments: YARN-5311.001.patch, YARN-5311.002.patch, 
> YARN-5311.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6290) Fair scheduler page broken

2017-03-08 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902032#comment-15902032
 ] 

Haibo Chen commented on YARN-6290:
--

Title says it is fair scheduler, but the stacktrace indicates Capacity 
Scheduler is used. Can you confirm which one is used [~leopold.boudard]?


> Fair scheduler page broken
> --
>
> Key: YARN-6290
> URL: https://issues.apache.org/jira/browse/YARN-6290
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.3
>Reporter: Léopold Boudard
>
> Hello,
> I have an issue very similar to this one
> https://issues.apache.org/jira/browse/YARN-3478
> not being able to access to scheduler interface in yarn webapp.
> Except that traceback is bit different (NullPointerException and could be be 
> caused by stale queue configuration.
> I suspect QueueCapacitiesInfo.java initializing info object with null value 
> for some reason.
> Below traceback:
> ```
> 2017-03-06 10:20:00,945 ERROR webapp.Dispatcher 
> (Dispatcher.java:service(162)) - error handling URI: /cluster/scheduler
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:153)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>   at 
> com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263)
>   at 
> com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178)
>   at 
> com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
>   at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebAppFilter.doFilter(RMWebAppFilter.java:178)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795)
>   at 
> com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
>   at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
>   at 
> com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
>   at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>   at 
> org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>   at 
> org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>   at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:614)
>   at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:573)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>   at 
> org.apache.hadoop.security.http.CrossOriginFilter.doFilter(CrossOriginFilter.java:95)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>   at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1294)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>   at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>   at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>   at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
>   at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>   at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
>   at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:767)
>   at 

[jira] [Created] (YARN-6309) Fair scheduler docs should have the queue and queuePlacementPolicy elements listed in bold so that they're easier to see

2017-03-08 Thread Daniel Templeton (JIRA)
Daniel Templeton created YARN-6309:
--

 Summary: Fair scheduler docs should have the queue and 
queuePlacementPolicy elements listed in bold so that they're easier to see
 Key: YARN-6309
 URL: https://issues.apache.org/jira/browse/YARN-6309
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Affects Versions: 3.0.0-alpha2
Reporter: Daniel Templeton
Priority: Minor


Under {{Allocation file format : Queue elements}}, all of the element names 
should be bold, e.g. {{minResources}}, {{maxResources}}, etc.  Same for 
{{Allocation file format : A queuePlacementPolicy element}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6300) NULL_UPDATE_REQUESTS is redundant in TestFairScheduler

2017-03-08 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902021#comment-15902021
 ] 

Haibo Chen commented on YARN-6300:
--

Thanks [~yuanbo] for the patch. +1 nonbinding.

> NULL_UPDATE_REQUESTS is redundant in TestFairScheduler
> --
>
> Key: YARN-6300
> URL: https://issues.apache.org/jira/browse/YARN-6300
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha2
>Reporter: Daniel Templeton
>Assignee: Yuanbo Liu
>Priority: Minor
>  Labels: newbie
> Attachments: YARN-6300.001.patch
>
>
> The {{TestFairScheduler.NULL_UPDATE_REQUESTS}} field hides 
> {{FairSchedulerTestBase.NULL_UPDATE_REQUESTS}}, which has the same value.  
> The {{NULL_UPDATE_REQUESTS}} field should be removed from 
> {{TestFairScheduler}}.
> While you're at it, maybe also remove the unused import.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6298) Metric preemptCall is not used in new preemption.

2017-03-08 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu reassigned YARN-6298:
--

Assignee: (was: Yufei Gu)

> Metric preemptCall is not used in new preemption.
> -
>
> Key: YARN-6298
> URL: https://issues.apache.org/jira/browse/YARN-6298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Affects Versions: 2.8.0, 3.0.0-alpha2
>Reporter: Yufei Gu
>
> Either get rid of it in Hadoop 3 or use it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6308) Fix TestAMRMClient compilation errors

2017-03-08 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901986#comment-15901986
 ] 

Manoj Govindassamy commented on YARN-6308:
--

bq. Yes, it looks like the changes in YARN-6218 conflict with this change. For 
example, in that commit, the method tearDown was renamed to teardown, so in 
this patch we call a method that doesn't exist anymore. Want me to make a 
separate patch to fix this?
[~Steven Rand] yes, please.

> Fix TestAMRMClient compilation errors
> -
>
> Key: YARN-6308
> URL: https://issues.apache.org/jira/browse/YARN-6308
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha3
>Reporter: Manoj Govindassamy
>
> Looks like fixes committed for HADOOP-14062 and YARN-6218 had conflicts and 
> left TestAMRMClient in a dangling state with compilation errors. 
> TestAMRMClient needs a fix.
> {code}
> [ERROR] COMPILATION ERROR : 
> [ERROR] 
> /Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[145,5]
>  non-static variable yarnCluster cannot be referenced from a static context
> [ERROR] 
> /Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[145,71]
>  non-static variable nodeCount cannot be referenced from a static context
> [ERROR] 
> /Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[146,5]
>  non-static variable yarnCluster cannot be referenced from a static context
> ..
> ..
> [ERROR] 
> /Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[204,9]
>  non-static variable attemptId cannot be referenced from a static context
> [ERROR] 
> /Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[207,20]
>  non-static variable attemptId cannot be referenced from a static context
> [ERROR] 
> /Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[206,13]
>  non-static variable yarnCluster cannot be referenced from a static context
> [ERROR] 
> /Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[874,5]
>  cannot find symbol
> [ERROR] symbol:   method tearDown()
> [ERROR] location: class org.apache.hadoop.yarn.client.api.impl.TestAMRMClient
> [ERROR] 
> /Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[876,5]
>  cannot find symbol
> [ERROR] symbol:   method startApp()
> [ERROR] location: class org.apache.hadoop.yarn.client.api.impl.TestAMRMClient
> [ERROR] 
> /Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[881,5]
>  cannot find symbol
> [ERROR] symbol:   method tearDown()
> [ERROR] location: class org.apache.hadoop.yarn.client.api.impl.TestAMRMClient
> [ERROR] 
> /Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[885,5]
>  cannot find symbol
> [ERROR] symbol:   method startApp()
> [ERROR] location: class org.apache.hadoop.yarn.client.api.impl.TestAMRMClient
> [ERROR] -> [Help 1]
> [ERROR] 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6165) Intra-queue preemption occurs even when preemption is turned off for a specific queue.

2017-03-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901993#comment-15901993
 ] 

Hadoop QA commented on YARN-6165:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  4m 
45s{color} | {color:red} root in trunk failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 43m  9s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 58m  2s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-6165 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12855987/YARN-6165.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 2d24ec6a2e2a 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 
09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 241c1cc |
| Default Java | 1.8.0_121 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-YARN-Build/15210/artifact/patchprocess/branch-mvninstall-root.txt
 |
| findbugs | v3.0.0 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/15210/artifact/patchprocess/whitespace-eol.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/15210/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/15210/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/15210/console |
| Powered by | Apache 

[jira] [Created] (YARN-6308) Fix TestAMRMClient compilation errors

2017-03-08 Thread Manoj Govindassamy (JIRA)
Manoj Govindassamy created YARN-6308:


 Summary: Fix TestAMRMClient compilation errors
 Key: YARN-6308
 URL: https://issues.apache.org/jira/browse/YARN-6308
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0-alpha3
Reporter: Manoj Govindassamy



Looks like fixes committed for HADOOP-14062 and YARN-6218 had conflicts and 
left TestAMRMClient in a dangling state with compilation errors. TestAMRMClient 
needs a fix.

{code}
[ERROR] COMPILATION ERROR : 
[ERROR] 
/Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[145,5]
 non-static variable yarnCluster cannot be referenced from a static context
[ERROR] 
/Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[145,71]
 non-static variable nodeCount cannot be referenced from a static context
[ERROR] 
/Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[146,5]
 non-static variable yarnCluster cannot be referenced from a static context
..
..
[ERROR] 
/Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[204,9]
 non-static variable attemptId cannot be referenced from a static context
[ERROR] 
/Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[207,20]
 non-static variable attemptId cannot be referenced from a static context
[ERROR] 
/Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[206,13]
 non-static variable yarnCluster cannot be referenced from a static context
[ERROR] 
/Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[874,5]
 cannot find symbol
[ERROR] symbol:   method tearDown()
[ERROR] location: class org.apache.hadoop.yarn.client.api.impl.TestAMRMClient
[ERROR] 
/Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[876,5]
 cannot find symbol
[ERROR] symbol:   method startApp()
[ERROR] location: class org.apache.hadoop.yarn.client.api.impl.TestAMRMClient
[ERROR] 
/Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[881,5]
 cannot find symbol
[ERROR] symbol:   method tearDown()
[ERROR] location: class org.apache.hadoop.yarn.client.api.impl.TestAMRMClient
[ERROR] 
/Users/manoj/work/ups-hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java:[885,5]
 cannot find symbol
[ERROR] symbol:   method startApp()
[ERROR] location: class org.apache.hadoop.yarn.client.api.impl.TestAMRMClient
[ERROR] -> [Help 1]
[ERROR] 

{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6165) Intra-queue preemption occurs even when preemption is turned off for a specific queue.

2017-03-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901981#comment-15901981
 ] 

Hadoop QA commented on YARN-6165:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  4m 
30s{color} | {color:red} root in trunk failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 40m 13s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 53m 27s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-6165 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12855987/YARN-6165.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux f9fdf9ee8081 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 241c1cc |
| Default Java | 1.8.0_121 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-YARN-Build/15209/artifact/patchprocess/branch-mvninstall-root.txt
 |
| findbugs | v3.0.0 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/15209/artifact/patchprocess/whitespace-eol.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/15209/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/15209/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/15209/console |
| Powered by | Apache Yetus 

[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly

2017-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901945#comment-15901945
 ] 

ASF GitHub Bot commented on YARN-6302:
--

GitHub user szegedim opened a pull request:

https://github.com/apache/hadoop/pull/200

YARN-6302 Fail the node, if Linux Container Executor is not configured 
properly

YARN-6302 Fail the node, if Linux Container Executor is not configured 
properly

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/szegedim/hadoop YARN-6302

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hadoop/pull/200.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #200


commit cb97a1911c0df3528c49aa0ba96e7bc6233d630a
Author: Miklos Szegedi 
Date:   2017-03-07T22:35:16Z

YARN-6302 Throw on error 24

Change-Id: Ia676061fd49cc7f54dbd9ae22bb999d4ea8a965b

commit 6f7872e99f5be813c74493dd204e14355049659d
Author: Miklos Szegedi 
Date:   2017-03-08T03:37:10Z

YARN-6302 Shutdown on error 24

Change-Id: Ib17d4a357b6fdf1a6d940f0641770054f1f73e81

commit 03f4cd8a1391360ea3d7790b1044421eb05d6d2d
Author: Miklos Szegedi 
Date:   2017-03-08T19:47:03Z

YARN-6302 Mark node unhealthy on error 24

Change-Id: Ib1e7215f9dac6825bda2eb54707782c59f19eb0c




> Fail the node, if Linux Container Executor is not configured properly
> -
>
> Key: YARN-6302
> URL: https://issues.apache.org/jira/browse/YARN-6302
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Minor
>
> We have a cluster that has one node with misconfigured Linux Container 
> Executor. Every time an AM or regular container is launched on the cluster, 
> it will fail. The node will still have resources available, so it keeps 
> failing apps until the administrator notices the issue and decommissions the 
> node. AM Blacklisting only helps, if the application is already running.
> As a possible improvement, when the LCE is used on the cluster and a NM gets 
> certain errors back from the LCE, like error 24 configuration not found, we 
> should not try to allocate anything on the node anymore or shut down the node 
> entirely. That kind of problem normally does not fix itself and it means that 
> nothing can really run on that node.
> {code}
> Application application_1488920587909_0010 failed 2 times due to AM Container 
> for appattempt_1488920587909_0010_02 exited with exitCode: -1000
> Failing this attempt.Diagnostics: Application application_1488920587909_0010 
> initialization failed (exitCode=24) with output:
> For more detailed output, check the application tracking page: 
> http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then 
> click on links to logs of each attempt.
> . Failing the application.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6165) Intra-queue preemption occurs even when preemption is turned off for a specific queue.

2017-03-08 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901921#comment-15901921
 ] 

Eric Payne commented on YARN-6165:
--

bq. this needs a Jenkins run anyway
Thanks [~jlowe]. I kicked the build.


> Intra-queue preemption occurs even when preemption is turned off for a 
> specific queue.
> --
>
> Key: YARN-6165
> URL: https://issues.apache.org/jira/browse/YARN-6165
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, scheduler preemption
>Affects Versions: 3.0.0-alpha2
>Reporter: Eric Payne
>Assignee: Eric Payne
> Attachments: YARN-6165.001.patch
>
>
> Intra-queue preemption occurs even when preemption is turned on for the whole 
> cluster ({{yarn.resourcemanager.scheduler.monitor.enable == true}}) but 
> turned off for a specific queue 
> ({{yarn.scheduler.capacity.root.queue1.disable_preemption == true}}).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6165) Intra-queue preemption occurs even when preemption is turned off for a specific queue.

2017-03-08 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901915#comment-15901915
 ] 

Jason Lowe commented on YARN-6165:
--

I'd like to take a quick look, and this needs a Jenkins run anyway.

> Intra-queue preemption occurs even when preemption is turned off for a 
> specific queue.
> --
>
> Key: YARN-6165
> URL: https://issues.apache.org/jira/browse/YARN-6165
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, scheduler preemption
>Affects Versions: 3.0.0-alpha2
>Reporter: Eric Payne
>Assignee: Eric Payne
> Attachments: YARN-6165.001.patch
>
>
> Intra-queue preemption occurs even when preemption is turned on for the whole 
> cluster ({{yarn.resourcemanager.scheduler.monitor.enable == true}}) but 
> turned off for a specific queue 
> ({{yarn.scheduler.capacity.root.queue1.disable_preemption == true}}).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5948) Implement MutableConfigurationManager for handling storage into configuration store

2017-03-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901892#comment-15901892
 ] 

Hadoop QA commented on YARN-5948:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
42s{color} | {color:green} YARN-5734 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
23s{color} | {color:green} YARN-5734 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 7s{color} | {color:green} YARN-5734 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
12s{color} | {color:green} YARN-5734 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
18s{color} | {color:green} YARN-5734 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
55s{color} | {color:green} YARN-5734 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
55s{color} | {color:green} YARN-5734 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
45s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
1s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 41m 50s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
46s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}111m 37s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
|   | hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer |
|   | hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStore |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5948 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12856845/YARN-5948-YARN-5734.007.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 583fa6e08cda 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 
15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | YARN-5734 / 01ea2f3 |
| Default Java | 1.8.0_121 |
| 

[jira] [Created] (YARN-6307) Refactor FairShareComparator#compare

2017-03-08 Thread Yufei Gu (JIRA)
Yufei Gu created YARN-6307:
--

 Summary: Refactor FairShareComparator#compare
 Key: YARN-6307
 URL: https://issues.apache.org/jira/browse/YARN-6307
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Yufei Gu
Assignee: Yufei Gu


The method did three things: check the min share ratio, check weight ratio, 
break tied by submit time and name. They are mixed with each other which is not 
easy to read and maintenance,  poor style. Additionally, there are potential 
performance issues, like no need to calculate weight ratio every time. 




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6165) Intra-queue preemption occurs even when preemption is turned off for a specific queue.

2017-03-08 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901886#comment-15901886
 ] 

Eric Payne commented on YARN-6165:
--

bq. I think patch looks fine for me.

Thanks [~sunilg]. Will you be committing this?

> Intra-queue preemption occurs even when preemption is turned off for a 
> specific queue.
> --
>
> Key: YARN-6165
> URL: https://issues.apache.org/jira/browse/YARN-6165
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, scheduler preemption
>Affects Versions: 3.0.0-alpha2
>Reporter: Eric Payne
>Assignee: Eric Payne
> Attachments: YARN-6165.001.patch
>
>
> Intra-queue preemption occurs even when preemption is turned on for the whole 
> cluster ({{yarn.resourcemanager.scheduler.monitor.enable == true}}) but 
> turned off for a specific queue 
> ({{yarn.scheduler.capacity.root.queue1.disable_preemption == true}}).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6237) Move UID constant to TimelineReaderUtils

2017-03-08 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901885#comment-15901885
 ] 

Varun Saxena commented on YARN-6237:


Committed to YARN-5355, YARN-5355-branch-2.
Thanks [~rohithsharma] for your contribution.

> Move UID constant to TimelineReaderUtils
> 
>
> Key: YARN-6237
> URL: https://issues.apache.org/jira/browse/YARN-6237
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>  Labels: newbie
> Fix For: YARN-5355, YARN-5355-branch-2
>
> Attachments: YARN-6237-YARN-5355.0001.patch
>
>
> UID constant is kept in TimelineReader Manager. This can be moved to 
> TimelineReaderUtils which can keep track of all reader constants. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5948) Implement MutableConfigurationManager for handling storage into configuration store

2017-03-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901859#comment-15901859
 ] 

Hadoop QA commented on YARN-5948:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
53s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
51s{color} | {color:green} YARN-5734 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
40s{color} | {color:green} YARN-5734 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 0s{color} | {color:green} YARN-5734 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
1s{color} | {color:green} YARN-5734 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
13s{color} | {color:green} YARN-5734 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
30s{color} | {color:green} YARN-5734 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
41s{color} | {color:green} YARN-5734 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
37s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
30s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 39m 50s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
41s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 98m 14s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
|   | hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5948 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12856845/YARN-5948-YARN-5734.007.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 79192817cdb1 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | YARN-5734 / 01ea2f3 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| unit | 

[jira] [Updated] (YARN-6237) Move UID constant to TimelineReaderUtils

2017-03-08 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-6237:
---
Summary: Move UID constant to TimelineReaderUtils  (was: Move UID constant 
into TimelineReaderUtils)

> Move UID constant to TimelineReaderUtils
> 
>
> Key: YARN-6237
> URL: https://issues.apache.org/jira/browse/YARN-6237
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>  Labels: newbie
> Attachments: YARN-6237-YARN-5355.0001.patch
>
>
> UID constant is kept in TimelineReader Manager. This can be moved to 
> TimelineReaderUtils which can keep track of all reader constants. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6297) TestAppLogAggregatorImp.verifyFilesUploaded() should check # of filed uploaded with that of files expected

2017-03-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901771#comment-15901771
 ] 

Hudson commented on YARN-6297:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #11373 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11373/])
YARN-6297. TestAppLogAggregatorImp.verifyFilesUploaded() should check # 
(rkanter: rev 287ba4ffa66212c02e1b1edc8fca53f6368a9efc)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestAppLogAggregatorImpl.java


> TestAppLogAggregatorImp.verifyFilesUploaded() should check # of filed 
> uploaded with that of files expected
> --
>
> Key: YARN-6297
> URL: https://issues.apache.org/jira/browse/YARN-6297
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>  Labels: test
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: YARN-6297.01.patch
>
>
> Per YARN-6252
> {code:java}
>   private static void verifyFilesUploaded(Set filesUploaded,
>   Set filesExpected) {
> final String errMsgPrefix = "The set of files uploaded are not the same " 
> +
> "as expected";
> if(filesUploaded.size() != filesUploaded.size()) {
>   fail(errMsgPrefix + ": actual size: " + filesUploaded.size() + " vs " +
>   "expected size: " + filesExpected.size());
> }
> for(String file: filesExpected) {
>   if(!filesUploaded.contains(file)) {
> fail(errMsgPrefix + ": expecting " + file);
>   }
> }
>   }
> {code}
> should check the number of files uploaded against the number of files 
> expected.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6297) TestAppLogAggregatorImp.verifyFilesUploaded() should check # of filed uploaded with that of files expected

2017-03-08 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901740#comment-15901740
 ] 

Robert Kanter commented on YARN-6297:
-

+1

> TestAppLogAggregatorImp.verifyFilesUploaded() should check # of filed 
> uploaded with that of files expected
> --
>
> Key: YARN-6297
> URL: https://issues.apache.org/jira/browse/YARN-6297
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>  Labels: test
> Attachments: YARN-6297.01.patch
>
>
> Per YARN-6252
> {code:java}
>   private static void verifyFilesUploaded(Set filesUploaded,
>   Set filesExpected) {
> final String errMsgPrefix = "The set of files uploaded are not the same " 
> +
> "as expected";
> if(filesUploaded.size() != filesUploaded.size()) {
>   fail(errMsgPrefix + ": actual size: " + filesUploaded.size() + " vs " +
>   "expected size: " + filesExpected.size());
> }
> for(String file: filesExpected) {
>   if(!filesUploaded.contains(file)) {
> fail(errMsgPrefix + ": expecting " + file);
>   }
> }
>   }
> {code}
> should check the number of files uploaded against the number of files 
> expected.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6306) NMClient API change for container upgrade

2017-03-08 Thread Jian He (JIRA)
Jian He created YARN-6306:
-

 Summary: NMClient API change for container upgrade
 Key: YARN-6306
 URL: https://issues.apache.org/jira/browse/YARN-6306
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Jian He
Assignee: Arun Suresh






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5169) most YARN events have timestamp of -1

2017-03-08 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen reassigned YARN-5169:


Assignee: Haibo Chen

> most YARN events have timestamp of -1
> -
>
> Key: YARN-5169
> URL: https://issues.apache.org/jira/browse/YARN-5169
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.2
>Reporter: Sangjin Lee
>Assignee: Haibo Chen
>
> Most of the YARN events (subclasses of {{AbstractEvent}}) have timestamp of 
> -1. {{AbstractEvent}} have two constructors, one that initializes the 
> timestamp to -1 and the other to the caller-provided value. But most events 
> use the former (thus timestamp of -1).
> Some of the more common events, including {{ApplicationEvent}}, 
> {{ContainerEvent}}, {{JobEvent}}, etc. do not set the timestamp.
> The rationale for this behavior seems to be mentioned in {{AbstractEvent}}:
> {code}
>   // use this if you DON'T care about the timestamp
>   public AbstractEvent(TYPE type) {
> this.type = type;
> // We're not generating a real timestamp here.  It's too expensive.
> timestamp = -1L;
>   }
> {code}
> This absence of the timestamp isn't really visible in many cases and 
> therefore may have gone unnoticed, but the timeline service exposes this 
> problem very visibly.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5169) most YARN events have timestamp of -1

2017-03-08 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901717#comment-15901717
 ] 

Haibo Chen commented on YARN-5169:
--

[~sjlee0] [~gtCarrera9], IIUC from the above discussion, RM is already using 
System.currentTimeMillis() to generate ATS event timestamps, that means we are 
not worried about its performance overhead? If so, I think we should probably 
be consistent at least for events like container localization that we persist 
in ATS v2. Otherwise, YARN_NM_CONTAINER_LOCALIZATION_STARTED event will just 
show a timestamp of -1, which I don't think is of much value to users. Thoughts?

> most YARN events have timestamp of -1
> -
>
> Key: YARN-5169
> URL: https://issues.apache.org/jira/browse/YARN-5169
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.2
>Reporter: Sangjin Lee
>
> Most of the YARN events (subclasses of {{AbstractEvent}}) have timestamp of 
> -1. {{AbstractEvent}} have two constructors, one that initializes the 
> timestamp to -1 and the other to the caller-provided value. But most events 
> use the former (thus timestamp of -1).
> Some of the more common events, including {{ApplicationEvent}}, 
> {{ContainerEvent}}, {{JobEvent}}, etc. do not set the timestamp.
> The rationale for this behavior seems to be mentioned in {{AbstractEvent}}:
> {code}
>   // use this if you DON'T care about the timestamp
>   public AbstractEvent(TYPE type) {
> this.type = type;
> // We're not generating a real timestamp here.  It's too expensive.
> timestamp = -1L;
>   }
> {code}
> This absence of the timestamp isn't really visible in many cases and 
> therefore may have gone unnoticed, but the timeline service exposes this 
> problem very visibly.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6164) Expose maximum-am-resource-percent in YarnClient

2017-03-08 Thread Benson Qiu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901696#comment-15901696
 ] 

Benson Qiu commented on YARN-6164:
--

[~sunilg], [~leftnoteasy]: Sure, I can add {{QueueCapacities}} to {{QueueInfo}} 
as originally discussed.

If I work on this, is it very likely that my patch will get accepted? I would 
be working on this patch on my own time (not for the company I work for).

> Expose maximum-am-resource-percent in YarnClient
> 
>
> Key: YARN-6164
> URL: https://issues.apache.org/jira/browse/YARN-6164
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.7.2
>Reporter: Benson Qiu
>Assignee: Benson Qiu
> Attachments: YARN-6164.001.patch, YARN-6164.002.patch, 
> YARN-6164.003.patch, YARN-6164.004.patch, YARN-6164.005.patch
>
>
> `yarn.scheduler.capacity.maximum-am-resource-percent` is exposed through the 
> [Cluster Scheduler 
> API|http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Scheduler_API],
>  but not through 
> [YarnClient|https://hadoop.apache.org/docs/current/api/org/apache/hadoop/yarn/client/api/YarnClient.html].
> Since YarnClient and RM REST APIs depend on different ports (8032 vs 8088 by 
> default), it would be nice to expose `maximum-am-resource-percent` in 
> YarnClient as well. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6050) AMs can't be scheduled on racks or nodes

2017-03-08 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901678#comment-15901678
 ] 

Robert Kanter commented on YARN-6050:
-

Test failure looks to be unrelated (YARN-5548).

> AMs can't be scheduled on racks or nodes
> 
>
> Key: YARN-6050
> URL: https://issues.apache.org/jira/browse/YARN-6050
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-6050.001.patch, YARN-6050.002.patch, 
> YARN-6050.003.patch, YARN-6050.004.patch, YARN-6050.005.patch, 
> YARN-6050.006.patch, YARN-6050.007.patch, YARN-6050.008.patch, 
> YARN-6050.009.patch, YARN-6050.010.patch
>
>
> Yarn itself supports rack/node aware scheduling for AMs; however, there 
> currently are two problems:
> # To specify hard or soft rack/node requests, you have to specify more than 
> one {{ResourceRequest}}.  For example, if you want to schedule an AM only on 
> "rackA", you have to create two {{ResourceRequest}}, like this:
> {code}
> ResourceRequest.newInstance(PRIORITY, ANY, CAPABILITY, NUM_CONTAINERS, false);
> ResourceRequest.newInstance(PRIORITY, "rackA", CAPABILITY, NUM_CONTAINERS, 
> true);
> {code}
> The problem is that the Yarn API doesn't actually allow you to specify more 
> than one {{ResourceRequest}} in the {{ApplicationSubmissionContext}}.  The 
> current behavior is to either build one from {{getResource}} or directly from 
> {{getAMContainerResourceRequest}}, depending on if 
> {{getAMContainerResourceRequest}} is null or not.  We'll need to add a third 
> method, say {{getAMContainerResourceRequests}}, which takes a list of 
> {{ResourceRequest}} so that clients can specify the multiple resource 
> requests.
> # There are some places where things are hardcoded to overwrite what the 
> client specifies.  These are pretty straightforward to fix.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5948) Implement MutableConfigurationManager for handling storage into configuration store

2017-03-08 Thread Jonathan Hung (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-5948:

Attachment: YARN-5948-YARN-5734.007.patch

> Implement MutableConfigurationManager for handling storage into configuration 
> store
> ---
>
> Key: YARN-5948
> URL: https://issues.apache.org/jira/browse/YARN-5948
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
> Attachments: YARN-5948.001.patch, YARN-5948-YARN-5734.002.patch, 
> YARN-5948-YARN-5734.003.patch, YARN-5948-YARN-5734.004.patch, 
> YARN-5948-YARN-5734.005.patch, YARN-5948-YARN-5734.006.patch, 
> YARN-5948-YARN-5734.007.patch
>
>
> The MutableConfigurationManager will take REST calls with desired client 
> configuration changes and call YarnConfigurationStore methods to store these 
> changes in the backing store.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6305) Improve signaling of short lived containers

2017-03-08 Thread Shane Kumpf (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shane Kumpf reassigned YARN-6305:
-

Assignee: Shane Kumpf

> Improve signaling of short lived containers
> ---
>
> Key: YARN-6305
> URL: https://issues.apache.org/jira/browse/YARN-6305
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>
> Currently it is possible for containers to leak and remain in an exited state 
> if a docker container is not fully started before being killed. Depending on 
> the selected Docker storage driver, the lower bound on starting a container 
> can be as much as three seconds (using {{docker run}}). If an implicit image 
> pull occurs, this could be much longer.
> When a container is not fully started, the PID is not available yet. As a 
> result, {{ContainerLaunch#cleanUpContainer}} will not signal the container as 
> it relies on the PID. The PID is not required for docker client operations, 
> so allowing the signaling to occur anyway appears to be appropriate.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5818) Support the Docker Live Restore feature

2017-03-08 Thread Shane Kumpf (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shane Kumpf reassigned YARN-5818:
-

Assignee: Shane Kumpf

> Support the Docker Live Restore feature
> ---
>
> Key: YARN-5818
> URL: https://issues.apache.org/jira/browse/YARN-5818
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>
> Docker 1.12.x introduced the docker [Live 
> Restore|https://docs.docker.com/engine/admin/live-restore/] feature which 
> allows docker containers to survive docker daemon restarts/upgrades. Support 
> for this feature should be added to YARN to allow docker changes and upgrades 
> to be less impactful to existing containers.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6305) Improve signaling of short lived containers

2017-03-08 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901655#comment-15901655
 ] 

Shane Kumpf commented on YARN-6305:
---

I've been looking into this, so I'll take ownership and will put together a 
patch for discussion.

> Improve signaling of short lived containers
> ---
>
> Key: YARN-6305
> URL: https://issues.apache.org/jira/browse/YARN-6305
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Shane Kumpf
>
> Currently it is possible for containers to leak and remain in an exited state 
> if a docker container is not fully started before being killed. Depending on 
> the selected Docker storage driver, the lower bound on starting a container 
> can be as much as three seconds (using {{docker run}}). If an implicit image 
> pull occurs, this could be much longer.
> When a container is not fully started, the PID is not available yet. As a 
> result, {{ContainerLaunch#cleanUpContainer}} will not signal the container as 
> it relies on the PID. The PID is not required for docker client operations, 
> so allowing the signaling to occur anyway appears to be appropriate.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6305) Improve signaling of short lived containers

2017-03-08 Thread Shane Kumpf (JIRA)
Shane Kumpf created YARN-6305:
-

 Summary: Improve signaling of short lived containers
 Key: YARN-6305
 URL: https://issues.apache.org/jira/browse/YARN-6305
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: yarn
Reporter: Shane Kumpf


Currently it is possible for containers to leak and remain in an exited state 
if a docker container is not fully started before being killed. Depending on 
the selected Docker storage driver, the lower bound on starting a container can 
be as much as three seconds (using {{docker run}}). If an implicit image pull 
occurs, this could be much longer.

When a container is not fully started, the PID is not available yet. As a 
result, {{ContainerLaunch#cleanUpContainer}} will not signal the container as 
it relies on the PID. The PID is not required for docker client operations, so 
allowing the signaling to occur anyway appears to be appropriate.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6299) FairSharePolicy is incorrect when demand is less than min share

2017-03-08 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901645#comment-15901645
 ] 

Yufei Gu commented on YARN-6299:


It makes sense to me that an app is always needy when its demand is less than 
min share. [~templedf], can you elaborate your concern?

> FairSharePolicy is incorrect when demand is less than min share
> ---
>
> Key: YARN-6299
> URL: https://issues.apache.org/jira/browse/YARN-6299
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 3.0.0-alpha2
>Reporter: Daniel Templeton
>
> {code}
>   Resource resourceUsage1 = s1.getResourceUsage();
>   Resource resourceUsage2 = s2.getResourceUsage();
>   Resource minShare1 = Resources.min(RESOURCE_CALCULATOR, null,
>   s1.getMinShare(), s1.getDemand());
>   Resource minShare2 = Resources.min(RESOURCE_CALCULATOR, null,
>   s2.getMinShare(), s2.getDemand());
>   boolean s1Needy = Resources.lessThan(RESOURCE_CALCULATOR, null,
>   resourceUsage1, minShare1);
>   boolean s2Needy = Resources.lessThan(RESOURCE_CALCULATOR, null,
>   resourceUsage2, minShare2);
>   minShareRatio1 = (double) resourceUsage1.getMemorySize()
>   / Resources.max(RESOURCE_CALCULATOR, null, minShare1, 
> ONE).getMemorySize();
>   minShareRatio2 = (double) resourceUsage2.getMemorySize()
>   / Resources.max(RESOURCE_CALCULATOR, null, minShare2, 
> ONE).getMemorySize();
> {code}
> If demand is less than min share, then an app will be flagged as needy if it 
> has demand that is higher than its usage, which happens any time the app has 
> been assigned resources that it hasn't started using yet.  That sounds wrong 
> to me.  [~kasha], [~yufeigu]?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4266) Allow whitelisted users to disable user re-mapping/squashing when launching docker containers

2017-03-08 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901607#comment-15901607
 ] 

Eric Badger commented on YARN-4266:
---

Looks like I should have reloaded the page before commenting as I didn't see 
[~shaneku...@gmail.com]'s comment. It's unfortunate that docker's namespace 
remapping doesn't work for multiple users. I guess a single-user namespace 
emapping would help the problem, but not solve all use cases. 

> Allow whitelisted users to disable user re-mapping/squashing when launching 
> docker containers
> -
>
> Key: YARN-4266
> URL: https://issues.apache.org/jira/browse/YARN-4266
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Zhankun Tang
> Attachments: YARN-4266.001.patch, 
> YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping.pdf, 
> YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v2.pdf, 
> YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v3.pdf, 
> YARN-4266-branch-2.8.001.patch
>
>
> Docker provides a mechanism (the --user switch) that enables us to specify 
> the user the container processes should run as. We use this mechanism today 
> when launching docker containers . In non-secure mode, we run the docker 
> container based on 
> `yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user` and in 
> secure mode, as the submitting user. However, this mechanism breaks down with 
> a large number of 'pre-created' images which don't necessarily have the users 
> available within the image. Examples of such images include shared images 
> that need to be used by multiple users. We need a way in which we can allow a 
> pre-defined set of users to run containers based on existing images, without 
> using the --user switch. There are some implications of disabling this user 
> squashing that we'll need to work through : log aggregation, artifact 
> deletion etc.,



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4236) Metric for aggregated resources allocation per queue

2017-03-08 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901600#comment-15901600
 ] 

Jason Lowe commented on YARN-4236:
--

Oops spoke too soon.  Just before committing I noticed there's a place that was 
missed.  There are two QueueMetrics#allocateResources methods, and the patch 
only increments the new aggregate metrics in one of the cases.  The case that 
was missed is used for the scenario where an existing container is changing its 
allocation size.  Arguably an increase in container size probably should be 
treated like an allocation of the delta for aggregate calculation purposes.  
For a decrease in size, that's sort of like a release of the delta size, and 
aggregate calculations ignore releases.

> Metric for aggregated resources allocation per queue
> 
>
> Key: YARN-4236
> URL: https://issues.apache.org/jira/browse/YARN-4236
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: metrics, scheduler
>Reporter: Chang Li
>Assignee: Chang Li
>  Labels: oct16-medium
> Attachments: YARN-4236.2.patch, YARN-4236-3.patch, YARN-4236.patch
>
>
> We currently track allocated memory and allocated vcores per queue but we 
> don't have a good rate metric on how fast we're allocating these things. In 
> other words, a straight line in allocatedmb could equally be one extreme of 
> no new containers are being allocated or allocating a bunch of containers 
> where we free exactly what we allocate each time. Adding a resources 
> allocated per second per queue would give us a better insight into the rate 
> of resource churn on a queue. Based on this aggregated resource allocation 
> per queue we can easily have some tools to measure the rate of resource 
> allocation per queue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4236) Metric for aggregated resources allocation per queue

2017-03-08 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901575#comment-15901575
 ] 

Jason Lowe commented on YARN-4236:
--

+1 lgtm.  Committing this.

> Metric for aggregated resources allocation per queue
> 
>
> Key: YARN-4236
> URL: https://issues.apache.org/jira/browse/YARN-4236
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: metrics, scheduler
>Reporter: Chang Li
>Assignee: Chang Li
>  Labels: oct16-medium
> Attachments: YARN-4236.2.patch, YARN-4236-3.patch, YARN-4236.patch
>
>
> We currently track allocated memory and allocated vcores per queue but we 
> don't have a good rate metric on how fast we're allocating these things. In 
> other words, a straight line in allocatedmb could equally be one extreme of 
> no new containers are being allocated or allocating a bunch of containers 
> where we free exactly what we allocate each time. Adding a resources 
> allocated per second per queue would give us a better insight into the rate 
> of resource churn on a queue. Based on this aggregated resource allocation 
> per queue we can easily have some tools to measure the rate of resource 
> allocation per queue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4266) Allow whitelisted users to disable user re-mapping/squashing when launching docker containers

2017-03-08 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901562#comment-15901562
 ] 

Eric Badger commented on YARN-4266:
---

[~sidharta-s], I agree with your assessment. I don't see this "--user" 
workaround to be the longterm solution, especially if the goal is to allow 
users to supply their own arbitrary, untrusted images. As others have 
identified previously in this jira, I believe that the real solution is to use 
[user namespace 
remapping|https://success.docker.com/KBase/Introduction_to_User_Namespaces_in_Docker_Engine],
 which was introduced in Docker 1.10. However, that requires a more updated 
kernel (3.10) than I think most of us are on, especially in production. 

So, until then I think that allowing an arbitrary UID:GID (or even user:group) 
to enter the container will be sufficient (disabled by default, as you 
suggested). Though I believe that containers working in this way are under the 
big assumption that the image is trusted and well-crafted, which is necessary 
until we figure out the user remapping issue, resolve security concerns, etc. 

> Allow whitelisted users to disable user re-mapping/squashing when launching 
> docker containers
> -
>
> Key: YARN-4266
> URL: https://issues.apache.org/jira/browse/YARN-4266
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Zhankun Tang
> Attachments: YARN-4266.001.patch, 
> YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping.pdf, 
> YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v2.pdf, 
> YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v3.pdf, 
> YARN-4266-branch-2.8.001.patch
>
>
> Docker provides a mechanism (the --user switch) that enables us to specify 
> the user the container processes should run as. We use this mechanism today 
> when launching docker containers . In non-secure mode, we run the docker 
> container based on 
> `yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user` and in 
> secure mode, as the submitting user. However, this mechanism breaks down with 
> a large number of 'pre-created' images which don't necessarily have the users 
> available within the image. Examples of such images include shared images 
> that need to be used by multiple users. We need a way in which we can allow a 
> pre-defined set of users to run containers based on existing images, without 
> using the --user switch. There are some implications of disabling this user 
> squashing that we'll need to work through : log aggregation, artifact 
> deletion etc.,



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2017-03-08 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901553#comment-15901553
 ] 

Jason Lowe commented on YARN-4051:
--

Thanks for updating the patch!  In the future, please don't delete patches and 
re-upload them with the same name.  It can lead to very confusing cases where 
Jenkins comments on a patch that happens to have the same name as one of the 
current attachments but isn't actually the patch that was tested.

The following code won't actually cause it to ignore the FINISH_APPS event.  
The {{continue}} in the for loop is degenerate, so all this does is log 
warnings but otherwise is semantically the same logic:
{code}
for (Container container : app.getContainers().values()) {
  if (container.isRecovering()) {
LOG.warn("drop FINISH_APPS event to " + appID + "because container "
+ container.getContainerId() + "is recovering");
continue;
  }
}
{code}

Also this shouldn't be a warning since it's not actually wrong when this 
happens, correct?  Similarly the warn log when ignoring the FINISH_CONTAINERS 
event seems like that should just be an info log at best.

I'm also wondering about the scenario where the kill event is coming in from an 
AM and not the RM.  If a container is still in the recovering state when we 
open up the client service for new requests it seems a client (e.g.: AM) could 
come in and ask for a still-recovering container to be killed.  I think the 
container process will be orphaned if that occurs, since the NM will mistakenly 
believe the container has not been launched yet.

> ContainerKillEvent is lost when container is  In New State and is recovering
> 
>
> Key: YARN-4051
> URL: https://issues.apache.org/jira/browse/YARN-4051
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: sandflee
>Assignee: sandflee
>Priority: Critical
> Attachments: YARN-4051.01.patch, YARN-4051.02.patch, 
> YARN-4051.03.patch, YARN-4051.04.patch, YARN-4051.05.patch, YARN-4051.06.patch
>
>
> As in YARN-4050, NM event dispatcher is blocked, and container is in New 
> state, when we finish application, the container still alive even after NM 
> event dispatcher is unblocked.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4266) Allow whitelisted users to disable user re-mapping/squashing when launching docker containers

2017-03-08 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901534#comment-15901534
 ] 

Shane Kumpf commented on YARN-4266:
---

I took another look at the progress being made on user namespaces in Docker and 
as far as I can tell, the story remains the same. I echo [~sidharta-s], it just 
doesn't appear there is a solution here that will work for all container types. 
As [~templedf] pointed out, "Modifying the container is not a valid  
alternative to modifying the container", but we are limited on options here. :)

As it appears the proposed solution will solve the problem for a class of 
container types, I'm +1 on adding the UID/usermod approach as an optional 
solution. Note that this solution won't help for official docker hub images 
such as postgres and apache without some sort of setuid wrapper, so we'll need 
to continue to discuss how we handle those.

I do believe that {{docker logs}} is worth exploring as a means of reducing or 
eliminating the writable bind mounted directories. We could explore read-only 
mounts for the various caches. It seems the biggest hurdle there will be the 
secure tokens, but read-only may work here as well. Anyone already thought 
about this far enough to have a story for the tokens? Should we open a new 
ticket to discuss this approach?

> Allow whitelisted users to disable user re-mapping/squashing when launching 
> docker containers
> -
>
> Key: YARN-4266
> URL: https://issues.apache.org/jira/browse/YARN-4266
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Zhankun Tang
> Attachments: YARN-4266.001.patch, 
> YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping.pdf, 
> YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v2.pdf, 
> YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v3.pdf, 
> YARN-4266-branch-2.8.001.patch
>
>
> Docker provides a mechanism (the --user switch) that enables us to specify 
> the user the container processes should run as. We use this mechanism today 
> when launching docker containers . In non-secure mode, we run the docker 
> container based on 
> `yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user` and in 
> secure mode, as the submitting user. However, this mechanism breaks down with 
> a large number of 'pre-created' images which don't necessarily have the users 
> available within the image. Examples of such images include shared images 
> that need to be used by multiple users. We need a way in which we can allow a 
> pre-defined set of users to run containers based on existing images, without 
> using the --user switch. There are some implications of disabling this user 
> squashing that we'll need to work through : log aggregation, artifact 
> deletion etc.,



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6280) Add a query parameter in ResourceManager Cluster Applications REST API to control whether or not returns ResourceRequest

2017-03-08 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901427#comment-15901427
 ] 

Sunil G commented on YARN-6280:
---

[~cltlfcjin]
Thanks for the patch. 

I think we need to keep the default behavior as it exists from 2.7 onwards. As 
[~rohithsharma] suggested, lets have {{hideResourceRequest}} as a query param 
instead of {{showResourceRequest}}. By default, we can keep hideResourceRequest 
as false. So we will serve apps with ResourceRequests always. If query param 
hideResourceRequest is given as true, we will skip it and reply. We can 
document and advise to use this filter when issues like this occurs. 


> Add a query parameter in ResourceManager Cluster Applications REST API to 
> control whether or not returns ResourceRequest
> 
>
> Key: YARN-6280
> URL: https://issues.apache.org/jira/browse/YARN-6280
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager, restapi
>Affects Versions: 2.7.3
>Reporter: Lantao Jin
> Attachments: YARN-6280.001.patch, YARN-6280.002.patch
>
>
> Begin from v2.7, the ResourceManager Cluster Applications REST API returns   
> ResourceRequest list. It's a very large construction in AppInfo.
> As a test, we use below URI to query only 2 results:
> http:// address:port>/ws/v1/cluster/apps?states=running,accepted=2
> The results are very different:
> ||Hadoop version|Total Character|Total Word|Total Lines|Size||
> |2.4.1|1192|  42| 42| 1.2 KB|
> |2.7.1|1222179|   48740|  48735|  1.21 MB|
> Most RESTful API requesters don't know about this after upgraded and their 
> old queries may cause ResourceManager more GC consuming and slower. Even if 
> they know this but have no idea to reduce the impact of ResourceManager 
> except slow down their query frequency.
> The patch adding a query parameter "showResourceRequests" to help requesters 
> who don't need this information to reduce the overhead. In consideration of 
> compatibility of interface, the default value is true if they don't set the 
> parameter, so the behaviour is the same as now.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4236) Metric for aggregated resources allocation per queue

2017-03-08 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901377#comment-15901377
 ] 

Eric Badger commented on YARN-4236:
---

Thanks, [~lichangleo]! 

[~jlowe], could you review? 

> Metric for aggregated resources allocation per queue
> 
>
> Key: YARN-4236
> URL: https://issues.apache.org/jira/browse/YARN-4236
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: metrics, scheduler
>Reporter: Chang Li
>Assignee: Chang Li
>  Labels: oct16-medium
> Attachments: YARN-4236.2.patch, YARN-4236-3.patch, YARN-4236.patch
>
>
> We currently track allocated memory and allocated vcores per queue but we 
> don't have a good rate metric on how fast we're allocating these things. In 
> other words, a straight line in allocatedmb could equally be one extreme of 
> no new containers are being allocated or allocating a bunch of containers 
> where we free exactly what we allocate each time. Adding a resources 
> allocated per second per queue would give us a better insight into the rate 
> of resource churn on a queue. Based on this aggregated resource allocation 
> per queue we can easily have some tools to measure the rate of resource 
> allocation per queue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5956) Refactor ClientRMService

2017-03-08 Thread Kai Sasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901299#comment-15901299
 ] 

Kai Sasaki commented on YARN-5956:
--

[~sunilg] Thanks. I updated the patch. The failed test seems fails 
intermittently. I confirmed it passed at local.
Could you check again this?

> Refactor ClientRMService
> 
>
> Key: YARN-5956
> URL: https://issues.apache.org/jira/browse/YARN-5956
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.0.0-alpha2
>Reporter: Kai Sasaki
>Assignee: Kai Sasaki
>Priority: Minor
> Attachments: YARN-5956.01.patch, YARN-5956.02.patch, 
> YARN-5956.03.patch, YARN-5956.04.patch, YARN-5956.05.patch, 
> YARN-5956.06.patch, YARN-5956.07.patch, YARN-5956.08.patch, 
> YARN-5956.09.patch, YARN-5956.10.patch, YARN-5956.11.patch, 
> YARN-5956.12.patch, YARN-5956.13.patch
>
>
> Some refactoring can be done in {{ClientRMService}}.
> - Remove redundant variable declaration
> - Fill in missing javadocs
> - Proper variable access modifier
> - Fix some typos in method name and exception messages



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-349) Send out last-minute load averages in TaskTrackerStatus

2017-03-08 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J reassigned YARN-349:


Assignee: (was: Harsh J)

> Send out last-minute load averages in TaskTrackerStatus
> ---
>
> Key: YARN-349
> URL: https://issues.apache.org/jira/browse/YARN-349
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
> Attachments: mapreduce.loadaverage.r3.diff, 
> mapreduce.loadaverage.r4.diff, mapreduce.loadaverage.r5.diff, 
> mapreduce.loadaverage.r6.diff
>
>   Original Estimate: 20m
>  Remaining Estimate: 20m
>
> Load averages could be useful in scheduling. This patch looks to extend the 
> existing Linux resource plugin (via /proc/loadavg file) to allow transmitting 
> load averages of the last one minute via the TaskTrackerStatus.
> Patch is up for review, with test cases added, at: 
> https://reviews.apache.org/r/20/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6300) NULL_UPDATE_REQUESTS is redundant in TestFairScheduler

2017-03-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901063#comment-15901063
 ] 

Hadoop QA commented on YARN-6300:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 0 new + 193 unchanged - 1 fixed = 193 total (was 194) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 42m 
57s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 66m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-6300 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12856782/YARN-6300.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 84f2001d2eaa 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 
15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 1eb8186 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/15206/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/15206/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> NULL_UPDATE_REQUESTS is redundant in TestFairScheduler
> --
>
> Key: YARN-6300
> URL: https://issues.apache.org/jira/browse/YARN-6300
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha2
>Reporter: Daniel Templeton
> 

[jira] [Updated] (YARN-6300) NULL_UPDATE_REQUESTS is redundant in TestFairScheduler

2017-03-08 Thread Yuanbo Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuanbo Liu updated YARN-6300:
-
Attachment: YARN-6300.001.patch

upload v1 patch for this JIRA.

> NULL_UPDATE_REQUESTS is redundant in TestFairScheduler
> --
>
> Key: YARN-6300
> URL: https://issues.apache.org/jira/browse/YARN-6300
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha2
>Reporter: Daniel Templeton
>Assignee: Yuanbo Liu
>Priority: Minor
>  Labels: newbie
> Attachments: YARN-6300.001.patch
>
>
> The {{TestFairScheduler.NULL_UPDATE_REQUESTS}} field hides 
> {{FairSchedulerTestBase.NULL_UPDATE_REQUESTS}}, which has the same value.  
> The {{NULL_UPDATE_REQUESTS}} field should be removed from 
> {{TestFairScheduler}}.
> While you're at it, maybe also remove the unused import.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6300) NULL_UPDATE_REQUESTS is redundant in TestFairScheduler

2017-03-08 Thread Yuanbo Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuanbo Liu reassigned YARN-6300:


Assignee: Yuanbo Liu

> NULL_UPDATE_REQUESTS is redundant in TestFairScheduler
> --
>
> Key: YARN-6300
> URL: https://issues.apache.org/jira/browse/YARN-6300
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha2
>Reporter: Daniel Templeton
>Assignee: Yuanbo Liu
>Priority: Minor
>  Labels: newbie
>
> The {{TestFairScheduler.NULL_UPDATE_REQUESTS}} field hides 
> {{FairSchedulerTestBase.NULL_UPDATE_REQUESTS}}, which has the same value.  
> The {{NULL_UPDATE_REQUESTS}} field should be removed from 
> {{TestFairScheduler}}.
> While you're at it, maybe also remove the unused import.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6207) Move application across queues should handle delayed event processing

2017-03-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900878#comment-15900878
 ] 

Hudson commented on YARN-6207:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11369 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11369/])
YARN-6207. Move application across queues should handle delayed event (sunilg: 
rev 1eb81867032b016a59662043cbae50daa52dafa9)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java


> Move application across queues should handle delayed event processing
> -
>
> Key: YARN-6207
> URL: https://issues.apache.org/jira/browse/YARN-6207
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: YARN-6207.001.patch, YARN-6207.002.patch, 
> YARN-6207.003.patch, YARN-6207.004.patch, YARN-6207.005.patch, 
> YARN-6207.006.patch, YARN-6207.007.patch, YARN-6207.008.patch
>
>
> *Steps to reproduce*
> 1.Submit application  and delay attempt add to Scheduler
> (Simulate using debug at EventDispatcher for SchedulerEventDispatcher)
> 2. Call move application to destination queue.
> {noformat}
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.preValidateMoveApplication(CapacityScheduler.java:2086)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.moveApplicationAcrossQueue(RMAppManager.java:669)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.moveApplicationAcrossQueues(ClientRMService.java:1231)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.moveApplicationAcrossQueues(ApplicationClientProtocolPBServiceImpl.java:388)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:537)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:522)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:867)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:813)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2659)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1483)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1429)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1339)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:115)
>   at com.sun.proxy.$Proxy7.moveApplicationAcrossQueues(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.moveApplicationAcrossQueues(ApplicationClientProtocolPBClientImpl.java:398)
>   ... 16 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org