[jira] [Commented] (YARN-8468) Enable the use of queue based maximum container allocation limit and implement it in FairScheduler

2018-10-10 Thread Wilfred Spiegelenburg (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16646003#comment-16646003
 ] 

Wilfred Spiegelenburg commented on YARN-8468:
-

[~cheersyang] I see the same failure on trunk in YARN-8865. I don't think it is 
specific for branch 3.1, it seems to be a more generic issue.

> Enable the use of queue based maximum container allocation limit and 
> implement it in FairScheduler
> --
>
> Key: YARN-8468
> URL: https://issues.apache.org/jira/browse/YARN-8468
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, scheduler
>Affects Versions: 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Antal Bálint Steinbach
>Priority: Critical
> Fix For: 3.2.0
>
> Attachments: YARN-8468-branch-3.1.018.patch, 
> YARN-8468-branch-3.1.019.patch, YARN-8468-branch-3.1.020.patch, 
> YARN-8468-branch-3.1.021.patch, YARN-8468.000.patch, YARN-8468.001.patch, 
> YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, 
> YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, 
> YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, 
> YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, 
> YARN-8468.014.patch, YARN-8468.015.patch, YARN-8468.016.patch, 
> YARN-8468.017.patch, YARN-8468.018.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" 
> to limit the overall size of a container. This applies globally to all 
> containers and cannot be limited by queue or and is not scheduler dependent.
> The goal of this ticket is to allow this value to be set on a per queue basis.
> The use case: User has two pools, one for ad hoc jobs and one for enterprise 
> apps. User wants to limit ad hoc jobs to small containers but allow 
> enterprise apps to request as many resources as needed. Setting 
> yarn.scheduler.maximum-allocation-mb sets a default value for maximum 
> container size for all queues and setting maximum resources per queue with 
> “maxContainerResources” queue config value.
> Suggested solution:
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and 
> leaf), this will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should 
> not allow that.
>  * make sure that queue resource cap can not be larger than scheduler max 
> resource cap in the config.
>  * implement getMaximumResourceCapability(String queueName) in the 
> FairScheduler
>  * implement getMaximumResourceCapability(String queueName) in both 
> FSParentQueue and FSLeafQueue as follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * Enforce the use of queue based maximum allocation limit if it is 
> available, if not use the general scheduler level setting
>  ** Use it during validation and normalization of requests in 
> scheduler.allocate, app submit and resource request



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8865) RMStateStore contains large number of expired RMDelegationToken

2018-10-10 Thread Wilfred Spiegelenburg (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645997#comment-16645997
 ] 

Wilfred Spiegelenburg commented on YARN-8865:
-

I added a second patch that follows the second approach by deleting the token 
in {{addPersistedDelegationToken()}}.
This makes it more a HADOOP issue than a YARN issue. please let me know if we 
should move the jira to the HADOOP project.

> RMStateStore contains large number of expired RMDelegationToken
> ---
>
> Key: YARN-8865
> URL: https://issues.apache.org/jira/browse/YARN-8865
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.1.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: YARN-8865.001.patch, YARN-8865.002.patch
>
>
> When the RM state store is restored expired delegation tokens are restored 
> and added to the system. These expired tokens do not get cleaned up or 
> removed. The exact reason why the tokens are still in the store is not clear. 
> We have seen as many as 250,000 tokens in the store some of which were 2 
> years old.
> This has two side effects:
> * for the zookeeper store this leads to a jute buffer exhaustion issue and 
> prevents the RM from becoming active.
> * restore takes longer than needed and heap usage is higher than it should be
> We should not restore already expired tokens since they cannot be renewed or 
> used.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8865) RMStateStore contains large number of expired RMDelegationToken

2018-10-10 Thread Wilfred Spiegelenburg (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated YARN-8865:

Attachment: YARN-8865.002.patch

> RMStateStore contains large number of expired RMDelegationToken
> ---
>
> Key: YARN-8865
> URL: https://issues.apache.org/jira/browse/YARN-8865
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.1.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: YARN-8865.001.patch, YARN-8865.002.patch
>
>
> When the RM state store is restored expired delegation tokens are restored 
> and added to the system. These expired tokens do not get cleaned up or 
> removed. The exact reason why the tokens are still in the store is not clear. 
> We have seen as many as 250,000 tokens in the store some of which were 2 
> years old.
> This has two side effects:
> * for the zookeeper store this leads to a jute buffer exhaustion issue and 
> prevents the RM from becoming active.
> * restore takes longer than needed and heap usage is higher than it should be
> We should not restore already expired tokens since they cannot be renewed or 
> used.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8627) EntityGroupFSTimelineStore hdfs done directory keeps on accumulating

2018-10-10 Thread Tarun Parimi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645983#comment-16645983
 ] 

Tarun Parimi commented on YARN-8627:


[~rohithsharma] Attached the domain log files for three such applications in 
[^app-domain-logs.zip] . All these are Tez applications. I can see that the 
domainId of the two domain files are a slightly different, where the one with a 
wrong directory structure has  "Tez_ATS_application_1534014715392_830173_1", 
while the one in proper directory is 
"Tez_ATS_application_1534014715392_830173". 

 The domainId "Tez_ATS_application_1534014715392_830173_1" seems to be created 
by 
{{org.apache.tez.dag.history.ats.acls.ATSV15HistoryACLPolicyManager#createDAGDomain}}
 and the other domain by 
{{org.apache.tez.dag.history.ats.acls.ATSV15HistoryACLPolicyManager#createSessionDomain}}
 .

No idea how the DAGDomain log gets written in the wrong directory structure. 
Will try to see if I can reproduce locally based on this.

 

 

> EntityGroupFSTimelineStore hdfs done directory keeps on accumulating
> 
>
> Key: YARN-8627
> URL: https://issues.apache.org/jira/browse/YARN-8627
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.8.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
> Attachments: YARN-8627.001.patch, YARN-8627.002.patch, 
> app-domain-logs.zip
>
>
> The EntityLogCleaner threads exits with the following ERROR every time it 
> runs.  
> {code:java}
> 2018-07-18 19:59:39,837 INFO timeline.EntityGroupFSTimelineStore 
> (EntityGroupFSTimelineStore.java:cleanLogs(462)) - Deleting 
> hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18268
> 2018-07-18 19:59:39,844 INFO timeline.EntityGroupFSTimelineStore 
> (EntityGroupFSTimelineStore.java:cleanLogs(462)) - Deleting 
> hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18270
> 2018-07-18 19:59:39,848 ERROR timeline.EntityGroupFSTimelineStore 
> (EntityGroupFSTimelineStore.java:run(899)) - Error cleaning files  
> java.io.FileNotFoundException: File 
> hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18270
>  does not exist.  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1062)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1069)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1040)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1019)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1015)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusIterator(DistributedFileSystem.java:1015)
>   at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.shouldCleanAppLogDir(EntityGroupFSTimelineStore.java:480)
>  
> {code}
>  
>  Each time the thread gets scheduled, it is a different folder encountering 
> the error. As a result, the thread is not able to clean all the old done 
> directories, since it stops after this error. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8870) Add submarine installation scripts

2018-10-10 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan reassigned YARN-8870:


Assignee: Wangda Tan

> Add submarine installation scripts
> --
>
> Key: YARN-8870
> URL: https://issues.apache.org/jira/browse/YARN-8870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xun Liu
>Assignee: Wangda Tan
>Priority: Major
>
> In order to reduce the deployment difficulty of Hadoop {Submarine} DNS, 
> Docker, GPU, Network, graphics card, operating system kernel modification and 
> other components, I specially developed this installation script to deploy 
> Hadoop {Submarine} runtime environment, providing one-click installation 
> Scripts, which can also be used to install, uninstall, start, and stop 
> individual components step by step.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8870) Add submarine installation scripts

2018-10-10 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan reassigned YARN-8870:


Assignee: Xun Liu  (was: Wangda Tan)

> Add submarine installation scripts
> --
>
> Key: YARN-8870
> URL: https://issues.apache.org/jira/browse/YARN-8870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xun Liu
>Assignee: Xun Liu
>Priority: Major
>
> In order to reduce the deployment difficulty of Hadoop {Submarine} DNS, 
> Docker, GPU, Network, graphics card, operating system kernel modification and 
> other components, I specially developed this installation script to deploy 
> Hadoop {Submarine} runtime environment, providing one-click installation 
> Scripts, which can also be used to install, uninstall, start, and stop 
> individual components step by step.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8870) Add submarine installation scripts

2018-10-10 Thread Xun Liu (JIRA)
Xun Liu created YARN-8870:
-

 Summary: Add submarine installation scripts
 Key: YARN-8870
 URL: https://issues.apache.org/jira/browse/YARN-8870
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xun Liu


In order to reduce the deployment difficulty of Hadoop {Submarine} DNS, Docker, 
GPU, Network, graphics card, operating system kernel modification and other 
components, I specially developed this installation script to deploy Hadoop 
{Submarine} runtime environment, providing one-click installation Scripts, 
which can also be used to install, uninstall, start, and stop individual 
components step by step.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8627) EntityGroupFSTimelineStore hdfs done directory keeps on accumulating

2018-10-10 Thread Tarun Parimi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tarun Parimi updated YARN-8627:
---
Attachment: app-domain-logs.zip

> EntityGroupFSTimelineStore hdfs done directory keeps on accumulating
> 
>
> Key: YARN-8627
> URL: https://issues.apache.org/jira/browse/YARN-8627
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.8.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
> Attachments: YARN-8627.001.patch, YARN-8627.002.patch, 
> app-domain-logs.zip
>
>
> The EntityLogCleaner threads exits with the following ERROR every time it 
> runs.  
> {code:java}
> 2018-07-18 19:59:39,837 INFO timeline.EntityGroupFSTimelineStore 
> (EntityGroupFSTimelineStore.java:cleanLogs(462)) - Deleting 
> hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18268
> 2018-07-18 19:59:39,844 INFO timeline.EntityGroupFSTimelineStore 
> (EntityGroupFSTimelineStore.java:cleanLogs(462)) - Deleting 
> hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18270
> 2018-07-18 19:59:39,848 ERROR timeline.EntityGroupFSTimelineStore 
> (EntityGroupFSTimelineStore.java:run(899)) - Error cleaning files  
> java.io.FileNotFoundException: File 
> hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18270
>  does not exist.  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1062)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1069)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1040)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1019)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1015)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusIterator(DistributedFileSystem.java:1015)
>   at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.shouldCleanAppLogDir(EntityGroupFSTimelineStore.java:480)
>  
> {code}
>  
>  Each time the thread gets scheduled, it is a different folder encountering 
> the error. As a result, the thread is not able to clean all the old done 
> directories, since it stops after this error. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8468) Enable the use of queue based maximum container allocation limit and implement it in FairScheduler

2018-10-10 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645973#comment-16645973
 ] 

Weiwei Yang commented on YARN-8468:
---

Still same issue... looks like it was not caused by the patch. [~leftnoteasy], 
[~sunilg], any idea how we can resolve this? It's likely a jenkins issue on 
branch-3.1.

> Enable the use of queue based maximum container allocation limit and 
> implement it in FairScheduler
> --
>
> Key: YARN-8468
> URL: https://issues.apache.org/jira/browse/YARN-8468
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, scheduler
>Affects Versions: 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Antal Bálint Steinbach
>Priority: Critical
> Fix For: 3.2.0
>
> Attachments: YARN-8468-branch-3.1.018.patch, 
> YARN-8468-branch-3.1.019.patch, YARN-8468-branch-3.1.020.patch, 
> YARN-8468-branch-3.1.021.patch, YARN-8468.000.patch, YARN-8468.001.patch, 
> YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, 
> YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, 
> YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, 
> YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, 
> YARN-8468.014.patch, YARN-8468.015.patch, YARN-8468.016.patch, 
> YARN-8468.017.patch, YARN-8468.018.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" 
> to limit the overall size of a container. This applies globally to all 
> containers and cannot be limited by queue or and is not scheduler dependent.
> The goal of this ticket is to allow this value to be set on a per queue basis.
> The use case: User has two pools, one for ad hoc jobs and one for enterprise 
> apps. User wants to limit ad hoc jobs to small containers but allow 
> enterprise apps to request as many resources as needed. Setting 
> yarn.scheduler.maximum-allocation-mb sets a default value for maximum 
> container size for all queues and setting maximum resources per queue with 
> “maxContainerResources” queue config value.
> Suggested solution:
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and 
> leaf), this will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should 
> not allow that.
>  * make sure that queue resource cap can not be larger than scheduler max 
> resource cap in the config.
>  * implement getMaximumResourceCapability(String queueName) in the 
> FairScheduler
>  * implement getMaximumResourceCapability(String queueName) in both 
> FSParentQueue and FSLeafQueue as follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * Enforce the use of queue based maximum allocation limit if it is 
> available, if not use the general scheduler level setting
>  ** Use it during validation and normalization of requests in 
> scheduler.allocate, app submit and resource request



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8468) Enable the use of queue based maximum container allocation limit and implement it in FairScheduler

2018-10-10 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-8468:
--
Fix Version/s: (was: 3.1.2)

> Enable the use of queue based maximum container allocation limit and 
> implement it in FairScheduler
> --
>
> Key: YARN-8468
> URL: https://issues.apache.org/jira/browse/YARN-8468
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, scheduler
>Affects Versions: 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Antal Bálint Steinbach
>Priority: Critical
> Fix For: 3.2.0
>
> Attachments: YARN-8468-branch-3.1.018.patch, 
> YARN-8468-branch-3.1.019.patch, YARN-8468-branch-3.1.020.patch, 
> YARN-8468-branch-3.1.021.patch, YARN-8468.000.patch, YARN-8468.001.patch, 
> YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, 
> YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, 
> YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, 
> YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, 
> YARN-8468.014.patch, YARN-8468.015.patch, YARN-8468.016.patch, 
> YARN-8468.017.patch, YARN-8468.018.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" 
> to limit the overall size of a container. This applies globally to all 
> containers and cannot be limited by queue or and is not scheduler dependent.
> The goal of this ticket is to allow this value to be set on a per queue basis.
> The use case: User has two pools, one for ad hoc jobs and one for enterprise 
> apps. User wants to limit ad hoc jobs to small containers but allow 
> enterprise apps to request as many resources as needed. Setting 
> yarn.scheduler.maximum-allocation-mb sets a default value for maximum 
> container size for all queues and setting maximum resources per queue with 
> “maxContainerResources” queue config value.
> Suggested solution:
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and 
> leaf), this will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should 
> not allow that.
>  * make sure that queue resource cap can not be larger than scheduler max 
> resource cap in the config.
>  * implement getMaximumResourceCapability(String queueName) in the 
> FairScheduler
>  * implement getMaximumResourceCapability(String queueName) in both 
> FSParentQueue and FSLeafQueue as follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * Enforce the use of queue based maximum allocation limit if it is 
> available, if not use the general scheduler level setting
>  ** Use it during validation and normalization of requests in 
> scheduler.allocate, app submit and resource request



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8842) Update QueueMetrics with custom resource values

2018-10-10 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645960#comment-16645960
 ] 

Szilard Nemeth commented on YARN-8842:
--

Hi [~wilfreds]! 
Thanks for coming back to this!
I think I addressed all your comments, with this answer: 
https://issues.apache.org/jira/browse/YARN-8842?focusedCommentId=16639171=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16639171
If you don't think so please list what is missing because I think all of them 
are addressed, except the {{getMaxAllocationUtilization}} method I haven't 
touched as I would prefer to do that with YARN-8059, which will be done 
immediately when this gets merged.
Thanks!

> Update QueueMetrics with custom resource values 
> 
>
> Key: YARN-8842
> URL: https://issues.apache.org/jira/browse/YARN-8842
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-8842.001.patch, YARN-8842.002.patch, 
> YARN-8842.003.patch, YARN-8842.004.patch, YARN-8842.005.patch, 
> YARN-8842.006.patch, YARN-8842.007.patch, YARN-8842.008.patch
>
>
> This is the 2nd dependent jira of YARN-8059.
> As updating the metrics is an independent step from handling preemption, this 
> jira only deals with the queue metrics update of custom resources.
> The following metrics should be updated: 
> * allocated resources
> * available resources
> * pending resources
> * reserved resources
> * aggregate seconds preempted



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8666) [UI2] Remove application tab from Yarn Queue Page

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645959#comment-16645959
 ] 

Hadoop QA commented on YARN-8666:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
27m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  3s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 39m 26s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8666 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943349/YARN-8666.002.patch |
| Optional Tests |  dupname  asflicense  shadedclient  |
| uname | Linux 313a788cc377 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 
07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / f261c31 |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 413 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/22147/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> [UI2] Remove application tab from Yarn Queue Page
> -
>
> Key: YARN-8666
> URL: https://issues.apache.org/jira/browse/YARN-8666
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Affects Versions: 3.1.1
>Reporter: Yesha Vora
>Assignee: Yesha Vora
>Priority: Major
> Attachments: Screen Shot 2018-08-14 at 3.43.18 PM.png, Screen Shot 
> 2018-09-06 at 12.50.14 PM.png, YARN-8666.001.patch, YARN-8666.002.patch
>
>
> Yarn UI2 Queue page puts Application button. This button does not redirect to 
> any other page. In addition to that running application table is also 
> available on same page. 
> Thus, there is no need to have a button for application in Queue page. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8448) AM HTTPS Support

2018-10-10 Thread Robert Kanter (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-8448:

Attachment: YARN-8448.007.patch

> AM HTTPS Support
> 
>
> Key: YARN-8448
> URL: https://issues.apache.org/jira/browse/YARN-8448
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Major
> Attachments: YARN-8448.001.patch, YARN-8448.002.patch, 
> YARN-8448.003.patch, YARN-8448.004.patch, YARN-8448.005.patch, 
> YARN-8448.006.patch, YARN-8448.007.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8448) AM HTTPS Support

2018-10-10 Thread Robert Kanter (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645928#comment-16645928
 ] 

Robert Kanter commented on YARN-8448:
-

The 007 patch:
- Rebased on latest trunk
- Removed pom changes because they were taken care of by HADOOP-15832
- Renamed OFF, OPTIONAL, REQUIRED to NONE, LENIENT, and STRICT

> AM HTTPS Support
> 
>
> Key: YARN-8448
> URL: https://issues.apache.org/jira/browse/YARN-8448
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Major
> Attachments: YARN-8448.001.patch, YARN-8448.002.patch, 
> YARN-8448.003.patch, YARN-8448.004.patch, YARN-8448.005.patch, 
> YARN-8448.006.patch, YARN-8448.007.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8666) [UI2] Remove application tab from Yarn Queue Page

2018-10-10 Thread Yesha Vora (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yesha Vora updated YARN-8666:
-
Attachment: YARN-8666.002.patch

> [UI2] Remove application tab from Yarn Queue Page
> -
>
> Key: YARN-8666
> URL: https://issues.apache.org/jira/browse/YARN-8666
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Affects Versions: 3.1.1
>Reporter: Yesha Vora
>Assignee: Yesha Vora
>Priority: Major
> Attachments: Screen Shot 2018-08-14 at 3.43.18 PM.png, Screen Shot 
> 2018-09-06 at 12.50.14 PM.png, YARN-8666.001.patch, YARN-8666.002.patch
>
>
> Yarn UI2 Queue page puts Application button. This button does not redirect to 
> any other page. In addition to that running application table is also 
> available on same page. 
> Thus, there is no need to have a button for application in Queue page. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8590) Fair scheduler promotion does not update container execution type and token

2018-10-10 Thread Miklos Szegedi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645896#comment-16645896
 ] 

Miklos Szegedi commented on YARN-8590:
--

+1. Thank you for the patch [~haibochen], and for the review [~zsiegl].

> Fair scheduler promotion does not update container execution type and token
> ---
>
> Key: YARN-8590
> URL: https://issues.apache.org/jira/browse/YARN-8590
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Affects Versions: YARN-1011
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-8590-YARN-1011.00.patch, 
> YARN-8590-YARN-1011.01.patch, YARN-8590-YARN-1011.02.patch
>
>
> Fair Scheduler promotion of opportunistic containers does not update 
> container execution type and token. This leads to incorrect resource 
> accounting when the promoted containers are released.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6989) Ensure timeline service v2 codebase gets UGI from HttpServletRequest in a consistent way

2018-10-10 Thread Abhishek Modi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645878#comment-16645878
 ] 

Abhishek Modi commented on YARN-6989:
-

Thanks [~vrushalic] for committing this.

> Ensure timeline service v2 codebase gets UGI from HttpServletRequest in a 
> consistent way
> 
>
> Key: YARN-6989
> URL: https://issues.apache.org/jira/browse/YARN-6989
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Vrushali C
>Assignee: Abhishek Modi
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 3.0.4, 3.1.2
>
> Attachments: YARN-6989.001.patch, YARN-6989.002.patch
>
>
> As noticed during discussions in YARN-6820, the webservices in timeline 
> service v2 get the UGI created from the user obtained by invoking 
> getRemoteUser on the HttpServletRequest . 
> It will be good to use getUserPrincipal instead of invoking getRemoteUser on 
> the HttpServletRequest. 
> Filing jira to update the code. 
> Per Java EE documentations for 6 and 7, the behavior around getRemoteUser and 
> getUserPrincipal is listed at:
> http://docs.oracle.com/javaee/6/tutorial/doc/gjiie.html#bncba
> https://docs.oracle.com/javaee/7/tutorial/security-webtier003.htm
> {code}
> getRemoteUser, which determines the user name with which the client 
> authenticated. The getRemoteUser method returns the name of the remote user 
> (the caller) associated by the container with the request. If no user has 
> been authenticated, this method returns null.
> getUserPrincipal, which determines the principal name of the current user and 
> returns a java.security.Principal object. If no user has been authenticated, 
> this method returns null. Calling the getName method on the Principal 
> returned by getUserPrincipal returns the name of the remote user.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8842) Update QueueMetrics with custom resource values

2018-10-10 Thread Wilfred Spiegelenburg (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645868#comment-16645868
 ] 

Wilfred Spiegelenburg commented on YARN-8842:
-

The test failure is not related to this change as I see it happening in 
different builds. Based on the comments on YARN-8468 Weiwei thinks it is 
related to that change.

Can you have a look at the comments I added, even after version 8 they are 
still relevant.

> Update QueueMetrics with custom resource values 
> 
>
> Key: YARN-8842
> URL: https://issues.apache.org/jira/browse/YARN-8842
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-8842.001.patch, YARN-8842.002.patch, 
> YARN-8842.003.patch, YARN-8842.004.patch, YARN-8842.005.patch, 
> YARN-8842.006.patch, YARN-8842.007.patch, YARN-8842.008.patch
>
>
> This is the 2nd dependent jira of YARN-8059.
> As updating the metrics is an independent step from handling preemption, this 
> jira only deals with the queue metrics update of custom resources.
> The following metrics should be updated: 
> * allocated resources
> * available resources
> * pending resources
> * reserved resources
> * aggregate seconds preempted



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8869) YARN client might not work correctly with RM Rest API for Kerberos authentication

2018-10-10 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-8869:
-
Priority: Blocker  (was: Major)

> YARN client might not work correctly with RM Rest API for Kerberos 
> authentication
> -
>
> Key: YARN-8869
> URL: https://issues.apache.org/jira/browse/YARN-8869
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.1.0, 3.2.0, 3.1.1
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Blocker
> Attachments: YARN-8869.001.patch
>
>
> ApiServiceClient uses WebResource instead of Builder to pass Kerberos 
> authorization header.  This may not work sometimes, and it is because 
> WebResource.header() could be bind mashed new Builder instance in some 
> condition.  This article explained the details: 
> https://juristr.com/blog/2015/05/jersey-webresource-ignores-headers/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645847#comment-16645847
 ] 

Hadoop QA commented on YARN-8777:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
29m 16s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 23s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
53s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 62m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8777 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/1294/YARN-8777.008.patch |
| Optional Tests |  dupname  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux d399cefb9576 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 
07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 2bd000c |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/22145/testReport/ |
| Max. process+thread count | 446 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/22145/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Container Executor C binary change to execute interactive docker command
> 
>
> Key: YARN-8777
> URL: https://issues.apache.org/jira/browse/YARN-8777
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8777.001.patch, YARN-8777.002.patch, 
> YARN-8777.003.patch, YARN-8777.004.patch, YARN-8777.005.patch, 
> YARN-8777.006.patch, YARN-8777.007.patch, YARN-8777.008.patch
>
>
> Since Container Executor provides Container execution using the native 
> container-executor binary, we also need to make changes to accept new 
> 

[jira] [Commented] (YARN-8869) YARN client might not work correctly with RM Rest API for Kerberos authentication

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645836#comment-16645836
 ] 

Hadoop QA commented on YARN-8869:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 50s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 13s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api:
 The patch generated 1 new + 4 unchanged - 0 fixed = 5 total (was 4) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 29s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
44s{color} | {color:green} hadoop-yarn-services-api in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 51m 34s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8869 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943330/YARN-8869.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 92f47b0d815c 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 2bd000c |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/22146/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-services_hadoop-yarn-services-api.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/22146/testReport/ |
| Max. 

[jira] [Commented] (YARN-8865) RMStateStore contains large number of expired RMDelegationToken

2018-10-10 Thread Wilfred Spiegelenburg (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645820#comment-16645820
 ] 

Wilfred Spiegelenburg commented on YARN-8865:
-

Thank you for the feedback [~jlowe] and [~daryn]. We're not sure why the tokens 
are not purged in the first place. We suspect that it has something to do with 
the token expiring while the RM is down.

This is a dump of one of the tokens (just the relevant data):
{code}
 sequence   6327582
 kind   RM_DELEGATION_TOKEN
 issuedate  2016-10-09 00:03:17,714+1100
 maxdate2016-10-16 00:03:17,714+1100
 masterkey  106
{code}

I checked the master keys that are in the store and we only have key IDs 877 to 
887. Based on Jason's comment I now understand why we are not removing them 
after the recover. In the recover steps we first recover the master keys. Then 
when that is done we recover the tokens. If the token has a master key that is 
currently not known we just skip the token and "forget" about it. This happens 
in the {{addPersistedDelegationToken()}}. We thus never clean them up if the 
master key is already gone.
That could also explain the issue:
* the expired token removal thread runs and cleans up tokens (problem tokens 
are not expired yet)
* the removal thread runs again and the master key is expired and gets removed: 
tokens also expire at that point but the cleanup has not happened
* the RM goes down or fails over before the token removal thread cleans up the 
expired tokens

Recovery from that point on will ignore the tokens because the key has been 
removed. This should thus affect all state store types and not just the ZK, 
Daryn am I correct in saying that?
Logically the change should then be in the ADTSM method 
{{addPersistedDelegationToken()}} to remove the token when no key is found.

> RMStateStore contains large number of expired RMDelegationToken
> ---
>
> Key: YARN-8865
> URL: https://issues.apache.org/jira/browse/YARN-8865
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.1.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: YARN-8865.001.patch
>
>
> When the RM state store is restored expired delegation tokens are restored 
> and added to the system. These expired tokens do not get cleaned up or 
> removed. The exact reason why the tokens are still in the store is not clear. 
> We have seen as many as 250,000 tokens in the store some of which were 2 
> years old.
> This has two side effects:
> * for the zookeeper store this leads to a jute buffer exhaustion issue and 
> prevents the RM from becoming active.
> * restore takes longer than needed and heap usage is higher than it should be
> We should not restore already expired tokens since they cannot be renewed or 
> used.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8869) YARN client might not work correctly with RM Rest API for Kerberos authentication

2018-10-10 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8869:

Description: ApiServiceClient uses WebResource instead of Builder to pass 
Kerberos authorization header.  This may not work sometimes, and it is because 
WebResource.header() could be bind mashed new Builder instance in some 
condition.  This article explained the details: 
https://juristr.com/blog/2015/05/jersey-webresource-ignores-headers/  (was: 
ApiServiceClient uses WebResource instead of Builder to pass Kerberos 
authorization header.  This may not work sometimes, and it is because 
WebResource.header() could be bind mashed build new Builder instance in some 
condition.  This article explained the details: 
https://juristr.com/blog/2015/05/jersey-webresource-ignores-headers/)

> YARN client might not work correctly with RM Rest API for Kerberos 
> authentication
> -
>
> Key: YARN-8869
> URL: https://issues.apache.org/jira/browse/YARN-8869
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.1.0, 3.2.0, 3.1.1
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-8869.001.patch
>
>
> ApiServiceClient uses WebResource instead of Builder to pass Kerberos 
> authorization header.  This may not work sometimes, and it is because 
> WebResource.header() could be bind mashed new Builder instance in some 
> condition.  This article explained the details: 
> https://juristr.com/blog/2015/05/jersey-webresource-ignores-headers/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command

2018-10-10 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645775#comment-16645775
 ] 

Eric Yang commented on YARN-8777:
-

Patch 8 addressed [~billie.rinaldi]'s comments.

> Container Executor C binary change to execute interactive docker command
> 
>
> Key: YARN-8777
> URL: https://issues.apache.org/jira/browse/YARN-8777
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8777.001.patch, YARN-8777.002.patch, 
> YARN-8777.003.patch, YARN-8777.004.patch, YARN-8777.005.patch, 
> YARN-8777.006.patch, YARN-8777.007.patch, YARN-8777.008.patch
>
>
> Since Container Executor provides Container execution using the native 
> container-executor binary, we also need to make changes to accept new 
> “dockerExec” method to invoke the corresponding native function to execute 
> docker exec command to the running container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8777) Container Executor C binary change to execute interactive docker command

2018-10-10 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8777:

Attachment: YARN-8777.008.patch

> Container Executor C binary change to execute interactive docker command
> 
>
> Key: YARN-8777
> URL: https://issues.apache.org/jira/browse/YARN-8777
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8777.001.patch, YARN-8777.002.patch, 
> YARN-8777.003.patch, YARN-8777.004.patch, YARN-8777.005.patch, 
> YARN-8777.006.patch, YARN-8777.007.patch, YARN-8777.008.patch
>
>
> Since Container Executor provides Container execution using the native 
> container-executor binary, we also need to make changes to accept new 
> “dockerExec” method to invoke the corresponding native function to execute 
> docker exec command to the running container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8868) Set HTTPOnly attribute to Cookie

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645756#comment-16645756
 ] 

Hadoop QA commented on YARN-8868:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
34s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 31s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
22s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 44s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
34s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
47s{color} | {color:green} hadoop-yarn-server-web-proxy in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 83m 34s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8868 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943319/YARN-8868.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 521b01eec62a 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 
07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 045069e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 

[jira] [Assigned] (YARN-8869) YARN client might not work correctly with RM Rest API for Kerberos authentication

2018-10-10 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang reassigned YARN-8869:
---

Assignee: Eric Yang

> YARN client might not work correctly with RM Rest API for Kerberos 
> authentication
> -
>
> Key: YARN-8869
> URL: https://issues.apache.org/jira/browse/YARN-8869
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.1.1, 3.1.0, 3.2.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-8869.001.patch
>
>
> ApiServiceClient uses WebResource instead of Builder to pass Kerberos 
> authorization header.  This may not work sometimes, and it is because 
> WebResource.header() could be bind mashed build new Builder instance in some 
> condition.  This article explained the details: 
> https://juristr.com/blog/2015/05/jersey-webresource-ignores-headers/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8869) YARN client might not work correctly with RM Rest API for Kerberos authentication

2018-10-10 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8869:

Attachment: YARN-8869.001.patch

> YARN client might not work correctly with RM Rest API for Kerberos 
> authentication
> -
>
> Key: YARN-8869
> URL: https://issues.apache.org/jira/browse/YARN-8869
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.1.0, 3.2.0, 3.1.1
>Reporter: Eric Yang
>Priority: Major
> Attachments: YARN-8869.001.patch
>
>
> ApiServiceClient uses WebResource instead of Builder to pass Kerberos 
> authorization header.  This may not work sometimes, and it is because 
> WebResource.header() could be bind mashed build new Builder instance in some 
> condition.  This article explained the details: 
> https://juristr.com/blog/2015/05/jersey-webresource-ignores-headers/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8869) YARN client might not work correctly with RM Rest API for Kerberos authentication

2018-10-10 Thread Eric Yang (JIRA)
Eric Yang created YARN-8869:
---

 Summary: YARN client might not work correctly with RM Rest API for 
Kerberos authentication
 Key: YARN-8869
 URL: https://issues.apache.org/jira/browse/YARN-8869
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 3.1.1, 3.1.0, 3.2.0
Reporter: Eric Yang


ApiServiceClient uses WebResource instead of Builder to pass Kerberos 
authorization header.  This may not work sometimes, and it is because 
WebResource.header() could be bind mashed build new Builder instance in some 
condition.  This article explained the details: 
https://juristr.com/blog/2015/05/jersey-webresource-ignores-headers/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3854) Add localization support for docker images

2018-10-10 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645737#comment-16645737
 ] 

Chandni Singh commented on YARN-3854:
-

[~ebadger]  I thought more about the below issue
{quote}
I'm worried that the image deletion is going to be quite rough and error-prone. 
Docker containers sometimes stick around in the Exited or Dead states or keep 
running beyond when they should. That will cause the delete to perpetually fail 
since the image is still in use, even though it isn't really in use, at least 
not in a useful sense. Do we just keep going around and around and trying to 
delete the same images even if we keep failing?
{quote}
If we go by NM maintaining an LRU cache of images it has localized and only 
deletes the one it localized, then below could be a possible solution
- image meta contains count of running containers. This count is updated when 
the container is launched.
- NM already has a service which periodically check the statuses of the 
container. If the container is stopped/failed, image meta is updated.
- When the image is evicted and count of running containers = 0, it can be 
force deleted.
With this solution, we just rely on NM's view of containers and images.

> Add localization support for docker images
> --
>
> Key: YARN-3854
> URL: https://issues.apache.org/jira/browse/YARN-3854
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Chandni Singh
>Priority: Major
>  Labels: Docker
> Attachments: Localization Support For Docker Images.pdf, 
> YARN-3854-branch-2.8.001.patch, 
> YARN-3854_Localization_support_for_Docker_image_v1.pdf, 
> YARN-3854_Localization_support_for_Docker_image_v2.pdf, 
> YARN-3854_Localization_support_for_Docker_image_v3.pdf
>
>
> We need the ability to localize docker images when those images aren't 
> already available locally. There are various approaches that could be used 
> here with different trade-offs/issues : image archives on HDFS + docker load 
> ,  docker pull during the localization phase or (automatic) docker pull 
> during the run/launch phase. 
> We also need the ability to clean-up old/stale, unused images. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8855) Application fails if one of the sublcluster is down.

2018-10-10 Thread Botong Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645703#comment-16645703
 ] 

Botong Huang commented on YARN-8855:


Hi [~bibinchundatt], yes YARN-7652 can be the reason. There's other possibility 
as well, say YARN-8581, depending on the config setup. 

If a sub-cluster is gone for longer than some time, SubclusterCleaner in GPG 
(YARN-6648) will mark the sub-cluster to LOST state in StateStore. AMRMProxy 
will eventually pick it up. 

> Application fails if one of the sublcluster is down.
> 
>
> Key: YARN-8855
> URL: https://issues.apache.org/jira/browse/YARN-8855
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Reporter: Rahul Anand
>Priority: Major
>
> If one of sub cluster is down then application keeps on trying multiple times 
> and then it fails About 30 failover attempts found in the logs. Below is the 
> detailed exception. 
> {code:java}
> 2018-10-08 14:21:21,245 | INFO | NM ContainerManager dispatcher | Container 
> container_e03_1538297667953_0005_01_01 transitioned from 
> CONTAINER_CLEANEDUP_AFTER_KILL to DONE | ContainerImpl.java:2093
> 2018-10-08 14:21:21,245 | INFO | NM ContainerManager dispatcher | Removing 
> container_e03_1538297667953_0005_01_01 from application 
> application_1538297667953_0005 | ApplicationImpl.java:512
> 2018-10-08 14:21:21,246 | INFO | NM ContainerManager dispatcher | Stopping 
> resource-monitoring for container_e03_1538297667953_0005_01_01 | 
> ContainersMonitorImpl.java:932
> 2018-10-08 14:21:21,246 | INFO | NM ContainerManager dispatcher | Considering 
> container container_e03_1538297667953_0005_01_01 for log-aggregation | 
> AppLogAggregatorImpl.java:538
> 2018-10-08 14:21:21,246 | INFO | NM ContainerManager dispatcher | Got event 
> CONTAINER_STOP for appId application_1538297667953_0005 | AuxServices.java:350
> 2018-10-08 14:21:21,246 | INFO | NM ContainerManager dispatcher | Stopping 
> container container_e03_1538297667953_0005_01_01 | 
> YarnShuffleService.java:295
> 2018-10-08 14:21:21,247 | WARN | NM Event dispatcher | couldn't find 
> container container_e03_1538297667953_0005_01_01 while processing 
> FINISH_CONTAINERS event | ContainerManagerImpl.java:1660
> 2018-10-08 14:21:22,248 | INFO | Node Status Updater | Removed completed 
> containers from NM context: [container_e03_1538297667953_0005_01_01] | 
> NodeStatusUpdaterImpl.java:696
> 2018-10-08 14:21:26,734 | INFO | pool-16-thread-1 | Failing over to the 
> ResourceManager for SubClusterId: cluster2 | 
> FederationRMFailoverProxyProvider.java:124
> 2018-10-08 14:21:26,735 | INFO | pool-16-thread-1 | Flushing subClusters from 
> cache and rehydrating from store, most likely on account of RM failover. | 
> FederationStateStoreFacade.java:258
> 2018-10-08 14:21:26,738 | INFO | pool-16-thread-1 | Connecting to 
> /192.168.0.25:8032 subClusterId cluster2 with protocol 
> ApplicationClientProtocol as user root (auth:SIMPLE) | 
> FederationRMFailoverProxyProvider.java:145
> 2018-10-08 14:21:26,741 | INFO | pool-16-thread-1 | 
> java.net.ConnectException: Call From node-core-jIKcN/192.168.0.64 to 
> node-master1-IYTxR:8032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see: 
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.submitApplication over cluster2 after 
> 28 failover attempts. Trying to failover after sleeping for 15261ms. | 
> RetryInvocationHandler.java:411
> 2018-10-08 14:21:42,002 | INFO | pool-16-thread-1 | Failing over to the 
> ResourceManager for SubClusterId: cluster2 | 
> FederationRMFailoverProxyProvider.java:124
> 2018-10-08 14:21:42,003 | INFO | pool-16-thread-1 | Flushing subClusters from 
> cache and rehydrating from store, most likely on account of RM failover. | 
> FederationStateStoreFacade.java:258
> 2018-10-08 14:21:42,005 | INFO | pool-16-thread-1 | Connecting to 
> /192.168.0.25:8032 subClusterId cluster2 with protocol 
> ApplicationClientProtocol as user root (auth:SIMPLE) | 
> FederationRMFailoverProxyProvider.java:145
> 2018-10-08 14:21:42,007 | INFO | pool-16-thread-1 | 
> java.net.ConnectException: Call From node-core-jIKcN/192.168.0.64 to 
> node-master1-IYTxR:8032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see: 
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.submitApplication over cluster2 after 
> 29 failover attempts. Trying to failover after sleeping for 21175ms. | 
> RetryInvocationHandler.java:411
> 2018-10-08 14:22:03,183 | INFO | pool-16-thread-1 | Failing over to the 
> ResourceManager for SubClusterId: cluster2 | 
> 

[jira] [Commented] (YARN-8868) Set HTTPOnly attribute to Cookie

2018-10-10 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645685#comment-16645685
 ] 

Chandni Singh commented on YARN-8868:
-

[~sunilg] could you please help review the patch?

> Set HTTPOnly attribute to Cookie
> 
>
> Key: YARN-8868
> URL: https://issues.apache.org/jira/browse/YARN-8868
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8868.001.patch
>
>
> 1. The program creates a cookie in Dispatcher.java at line 182, 185 and 199, 
> but fails to set the HttpOnly flag to true.
> 2. The program creates a cookie in WebAppProxyServlet.java at line 141 and 
> 388, but fails to set the HttpOnly flag to true.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8868) Set HTTPOnly attribute to Cookie

2018-10-10 Thread Chandni Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-8868:

Attachment: YARN-8868.001.patch

> Set HTTPOnly attribute to Cookie
> 
>
> Key: YARN-8868
> URL: https://issues.apache.org/jira/browse/YARN-8868
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8868.001.patch
>
>
> 1. The program creates a cookie in Dispatcher.java at line 182, 185 and 199, 
> but fails to set the HttpOnly flag to true.
> 2. The program creates a cookie in WebAppProxyServlet.java at line 141 and 
> 388, but fails to set the HttpOnly flag to true.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6989) Ensure timeline service v2 codebase gets UGI from HttpServletRequest in a consistent way

2018-10-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645645#comment-16645645
 ] 

Hudson commented on YARN-6989:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15173 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/15173/])
YARN-6989 Ensure timeline service v2 codebase gets UGI from (vrushali: rev 
045069efeca07674be1571252bc4c685aa57b440)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/reader/security/TimelineReaderWhitelistAuthorizationFilter.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/reader/TimelineReaderWebServicesUtils.java


> Ensure timeline service v2 codebase gets UGI from HttpServletRequest in a 
> consistent way
> 
>
> Key: YARN-6989
> URL: https://issues.apache.org/jira/browse/YARN-6989
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Vrushali C
>Assignee: Abhishek Modi
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 3.0.4, 3.1.2
>
> Attachments: YARN-6989.001.patch, YARN-6989.002.patch
>
>
> As noticed during discussions in YARN-6820, the webservices in timeline 
> service v2 get the UGI created from the user obtained by invoking 
> getRemoteUser on the HttpServletRequest . 
> It will be good to use getUserPrincipal instead of invoking getRemoteUser on 
> the HttpServletRequest. 
> Filing jira to update the code. 
> Per Java EE documentations for 6 and 7, the behavior around getRemoteUser and 
> getUserPrincipal is listed at:
> http://docs.oracle.com/javaee/6/tutorial/doc/gjiie.html#bncba
> https://docs.oracle.com/javaee/7/tutorial/security-webtier003.htm
> {code}
> getRemoteUser, which determines the user name with which the client 
> authenticated. The getRemoteUser method returns the name of the remote user 
> (the caller) associated by the container with the request. If no user has 
> been authenticated, this method returns null.
> getUserPrincipal, which determines the principal name of the current user and 
> returns a java.security.Principal object. If no user has been authenticated, 
> this method returns null. Calling the getName method on the Principal 
> returned by getUserPrincipal returns the name of the remote user.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8868) Set HTTPOnly attribute to Cookie

2018-10-10 Thread Chandni Singh (JIRA)
Chandni Singh created YARN-8868:
---

 Summary: Set HTTPOnly attribute to Cookie
 Key: YARN-8868
 URL: https://issues.apache.org/jira/browse/YARN-8868
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Chandni Singh
Assignee: Chandni Singh


1. The program creates a cookie in Dispatcher.java at line 182, 185 and 199, 
but fails to set the HttpOnly flag to true.

2. The program creates a cookie in WebAppProxyServlet.java at line 141 and 388, 
but fails to set the HttpOnly flag to true.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6989) Ensure timeline service v2 codebase gets UGI from HttpServletRequest in a consistent way

2018-10-10 Thread Vrushali C (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vrushali C updated YARN-6989:
-
Fix Version/s: 3.1.2
   3.0.4
   3.2.0
   2.10.0

> Ensure timeline service v2 codebase gets UGI from HttpServletRequest in a 
> consistent way
> 
>
> Key: YARN-6989
> URL: https://issues.apache.org/jira/browse/YARN-6989
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Vrushali C
>Assignee: Abhishek Modi
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 3.0.4, 3.1.2
>
> Attachments: YARN-6989.001.patch, YARN-6989.002.patch
>
>
> As noticed during discussions in YARN-6820, the webservices in timeline 
> service v2 get the UGI created from the user obtained by invoking 
> getRemoteUser on the HttpServletRequest . 
> It will be good to use getUserPrincipal instead of invoking getRemoteUser on 
> the HttpServletRequest. 
> Filing jira to update the code. 
> Per Java EE documentations for 6 and 7, the behavior around getRemoteUser and 
> getUserPrincipal is listed at:
> http://docs.oracle.com/javaee/6/tutorial/doc/gjiie.html#bncba
> https://docs.oracle.com/javaee/7/tutorial/security-webtier003.htm
> {code}
> getRemoteUser, which determines the user name with which the client 
> authenticated. The getRemoteUser method returns the name of the remote user 
> (the caller) associated by the container with the request. If no user has 
> been authenticated, this method returns null.
> getUserPrincipal, which determines the principal name of the current user and 
> returns a java.security.Principal object. If no user has been authenticated, 
> this method returns null. Calling the getName method on the Principal 
> returned by getUserPrincipal returns the name of the remote user.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6989) Ensure timeline service v2 codebase gets UGI from HttpServletRequest in a consistent way

2018-10-10 Thread Vrushali C (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645629#comment-16645629
 ] 

Vrushali C commented on YARN-6989:
--

Committed to trunk as part of 
https://github.com/apache/hadoop/commit/045069efeca07674be1571252bc4c685aa57b440

Committed to branch-2 as part of

https://github.com/apache/hadoop/commit/7a5d27dde4abab8db35895b77d0ce93bfe99c8a1


> Ensure timeline service v2 codebase gets UGI from HttpServletRequest in a 
> consistent way
> 
>
> Key: YARN-6989
> URL: https://issues.apache.org/jira/browse/YARN-6989
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Vrushali C
>Assignee: Abhishek Modi
>Priority: Major
> Attachments: YARN-6989.001.patch, YARN-6989.002.patch
>
>
> As noticed during discussions in YARN-6820, the webservices in timeline 
> service v2 get the UGI created from the user obtained by invoking 
> getRemoteUser on the HttpServletRequest . 
> It will be good to use getUserPrincipal instead of invoking getRemoteUser on 
> the HttpServletRequest. 
> Filing jira to update the code. 
> Per Java EE documentations for 6 and 7, the behavior around getRemoteUser and 
> getUserPrincipal is listed at:
> http://docs.oracle.com/javaee/6/tutorial/doc/gjiie.html#bncba
> https://docs.oracle.com/javaee/7/tutorial/security-webtier003.htm
> {code}
> getRemoteUser, which determines the user name with which the client 
> authenticated. The getRemoteUser method returns the name of the remote user 
> (the caller) associated by the container with the request. If no user has 
> been authenticated, this method returns null.
> getUserPrincipal, which determines the principal name of the current user and 
> returns a java.security.Principal object. If no user has been authenticated, 
> this method returns null. Calling the getName method on the Principal 
> returned by getUserPrincipal returns the name of the remote user.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8861) executorLock is misleading in ContainerLaunch

2018-10-10 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645592#comment-16645592
 ] 

Chandni Singh commented on YARN-8861:
-

The patch just renames a variable so have not added a unit test. 

[~jlowe] could you please take a look.

> executorLock is misleading in ContainerLaunch
> -
>
> Key: YARN-8861
> URL: https://issues.apache.org/jira/browse/YARN-8861
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Trivial
>  Labels: docker
> Attachments: YARN-8861.001.patch
>
>
> The name of the variable {{executorLock}} is confusing. Rename it to 
> {{launchLock}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7225) Add queue and partition info to RM audit log

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645581#comment-16645581
 ] 

Hadoop QA commented on YARN-7225:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m  
2s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.8 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
39s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} branch-2.8 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 14s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 2 new + 99 unchanged - 1 fixed = 101 total (was 100) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m 33s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
22s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}115m 20s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestLeaderElectorService |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ae3769f |
| JIRA Issue | YARN-7225 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943287/YARN-7225.branch-2.8.001.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f853a086da0d 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 
07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2.8 / 23150c2 |
| maven | version: Apache Maven 3.0.5 |
| Default Java | 1.7.0_181 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/22143/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/22143/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/22143/testReport/ |
| asflicense | 
https://builds.apache.org/job/PreCommit-YARN-Build/22143/artifact/out/patch-asflicense-problems.txt
 |
| Max. process+thread count | 697 (vs. ulimit of 1) |
| modules | C: 

[jira] [Commented] (YARN-3854) Add localization support for docker images

2018-10-10 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1664#comment-1664
 ] 

Chandni Singh commented on YARN-3854:
-

{quote}It would be nice to have an "untouchable" list though. Basically images 
that won't get deleted even if they would qualify to be deleted using LRU.
{quote}
+1
{quote}It would also be nice to be able to specify images to delete cluster 
wide with some sort of API. Just put the image into the LRU list regardless of 
where it would have been.
{quote}
I agree this is useful, however to me this seems like out of scope for this 
effort. A way to do this is to add an API to RM which will then ask all the NMs 
to put the images in the LRU. This will require changes on the RM side and the 
API layer between RM and NMs.
 IMO it would be a good enhancement later.
{quote}I'm worried that the image deletion is going to be quite rough and 
error-prone. Docker containers sometimes stick around in the Exited or Dead 
states or keep running beyond when they should. That will cause the delete to 
perpetually fail since the image is still in use, even though it isn't really 
in use, at least not in a useful sense. Do we just keep going around and around 
and trying to delete the same images even if we keep failing?
{quote}
I can think of few things to make it a little better when an image is deleted
 # Find out all the existing containers for the image
 # If there is a way to identify Exited or Dead State of the container, and all 
the containers are in this state then forcefully delete the image.
 # If the above is not possible or there are containers that keep running, we 
retry deleting configured number of times. Once this configured number of times 
fails, throw an alert.

{quote}This isn't true on the whole. {{docker run}} only does a {{docker pull}} 
if the image doesn't exist locally.
{quote}
That's right. I will make it explicit in the doc that when the image doesn't 
exist locally, then {{docker pull}} is done which takes longer. 

> Add localization support for docker images
> --
>
> Key: YARN-3854
> URL: https://issues.apache.org/jira/browse/YARN-3854
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Chandni Singh
>Priority: Major
>  Labels: Docker
> Attachments: Localization Support For Docker Images.pdf, 
> YARN-3854-branch-2.8.001.patch, 
> YARN-3854_Localization_support_for_Docker_image_v1.pdf, 
> YARN-3854_Localization_support_for_Docker_image_v2.pdf, 
> YARN-3854_Localization_support_for_Docker_image_v3.pdf
>
>
> We need the ability to localize docker images when those images aren't 
> already available locally. There are various approaches that could be used 
> here with different trade-offs/issues : image archives on HDFS + docker load 
> ,  docker pull during the localization phase or (automatic) docker pull 
> during the run/launch phase. 
> We also need the ability to clean-up old/stale, unused images. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8865) RMStateStore contains large number of expired RMDelegationToken

2018-10-10 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645520#comment-16645520
 ] 

Daryn Sharp commented on YARN-8865:
---

The RMDelegationTokenSecretManager is an AbstractDelegationTokenSecretManager.  
The ADTSM uses a thread to periodically roll secret keys and purge expired 
tokens.  We checked some clusters that use the level db state store and we're 
not leaking tokens which implies the problem is likely specific to the 
ZKRMStateStore.

Given it's the ADTSM's job to expunge expired tokens, every state store impl 
should not be burdened with duplicated code to explicitly purge tokens just 
because one state store impl is buggy.

> RMStateStore contains large number of expired RMDelegationToken
> ---
>
> Key: YARN-8865
> URL: https://issues.apache.org/jira/browse/YARN-8865
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.1.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: YARN-8865.001.patch
>
>
> When the RM state store is restored expired delegation tokens are restored 
> and added to the system. These expired tokens do not get cleaned up or 
> removed. The exact reason why the tokens are still in the store is not clear. 
> We have seen as many as 250,000 tokens in the store some of which were 2 
> years old.
> This has two side effects:
> * for the zookeeper store this leads to a jute buffer exhaustion issue and 
> prevents the RM from becoming active.
> * restore takes longer than needed and heap usage is higher than it should be
> We should not restore already expired tokens since they cannot be renewed or 
> used.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8861) executorLock is misleading in ContainerLaunch

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645513#comment-16645513
 ] 

Hadoop QA commented on YARN-8861:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 29s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  1s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m  
7s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 66m  3s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8861 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943283/YARN-8861.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 957e97bb6877 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 
07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / bf3d591 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/22142/testReport/ |
| Max. process+thread count | 412 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/22142/console |
| Powered by | Apache Yetus 0.8.0   

[jira] [Commented] (YARN-3854) Add localization support for docker images

2018-10-10 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645496#comment-16645496
 ] 

Eric Badger commented on YARN-3854:
---

I definitely prefer the LRU strategy as opposed to {{docker prune}}. It would 
be nice to have an "untouchable" list though. Basically images that won't get 
deleted even if they would qualify to be deleted using LRU. 

It would also be nice to be able to specify images to delete cluster wide with 
some sort of API. Just put the image into the LRU list regardless of where it 
would have been. 

I'm worried that the image deletion is going to be quite rough and error-prone. 
Docker containers sometimes stick around in the {{Exited}} or {{Dead}} states 
or keep running beyond when they should. That will cause the delete to 
perpetually fail since the image is still in use, even though it isn't _really_ 
in use, at least not in a useful sense. Do we just keep going around and around 
and trying to delete the same images even if we keep failing? 

{quote}
Since `docker run` implicitly does a `docker pull`, this takes a long time and 
the container will be down for a longer time.
{quote}
This isn't true on the whole. {{docker run}} only does a {{docker pull}} if the 
image doesn't exist locally.

> Add localization support for docker images
> --
>
> Key: YARN-3854
> URL: https://issues.apache.org/jira/browse/YARN-3854
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Chandni Singh
>Priority: Major
>  Labels: Docker
> Attachments: Localization Support For Docker Images.pdf, 
> YARN-3854-branch-2.8.001.patch, 
> YARN-3854_Localization_support_for_Docker_image_v1.pdf, 
> YARN-3854_Localization_support_for_Docker_image_v2.pdf, 
> YARN-3854_Localization_support_for_Docker_image_v3.pdf
>
>
> We need the ability to localize docker images when those images aren't 
> already available locally. There are various approaches that could be used 
> here with different trade-offs/issues : image archives on HDFS + docker load 
> ,  docker pull during the localization phase or (automatic) docker pull 
> during the run/launch phase. 
> We also need the ability to clean-up old/stale, unused images. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8867) Retrieve the progress of resource localization

2018-10-10 Thread Chandni Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-8867:

Description: 
Refer YARN-3854.

Currently NM does not have an API to retrieve the progress of localization. 
Unless the client can know when the localization of a resource is complete 
irrespective of the type of the resource, it cannot take any appropriate 
action. 

We need an API in {{ContainerManagementProtocol}} to retrieve the progress on 
the localization. 


  was:Refer YARN-3854.


> Retrieve the progress of resource localization
> --
>
> Key: YARN-8867
> URL: https://issues.apache.org/jira/browse/YARN-8867
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>
> Refer YARN-3854.
> Currently NM does not have an API to retrieve the progress of localization. 
> Unless the client can know when the localization of a resource is complete 
> irrespective of the type of the resource, it cannot take any appropriate 
> action. 
> We need an API in {{ContainerManagementProtocol}} to retrieve the progress on 
> the localization. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8867) Retrieve the progress of resource localization

2018-10-10 Thread Chandni Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-8867:

Issue Type: Sub-task  (was: Bug)
Parent: YARN-8472

> Retrieve the progress of resource localization
> --
>
> Key: YARN-8867
> URL: https://issues.apache.org/jira/browse/YARN-8867
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>
> Refer YARN-3854.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8867) Retrieve the progress of resource localization

2018-10-10 Thread Chandni Singh (JIRA)
Chandni Singh created YARN-8867:
---

 Summary: Retrieve the progress of resource localization
 Key: YARN-8867
 URL: https://issues.apache.org/jira/browse/YARN-8867
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Reporter: Chandni Singh
Assignee: Chandni Singh


Refer YARN-3854.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7225) Add queue and partition info to RM audit log

2018-10-10 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645402#comment-16645402
 ] 

Eric Payne commented on YARN-7225:
--

I re-uploaded the branch-2.8 patch to kick off the 2.8 pre-commit build again. 
The build issues from the first run look like environment failures.

> Add queue and partition info to RM audit log
> 
>
> Key: YARN-7225
> URL: https://issues.apache.org/jira/browse/YARN-7225
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.9.1, 2.8.4, 3.0.2, 3.1.1
>Reporter: Jonathan Hung
>Assignee: Eric Payne
>Priority: Major
> Attachments: YARN-7225.001.patch, YARN-7225.002.patch, 
> YARN-7225.003.patch, YARN-7225.004.patch, YARN-7225.branch-2.8.001.patch
>
>
> Right now RM audit log has fields such as user, ip, resource, etc. Having 
> queue and partition  is useful for resource tracking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7225) Add queue and partition info to RM audit log

2018-10-10 Thread Eric Payne (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated YARN-7225:
-
Attachment: YARN-7225.branch-2.8.001.patch

> Add queue and partition info to RM audit log
> 
>
> Key: YARN-7225
> URL: https://issues.apache.org/jira/browse/YARN-7225
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.9.1, 2.8.4, 3.0.2, 3.1.1
>Reporter: Jonathan Hung
>Assignee: Eric Payne
>Priority: Major
> Attachments: YARN-7225.001.patch, YARN-7225.002.patch, 
> YARN-7225.003.patch, YARN-7225.004.patch, YARN-7225.branch-2.8.001.patch
>
>
> Right now RM audit log has fields such as user, ip, resource, etc. Having 
> queue and partition  is useful for resource tracking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7225) Add queue and partition info to RM audit log

2018-10-10 Thread Eric Payne (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated YARN-7225:
-
Attachment: (was: YARN-7225.branch-2.8.001.patch)

> Add queue and partition info to RM audit log
> 
>
> Key: YARN-7225
> URL: https://issues.apache.org/jira/browse/YARN-7225
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.9.1, 2.8.4, 3.0.2, 3.1.1
>Reporter: Jonathan Hung
>Assignee: Eric Payne
>Priority: Major
> Attachments: YARN-7225.001.patch, YARN-7225.002.patch, 
> YARN-7225.003.patch, YARN-7225.004.patch, YARN-7225.branch-2.8.001.patch
>
>
> Right now RM audit log has fields such as user, ip, resource, etc. Having 
> queue and partition  is useful for resource tracking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8861) executorLock is misleading in ContainerLaunch

2018-10-10 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645377#comment-16645377
 ] 

Chandni Singh edited comment on YARN-8861 at 10/10/18 6:22 PM:
---

renamed {{containerExecLock}} to {{launchLock}} in patch 1


was (Author: csingh):
renamed {{containerExecLock}} to {{launchLock}}.

> executorLock is misleading in ContainerLaunch
> -
>
> Key: YARN-8861
> URL: https://issues.apache.org/jira/browse/YARN-8861
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Trivial
>  Labels: docker
> Attachments: YARN-8861.001.patch
>
>
> The name of the variable {{executorLock}} is confusing. Rename it to 
> {{launchLock}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8861) executorLock is misleading in ContainerLaunch

2018-10-10 Thread Chandni Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-8861:

Attachment: YARN-8861.001.patch

> executorLock is misleading in ContainerLaunch
> -
>
> Key: YARN-8861
> URL: https://issues.apache.org/jira/browse/YARN-8861
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Trivial
>  Labels: docker
> Attachments: YARN-8861.001.patch
>
>
> The name of the variable {{executorLock}} is confusing. Rename it to 
> {{launchLock}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8862) [GPG] add Yarn Registry cleanup in ApplicationCleaner

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645371#comment-16645371
 ] 

Hadoop QA commented on YARN-8862:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} YARN-7402 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  3m  
2s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
58s{color} | {color:green} YARN-7402 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
35s{color} | {color:green} YARN-7402 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green} YARN-7402 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
30s{color} | {color:green} YARN-7402 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  9s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
16s{color} | {color:green} YARN-7402 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} YARN-7402 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 36s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
29s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
56s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
37s{color} | {color:green} hadoop-yarn-server-globalpolicygenerator in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 88m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8862 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943271/YARN-8862-YARN-7402.v3.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 83ba93d9bce9 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 
07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | YARN-7402 / 3671dc3 |
| maven | version: 

[jira] [Commented] (YARN-7644) NM gets backed up deleting docker containers

2018-10-10 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645338#comment-16645338
 ] 

Chandni Singh commented on YARN-7644:
-

Thanks [~jlowe], [~ebadger] & [~eyang]

> NM gets backed up deleting docker containers
> 
>
> Key: YARN-7644
> URL: https://issues.apache.org/jira/browse/YARN-7644
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Eric Badger
>Assignee: Chandni Singh
>Priority: Major
>  Labels: Docker
> Fix For: 3.2.0
>
> Attachments: YARN-7644.001.patch, YARN-7644.002.patch, 
> YARN-7644.003.patch, YARN-7644.004.patch, YARN-7644.005.patch, 
> YARN-7644.006.patch
>
>
> We are sending a {{docker stop}} to the docker container with a timeout of 10 
> seconds when we shut down a container. If the container does not stop after 
> 10 seconds then we force kill it. However, the {{docker stop}} command is a 
> blocking call. So in cases where lots of containers don't go down with the 
> initial SIGTERM, we have to wait 10+ seconds for the {{docker stop}} to 
> return. This ties up the ContainerLaunch handler and so these kill events 
> back up. It also appears to be backing up new container launches as well. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-10 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645324#comment-16645324
 ] 

Eric Yang commented on YARN-8569:
-

{quote}If you will do #1, IIUC, the #2 has to be done, otherwise as a normal 
user we cannot read from nmPrivate dir. {quote}

Line 2632-2638 of container-executor does exactly as open file as node manager 
user, and it already drop privileges while writing the file.  I am not disagree 
with what you said, but it is already part of the patch.

{quote}We already limit to read file under .../nmPrivate/app../sys/fs/ 
correct? How it is possible to read token file from that directory?{quote}

App.json is writing to nmPrivate/[appid]/app.json, there is no sub-directory in 
nmPrivate/[appid]/sysfs.  Sysfs directory only exists in distributed cache 
location.  Good security practice is to make generated paths have hard coded 
depth and fix filename to prevent path exploits.  If we allow API to specify 
arbitrary filename, user can easy specify "sysfs/../container_tokens" from rest 
api to defeat security measurement that were put in place.  Readlink can 
flatten the path, but there is no real indicator that source can be copied.  
For now, I can do source path check to make sure it is jailed in 
nmPrivate/[appid]/sysfs path.  However, the first version of API will not 
accept arbitrary filename to minimize security concerns.  It would be easy to 
extend the REST API method with support of arbitrary filename when more 
thoughts are given.  I leave that responsibility to the one that wants to bare 
the problem.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch, YARN-8569.011.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7225) Add queue and partition info to RM audit log

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645323#comment-16645323
 ] 

Hadoop QA commented on YARN-7225:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 51s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 33s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 3 new + 80 unchanged - 0 fixed = 83 total (was 80) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 33s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 81m 32s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}139m 53s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-7225 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943256/YARN-7225.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 8a829acc1f7e 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 90552b1 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/22140/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/22140/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 

[jira] [Commented] (YARN-8468) Enable the use of queue based maximum container allocation limit and implement it in FairScheduler

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645297#comment-16645297
 ] 

Hadoop QA commented on YARN-8468:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 14m 
54s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 15 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.1 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
34s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
47s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
36s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
19s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 22s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
23s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} branch-3.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
23s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 33s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 10 new + 856 unchanged - 19 fixed = 866 total (was 875) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 58s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 79m 50s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
21s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}171m  6s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:a607c02 |
| JIRA Issue | YARN-8468 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943242/YARN-8468-branch-3.1.021.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  

[jira] [Updated] (YARN-8855) Application fails if one of the sublcluster is down.

2018-10-10 Thread Bibin A Chundatt (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-8855:
---
Component/s: federation

> Application fails if one of the sublcluster is down.
> 
>
> Key: YARN-8855
> URL: https://issues.apache.org/jira/browse/YARN-8855
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Reporter: Rahul Anand
>Priority: Major
>
> If one of sub cluster is down then application keeps on trying multiple times 
> and then it fails About 30 failover attempts found in the logs. Below is the 
> detailed exception. 
> {code:java}
> 2018-10-08 14:21:21,245 | INFO | NM ContainerManager dispatcher | Container 
> container_e03_1538297667953_0005_01_01 transitioned from 
> CONTAINER_CLEANEDUP_AFTER_KILL to DONE | ContainerImpl.java:2093
> 2018-10-08 14:21:21,245 | INFO | NM ContainerManager dispatcher | Removing 
> container_e03_1538297667953_0005_01_01 from application 
> application_1538297667953_0005 | ApplicationImpl.java:512
> 2018-10-08 14:21:21,246 | INFO | NM ContainerManager dispatcher | Stopping 
> resource-monitoring for container_e03_1538297667953_0005_01_01 | 
> ContainersMonitorImpl.java:932
> 2018-10-08 14:21:21,246 | INFO | NM ContainerManager dispatcher | Considering 
> container container_e03_1538297667953_0005_01_01 for log-aggregation | 
> AppLogAggregatorImpl.java:538
> 2018-10-08 14:21:21,246 | INFO | NM ContainerManager dispatcher | Got event 
> CONTAINER_STOP for appId application_1538297667953_0005 | AuxServices.java:350
> 2018-10-08 14:21:21,246 | INFO | NM ContainerManager dispatcher | Stopping 
> container container_e03_1538297667953_0005_01_01 | 
> YarnShuffleService.java:295
> 2018-10-08 14:21:21,247 | WARN | NM Event dispatcher | couldn't find 
> container container_e03_1538297667953_0005_01_01 while processing 
> FINISH_CONTAINERS event | ContainerManagerImpl.java:1660
> 2018-10-08 14:21:22,248 | INFO | Node Status Updater | Removed completed 
> containers from NM context: [container_e03_1538297667953_0005_01_01] | 
> NodeStatusUpdaterImpl.java:696
> 2018-10-08 14:21:26,734 | INFO | pool-16-thread-1 | Failing over to the 
> ResourceManager for SubClusterId: cluster2 | 
> FederationRMFailoverProxyProvider.java:124
> 2018-10-08 14:21:26,735 | INFO | pool-16-thread-1 | Flushing subClusters from 
> cache and rehydrating from store, most likely on account of RM failover. | 
> FederationStateStoreFacade.java:258
> 2018-10-08 14:21:26,738 | INFO | pool-16-thread-1 | Connecting to 
> /192.168.0.25:8032 subClusterId cluster2 with protocol 
> ApplicationClientProtocol as user root (auth:SIMPLE) | 
> FederationRMFailoverProxyProvider.java:145
> 2018-10-08 14:21:26,741 | INFO | pool-16-thread-1 | 
> java.net.ConnectException: Call From node-core-jIKcN/192.168.0.64 to 
> node-master1-IYTxR:8032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see: 
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.submitApplication over cluster2 after 
> 28 failover attempts. Trying to failover after sleeping for 15261ms. | 
> RetryInvocationHandler.java:411
> 2018-10-08 14:21:42,002 | INFO | pool-16-thread-1 | Failing over to the 
> ResourceManager for SubClusterId: cluster2 | 
> FederationRMFailoverProxyProvider.java:124
> 2018-10-08 14:21:42,003 | INFO | pool-16-thread-1 | Flushing subClusters from 
> cache and rehydrating from store, most likely on account of RM failover. | 
> FederationStateStoreFacade.java:258
> 2018-10-08 14:21:42,005 | INFO | pool-16-thread-1 | Connecting to 
> /192.168.0.25:8032 subClusterId cluster2 with protocol 
> ApplicationClientProtocol as user root (auth:SIMPLE) | 
> FederationRMFailoverProxyProvider.java:145
> 2018-10-08 14:21:42,007 | INFO | pool-16-thread-1 | 
> java.net.ConnectException: Call From node-core-jIKcN/192.168.0.64 to 
> node-master1-IYTxR:8032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see: 
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.submitApplication over cluster2 after 
> 29 failover attempts. Trying to failover after sleeping for 21175ms. | 
> RetryInvocationHandler.java:411
> 2018-10-08 14:22:03,183 | INFO | pool-16-thread-1 | Failing over to the 
> ResourceManager for SubClusterId: cluster2 | 
> FederationRMFailoverProxyProvider.java:124
> 2018-10-08 14:22:03,183 | INFO | pool-16-thread-1 | Flushing subClusters from 
> cache and rehydrating from store, most likely on account of RM failover. | 
> FederationStateStoreFacade.java:258
> 2018-10-08 14:22:03,186 | INFO | pool-16-thread-1 | Connecting to 
> /192.168.0.25:8032 subClusterId cluster2 

[jira] [Commented] (YARN-8855) Application fails if one of the sublcluster is down.

2018-10-10 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645291#comment-16645291
 ] 

Bibin A Chundatt commented on YARN-8855:


[~botong] adding one observation 

{{FederationStateStoreFacade}} -> {{FederationStateStore}} for  filtering out 
active subclusters, statestore depends on {{info.getState().isActive()}} check.
Which compares state is {{SC_RUNNING}} or not. Incase of proper shutdown of RM 
this check seems fine.

But in case of machine failure / process kill the state could still be RUNNING. 
But RMs will be unavailable.

This could result in unwanted retry/ connection attempts to failed clusters  rt 
?? 

> Application fails if one of the sublcluster is down.
> 
>
> Key: YARN-8855
> URL: https://issues.apache.org/jira/browse/YARN-8855
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rahul Anand
>Priority: Major
>
> If one of sub cluster is down then application keeps on trying multiple times 
> and then it fails About 30 failover attempts found in the logs. Below is the 
> detailed exception. 
> {code:java}
> 2018-10-08 14:21:21,245 | INFO | NM ContainerManager dispatcher | Container 
> container_e03_1538297667953_0005_01_01 transitioned from 
> CONTAINER_CLEANEDUP_AFTER_KILL to DONE | ContainerImpl.java:2093
> 2018-10-08 14:21:21,245 | INFO | NM ContainerManager dispatcher | Removing 
> container_e03_1538297667953_0005_01_01 from application 
> application_1538297667953_0005 | ApplicationImpl.java:512
> 2018-10-08 14:21:21,246 | INFO | NM ContainerManager dispatcher | Stopping 
> resource-monitoring for container_e03_1538297667953_0005_01_01 | 
> ContainersMonitorImpl.java:932
> 2018-10-08 14:21:21,246 | INFO | NM ContainerManager dispatcher | Considering 
> container container_e03_1538297667953_0005_01_01 for log-aggregation | 
> AppLogAggregatorImpl.java:538
> 2018-10-08 14:21:21,246 | INFO | NM ContainerManager dispatcher | Got event 
> CONTAINER_STOP for appId application_1538297667953_0005 | AuxServices.java:350
> 2018-10-08 14:21:21,246 | INFO | NM ContainerManager dispatcher | Stopping 
> container container_e03_1538297667953_0005_01_01 | 
> YarnShuffleService.java:295
> 2018-10-08 14:21:21,247 | WARN | NM Event dispatcher | couldn't find 
> container container_e03_1538297667953_0005_01_01 while processing 
> FINISH_CONTAINERS event | ContainerManagerImpl.java:1660
> 2018-10-08 14:21:22,248 | INFO | Node Status Updater | Removed completed 
> containers from NM context: [container_e03_1538297667953_0005_01_01] | 
> NodeStatusUpdaterImpl.java:696
> 2018-10-08 14:21:26,734 | INFO | pool-16-thread-1 | Failing over to the 
> ResourceManager for SubClusterId: cluster2 | 
> FederationRMFailoverProxyProvider.java:124
> 2018-10-08 14:21:26,735 | INFO | pool-16-thread-1 | Flushing subClusters from 
> cache and rehydrating from store, most likely on account of RM failover. | 
> FederationStateStoreFacade.java:258
> 2018-10-08 14:21:26,738 | INFO | pool-16-thread-1 | Connecting to 
> /192.168.0.25:8032 subClusterId cluster2 with protocol 
> ApplicationClientProtocol as user root (auth:SIMPLE) | 
> FederationRMFailoverProxyProvider.java:145
> 2018-10-08 14:21:26,741 | INFO | pool-16-thread-1 | 
> java.net.ConnectException: Call From node-core-jIKcN/192.168.0.64 to 
> node-master1-IYTxR:8032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see: 
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.submitApplication over cluster2 after 
> 28 failover attempts. Trying to failover after sleeping for 15261ms. | 
> RetryInvocationHandler.java:411
> 2018-10-08 14:21:42,002 | INFO | pool-16-thread-1 | Failing over to the 
> ResourceManager for SubClusterId: cluster2 | 
> FederationRMFailoverProxyProvider.java:124
> 2018-10-08 14:21:42,003 | INFO | pool-16-thread-1 | Flushing subClusters from 
> cache and rehydrating from store, most likely on account of RM failover. | 
> FederationStateStoreFacade.java:258
> 2018-10-08 14:21:42,005 | INFO | pool-16-thread-1 | Connecting to 
> /192.168.0.25:8032 subClusterId cluster2 with protocol 
> ApplicationClientProtocol as user root (auth:SIMPLE) | 
> FederationRMFailoverProxyProvider.java:145
> 2018-10-08 14:21:42,007 | INFO | pool-16-thread-1 | 
> java.net.ConnectException: Call From node-core-jIKcN/192.168.0.64 to 
> node-master1-IYTxR:8032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see: 
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.submitApplication over cluster2 after 
> 29 failover attempts. Trying to failover after sleeping for 21175ms. | 
> RetryInvocationHandler.java:411
> 

[jira] [Commented] (YARN-8855) Application fails if one of the sublcluster is down.

2018-10-10 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645275#comment-16645275
 ] 

Bibin A Chundatt commented on YARN-8855:


[~rahulanand90] YARN-7652 should solve the issue mentioned.. [~botong] , 
correct me if i am wrong.

> Application fails if one of the sublcluster is down.
> 
>
> Key: YARN-8855
> URL: https://issues.apache.org/jira/browse/YARN-8855
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rahul Anand
>Priority: Major
>
> If one of sub cluster is down then application keeps on trying multiple times 
> and then it fails About 30 failover attempts found in the logs. Below is the 
> detailed exception. 
> {code:java}
> 2018-10-08 14:21:21,245 | INFO | NM ContainerManager dispatcher | Container 
> container_e03_1538297667953_0005_01_01 transitioned from 
> CONTAINER_CLEANEDUP_AFTER_KILL to DONE | ContainerImpl.java:2093
> 2018-10-08 14:21:21,245 | INFO | NM ContainerManager dispatcher | Removing 
> container_e03_1538297667953_0005_01_01 from application 
> application_1538297667953_0005 | ApplicationImpl.java:512
> 2018-10-08 14:21:21,246 | INFO | NM ContainerManager dispatcher | Stopping 
> resource-monitoring for container_e03_1538297667953_0005_01_01 | 
> ContainersMonitorImpl.java:932
> 2018-10-08 14:21:21,246 | INFO | NM ContainerManager dispatcher | Considering 
> container container_e03_1538297667953_0005_01_01 for log-aggregation | 
> AppLogAggregatorImpl.java:538
> 2018-10-08 14:21:21,246 | INFO | NM ContainerManager dispatcher | Got event 
> CONTAINER_STOP for appId application_1538297667953_0005 | AuxServices.java:350
> 2018-10-08 14:21:21,246 | INFO | NM ContainerManager dispatcher | Stopping 
> container container_e03_1538297667953_0005_01_01 | 
> YarnShuffleService.java:295
> 2018-10-08 14:21:21,247 | WARN | NM Event dispatcher | couldn't find 
> container container_e03_1538297667953_0005_01_01 while processing 
> FINISH_CONTAINERS event | ContainerManagerImpl.java:1660
> 2018-10-08 14:21:22,248 | INFO | Node Status Updater | Removed completed 
> containers from NM context: [container_e03_1538297667953_0005_01_01] | 
> NodeStatusUpdaterImpl.java:696
> 2018-10-08 14:21:26,734 | INFO | pool-16-thread-1 | Failing over to the 
> ResourceManager for SubClusterId: cluster2 | 
> FederationRMFailoverProxyProvider.java:124
> 2018-10-08 14:21:26,735 | INFO | pool-16-thread-1 | Flushing subClusters from 
> cache and rehydrating from store, most likely on account of RM failover. | 
> FederationStateStoreFacade.java:258
> 2018-10-08 14:21:26,738 | INFO | pool-16-thread-1 | Connecting to 
> /192.168.0.25:8032 subClusterId cluster2 with protocol 
> ApplicationClientProtocol as user root (auth:SIMPLE) | 
> FederationRMFailoverProxyProvider.java:145
> 2018-10-08 14:21:26,741 | INFO | pool-16-thread-1 | 
> java.net.ConnectException: Call From node-core-jIKcN/192.168.0.64 to 
> node-master1-IYTxR:8032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see: 
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.submitApplication over cluster2 after 
> 28 failover attempts. Trying to failover after sleeping for 15261ms. | 
> RetryInvocationHandler.java:411
> 2018-10-08 14:21:42,002 | INFO | pool-16-thread-1 | Failing over to the 
> ResourceManager for SubClusterId: cluster2 | 
> FederationRMFailoverProxyProvider.java:124
> 2018-10-08 14:21:42,003 | INFO | pool-16-thread-1 | Flushing subClusters from 
> cache and rehydrating from store, most likely on account of RM failover. | 
> FederationStateStoreFacade.java:258
> 2018-10-08 14:21:42,005 | INFO | pool-16-thread-1 | Connecting to 
> /192.168.0.25:8032 subClusterId cluster2 with protocol 
> ApplicationClientProtocol as user root (auth:SIMPLE) | 
> FederationRMFailoverProxyProvider.java:145
> 2018-10-08 14:21:42,007 | INFO | pool-16-thread-1 | 
> java.net.ConnectException: Call From node-core-jIKcN/192.168.0.64 to 
> node-master1-IYTxR:8032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see: 
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.submitApplication over cluster2 after 
> 29 failover attempts. Trying to failover after sleeping for 21175ms. | 
> RetryInvocationHandler.java:411
> 2018-10-08 14:22:03,183 | INFO | pool-16-thread-1 | Failing over to the 
> ResourceManager for SubClusterId: cluster2 | 
> FederationRMFailoverProxyProvider.java:124
> 2018-10-08 14:22:03,183 | INFO | pool-16-thread-1 | Flushing subClusters from 
> cache and rehydrating from store, most likely on account of RM failover. | 
> FederationStateStoreFacade.java:258
> 2018-10-08 14:22:03,186 | INFO 

[jira] [Updated] (YARN-8862) [GPG] add Yarn Registry cleanup in ApplicationCleaner

2018-10-10 Thread Botong Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-8862:
---
Attachment: YARN-8862-YARN-7402.v3.patch

> [GPG] add Yarn Registry cleanup in ApplicationCleaner
> -
>
> Key: YARN-8862
> URL: https://issues.apache.org/jira/browse/YARN-8862
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-8862-YARN-7402.v1.patch, 
> YARN-8862-YARN-7402.v2.patch, YARN-8862-YARN-7402.v3.patch
>
>
> In Yarn Federation, we use Yarn Registry to use the AMToken for UAMs in 
> secondary sub-clusters. Because of potential more app attempts later, 
> AMRMProxy cannot kill the UAM and delete the tokens when one local attempt 
> finishes. So similar to the StateStore application table, we need 
> ApplicationCleaner in GPG to cleanup the finished app entries in Yarn 
> Registry. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8865) RMStateStore contains large number of expired RMDelegationToken

2018-10-10 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645180#comment-16645180
 ] 

Jason Lowe commented on YARN-8865:
--

Thanks for the report and patch!  Do we have any idea how these are getting 
leaked in the first place?  If I recall correctly, there's a thread pool that 
periodically tries to renew tokens, and when those tokens fail to renew because 
they're expired the token is removed from the state store.  Therefore even upon 
recovery it should try to renew these ancient tokens, fail to do so because 
they're expired, then remove them from the state store.  Is the state store 
removal itself failing?  Each secret manager is responsible for removing 
expired tokens it is managing, so wondering how that is not happening here.

Rather than have each state store need to implement this feature separately, 
wondering if the RMDelegationTokenSecretManager should choose not to load the 
tokens in the recovered RMDTSecretManagerState that are expired and instead 
immediately remove them from the state store.  Otherwise every state store 
needs to implement this separately which is a maintenance burden.


> RMStateStore contains large number of expired RMDelegationToken
> ---
>
> Key: YARN-8865
> URL: https://issues.apache.org/jira/browse/YARN-8865
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.1.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: YARN-8865.001.patch
>
>
> When the RM state store is restored expired delegation tokens are restored 
> and added to the system. These expired tokens do not get cleaned up or 
> removed. The exact reason why the tokens are still in the store is not clear. 
> We have seen as many as 250,000 tokens in the store some of which were 2 
> years old.
> This has two side effects:
> * for the zookeeper store this leads to a jute buffer exhaustion issue and 
> prevents the RM from becoming active.
> * restore takes longer than needed and heap usage is higher than it should be
> We should not restore already expired tokens since they cannot be renewed or 
> used.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command

2018-10-10 Thread Billie Rinaldi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645165#comment-16645165
 ] 

Billie Rinaldi commented on YARN-8777:
--

I tested out patch 7. A couple of minor comments: I was confused by the 
"Father" comment, maybe use Parent instead. Maybe we should introduce a new 
error code for docker exec failed instead of using DOCKER_RUN_FAILED? Lastly, 
the new test is missing free_configuration(_executor_cfg).

> Container Executor C binary change to execute interactive docker command
> 
>
> Key: YARN-8777
> URL: https://issues.apache.org/jira/browse/YARN-8777
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8777.001.patch, YARN-8777.002.patch, 
> YARN-8777.003.patch, YARN-8777.004.patch, YARN-8777.005.patch, 
> YARN-8777.006.patch, YARN-8777.007.patch
>
>
> Since Container Executor provides Container execution using the native 
> container-executor binary, we also need to make changes to accept new 
> “dockerExec” method to invoke the corresponding native function to execute 
> docker exec command to the running container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7086) Release all containers aynchronously

2018-10-10 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645159#comment-16645159
 ] 

Jason Lowe commented on YARN-7086:
--

Thanks for developing a perf test case!  The huge variations in runtime need to 
be investigated.  The second test case variations are up to 63%, including 
multiple samples that are slower than existing code average.  With this data, I 
would argue the results are close to the noise range given the wild swings in 
measurements.  How could it sometimes be well over 50% faster sometimes?  Is 
the JVM hitting a large GC?  System I/O?  I see the test is spamming logs on 
stdout in a tight loop while measuring timing -- that's not good.  I could see 
I/O effects dominating the runtimes.  Try running this where the test produces 
as little output as possible while running.  No stdout printing in the tight 
loop, use a log4j.properties that suppresses the RM logging, etc.  We need to 
get the runs to be a lot more consistent, otherwise we're probably not 
measuring what we think we're measuring.


> Release all containers aynchronously
> 
>
> Key: YARN-7086
> URL: https://issues.apache.org/jira/browse/YARN-7086
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Arun Suresh
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-7086.001.patch, YARN-7086.002.patch, 
> YARN-7086.Perf-test-case.patch
>
>
> We have noticed in production two situations that can cause deadlocks and 
> cause scheduling of new containers to come to a halt, especially with regard 
> to applications that have a lot of live containers:
> # When these applicaitons release these containers in bulk.
> # When these applications terminate abruptly due to some failure, the 
> scheduler releases all its live containers in a loop.
> To handle the issues mentioned above, we have a patch in production to make 
> sure ALL container releases happen asynchronously - and it has served us well.
> Opening this JIRA to gather feedback on if this is a good idea generally (cc 
> [~leftnoteasy], [~jlowe], [~curino], [~kasha], [~subru], [~roniburd])
> BTW, In YARN-6251, we already have an asyncReleaseContainer() in the 
> AbstractYarnScheduler and a corresponding scheduler event, which is currently 
> used specifically for the container-update code paths (where the scheduler 
> realeases temp containers which it creates for the update)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7644) NM gets backed up deleting docker containers

2018-10-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645150#comment-16645150
 ] 

Hudson commented on YARN-7644:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15167 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/15167/])
YARN-7644. NM gets backed up deleting docker containers. Contributed by (jlowe: 
rev 5ce70e1211e624d58e8bb1181aec00729ebdc085)
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerCleanup.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerCleanup.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainersLauncher.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainersLauncher.java


> NM gets backed up deleting docker containers
> 
>
> Key: YARN-7644
> URL: https://issues.apache.org/jira/browse/YARN-7644
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Eric Badger
>Assignee: Chandni Singh
>Priority: Major
>  Labels: Docker
> Fix For: 3.2.0
>
> Attachments: YARN-7644.001.patch, YARN-7644.002.patch, 
> YARN-7644.003.patch, YARN-7644.004.patch, YARN-7644.005.patch, 
> YARN-7644.006.patch
>
>
> We are sending a {{docker stop}} to the docker container with a timeout of 10 
> seconds when we shut down a container. If the container does not stop after 
> 10 seconds then we force kill it. However, the {{docker stop}} command is a 
> blocking call. So in cases where lots of containers don't go down with the 
> initial SIGTERM, we have to wait 10+ seconds for the {{docker stop}} to 
> return. This ties up the ContainerLaunch handler and so these kill events 
> back up. It also appears to be backing up new container launches as well. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7225) Add queue and partition info to RM audit log

2018-10-10 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645136#comment-16645136
 ] 

Eric Payne commented on YARN-7225:
--

Attached 004 version of the trunk patch to address the whitespace issues.

The {{TestCapacityOverTimePolicy}} test that failed in the above trunk 
pre-commit build succeeds for me in my build environment.

I am still investigating the issues in the branch-2.8 patch.

> Add queue and partition info to RM audit log
> 
>
> Key: YARN-7225
> URL: https://issues.apache.org/jira/browse/YARN-7225
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.9.1, 2.8.4, 3.0.2, 3.1.1
>Reporter: Jonathan Hung
>Assignee: Eric Payne
>Priority: Major
> Attachments: YARN-7225.001.patch, YARN-7225.002.patch, 
> YARN-7225.003.patch, YARN-7225.004.patch, YARN-7225.branch-2.8.001.patch
>
>
> Right now RM audit log has fields such as user, ip, resource, etc. Having 
> queue and partition  is useful for resource tracking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7225) Add queue and partition info to RM audit log

2018-10-10 Thread Eric Payne (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated YARN-7225:
-
Attachment: YARN-7225.004.patch

> Add queue and partition info to RM audit log
> 
>
> Key: YARN-7225
> URL: https://issues.apache.org/jira/browse/YARN-7225
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.9.1, 2.8.4, 3.0.2, 3.1.1
>Reporter: Jonathan Hung
>Assignee: Eric Payne
>Priority: Major
> Attachments: YARN-7225.001.patch, YARN-7225.002.patch, 
> YARN-7225.003.patch, YARN-7225.004.patch, YARN-7225.branch-2.8.001.patch
>
>
> Right now RM audit log has fields such as user, ip, resource, etc. Having 
> queue and partition  is useful for resource tracking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7644) NM gets backed up deleting docker containers

2018-10-10 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645096#comment-16645096
 ] 

Jason Lowe commented on YARN-7644:
--

Thanks for updating the patch!  +1 for patch v6 as well.  Committing this.

> NM gets backed up deleting docker containers
> 
>
> Key: YARN-7644
> URL: https://issues.apache.org/jira/browse/YARN-7644
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Eric Badger
>Assignee: Chandni Singh
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7644.001.patch, YARN-7644.002.patch, 
> YARN-7644.003.patch, YARN-7644.004.patch, YARN-7644.005.patch, 
> YARN-7644.006.patch
>
>
> We are sending a {{docker stop}} to the docker container with a timeout of 10 
> seconds when we shut down a container. If the container does not stop after 
> 10 seconds then we force kill it. However, the {{docker stop}} command is a 
> blocking call. So in cases where lots of containers don't go down with the 
> initial SIGTERM, we have to wait 10+ seconds for the {{docker stop}} to 
> return. This ties up the ContainerLaunch handler and so these kill events 
> back up. It also appears to be backing up new container launches as well. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8468) Enable the use of queue based maximum container allocation limit and implement it in FairScheduler

2018-10-10 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644989#comment-16644989
 ] 

Weiwei Yang commented on YARN-8468:
---

Reverted the change from branch-3.1, and trigger jenkins job again...

> Enable the use of queue based maximum container allocation limit and 
> implement it in FairScheduler
> --
>
> Key: YARN-8468
> URL: https://issues.apache.org/jira/browse/YARN-8468
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, scheduler
>Affects Versions: 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Antal Bálint Steinbach
>Priority: Critical
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-8468-branch-3.1.018.patch, 
> YARN-8468-branch-3.1.019.patch, YARN-8468-branch-3.1.020.patch, 
> YARN-8468-branch-3.1.021.patch, YARN-8468.000.patch, YARN-8468.001.patch, 
> YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, 
> YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, 
> YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, 
> YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, 
> YARN-8468.014.patch, YARN-8468.015.patch, YARN-8468.016.patch, 
> YARN-8468.017.patch, YARN-8468.018.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" 
> to limit the overall size of a container. This applies globally to all 
> containers and cannot be limited by queue or and is not scheduler dependent.
> The goal of this ticket is to allow this value to be set on a per queue basis.
> The use case: User has two pools, one for ad hoc jobs and one for enterprise 
> apps. User wants to limit ad hoc jobs to small containers but allow 
> enterprise apps to request as many resources as needed. Setting 
> yarn.scheduler.maximum-allocation-mb sets a default value for maximum 
> container size for all queues and setting maximum resources per queue with 
> “maxContainerResources” queue config value.
> Suggested solution:
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and 
> leaf), this will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should 
> not allow that.
>  * make sure that queue resource cap can not be larger than scheduler max 
> resource cap in the config.
>  * implement getMaximumResourceCapability(String queueName) in the 
> FairScheduler
>  * implement getMaximumResourceCapability(String queueName) in both 
> FSParentQueue and FSLeafQueue as follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * Enforce the use of queue based maximum allocation limit if it is 
> available, if not use the general scheduler level setting
>  ** Use it during validation and normalization of requests in 
> scheduler.allocate, app submit and resource request



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8468) Enable the use of queue based maximum container allocation limit and implement it in FairScheduler

2018-10-10 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-8468:
--
Attachment: YARN-8468-branch-3.1.021.patch

> Enable the use of queue based maximum container allocation limit and 
> implement it in FairScheduler
> --
>
> Key: YARN-8468
> URL: https://issues.apache.org/jira/browse/YARN-8468
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, scheduler
>Affects Versions: 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Antal Bálint Steinbach
>Priority: Critical
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-8468-branch-3.1.018.patch, 
> YARN-8468-branch-3.1.019.patch, YARN-8468-branch-3.1.020.patch, 
> YARN-8468-branch-3.1.021.patch, YARN-8468.000.patch, YARN-8468.001.patch, 
> YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, 
> YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, 
> YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, 
> YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, 
> YARN-8468.014.patch, YARN-8468.015.patch, YARN-8468.016.patch, 
> YARN-8468.017.patch, YARN-8468.018.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" 
> to limit the overall size of a container. This applies globally to all 
> containers and cannot be limited by queue or and is not scheduler dependent.
> The goal of this ticket is to allow this value to be set on a per queue basis.
> The use case: User has two pools, one for ad hoc jobs and one for enterprise 
> apps. User wants to limit ad hoc jobs to small containers but allow 
> enterprise apps to request as many resources as needed. Setting 
> yarn.scheduler.maximum-allocation-mb sets a default value for maximum 
> container size for all queues and setting maximum resources per queue with 
> “maxContainerResources” queue config value.
> Suggested solution:
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and 
> leaf), this will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should 
> not allow that.
>  * make sure that queue resource cap can not be larger than scheduler max 
> resource cap in the config.
>  * implement getMaximumResourceCapability(String queueName) in the 
> FairScheduler
>  * implement getMaximumResourceCapability(String queueName) in both 
> FSParentQueue and FSLeafQueue as follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * Enforce the use of queue based maximum allocation limit if it is 
> available, if not use the general scheduler level setting
>  ** Use it during validation and normalization of requests in 
> scheduler.allocate, app submit and resource request



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7018) Interface for adding extra behavior to node heartbeats

2018-10-10 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644967#comment-16644967
 ] 

Jason Lowe commented on YARN-7018:
--

Thanks for updating the patch!  Interface looks good for now.

This JIRA should define the initial interface to the plugin and provide a 
default implementation of the plugin that does nothing (i.e.: the null plugin). 
 NodeHeartbeatPluginImpl and BaseNodeHeartBeatPluginImpl should not be part of 
this, IMHO.

Loading the plugin should not be in a base class that implements the plugin -- 
that belongs in he AbstractYarnScheduler code that is responsible for loading 
and initializing the plugin.  And once that's removed from 
BaseNodeHeartBeatPluginImpl I don't see a need for that class.

bq. Where shall we keep the plugin? RMActiveService? or Scheduler like Wangda 
Tan suggested earlier? 

We can put it in the AbstractYarnScheduler which the latest patch essentially 
is doing.  The reinitialize method is loading the plugin and setting it up in 
the RMContext.

> Interface for adding extra behavior to node heartbeats
> --
>
> Key: YARN-7018
> URL: https://issues.apache.org/jira/browse/YARN-7018
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: resourcemanager
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Attachments: YARN-7018.POC.001.patch, YARN-7018.POC.002.patch, 
> YARN-7018.POC.003.patch
>
>
> This JIRA tracks an interface for plugging in new behavior to node heartbeat 
> processing.  Adding a formal interface for additional node heartbeat 
> processing would allow admins to configure new functionality that is 
> scheduler-independent without needing to replace the entire scheduler.  For 
> example, both YARN-5202 and YARN-5215 had approaches where node heartbeat 
> processing was extended to implement new functionality that was essentially 
> scheduler-independent and could be implemented as a plugin with this 
> interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8468) Enable the use of queue based maximum container allocation limit and implement it in FairScheduler

2018-10-10 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644849#comment-16644849
 ] 

Weiwei Yang commented on YARN-8468:
---

Run the jenkins again, it still fails like following
{noformat}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on 
project hadoop-yarn-server-resourcemanager: There was a timeout or other error 
in the fork
{noformat}
According to mvn [shutdown 
doc|https://maven.apache.org/surefire/maven-failsafe-plugin/examples/shutdown.html]

{color:#205081}*After the test-set has completed*, the process executes 
{{java.lang.System.exit(0)}} which starts shutdown hooks. At this point the 
process may run next 30 seconds until all non daemon Threads die. After the 
period of time has elapsed, the process kills itself by 
{{java.lang.Runtime.halt(0)}}. The timeout of 30 seconds can be customized by 
configuration parameter {{forkedProcessExitTimeoutInSeconds}}.{color}

and it looks like somehow the shutdown did not finish in 900 seconds,

{color:#205081}Using parameter *forkedProcessTimeoutInSeconds* forked JVMs are 
killed separately after every individual process elapsed certain amount of time 
and the whole plugin fails with the error message:{color}

*{color:#205081}There was a timeout or other error in the fork{color}*

So it seems
 # All UTs (RM module) have run, and there is no failure
 # Shutdown failed on waiting forked JVM to terminate in 900 seconds (defined 
in hadoop-project/pom.xml)

it's unclear why #2 happens ... I think we need to revert this from branch-3.1 
and start over.. I'll do the revert in a few hours.

Sorry for inconvenience. 

> Enable the use of queue based maximum container allocation limit and 
> implement it in FairScheduler
> --
>
> Key: YARN-8468
> URL: https://issues.apache.org/jira/browse/YARN-8468
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, scheduler
>Affects Versions: 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Antal Bálint Steinbach
>Priority: Critical
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-8468-branch-3.1.018.patch, 
> YARN-8468-branch-3.1.019.patch, YARN-8468-branch-3.1.020.patch, 
> YARN-8468.000.patch, YARN-8468.001.patch, YARN-8468.002.patch, 
> YARN-8468.003.patch, YARN-8468.004.patch, YARN-8468.005.patch, 
> YARN-8468.006.patch, YARN-8468.007.patch, YARN-8468.008.patch, 
> YARN-8468.009.patch, YARN-8468.010.patch, YARN-8468.011.patch, 
> YARN-8468.012.patch, YARN-8468.013.patch, YARN-8468.014.patch, 
> YARN-8468.015.patch, YARN-8468.016.patch, YARN-8468.017.patch, 
> YARN-8468.018.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" 
> to limit the overall size of a container. This applies globally to all 
> containers and cannot be limited by queue or and is not scheduler dependent.
> The goal of this ticket is to allow this value to be set on a per queue basis.
> The use case: User has two pools, one for ad hoc jobs and one for enterprise 
> apps. User wants to limit ad hoc jobs to small containers but allow 
> enterprise apps to request as many resources as needed. Setting 
> yarn.scheduler.maximum-allocation-mb sets a default value for maximum 
> container size for all queues and setting maximum resources per queue with 
> “maxContainerResources” queue config value.
> Suggested solution:
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and 
> leaf), this will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should 
> not allow that.
>  * make sure that queue resource cap can not be larger than scheduler max 
> resource cap in the config.
>  * implement getMaximumResourceCapability(String queueName) in the 
> FairScheduler
>  * implement getMaximumResourceCapability(String queueName) in both 
> FSParentQueue and FSLeafQueue as follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * Enforce the use of queue based maximum allocation limit if it is 
> available, if not use the general scheduler level setting
>  ** Use it during validation and normalization of requests in 
> scheduler.allocate, app submit and resource request



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8834) Provide Java client for fetching Yarn specific entities from TimelineReader

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644812#comment-16644812
 ] 

Hadoop QA commented on YARN-8834:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  4s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 25s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
44s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 65m 10s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8834 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943204/YARN-8834.006.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 17dad9f18a12 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / b39b802 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/22138/testReport/ |
| Max. process+thread count | 355 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/22138/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Provide Java client for fetching Yarn specific entities from TimelineReader
> 

[jira] [Commented] (YARN-5742) Serve aggregated logs of historical apps from timeline service

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644808#comment-16644808
 ] 

Hadoop QA commented on YARN-5742:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 44s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
33s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 54s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 4 new + 
35 unchanged - 1 fixed = 39 total (was 36) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 29s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m  
1s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
14s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
12s{color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
2s{color} | {color:green} hadoop-yarn-server-timelineservice in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 71m 58s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common
 |
|  |  Dead store to address in 
org.apache.hadoop.yarn.server.webapp.LogWebService.init()  At 
LogWebService.java:org.apache.hadoop.yarn.server.webapp.LogWebService.init()  
At LogWebService.java:[line 100] |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | 

[jira] [Commented] (YARN-8834) Provide Java client for fetching Yarn specific entities from TimelineReader

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644758#comment-16644758
 ] 

Hadoop QA commented on YARN-8834:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 20s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 34s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
10s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 61m 10s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8834 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943191/YARN-8834.005.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 586b12543339 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / b39b802 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/22136/testReport/ |
| Max. process+thread count | 339 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/22136/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Provide Java client for fetching Yarn specific entities from TimelineReader
> 

[jira] [Commented] (YARN-8866) Fix a parsing error for crossdomain.xml

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644725#comment-16644725
 ] 

Hadoop QA commented on YARN-8866:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
32m 35s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 48s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 46m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8866 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943203/YARN-8866.1.patch |
| Optional Tests |  dupname  asflicense  shadedclient  xml  |
| uname | Linux 61e9d4de69f8 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / b39b802 |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 341 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/22135/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Fix a parsing error for crossdomain.xml
> ---
>
> Key: YARN-8866
> URL: https://issues.apache.org/jira/browse/YARN-8866
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: build, yarn-ui-v2
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
> Attachments: YARN-8866.1.patch
>
>
> [QBT|https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/] reports 
> a parsing error for crossdomain.xml in hadoop-yarn-ui.
> {noformat}
> Parsing Error(s): 
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml
>  
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8865) RMStateStore contains large number of expired RMDelegationToken

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644687#comment-16644687
 ] 

Hadoop QA commented on YARN-8865:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 21s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 31s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 82 unchanged - 1 fixed = 83 total (was 83) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  1s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 87m 34s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}137m 20s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8865 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943183/YARN-8865.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 427223d4a5df 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 
07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / edce866 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/22133/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/22133/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 

[jira] [Updated] (YARN-5742) Serve aggregated logs of historical apps from timeline service

2018-10-10 Thread Rohith Sharma K S (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-5742:

Attachment: YARN-5742.03.patch

> Serve aggregated logs of historical apps from timeline service
> --
>
> Key: YARN-5742
> URL: https://issues.apache.org/jira/browse/YARN-5742
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Varun Saxena
>Assignee: Rohith Sharma K S
>Priority: Critical
> Attachments: YARN-5742-POC-v0.patch, YARN-5742.01.patch, 
> YARN-5742.02.patch, YARN-5742.03.patch, YARN-5742.v0.patch
>
>
> ATSv1.5 daemon has servlet to serve aggregated logs. But enabling only ATSv2, 
> does not serve logs from CLI and UI for completed application. Log serving 
> story has completely broken in ATSv2.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8834) Provide Java client for fetching Yarn specific entities from TimelineReader

2018-10-10 Thread Abhishek Modi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Modi updated YARN-8834:

Attachment: YARN-8834.006.patch

> Provide Java client for fetching Yarn specific entities from TimelineReader
> ---
>
> Key: YARN-8834
> URL: https://issues.apache.org/jira/browse/YARN-8834
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Abhishek Modi
>Priority: Critical
> Attachments: YARN-8834.001.patch, YARN-8834.002.patch, 
> YARN-8834.003.patch, YARN-8834.004.patch, YARN-8834.005.patch, 
> YARN-8834.006.patch
>
>
> While reviewing YARN-8303, we felt that it is necessary to provide 
> TimelineReaderClient which wraps all the REST calls in it so that user can 
> just provide application or container ids along with filters.Currently 
> fetching entities from TimelineReader is only via REST call or somebody need 
> to write java client get entities.
> It is good to provide TimelineReaderClient which fetch entities from 
> TimelineReaderServer. This will be more useful.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8834) Provide Java client for fetching Yarn specific entities from TimelineReader

2018-10-10 Thread Abhishek Modi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644669#comment-16644669
 ] 

Abhishek Modi commented on YARN-8834:
-

Thanks [~rohithsharma] for catching this. I have submitted an updated patch.

> Provide Java client for fetching Yarn specific entities from TimelineReader
> ---
>
> Key: YARN-8834
> URL: https://issues.apache.org/jira/browse/YARN-8834
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Abhishek Modi
>Priority: Critical
> Attachments: YARN-8834.001.patch, YARN-8834.002.patch, 
> YARN-8834.003.patch, YARN-8834.004.patch, YARN-8834.005.patch, 
> YARN-8834.006.patch
>
>
> While reviewing YARN-8303, we felt that it is necessary to provide 
> TimelineReaderClient which wraps all the REST calls in it so that user can 
> just provide application or container ids along with filters.Currently 
> fetching entities from TimelineReader is only via REST call or somebody need 
> to write java client get entities.
> It is good to provide TimelineReaderClient which fetch entities from 
> TimelineReaderServer. This will be more useful.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8834) Provide Java client for fetching Yarn specific entities from TimelineReader

2018-10-10 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644647#comment-16644647
 ] 

Rohith Sharma K S commented on YARN-8834:
-

Hi [~abmodi] I just realized that patch uses below code which is meant for http 
address. 
Instead of 
{code}
String timelineReaderWebAppAddress = conf.get(
YarnConfiguration.TIMELINE_SERVICE_READER_WEBAPP_ADDRESS,

YarnConfiguration.DEFAULT_TIMELINE_SERVICE_READER_WEBAPP_ADDRESS);
{code}  
we should use WebAppUtils.getTimelineReaderWebAppURLWithoutScheme(conf); This 
gives right web address. 

> Provide Java client for fetching Yarn specific entities from TimelineReader
> ---
>
> Key: YARN-8834
> URL: https://issues.apache.org/jira/browse/YARN-8834
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Abhishek Modi
>Priority: Critical
> Attachments: YARN-8834.001.patch, YARN-8834.002.patch, 
> YARN-8834.003.patch, YARN-8834.004.patch, YARN-8834.005.patch
>
>
> While reviewing YARN-8303, we felt that it is necessary to provide 
> TimelineReaderClient which wraps all the REST calls in it so that user can 
> just provide application or container ids along with filters.Currently 
> fetching entities from TimelineReader is only via REST call or somebody need 
> to write java client get entities.
> It is good to provide TimelineReaderClient which fetch entities from 
> TimelineReaderServer. This will be more useful.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8830) SLS tool not working in trunk

2018-10-10 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644646#comment-16644646
 ] 

Bibin A Chundatt commented on YARN-8830:


[~sunilg]

Have done the dry run . its working fine..

> SLS tool not working in trunk
> -
>
> Key: YARN-8830
> URL: https://issues.apache.org/jira/browse/YARN-8830
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Major
> Attachments: YARN-8830.001.patch, YARN-8830.002.patch, 
> YARN-8830.003.patch
>
>
> Seems NodeDetails hashCode() and equals() causing too many node registration 
> for large data set



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8866) Fix a parsing error for crossdomain.xml

2018-10-10 Thread Takanobu Asanuma (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated YARN-8866:
---
Attachment: YARN-8866.1.patch

> Fix a parsing error for crossdomain.xml
> ---
>
> Key: YARN-8866
> URL: https://issues.apache.org/jira/browse/YARN-8866
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: build, yarn-ui-v2
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
> Attachments: YARN-8866.1.patch
>
>
> [QBT|https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/] reports 
> a parsing error for crossdomain.xml in hadoop-yarn-ui.
> {noformat}
> Parsing Error(s): 
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml
>  
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8866) Fix a parsing error for crossdomain.xml

2018-10-10 Thread Takanobu Asanuma (JIRA)
Takanobu Asanuma created YARN-8866:
--

 Summary: Fix a parsing error for crossdomain.xml
 Key: YARN-8866
 URL: https://issues.apache.org/jira/browse/YARN-8866
 Project: Hadoop YARN
  Issue Type: Bug
  Components: build, yarn-ui-v2
Reporter: Takanobu Asanuma
Assignee: Takanobu Asanuma


[QBT|https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/] reports a 
parsing error for crossdomain.xml in hadoop-yarn-ui.
{noformat}
Parsing Error(s): 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml
 
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5742) Serve aggregated logs of historical apps from timeline service

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644614#comment-16644614
 ] 

Hadoop QA commented on YARN-5742:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
2s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 19s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
18s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  6s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 12 new 
+ 35 unchanged - 1 fixed = 47 total (was 36) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 43s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
28s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
16s{color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
4s{color} | {color:green} hadoop-yarn-server-timelineservice in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 82m 38s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-5742 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943186/YARN-5742.02.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux d2c96396d79f 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (YARN-8834) Provide Java client for fetching Yarn specific entities from TimelineReader

2018-10-10 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644613#comment-16644613
 ] 

Rohith Sharma K S commented on YARN-8834:
-

+1 lgtm for latest patch.. pending jenkins.. 

[~vrushalic] would you like to take a look at 05 patch? Otherwise I will commit 
it later of today if no more objections..

> Provide Java client for fetching Yarn specific entities from TimelineReader
> ---
>
> Key: YARN-8834
> URL: https://issues.apache.org/jira/browse/YARN-8834
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Abhishek Modi
>Priority: Critical
> Attachments: YARN-8834.001.patch, YARN-8834.002.patch, 
> YARN-8834.003.patch, YARN-8834.004.patch, YARN-8834.005.patch
>
>
> While reviewing YARN-8303, we felt that it is necessary to provide 
> TimelineReaderClient which wraps all the REST calls in it so that user can 
> just provide application or container ids along with filters.Currently 
> fetching entities from TimelineReader is only via REST call or somebody need 
> to write java client get entities.
> It is good to provide TimelineReaderClient which fetch entities from 
> TimelineReaderServer. This will be more useful.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8468) Enable the use of queue based maximum container allocation limit and implement it in FairScheduler

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644584#comment-16644584
 ] 

Hadoop QA commented on YARN-8468:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 23m 
30s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-3.1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
56s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
49s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 42s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} branch-3.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 58s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m  4s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}175m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:a607c02 |
| JIRA Issue | YARN-8468 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943172/YARN-8468-branch-3.1.020.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ced728ef0323 3.13.0-144-generic #193-Ubuntu SMP Thu Mar 15 
17:03:53 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.1 / 323b76b |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/22132/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/22132/testReport/ |
| Max. process+thread count | 909 (vs. ulimit of 1) |
| modules | C: 

[jira] [Commented] (YARN-8834) Provide Java client for fetching Yarn specific entities from TimelineReader

2018-10-10 Thread Abhishek Modi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644555#comment-16644555
 ] 

Abhishek Modi commented on YARN-8834:
-

Thanks for review [~rohithsharma]. I have attached a new patch fixing the 
comments. [~rohithsharma] [~vrushalic] please review it whenever you get some 
time. Thanks.

> Provide Java client for fetching Yarn specific entities from TimelineReader
> ---
>
> Key: YARN-8834
> URL: https://issues.apache.org/jira/browse/YARN-8834
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Abhishek Modi
>Priority: Critical
> Attachments: YARN-8834.001.patch, YARN-8834.002.patch, 
> YARN-8834.003.patch, YARN-8834.004.patch, YARN-8834.005.patch
>
>
> While reviewing YARN-8303, we felt that it is necessary to provide 
> TimelineReaderClient which wraps all the REST calls in it so that user can 
> just provide application or container ids along with filters.Currently 
> fetching entities from TimelineReader is only via REST call or somebody need 
> to write java client get entities.
> It is good to provide TimelineReaderClient which fetch entities from 
> TimelineReaderServer. This will be more useful.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8834) Provide Java client for fetching Yarn specific entities from TimelineReader

2018-10-10 Thread Abhishek Modi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Modi updated YARN-8834:

Attachment: YARN-8834.005.patch

> Provide Java client for fetching Yarn specific entities from TimelineReader
> ---
>
> Key: YARN-8834
> URL: https://issues.apache.org/jira/browse/YARN-8834
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Abhishek Modi
>Priority: Critical
> Attachments: YARN-8834.001.patch, YARN-8834.002.patch, 
> YARN-8834.003.patch, YARN-8834.004.patch, YARN-8834.005.patch
>
>
> While reviewing YARN-8303, we felt that it is necessary to provide 
> TimelineReaderClient which wraps all the REST calls in it so that user can 
> just provide application or container ids along with filters.Currently 
> fetching entities from TimelineReader is only via REST call or somebody need 
> to write java client get entities.
> It is good to provide TimelineReaderClient which fetch entities from 
> TimelineReaderServer. This will be more useful.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8230) [UI2] Attempt Info page url shows NA for several fields for container info

2018-10-10 Thread Akhil PB (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akhil PB reassigned YARN-8230:
--

Assignee: Akhil PB

> [UI2] Attempt Info page url shows NA for several fields for container info
> --
>
> Key: YARN-8230
> URL: https://issues.apache.org/jira/browse/YARN-8230
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn, yarn-ui-v2
>Reporter: Sumana Sathish
>Assignee: Akhil PB
>Priority: Critical
>
> 1. Click on any application
> 2. Click on the appAttempt present 
> 3. Click on grid View
> 4. It shows container Info. But logs / nodemanager / and several fields show 
> NA, with finished time as Invalid



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5742) Serve aggregated logs of historical apps from timeline service

2018-10-10 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644538#comment-16644538
 ] 

Rohith Sharma K S commented on YARN-5742:
-

Updated the patch refactoring common code from AHSWebservice. 

> Serve aggregated logs of historical apps from timeline service
> --
>
> Key: YARN-5742
> URL: https://issues.apache.org/jira/browse/YARN-5742
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Varun Saxena
>Assignee: Rohith Sharma K S
>Priority: Critical
> Attachments: YARN-5742-POC-v0.patch, YARN-5742.01.patch, 
> YARN-5742.02.patch, YARN-5742.v0.patch
>
>
> ATSv1.5 daemon has servlet to serve aggregated logs. But enabling only ATSv2, 
> does not serve logs from CLI and UI for completed application. Log serving 
> story has completely broken in ATSv2.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5742) Serve aggregated logs of historical apps from timeline service

2018-10-10 Thread Rohith Sharma K S (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-5742:

Attachment: YARN-5742.02.patch

> Serve aggregated logs of historical apps from timeline service
> --
>
> Key: YARN-5742
> URL: https://issues.apache.org/jira/browse/YARN-5742
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Varun Saxena
>Assignee: Rohith Sharma K S
>Priority: Critical
> Attachments: YARN-5742-POC-v0.patch, YARN-5742.01.patch, 
> YARN-5742.02.patch, YARN-5742.v0.patch
>
>
> ATSv1.5 daemon has servlet to serve aggregated logs. But enabling only ATSv2, 
> does not serve logs from CLI and UI for completed application. Log serving 
> story has completely broken in ATSv2.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8842) Update QueueMetrics with custom resource values

2018-10-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644532#comment-16644532
 ] 

Hadoop QA commented on YARN-8842:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 58s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
26s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
28s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 25s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 37 new + 185 unchanged - 2 fixed = 222 total (was 187) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 27s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
16s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 94m 34s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 4s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}181m 19s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8842 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12943165/YARN-8842.008.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 36634744457f 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / edce866 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 

[jira] [Updated] (YARN-8865) RMStateStore contains large number of expired RMDelegationToken

2018-10-10 Thread Wilfred Spiegelenburg (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated YARN-8865:

Attachment: YARN-8865.001.patch

> RMStateStore contains large number of expired RMDelegationToken
> ---
>
> Key: YARN-8865
> URL: https://issues.apache.org/jira/browse/YARN-8865
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.1.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: YARN-8865.001.patch
>
>
> When the RM state store is restored expired delegation tokens are restored 
> and added to the system. These expired tokens do not get cleaned up or 
> removed. The exact reason why the tokens are still in the store is not clear. 
> We have seen as many as 250,000 tokens in the store some of which were 2 
> years old.
> This has two side effects:
> * for the zookeeper store this leads to a jute buffer exhaustion issue and 
> prevents the RM from becoming active.
> * restore takes longer than needed and heap usage is higher than it should be
> We should not restore already expired tokens since they cannot be renewed or 
> used.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8865) RMStateStore contains large number of expired RMDelegationToken

2018-10-10 Thread Wilfred Spiegelenburg (JIRA)
Wilfred Spiegelenburg created YARN-8865:
---

 Summary: RMStateStore contains large number of expired 
RMDelegationToken
 Key: YARN-8865
 URL: https://issues.apache.org/jira/browse/YARN-8865
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 3.1.0
Reporter: Wilfred Spiegelenburg
Assignee: Wilfred Spiegelenburg


When the RM state store is restored expired delegation tokens are restored and 
added to the system. These expired tokens do not get cleaned up or removed. The 
exact reason why the tokens are still in the store is not clear. We have seen 
as many as 250,000 tokens in the store some of which were 2 years old.

This has two side effects:
* for the zookeeper store this leads to a jute buffer exhaustion issue and 
prevents the RM from becoming active.
* restore takes longer than needed and heap usage is higher than it should be

We should not restore already expired tokens since they cannot be renewed or 
used.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org