[jira] [Updated] (YARN-9264) [Umbrella] Follow-up on IntelOpenCL FPGA plugin

2020-01-07 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-9264:
---
Issue Type: Improvement  (was: Bug)

> [Umbrella] Follow-up on IntelOpenCL FPGA plugin
> ---
>
> Key: YARN-9264
> URL: https://issues.apache.org/jira/browse/YARN-9264
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.1.0
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 3.3.0
>
>
> The Intel FPGA resource type support was released in Hadoop 3.1.0.
> Right now the plugin implementation has some deficiencies that need to be 
> fixed. This JIRA lists all problems that need to be resolved.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10068) TimelineV2Client may leak file descriptors creating ClientResponse objects.

2020-01-07 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010409#comment-17010409
 ] 

Hudson commented on YARN-10068:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17826 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17826/])
YARN-10068. Fix TimelineV2Client leaking File Descriptors. (pjoseph: rev 
571795cd180d3077e8ba189b3b70e81f0d1a7044)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineV2ClientImpl.java


> TimelineV2Client may leak file descriptors creating ClientResponse objects.
> ---
>
> Key: YARN-10068
> URL: https://issues.apache.org/jira/browse/YARN-10068
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.0.0
> Environment: HDP VERSION3.1.4
> AMBARI VERSION2.7.4.0
>Reporter: Anand Srinivasan
>Assignee: Anand Srinivasan
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: YARN-10068.001.patch, YARN-10068.002.patch, 
> YARN-10068.003.patch, image-2020-01-02-14-58-12-773.png
>
>
> Hi team,
> Code-walkthrough between v1 and v2 of TimelineClient API revealed that v2 API 
> TimelineV2ClientImpl#putObjects doesn't close ClientResponse objects under 
> success status returned from Timeline Server. ClientResponse is closed only 
> under erroneous response from the server using ClientResponse#getEntity.
> We also noticed that TimelineClient (v1) closes the ClientResponse object in 
> TimelineWriter#putEntities by calling ClientResponse#getEntity in both 
> success and error conditions from the server thereby avoiding this file 
> descriptor leak.
> Customer's original issue and the symptom was that the NodeManager went down 
> because of 'too many files open' condition where there were lots of 
> CLOSED_WAIT sockets observed between the timeline client (from NM) and the 
> timeline server hosts. 
> Could you please help resolve this issue ? Thanks.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10068) TimelineV2Client may leak file descriptors creating ClientResponse objects.

2020-01-07 Thread Anand Srinivasan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010404#comment-17010404
 ] 

Anand Srinivasan commented on YARN-10068:
-

Hi Prabhu Joseph,

Thanks for the review feedback and commit to the trunk.

Kind regards.

> TimelineV2Client may leak file descriptors creating ClientResponse objects.
> ---
>
> Key: YARN-10068
> URL: https://issues.apache.org/jira/browse/YARN-10068
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.0.0
> Environment: HDP VERSION3.1.4
> AMBARI VERSION2.7.4.0
>Reporter: Anand Srinivasan
>Assignee: Anand Srinivasan
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: YARN-10068.001.patch, YARN-10068.002.patch, 
> YARN-10068.003.patch, image-2020-01-02-14-58-12-773.png
>
>
> Hi team,
> Code-walkthrough between v1 and v2 of TimelineClient API revealed that v2 API 
> TimelineV2ClientImpl#putObjects doesn't close ClientResponse objects under 
> success status returned from Timeline Server. ClientResponse is closed only 
> under erroneous response from the server using ClientResponse#getEntity.
> We also noticed that TimelineClient (v1) closes the ClientResponse object in 
> TimelineWriter#putEntities by calling ClientResponse#getEntity in both 
> success and error conditions from the server thereby avoiding this file 
> descriptor leak.
> Customer's original issue and the symptom was that the NodeManager went down 
> because of 'too many files open' condition where there were lots of 
> CLOSED_WAIT sockets observed between the timeline client (from NM) and the 
> timeline server hosts. 
> Could you please help resolve this issue ? Thanks.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10068) TimelineV2Client may leak file descriptors creating ClientResponse objects.

2020-01-07 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010399#comment-17010399
 ] 

Prabhu Joseph commented on YARN-10068:
--

Thanks [~anand.srinivasan] for the patch and [~adam.antal] for the review.

Have committed the  [^YARN-10068.003.patch]  to trunk.

> TimelineV2Client may leak file descriptors creating ClientResponse objects.
> ---
>
> Key: YARN-10068
> URL: https://issues.apache.org/jira/browse/YARN-10068
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.0.0
> Environment: HDP VERSION3.1.4
> AMBARI VERSION2.7.4.0
>Reporter: Anand Srinivasan
>Assignee: Anand Srinivasan
>Priority: Critical
> Attachments: YARN-10068.001.patch, YARN-10068.002.patch, 
> YARN-10068.003.patch, image-2020-01-02-14-58-12-773.png
>
>
> Hi team,
> Code-walkthrough between v1 and v2 of TimelineClient API revealed that v2 API 
> TimelineV2ClientImpl#putObjects doesn't close ClientResponse objects under 
> success status returned from Timeline Server. ClientResponse is closed only 
> under erroneous response from the server using ClientResponse#getEntity.
> We also noticed that TimelineClient (v1) closes the ClientResponse object in 
> TimelineWriter#putEntities by calling ClientResponse#getEntity in both 
> success and error conditions from the server thereby avoiding this file 
> descriptor leak.
> Customer's original issue and the symptom was that the NodeManager went down 
> because of 'too many files open' condition where there were lots of 
> CLOSED_WAIT sockets observed between the timeline client (from NM) and the 
> timeline server hosts. 
> Could you please help resolve this issue ? Thanks.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10068) TimelineV2Client may leak file descriptors creating ClientResponse objects.

2020-01-07 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-10068:
-
Fix Version/s: 3.3.0

> TimelineV2Client may leak file descriptors creating ClientResponse objects.
> ---
>
> Key: YARN-10068
> URL: https://issues.apache.org/jira/browse/YARN-10068
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.0.0
> Environment: HDP VERSION3.1.4
> AMBARI VERSION2.7.4.0
>Reporter: Anand Srinivasan
>Assignee: Anand Srinivasan
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: YARN-10068.001.patch, YARN-10068.002.patch, 
> YARN-10068.003.patch, image-2020-01-02-14-58-12-773.png
>
>
> Hi team,
> Code-walkthrough between v1 and v2 of TimelineClient API revealed that v2 API 
> TimelineV2ClientImpl#putObjects doesn't close ClientResponse objects under 
> success status returned from Timeline Server. ClientResponse is closed only 
> under erroneous response from the server using ClientResponse#getEntity.
> We also noticed that TimelineClient (v1) closes the ClientResponse object in 
> TimelineWriter#putEntities by calling ClientResponse#getEntity in both 
> success and error conditions from the server thereby avoiding this file 
> descriptor leak.
> Customer's original issue and the symptom was that the NodeManager went down 
> because of 'too many files open' condition where there were lots of 
> CLOSED_WAIT sockets observed between the timeline client (from NM) and the 
> timeline server hosts. 
> Could you please help resolve this issue ? Thanks.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9698) [Umbrella] Tools to help migration from Fair Scheduler to Capacity Scheduler

2020-01-07 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010315#comment-17010315
 ] 

Brahma Reddy Battula edited comment on YARN-9698 at 1/8/20 4:53 AM:


[~cheersyang]

 I am planning for 3.3.0 Release( will share th plan on mailing list),This 
feature is marked to 3.3.0 release, can you guys update the plan..?


was (Author: brahmareddy):
[~cheersyang]

 I am planning for 3.3.0 Release,This feature is marked to 3.3.0 release, can 
you guys update the plan..?

> [Umbrella] Tools to help migration from Fair Scheduler to Capacity Scheduler
> 
>
> Key: YARN-9698
> URL: https://issues.apache.org/jira/browse/YARN-9698
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Weiwei Yang
>Priority: Major
>  Labels: fs2cs
> Attachments: FS-CS Migration.pdf
>
>
> We see some users want to migrate from Fair Scheduler to Capacity Scheduler, 
> this Jira is created as an umbrella to track all related efforts for the 
> migration, the scope contains
>  * Bug fixes
>  * Add missing features
>  * Migration tools that help to generate CS configs based on FS, validate 
> configs etc
>  * Documents
> this is part of CS component, the purpose is to make the migration process 
> smooth.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5542) Scheduling of opportunistic containers

2020-01-07 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010313#comment-17010313
 ] 

Brahma Reddy Battula edited comment on YARN-5542 at 1/8/20 4:53 AM:


[~kkaranasos], Planning to release 3.3.0 ( will share th plan on mailing 
list).. Most of the subtasks are finished and merged to 3.3.0, can we close 
this umbrella..?


was (Author: brahmareddy):
[~kkaranasos], Planning to release 3.3.0.. Most of the subtasks are finished 
and merged to 3.3.0, can we close this umbrella..?

> Scheduling of opportunistic containers
> --
>
> Key: YARN-5542
> URL: https://issues.apache.org/jira/browse/YARN-5542
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Konstantinos Karanasos
>Priority: Major
>
> This JIRA groups all efforts related to the scheduling of opportunistic 
> containers. 
> It includes the scheduling of opportunistic container through the central RM 
> (YARN-5220), through distributed scheduling (YARN-2877), as well as the 
> scheduling of containers based on actual node utilization (YARN-1011) and the 
> container promotion/demotion (YARN-5085).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9414) Application Catalog for YARN applications

2020-01-07 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010338#comment-17010338
 ] 

Brahma Reddy Battula edited comment on YARN-9414 at 1/8/20 4:52 AM:


[~eyang], I am planning for 3.3.0 release ( will share th plan on mailing 
list), Most of the these jira's are merged to 3.3.0. can this feature GA 
without remaning jira's..? can you please update..?


was (Author: brahmareddy):
[~eyang], I am planning for 3.3.0 release, Most of the these jira's are merged 
to 3.3.0. can this feature GA without remaning jira's..? can you please 
update..?

> Application Catalog for YARN applications
> -
>
> Key: YARN-9414
> URL: https://issues.apache.org/jira/browse/YARN-9414
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN Appstore.pdf, YARN-Application-Catalog.pdf
>
>
> YARN native services provides web services API to improve usability of 
> application deployment on Hadoop using collection of docker images.  It would 
> be nice to have an application catalog system which provides an editorial and 
> search interface for YARN applications.  This improves usability of YARN for 
> manage the life cycle of applications.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8851) [Umbrella] A pluggable device plugin framework to ease vendor plugin development

2020-01-07 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010344#comment-17010344
 ] 

Brahma Reddy Battula commented on YARN-8851:


[~tangzhankun], can we close this jira..? and move the pending jira's out..? I 
am planning for 3.3.0 release, going to mention this feature.

> [Umbrella] A pluggable device plugin framework to ease vendor plugin 
> development
> 
>
> Key: YARN-8851
> URL: https://issues.apache.org/jira/browse/YARN-8851
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: yarn
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-8851-WIP2-trunk.001.patch, 
> YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, 
> YARN-8851-WIP5-trunk.001.patch, YARN-8851-WIP6-trunk.001.patch, 
> YARN-8851-WIP7-trunk.001.patch, YARN-8851-WIP8-trunk.001.patch, 
> YARN-8851-WIP9-trunk.001.patch, YARN-8851-trunk.001.patch, 
> YARN-8851-trunk.002.patch, [YARN-8851] 
> YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] 
> YARN_New_Device_Plugin_Framework_Design_Proposal-4.pdf, [YARN-8851] 
> YARN_New_Device_Plugin_Framework_Design_Proposal.pdf
>
>
> At present, we support GPU/FPGA device in YARN through a native, coupling 
> way. But it's difficult for a vendor to implement such a device plugin 
> because the developer needs much knowledge of YARN internals. And this brings 
> burden to the community to maintain both YARN core and vendor-specific code.
> Here we propose a new device plugin framework to ease vendor device plugin 
> development and provide a more flexible way to integrate with YARN NM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9050) [Umbrella] Usability improvements for scheduler activities

2020-01-07 Thread Tao Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010339#comment-17010339
 ] 

Tao Yang commented on YARN-9050:


Glad to hear that 3.3.0 release is on the way and thanks for reminding me.
The remaining issues are almost ready and only need some reviews, they can be 
done before this release, thanks.

> [Umbrella] Usability improvements for scheduler activities
> --
>
> Key: YARN-9050
> URL: https://issues.apache.org/jira/browse/YARN-9050
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: image-2018-11-23-16-46-38-138.png
>
>
> We have did some usability improvements for scheduler activities based on 
> YARN3.1 in our cluster as follows:
>  1. Not available for multi-thread asynchronous scheduling. App and node 
> activities maybe confused when multiple scheduling threads record activities 
> of different allocation processes in the same variables like appsAllocation 
> and recordingNodesAllocation in ActivitiesManager. I think these variables 
> should be thread-local to make activities clear among multiple threads.
>  2. Incomplete activities for multi-node lookup mechanism, since 
> ActivitiesLogger will skip recording through \{{if (node == null || 
> activitiesManager == null) }} when node is null which represents this 
> allocation is for multi-nodes. We need support recording activities for 
> multi-node lookup mechanism.
>  3. Current app activities can not meet requirements of diagnostics, for 
> example, we can know that node doesn't match request but hard to know why, 
> especially when using placement constraints, it's difficult to make a 
> detailed diagnosis manually. So I propose to improve the diagnoses of 
> activities, add diagnosis for placement constraints check, update 
> insufficient resource diagnosis with detailed info (like 'insufficient 
> resource names:[memory-mb]') and so on.
>  4. Add more useful fields for app activities, in some scenarios we need to 
> distinguish different requests but can't locate requests based on app 
> activities info, there are some other fields can help to filter what we want 
> such as allocation tags. We have added containerPriority, allocationRequestId 
> and allocationTags fields in AppAllocation.
>  5. Filter app activities by key fields, sometimes the results of app 
> activities is massive, it's hard to find what we want. We have support filter 
> by allocation-tags to meet requirements from some apps, more over, we can 
> take container-priority and allocation-request-id as candidates if necessary.
>  6. Aggregate app activities by diagnoses. For a single allocation process, 
> activities still can be massive in a large cluster, we frequently want to 
> know why request can't be allocated in cluster, it's hard to check every node 
> manually in a large cluster, so that aggregation for app activities by 
> diagnoses is necessary to solve this trouble. We have added groupingType 
> parameter for app-activities REST API for this, supports grouping by 
> diagnostics.
> I think we can have a discuss about these points, useful improvements which 
> can be accepted will be added into the patch. Thanks.
> Running design doc is attached 
> [here|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.2jnaobmmfne5].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9414) Application Catalog for YARN applications

2020-01-07 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010338#comment-17010338
 ] 

Brahma Reddy Battula commented on YARN-9414:


[~eyang], I am planning for 3.3.0 release, Most of the these jira's are merged 
to 3.3.0. can this feature GA without remaning jira's..? can you please 
update..?

> Application Catalog for YARN applications
> -
>
> Key: YARN-9414
> URL: https://issues.apache.org/jira/browse/YARN-9414
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN Appstore.pdf, YARN-Application-Catalog.pdf
>
>
> YARN native services provides web services API to improve usability of 
> application deployment on Hadoop using collection of docker images.  It would 
> be nice to have an application catalog system which provides an editorial and 
> search interface for YARN applications.  This improves usability of YARN for 
> manage the life cycle of applications.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8283) [Umbrella] MaWo - A Master Worker framework on top of YARN Services

2020-01-07 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010333#comment-17010333
 ] 

Brahma Reddy Battula commented on YARN-8283:


[~yeshavora], I am planning for 3.3.0 Release. Two of these jira's are merged 
to 3.3.0.. Are you planing remaing jira's also.? Will these be GA without these 
jira's..?

> [Umbrella] MaWo - A Master Worker framework on top of YARN Services
> ---
>
> Key: YARN-8283
> URL: https://issues.apache.org/jira/browse/YARN-8283
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Yesha Vora
>Assignee: Yesha Vora
>Priority: Major
> Attachments: [Design Doc] [YARN-8283] MaWo - A Master Worker 
> framework on top of YARN Services.pdf
>
>
> There is a need for an application / framework to handle Master-Worker 
> scenarios. There are existing frameworks on YARN which can be used to run a 
> job in distributed manner such as Mapreduce, Tez, Spark etc. But 
> master-worker use-cases usually are force-fed into one of these existing 
> frameworks which have been designed primarily around data-parallelism instead 
> of generic Master Worker type of computations.
> In this JIRA, we’d like to contribute MaWo - a YARN Service based framework 
> that achieves this goal. The overall goal is to create an app that can take 
> an input job specification with tasks, their durations and have a Master dish 
> the tasks off to a predetermined set of workers. The components will be 
> responsible for making sure that the tasks and the overall job finish in 
> specific time durations.
> We have been using a version of the MaWo framework for running unit tests of 
> Hadoop in a parallel manner on an existing Hadoop YARN cluster. What 
> typically takes 10 hours to run all of Hadoop project’s unit-tests can finish 
> under 20 minutes on a MaWo app of about 50 containers!
> YARN-3307 was an original attempt at this but through a first-class YARN app. 
> In this JIRA, we instead use YARN Service for orchestration so that our code 
> can focus on the core Master Worker paradigm.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8472) YARN Container Phase 2

2020-01-07 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010332#comment-17010332
 ] 

Brahma Reddy Battula commented on YARN-8472:


[~eyang], Most of the jira's are fixed. Can we close this umbrella? As of some 
of these jira's are only in 3.3.0, I am planning to put in 3.3.0 release plan.

> YARN Container Phase 2
> --
>
> Key: YARN-8472
> URL: https://issues.apache.org/jira/browse/YARN-8472
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>
> In YARN-3611, we have implemented basic Docker container support for YARN.  
> This story is the next phase to improve container usability.
> Several area for improvements are:
>  # Software defined network support
>  # Interactive shell to container
>  # User management sss/nscd integration
>  # Runc/containerd support
>  # Metrics/Logs integration with Timeline service v2 
>  # Docker container profiles
>  # Docker cgroup management



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9050) [Umbrella] Usability improvements for scheduler activities

2020-01-07 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010326#comment-17010326
 ] 

Brahma Reddy Battula commented on YARN-9050:


[~Tao Yang] Looks only two jira's are pending for this feature. I am plannig 
for 3.3.0 release and put this feaute in list.. can you update the plan for 
this..?

> [Umbrella] Usability improvements for scheduler activities
> --
>
> Key: YARN-9050
> URL: https://issues.apache.org/jira/browse/YARN-9050
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: image-2018-11-23-16-46-38-138.png
>
>
> We have did some usability improvements for scheduler activities based on 
> YARN3.1 in our cluster as follows:
>  1. Not available for multi-thread asynchronous scheduling. App and node 
> activities maybe confused when multiple scheduling threads record activities 
> of different allocation processes in the same variables like appsAllocation 
> and recordingNodesAllocation in ActivitiesManager. I think these variables 
> should be thread-local to make activities clear among multiple threads.
>  2. Incomplete activities for multi-node lookup mechanism, since 
> ActivitiesLogger will skip recording through \{{if (node == null || 
> activitiesManager == null) }} when node is null which represents this 
> allocation is for multi-nodes. We need support recording activities for 
> multi-node lookup mechanism.
>  3. Current app activities can not meet requirements of diagnostics, for 
> example, we can know that node doesn't match request but hard to know why, 
> especially when using placement constraints, it's difficult to make a 
> detailed diagnosis manually. So I propose to improve the diagnoses of 
> activities, add diagnosis for placement constraints check, update 
> insufficient resource diagnosis with detailed info (like 'insufficient 
> resource names:[memory-mb]') and so on.
>  4. Add more useful fields for app activities, in some scenarios we need to 
> distinguish different requests but can't locate requests based on app 
> activities info, there are some other fields can help to filter what we want 
> such as allocation tags. We have added containerPriority, allocationRequestId 
> and allocationTags fields in AppAllocation.
>  5. Filter app activities by key fields, sometimes the results of app 
> activities is massive, it's hard to find what we want. We have support filter 
> by allocation-tags to meet requirements from some apps, more over, we can 
> take container-priority and allocation-request-id as candidates if necessary.
>  6. Aggregate app activities by diagnoses. For a single allocation process, 
> activities still can be massive in a large cluster, we frequently want to 
> know why request can't be allocated in cluster, it's hard to check every node 
> manually in a large cluster, so that aggregation for app activities by 
> diagnoses is necessary to solve this trouble. We have added groupingType 
> parameter for app-activities REST API for this, supports grouping by 
> diagnostics.
> I think we can have a discuss about these points, useful improvements which 
> can be accepted will be added into the patch. Thanks.
> Running design doc is attached 
> [here|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.2jnaobmmfne5].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9698) [Umbrella] Tools to help migration from Fair Scheduler to Capacity Scheduler

2020-01-07 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010315#comment-17010315
 ] 

Brahma Reddy Battula commented on YARN-9698:


[~cheersyang]

 I am planning for 3.3.0 Release,This feature is marked to 3.3.0 release, can 
you guys update the plan..?

> [Umbrella] Tools to help migration from Fair Scheduler to Capacity Scheduler
> 
>
> Key: YARN-9698
> URL: https://issues.apache.org/jira/browse/YARN-9698
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Weiwei Yang
>Priority: Major
>  Labels: fs2cs
> Attachments: FS-CS Migration.pdf
>
>
> We see some users want to migrate from Fair Scheduler to Capacity Scheduler, 
> this Jira is created as an umbrella to track all related efforts for the 
> migration, the scope contains
>  * Bug fixes
>  * Add missing features
>  * Migration tools that help to generate CS configs based on FS, validate 
> configs etc
>  * Documents
> this is part of CS component, the purpose is to make the migration process 
> smooth.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5542) Scheduling of opportunistic containers

2020-01-07 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010313#comment-17010313
 ] 

Brahma Reddy Battula commented on YARN-5542:


[~kkaranasos], Planning to release 3.3.0.. Most of the subtasks are finished 
and merged to 3.3.0, can we close this umbrella..?

> Scheduling of opportunistic containers
> --
>
> Key: YARN-5542
> URL: https://issues.apache.org/jira/browse/YARN-5542
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Konstantinos Karanasos
>Priority: Major
>
> This JIRA groups all efforts related to the scheduling of opportunistic 
> containers. 
> It includes the scheduling of opportunistic container through the central RM 
> (YARN-5220), through distributed scheduling (YARN-2877), as well as the 
> scheduling of containers based on actual node utilization (YARN-1011) and the 
> container promotion/demotion (YARN-5085).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9014) runC container runtime

2020-01-07 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010311#comment-17010311
 ] 

Brahma Reddy Battula commented on YARN-9014:


[~ebadger], Planning to release 3.3.0.. Looks some jira's are pending on this 
feature.. As some of jira's are already in 3.3.0,still you've plan to work on 
rest of the jira's..?

> runC container runtime
> --
>
> Key: YARN-9014
> URL: https://issues.apache.org/jira/browse/YARN-9014
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Jason Darrell Lowe
>Assignee: Eric Badger
>Priority: Major
>  Labels: Docker
> Attachments: OciSquashfsRuntime.v001.pdf, 
> RuncContainerRuntime.v002.pdf
>
>
> This JIRA tracks a YARN container runtime that supports running containers in 
> images built by Docker but the runtime does not use Docker directly, and 
> Docker does not have to be installed on the nodes.  The runtime leverages the 
> [OCI runtime standard|https://github.com/opencontainers/runtime-spec] to 
> launch containers, so an OCI-compliant runtime like {{runc}} is required.  
> {{runc}} has the benefit of not requiring a daemon like {{dockerd}} to be 
> running in order to launch/control containers.
> The layers comprising the Docker image are uploaded to HDFS as 
> [squashfs|http://tldp.org/HOWTO/SquashFS-HOWTO/whatis.html] images, enabling 
> the runtime to efficiently download and execute directly on the compressed 
> layers.  This saves image unpack time and space on the local disk.  The image 
> layers, like other entries in the YARN distributed cache, can be spread 
> across the YARN local disks, increasing the available space for storing 
> container images on each node.
> A design document will be posted shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1011) [Umbrella] Schedule containers based on utilization of currently allocated containers

2020-01-07 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010301#comment-17010301
 ] 

Brahma Reddy Battula commented on YARN-1011:


[~haibochen] , Looks most of the jira's are closed..Any plan to merge to 
trunk.. I am planning for 3.3.0 release so please let me know.

> [Umbrella] Schedule containers based on utilization of currently allocated 
> containers
> -
>
> Key: YARN-1011
> URL: https://issues.apache.org/jira/browse/YARN-1011
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Arun Murthy
>Assignee: Karthik Kambatla
>Priority: Major
> Attachments: patch-for-yarn-1011.patch, yarn-1011-design-v0.pdf, 
> yarn-1011-design-v1.pdf, yarn-1011-design-v2.pdf, yarn-1011-design-v3.pdf
>
>
> Currently RM allocates containers and assumes resources allocated are 
> utilized.
> RM can, and should, get to a point where it measures utilization of allocated 
> containers and, if appropriate, allocate more (speculative?) containers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10063) Usage output of container-executor binary needs to include --http/--https argument

2020-01-07 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010264#comment-17010264
 ] 

Hadoop QA commented on YARN-10063:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m  
0s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
 1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
35m 23s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 37s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m 
22s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 74m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | YARN-10063 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990148/YARN-10063.002.patch |
| Optional Tests |  dupname  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 81905a5ae989 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / a7fccc1 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/25346/testReport/ |
| Max. process+thread count | 309 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/25346/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Usage output of container-executor binary needs to include --http/--https 
> argument
> --
>
> Key: YARN-10063
> URL: https://issues.apache.org/jira/browse/YARN-10063
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Minor
> Attachments: YARN-10063.001.patch, YARN-10063.002.patch
>
>
> YARN-8448/YARN-6586 seems to have introduced a new option - "\--http" 
> (default) and "\--https" that is possible to be passed in to the 
> 

[jira] [Comment Edited] (YARN-10063) Usage output of container-executor binary needs to include --http/--https argument

2020-01-07 Thread Siddharth Ahuja (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010225#comment-17010225
 ] 

Siddharth Ahuja edited comment on YARN-10063 at 1/8/20 1:04 AM:


Thanks [~pbacsko].

I have made the changes and uploaded the patch again that incorporates the 
changes discussed just above.

Usage output of both commands - LAUNCH_CONTAINER (1) and 
LAUNCH_DOCKER_CONTAINER (4) has been updated and it looks as per below:

{code}
[root@ bin]# pwd
/var/lib/yarn-ce/bin
[root@ bin]# ll
total 800
---Sr-s--- 1 root yarn 728960 Jan  7 16:37 container-executor
---Sr-s--- 1 root yarn  87168 Nov  8 08:04 container-executor.orig
[root@ bin]# ./container-executor

[root@sid-63-1 bin]# ./container-executor
Usage: container-executor --checksetup
   container-executor --mount-cgroups  ...
[DISABLED] container-executor --tc-modify-state 
[DISABLED] container-executor --tc-read-state 
[DISABLED] container-executor --tc-read-stats 
[DISABLED] container-executor --exec-container 
[DISABLED] container-executor --run-docker 
[DISABLED] container-executor --remove-docker-container [hierarchy] 

[DISABLED] container-executor --inspect-docker-container 
[DISABLED] container-executor --run-runc-container 
[DISABLED] container-executor --reap-runc-layer-mounts 
   container-executor
   where command and command-args: 
initialize container:   0 appid tokens nm-local-dirs nm-log-dirs 
cmd app...
launch container:   1 appid containerid workdir 
container-script tokens --http | --https keystorepath truststorepath pidfile 
nm-local-dirs nm-log-dirs resources 
 [DISABLED] launch docker container:   4 appid containerid workdir 
container-script tokens --http | --https keystorepath truststorepath pidfile 
nm-local-dirs nm-log-dirs docker-command-file resources 
signal container:   2 container-pid signal
delete as user: 3 relative-path
list as user:   5 relative-path
[DISABLED]  sync yarn sysfs:6 app-id nm-local-dirs
{code}

Thanks in advance again for your check [~pbacsko].


was (Author: sahuja):
Thanks [~pbacsko].

I have made the changes and uploaded the patch again that incorporates the 
changes discussed just above.

Usage output of both commands - LAUNCH_CONTAINER (1) and 
LAUNCH_DOCKER_CONTAINER (4) has been updated and it looks as per below:

{code}
[root@ bin]# pwd
/var/lib/yarn-ce/bin
[root@ bin]# ll
total 800
---Sr-s--- 1 root yarn 728960 Jan  7 16:37 container-executor
---Sr-s--- 1 root yarn  87168 Nov  8 08:04 container-executor.orig
[root@ bin]# ./container-executor

[root@sid-63-1 bin]# ./container-executor
Usage: container-executor --checksetup
   container-executor --mount-cgroups  ...
[DISABLED] container-executor --tc-modify-state 
[DISABLED] container-executor --tc-read-state 
[DISABLED] container-executor --tc-read-stats 
[DISABLED] container-executor --exec-container 
[DISABLED] container-executor --run-docker 
[DISABLED] container-executor --remove-docker-container [hierarchy] 

[DISABLED] container-executor --inspect-docker-container 
[DISABLED] container-executor --run-runc-container 
[DISABLED] container-executor --reap-runc-layer-mounts 
   container-executor
   where command and command-args: 
initialize container:   0 appid tokens nm-local-dirs nm-log-dirs 
cmd app...
launch container:   1 appid containerid workdir 
container-script tokens --http | --https keystorepath truststorepath pidfile 
nm-local-dirs nm-log-dirs resources 
 [DISABLED] launch docker container:   4 appid containerid workdir 
container-script tokens --http | --https keystorepath truststorepath pidfile 
nm-local-dirs nm-log-dirs docker-command-file resources 
signal container:   2 container-pid signal
delete as user: 3 relative-path
list as user:   5 relative-path
[DISABLED]  sync yarn sysfs:6 app-id nm-local-dirs
{code}

> Usage output of container-executor binary needs to include --http/--https 
> argument
> --
>
> Key: YARN-10063
> URL: https://issues.apache.org/jira/browse/YARN-10063
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Minor
> Attachments: YARN-10063.001.patch, YARN-10063.002.patch
>
>
> YARN-8448/YARN-6586 seems to have introduced a new option - "\--http" 
> (default) and "\--https" that is possible to be passed in to the 
> container-executor binary, see :
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L564
> and 
> 

[jira] [Comment Edited] (YARN-10063) Usage output of container-executor binary needs to include --http/--https argument

2020-01-07 Thread Siddharth Ahuja (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010225#comment-17010225
 ] 

Siddharth Ahuja edited comment on YARN-10063 at 1/8/20 1:02 AM:


Thanks [~pbacsko].

I have made the changes and uploaded the patch again that incorporates the 
changes discussed just above.

Usage output of both commands - LAUNCH_CONTAINER (1) and 
LAUNCH_DOCKER_CONTAINER (4) has been updated and it looks as per below:

{code}
[root@ bin]# pwd
/var/lib/yarn-ce/bin
[root@ bin]# ll
total 800
---Sr-s--- 1 root yarn 728960 Jan  7 16:37 container-executor
---Sr-s--- 1 root yarn  87168 Nov  8 08:04 container-executor.orig
[root@ bin]# ./container-executor

[root@sid-63-1 bin]# ./container-executor
Usage: container-executor --checksetup
   container-executor --mount-cgroups  ...
[DISABLED] container-executor --tc-modify-state 
[DISABLED] container-executor --tc-read-state 
[DISABLED] container-executor --tc-read-stats 
[DISABLED] container-executor --exec-container 
[DISABLED] container-executor --run-docker 
[DISABLED] container-executor --remove-docker-container [hierarchy] 

[DISABLED] container-executor --inspect-docker-container 
[DISABLED] container-executor --run-runc-container 
[DISABLED] container-executor --reap-runc-layer-mounts 
   container-executor
   where command and command-args: 
initialize container:   0 appid tokens nm-local-dirs nm-log-dirs 
cmd app...
launch container:   1 appid containerid workdir 
container-script tokens --http | --https keystorepath truststorepath pidfile 
nm-local-dirs nm-log-dirs resources 
 [DISABLED] launch docker container:   4 appid containerid workdir 
container-script tokens --http | --https keystorepath truststorepath pidfile 
nm-local-dirs nm-log-dirs docker-command-file resources 
signal container:   2 container-pid signal
delete as user: 3 relative-path
list as user:   5 relative-path
[DISABLED]  sync yarn sysfs:6 app-id nm-local-dirs
{code}


was (Author: sahuja):
Thanks [~pbacsko].

I have made the changes and uploaded the patch again that incorporates the 
changes discussed just above.

sage output of both commands - LAUNCH_CONTAINER (1) and LAUNCH_DOCKER_CONTAINER 
(4) has been updated to include the http flag or https flag with details.

The usage output from the container-executor binary now looks as per below:

{code}
[root@ bin]# pwd
/var/lib/yarn-ce/bin
[root@ bin]# ll
total 800
---Sr-s--- 1 root yarn 728960 Jan  7 16:37 container-executor
---Sr-s--- 1 root yarn  87168 Nov  8 08:04 container-executor.orig
[root@ bin]# ./container-executor

[root@sid-63-1 bin]# ./container-executor
Usage: container-executor --checksetup
   container-executor --mount-cgroups  ...
[DISABLED] container-executor --tc-modify-state 
[DISABLED] container-executor --tc-read-state 
[DISABLED] container-executor --tc-read-stats 
[DISABLED] container-executor --exec-container 
[DISABLED] container-executor --run-docker 
[DISABLED] container-executor --remove-docker-container [hierarchy] 

[DISABLED] container-executor --inspect-docker-container 
[DISABLED] container-executor --run-runc-container 
[DISABLED] container-executor --reap-runc-layer-mounts 
   container-executor
   where command and command-args: 
initialize container:   0 appid tokens nm-local-dirs nm-log-dirs 
cmd app...
launch container:   1 appid containerid workdir 
container-script tokens --http | --https keystorepath truststorepath pidfile 
nm-local-dirs nm-log-dirs resources 
 [DISABLED] launch docker container:   4 appid containerid workdir 
container-script tokens --http | --https keystorepath truststorepath pidfile 
nm-local-dirs nm-log-dirs docker-command-file resources 
signal container:   2 container-pid signal
delete as user: 3 relative-path
list as user:   5 relative-path
[DISABLED]  sync yarn sysfs:6 app-id nm-local-dirs
{code}

> Usage output of container-executor binary needs to include --http/--https 
> argument
> --
>
> Key: YARN-10063
> URL: https://issues.apache.org/jira/browse/YARN-10063
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Minor
> Attachments: YARN-10063.001.patch, YARN-10063.002.patch
>
>
> YARN-8448/YARN-6586 seems to have introduced a new option - "\--http" 
> (default) and "\--https" that is possible to be passed in to the 
> container-executor binary, see :
> 

[jira] [Commented] (YARN-10063) Usage output of container-executor binary needs to include --http/--https argument

2020-01-07 Thread Siddharth Ahuja (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010225#comment-17010225
 ] 

Siddharth Ahuja commented on YARN-10063:


Thanks [~pbacsko].

I have made the changes and uploaded the patch again that incorporates the 
changes discussed just above.

sage output of both commands - LAUNCH_CONTAINER (1) and LAUNCH_DOCKER_CONTAINER 
(4) has been updated to include the http flag or https flag with details.

The usage output from the container-executor binary now looks as per below:

{code}
[root@ bin]# pwd
/var/lib/yarn-ce/bin
[root@ bin]# ll
total 800
---Sr-s--- 1 root yarn 728960 Jan  7 16:37 container-executor
---Sr-s--- 1 root yarn  87168 Nov  8 08:04 container-executor.orig
[root@ bin]# ./container-executor

[root@sid-63-1 bin]# ./container-executor
Usage: container-executor --checksetup
   container-executor --mount-cgroups  ...
[DISABLED] container-executor --tc-modify-state 
[DISABLED] container-executor --tc-read-state 
[DISABLED] container-executor --tc-read-stats 
[DISABLED] container-executor --exec-container 
[DISABLED] container-executor --run-docker 
[DISABLED] container-executor --remove-docker-container [hierarchy] 

[DISABLED] container-executor --inspect-docker-container 
[DISABLED] container-executor --run-runc-container 
[DISABLED] container-executor --reap-runc-layer-mounts 
   container-executor
   where command and command-args: 
initialize container:   0 appid tokens nm-local-dirs nm-log-dirs 
cmd app...
launch container:   1 appid containerid workdir 
container-script tokens --http | --https keystorepath truststorepath pidfile 
nm-local-dirs nm-log-dirs resources 
 [DISABLED] launch docker container:   4 appid containerid workdir 
container-script tokens --http | --https keystorepath truststorepath pidfile 
nm-local-dirs nm-log-dirs docker-command-file resources 
signal container:   2 container-pid signal
delete as user: 3 relative-path
list as user:   5 relative-path
[DISABLED]  sync yarn sysfs:6 app-id nm-local-dirs
{code}

> Usage output of container-executor binary needs to include --http/--https 
> argument
> --
>
> Key: YARN-10063
> URL: https://issues.apache.org/jira/browse/YARN-10063
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Minor
> Attachments: YARN-10063.001.patch, YARN-10063.002.patch
>
>
> YARN-8448/YARN-6586 seems to have introduced a new option - "\--http" 
> (default) and "\--https" that is possible to be passed in to the 
> container-executor binary, see :
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L564
> and 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L521
> however, the usage output seems to have missed this:
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L74
> Raising this jira to improve this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10063) Usage output of container-executor binary needs to include --http/--https argument

2020-01-07 Thread Siddharth Ahuja (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja updated YARN-10063:
---
Attachment: YARN-10063.002.patch

> Usage output of container-executor binary needs to include --http/--https 
> argument
> --
>
> Key: YARN-10063
> URL: https://issues.apache.org/jira/browse/YARN-10063
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Minor
> Attachments: YARN-10063.001.patch, YARN-10063.002.patch
>
>
> YARN-8448/YARN-6586 seems to have introduced a new option - "\--http" 
> (default) and "\--https" that is possible to be passed in to the 
> container-executor binary, see :
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L564
> and 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L521
> however, the usage output seems to have missed this:
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L74
> Raising this jira to improve this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out

2020-01-07 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010191#comment-17010191
 ] 

Hadoop QA commented on YARN-8672:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
42s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
 1s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 32s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
49s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} branch-3.2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 24s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 1 new + 428 unchanged - 1 fixed = 429 total (was 429) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 14s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
22s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 74m 21s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:0f25cbbb251 |
| JIRA Issue | YARN-8672 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990139/YARN-8672-branch-3.2.001.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 23c84af7ea68 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.2 / 250cd9f |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/25345/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/25345/testReport/ |
| Max. process+thread count | 308 (vs. ulimit of 5500) |
| modules | C: 

[jira] [Commented] (YARN-9879) Allow multiple leaf queues with the same name in CS

2020-01-07 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010190#comment-17010190
 ] 

Wilfred Spiegelenburg commented on YARN-9879:
-

Thank you [~leftnoteasy] for the comments.
{quote}And once application is submitted to CS, internal to CS, we should make 
sure we use queue path instead of queue name at all other places. Otherwise we 
will complicate other logics.
{quote}
I agree that is what I had in mind too. Make it as simple as possible inside 
the scheduler and that is to use just the full path internally.

For the configuration change: I do not think it is a problem and we can just 
accept the change. To be fair to the administrator we should show a message 
when the configuration is loaded or changed and the leaf queues are not unique 
(any more). However that is probably as far as we need to go.
{quote}Instead of using scheduler.getQueue, we may need to consider to add a 
method like getAppSubmissionQueue() to get a queue based on path or name, and 
after that, we will put normalized queue_path back to submission context of 
application to make sure in the future inside scheduler we all refer to queue 
path.
{quote}
The FS already does something like this already because it uses a placement 
rule in all cases. We should leverage a similar mechanism in the CS. We pass 
the queue from the submission into the queue placement which handles the full 
path or not. In both cases it just passes back the queue object which will be 
using the full path. If the queue is not found or the queue name is not unique 
it fails as per normal. The returned queue info is updated in the app and 
submission context.
 Far simpler than putting the burden on the core scheduler. It is all hidden in 
the placement of the app into the queue using the placement engine.

I did not mention queue mapping in my design. Queue mapping itself I thought 
did not need to change. We already calculate the parent queue in the rules if I 
am correct so the only change would be the return value. We do all internal 
handling for queues with the full queue path so it is a logical change. Using 
the placement rule for the qualified or not qualified mapping does require some 
changes in that area.

I might have forgotten to mention other bits and pieces like the cli or flow on 
effects on the UI but that needs to assessed when we have have a design we 
agree on. There will be more jiras needed to fix separate parts when the change 
is made to the core.

> Allow multiple leaf queues with the same name in CS
> ---
>
> Key: YARN-9879
> URL: https://issues.apache.org/jira/browse/YARN-9879
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
> Attachments: DesignDoc_v1.pdf
>
>
> Currently the leaf queue's name must be unique regardless of its position in 
> the queue hierarchy. 
> Design doc and first proposal is being made, I'll attach it as soon as it's 
> done.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10072) TestCSAllocateCustomResource failures

2020-01-07 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010147#comment-17010147
 ] 

Eric Payne commented on YARN-10072:
---

+1. I'll commit tomorrow and port all branches back to branch-2.10.

> TestCSAllocateCustomResource failures
> -
>
> Key: YARN-10072
> URL: https://issues.apache.org/jira/browse/YARN-10072
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: yarn
>Affects Versions: 2.10.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
>  Labels: YARN
> Attachments: YARN-10072.001.patch, YARN-10072.002.patch
>
>
> This test is failing for us consistently in our internal 2.10 based branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out

2020-01-07 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010143#comment-17010143
 ] 

Jim Brennan commented on YARN-8672:
---

[~ebadger] I have uploaded a patch for branch-3.2.

> TestContainerManager#testLocalingResourceWhileContainerRunning occasionally 
> times out
> -
>
> Key: YARN-8672
> URL: https://issues.apache.org/jira/browse/YARN-8672
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.10.0, 3.2.0
>Reporter: Jason Darrell Lowe
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-8672-branch-2.10.001.patch, 
> YARN-8672-branch-2.10.002.patch, YARN-8672-branch-2.10.003.patch, 
> YARN-8672-branch-3.2.001.patch, YARN-8672.001.patch, YARN-8672.002.patch, 
> YARN-8672.003.patch, YARN-8672.004.patch, YARN-8672.005.patch, 
> YARN-8672.006.patch, YARN-8672.007.patch, YARN-8672.008.patch
>
>
> Precommit builds have been failing in 
> TestContainerManager#testLocalingResourceWhileContainerRunning.  I have been 
> able to reproduce the problem without any patch applied if I run the test 
> enough times.  It looks like something is removing container tokens from the 
> nmPrivate area just as a new localizer starts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out

2020-01-07 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated YARN-8672:
--
Attachment: YARN-8672-branch-3.2.001.patch

> TestContainerManager#testLocalingResourceWhileContainerRunning occasionally 
> times out
> -
>
> Key: YARN-8672
> URL: https://issues.apache.org/jira/browse/YARN-8672
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.10.0, 3.2.0
>Reporter: Jason Darrell Lowe
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-8672-branch-2.10.001.patch, 
> YARN-8672-branch-2.10.002.patch, YARN-8672-branch-2.10.003.patch, 
> YARN-8672-branch-3.2.001.patch, YARN-8672.001.patch, YARN-8672.002.patch, 
> YARN-8672.003.patch, YARN-8672.004.patch, YARN-8672.005.patch, 
> YARN-8672.006.patch, YARN-8672.007.patch, YARN-8672.008.patch
>
>
> Precommit builds have been failing in 
> TestContainerManager#testLocalingResourceWhileContainerRunning.  I have been 
> able to reproduce the problem without any patch applied if I run the test 
> enough times.  It looks like something is removing container tokens from the 
> nmPrivate area just as a new localizer starts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10073) Intraqueue preemption doesn't work across partitions

2020-01-07 Thread Paul Jones (Jira)
Paul Jones created YARN-10073:
-

 Summary: Intraqueue preemption doesn't work across partitions
 Key: YARN-10073
 URL: https://issues.apache.org/jira/browse/YARN-10073
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacity scheduler, capacityscheduler, scheduler 
preemption
Affects Versions: 2.8.5
Reporter: Paul Jones


Cluster:

1 Node with label "A"

yarn.scheduler.capacity.root.accessible-node-labels=*

yarn.resourcemanager.monitor.capacity.preemption.intra-queue-preemption.enabled=true

yarn.scheduler.capacity.root.default.minimum-user-limit-percent=50

 

User 1:  Submit job Y require 10x cluster resources to queue, default, using 
label ""

User 2: (after job Y starts) submit job Z to queue, default using label ""

 

What we see: Job Z doesn't start until job Y releases resources. This happens 
because the pending requests for job Y and Z are in partition "". However, 
queue default is using resources in partition "A". Pending requests in 
partition "" don't cause intra queue preemptions in partition "A".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10004) Javadoc of YarnConfigurationStore#initialize is not straightforward

2020-01-07 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010136#comment-17010136
 ] 

Eric Payne commented on YARN-10004:
---

[~snemeth], can you please be more specific about what is wrong with the 
JavaDoc of YarnConfigurationStore#initialize?

> Javadoc of YarnConfigurationStore#initialize is not straightforward
> ---
>
> Key: YARN-10004
> URL: https://issues.apache.org/jira/browse/YARN-10004
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Siddharth Ahuja
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out

2020-01-07 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010120#comment-17010120
 ] 

Hadoop QA commented on YARN-8672:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 18m  
9s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.10 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
38s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} branch-2.10 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} branch-2.10 passed with JDK v1.8.0_232 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} branch-2.10 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} branch-2.10 passed with JDK v1.8.0_232 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed with JDK v1.8.0_232 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 24s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 1 new + 533 unchanged - 2 fixed = 534 total (was 535) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed with JDK v1.8.0_232 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 15m 
35s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
46s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 59m 34s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:a969cad0a12 |
| JIRA Issue | YARN-8672 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990132/YARN-8672-branch-2.10.003.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux c67b8478c954 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2.10 / 82bc477 |
| maven | version: Apache Maven 3.3.9 |
| 

[jira] [Commented] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out

2020-01-07 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010117#comment-17010117
 ] 

Jim Brennan commented on YARN-8672:
---

Thanks [~ebadger] I will put up a patch for branch-3.2.

 

> TestContainerManager#testLocalingResourceWhileContainerRunning occasionally 
> times out
> -
>
> Key: YARN-8672
> URL: https://issues.apache.org/jira/browse/YARN-8672
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.10.0, 3.2.0
>Reporter: Jason Darrell Lowe
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-8672-branch-2.10.001.patch, 
> YARN-8672-branch-2.10.002.patch, YARN-8672-branch-2.10.003.patch, 
> YARN-8672.001.patch, YARN-8672.002.patch, YARN-8672.003.patch, 
> YARN-8672.004.patch, YARN-8672.005.patch, YARN-8672.006.patch, 
> YARN-8672.007.patch, YARN-8672.008.patch
>
>
> Precommit builds have been failing in 
> TestContainerManager#testLocalingResourceWhileContainerRunning.  I have been 
> able to reproduce the problem without any patch applied if I run the test 
> enough times.  It looks like something is removing container tokens from the 
> nmPrivate area just as a new localizer starts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out

2020-01-07 Thread Eric Badger (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010115#comment-17010115
 ] 

Eric Badger commented on YARN-8672:
---

Thanks for the patch [~Jim_Brennan]! +1 on the branch-2.10 patch. 

Before I commit it, could you put up a patch for branch-3.2? The trunk patch 
doesn't apply cleanly and there are enough differences that I'm not comfortable 
fixing all of them without a patch running against hadoopQA.

> TestContainerManager#testLocalingResourceWhileContainerRunning occasionally 
> times out
> -
>
> Key: YARN-8672
> URL: https://issues.apache.org/jira/browse/YARN-8672
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.10.0, 3.2.0
>Reporter: Jason Darrell Lowe
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-8672-branch-2.10.001.patch, 
> YARN-8672-branch-2.10.002.patch, YARN-8672-branch-2.10.003.patch, 
> YARN-8672.001.patch, YARN-8672.002.patch, YARN-8672.003.patch, 
> YARN-8672.004.patch, YARN-8672.005.patch, YARN-8672.006.patch, 
> YARN-8672.007.patch, YARN-8672.008.patch
>
>
> Precommit builds have been failing in 
> TestContainerManager#testLocalingResourceWhileContainerRunning.  I have been 
> able to reproduce the problem without any patch applied if I run the test 
> enough times.  It looks like something is removing container tokens from the 
> nmPrivate area just as a new localizer starts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10072) TestCSAllocateCustomResource failures

2020-01-07 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010112#comment-17010112
 ] 

Jim Brennan commented on YARN-10072:


The unit test failure in TestCapacityScheduler is unrelated to this change:
{noformat}

[ERROR]   TestCapacityScheduler.testResourceOverCommit:1467 Too long: 2412ms
 {noformat}

> TestCSAllocateCustomResource failures
> -
>
> Key: YARN-10072
> URL: https://issues.apache.org/jira/browse/YARN-10072
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: yarn
>Affects Versions: 2.10.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
>  Labels: YARN
> Attachments: YARN-10072.001.patch, YARN-10072.002.patch
>
>
> This test is failing for us consistently in our internal 2.10 based branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10072) TestCSAllocateCustomResource failures

2020-01-07 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010111#comment-17010111
 ] 

Hadoop QA commented on YARN-10072:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
43s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 16s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  4s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 86m 15s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}143m  6s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | YARN-10072 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990123/YARN-10072.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux aba3cfcb2ada 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / d1f5976 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/25342/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/25342/testReport/ |
| Max. process+thread count | 834 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 

[jira] [Commented] (YARN-7387) org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer fails intermittently

2020-01-07 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010086#comment-17010086
 ] 

Jim Brennan commented on YARN-7387:
---

[~ebadger] or [~epayne] can you please review this one?

 

> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
>  fails intermittently
> ---
>
> Key: YARN-7387
> URL: https://issues.apache.org/jira/browse/YARN-7387
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7387.001.patch
>
>
> {code}
> Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 52.481 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
> testDecreaseAfterIncreaseWithAllocationExpiration(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer)
>   Time elapsed: 13.292 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<3072> but was:<4096>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer.testDecreaseAfterIncreaseWithAllocationExpiration(TestIncreaseAllocationExpirer.java:459)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out

2020-01-07 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010084#comment-17010084
 ] 

Jim Brennan commented on YARN-8672:
---

patch 003 fixes 2 of the three checkstyle issues.  The last one is:
{noformat}
./hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ContainerLocalizer.java:409:
  public static void buildMainArgs(List command,:22: More than 7 
parameters (found 8). {noformat}
This matches the trunk version.

> TestContainerManager#testLocalingResourceWhileContainerRunning occasionally 
> times out
> -
>
> Key: YARN-8672
> URL: https://issues.apache.org/jira/browse/YARN-8672
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.10.0, 3.2.0
>Reporter: Jason Darrell Lowe
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-8672-branch-2.10.001.patch, 
> YARN-8672-branch-2.10.002.patch, YARN-8672-branch-2.10.003.patch, 
> YARN-8672.001.patch, YARN-8672.002.patch, YARN-8672.003.patch, 
> YARN-8672.004.patch, YARN-8672.005.patch, YARN-8672.006.patch, 
> YARN-8672.007.patch, YARN-8672.008.patch
>
>
> Precommit builds have been failing in 
> TestContainerManager#testLocalingResourceWhileContainerRunning.  I have been 
> able to reproduce the problem without any patch applied if I run the test 
> enough times.  It looks like something is removing container tokens from the 
> nmPrivate area just as a new localizer starts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out

2020-01-07 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated YARN-8672:
--
Attachment: YARN-8672-branch-2.10.003.patch

> TestContainerManager#testLocalingResourceWhileContainerRunning occasionally 
> times out
> -
>
> Key: YARN-8672
> URL: https://issues.apache.org/jira/browse/YARN-8672
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.10.0, 3.2.0
>Reporter: Jason Darrell Lowe
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-8672-branch-2.10.001.patch, 
> YARN-8672-branch-2.10.002.patch, YARN-8672-branch-2.10.003.patch, 
> YARN-8672.001.patch, YARN-8672.002.patch, YARN-8672.003.patch, 
> YARN-8672.004.patch, YARN-8672.005.patch, YARN-8672.006.patch, 
> YARN-8672.007.patch, YARN-8672.008.patch
>
>
> Precommit builds have been failing in 
> TestContainerManager#testLocalingResourceWhileContainerRunning.  I have been 
> able to reproduce the problem without any patch applied if I run the test 
> enough times.  It looks like something is removing container tokens from the 
> nmPrivate area just as a new localizer starts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out

2020-01-07 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010064#comment-17010064
 ] 

Hadoop QA commented on YARN-8672:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
45s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.10 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
 1s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} branch-2.10 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} branch-2.10 passed with JDK v1.8.0_232 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
56s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} branch-2.10 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} branch-2.10 passed with JDK v1.8.0_232 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed with JDK v1.8.0_232 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 23s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 3 new + 534 unchanged - 2 fixed = 537 total (was 536) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed with JDK v1.8.0_232 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 
55s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 40m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:a969cad0a12 |
| JIRA Issue | YARN-8672 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990125/YARN-8672-branch-2.10.002.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 6f74a7f96f20 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2.10 / 82bc477 |
| maven | version: Apache Maven 3.3.9 |
| 

[jira] [Updated] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out

2020-01-07 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated YARN-8672:
--
Attachment: YARN-8672-branch-2.10.002.patch

> TestContainerManager#testLocalingResourceWhileContainerRunning occasionally 
> times out
> -
>
> Key: YARN-8672
> URL: https://issues.apache.org/jira/browse/YARN-8672
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.10.0, 3.2.0
>Reporter: Jason Darrell Lowe
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-8672-branch-2.10.001.patch, 
> YARN-8672-branch-2.10.002.patch, YARN-8672.001.patch, YARN-8672.002.patch, 
> YARN-8672.003.patch, YARN-8672.004.patch, YARN-8672.005.patch, 
> YARN-8672.006.patch, YARN-8672.007.patch, YARN-8672.008.patch
>
>
> Precommit builds have been failing in 
> TestContainerManager#testLocalingResourceWhileContainerRunning.  I have been 
> able to reproduce the problem without any patch applied if I run the test 
> enough times.  It looks like something is removing container tokens from the 
> nmPrivate area just as a new localizer starts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out

2020-01-07 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010008#comment-17010008
 ] 

Jim Brennan commented on YARN-8672:
---

Looks like I missed a change to DockerContainerExecutor.  I will fix.

 

> TestContainerManager#testLocalingResourceWhileContainerRunning occasionally 
> times out
> -
>
> Key: YARN-8672
> URL: https://issues.apache.org/jira/browse/YARN-8672
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.10.0, 3.2.0
>Reporter: Jason Darrell Lowe
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-8672-branch-2.10.001.patch, YARN-8672.001.patch, 
> YARN-8672.002.patch, YARN-8672.003.patch, YARN-8672.004.patch, 
> YARN-8672.005.patch, YARN-8672.006.patch, YARN-8672.007.patch, 
> YARN-8672.008.patch
>
>
> Precommit builds have been failing in 
> TestContainerManager#testLocalingResourceWhileContainerRunning.  I have been 
> able to reproduce the problem without any patch applied if I run the test 
> enough times.  It looks like something is removing container tokens from the 
> nmPrivate area just as a new localizer starts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10072) TestCSAllocateCustomResource failures

2020-01-07 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated YARN-10072:
---
Attachment: YARN-10072.002.patch

> TestCSAllocateCustomResource failures
> -
>
> Key: YARN-10072
> URL: https://issues.apache.org/jira/browse/YARN-10072
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: yarn
>Affects Versions: 2.10.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
>  Labels: YARN
> Attachments: YARN-10072.001.patch, YARN-10072.002.patch
>
>
> This test is failing for us consistently in our internal 2.10 based branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10072) TestCSAllocateCustomResource failures

2020-01-07 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009990#comment-17009990
 ] 

Jim Brennan commented on YARN-10072:


Thanks [~epayne]!  I will put up a new patch to fix those.

 

> TestCSAllocateCustomResource failures
> -
>
> Key: YARN-10072
> URL: https://issues.apache.org/jira/browse/YARN-10072
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: yarn
>Affects Versions: 2.10.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
>  Labels: YARN
> Attachments: YARN-10072.001.patch
>
>
> This test is failing for us consistently in our internal 2.10 based branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out

2020-01-07 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009985#comment-17009985
 ] 

Hadoop QA commented on YARN-8672:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m  
8s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.10 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
20s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} branch-2.10 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} branch-2.10 passed with JDK v1.8.0_232 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
51s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} branch-2.10 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} branch-2.10 passed with JDK v1.8.0_232 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
19s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
23s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed 
with JDK v1.7.0_95. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 23s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
19s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed 
with JDK v1.8.0_232. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 19s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.8.0_232. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 22s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 1 new + 480 unchanged - 0 fixed = 481 total (was 480) 
{color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
19s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
16s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed with JDK v1.8.0_232 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 20s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 38m 38s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:a969cad0a12 |
| JIRA Issue | YARN-8672 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990120/YARN-8672-branch-2.10.001.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 9a779395d800 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 

[jira] [Commented] (YARN-9879) Allow multiple leaf queues with the same name in CS

2020-01-07 Thread Wangda Tan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009975#comment-17009975
 ] 

Wangda Tan commented on YARN-9879:
--

[~pbacsko], thanks for working on the design. 

In general, I agree with what [~wilfreds] mentioned: we should try to avoid 
change RPC protocols, instead we just change internal logic to make sure 
multiple queues can be handled.

To me there're two major parts:

1) Whatever logic inside CS to allow multiple queue names. Either solution 
mentioned in the comment: 
https://issues.apache.org/jira/browse/YARN-9879?focusedCommentId=17009845=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17009845
 should be fine. And I expect the lookup of queue name (not queue path) should 
only be called when submit application.

And once application is submitted to CS, internal to CS, we should make sure we 
use queue path instead of queue name at all other places. Otherwise we will 
complicate other logics.

2) When submit app, the scheduler going to accept/reject app based on the 
uniqueness of queue name or path specified. The core part need to be changed is 
inside RMAppManager:
{code:java}
 if (!isRecovery && YarnConfiguration.isAclEnabled(conf)) {
  if (scheduler instanceof CapacityScheduler) {
String queueName = submissionContext.getQueue();
String appName = submissionContext.getApplicationName();
CSQueue csqueue = ((CapacityScheduler) scheduler).getQueue(queueName);{code}
Instead of using scheduler.getQueue, we may need to consider to add a method 
like getAppSubmissionQueue() to get a queue based on path or name, and after 
that, we will put normalized queue_path back to submission context of 
application to make sure in the future inside scheduler we all refer to queue 
path.

For the comment from [~wilfreds]: 
{quote}The important part is applying a new configuration. If the configuration 
adds a leaf queue that is not unique the configuration update currently is 
rejected. With this change we would allow that config to become active. This 
*could* break existing applications when they try to submit to the leaf queue 
that is no longer unique.
{quote}
I personally think it is not a big deal if application reject reasons from RM 
can clearly guide users to use full qualified queue path when duplicated queue 
names exists. It is like if a team has only one Peter we can use the first name 
only otherwise we will add last name to avoid confusion. It isn't 
counter-intuitive to me.

Also, we need to handle queue mapping for queue-path instead of queue name 
also, I didn't see it from the design doc or I missed it.

> Allow multiple leaf queues with the same name in CS
> ---
>
> Key: YARN-9879
> URL: https://issues.apache.org/jira/browse/YARN-9879
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
> Attachments: DesignDoc_v1.pdf
>
>
> Currently the leaf queue's name must be unique regardless of its position in 
> the queue hierarchy. 
> Design doc and first proposal is being made, I'll attach it as soon as it's 
> done.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out

2020-01-07 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009944#comment-17009944
 ] 

Eric Yang commented on YARN-8672:
-

[~Jim_Brennan] No objection on backport.  [~ebadger] Could you shepherd the 
process, if precommit build passes?  Thanks

> TestContainerManager#testLocalingResourceWhileContainerRunning occasionally 
> times out
> -
>
> Key: YARN-8672
> URL: https://issues.apache.org/jira/browse/YARN-8672
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.10.0, 3.2.0
>Reporter: Jason Darrell Lowe
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-8672-branch-2.10.001.patch, YARN-8672.001.patch, 
> YARN-8672.002.patch, YARN-8672.003.patch, YARN-8672.004.patch, 
> YARN-8672.005.patch, YARN-8672.006.patch, YARN-8672.007.patch, 
> YARN-8672.008.patch
>
>
> Precommit builds have been failing in 
> TestContainerManager#testLocalingResourceWhileContainerRunning.  I have been 
> able to reproduce the problem without any patch applied if I run the test 
> enough times.  It looks like something is removing container tokens from the 
> nmPrivate area just as a new localizer starts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10072) TestCSAllocateCustomResource failures

2020-01-07 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009938#comment-17009938
 ] 

Eric Payne commented on YARN-10072:
---

[~Jim_Brennan], the new CheckStyle warnings are due to the now-unused imports 
in TestCSAllocateCustomResource. Is there any reason not to remove those?

> TestCSAllocateCustomResource failures
> -
>
> Key: YARN-10072
> URL: https://issues.apache.org/jira/browse/YARN-10072
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: yarn
>Affects Versions: 2.10.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
>  Labels: YARN
> Attachments: YARN-10072.001.patch
>
>
> This test is failing for us consistently in our internal 2.10 based branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10072) TestCSAllocateCustomResource failures

2020-01-07 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009936#comment-17009936
 ] 

Hadoop QA commented on YARN-10072:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
34s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 33s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 31s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 6 new + 0 unchanged - 0 fixed = 6 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 47s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 85m 
29s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}143m 34s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | YARN-10072 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990108/YARN-10072.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 0fd08927ccca 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / bc366d4 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/25340/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/25340/testReport/ |
| Max. process+thread count | 818 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 

[jira] [Reopened] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out

2020-01-07 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan reopened YARN-8672:
---

[~csingh], [~eyang] we are seeing these failures in branch-2.10.  Any objection 
to pulling these changes back to branch-2.10?   I will provide a patch.

> TestContainerManager#testLocalingResourceWhileContainerRunning occasionally 
> times out
> -
>
> Key: YARN-8672
> URL: https://issues.apache.org/jira/browse/YARN-8672
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0
>Reporter: Jason Darrell Lowe
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-8672.001.patch, YARN-8672.002.patch, 
> YARN-8672.003.patch, YARN-8672.004.patch, YARN-8672.005.patch, 
> YARN-8672.006.patch, YARN-8672.007.patch, YARN-8672.008.patch
>
>
> Precommit builds have been failing in 
> TestContainerManager#testLocalingResourceWhileContainerRunning.  I have been 
> able to reproduce the problem without any patch applied if I run the test 
> enough times.  It looks like something is removing container tokens from the 
> nmPrivate area just as a new localizer starts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10068) TimelineV2Client may leak file descriptors creating ClientResponse objects.

2020-01-07 Thread Anand Srinivasan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009895#comment-17009895
 ] 

Anand Srinivasan commented on YARN-10068:
-

Hi Adam Antal,

Thanks for the review and feedbacks. Very much appreciate it.

Kind regards.

> TimelineV2Client may leak file descriptors creating ClientResponse objects.
> ---
>
> Key: YARN-10068
> URL: https://issues.apache.org/jira/browse/YARN-10068
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.0.0
> Environment: HDP VERSION3.1.4
> AMBARI VERSION2.7.4.0
>Reporter: Anand Srinivasan
>Assignee: Anand Srinivasan
>Priority: Critical
> Attachments: YARN-10068.001.patch, YARN-10068.002.patch, 
> YARN-10068.003.patch, image-2020-01-02-14-58-12-773.png
>
>
> Hi team,
> Code-walkthrough between v1 and v2 of TimelineClient API revealed that v2 API 
> TimelineV2ClientImpl#putObjects doesn't close ClientResponse objects under 
> success status returned from Timeline Server. ClientResponse is closed only 
> under erroneous response from the server using ClientResponse#getEntity.
> We also noticed that TimelineClient (v1) closes the ClientResponse object in 
> TimelineWriter#putEntities by calling ClientResponse#getEntity in both 
> success and error conditions from the server thereby avoiding this file 
> descriptor leak.
> Customer's original issue and the symptom was that the NodeManager went down 
> because of 'too many files open' condition where there were lots of 
> CLOSED_WAIT sockets observed between the timeline client (from NM) and the 
> timeline server hosts. 
> Could you please help resolve this issue ? Thanks.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10067) Add dry-run feature to FS-CS converter tool

2020-01-07 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009892#comment-17009892
 ] 

Szilard Nemeth commented on YARN-10067:
---

Hi [~pbacsko]!
Thanks for your patch. I have some comments:

1. In FSConfigToCSConfigArgumentHandler#parseAndConvert: Please extract the 
code from 
  
{code:java}
dryRun = cliParser.hasOption(CliOption.DRY_RUN.shortSwitch);
{code}

until 
 
{code:java}
 converter.convert(params);
{code}

to a separate method, for better readability.

2. In FSConfigToCSConfigArgumentHandler#parseAndConvert: The exception handling 
logic is quite verbose as of now. 
You could extract a code block that is occurring 3 times of the exception 
handling: 

{code:java}
if (dryRun) {
dryRunResultHolder.addDryRunError(msg);
} else {
logAndStdErr(e, msg);
return -1;
}
{code}


3. I think FSConfigToCSConfigArgumentHandler#printDryRunResults is a good 
method, in terms of contents. I would rather move the whole printing logic into 
DryRunResultHolder instead, so printing its own results can be the 
responsibility of that class.

4. Nit: FSConfigToCSConfigConverter#dryRun: You can omit "= false" from the 
declaration, since as per Java standards, booleans are initialized to false by 
default.

5. I can see in many places that the boolean dryRun + the DryRunResultsHolder 
are passed in tandem. For example, in FSConfigToCSConfigConverter, in 
FSConfigToCSConfigRuleHandler and in FSConfigToCSConfigArgumentHandler.
Can you create a class to hold these two together? 
For example, I can image something named like "RuntimeParameters" and there you 
could hide the details like dry run, as well as any other future runtime 
options.
Methods like FSConfigToCSConfigRuleHandler#handle and 
FSQueueConverter#convertQueueHierarchy could simply pass (delegate) the 
exceptionMessage to an instance of this RuntimeParameters class and the 
instance could decide on what to do with the error message: 
Either throw an UnsupportedPropertyException or to record it as a dry run 
error. This way, the dry-run feature is better abstracted, in my opinion.

6. Why don't you use FSConfigToCSConfigConverterParams#isDryRun anywhere? Is 
this intentional?

7. In TestFSQueueConverter, you have very similar code calls to create the 
FSQueueConverter objects. I would suggest to extract a method that creates a 
builder object with those common calls, e.g.

{code:java}
  FSQueueConverterBuilder.create()
.withRuleHandler(ruleHandler)
.withCapacitySchedulerConfig(csConfig)
.withPreemptionEnabled(false)
.withSizeBasedWeight(false)
.withAutoCreateChildQueues(true)
.withClusterResource(CLUSTER_RESOURCE)
.withQueueMaxAMShareDefault(0.16f)
.withQueueMaxAppsDefault(15)
.withDryRun(false)
{code}

and then tweak the builder to meet the testcase needs. This way, you 
can have a default builder object with a few additional calls to it to prepare 
the converter object.

> Add dry-run feature to FS-CS converter tool
> ---
>
> Key: YARN-10067
> URL: https://issues.apache.org/jira/browse/YARN-10067
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-10067-001.patch, YARN-10067-002.patch, 
> YARN-10067-003.patch
>
>
> Add a "d" / "-dry-run" switch to the tool. The purpose of this would be to 
> inform the user whether a conversion is possible and if it is, are there any 
> warnings.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10068) TimelineV2Client may leak file descriptors creating ClientResponse objects.

2020-01-07 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009883#comment-17009883
 ] 

Adam Antal commented on YARN-10068:
---

Thanks for the patch [~anand.srinivasan].

On a second look, I agree with resolution regarding my comments - thanks! 
+1 (non-binding) from me.

> TimelineV2Client may leak file descriptors creating ClientResponse objects.
> ---
>
> Key: YARN-10068
> URL: https://issues.apache.org/jira/browse/YARN-10068
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.0.0
> Environment: HDP VERSION3.1.4
> AMBARI VERSION2.7.4.0
>Reporter: Anand Srinivasan
>Assignee: Anand Srinivasan
>Priority: Critical
> Attachments: YARN-10068.001.patch, YARN-10068.002.patch, 
> YARN-10068.003.patch, image-2020-01-02-14-58-12-773.png
>
>
> Hi team,
> Code-walkthrough between v1 and v2 of TimelineClient API revealed that v2 API 
> TimelineV2ClientImpl#putObjects doesn't close ClientResponse objects under 
> success status returned from Timeline Server. ClientResponse is closed only 
> under erroneous response from the server using ClientResponse#getEntity.
> We also noticed that TimelineClient (v1) closes the ClientResponse object in 
> TimelineWriter#putEntities by calling ClientResponse#getEntity in both 
> success and error conditions from the server thereby avoiding this file 
> descriptor leak.
> Customer's original issue and the symptom was that the NodeManager went down 
> because of 'too many files open' condition where there were lots of 
> CLOSED_WAIT sockets observed between the timeline client (from NM) and the 
> timeline server hosts. 
> Could you please help resolve this issue ? Thanks.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10072) TestCSAllocateCustomResource failures

2020-01-07 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009882#comment-17009882
 ] 

Eric Payne commented on YARN-10072:
---

The changes look good to me. I'll wait for the pre-commit build results and 
evaluate more.

> TestCSAllocateCustomResource failures
> -
>
> Key: YARN-10072
> URL: https://issues.apache.org/jira/browse/YARN-10072
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: yarn
>Affects Versions: 2.10.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
>  Labels: YARN
> Attachments: YARN-10072.001.patch
>
>
> This test is failing for us consistently in our internal 2.10 based branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10072) TestCSAllocateCustomResource failures

2020-01-07 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009868#comment-17009868
 ] 

Jim Brennan commented on YARN-10072:


[~epayne], I've put up patch 001 for this.  Can you please review?

> TestCSAllocateCustomResource failures
> -
>
> Key: YARN-10072
> URL: https://issues.apache.org/jira/browse/YARN-10072
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: yarn
>Affects Versions: 2.10.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
>  Labels: YARN
> Attachments: YARN-10072.001.patch
>
>
> This test is failing for us consistently in our internal 2.10 based branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10027) Add ability for ATS (log servlet) to read logs of running apps

2020-01-07 Thread Adam Antal (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal updated YARN-10027:
--
Description: 
Currently neither version of the AHS is able to read logs of running apps 
(local logs of NodeManager). YARN log CLI is integrated with NodeManager to 
extract local logs as well (see YARN-5224 for reference), the same should be 
done for ATS.

Some context:
 The local log files are read by the server in 
{{NMWebServices#getContainerLogFile}}. This is accessed by the YARN logs CLI 
through REST using the /containers/\{containerid}/logs/\{filename} endpoint in 
{{LogsCLI#getResponeFromNMWebService}}.

If YARN-10026 we can pull the common code pieces out of those services, we can 
implement this in the common log servlet.

  was:
Currently neither version of the AHS is able to read logs of running apps 
(local logs of NodeManager). YARN log CLI is integrated with NodeManager to 
extract local logs as well (see YARN-5224 for reference), the same should be 
done for ATS.

Some context:
The local log files are read by the server in 
{{NMWebServices#getContainerLogFile}}. This is accessed by the YARN logs CLI 
through REST using the /containers/{containerid}/logs/{filename} endpoint in 
{{LogsCLI#getResponeFromNMWebService}}.

If YARN-10026 we can pull the common code pieces out of those services, we can 
implement this in the common log servlet.


> Add ability for ATS (log servlet) to read logs of running apps
> --
>
> Key: YARN-10027
> URL: https://issues.apache.org/jira/browse/YARN-10027
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Minor
>
> Currently neither version of the AHS is able to read logs of running apps 
> (local logs of NodeManager). YARN log CLI is integrated with NodeManager to 
> extract local logs as well (see YARN-5224 for reference), the same should be 
> done for ATS.
> Some context:
>  The local log files are read by the server in 
> {{NMWebServices#getContainerLogFile}}. This is accessed by the YARN logs CLI 
> through REST using the /containers/\{containerid}/logs/\{filename} endpoint 
> in {{LogsCLI#getResponeFromNMWebService}}.
> If YARN-10026 we can pull the common code pieces out of those services, we 
> can implement this in the common log servlet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10026) Pull out common code pieces from ATS v1.5 and v2

2020-01-07 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009858#comment-17009858
 ] 

Adam Antal commented on YARN-10026:
---

Jenkins passed on branch-3.2. Could you please commit this [~snemeth] to that 
branch as well?

> Pull out common code pieces from ATS v1.5 and v2
> 
>
> Key: YARN-10026
> URL: https://issues.apache.org/jira/browse/YARN-10026
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2, yarn
>Affects Versions: 3.2.1
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-10026.001.patch, YARN-10026.002.patch, 
> YARN-10026.003.patch, YARN-10026.branch-3.2.001.patch
>
>
> ATSv1.5 and ATSv2 has lots of common code that can be pulled to an abstract 
> service / package. The logic is the same, and the code is _almost_ the same.
> As far as I see, the only ATS specific thing in that AppInfo is constructed 
> from an ApplicationReport, which information is extracted from the 
> TimelineReader client, 
> Later the appInfo object's user and appState fields are used, but I see no 
> other dependency on the timeline part. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9879) Allow multiple leaf queues with the same name in CS

2020-01-07 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009845#comment-17009845
 ] 

Peter Bacsko commented on YARN-9879:


Alternatively, we can have {{Map fullToCSQueue}} and 
{{Map leafToCSQueue}}, so we can avoid the double lookup (not 
that it's really that expensive).

Also it's probably better to have {{Map}} to check whether a 
leaf is unique. When we add/remove a queue, we increase/decrease a counter, so 
upon removal, we know whether it has became unique or not.

> Allow multiple leaf queues with the same name in CS
> ---
>
> Key: YARN-9879
> URL: https://issues.apache.org/jira/browse/YARN-9879
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
> Attachments: DesignDoc_v1.pdf
>
>
> Currently the leaf queue's name must be unique regardless of its position in 
> the queue hierarchy. 
> Design doc and first proposal is being made, I'll attach it as soon as it's 
> done.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10043) FairOrderingPolicy Improvements

2020-01-07 Thread Manikandan R (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009840#comment-17009840
 ] 

Manikandan R commented on YARN-10043:
-

[~wilfreds] Any suggestions?

> FairOrderingPolicy Improvements
> ---
>
> Key: YARN-10043
> URL: https://issues.apache.org/jira/browse/YARN-10043
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Manikandan R
>Assignee: Manikandan R
>Priority: Major
>
> FairOrderingPolicy can be improved by using some of the approaches (only 
> relevant) implemented in FairSharePolicy of FS. This improvement has 
> significance in FS to CS migration context.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9866) u:user2:%primary_group is not working as expected

2020-01-07 Thread Manikandan R (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009838#comment-17009838
 ] 

Manikandan R commented on YARN-9866:


[~snemeth] Patch is ready for commit. Can you take a quick look?

> u:user2:%primary_group is not working as expected
> -
>
> Key: YARN-9866
> URL: https://issues.apache.org/jira/browse/YARN-9866
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Manikandan R
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-9866.001.patch, YARN-9866.002.patch, 
> YARN-9866.003.patch, YARN-9866.004.patch, YARN-9866.005.patch
>
>
> Please refer #1 in 
> https://issues.apache.org/jira/browse/YARN-9841?focusedCommentId=16937024=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16937024
>  for more details



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9768) RM Renew Delegation token thread should timeout and retry

2020-01-07 Thread Manikandan R (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009835#comment-17009835
 ] 

Manikandan R commented on YARN-9768:


[~bibinchundatt] This is hanging for quite some time. Can we please get a 
closure on this?

> RM Renew Delegation token thread should timeout and retry
> -
>
> Key: YARN-9768
> URL: https://issues.apache.org/jira/browse/YARN-9768
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: CR Hota
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-9768.001.patch, YARN-9768.002.patch, 
> YARN-9768.003.patch, YARN-9768.004.patch, YARN-9768.005.patch, 
> YARN-9768.006.patch, YARN-9768.007.patch, YARN-9768.008.patch
>
>
> Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews 
> HDFS tokens received to check for validity and expiration time.
> This call is made to an underlying HDFS NN or Router Node (which has exact 
> APIs as HDFS NN). If one of the nodes is bad and the renew call is stuck the 
> thread remains stuck indefinitely. The thread should ideally timeout the 
> renewToken and retry from the client's perspective.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9868) Validate %primary_group queue in CS queue manager

2020-01-07 Thread Manikandan R (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009829#comment-17009829
 ] 

Manikandan R commented on YARN-9868:


[~snemeth] Can you take it forward?

> Validate %primary_group queue in CS queue manager
> -
>
> Key: YARN-9868
> URL: https://issues.apache.org/jira/browse/YARN-9868
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Manikandan R
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-9868-003.patch, YARN-9868-003.patch, 
> YARN-9868-004.patch, YARN-9868.001.patch, YARN-9868.002.patch, 
> YARN-9868.005.patch
>
>
> As part of %secondary_group mapping, we ensure o/p of %secondary_group while 
> processing the queue mapping is available using CSQueueManager. Similarly, we 
> will need to same for %primary_group.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10072) TestCSAllocateCustomResource failures

2020-01-07 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan reassigned YARN-10072:
--

Attachment: YARN-10072.001.patch
  Assignee: Jim Brennan
Labels: YARN  (was: )

> TestCSAllocateCustomResource failures
> -
>
> Key: YARN-10072
> URL: https://issues.apache.org/jira/browse/YARN-10072
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: yarn
>Affects Versions: 2.10.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
>  Labels: YARN
> Attachments: YARN-10072.001.patch
>
>
> This test is failing for us consistently in our internal 2.10 based branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10072) TestCSAllocateCustomResource failures

2020-01-07 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009806#comment-17009806
 ] 

Jim Brennan commented on YARN-10072:


I resolved this internally by changing 
TestCSAllocateCustomResource#testCapacitySchedulerInitWithCustomResourceType to 
use MockRM like the other test in TestCSAllocateCustomResource.   It was using 
a lot of mocking/spying to try to isolate CapacityScheduler, but in so doing I 
think it introduced some inconsistency in the initialization process - I was 
getting inconsistent results depending on where I set breakpoints while 
debugging. By using MockRM, the CapacityScheduler initialization should more 
closely match what happens in production.  And it removed the inconsistency 
with breakpoints.

I will put up a patch for trunk shortly.

> TestCSAllocateCustomResource failures
> -
>
> Key: YARN-10072
> URL: https://issues.apache.org/jira/browse/YARN-10072
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: yarn
>Affects Versions: 2.10.0
>Reporter: Jim Brennan
>Priority: Major
>
> This test is failing for us consistently in our internal 2.10 based branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10072) TestCSAllocateCustomResource failures

2020-01-07 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009800#comment-17009800
 ] 

Jim Brennan commented on YARN-10072:


Here is a sample failure:
{noformat}
---
Test set: 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCSAllocateCustomResource
---
Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.291 s <<< 
FAILURE! - in 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCSAllocateCustomResource
testCapacitySchedulerInitWithCustomResourceType(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCSAllocateCustomResource)
  Time elapsed: 0.569 s  <<< FAILURE!
java.lang.AssertionError: Values should be different. Actual: 0
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failEquals(Assert.java:185)
at org.junit.Assert.assertNotEquals(Assert.java:161)
at org.junit.Assert.assertNotEquals(Assert.java:198)
at org.junit.Assert.assertNotEquals(Assert.java:209)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCSAllocateCustomResource.testCapacitySchedulerInitWithCustomResourceType(TestCSAllocateCustomResource.java:184)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413) 
{noformat}

> TestCSAllocateCustomResource failures
> -
>
> Key: YARN-10072
> URL: https://issues.apache.org/jira/browse/YARN-10072
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: yarn
>Affects Versions: 2.10.0
>Reporter: Jim Brennan
>Priority: Major
>
> This test is failing for us consistently in our internal 2.10 based branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10072) TestCSAllocateCustomResource failures

2020-01-07 Thread Jim Brennan (Jira)
Jim Brennan created YARN-10072:
--

 Summary: TestCSAllocateCustomResource failures
 Key: YARN-10072
 URL: https://issues.apache.org/jira/browse/YARN-10072
 Project: Hadoop YARN
  Issue Type: Test
  Components: yarn
Affects Versions: 2.10.0
Reporter: Jim Brennan


This test is failing for us consistently in our internal 2.10 based branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9879) Allow multiple leaf queues with the same name in CS

2020-01-07 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009797#comment-17009797
 ] 

Peter Bacsko edited comment on YARN-9879 at 1/7/20 2:55 PM:


[~wilfreds] based on your suggestion, here's what I came up with:

We can still maintain the HashMap with queueName->CSQueue, however we'd use two 
levels:
 1. Leaf queue -> full path
 2. Full path -> CSQueue object

We additionally need an extra map which tells whether a leaf queue is unique.

So after some thinking, this is the semi-pseudocode that could possibly do the 
job:
{noformat}
Map fullPathQueues;
Map leafToFullPath;
Map leafUnique;

public CSQueue getQueue(String queueName) {
  if (fullPathName(queueName)) {
return fullPathQueues.get(queueName);
  } else {
if (leafUnique.get(queueName)) {
  String fullName = leafToFullPath.get(queueName);
  return fullPathQueues.get(fullName);
} else {
  throw new YarnException(queueName + " is not unique");
}
  } 
}
{noformat}
Obviously methods like {{addQueue()}}, {{removeQueue()}} should be updated too.


was (Author: pbacsko):
[~wilfreds] based on your suggestion, here's what I came up with:

We can still maintain the HashMap with queueName->CSQueue, however we'd use two 
levels:
 1. Leaf queue -> full path
 2. Full path -> CSQueue object

We additionally need an extra map which tells whether a leaf queue is unique.

So after some thinking, this is the semi-pseudocode that could possibly do the 
job:

{noformat}
Map fullPathQueues;
Map leafToFullPath;
Map leafUnique;

public CSQueue getQueue(String queueName) {
  if (fullPathName(queueName)) {
return queues.get(queueName);
  } else {
if (leafUnique.get(queueName)) {
  String fullName = leafToFullPath.get(queueName);
  return queues.get(fullName);
} else {
  throw new YarnException(queueName + " is not unique");
}
  } 
}
{noformat}

Obviously methods like {{addQueue()}}, {{removeQueue()}} should be updated too.

> Allow multiple leaf queues with the same name in CS
> ---
>
> Key: YARN-9879
> URL: https://issues.apache.org/jira/browse/YARN-9879
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
> Attachments: DesignDoc_v1.pdf
>
>
> Currently the leaf queue's name must be unique regardless of its position in 
> the queue hierarchy. 
> Design doc and first proposal is being made, I'll attach it as soon as it's 
> done.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9879) Allow multiple leaf queues with the same name in CS

2020-01-07 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009797#comment-17009797
 ] 

Peter Bacsko commented on YARN-9879:


[~wilfreds] based on your suggestion, here's what I came up with:

We can still maintain the HashMap with queueName->CSQueue, however we'd use two 
levels:
 1. Leaf queue -> full path
 2. Full path -> CSQueue object

We additionally need an extra map which tells whether a leaf queue is unique.

So after some thinking, this is the semi-pseudocode that could possibly do the 
job:

{noformat}
MapfullPathQueues;
Map leafToFullPath;
Map leafUnique;

public CSQueue getQueue(String queueName) {
  if (fullPathName(queueName)) {
return queues.get(queueName);
  } else {
if (leafUnique.get(queueName)) {
  String fullName = leafToFullPath.get(queueName);
  return queues.get(fullName);
} else {
  throw new YarnException(queueName + " is not unique");
}
  } 
}
{noformat}

Obviously methods like {{addQueue()}}, {{removeQueue()}} should be updated too.

> Allow multiple leaf queues with the same name in CS
> ---
>
> Key: YARN-9879
> URL: https://issues.apache.org/jira/browse/YARN-9879
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
> Attachments: DesignDoc_v1.pdf
>
>
> Currently the leaf queue's name must be unique regardless of its position in 
> the queue hierarchy. 
> Design doc and first proposal is being made, I'll attach it as soon as it's 
> done.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9879) Allow multiple leaf queues with the same name in CS

2020-01-07 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009797#comment-17009797
 ] 

Peter Bacsko edited comment on YARN-9879 at 1/7/20 2:54 PM:


[~wilfreds] based on your suggestion, here's what I came up with:

We can still maintain the HashMap with queueName->CSQueue, however we'd use two 
levels:
 1. Leaf queue -> full path
 2. Full path -> CSQueue object

We additionally need an extra map which tells whether a leaf queue is unique.

So after some thinking, this is the semi-pseudocode that could possibly do the 
job:

{noformat}
Map fullPathQueues;
Map leafToFullPath;
Map leafUnique;

public CSQueue getQueue(String queueName) {
  if (fullPathName(queueName)) {
return queues.get(queueName);
  } else {
if (leafUnique.get(queueName)) {
  String fullName = leafToFullPath.get(queueName);
  return queues.get(fullName);
} else {
  throw new YarnException(queueName + " is not unique");
}
  } 
}
{noformat}

Obviously methods like {{addQueue()}}, {{removeQueue()}} should be updated too.


was (Author: pbacsko):
[~wilfreds] based on your suggestion, here's what I came up with:

We can still maintain the HashMap with queueName->CSQueue, however we'd use two 
levels:
 1. Leaf queue -> full path
 2. Full path -> CSQueue object

We additionally need an extra map which tells whether a leaf queue is unique.

So after some thinking, this is the semi-pseudocode that could possibly do the 
job:

{noformat}
MapfullPathQueues;
Map leafToFullPath;
Map leafUnique;

public CSQueue getQueue(String queueName) {
  if (fullPathName(queueName)) {
return queues.get(queueName);
  } else {
if (leafUnique.get(queueName)) {
  String fullName = leafToFullPath.get(queueName);
  return queues.get(fullName);
} else {
  throw new YarnException(queueName + " is not unique");
}
  } 
}
{noformat}

Obviously methods like {{addQueue()}}, {{removeQueue()}} should be updated too.

> Allow multiple leaf queues with the same name in CS
> ---
>
> Key: YARN-9879
> URL: https://issues.apache.org/jira/browse/YARN-9879
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
> Attachments: DesignDoc_v1.pdf
>
>
> Currently the leaf queue's name must be unique regardless of its position in 
> the queue hierarchy. 
> Design doc and first proposal is being made, I'll attach it as soon as it's 
> done.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7387) org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer fails intermittently

2020-01-07 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009784#comment-17009784
 ] 

Jim Brennan commented on YARN-7387:
---

Thanks [~snemeth]!  Do you want to review the patch?

cc: [~epayne]

> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
>  fails intermittently
> ---
>
> Key: YARN-7387
> URL: https://issues.apache.org/jira/browse/YARN-7387
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7387.001.patch
>
>
> {code}
> Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 52.481 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
> testDecreaseAfterIncreaseWithAllocationExpiration(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer)
>   Time elapsed: 13.292 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<3072> but was:<4096>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer.testDecreaseAfterIncreaseWithAllocationExpiration(TestIncreaseAllocationExpirer.java:459)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7387) org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer fails intermittently

2020-01-07 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan reassigned YARN-7387:
-

Assignee: Jim Brennan  (was: Szilard Nemeth)

> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
>  fails intermittently
> ---
>
> Key: YARN-7387
> URL: https://issues.apache.org/jira/browse/YARN-7387
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7387.001.patch
>
>
> {code}
> Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 52.481 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
> testDecreaseAfterIncreaseWithAllocationExpiration(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer)
>   Time elapsed: 13.292 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<3072> but was:<4096>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer.testDecreaseAfterIncreaseWithAllocationExpiration(TestIncreaseAllocationExpirer.java:459)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9525) IFile format is not working against s3a remote folder

2020-01-07 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009673#comment-17009673
 ] 

Hadoop QA commented on YARN-9525:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
40s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 10s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 31s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
38s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 62m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | YARN-9525 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990088/YARN-9525.006.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 73efc528e676 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 2bbf73f |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/25339/testReport/ |
| Max. process+thread count | 363 (vs. ulimit of 5500) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/25339/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> IFile format is 

[jira] [Commented] (YARN-9525) IFile format is not working against s3a remote folder

2020-01-07 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009637#comment-17009637
 ] 

Adam Antal commented on YARN-9525:
--

Reuploaded patch v6 as it latest Jenkins result was a while ago.

> IFile format is not working against s3a remote folder
> -
>
> Key: YARN-9525
> URL: https://issues.apache.org/jira/browse/YARN-9525
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation
>Affects Versions: 3.1.2
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: IFile-S3A-POC01.patch, YARN-9525-001.patch, 
> YARN-9525.002.patch, YARN-9525.003.patch, YARN-9525.004.patch, 
> YARN-9525.005.patch, YARN-9525.006.patch, YARN-9525.006.patch
>
>
> Using the IndexedFileFormat {{yarn.nodemanager.remote-app-log-dir}} 
> configured to an s3a URI throws the following exception during log 
> aggregation:
> {noformat}
> Cannot create writer for app application_1556199768861_0001. Skip log upload 
> this time. 
> java.io.IOException: java.io.FileNotFoundException: No such file or 
> directory: 
> s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:247)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainers(AppLogAggregatorImpl.java:306)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:464)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:420)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$1.run(LogAggregationService.java:276)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: No such file or directory: 
> s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2488)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2382)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2321)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:128)
>   at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1244)
>   at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1240)
>   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
>   at org.apache.hadoop.fs.FileContext.getFileStatus(FileContext.java:1246)
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController$1.run(LogAggregationIndexedFileController.java:228)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:195)
>   ... 7 more
> {noformat}
> This stack trace point to 
> {{LogAggregationIndexedFileController$initializeWriter}} where we do the 
> following steps (in a non-rolling log aggregation setup):
> - create FSDataOutputStream
> - writing out a UUID
> - flushing
> - immediately after that we call a GetFileStatus to get the length of the log 
> file (the bytes we just wrote out), and that's where the failures happens: 
> the file is not there yet due to eventual consistency.
> Maybe we can get rid of that, so we can use IFile format against a s3a target.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9525) IFile format is not working against s3a remote folder

2020-01-07 Thread Adam Antal (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal updated YARN-9525:
-
Attachment: YARN-9525.006.patch

> IFile format is not working against s3a remote folder
> -
>
> Key: YARN-9525
> URL: https://issues.apache.org/jira/browse/YARN-9525
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation
>Affects Versions: 3.1.2
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: IFile-S3A-POC01.patch, YARN-9525-001.patch, 
> YARN-9525.002.patch, YARN-9525.003.patch, YARN-9525.004.patch, 
> YARN-9525.005.patch, YARN-9525.006.patch, YARN-9525.006.patch
>
>
> Using the IndexedFileFormat {{yarn.nodemanager.remote-app-log-dir}} 
> configured to an s3a URI throws the following exception during log 
> aggregation:
> {noformat}
> Cannot create writer for app application_1556199768861_0001. Skip log upload 
> this time. 
> java.io.IOException: java.io.FileNotFoundException: No such file or 
> directory: 
> s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:247)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainers(AppLogAggregatorImpl.java:306)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:464)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:420)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$1.run(LogAggregationService.java:276)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: No such file or directory: 
> s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2488)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2382)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2321)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:128)
>   at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1244)
>   at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1240)
>   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
>   at org.apache.hadoop.fs.FileContext.getFileStatus(FileContext.java:1246)
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController$1.run(LogAggregationIndexedFileController.java:228)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:195)
>   ... 7 more
> {noformat}
> This stack trace point to 
> {{LogAggregationIndexedFileController$initializeWriter}} where we do the 
> following steps (in a non-rolling log aggregation setup):
> - create FSDataOutputStream
> - writing out a UUID
> - flushing
> - immediately after that we call a GetFileStatus to get the length of the log 
> file (the bytes we just wrote out), and that's where the failures happens: 
> the file is not there yet due to eventual consistency.
> Maybe we can get rid of that, so we can use IFile format against a s3a target.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10071) Sync Mockito version with other modules

2020-01-07 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009632#comment-17009632
 ] 

Akira Ajisaka commented on YARN-10071:
--

Now Mockito 1.x API is not used in the MaWo module. Therefore removing the 
dependency from pom.xml files seems fine.

> Sync Mockito version with other modules
> ---
>
> Key: YARN-10071
> URL: https://issues.apache.org/jira/browse/YARN-10071
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: build, test
>Reporter: Akira Ajisaka
>Priority: Major
>
> YARN-8551 introduced Mockito 1.x dependency, update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8374) Upgrade objenesis to 2.6

2020-01-07 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated YARN-8374:

Component/s: build
Summary: Upgrade objenesis to 2.6  (was: Upgrade objenesis dependency)

> Upgrade objenesis to 2.6
> 
>
> Key: YARN-8374
> URL: https://issues.apache.org/jira/browse/YARN-8374
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: build, timelineservice
>Reporter: Jason Darrell Lowe
>Assignee: Akira Ajisaka
>Priority: Major
>
> After HADOOP-14918 is committed we should be able to remove the explicit 
> objenesis dependency and objenesis exclusion from the fst dependency to pick 
> up the version fst wants naturally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8374) Upgrade objenesis dependency

2020-01-07 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reassigned YARN-8374:
---

Assignee: Akira Ajisaka

> Upgrade objenesis dependency
> 
>
> Key: YARN-8374
> URL: https://issues.apache.org/jira/browse/YARN-8374
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: timelineservice
>Reporter: Jason Darrell Lowe
>Assignee: Akira Ajisaka
>Priority: Major
>
> After HADOOP-14918 is committed we should be able to remove the explicit 
> objenesis dependency and objenesis exclusion from the fst dependency to pick 
> up the version fst wants naturally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10071) Sync Mockito version with other modules

2020-01-07 Thread Akira Ajisaka (Jira)
Akira Ajisaka created YARN-10071:


 Summary: Sync Mockito version with other modules
 Key: YARN-10071
 URL: https://issues.apache.org/jira/browse/YARN-10071
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: build, test
Reporter: Akira Ajisaka


YARN-8551 introduced Mockito 1.x dependency, update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10070) NPE if no rule is defined and application-tag-based-placement is enabled

2020-01-07 Thread Kinga Marton (Jira)
Kinga Marton created YARN-10070:
---

 Summary: NPE if no rule is defined and 
application-tag-based-placement is enabled
 Key: YARN-10070
 URL: https://issues.apache.org/jira/browse/YARN-10070
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Kinga Marton
Assignee: Kinga Marton


If there is no rule defined for a user NPE is thrown by the following line.
{code:java}
String queue = placementManager
 .placeApplication(context, usernameUsedForPlacement).getQueue();{code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7913) Improve error handling when application recovery fails with exception

2020-01-07 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009551#comment-17009551
 ] 

Wilfred Spiegelenburg commented on YARN-7913:
-

[~snemeth] and [~sunilg] can you have a look at the change please?

> Improve error handling when application recovery fails with exception
> -
>
> Key: YARN-7913
> URL: https://issues.apache.org/jira/browse/YARN-7913
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: Gergo Repas
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: YARN-7913.000.poc.patch, YARN-7913.001.patch, 
> YARN-7913.002.patch, YARN-7913.003.patch
>
>
> There are edge cases when the application recovery fails with an exception.
> Example failure scenario:
>  * setup: a queue is a leaf queue in the primary RM's config and the same 
> queue is a parent queue in the secondary RM's config.
>  * When failover happens with this setup, the recovery will fail for 
> applications on this queue, and an APP_REJECTED event will be dispatched to 
> the async dispatcher. On the same thread (that handles the recovery), a 
> NullPointerException is thrown when the applicationAttempt is tried to be 
> recovered 
> (https://github.com/apache/hadoop/blob/55066cc53dc22b68f9ca55a0029741d6c846be0a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java#L494).
>  I don't see a good way to avoid the NPE in this scenario, because when the 
> NPE occurs the APP_REJECTED has not been processed yet, and we don't know 
> that the application recovery failed.
> Currently the first exception will abort the recovery, and if there are X 
> applications, there will be ~X passive -> active RM transition attempts - the 
> passive -> active RM transition will only succeed when the last APP_REJECTED 
> event is processed on the async dispatcher thread.
> _The point of this ticket is to improve the error handling and reduce the 
> number of passive -> active RM transition attempts (solving the above 
> described failure scenario isn't in scope)._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6212) NodeManager metrics returning wrong negative values

2020-01-07 Thread Max Xie (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009544#comment-17009544
 ] 

Max  Xie commented on YARN-6212:


In my cluster, the Nodemanagers metrics return the negative values too.

> NodeManager metrics returning wrong negative values
> ---
>
> Key: YARN-6212
> URL: https://issues.apache.org/jira/browse/YARN-6212
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 2.7.3
>Reporter: Abhishek Shivanna
>Priority: Major
>
> It looks like the metrics returned by the NodeManager have negative values 
> for metrics that never should be negative. Here is an output form NM endpoint 
> {noformat}
> /jmx?qry=Hadoop:service=NodeManager,name=NodeManagerMetrics
> {noformat}
> {noformat}
> {
>   "beans" : [ {
> "name" : "Hadoop:service=NodeManager,name=NodeManagerMetrics",
> "modelerType" : "NodeManagerMetrics",
> "tag.Context" : "yarn",
> "tag.Hostname" : "",
> "ContainersLaunched" : 707,
> "ContainersCompleted" : 9,
> "ContainersFailed" : 124,
> "ContainersKilled" : 579,
> "ContainersIniting" : 0,
> "ContainersRunning" : 19,
> "AllocatedGB" : -26,
> "AllocatedContainers" : -5,
> "AvailableGB" : 252,
> "AllocatedVCores" : -5,
> "AvailableVCores" : 101,
> "ContainerLaunchDurationNumOps" : 718,
> "ContainerLaunchDurationAvgTime" : 18.0
>   } ]
> }
> {noformat}
> Is there any circumstance under which the value for AllocatedGB, 
> AllocatedContainers and AllocatedVCores go below 0? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9624) Use switch case for ProtoUtils#convertFromProtoFormat containerState

2020-01-07 Thread Bibin Chundatt (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009536#comment-17009536
 ] 

Bibin Chundatt commented on YARN-9624:
--

[~BilwaST] Could you update the patch ?

> Use switch case for ProtoUtils#convertFromProtoFormat containerState
> 
>
> Key: YARN-9624
> URL: https://issues.apache.org/jira/browse/YARN-9624
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Bibin Chundatt
>Assignee: Bilwa S T
>Priority: Major
>  Labels: performance
> Attachments: YARN-9624.001.patch, YARN-9624.002.patch
>
>
> On large cluster with 100K+ containers on every heartbeat 
> {{ContainerState.valueOf(e.name().replace(CONTAINER_STATE_PREFIX, ""))}} will 
> be too costly. Update with switch case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7387) org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer fails intermittently

2020-01-07 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009478#comment-17009478
 ] 

Szilard Nemeth commented on YARN-7387:
--

Hi [~Jim_Brennan]!
Thanks.
I have never worked on this jira, it is just assigned to me as I have planned 
to work on it.
Feel free to reassign the jira to yourself.

> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
>  fails intermittently
> ---
>
> Key: YARN-7387
> URL: https://issues.apache.org/jira/browse/YARN-7387
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-7387.001.patch
>
>
> {code}
> Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 52.481 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
> testDecreaseAfterIncreaseWithAllocationExpiration(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer)
>   Time elapsed: 13.292 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<3072> but was:<4096>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer.testDecreaseAfterIncreaseWithAllocationExpiration(TestIncreaseAllocationExpirer.java:459)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org