[jira] [Commented] (YARN-9076) ContainerShellWebSocket Render Process Output

2019-03-01 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781743#comment-16781743
 ] 

BELUGA BEHR commented on YARN-9076:
---

[~giovanni.fumarola] Hello.  Thank you for your review.  I see what you mean.  
I'm not sure what the purpose of this 4K buffer is... maybe to protect from a 
bad client that sends an unlimited stream of data?  I don't really know.  
However, I have provided a new patch that again caps the read at max 4K.

Also, using {{available()}} is almost never the correct thing to do.  It's 
return value can be very funky.  Every {{InputStream}} implementation does it 
differently.

> ContainerShellWebSocket Render Process Output
> -
>
> Key: YARN-9076
> URL: https://issues.apache.org/jira/browse/YARN-9076
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: webapp
>Affects Versions: 3.3.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-9076.1.patch, YARN-9076.2.patch
>
>
> {code:java|title=ContainerShellWebSocket.java}
> // Render process output
> int no = pair.in.available();
> pair.in.read(buffer, 0, Math.min(no, buffer.length));
> String formatted = new String(buffer, Charset.forName("UTF-8"))
> .replaceAll("\n", "\r\n");
> session.getRemote().sendString(formatted);
>   }
> {code}
> This code strikes me as a bit odd.  First of it, it is using {{available{}}} 
> which is known as a being unreliable and inaccurate (i.e., for sockets) .  
> Second, it will only read a max of 4000 characters and that's it.  Anything 
> else is truncated.
> Change this code to read the entire data stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9076) ContainerShellWebSocket Render Process Output

2019-03-01 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-9076:
--
Attachment: YARN-9076.3.patch

> ContainerShellWebSocket Render Process Output
> -
>
> Key: YARN-9076
> URL: https://issues.apache.org/jira/browse/YARN-9076
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: webapp
>Affects Versions: 3.3.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-9076.1.patch, YARN-9076.2.patch, YARN-9076.3.patch
>
>
> {code:java|title=ContainerShellWebSocket.java}
> // Render process output
> int no = pair.in.available();
> pair.in.read(buffer, 0, Math.min(no, buffer.length));
> String formatted = new String(buffer, Charset.forName("UTF-8"))
> .replaceAll("\n", "\r\n");
> session.getRemote().sendString(formatted);
>   }
> {code}
> This code strikes me as a bit odd.  First of it, it is using {{available{}}} 
> which is known as a being unreliable and inaccurate (i.e., for sockets) .  
> Second, it will only read a max of 4000 characters and that's it.  Anything 
> else is truncated.
> Change this code to read the entire data stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9076) ContainerShellWebSocket Render Process Output

2019-03-01 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-9076:
--
Attachment: YARN-9076.2.patch

> ContainerShellWebSocket Render Process Output
> -
>
> Key: YARN-9076
> URL: https://issues.apache.org/jira/browse/YARN-9076
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: webapp
>Affects Versions: 3.3.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-9076.1.patch, YARN-9076.2.patch
>
>
> {code:java|title=ContainerShellWebSocket.java}
> // Render process output
> int no = pair.in.available();
> pair.in.read(buffer, 0, Math.min(no, buffer.length));
> String formatted = new String(buffer, Charset.forName("UTF-8"))
> .replaceAll("\n", "\r\n");
> session.getRemote().sendString(formatted);
>   }
> {code}
> This code strikes me as a bit odd.  First of it, it is using {{available{}}} 
> which is known as a being unreliable and inaccurate (i.e., for sockets) .  
> Second, it will only read a max of 4000 characters and that's it.  Anything 
> else is truncated.
> Change this code to read the entire data stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8146) Remove LinkedList From resourcemanager.reservation.planning Package

2019-02-06 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8146:
--
Attachment: YARN-8146.2.patch

> Remove LinkedList From resourcemanager.reservation.planning Package
> ---
>
> Key: YARN-8146
> URL: https://issues.apache.org/jira/browse/YARN-8146
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: reservation system
>Affects Versions: 3.0.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: YARN-8146.1.patch, YARN-8146.2.patch
>
>
> Remove {{LinkedList}} instances in favor of {{ArrayList}}.  {{ArrayList}} is 
> generally more memory efficient, require less memory fragmentation, and with 
> memory localization, faster to iterate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9260) Re-Launch ApplicationMasters That Fail With OOM Using Larger Container

2019-01-31 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-9260:
--
Environment: (was: If an ApplicationMaster fails with an OOM, or is 
killed by YARN for using more memory than is allowed by the container it's 
launched in, when re-trying the AM, increase its container size.)

> Re-Launch ApplicationMasters That Fail With OOM Using Larger Container
> --
>
> Key: YARN-9260
> URL: https://issues.apache.org/jira/browse/YARN-9260
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: BELUGA BEHR
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9260) Re-Launch ApplicationMasters That Fail With OOM Using Larger Container

2019-01-31 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757787#comment-16757787
 ] 

BELUGA BEHR commented on YARN-9260:
---

Thanks for the input [~eepayne].  I could see a new configuration called 
{{growth-factor}} which specifies how large to grow the container each time it 
is re-tried.  This would be a percentage, therefore, a {{growth-factor}} of 
1.0f (100%) would preserve the current behavior.

> Re-Launch ApplicationMasters That Fail With OOM Using Larger Container
> --
>
> Key: YARN-9260
> URL: https://issues.apache.org/jira/browse/YARN-9260
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: BELUGA BEHR
>Priority: Major
>
> If an ApplicationMaster fails with an OOM, or is killed by YARN for using 
> more memory than is allowed by the container it's launched in, when re-trying 
> the AM, increase its container size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9260) Re-Launch ApplicationMasters That Fail With OOM Using Larger Container

2019-01-31 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-9260:
--
Description: If an ApplicationMaster fails with an OOM, or is killed by 
YARN for using more memory than is allowed by the container it's launched in, 
when re-trying the AM, increase its container size.

> Re-Launch ApplicationMasters That Fail With OOM Using Larger Container
> --
>
> Key: YARN-9260
> URL: https://issues.apache.org/jira/browse/YARN-9260
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: BELUGA BEHR
>Priority: Major
>
> If an ApplicationMaster fails with an OOM, or is killed by YARN for using 
> more memory than is allowed by the container it's launched in, when re-trying 
> the AM, increase its container size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9260) Re-Launch ApplicationMasters That Fail With OOM Using Larger Container

2019-01-31 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created YARN-9260:
-

 Summary: Re-Launch ApplicationMasters That Fail With OOM Using 
Larger Container
 Key: YARN-9260
 URL: https://issues.apache.org/jira/browse/YARN-9260
 Project: Hadoop YARN
  Issue Type: Improvement
 Environment: If an ApplicationMaster fails with an OOM, or is killed 
by YARN for using more memory than is allowed by the container it's launched 
in, when re-trying the AM, increase its container size.
Reporter: BELUGA BEHR






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9259) Assign ApplicationMaster (AM) Heap Memory Based on Container Size

2019-01-31 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757518#comment-16757518
 ] 

BELUGA BEHR edited comment on YARN-9259 at 1/31/19 5:14 PM:


This goes hand-in-hand with another proposed Idea: [MAPREDUCE-7180].  If the AM 
fails with a memory error, it should be relaunched with a larger container.  
However, the larger container may not be good enough, it may need a larger JVM 
heap size too (hence this ticket).  This is a more crude (and simple) way of 
determining the correct container/heap size than [MAPREDUCE-5892].  It just 
tries with progressively larger container sizes instead of having to come up 
with some heuristic to reason about the amount of memory required based on 
number of splits and other complicating factors.

Though, MR ApplicatonMasters are not all that dynamic in regards to memory 
usage.  [MAPREDUCE-207] first needs to be addressed.


was (Author: belugabehr):
This goes hand-in-hand with another proposed Idea: [MAPREDUCE-7180].  If the AM 
fails with a memory error, it should be relaunched with a larger container (and 
therefore a larger JVM heap size).

> Assign ApplicationMaster (AM) Heap Memory Based on Container Size
> -
>
> Key: YARN-9259
> URL: https://issues.apache.org/jira/browse/YARN-9259
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: BELUGA BEHR
>Priority: Major
>
> [YARN-7936] introduced a sane default value for the ApplicationMaster (AM) 
> Java heap size.  However, [MAPREDUCE-5785] added a feature that sets the Java 
> Heap size of the Mapper/Reducer to be 80% (configurable, of course) of the 
> YARN container size.
> Please add similar logic for MR ApplicationMaster.  If the size of the AM 
> container is increased, the JVM heap size should be too automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9260) Re-Launch ApplicationMasters That Fail With OOM Using Larger Container

2019-01-31 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757528#comment-16757528
 ] 

BELUGA BEHR commented on YARN-9260:
---

When re-launching the container, the AM JVM Heap memory needs to be increased 
too: [YARN-9259]

> Re-Launch ApplicationMasters That Fail With OOM Using Larger Container
> --
>
> Key: YARN-9260
> URL: https://issues.apache.org/jira/browse/YARN-9260
> Project: Hadoop YARN
>  Issue Type: Improvement
> Environment: If an ApplicationMaster fails with an OOM, or is killed 
> by YARN for using more memory than is allowed by the container it's launched 
> in, when re-trying the AM, increase its container size.
>Reporter: BELUGA BEHR
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9259) Assign ApplicationMaster (AM) Heap Memory Based on Container Size

2019-01-31 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757518#comment-16757518
 ] 

BELUGA BEHR commented on YARN-9259:
---

This goes hand-in-hand with another proposed Idea: [MAPREDUCE-7180].  If the AM 
fails with a memory error, it should be relaunched with a larger container (and 
therefore a larger JVM heap size).

> Assign ApplicationMaster (AM) Heap Memory Based on Container Size
> -
>
> Key: YARN-9259
> URL: https://issues.apache.org/jira/browse/YARN-9259
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: BELUGA BEHR
>Priority: Major
>
> [YARN-7936] introduced a sane default value for the ApplicationMaster (AM) 
> Java heap size.  However, [MAPREDUCE-5785] added a feature that sets the Java 
> Heap size of the Mapper/Reducer to be 80% (configurable, of course) of the 
> YARN container size.
> Please add similar logic for MR ApplicationMaster.  If the size of the AM 
> container is increased, the JVM heap size should be too automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9259) Assign ApplicationMaster (AM) Heap Memory Based on Container Size

2019-01-31 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-9259:
--
Summary: Assign ApplicationMaster (AM) Heap Memory Based on Container Size  
(was: Assign ApplicationMaster (AM) Memory Based on Container Size)

> Assign ApplicationMaster (AM) Heap Memory Based on Container Size
> -
>
> Key: YARN-9259
> URL: https://issues.apache.org/jira/browse/YARN-9259
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: BELUGA BEHR
>Priority: Major
>
> [YARN-7936] introduced a sane default value for the ApplicationMaster (AM) 
> Java heap size.  However, [MAPREDUCE-5785] added a feature that sets the Java 
> Heap size of the Mapper/Reducer to be 80% (configurable, of course) of the 
> YARN container size.
> Please add similar logic for MR ApplicationMaster.  If the size of the AM 
> container is increased, the JVM heap size should be too automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9259) Assign ApplicationMaster (AM) Memory Based on Container Size

2019-01-31 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created YARN-9259:
-

 Summary: Assign ApplicationMaster (AM) Memory Based on Container 
Size
 Key: YARN-9259
 URL: https://issues.apache.org/jira/browse/YARN-9259
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: BELUGA BEHR


[YARN-7936] introduced a sane default value for the ApplicationMaster (AM) Java 
heap size.  However, [MAPREDUCE-5785] added a feature that sets the Java Heap 
size of the Mapper/Reducer to be 80% (configurable, of course) of the YARN 
container size.

Please add similar logic for MR ApplicationMaster.  If the size of the AM 
container is increased, the JVM heap size should be too automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9189) Clarify FairScheduler submission logging

2019-01-10 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16739522#comment-16739522
 ] 

BELUGA BEHR commented on YARN-9189:
---

I was having the same problem as Patrick I think.  I'm not sure how my local 
git repo got so completely out of sync.  I blew it away and started from 
scratch, so this latest patch should be good.  Sorry for the testing spam.

> Clarify FairScheduler submission logging
> 
>
> Key: YARN-9189
> URL: https://issues.apache.org/jira/browse/YARN-9189
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 3.2.0
>Reporter: Patrick Bayne
>Priority: Minor
> Attachments: YARN-9189.1.patch, YARN-9189.2.patch, YARN-9189.3.patch, 
> YARN-9189.4.patch
>
>
> Logging was ambiguous for the fairscheduler. It was unclear if the "total 
> number applications" was referring to the global total or the queue's total. 
> Fixed wording/spelling of output logging. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9189) Clarify FairScheduler submission logging

2019-01-10 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-9189:
--
Attachment: YARN-9189.4.patch

> Clarify FairScheduler submission logging
> 
>
> Key: YARN-9189
> URL: https://issues.apache.org/jira/browse/YARN-9189
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 3.2.0
>Reporter: Patrick Bayne
>Priority: Minor
> Attachments: YARN-9189.1.patch, YARN-9189.2.patch, YARN-9189.3.patch, 
> YARN-9189.4.patch
>
>
> Logging was ambiguous for the fairscheduler. It was unclear if the "total 
> number applications" was referring to the global total or the queue's total. 
> Fixed wording/spelling of output logging. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9189) Clarify FairScheduler submission logging

2019-01-10 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738967#comment-16738967
 ] 

BELUGA BEHR edited comment on YARN-9189 at 1/10/19 3:35 PM:


I think the original work was done on the wrong branch.


was (Author: belugabehr):
I think the original work was done on the wrong branch.

I also updated logging to use slf4j.

> Clarify FairScheduler submission logging
> 
>
> Key: YARN-9189
> URL: https://issues.apache.org/jira/browse/YARN-9189
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 3.2.0
>Reporter: Patrick Bayne
>Priority: Minor
> Attachments: YARN-9189.1.patch, YARN-9189.2.patch, YARN-9189.3.patch, 
> YARN-9189.4.patch
>
>
> Logging was ambiguous for the fairscheduler. It was unclear if the "total 
> number applications" was referring to the global total or the queue's total. 
> Fixed wording/spelling of output logging. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9189) Clarify FairScheduler submission logging

2019-01-10 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-9189:
--
Attachment: YARN-9189.3.patch

> Clarify FairScheduler submission logging
> 
>
> Key: YARN-9189
> URL: https://issues.apache.org/jira/browse/YARN-9189
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 3.2.0
>Reporter: Patrick Bayne
>Priority: Minor
> Attachments: YARN-9189.1.patch, YARN-9189.2.patch, YARN-9189.3.patch
>
>
> Logging was ambiguous for the fairscheduler. It was unclear if the "total 
> number applications" was referring to the global total or the queue's total. 
> Fixed wording/spelling of output logging. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9189) Clarify FairScheduler submission logging

2019-01-09 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-9189:
--
Attachment: YARN-9189.2.patch

> Clarify FairScheduler submission logging
> 
>
> Key: YARN-9189
> URL: https://issues.apache.org/jira/browse/YARN-9189
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 3.2.0
>Reporter: Patrick Bayne
>Priority: Minor
> Attachments: YARN-9189.1.patch, YARN-9189.2.patch
>
>
> Logging was ambiguous for the fairscheduler. It was unclear if the "total 
> number applications" was referring to the global total or the queue's total. 
> Fixed wording/spelling of output logging. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-12-03 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707273#comment-16707273
 ] 

BELUGA BEHR commented on YARN-8789:
---

[~pbacsko] The {{offer()}} method is wrapped in a 'while' clause so it will 
continue to attempt to put the event in the queue for as long as it takes.  
They are not lost.

[~wilfreds] Customer is using a version of CDH from before [MAPREDUCE-5124] was 
introduced.  This queue change is also intended to throttle.  If the queue is 
full, the producers will wait (their threads will block).  If they wait a long 
time, I imagine that the events coming from a remote clients like a Mapper or 
Reducer will simply timeout and fail.  The tasks will have to be re-tried, but 
it is better, in my mind, to have to restart a subset of tasks than to kill the 
AM with an OOM and never complete.



> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.10.patch, 
> YARN-8789.12.patch, YARN-8789.14.patch, YARN-8789.2.patch, YARN-8789.3.patch, 
> YARN-8789.4.patch, YARN-8789.5.patch, YARN-8789.6.patch, YARN-8789.7.patch, 
> YARN-8789.7.patch, YARN-8789.8.patch, YARN-8789.9.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.
> Logging Message:
> Size of event-queue is xxx



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-12-03 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707273#comment-16707273
 ] 

BELUGA BEHR edited comment on YARN-8789 at 12/3/18 2:31 PM:


[~pbacsko] The {{offer()}} method is wrapped in a 'while' clause so it will 
continue to attempt to put the event in the queue for as long as it takes.  
They are not lost.

[~wilfreds] Customer is using a version of CDH from before [MAPREDUCE-5124] was 
introduced.  This queue change is also intended to throttle clients and protect 
the AM.  If the queue is full, the producers will wait (their threads will 
block).  If they wait a long time, I imagine that the events coming from a 
remote clients like a Mapper or Reducer will simply timeout and fail.  The 
tasks will have to be re-tried, but it is better, in my mind, to have to 
restart a subset of tasks than to kill the AM with an OOM and never complete.




was (Author: belugabehr):
[~pbacsko] The {{offer()}} method is wrapped in a 'while' clause so it will 
continue to attempt to put the event in the queue for as long as it takes.  
They are not lost.

[~wilfreds] Customer is using a version of CDH from before [MAPREDUCE-5124] was 
introduced.  This queue change is also intended to throttle.  If the queue is 
full, the producers will wait (their threads will block).  If they wait a long 
time, I imagine that the events coming from a remote clients like a Mapper or 
Reducer will simply timeout and fail.  The tasks will have to be re-tried, but 
it is better, in my mind, to have to restart a subset of tasks than to kill the 
AM with an OOM and never complete.



> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.10.patch, 
> YARN-8789.12.patch, YARN-8789.14.patch, YARN-8789.2.patch, YARN-8789.3.patch, 
> YARN-8789.4.patch, YARN-8789.5.patch, YARN-8789.6.patch, YARN-8789.7.patch, 
> YARN-8789.7.patch, YARN-8789.8.patch, YARN-8789.9.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.
> Logging Message:
> Size of event-queue is xxx



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9076) ContainerShellWebSocket Render Process Output

2018-11-30 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created YARN-9076:
-

 Summary: ContainerShellWebSocket Render Process Output
 Key: YARN-9076
 URL: https://issues.apache.org/jira/browse/YARN-9076
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: webapp
Affects Versions: 3.3.0
Reporter: BELUGA BEHR
 Attachments: YARN-9076.1.patch

{code:java|title=ContainerShellWebSocket.java}
// Render process output
int no = pair.in.available();
pair.in.read(buffer, 0, Math.min(no, buffer.length));
String formatted = new String(buffer, Charset.forName("UTF-8"))
.replaceAll("\n", "\r\n");
session.getRemote().sendString(formatted);
  }
{code}

This code strikes me as a bit odd.  First of it, it is using {{available{}}} 
which is known as a being unreliable and inaccurate (i.e., for sockets) .  
Second, it will only read a max of 4000 characters and that's it.  Anything 
else is truncated.

Change this code to read the entire data stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9076) ContainerShellWebSocket Render Process Output

2018-11-30 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR reassigned YARN-9076:
-

Assignee: BELUGA BEHR

> ContainerShellWebSocket Render Process Output
> -
>
> Key: YARN-9076
> URL: https://issues.apache.org/jira/browse/YARN-9076
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: webapp
>Affects Versions: 3.3.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-9076.1.patch
>
>
> {code:java|title=ContainerShellWebSocket.java}
> // Render process output
> int no = pair.in.available();
> pair.in.read(buffer, 0, Math.min(no, buffer.length));
> String formatted = new String(buffer, Charset.forName("UTF-8"))
> .replaceAll("\n", "\r\n");
> session.getRemote().sendString(formatted);
>   }
> {code}
> This code strikes me as a bit odd.  First of it, it is using {{available{}}} 
> which is known as a being unreliable and inaccurate (i.e., for sockets) .  
> Second, it will only read a max of 4000 characters and that's it.  Anything 
> else is truncated.
> Change this code to read the entire data stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9076) ContainerShellWebSocket Render Process Output

2018-11-30 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-9076:
--
Attachment: YARN-9076.1.patch

> ContainerShellWebSocket Render Process Output
> -
>
> Key: YARN-9076
> URL: https://issues.apache.org/jira/browse/YARN-9076
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: webapp
>Affects Versions: 3.3.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-9076.1.patch
>
>
> {code:java|title=ContainerShellWebSocket.java}
> // Render process output
> int no = pair.in.available();
> pair.in.read(buffer, 0, Math.min(no, buffer.length));
> String formatted = new String(buffer, Charset.forName("UTF-8"))
> .replaceAll("\n", "\r\n");
> session.getRemote().sendString(formatted);
>   }
> {code}
> This code strikes me as a bit odd.  First of it, it is using {{available{}}} 
> which is known as a being unreliable and inaccurate (i.e., for sockets) .  
> Second, it will only read a max of 4000 characters and that's it.  Anything 
> else is truncated.
> Change this code to read the entire data stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8928) TestRMAdminService is failing

2018-10-22 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8928:
--
Attachment: YARN-8928.3.patch

> TestRMAdminService is failing
> -
>
> Key: YARN-8928
> URL: https://issues.apache.org/jira/browse/YARN-8928
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Jason Lowe
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8928.1.patch, YARN-8928.2.patch, YARN-8928.3.patch
>
>
> After HADOOP-15836 TestRMAdminService has started failing consistently.  
> Sample stacktraces to follow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8928) TestRMAdminService is failing

2018-10-22 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659637#comment-16659637
 ] 

BELUGA BEHR commented on YARN-8928:
---

Hey [~elgoiri], thanks for the review.  I'm sorry for the headache for you, 
especially since you've been very supportive of my efforts to scrub the code 
base.

 

I'm working on HADOOP-12640, I actually did some work against this class a long 
time ago which I can now complete.  I can use that ticket as an opportunity to 
update the JavaDoc to include information about ordering.  I agree that it 
should be in there.  I don't want to do too much in this ticket though, I just 
want the project to pass all unit tests ASAP.

In regard to implementing {{buildAclString}} with a {{TreeSet}} seems a bit 
overkill.  The test does not have to implement things the same way as the 
actual code (which may change at some point in the future).  It just need to 
put things in order.  However, I can move the comments to the new test method.

> TestRMAdminService is failing
> -
>
> Key: YARN-8928
> URL: https://issues.apache.org/jira/browse/YARN-8928
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Jason Lowe
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8928.1.patch, YARN-8928.2.patch
>
>
> After HADOOP-15836 TestRMAdminService has started failing consistently.  
> Sample stacktraces to follow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8928) TestRMAdminService is failing

2018-10-22 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8928:
--
Attachment: YARN-8928.2.patch

> TestRMAdminService is failing
> -
>
> Key: YARN-8928
> URL: https://issues.apache.org/jira/browse/YARN-8928
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Jason Lowe
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8928.1.patch, YARN-8928.2.patch
>
>
> After HADOOP-15836 TestRMAdminService has started failing consistently.  
> Sample stacktraces to follow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8928) TestRMAdminService is failing

2018-10-22 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659138#comment-16659138
 ] 

BELUGA BEHR commented on YARN-8928:
---

One of the challenges here is that the tests use the name of the user that is 
running the test, since it was made alphabetical, the name of the user can 
impact the test.  I've included a patch to address.

> TestRMAdminService is failing
> -
>
> Key: YARN-8928
> URL: https://issues.apache.org/jira/browse/YARN-8928
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Jason Lowe
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8928.1.patch
>
>
> After HADOOP-15836 TestRMAdminService has started failing consistently.  
> Sample stacktraces to follow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8928) TestRMAdminService is failing

2018-10-22 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8928:
--
Attachment: YARN-8928.1.patch

> TestRMAdminService is failing
> -
>
> Key: YARN-8928
> URL: https://issues.apache.org/jira/browse/YARN-8928
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Jason Lowe
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8928.1.patch
>
>
> After HADOOP-15836 TestRMAdminService has started failing consistently.  
> Sample stacktraces to follow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8928) TestRMAdminService is failing

2018-10-22 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR reassigned YARN-8928:
-

Assignee: BELUGA BEHR

> TestRMAdminService is failing
> -
>
> Key: YARN-8928
> URL: https://issues.apache.org/jira/browse/YARN-8928
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Jason Lowe
>Assignee: BELUGA BEHR
>Priority: Major
>
> After HADOOP-15836 TestRMAdminService has started failing consistently.  
> Sample stacktraces to follow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8894) Improve InMemoryPlan.java toString

2018-10-16 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR reassigned YARN-8894:
-

Assignee: BELUGA BEHR

> Improve InMemoryPlan.java toString
> --
>
> Key: YARN-8894
> URL: https://issues.apache.org/jira/browse/YARN-8894
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: reservation system
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-8894.1.patch
>
>
> * Replace {{StringBuffer}} with {{StringBuilder}}
> * Add spaces between fields for readability



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8894) Improve InMemoryPlan.java toString

2018-10-16 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8894:
--
Attachment: YARN-8894.1.patch

> Improve InMemoryPlan.java toString
> --
>
> Key: YARN-8894
> URL: https://issues.apache.org/jira/browse/YARN-8894
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: reservation system
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-8894.1.patch
>
>
> * Replace {{StringBuffer}} with {{StringBuilder}}
> * Add spaces between fields for readability



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8894) Improve InMemoryPlan.java toString

2018-10-16 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created YARN-8894:
-

 Summary: Improve InMemoryPlan.java toString
 Key: YARN-8894
 URL: https://issues.apache.org/jira/browse/YARN-8894
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: reservation system
Affects Versions: 3.2.0
Reporter: BELUGA BEHR


* Replace {{StringBuffer}} with {{StringBuilder}}
* Add spaces between fields for readability



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8878) Remove StringBuffer from ManagedParentQueue.java

2018-10-15 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8878:
--
Attachment: YARN-8878.1.patch

> Remove StringBuffer from ManagedParentQueue.java
> 
>
> Key: YARN-8878
> URL: https://issues.apache.org/jira/browse/YARN-8878
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: YARN-8878.1.patch
>
>
> Remove all {{StringBuffer}} references from the class {{ManagedParentQueue}}. 
>  {{StringBuffer}} are synchronized and should be instead replaced with the 
> non-synchronized {{StringBuilder}}. However, in this case, just use {{SLF4J}} 
> parameter logging.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8878) Remove StringBuffer from ManagedParentQueue.java

2018-10-15 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8878:
--
Attachment: (was: YARN-8878.1.patch)

> Remove StringBuffer from ManagedParentQueue.java
> 
>
> Key: YARN-8878
> URL: https://issues.apache.org/jira/browse/YARN-8878
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
>
> Remove all {{StringBuffer}} references from the class {{ManagedParentQueue}}. 
>  {{StringBuffer}} are synchronized and should be instead replaced with the 
> non-synchronized {{StringBuilder}}. However, in this case, just use {{SLF4J}} 
> parameter logging.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8878) Remove StringBuffer from ManagedParentQueue.java

2018-10-15 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8878:
--
Attachment: YARN-8878.1.patch

> Remove StringBuffer from ManagedParentQueue.java
> 
>
> Key: YARN-8878
> URL: https://issues.apache.org/jira/browse/YARN-8878
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: YARN-8878.1.patch
>
>
> Remove all {{StringBuffer}} references from the class {{ManagedParentQueue}}. 
>  {{StringBuffer}} are synchronized and should be instead replaced with the 
> non-synchronized {{StringBuilder}}. However, in this case, just use {{SLF4J}} 
> parameter logging.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8878) Remove StringBuffer from ManagedParentQueue.java

2018-10-15 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR reassigned YARN-8878:
-

Assignee: BELUGA BEHR

> Remove StringBuffer from ManagedParentQueue.java
> 
>
> Key: YARN-8878
> URL: https://issues.apache.org/jira/browse/YARN-8878
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: YARN-8878.1.patch
>
>
> Remove all {{StringBuffer}} references from the class {{ManagedParentQueue}}. 
>  {{StringBuffer}} are synchronized and should be instead replaced with the 
> non-synchronized {{StringBuilder}}. However, in this case, just use {{SLF4J}} 
> parameter logging.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8878) Remove StringBuffer from ManagedParentQueue.java

2018-10-15 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created YARN-8878:
-

 Summary: Remove StringBuffer from ManagedParentQueue.java
 Key: YARN-8878
 URL: https://issues.apache.org/jira/browse/YARN-8878
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 3.2.0
Reporter: BELUGA BEHR
 Attachments: YARN-8878.1.patch

Remove all {{StringBuffer}} references from the class {{ManagedParentQueue}}.  
{{StringBuffer}} are synchronized and should be instead replaced with the 
non-synchronized {{StringBuilder}}. However, in this case, just use {{SLF4J}} 
parameter logging.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8146) Remove LinkedList From resourcemanager.reservation.planning Package

2018-10-01 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR reassigned YARN-8146:
-

Assignee: BELUGA BEHR

> Remove LinkedList From resourcemanager.reservation.planning Package
> ---
>
> Key: YARN-8146
> URL: https://issues.apache.org/jira/browse/YARN-8146
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: reservation system
>Affects Versions: 3.0.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: YARN-8146.1.patch
>
>
> Remove {{LinkedList}} instances in favor of {{ArrayList}}.  {{ArrayList}} is 
> generally more memory efficient, require less memory fragmentation, and with 
> memory localization, faster to iterate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8837) TestNMProxy.testNMProxyRPCRetry Improvement

2018-09-30 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16633511#comment-16633511
 ] 

BELUGA BEHR commented on YARN-8837:
---

{{testNMProxyRetry()}} failed right on cue. New assertion error message:

{code:java}
Expected: an instance of java.net.SocketException
 but: http://wiki.apache.org/hadoop/UnknownHost> is a 
java.net.UnknownHostException
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.junit.Assert.assertThat(Assert.java:865)
at org.junit.Assert.assertThat(Assert.java:832)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy.testNMProxyRPCRetry(TestNMProxy.java:173)
{code}

Please accept this patch for inclusion in the project.

> TestNMProxy.testNMProxyRPCRetry Improvement
> ---
>
> Key: YARN-8837
> URL: https://issues.apache.org/jira/browse/YARN-8837
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: YARN-8789.1.patch
>
>
> The unit test 
> {{org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy.testNMProxyRetry()}}
>  has had some issues in the past. You can search JIRA for it, but one example 
> is [YARN-5104].  I recently had some issues with it myself and found the 
> follow change helpful in troubleshooting.
> {code:java|title=Current Implementation}
> } catch (IOException e) {
> // socket exception should be thrown immediately, without RPC retries.
> Assert.assertTrue(e instanceof java.net.SocketException);
> }
> {code}
> The issue here is that the test is true/false.  The testing framework does 
> not give me any feedback regarding the type of exception that was thrown, it 
> just says "assertion failed."



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8837) TestNMProxy.testNMProxyRPCRetry Improvement

2018-09-30 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8837:
--
Attachment: YARN-8789.1.patch

> TestNMProxy.testNMProxyRPCRetry Improvement
> ---
>
> Key: YARN-8837
> URL: https://issues.apache.org/jira/browse/YARN-8837
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: YARN-8789.1.patch
>
>
> The unit test 
> {{org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy.testNMProxyRetry()}}
>  has had some issues in the past. You can search JIRA for it, but one example 
> is [YARN-5104].  I recently had some issues with it myself and found the 
> follow change helpful in troubleshooting.
> {code:java|title=Current Implementation}
> } catch (IOException e) {
> // socket exception should be thrown immediately, without RPC retries.
> Assert.assertTrue(e instanceof java.net.SocketException);
> }
> {code}
> The issue here is that the test is true/false.  The testing framework does 
> not give me any feedback regarding the type of exception that was thrown, it 
> just says "assertion failed."



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8837) TestNMProxy.testNMProxyRPCRetry Improvement

2018-09-30 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR reassigned YARN-8837:
-

Assignee: BELUGA BEHR

> TestNMProxy.testNMProxyRPCRetry Improvement
> ---
>
> Key: YARN-8837
> URL: https://issues.apache.org/jira/browse/YARN-8837
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
>
> The unit test 
> {{org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy.testNMProxyRetry()}}
>  has had some issues in the past. You can search JIRA for it, but one example 
> is [YARN-5104].  I recently had some issues with it myself and found the 
> follow change helpful in troubleshooting.
> {code:java|title=Current Implementation}
> } catch (IOException e) {
> // socket exception should be thrown immediately, without RPC retries.
> Assert.assertTrue(e instanceof java.net.SocketException);
> }
> {code}
> The issue here is that the test is true/false.  The testing framework does 
> not give me any feedback regarding the type of exception that was thrown, it 
> just says "assertion failed."



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8837) TestNMProxy.testNMProxyRPCRetry Improvement

2018-09-30 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created YARN-8837:
-

 Summary: TestNMProxy.testNMProxyRPCRetry Improvement
 Key: YARN-8837
 URL: https://issues.apache.org/jira/browse/YARN-8837
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: yarn
Affects Versions: 3.2.0
Reporter: BELUGA BEHR


The unit test 
{{org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy.testNMProxyRetry()}}
 has had some issues in the past. You can search JIRA for it, but one example 
is [YARN-5104].  I recently had some issues with it myself and found the follow 
change helpful in troubleshooting.

{code:java|title=Current Implementation}
} catch (IOException e) {
// socket exception should be thrown immediately, without RPC retries.
Assert.assertTrue(e instanceof java.net.SocketException);
}
{code}


The issue here is that the test is true/false.  The testing framework does not 
give me any feedback regarding the type of exception that was thrown, it just 
says "assertion failed."



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-30 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16633375#comment-16633375
 ] 

BELUGA BEHR commented on YARN-8789:
---

The one failed unit test passes locally.  Seems to have problems regularly with 
this test YARN-5104

 

Please consider the latest patch [^YARN-8789.14.patch] for inclusion into the 
project.

 

Thanks!

 
{code:java}
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running 
org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.683 s 
- in org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0{code}

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.10.patch, 
> YARN-8789.12.patch, YARN-8789.14.patch, YARN-8789.2.patch, YARN-8789.3.patch, 
> YARN-8789.4.patch, YARN-8789.5.patch, YARN-8789.6.patch, YARN-8789.7.patch, 
> YARN-8789.7.patch, YARN-8789.8.patch, YARN-8789.9.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.
> Logging Message:
> Size of event-queue is xxx



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-29 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8789:
--
Attachment: YARN-8789.14.patch

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.10.patch, 
> YARN-8789.12.patch, YARN-8789.14.patch, YARN-8789.2.patch, YARN-8789.3.patch, 
> YARN-8789.4.patch, YARN-8789.5.patch, YARN-8789.6.patch, YARN-8789.7.patch, 
> YARN-8789.7.patch, YARN-8789.8.patch, YARN-8789.9.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.
> Logging Message:
> Size of event-queue is xxx



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-28 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8789:
--
Attachment: YARN-8789.12.patch

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.10.patch, 
> YARN-8789.12.patch, YARN-8789.2.patch, YARN-8789.3.patch, YARN-8789.4.patch, 
> YARN-8789.5.patch, YARN-8789.6.patch, YARN-8789.7.patch, YARN-8789.7.patch, 
> YARN-8789.8.patch, YARN-8789.9.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.
> Logging Message:
> Size of event-queue is xxx



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8832) Review of RMCommunicator Class

2018-09-27 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR reassigned YARN-8832:
-

Assignee: BELUGA BEHR

> Review of RMCommunicator Class
> --
>
> Key: YARN-8832
> URL: https://issues.apache.org/jira/browse/YARN-8832
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-88321.patch
>
>
> Various improvements to the {{RMCommunicator}} class.
>  
>  * Use SLF4J parameterized logging
>  * Use switch statement instead of {{if}}-{{else statements}}
>  * Remove anti-pattern of "log and throw" (just throw)
>  * Use a flag to stop thread instead of an interrupt (it may be interrupting 
> the heartbeat code and not the thread loop)
>  * The main thread loops performs loops on the heartbeat callback queue until 
> the queue is empty.  It's technically possible that other threads could 
> constantly put new callbacks into the queue and therefore the main thread 
> never progresses past the callbacks.  Put a cap on the number of callbacks 
> that will be processed in any iteration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8832) Review of RMCommunicator Class

2018-09-27 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8832:
--
Attachment: YARN-88321.patch

> Review of RMCommunicator Class
> --
>
> Key: YARN-8832
> URL: https://issues.apache.org/jira/browse/YARN-8832
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-88321.patch
>
>
> Various improvements to the {{RMCommunicator}} class.
>  
>  * Use SLF4J parameterized logging
>  * Use switch statement instead of {{if}}-{{else statements}}
>  * Remove anti-pattern of "log and throw" (just throw)
>  * Use a flag to stop thread instead of an interrupt (it may be interrupting 
> the heartbeat code and not the thread loop)
>  * The main thread loops performs loops on the heartbeat callback queue until 
> the queue is empty.  It's technically possible that other threads could 
> constantly put new callbacks into the queue and therefore the main thread 
> never progresses past the callbacks.  Put a cap on the number of callbacks 
> that will be processed in any iteration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8832) Review of RMCommunicator Class

2018-09-27 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created YARN-8832:
-

 Summary: Review of RMCommunicator Class
 Key: YARN-8832
 URL: https://issues.apache.org/jira/browse/YARN-8832
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: applications
Affects Versions: 3.2.0
Reporter: BELUGA BEHR


Various improvements to the {{RMCommunicator}} class.

 
 * Use SLF4J parameterized logging
 * Use switch statement instead of {{if}}-{{else statements}}
 * Remove anti-pattern of "log and throw" (just throw)
 * Use a flag to stop thread instead of an interrupt (it may be interrupting 
the heartbeat code and not the thread loop)
 * The main thread loops performs loops on the heartbeat callback queue until 
the queue is empty.  It's technically possible that other threads could 
constantly put new callbacks into the queue and therefore the main thread never 
progresses past the callbacks.  Put a cap on the number of callbacks that will 
be processed in any iteration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8831) Review of LocalContainerAllocator

2018-09-27 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created YARN-8831:
-

 Summary: Review of LocalContainerAllocator
 Key: YARN-8831
 URL: https://issues.apache.org/jira/browse/YARN-8831
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: applications
Affects Versions: 3.2.0
Reporter: BELUGA BEHR
 Attachments: YARN-8831.1.patch

Some trivial cleanup of class {{LocalContainerAllocator}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8831) Review of LocalContainerAllocator

2018-09-27 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8831:
--
Priority: Trivial  (was: Minor)

> Review of LocalContainerAllocator
> -
>
> Key: YARN-8831
> URL: https://issues.apache.org/jira/browse/YARN-8831
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Priority: Trivial
> Attachments: YARN-8831.1.patch
>
>
> Some trivial cleanup of class {{LocalContainerAllocator}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8831) Review of LocalContainerAllocator

2018-09-27 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR reassigned YARN-8831:
-

Assignee: BELUGA BEHR

> Review of LocalContainerAllocator
> -
>
> Key: YARN-8831
> URL: https://issues.apache.org/jira/browse/YARN-8831
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: YARN-8831.1.patch
>
>
> Some trivial cleanup of class {{LocalContainerAllocator}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8831) Review of LocalContainerAllocator

2018-09-27 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8831:
--
Attachment: YARN-8831.1.patch

> Review of LocalContainerAllocator
> -
>
> Key: YARN-8831
> URL: https://issues.apache.org/jira/browse/YARN-8831
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Priority: Trivial
> Attachments: YARN-8831.1.patch
>
>
> Some trivial cleanup of class {{LocalContainerAllocator}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-27 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8789:
--
Attachment: YARN-8789.10.patch

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.10.patch, 
> YARN-8789.2.patch, YARN-8789.3.patch, YARN-8789.4.patch, YARN-8789.5.patch, 
> YARN-8789.6.patch, YARN-8789.7.patch, YARN-8789.7.patch, YARN-8789.8.patch, 
> YARN-8789.9.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.
> Logging Message:
> Size of event-queue is xxx



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-26 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8789:
--
Attachment: YARN-8789.9.patch

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.2.patch, YARN-8789.3.patch, 
> YARN-8789.4.patch, YARN-8789.5.patch, YARN-8789.6.patch, YARN-8789.7.patch, 
> YARN-8789.7.patch, YARN-8789.8.patch, YARN-8789.9.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.
> Logging Message:
> Size of event-queue is xxx



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-25 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8789:
--
Attachment: YARN-8789.8.patch

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.2.patch, YARN-8789.3.patch, 
> YARN-8789.4.patch, YARN-8789.5.patch, YARN-8789.6.patch, YARN-8789.7.patch, 
> YARN-8789.7.patch, YARN-8789.8.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.
> Logging Message:
> Size of event-queue is xxx



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-24 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8789:
--
Attachment: YARN-8789.7.patch

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.2.patch, YARN-8789.3.patch, 
> YARN-8789.4.patch, YARN-8789.5.patch, YARN-8789.6.patch, YARN-8789.7.patch, 
> YARN-8789.7.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.
> Logging Message:
> Size of event-queue is xxx



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-24 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8789:
--
Attachment: YARN-8789.7.patch

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.2.patch, YARN-8789.3.patch, 
> YARN-8789.4.patch, YARN-8789.5.patch, YARN-8789.6.patch, YARN-8789.7.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.
> Logging Message:
> Size of event-queue is xxx



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8816) YARN Unit Tests Fail with Ubuntu VM

2018-09-24 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8816:
--
Description: 
{code}
Linux apache-dev 4.15.0-34-generic #37~16.04.1-Ubuntu SMP Tue Aug 28 10:44:06 
UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
{code}

{code}
[ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 3.926 s 
<<< FAILURE! - in 
org.apache.hadoop.yarn.server.resourcemanager.TestRMStoreCommands
[ERROR] 
testRemoveApplicationFromStateStoreCmdForZK(org.apache.hadoop.yarn.server.resourcemanager.TestRMStoreCommands)
  Time elapsed: 2.668 s  <<< ERROR!
java.lang.ExceptionInInitializerError
at 
org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:316)
at 
org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:304)
at 
org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1828)
at 
org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:710)
at 
org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:660)
at 
org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:571)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:286)
at 
org.apache.hadoop.yarn.server.resourcemanager.MockRM.serviceInit(MockRM.java:1381)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at 
org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:164)
at 
org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:143)
at 
org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:139)
at 
org.apache.hadoop.yarn.server.resourcemanager.TestRMStoreCommands.testRemoveApplicationFromStateStoreCmdForZK(TestRMStoreCommands.java:79)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.security.SecurityUtil$QualifiedHostResolver.(SecurityUtil.java:593)
at 
org.apache.hadoop.security.SecurityUtil.setTokenServiceUseIp(SecurityUtil.java:129)
at 
org.apache.hadoop.security.SecurityUtil.setConfigurationInternal(SecurityUtil.java:102)
at 
org.apache.hadoop.security.SecurityUtil.(SecurityUtil.java:88)
... 38 more
{code}

  was:
{code}
[ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 3.926 s 
<<< FAILURE! - in 
org.apache.hadoop.yarn.server.resourcemanager.TestRMStoreCommands
[ERROR] 
testRemoveApplicationFromStateStoreCmdForZK(org.apache.hadoop.yarn.server.resourcemanager.TestRMStoreCommands)
  Time elapsed: 2.668 s  <<< ERROR!

[jira] [Created] (YARN-8816) YARN Unit Tests Fail with Ubuntu VM

2018-09-24 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created YARN-8816:
-

 Summary: YARN Unit Tests Fail with Ubuntu VM
 Key: YARN-8816
 URL: https://issues.apache.org/jira/browse/YARN-8816
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: yarn
Affects Versions: 3.2.0
Reporter: BELUGA BEHR


{code}
[ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 3.926 s 
<<< FAILURE! - in 
org.apache.hadoop.yarn.server.resourcemanager.TestRMStoreCommands
[ERROR] 
testRemoveApplicationFromStateStoreCmdForZK(org.apache.hadoop.yarn.server.resourcemanager.TestRMStoreCommands)
  Time elapsed: 2.668 s  <<< ERROR!
java.lang.ExceptionInInitializerError
at 
org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:316)
at 
org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:304)
at 
org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1828)
at 
org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:710)
at 
org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:660)
at 
org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:571)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:286)
at 
org.apache.hadoop.yarn.server.resourcemanager.MockRM.serviceInit(MockRM.java:1381)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at 
org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:164)
at 
org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:143)
at 
org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:139)
at 
org.apache.hadoop.yarn.server.resourcemanager.TestRMStoreCommands.testRemoveApplicationFromStateStoreCmdForZK(TestRMStoreCommands.java:79)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.security.SecurityUtil$QualifiedHostResolver.(SecurityUtil.java:593)
at 
org.apache.hadoop.security.SecurityUtil.setTokenServiceUseIp(SecurityUtil.java:129)
at 
org.apache.hadoop.security.SecurityUtil.setConfigurationInternal(SecurityUtil.java:102)
at 
org.apache.hadoop.security.SecurityUtil.(SecurityUtil.java:88)
... 38 more
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-22 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8789:
--
Attachment: YARN-8789.6.patch

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.2.patch, YARN-8789.3.patch, 
> YARN-8789.4.patch, YARN-8789.5.patch, YARN-8789.6.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.
> Logging Message:
> Size of event-queue is xxx



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-21 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8789:
--
Attachment: YARN-8789.5.patch

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.2.patch, YARN-8789.3.patch, 
> YARN-8789.4.patch, YARN-8789.5.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.
> Logging Message:
> Size of event-queue is xxx



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-21 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8789:
--
Attachment: YARN-8789.4.patch

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.2.patch, YARN-8789.3.patch, 
> YARN-8789.4.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.
> Logging Message:
> Size of event-queue is xxx



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-20 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8789:
--
Description: 
I recently came across a scenario where an MR ApplicationMaster was failing 
with an OOM exception.  It had many thousands of Mappers and thousands of 
Reducers.  It was noted that in the logging that the event-queue of 
{{AsyncDispatcher}} had a very large number of item in it and was seemingly 
never decreasing.

I started looking at the code and thought it could use some clean up, 
simplification, and the ability to specify a bounded queue so that any incoming 
events are throttled until they can be processed.  This will protect the 
ApplicationMaster from a flood of events.

Logging Message:
Size of event-queue is xxx

  was:
I recently came across a scenario where an MR ApplicationMaster was failing 
with an OOM exception.  It had many thousands of Mappers and thousands of 
Reducers.  It was noted that in the logging that the event-queue of 
{{AsyncDispatcher}} had a very large number of item in it and was seemingly 
never decreasing.

I started looking at the code and thought it could use some clean up, 
simplification, and the ability to specify a bounded queue so that any incoming 
events are throttled until they can be processed.  This will protect the 
ApplicationMaster from a flood of events.


> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.2.patch, YARN-8789.3.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.
> Logging Message:
> Size of event-queue is xxx



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-20 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8789:
--
Attachment: YARN-8789.3.patch

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.2.patch, YARN-8789.3.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-18 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8789:
--
Attachment: YARN-8789.2.patch

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.2.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-18 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8789:
--
Attachment: (was: YARN-8789.2.patch)

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.2.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-18 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8789:
--
Attachment: YARN-8789.2.patch

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.2.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-18 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8789:
--
Attachment: (was: YARN-8789.2.patch)

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.2.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-18 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8789:
--
Attachment: YARN-8789.2.patch

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.2.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-18 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8789:
--
Attachment: (was: YARN-8789.2.patch)

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-18 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8789:
--
Attachment: YARN-8789.2.patch

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.2.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-18 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8789:
--
Attachment: YARN-8789.1.patch

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-18 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR reassigned YARN-8789:
-

Assignee: BELUGA BEHR

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-09-18 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created YARN-8789:
-

 Summary: Add BoundedQueue to AsyncDispatcher
 Key: YARN-8789
 URL: https://issues.apache.org/jira/browse/YARN-8789
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: applications
Affects Versions: 3.2.0
Reporter: BELUGA BEHR


I recently came across a scenario where an MR ApplicationMaster was failing 
with an OOM exception.  It had many thousands of Mappers and thousands of 
Reducers.  It was noted that in the logging that the event-queue of 
{{AsyncDispatcher}} had a very large number of item in it and was seemingly 
never decreasing.

I started looking at the code and thought it could use some clean up, 
simplification, and the ability to specify a bounded queue so that any incoming 
events are throttled until they can be processed.  This will protect the 
ApplicationMaster from a flood of events.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8169) Review RackResolver.java

2018-04-18 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16442506#comment-16442506
 ] 

BELUGA BEHR commented on YARN-8169:
---

[~ajisakaa] Checkstyle corrected :)

> Review RackResolver.java
> 
>
> Key: YARN-8169
> URL: https://issues.apache.org/jira/browse/YARN-8169
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.0.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: YARN-8169.1.patch, YARN.8169.2.patch
>
>
> # Use SLF4J
> # Fix some checkstyle warnings
> # Minor clean up



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8169) Review RackResolver.java

2018-04-18 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8169:
--
Attachment: YARN.8169.2.patch

> Review RackResolver.java
> 
>
> Key: YARN-8169
> URL: https://issues.apache.org/jira/browse/YARN-8169
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.0.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: YARN-8169.1.patch, YARN.8169.2.patch
>
>
> # Use SLF4J
> # Fix some checkstyle warnings
> # Minor clean up



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8169) Review RackResolver.java

2018-04-18 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16442403#comment-16442403
 ] 

BELUGA BEHR commented on YARN-8169:
---

[~leftnoteasy]

Parameters are best for slf4j:

# Avoids double-checking the 'debug enabled' flag
# Faster than run-time String concatenation
# Less code clutter
# Produces smaller product binary (saves memory and execution cache)

https://www.slf4j.org/faq.html#logging_performance

> Review RackResolver.java
> 
>
> Key: YARN-8169
> URL: https://issues.apache.org/jira/browse/YARN-8169
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.0.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: YARN-8169.1.patch
>
>
> # Use SLF4J
> # Fix some checkstyle warnings
> # Minor clean up



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8170) Caching Node Rack Location

2018-04-17 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created YARN-8170:
-

 Summary: Caching Node Rack Location
 Key: YARN-8170
 URL: https://issues.apache.org/jira/browse/YARN-8170
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: applications, nodemanager
Affects Versions: 3.0.1
Reporter: BELUGA BEHR


When the MapReduce ApplicationMaster is trying to assign Mappers to Nodes, it 
loops all of the queued Mappers and looks up the ideal rack location of each 
Mapper.

Under the covers, the rack awareness script is being called, once per Mapper. 
The results do get cached, but for only as long as the ApplicationMaster 
exists. That means that the script gets called N times each time a new 
ApplicationMaster is launched. If the rack awareness script is complex or 
requires an external lookup, this can be a slow process and can even DDOS the 
external lookup source.

There are at least a couple of ways to tackle this...
 # Add a DNSToSwitchMapping implementation that caches in an external cache 
(i.e., memcached) instead of memory so that all ApplicationMasters can share 
the same cache and would rarely call the rack awareness script.
 # Like the shuffle service, add a new NodeManager auxiliary which exposes a 
rack lookup API so that the NodeManagers are responsible for caching the rack 
locations. This would also require a DNSToSwitchMapping implementation that 
interacts with this new service.
 # Other?

{code:java}
  String host = allocated.getNodeId().getHost();
  String rack = RackResolver.resolve(host).getNetworkLocation();
{code}
[https://github.com/apache/hadoop/blob/453d48bdfbb67ed3e66c33c4aef239c3d7bdd3bc/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java#L1435-L1464]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8169) Review RackResolver.java

2018-04-17 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8169:
--
Attachment: YARN-8169.1.patch

> Review RackResolver.java
> 
>
> Key: YARN-8169
> URL: https://issues.apache.org/jira/browse/YARN-8169
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.0.1
>Reporter: BELUGA BEHR
>Priority: Trivial
> Attachments: YARN-8169.1.patch
>
>
> # Use SLF4J
> # Fix some checkstyle warnings
> # Minor clean up



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8169) Review RackResolver.java

2018-04-17 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created YARN-8169:
-

 Summary: Review RackResolver.java
 Key: YARN-8169
 URL: https://issues.apache.org/jira/browse/YARN-8169
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: yarn
Affects Versions: 3.0.1
Reporter: BELUGA BEHR
 Attachments: YARN-8169.1.patch

# Use SLF4J
# Fix some checkstyle warnings
# Minor clean up



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer

2018-04-17 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441037#comment-16441037
 ] 

BELUGA BEHR commented on YARN-7962:
---

[~billie.rinaldi] [~wilfreds] Any additional concerns?  Please commit.

> Race Condition When Stopping DelegationTokenRenewer
> ---
>
> Key: YARN-7962
> URL: https://issues.apache.org/jira/browse/YARN-7962
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-7962.1.patch, YARN-7962.2.patch, YARN-7962.3.patch, 
> YARN-7962.4.patch
>
>
> [https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
> {code:java}
>   private ThreadPoolExecutor renewerService;
>   private void processDelegationTokenRenewerEvent(
>   DelegationTokenRenewerEvent evt) {
> serviceStateLock.readLock().lock();
> try {
>   if (isServiceStarted) {
> renewerService.execute(new DelegationTokenRenewerRunnable(evt));
>   } else {
> pendingEventQueue.add(evt);
>   }
> } finally {
>   serviceStateLock.readLock().unlock();
> }
>   }
>   @Override
>   protected void serviceStop() {
> if (renewalTimer != null) {
>   renewalTimer.cancel();
> }
> appTokens.clear();
> allTokens.clear();
> this.renewerService.shutdown();
> {code}
> {code:java}
> 2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2
>  rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> What I think is going on here is that the {{serviceStop}} method is not 
> setting the {{isServiceStarted}} flag to 'false'.
> Please update so that the {{serviceStop}} method grabs the 
> {{serviceStateLock}} and sets {{isServiceStarted}} to _false_, before 
> shutting down the {{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8146) Remove LinkedList From resourcemanager.reservation.planning Package

2018-04-12 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8146:
--
Priority: Trivial  (was: Minor)

> Remove LinkedList From resourcemanager.reservation.planning Package
> ---
>
> Key: YARN-8146
> URL: https://issues.apache.org/jira/browse/YARN-8146
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: reservation system
>Affects Versions: 3.0.1
>Reporter: BELUGA BEHR
>Priority: Trivial
> Attachments: YARN-8146.1.patch
>
>
> Remove {{LinkedList}} instances in favor of {{ArrayList}}.  {{ArrayList}} is 
> generally more memory efficient, require less memory fragmentation, and with 
> memory localization, faster to iterate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8146) Remove LinkedList From resourcemanager.reservation.planning Package

2018-04-11 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-8146:
--
Attachment: YARN-8146.1.patch

> Remove LinkedList From resourcemanager.reservation.planning Package
> ---
>
> Key: YARN-8146
> URL: https://issues.apache.org/jira/browse/YARN-8146
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: reservation system
>Affects Versions: 3.0.1
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-8146.1.patch
>
>
> Remove {{LinkedList}} instances in favor of {{ArrayList}}.  {{ArrayList}} is 
> generally more memory efficient, require less memory fragmentation, and with 
> memory localization, faster to iterate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8146) Remove LinkedList From resourcemanager.reservation.planning Package

2018-04-11 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created YARN-8146:
-

 Summary: Remove LinkedList From 
resourcemanager.reservation.planning Package
 Key: YARN-8146
 URL: https://issues.apache.org/jira/browse/YARN-8146
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: reservation system
Affects Versions: 3.0.1
Reporter: BELUGA BEHR


Remove {{LinkedList}} instances in favor of {{ArrayList}}.  {{ArrayList}} is 
generally more memory efficient, require less memory fragmentation, and with 
memory localization, faster to iterate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer

2018-04-04 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425525#comment-16425525
 ] 

BELUGA BEHR commented on YARN-7962:
---

[~wilfreds] {{try...finally}} is a best practice.

bq. It is recommended practice to always immediately follow a call to lock with 
a try block, most typically in a before/after construction

https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/ReentrantLock.html

Please consider patch version 3 for inclusion into the project.

> Race Condition When Stopping DelegationTokenRenewer
> ---
>
> Key: YARN-7962
> URL: https://issues.apache.org/jira/browse/YARN-7962
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-7962.1.patch, YARN-7962.2.patch, YARN-7962.3.patch, 
> YARN-7962.4.patch
>
>
> [https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
> {code:java}
>   private ThreadPoolExecutor renewerService;
>   private void processDelegationTokenRenewerEvent(
>   DelegationTokenRenewerEvent evt) {
> serviceStateLock.readLock().lock();
> try {
>   if (isServiceStarted) {
> renewerService.execute(new DelegationTokenRenewerRunnable(evt));
>   } else {
> pendingEventQueue.add(evt);
>   }
> } finally {
>   serviceStateLock.readLock().unlock();
> }
>   }
>   @Override
>   protected void serviceStop() {
> if (renewalTimer != null) {
>   renewalTimer.cancel();
> }
> appTokens.clear();
> allTokens.clear();
> this.renewerService.shutdown();
> {code}
> {code:java}
> 2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2
>  rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> What I think is going on here is that the {{serviceStop}} method is not 
> setting the {{isServiceStarted}} flag to 'false'.
> Please update so that the {{serviceStop}} method grabs the 
> {{serviceStateLock}} and sets {{isServiceStarted}} to _false_, before 
> shutting down the {{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer

2018-04-03 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-7962:
--
Attachment: YARN-7962.3.patch

> Race Condition When Stopping DelegationTokenRenewer
> ---
>
> Key: YARN-7962
> URL: https://issues.apache.org/jira/browse/YARN-7962
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-7962.1.patch, YARN-7962.2.patch, YARN-7962.3.patch, 
> YARN-7962.4.patch
>
>
> [https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
> {code:java}
>   private ThreadPoolExecutor renewerService;
>   private void processDelegationTokenRenewerEvent(
>   DelegationTokenRenewerEvent evt) {
> serviceStateLock.readLock().lock();
> try {
>   if (isServiceStarted) {
> renewerService.execute(new DelegationTokenRenewerRunnable(evt));
>   } else {
> pendingEventQueue.add(evt);
>   }
> } finally {
>   serviceStateLock.readLock().unlock();
> }
>   }
>   @Override
>   protected void serviceStop() {
> if (renewalTimer != null) {
>   renewalTimer.cancel();
> }
> appTokens.clear();
> allTokens.clear();
> this.renewerService.shutdown();
> {code}
> {code:java}
> 2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2
>  rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> What I think is going on here is that the {{serviceStop}} method is not 
> setting the {{isServiceStarted}} flag to 'false'.
> Please update so that the {{serviceStop}} method grabs the 
> {{serviceStateLock}} and sets {{isServiceStarted}} to _false_, before 
> shutting down the {{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer

2018-04-03 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-7962:
--
Attachment: (was: YARN-7962.3.patch)

> Race Condition When Stopping DelegationTokenRenewer
> ---
>
> Key: YARN-7962
> URL: https://issues.apache.org/jira/browse/YARN-7962
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-7962.1.patch, YARN-7962.2.patch, YARN-7962.3.patch, 
> YARN-7962.4.patch
>
>
> [https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
> {code:java}
>   private ThreadPoolExecutor renewerService;
>   private void processDelegationTokenRenewerEvent(
>   DelegationTokenRenewerEvent evt) {
> serviceStateLock.readLock().lock();
> try {
>   if (isServiceStarted) {
> renewerService.execute(new DelegationTokenRenewerRunnable(evt));
>   } else {
> pendingEventQueue.add(evt);
>   }
> } finally {
>   serviceStateLock.readLock().unlock();
> }
>   }
>   @Override
>   protected void serviceStop() {
> if (renewalTimer != null) {
>   renewalTimer.cancel();
> }
> appTokens.clear();
> allTokens.clear();
> this.renewerService.shutdown();
> {code}
> {code:java}
> 2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2
>  rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> What I think is going on here is that the {{serviceStop}} method is not 
> setting the {{isServiceStarted}} flag to 'false'.
> Please update so that the {{serviceStop}} method grabs the 
> {{serviceStateLock}} and sets {{isServiceStarted}} to _false_, before 
> shutting down the {{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer

2018-04-03 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-7962:
--
Attachment: YARN-7962.4.patch

> Race Condition When Stopping DelegationTokenRenewer
> ---
>
> Key: YARN-7962
> URL: https://issues.apache.org/jira/browse/YARN-7962
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-7962.1.patch, YARN-7962.2.patch, YARN-7962.3.patch, 
> YARN-7962.4.patch
>
>
> [https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
> {code:java}
>   private ThreadPoolExecutor renewerService;
>   private void processDelegationTokenRenewerEvent(
>   DelegationTokenRenewerEvent evt) {
> serviceStateLock.readLock().lock();
> try {
>   if (isServiceStarted) {
> renewerService.execute(new DelegationTokenRenewerRunnable(evt));
>   } else {
> pendingEventQueue.add(evt);
>   }
> } finally {
>   serviceStateLock.readLock().unlock();
> }
>   }
>   @Override
>   protected void serviceStop() {
> if (renewalTimer != null) {
>   renewalTimer.cancel();
> }
> appTokens.clear();
> allTokens.clear();
> this.renewerService.shutdown();
> {code}
> {code:java}
> 2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2
>  rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> What I think is going on here is that the {{serviceStop}} method is not 
> setting the {{isServiceStarted}} flag to 'false'.
> Please update so that the {{serviceStop}} method grabs the 
> {{serviceStateLock}} and sets {{isServiceStarted}} to _false_, before 
> shutting down the {{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer

2018-03-29 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419830#comment-16419830
 ] 

BELUGA BEHR commented on YARN-7962:
---

[~billie.rinaldi] Thank you for the assist!  I have been on vacation now for a 
little bit and have been unable to work on this.

> Race Condition When Stopping DelegationTokenRenewer
> ---
>
> Key: YARN-7962
> URL: https://issues.apache.org/jira/browse/YARN-7962
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-7962.1.patch, YARN-7962.2.patch, YARN-7962.3.patch
>
>
> [https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
> {code:java}
>   private ThreadPoolExecutor renewerService;
>   private void processDelegationTokenRenewerEvent(
>   DelegationTokenRenewerEvent evt) {
> serviceStateLock.readLock().lock();
> try {
>   if (isServiceStarted) {
> renewerService.execute(new DelegationTokenRenewerRunnable(evt));
>   } else {
> pendingEventQueue.add(evt);
>   }
> } finally {
>   serviceStateLock.readLock().unlock();
> }
>   }
>   @Override
>   protected void serviceStop() {
> if (renewalTimer != null) {
>   renewalTimer.cancel();
> }
> appTokens.clear();
> allTokens.clear();
> this.renewerService.shutdown();
> {code}
> {code:java}
> 2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2
>  rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> What I think is going on here is that the {{serviceStop}} method is not 
> setting the {{isServiceStarted}} flag to 'false'.
> Please update so that the {{serviceStop}} method grabs the 
> {{serviceStateLock}} and sets {{isServiceStarted}} to _false_, before 
> shutting down the {{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer

2018-03-17 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-7962:
--
Attachment: YARN-7962.2.patch

> Race Condition When Stopping DelegationTokenRenewer
> ---
>
> Key: YARN-7962
> URL: https://issues.apache.org/jira/browse/YARN-7962
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-7962.1.patch, YARN-7962.2.patch
>
>
> [https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
> {code:java}
>   private ThreadPoolExecutor renewerService;
>   private void processDelegationTokenRenewerEvent(
>   DelegationTokenRenewerEvent evt) {
> serviceStateLock.readLock().lock();
> try {
>   if (isServiceStarted) {
> renewerService.execute(new DelegationTokenRenewerRunnable(evt));
>   } else {
> pendingEventQueue.add(evt);
>   }
> } finally {
>   serviceStateLock.readLock().unlock();
> }
>   }
>   @Override
>   protected void serviceStop() {
> if (renewalTimer != null) {
>   renewalTimer.cancel();
> }
> appTokens.clear();
> allTokens.clear();
> this.renewerService.shutdown();
> {code}
> {code:java}
> 2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2
>  rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> What I think is going on here is that the {{serviceStop}} method is not 
> setting the {{isServiceStarted}} flag to 'false'.
> Please update so that the {{serviceStop}} method grabs the 
> {{serviceStateLock}} and sets {{isServiceStarted}} to _false_, before 
> shutting down the {{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer

2018-03-16 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402242#comment-16402242
 ] 

BELUGA BEHR commented on YARN-7962:
---

[~wilfreds] Can you please provide thoughts on how to unit test a race 
condition of this sort?  How to introduce pauses into the locked code?

Also, there technically isn't a need to lock on the initialization.  It's just 
a safety and good practice item.  There will be almost no overheard since we 
will only initialize one time (or maybe a couple) of times, so it doesn't hurt 
to be safe.

> Race Condition When Stopping DelegationTokenRenewer
> ---
>
> Key: YARN-7962
> URL: https://issues.apache.org/jira/browse/YARN-7962
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-7962.1.patch
>
>
> [https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
> {code:java}
>   private ThreadPoolExecutor renewerService;
>   private void processDelegationTokenRenewerEvent(
>   DelegationTokenRenewerEvent evt) {
> serviceStateLock.readLock().lock();
> try {
>   if (isServiceStarted) {
> renewerService.execute(new DelegationTokenRenewerRunnable(evt));
>   } else {
> pendingEventQueue.add(evt);
>   }
> } finally {
>   serviceStateLock.readLock().unlock();
> }
>   }
>   @Override
>   protected void serviceStop() {
> if (renewalTimer != null) {
>   renewalTimer.cancel();
> }
> appTokens.clear();
> allTokens.clear();
> this.renewerService.shutdown();
> {code}
> {code:java}
> 2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2
>  rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> What I think is going on here is that the {{serviceStop}} method is not 
> setting the {{isServiceStarted}} flag to 'false'.
> Please update so that the {{serviceStop}} method grabs the 
> {{serviceStateLock}} and sets {{isServiceStarted}} to _false_, before 
> shutting down the {{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer

2018-02-26 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16377552#comment-16377552
 ] 

BELUGA BEHR commented on YARN-7962:
---

Unit test failures appear to be unrelated.

> Race Condition When Stopping DelegationTokenRenewer
> ---
>
> Key: YARN-7962
> URL: https://issues.apache.org/jira/browse/YARN-7962
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-7962.1.patch
>
>
> [https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
> {code:java}
>   private ThreadPoolExecutor renewerService;
>   private void processDelegationTokenRenewerEvent(
>   DelegationTokenRenewerEvent evt) {
> serviceStateLock.readLock().lock();
> try {
>   if (isServiceStarted) {
> renewerService.execute(new DelegationTokenRenewerRunnable(evt));
>   } else {
> pendingEventQueue.add(evt);
>   }
> } finally {
>   serviceStateLock.readLock().unlock();
> }
>   }
>   @Override
>   protected void serviceStop() {
> if (renewalTimer != null) {
>   renewalTimer.cancel();
> }
> appTokens.clear();
> allTokens.clear();
> this.renewerService.shutdown();
> {code}
> {code:java}
> 2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2
>  rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> What I think is going on here is that the {{serviceStop}} method is not 
> setting the {{isServiceStarted}} flag to 'false'.
> Please update so that the {{serviceStop}} method grabs the 
> {{serviceStateLock}} and sets {{isServiceStarted}} to _false_, before 
> shutting down the {{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer

2018-02-23 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374999#comment-16374999
 ] 

BELUGA BEHR commented on YARN-7962:
---

Also tighten things up a little (make start and stop symmetrical) when it comes 
to blocking.

> Race Condition When Stopping DelegationTokenRenewer
> ---
>
> Key: YARN-7962
> URL: https://issues.apache.org/jira/browse/YARN-7962
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-7962.1.patch
>
>
> [https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
> {code:java}
>   private ThreadPoolExecutor renewerService;
>   private void processDelegationTokenRenewerEvent(
>   DelegationTokenRenewerEvent evt) {
> serviceStateLock.readLock().lock();
> try {
>   if (isServiceStarted) {
> renewerService.execute(new DelegationTokenRenewerRunnable(evt));
>   } else {
> pendingEventQueue.add(evt);
>   }
> } finally {
>   serviceStateLock.readLock().unlock();
> }
>   }
>   @Override
>   protected void serviceStop() {
> if (renewalTimer != null) {
>   renewalTimer.cancel();
> }
> appTokens.clear();
> allTokens.clear();
> this.renewerService.shutdown();
> {code}
> {code:java}
> 2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2
>  rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> What I think is going on here is that the {{serviceStop}} method is not 
> setting the {{isServiceStarted}} flag to 'false'.
> Please update so that the {{serviceStop}} method grabs the 
> {{serviceStateLock}} and sets {{isServiceStarted}} to _false_, before 
> shutting down the {{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer

2018-02-23 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-7962:
--
Attachment: YARN-7962.1.patch

> Race Condition When Stopping DelegationTokenRenewer
> ---
>
> Key: YARN-7962
> URL: https://issues.apache.org/jira/browse/YARN-7962
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-7962.1.patch
>
>
> [https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
> {code:java}
>   private ThreadPoolExecutor renewerService;
>   private void processDelegationTokenRenewerEvent(
>   DelegationTokenRenewerEvent evt) {
> serviceStateLock.readLock().lock();
> try {
>   if (isServiceStarted) {
> renewerService.execute(new DelegationTokenRenewerRunnable(evt));
>   } else {
> pendingEventQueue.add(evt);
>   }
> } finally {
>   serviceStateLock.readLock().unlock();
> }
>   }
>   @Override
>   protected void serviceStop() {
> if (renewalTimer != null) {
>   renewalTimer.cancel();
> }
> appTokens.clear();
> allTokens.clear();
> this.renewerService.shutdown();
> {code}
> {code:java}
> 2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2
>  rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> What I think is going on here is that the {{serviceStop}} method is not 
> setting the {{isServiceStarted}} flag to 'false'.
> Please update so that the {{serviceStop}} method grabs the 
> {{serviceStateLock}} and sets {{isServiceStarted}} to _false_, before 
> shutting down the {{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer

2018-02-22 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373587#comment-16373587
 ] 

BELUGA BEHR commented on YARN-7962:
---

{quote}pool size = 0, active threads = 0, queued tasks = 0{quote}

Pool size is 0 because the pool was shut down is my guess.

> Race Condition When Stopping DelegationTokenRenewer
> ---
>
> Key: YARN-7962
> URL: https://issues.apache.org/jira/browse/YARN-7962
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
>
> [https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
> {code:java}
>   private ThreadPoolExecutor renewerService;
>   private void processDelegationTokenRenewerEvent(
>   DelegationTokenRenewerEvent evt) {
> serviceStateLock.readLock().lock();
> try {
>   if (isServiceStarted) {
> renewerService.execute(new DelegationTokenRenewerRunnable(evt));
>   } else {
> pendingEventQueue.add(evt);
>   }
> } finally {
>   serviceStateLock.readLock().unlock();
> }
>   }
>   @Override
>   protected void serviceStop() {
> if (renewalTimer != null) {
>   renewalTimer.cancel();
> }
> appTokens.clear();
> allTokens.clear();
> this.renewerService.shutdown();
> {code}
> {code:java}
> 2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2
>  rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> What I think is going on here is that the {{serviceStop}} method is not 
> setting the {{isServiceStarted}} flag to 'false'.
> Please update so that the {{serviceStop}} method grabs the 
> {{serviceStateLock}} and sets {{isServiceStarted}} to _false_, before 
> shutting down the {{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer

2018-02-22 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created YARN-7962:
-

 Summary: Race Condition When Stopping DelegationTokenRenewer
 Key: YARN-7962
 URL: https://issues.apache.org/jira/browse/YARN-7962
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 3.0.0
Reporter: BELUGA BEHR


[https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
{code:java}
  private ThreadPoolExecutor renewerService;

  private void processDelegationTokenRenewerEvent(
  DelegationTokenRenewerEvent evt) {
serviceStateLock.readLock().lock();
try {
  if (isServiceStarted) {
renewerService.execute(new DelegationTokenRenewerRunnable(evt));
  } else {
pendingEventQueue.add(evt);
  }
} finally {
  serviceStateLock.readLock().unlock();
}
  }

  @Override
  protected void serviceStop() {
if (renewalTimer != null) {
  renewalTimer.cancel();
}
appTokens.clear();
allTokens.clear();
this.renewerService.shutdown();
{code}
{code:java}
2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
Error in dispatcher thread
java.util.concurrent.RejectedExecutionException: Task 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2
 rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, 
pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
at 
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
at 
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
at 
java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
at 
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
at 
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
at 
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
at java.lang.Thread.run(Thread.java:745)
{code}
What I think is going on here is that the {{serviceStop}} method is not setting 
the {{isServiceStarted}} flag to 'false'.

Please update so that the {{serviceStop}} method grabs the {{serviceStateLock}} 
and sets {{isServiceStarted}} to _false_, before shutting down the 
{{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7688) Miscellaneous Improvements To ProcfsBasedProcessTree

2018-01-02 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308841#comment-16308841
 ] 

BELUGA BEHR commented on YARN-7688:
---

[~miklos.szeg...@cloudera.com] Kindly consider this patch to the project. :)

> Miscellaneous Improvements To ProcfsBasedProcessTree
> 
>
> Key: YARN-7688
> URL: https://issues.apache.org/jira/browse/YARN-7688
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-7688.1.patch, YARN-7688.2.patch, YARN-7688.3.patch, 
> YARN-7688.4.patch
>
>
> * Use ArrayDeque for performance instead of LinkedList
> * Use more Apache Commons routines to replace existing implementations
> * Remove superfluous code guards around DEBUG statements
> * Remove superfluous annotations in the tests
> * Other small improvements



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7687) ContainerLogAppender Improvements

2017-12-30 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-7687:
--
Attachment: YARN-7687.3.patch

> ContainerLogAppender Improvements
> -
>
> Key: YARN-7687
> URL: https://issues.apache.org/jira/browse/YARN-7687
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Trivial
> Attachments: YARN-7687.1.patch, YARN-7687.2.patch, YARN-7687.3.patch
>
>
> * Use Array-backed collection instead of LinkedList
> * Ignore calls to {{close()}} after the initial call
> * Clear the queue after {{close}} is called to let garbage collection do its 
> magic on the items inside of it
> * Fix int-to-long conversion issue (overflow)
> * Remove superfluous white space



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7687) ContainerLogAppender Improvements

2017-12-30 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-7687:
--
Attachment: YARN-7687.2.patch

> ContainerLogAppender Improvements
> -
>
> Key: YARN-7687
> URL: https://issues.apache.org/jira/browse/YARN-7687
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Trivial
> Attachments: YARN-7687.1.patch, YARN-7687.2.patch
>
>
> * Use Array-backed collection instead of LinkedList
> * Ignore calls to {{close()}} after the initial call
> * Clear the queue after {{close}} is called to let garbage collection do its 
> magic on the items inside of it
> * Fix int-to-long conversion issue (overflow)
> * Remove superfluous white space



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7687) ContainerLogAppender Improvements

2017-12-30 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated YARN-7687:
--
Attachment: (was: YARN-7687.2.patch)

> ContainerLogAppender Improvements
> -
>
> Key: YARN-7687
> URL: https://issues.apache.org/jira/browse/YARN-7687
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Trivial
> Attachments: YARN-7687.1.patch, YARN-7687.2.patch
>
>
> * Use Array-backed collection instead of LinkedList
> * Ignore calls to {{close()}} after the initial call
> * Clear the queue after {{close}} is called to let garbage collection do its 
> magic on the items inside of it
> * Fix int-to-long conversion issue (overflow)
> * Remove superfluous white space



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



  1   2   >