from:"Zian Chen \(JIRA\)"

[jira] [Commented] (YARN-8776) Container Executor change to create stdin/stdout pipeline

2018-10-15 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650894#comment-16650894
 ] 

Zian Chen commented on YARN-8776:
-

Still work on refine the patch. Will update initial one later today or tomorrow.

> Container Executor change to create stdin/stdout pipeline
> -
>
> Key: YARN-8776
> URL: https://issues.apache.org/jira/browse/YARN-8776
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
>
> The pipeline is built to connect the stdin/stdout channel from WebSocket 
> servlet through container-executor to docker executor. So when the WebSocket 
> servlet is started, we need to invoke container-executor “dockerExec” method 
> (which will be implemented) to create a new docker executor and use “docker 
> exec -it $ContainerId” command which executes an interactive bash shell on 
> the container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-8778) Add Command Line interface to invoke interactive docker shell

2018-10-15 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen reassigned YARN-8778:
---

Assignee: Eric Yang  (was: Zian Chen)

> Add Command Line interface to invoke interactive docker shell
> -
>
> Key: YARN-8778
> URL: https://issues.apache.org/jira/browse/YARN-8778
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
>
> CLI will be the mandatory interface we are providing for a user to use 
> interactive docker shell feature. We will need to create a new class 
> “InteractiveDockerShellCLI” to read command line into the servlet and pass 
> all the way down to docker executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8778) Add Command Line interface to invoke interactive docker shell

2018-10-15 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650892#comment-16650892
 ] 

Zian Chen commented on YARN-8778:
-

Hi [~eyang], sure, please take this one. I'll assign it to you.

> Add Command Line interface to invoke interactive docker shell
> -
>
> Key: YARN-8778
> URL: https://issues.apache.org/jira/browse/YARN-8778
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
>
> CLI will be the mandatory interface we are providing for a user to use 
> interactive docker shell feature. We will need to create a new class 
> “InteractiveDockerShellCLI” to read command line into the servlet and pass 
> all the way down to docker executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command

2018-10-08 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16642220#comment-16642220
 ] 

Zian Chen commented on YARN-8777:
-

+1 for patch 7. 

> Container Executor C binary change to execute interactive docker command
> 
>
> Key: YARN-8777
> URL: https://issues.apache.org/jira/browse/YARN-8777
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8777.001.patch, YARN-8777.002.patch, 
> YARN-8777.003.patch, YARN-8777.004.patch, YARN-8777.005.patch, 
> YARN-8777.006.patch, YARN-8777.007.patch
>
>
> Since Container Executor provides Container execution using the native 
> container-executor binary, we also need to make changes to accept new 
> “dockerExec” method to invoke the corresponding native function to execute 
> docker exec command to the running container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet

2018-10-08 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16642219#comment-16642219
 ] 

Zian Chen commented on YARN-8763:
-

[~eyang], thanks for +1. Could you help me commit this patch? Thanks

 

> Add WebSocket logic to the Node Manager web server to establish servlet
> ---
>
> Key: YARN-8763
> URL: https://issues.apache.org/jira/browse/YARN-8763
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8763-001.patch, YARN-8763.002.patch, 
> YARN-8763.003.patch, YARN-8763.004.patch, YARN-8763.005.patch
>
>
> The reason we want to use WebSocket servlet to serve the backend instead of 
> establishing the connection through HTTP is that WebSocket solves a few 
> issues with HTTP which needed for our scenario,
>  # In HTTP, the request is always initiated by the client and the response is 
> processed by the server — making HTTP a unidirectional protocol, while web 
> socket provides the Bi-directional protocol which means either client/server 
> can send a message to the other party.
>  # Full-duplex communication — client and server can talk to each other 
> independently at the same time
>  # Single TCP connection — After upgrading the HTTP connection in the 
> beginning, client and server communicate over that same TCP connection 
> throughout the lifecycle of WebSocket connection



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet

2018-10-05 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8763:

Attachment: YARN-8763.005.patch

> Add WebSocket logic to the Node Manager web server to establish servlet
> ---
>
> Key: YARN-8763
> URL: https://issues.apache.org/jira/browse/YARN-8763
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8763-001.patch, YARN-8763.002.patch, 
> YARN-8763.003.patch, YARN-8763.004.patch, YARN-8763.005.patch
>
>
> The reason we want to use WebSocket servlet to serve the backend instead of 
> establishing the connection through HTTP is that WebSocket solves a few 
> issues with HTTP which needed for our scenario,
>  # In HTTP, the request is always initiated by the client and the response is 
> processed by the server — making HTTP a unidirectional protocol, while web 
> socket provides the Bi-directional protocol which means either client/server 
> can send a message to the other party.
>  # Full-duplex communication — client and server can talk to each other 
> independently at the same time
>  # Single TCP connection — After upgrading the HTTP connection in the 
> beginning, client and server communicate over that same TCP connection 
> throughout the lifecycle of WebSocket connection



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet

2018-10-05 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16640482#comment-16640482
 ] 

Zian Chen edited comment on YARN-8763 at 10/6/18 12:01 AM:
---

[~eyang] sorry for getting back late. I'm curious why TestContainerManager been 
triggered too. Anyway, i tried the patch 004 locally and TestContainerManager 
UTs all passed, And I updated patch 005 with your suggestions. Let's see how 
patch-005 goes.


was (Author: zian chen):
[~eyang] sorry for getting back late. I'm curious why TestContainerManager been 
triggered too. Anyway, i tried the patch 003 locally and TestContainerManager 
UTs all passed, And I updated patch 004 with your suggestions. Let's see how 
patch-004 goes.

> Add WebSocket logic to the Node Manager web server to establish servlet
> ---
>
> Key: YARN-8763
> URL: https://issues.apache.org/jira/browse/YARN-8763
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8763-001.patch, YARN-8763.002.patch, 
> YARN-8763.003.patch, YARN-8763.004.patch
>
>
> The reason we want to use WebSocket servlet to serve the backend instead of 
> establishing the connection through HTTP is that WebSocket solves a few 
> issues with HTTP which needed for our scenario,
>  # In HTTP, the request is always initiated by the client and the response is 
> processed by the server — making HTTP a unidirectional protocol, while web 
> socket provides the Bi-directional protocol which means either client/server 
> can send a message to the other party.
>  # Full-duplex communication — client and server can talk to each other 
> independently at the same time
>  # Single TCP connection — After upgrading the HTTP connection in the 
> beginning, client and server communicate over that same TCP connection 
> throughout the lifecycle of WebSocket connection



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet

2018-10-05 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16640482#comment-16640482
 ] 

Zian Chen commented on YARN-8763:
-

[~eyang] sorry for getting back late. I'm curious why TestContainerManager been 
triggered too. Anyway, i tried the patch 003 locally and TestContainerManager 
UTs all passed, And I updated patch 004 with your suggestions. Let's see how 
patch-004 goes.

> Add WebSocket logic to the Node Manager web server to establish servlet
> ---
>
> Key: YARN-8763
> URL: https://issues.apache.org/jira/browse/YARN-8763
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8763-001.patch, YARN-8763.002.patch, 
> YARN-8763.003.patch, YARN-8763.004.patch
>
>
> The reason we want to use WebSocket servlet to serve the backend instead of 
> establishing the connection through HTTP is that WebSocket solves a few 
> issues with HTTP which needed for our scenario,
>  # In HTTP, the request is always initiated by the client and the response is 
> processed by the server — making HTTP a unidirectional protocol, while web 
> socket provides the Bi-directional protocol which means either client/server 
> can send a message to the other party.
>  # Full-duplex communication — client and server can talk to each other 
> independently at the same time
>  # Single TCP connection — After upgrading the HTTP connection in the 
> beginning, client and server communicate over that same TCP connection 
> throughout the lifecycle of WebSocket connection



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command

2018-10-04 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638734#comment-16638734
 ] 

Zian Chen commented on YARN-8777:
-

Hi [~eyang], thanks for patch 006. Seems we still have whitespace errors in 
latest Jenkins build? Could you help fix it? Overall patch looks good to me.

> Container Executor C binary change to execute interactive docker command
> 
>
> Key: YARN-8777
> URL: https://issues.apache.org/jira/browse/YARN-8777
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8777.001.patch, YARN-8777.002.patch, 
> YARN-8777.003.patch, YARN-8777.004.patch, YARN-8777.005.patch, 
> YARN-8777.006.patch
>
>
> Since Container Executor provides Container execution using the native 
> container-executor binary, we also need to make changes to accept new 
> “dockerExec” method to invoke the corresponding native function to execute 
> docker exec command to the running container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet

2018-10-04 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638730#comment-16638730
 ] 

Zian Chen commented on YARN-8763:
-

[~eyang], just uploaded patch 004, please help review it. Thanks

 

> Add WebSocket logic to the Node Manager web server to establish servlet
> ---
>
> Key: YARN-8763
> URL: https://issues.apache.org/jira/browse/YARN-8763
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8763-001.patch, YARN-8763.002.patch, 
> YARN-8763.003.patch, YARN-8763.004.patch
>
>
> The reason we want to use WebSocket servlet to serve the backend instead of 
> establishing the connection through HTTP is that WebSocket solves a few 
> issues with HTTP which needed for our scenario,
>  # In HTTP, the request is always initiated by the client and the response is 
> processed by the server — making HTTP a unidirectional protocol, while web 
> socket provides the Bi-directional protocol which means either client/server 
> can send a message to the other party.
>  # Full-duplex communication — client and server can talk to each other 
> independently at the same time
>  # Single TCP connection — After upgrading the HTTP connection in the 
> beginning, client and server communicate over that same TCP connection 
> throughout the lifecycle of WebSocket connection



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet

2018-10-04 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8763:

Attachment: YARN-8763.004.patch

> Add WebSocket logic to the Node Manager web server to establish servlet
> ---
>
> Key: YARN-8763
> URL: https://issues.apache.org/jira/browse/YARN-8763
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8763-001.patch, YARN-8763.002.patch, 
> YARN-8763.003.patch, YARN-8763.004.patch
>
>
> The reason we want to use WebSocket servlet to serve the backend instead of 
> establishing the connection through HTTP is that WebSocket solves a few 
> issues with HTTP which needed for our scenario,
>  # In HTTP, the request is always initiated by the client and the response is 
> processed by the server — making HTTP a unidirectional protocol, while web 
> socket provides the Bi-directional protocol which means either client/server 
> can send a message to the other party.
>  # Full-duplex communication — client and server can talk to each other 
> independently at the same time
>  # Single TCP connection — After upgrading the HTTP connection in the 
> beginning, client and server communicate over that same TCP connection 
> throughout the lifecycle of WebSocket connection



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet

2018-10-04 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638658#comment-16638658
 ] 

Zian Chen commented on YARN-8763:
-

Thanks for the comments, Eric, I'll update the patch later today.

> Add WebSocket logic to the Node Manager web server to establish servlet
> ---
>
> Key: YARN-8763
> URL: https://issues.apache.org/jira/browse/YARN-8763
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8763-001.patch, YARN-8763.002.patch, 
> YARN-8763.003.patch
>
>
> The reason we want to use WebSocket servlet to serve the backend instead of 
> establishing the connection through HTTP is that WebSocket solves a few 
> issues with HTTP which needed for our scenario,
>  # In HTTP, the request is always initiated by the client and the response is 
> processed by the server — making HTTP a unidirectional protocol, while web 
> socket provides the Bi-directional protocol which means either client/server 
> can send a message to the other party.
>  # Full-duplex communication — client and server can talk to each other 
> independently at the same time
>  # Single TCP connection — After upgrading the HTTP connection in the 
> beginning, client and server communicate over that same TCP connection 
> throughout the lifecycle of WebSocket connection



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet

2018-10-03 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16636543#comment-16636543
 ] 

Zian Chen commented on YARN-8763:
-

Hi [~eyang], could you help review patch 003? It address comments as we 
discussed above. Thanks

 

> Add WebSocket logic to the Node Manager web server to establish servlet
> ---
>
> Key: YARN-8763
> URL: https://issues.apache.org/jira/browse/YARN-8763
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8763-001.patch, YARN-8763.002.patch, 
> YARN-8763.003.patch
>
>
> The reason we want to use WebSocket servlet to serve the backend instead of 
> establishing the connection through HTTP is that WebSocket solves a few 
> issues with HTTP which needed for our scenario,
>  # In HTTP, the request is always initiated by the client and the response is 
> processed by the server — making HTTP a unidirectional protocol, while web 
> socket provides the Bi-directional protocol which means either client/server 
> can send a message to the other party.
>  # Full-duplex communication — client and server can talk to each other 
> independently at the same time
>  # Single TCP connection — After upgrading the HTTP connection in the 
> beginning, client and server communicate over that same TCP connection 
> throughout the lifecycle of WebSocket connection



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet

2018-10-03 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8763:

Attachment: YARN-8763.003.patch

> Add WebSocket logic to the Node Manager web server to establish servlet
> ---
>
> Key: YARN-8763
> URL: https://issues.apache.org/jira/browse/YARN-8763
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8763-001.patch, YARN-8763.002.patch, 
> YARN-8763.003.patch
>
>
> The reason we want to use WebSocket servlet to serve the backend instead of 
> establishing the connection through HTTP is that WebSocket solves a few 
> issues with HTTP which needed for our scenario,
>  # In HTTP, the request is always initiated by the client and the response is 
> processed by the server — making HTTP a unidirectional protocol, while web 
> socket provides the Bi-directional protocol which means either client/server 
> can send a message to the other party.
>  # Full-duplex communication — client and server can talk to each other 
> independently at the same time
>  # Single TCP connection — After upgrading the HTTP connection in the 
> beginning, client and server communicate over that same TCP connection 
> throughout the lifecycle of WebSocket connection



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet

2018-10-01 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16634486#comment-16634486
 ] 

Zian Chen commented on YARN-8763:
-

Hi [~eyang] , make sense.  I'll work on patch 003 to address comments & Jenkins 
failures. 

> Add WebSocket logic to the Node Manager web server to establish servlet
> ---
>
> Key: YARN-8763
> URL: https://issues.apache.org/jira/browse/YARN-8763
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8763-001.patch, YARN-8763.002.patch
>
>
> The reason we want to use WebSocket servlet to serve the backend instead of 
> establishing the connection through HTTP is that WebSocket solves a few 
> issues with HTTP which needed for our scenario,
>  # In HTTP, the request is always initiated by the client and the response is 
> processed by the server — making HTTP a unidirectional protocol, while web 
> socket provides the Bi-directional protocol which means either client/server 
> can send a message to the other party.
>  # Full-duplex communication — client and server can talk to each other 
> independently at the same time
>  # Single TCP connection — After upgrading the HTTP connection in the 
> beginning, client and server communicate over that same TCP connection 
> throughout the lifecycle of WebSocket connection



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet

2018-09-28 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16632636#comment-16632636
 ] 

Zian Chen commented on YARN-8763:
-

Update patch-002 for review. really appreciate the help from [~eyang] on 
patch-002.

> Add WebSocket logic to the Node Manager web server to establish servlet
> ---
>
> Key: YARN-8763
> URL: https://issues.apache.org/jira/browse/YARN-8763
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8763-001.patch, YARN-8763.002.patch
>
>
> The reason we want to use WebSocket servlet to serve the backend instead of 
> establishing the connection through HTTP is that WebSocket solves a few 
> issues with HTTP which needed for our scenario,
>  # In HTTP, the request is always initiated by the client and the response is 
> processed by the server — making HTTP a unidirectional protocol, while web 
> socket provides the Bi-directional protocol which means either client/server 
> can send a message to the other party.
>  # Full-duplex communication — client and server can talk to each other 
> independently at the same time
>  # Single TCP connection — After upgrading the HTTP connection in the 
> beginning, client and server communicate over that same TCP connection 
> throughout the lifecycle of WebSocket connection



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet

2018-09-28 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8763:

Attachment: YARN-8763.002.patch

> Add WebSocket logic to the Node Manager web server to establish servlet
> ---
>
> Key: YARN-8763
> URL: https://issues.apache.org/jira/browse/YARN-8763
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8763-001.patch, YARN-8763.002.patch
>
>
> The reason we want to use WebSocket servlet to serve the backend instead of 
> establishing the connection through HTTP is that WebSocket solves a few 
> issues with HTTP which needed for our scenario,
>  # In HTTP, the request is always initiated by the client and the response is 
> processed by the server — making HTTP a unidirectional protocol, while web 
> socket provides the Bi-directional protocol which means either client/server 
> can send a message to the other party.
>  # Full-duplex communication — client and server can talk to each other 
> independently at the same time
>  # Single TCP connection — After upgrading the HTTP connection in the 
> beginning, client and server communicate over that same TCP connection 
> throughout the lifecycle of WebSocket connection



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8758) PreemptionMessage when using AMRMClientAsync

2018-09-24 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16626601#comment-16626601
 ] 

Zian Chen commented on YARN-8758:
-

Hi [~sunilg] [~weiweiyagn666], could you help review the patch? Thanks

 

> PreemptionMessage when using AMRMClientAsync
> 
>
> Key: YARN-8758
> URL: https://issues.apache.org/jira/browse/YARN-8758
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.1.1
>Reporter: Krishna Kishore
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8758.001.patch
>
>
> Hi,
>    The preemption notification messages sent in the time period defined by 
> the following parameter now work only on AMRMClient, but not on 
> AMRMClientAsync.
> *yarn.resourcemanager.monitor.capacity.preemption.max_wait_before_kill*
> We want this work on the AMRMClientAsync also because our implementations are 
> based on this one. 
>  
> Thanks,
> Kishore



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8758) PreemptionMessage when using AMRMClientAsync

2018-09-24 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8758:

Attachment: YARN-8758.001.patch

> PreemptionMessage when using AMRMClientAsync
> 
>
> Key: YARN-8758
> URL: https://issues.apache.org/jira/browse/YARN-8758
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.1.1
>Reporter: Krishna Kishore
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8758.001.patch
>
>
> Hi,
>    The preemption notification messages sent in the time period defined by 
> the following parameter now work only on AMRMClient, but not on 
> AMRMClientAsync.
> *yarn.resourcemanager.monitor.capacity.preemption.max_wait_before_kill*
> We want this work on the AMRMClientAsync also because our implementations are 
> based on this one. 
>  
> Thanks,
> Kishore



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8758) PreemptionMessage when using AMRMClientAsync

2018-09-24 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16626513#comment-16626513
 ] 

Zian Chen commented on YARN-8758:
-

I'll work on this Jira and provide an initial patch.

> PreemptionMessage when using AMRMClientAsync
> 
>
> Key: YARN-8758
> URL: https://issues.apache.org/jira/browse/YARN-8758
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.1.1
>Reporter: Krishna Kishore
>Priority: Major
>
> Hi,
>    The preemption notification messages sent in the time period defined by 
> the following parameter now work only on AMRMClient, but not on 
> AMRMClientAsync.
> *yarn.resourcemanager.monitor.capacity.preemption.max_wait_before_kill*
> We want this work on the AMRMClientAsync also because our implementations are 
> based on this one. 
>  
> Thanks,
> Kishore



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-8758) PreemptionMessage when using AMRMClientAsync

2018-09-24 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen reassigned YARN-8758:
---

Assignee: Zian Chen

> PreemptionMessage when using AMRMClientAsync
> 
>
> Key: YARN-8758
> URL: https://issues.apache.org/jira/browse/YARN-8758
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.1.1
>Reporter: Krishna Kishore
>Assignee: Zian Chen
>Priority: Major
>
> Hi,
>    The preemption notification messages sent in the time period defined by 
> the following parameter now work only on AMRMClient, but not on 
> AMRMClientAsync.
> *yarn.resourcemanager.monitor.capacity.preemption.max_wait_before_kill*
> We want this work on the AMRMClientAsync also because our implementations are 
> based on this one. 
>  
> Thanks,
> Kishore



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8785) Error Message "Invalid docker rw mount" not helpful

2018-09-21 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16624007#comment-16624007
 ] 

Zian Chen commented on YARN-8785:
-

Hi [~simonprewo], thanks for the patch. The patch itself looks good to me. One 
more add-on with  [~eyang] comments, after rename the patch as " 
YARN-8785.001.patch", please click submit patch on the top bottom and drop in 
the patch file as an attachment, then it will trigger Jenkins build to verify 
if anything is affected by this patch. Thanks for the effort

> Error Message "Invalid docker rw mount" not helpful
> ---
>
> Key: YARN-8785
> URL: https://issues.apache.org/jira/browse/YARN-8785
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.9.1, 3.1.1
>Reporter: Simon Prewo
>Assignee: Simon Prewo
>Priority: Major
>  Labels: Docker
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> A user recieves the error message _Invalid docker rw mount_ when a container 
> tries to mount a directory which is not configured in property  
> *docker.allowed.rw-mounts*. 
> {code:java}
> Invalid docker rw mount 
> '/usr/local/hadoop/logs/userlogs/application_1536476159258_0004/container_1536476159258_0004_02_01:/usr/local/hadoop/logs/userlogs/application_1536476159258_0004/container_1536476159258_0004_02_01',
>  
> realpath=/usr/local/hadoop/logs/userlogs/application_1536476159258_0004/container_1536476159258_0004_02_01{code}
> The error message makes the user think "It is not possible due to a docker 
> issue". My suggestion would be to put there a message like *Configuration of 
> the container executor does not allow mounting directory.*.
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.c
> CURRENT:
> {code:java}
> permitted_rw = check_mount_permitted((const char **) permitted_rw_mounts, 
> mount_src);
> permitted_ro = check_mount_permitted((const char **) permitted_ro_mounts, 
> mount_src);
> if (permitted_ro == -1 || permitted_rw == -1) {
>   fprintf(ERRORFILE, "Invalid docker mount '%s', realpath=%s\n", 
> values[i], mount_src);
> ...
> {code}
> NEW:
> {code:java}
> permitted_rw = check_mount_permitted((const char **) permitted_rw_mounts, 
> mount_src);
> permitted_ro = check_mount_permitted((const char **) permitted_ro_mounts, 
> mount_src);
> if (permitted_ro == -1 || permitted_rw == -1) {
>   fprintf(ERRORFILE, "Configuration of the container executor does not 
> allow mounting directory '%s', realpath=%s\n", values[i], mount_src);
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command

2018-09-21 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16623999#comment-16623999
 ] 

Zian Chen commented on YARN-8777:
-

Thanks [~eyang] for the work. I'm ok with patch 003. One quick question, you 
mentioned 
{code:java}
It is entirely possible to use ProcessBuilder and launch container-executor to 
run docker exec, and send unix command to be executed.
{code}
Is processbuilder mentioned to be an possible way for code reuse on passing 
arbitrary commands?  If yes, then this approach might run into similar issue 
for enum approach, which can only handle a small set of command options not 
arbitrary commands. 

> Container Executor C binary change to execute interactive docker command
> 
>
> Key: YARN-8777
> URL: https://issues.apache.org/jira/browse/YARN-8777
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8777.001.patch, YARN-8777.002.patch, 
> YARN-8777.003.patch
>
>
> Since Container Executor provides Container execution using the native 
> container-executor binary, we also need to make changes to accept new 
> “dockerExec” method to invoke the corresponding native function to execute 
> docker exec command to the running container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8805) Automatically convert the launch command to the exec form when using entrypoint support

2018-09-20 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16622555#comment-16622555
 ] 

Zian Chen commented on YARN-8805:
-

Thanks [~shaneku...@gmail.com], I'll work on the patch

> Automatically convert the launch command to the exec form when using 
> entrypoint support
> ---
>
> Key: YARN-8805
> URL: https://issues.apache.org/jira/browse/YARN-8805
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
>
> When {{YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE}} is true, and a 
> launch command is provided, it is expected that the launch command is 
> provided by the user in exec form.
> For example:
> {code:java}
> "/usr/bin/sleep 6000"{code}
> must be changed to:
> {code}"/usr/bin/sleep,6000"{code}
> If this is not done, the container will never start and will be in a Created 
> state. We should automatically do this conversion vs making the user 
> understand this nuance of using the entrypoint support. Docs should be 
> updated to reflect this change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-8805) Automatically convert the launch command to the exec form when using entrypoint support

2018-09-20 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen reassigned YARN-8805:
---

Assignee: Zian Chen

> Automatically convert the launch command to the exec form when using 
> entrypoint support
> ---
>
> Key: YARN-8805
> URL: https://issues.apache.org/jira/browse/YARN-8805
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
>
> When {{YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE}} is true, and a 
> launch command is provided, it is expected that the launch command is 
> provided by the user in exec form.
> For example:
> {code:java}
> "/usr/bin/sleep 6000"{code}
> must be changed to:
> {code}"/usr/bin/sleep,6000"{code}
> If this is not done, the container will never start and will be in a Created 
> state. We should automatically do this conversion vs making the user 
> understand this nuance of using the entrypoint support. Docs should be 
> updated to reflect this change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8805) Automatically convert the launch command to the exec form when using entrypoint support

2018-09-20 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16622499#comment-16622499
 ] 

Zian Chen commented on YARN-8805:
-

Yes, just checked the latest released doc, 
[https://hadoop.apache.org/docs/r3.1.1/hadoop-yarn/hadoop-yarn-site/yarn-service/Examples.html,]
 format needs to be fixed. Also agree with [~shaneku...@gmail.com], we should 
make the convert automatically when 
YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE is set to true. Would you 
like to provide a patch for this [~shaneku...@gmail.com], or I can help

> Automatically convert the launch command to the exec form when using 
> entrypoint support
> ---
>
> Key: YARN-8805
> URL: https://issues.apache.org/jira/browse/YARN-8805
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Priority: Major
>  Labels: Docker
>
> When {{YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE}} is true, and a 
> launch command is provided, it is expected that the launch command is 
> provided by the user in exec form.
> For example:
> {code:java}
> "/usr/bin/sleep 6000"{code}
> must be changed to:
> {code}"/usr/bin/sleep,6000"{code}
> If this is not done, the container will never start and will be in a Created 
> state. We should automatically do this conversion vs making the user 
> understand this nuance of using the entrypoint support. Docs should be 
> updated to reflect this change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8785) Error Message "Invalid docker rw mount" not helpful

2018-09-20 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16622495#comment-16622495
 ] 

Zian Chen commented on YARN-8785:
-

Hi [~simonprewo], would you like to work on this Jira and provide a patch? Or I 
can help with it.

> Error Message "Invalid docker rw mount" not helpful
> ---
>
> Key: YARN-8785
> URL: https://issues.apache.org/jira/browse/YARN-8785
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.9.1, 3.1.1
>Reporter: Simon Prewo
>Priority: Major
>  Labels: Docker
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> A user recieves the error message _Invalid docker rw mount_ when a container 
> tries to mount a directory which is not configured in property  
> *docker.allowed.rw-mounts*. 
> {code:java}
> Invalid docker rw mount 
> '/usr/local/hadoop/logs/userlogs/application_1536476159258_0004/container_1536476159258_0004_02_01:/usr/local/hadoop/logs/userlogs/application_1536476159258_0004/container_1536476159258_0004_02_01',
>  
> realpath=/usr/local/hadoop/logs/userlogs/application_1536476159258_0004/container_1536476159258_0004_02_01{code}
> The error message makes the user think "It is not possible due to a docker 
> issue". My suggestion would be to put there a message like *Configuration of 
> the container executor does not allow mounting directory.*.
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.c
> CURRENT:
> {code:java}
> permitted_rw = check_mount_permitted((const char **) permitted_rw_mounts, 
> mount_src);
> permitted_ro = check_mount_permitted((const char **) permitted_ro_mounts, 
> mount_src);
> if (permitted_ro == -1 || permitted_rw == -1) {
>   fprintf(ERRORFILE, "Invalid docker mount '%s', realpath=%s\n", 
> values[i], mount_src);
> ...
> {code}
> NEW:
> {code:java}
> permitted_rw = check_mount_permitted((const char **) permitted_rw_mounts, 
> mount_src);
> permitted_ro = check_mount_permitted((const char **) permitted_ro_mounts, 
> mount_src);
> if (permitted_ro == -1 || permitted_rw == -1) {
>   fprintf(ERRORFILE, "Configuration of the container executor does not 
> allow mounting directory '%s', realpath=%s\n", values[i], mount_src);
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8801) java doc comments in docker-util.h is confusing

2018-09-20 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16622438#comment-16622438
 ] 

Zian Chen commented on YARN-8801:
-

Thank you [~eyang]

> java doc comments in docker-util.h is confusing
> ---
>
> Key: YARN-8801
> URL: https://issues.apache.org/jira/browse/YARN-8801
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Minor
>  Labels: Docker
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-8801.001.patch
>
>
> {code:java}
>  /**
> + * Get the Docker exec command line string. The function will verify that 
> the params file is meant for the exec command.
> + * @param command_file File containing the params for the Docker start 
> command
> + * @param conf Configuration struct containing the container-executor.cfg 
> details
> + * @param out Buffer to fill with the exec command
> + * @param outlen Size of the output buffer
> + * @return Return code with 0 indicating success and non-zero codes 
> indicating error
> + */
> +int get_docker_exec_command(const char* command_file, const struct 
> configuration* conf, args *args);{code}
> The method param list have out an outlen which didn't match the signature, 
> and we miss description for param args. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8801) java doc comments in docker-util.h is confusing

2018-09-19 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16621350#comment-16621350
 ] 

Zian Chen commented on YARN-8801:
-

Provide patch for the fix. Don't need to add UTs here.

> java doc comments in docker-util.h is confusing
> ---
>
> Key: YARN-8801
> URL: https://issues.apache.org/jira/browse/YARN-8801
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Minor
>  Labels: Docker
>
> {code:java}
>  /**
> + * Get the Docker exec command line string. The function will verify that 
> the params file is meant for the exec command.
> + * @param command_file File containing the params for the Docker start 
> command
> + * @param conf Configuration struct containing the container-executor.cfg 
> details
> + * @param out Buffer to fill with the exec command
> + * @param outlen Size of the output buffer
> + * @return Return code with 0 indicating success and non-zero codes 
> indicating error
> + */
> +int get_docker_exec_command(const char* command_file, const struct 
> configuration* conf, args *args);{code}
> The method param list have out an outlen which didn't match the signature, 
> and we miss description for param args. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8801) java doc comments in docker-util.h is confusing

2018-09-19 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8801:

Labels: Docker  (was: )

> java doc comments in docker-util.h is confusing
> ---
>
> Key: YARN-8801
> URL: https://issues.apache.org/jira/browse/YARN-8801
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Minor
>  Labels: Docker
>
> {code:java}
>  /**
> + * Get the Docker exec command line string. The function will verify that 
> the params file is meant for the exec command.
> + * @param command_file File containing the params for the Docker start 
> command
> + * @param conf Configuration struct containing the container-executor.cfg 
> details
> + * @param out Buffer to fill with the exec command
> + * @param outlen Size of the output buffer
> + * @return Return code with 0 indicating success and non-zero codes 
> indicating error
> + */
> +int get_docker_exec_command(const char* command_file, const struct 
> configuration* conf, args *args);{code}
> The method param list have out an outlen which didn't match the signature, 
> and we miss description for param args. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-8801) java doc comments in docker-util.h is confusing

2018-09-19 Thread Zian Chen (JIRA)

Zian Chen created YARN-8801:
---

 Summary: java doc comments in docker-util.h is confusing
 Key: YARN-8801
 URL: https://issues.apache.org/jira/browse/YARN-8801
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zian Chen
Assignee: Zian Chen


{code:java}
 /**
+ * Get the Docker exec command line string. The function will verify that the 
params file is meant for the exec command.
+ * @param command_file File containing the params for the Docker start command
+ * @param conf Configuration struct containing the container-executor.cfg 
details
+ * @param out Buffer to fill with the exec command
+ * @param outlen Size of the output buffer
+ * @return Return code with 0 indicating success and non-zero codes indicating 
error
+ */
+int get_docker_exec_command(const char* command_file, const struct 
configuration* conf, args *args);{code}
The method param list have out an outlen which didn't match the signature, and 
we miss description for param args. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-8790) Authentication Filter change to force security check

2018-09-18 Thread Zian Chen (JIRA)

Zian Chen created YARN-8790:
---

 Summary: Authentication Filter change to force security check 
 Key: YARN-8790
 URL: https://issues.apache.org/jira/browse/YARN-8790
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zian Chen


Hadoop node manager REST API is authenticated using AuthenticationFilter from 
Hadoop-auth project. AuthenticationFilter is added to the new WebSocket URL 
path spec. The requested remote user is verified to match the container owner 
to allow WebSocket connection to be established. WebSocket servlet code 
enforces the username match check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command

2018-09-18 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16619569#comment-16619569
 ] 

Zian Chen commented on YARN-8777:
-

Hi [~eyang], thanks for the patch, some quick suggestions and questions,

1. 
{code:java}
 /**
+ * Get the Docker exec command line string. The function will verify that the 
params file is meant for the exec command.
+ * @param command_file File containing the params for the Docker start command
+ * @param conf Configuration struct containing the container-executor.cfg 
details
+ * @param out Buffer to fill with the exec command
+ * @param outlen Size of the output buffer
+ * @return Return code with 0 indicating success and non-zero codes indicating 
error
+ */
+int get_docker_exec_command(const char* command_file, const struct 
configuration* conf, args *args);{code}
The method param list have out an outlen which didn't match the signature, and 
we miss description for param args, is this typo?

2. for the code reuse you discussed with [~ebadger], my quick thoughts is 
instead of passing parameters from node  manager, we can probably give an enum 
to index several common used command options, and ask node manager only pass 
index which can be matched with one of these enum elements, in this way we can 
have some kind of flexibility without open up bigger attack interface. 

3. This patch seems focus on running docker exec -it command to attach to a 
running container, but later on when the pipeline is been build, should we also 
take care of passing shell commands inside the container ?

 

> Container Executor C binary change to execute interactive docker command
> 
>
> Key: YARN-8777
> URL: https://issues.apache.org/jira/browse/YARN-8777
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8777.001.patch
>
>
> Since Container Executor provides Container execution using the native 
> container-executor binary, we also need to make changes to accept new 
> “dockerExec” method to invoke the corresponding native function to execute 
> docker exec command to the running container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Resolved] (YARN-8781) back-port YARN-8091 to branch-2.6.4

2018-09-16 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen resolved YARN-8781.
-
Resolution: Invalid

> back-port YARN-8091 to branch-2.6.4
> ---
>
> Key: YARN-8781
> URL: https://issues.apache.org/jira/browse/YARN-8781
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.4
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Minor
> Fix For: 2.6.4
>
>
> We suggest a patch that back-ports the change 
> https://issues.apache.org/jira/browse/YARN-8091 to branch 2.6.4
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8781) back-port YARN-8091 to branch-2.6.4

2018-09-16 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16616961#comment-16616961
 ] 

Zian Chen commented on YARN-8781:
-

Close as invalid.

> back-port YARN-8091 to branch-2.6.4
> ---
>
> Key: YARN-8781
> URL: https://issues.apache.org/jira/browse/YARN-8781
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.4
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Minor
> Fix For: 2.6.4
>
>
> We suggest a patch that back-ports the change 
> https://issues.apache.org/jira/browse/YARN-8091 to branch 2.6.4
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8781) back-port YARN-8091 to branch-2.6.4

2018-09-15 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16616416#comment-16616416
 ] 

Zian Chen commented on YARN-8781:
-

Rework the YARN 8091 patch to fix conflicts with trunk since a lot of changes 
has been made since 2.6.4

> back-port YARN-8091 to branch-2.6.4
> ---
>
> Key: YARN-8781
> URL: https://issues.apache.org/jira/browse/YARN-8781
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.4
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Minor
> Fix For: 2.6.4
>
>
> We suggest a patch that back-ports the change 
> https://issues.apache.org/jira/browse/YARN-8091 to branch 2.6.4
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-8781) back-port YARN-8091 to branch-2.6.4

2018-09-15 Thread Zian Chen (JIRA)

Zian Chen created YARN-8781:
---

 Summary: back-port YARN-8091 to branch-2.6.4
 Key: YARN-8781
 URL: https://issues.apache.org/jira/browse/YARN-8781
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.4
Reporter: Zian Chen
Assignee: Zian Chen
 Fix For: 2.6.4


We suggest a patch that back-ports the change 
https://issues.apache.org/jira/browse/YARN-8091 to branch 2.6.4

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8780) back-port YARN-8028 to branch-2.6.4

2018-09-15 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8780:

Attachment: YARN-8028-branch-2.6.4-001.patch

> back-port YARN-8028 to branch-2.6.4
> ---
>
> Key: YARN-8780
> URL: https://issues.apache.org/jira/browse/YARN-8780
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Minor
> Fix For: 2.6.4
>
> Attachments: YARN-8028-branch-2.6.4-001.patch
>
>
> We suggest a patch that back-ports the change 
> https://issues.apache.org/jira/browse/YARN-8028 to branch 2.6.4



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8780) back-port YARN-802 to branch-2.6.4

2018-09-15 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16616409#comment-16616409
 ] 

Zian Chen commented on YARN-8780:
-

Rework the YARN-8028 patch to fix conflicts with trunk since a lot of changes 
has been made since 2.6.4

> back-port YARN-802 to branch-2.6.4
> --
>
> Key: YARN-8780
> URL: https://issues.apache.org/jira/browse/YARN-8780
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Minor
> Fix For: 2.6.4
>
>
> We suggest a patch that back-ports the change 
> https://issues.apache.org/jira/browse/YARN-8028 to branch 2.6.4



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8780) back-port YARN-8028 to branch-2.6.4

2018-09-15 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8780:

Summary: back-port YARN-8028 to branch-2.6.4  (was: back-port YARN-802 to 
branch-2.6.4)

> back-port YARN-8028 to branch-2.6.4
> ---
>
> Key: YARN-8780
> URL: https://issues.apache.org/jira/browse/YARN-8780
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Minor
> Fix For: 2.6.4
>
>
> We suggest a patch that back-ports the change 
> https://issues.apache.org/jira/browse/YARN-8028 to branch 2.6.4



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-8780) back-port YARN-802 to branch-2.6.4

2018-09-15 Thread Zian Chen (JIRA)

Zian Chen created YARN-8780:
---

 Summary: back-port YARN-802 to branch-2.6.4
 Key: YARN-8780
 URL: https://issues.apache.org/jira/browse/YARN-8780
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zian Chen
Assignee: Zian Chen
 Fix For: 2.6.4


We suggest a patch that back-ports the change 
https://issues.apache.org/jira/browse/YARN-8028 to branch 2.6.4



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-8778) Add Command Line interface to invoke interactive docker shell

2018-09-14 Thread Zian Chen (JIRA)

Zian Chen created YARN-8778:
---

 Summary: Add Command Line interface to invoke interactive docker 
shell
 Key: YARN-8778
 URL: https://issues.apache.org/jira/browse/YARN-8778
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zian Chen
Assignee: Zian Chen


CLI will be the mandatory interface we are providing for a user to use 
interactive docker shell feature. We will need to create a new class 
“InteractiveDockerShellCLI” to read command line into the servlet and pass all 
the way down to docker executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-8777) Container Executor C binary change to execute interactive docker command

2018-09-14 Thread Zian Chen (JIRA)

Zian Chen created YARN-8777:
---

 Summary: Container Executor C binary change to execute interactive 
docker command
 Key: YARN-8777
 URL: https://issues.apache.org/jira/browse/YARN-8777
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zian Chen


Since Container Executor provides Container execution using the native 
container-executor binary, we also need to make changes to accept new 
“dockerExec” method to invoke the corresponding native function to execute 
docker exec command to the running container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-8776) Container Executor change to create stdin/stdout pipeline

2018-09-14 Thread Zian Chen (JIRA)

Zian Chen created YARN-8776:
---

 Summary: Container Executor change to create stdin/stdout pipeline
 Key: YARN-8776
 URL: https://issues.apache.org/jira/browse/YARN-8776
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zian Chen
Assignee: Zian Chen


The pipeline is built to connect the stdin/stdout channel from WebSocket 
servlet through container-executor to docker executor. So when the WebSocket 
servlet is started, we need to invoke container-executor “dockerExec” method 
(which will be implemented) to create a new docker executor and use “docker 
exec -it $ContainerId” command which executes an interactive bash shell on the 
container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet

2018-09-11 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16610186#comment-16610186
 ] 

Zian Chen commented on YARN-8763:
-

Hi [~eyang], thanks for the detailed suggestions. Make sense. Let me address 
these comments as well as Jenkins errors/

> Add WebSocket logic to the Node Manager web server to establish servlet
> ---
>
> Key: YARN-8763
> URL: https://issues.apache.org/jira/browse/YARN-8763
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8763-001.patch
>
>
> The reason we want to use WebSocket servlet to serve the backend instead of 
> establishing the connection through HTTP is that WebSocket solves a few 
> issues with HTTP which needed for our scenario,
>  # In HTTP, the request is always initiated by the client and the response is 
> processed by the server — making HTTP a unidirectional protocol, while web 
> socket provides the Bi-directional protocol which means either client/server 
> can send a message to the other party.
>  # Full-duplex communication — client and server can talk to each other 
> independently at the same time
>  # Single TCP connection — After upgrading the HTTP connection in the 
> beginning, client and server communicate over that same TCP connection 
> throughout the lifecycle of WebSocket connection



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet

2018-09-10 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16609919#comment-16609919
 ] 

Zian Chen commented on YARN-8763:
-

Hi [~eyang], could you help review the patch? Thanks!

> Add WebSocket logic to the Node Manager web server to establish servlet
> ---
>
> Key: YARN-8763
> URL: https://issues.apache.org/jira/browse/YARN-8763
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8763-001.patch
>
>
> The reason we want to use WebSocket servlet to serve the backend instead of 
> establishing the connection through HTTP is that WebSocket solves a few 
> issues with HTTP which needed for our scenario,
>  # In HTTP, the request is always initiated by the client and the response is 
> processed by the server — making HTTP a unidirectional protocol, while web 
> socket provides the Bi-directional protocol which means either client/server 
> can send a message to the other party.
>  # Full-duplex communication — client and server can talk to each other 
> independently at the same time
>  # Single TCP connection — After upgrading the HTTP connection in the 
> beginning, client and server communicate over that same TCP connection 
> throughout the lifecycle of WebSocket connection



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet

2018-09-10 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8763:

Attachment: YARN-8763-001.patch

> Add WebSocket logic to the Node Manager web server to establish servlet
> ---
>
> Key: YARN-8763
> URL: https://issues.apache.org/jira/browse/YARN-8763
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8763-001.patch
>
>
> The reason we want to use WebSocket servlet to serve the backend instead of 
> establishing the connection through HTTP is that WebSocket solves a few 
> issues with HTTP which needed for our scenario,
>  # In HTTP, the request is always initiated by the client and the response is 
> processed by the server — making HTTP a unidirectional protocol, while web 
> socket provides the Bi-directional protocol which means either client/server 
> can send a message to the other party.
>  # Full-duplex communication — client and server can talk to each other 
> independently at the same time
>  # Single TCP connection — After upgrading the HTTP connection in the 
> beginning, client and server communicate over that same TCP connection 
> throughout the lifecycle of WebSocket connection



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet

2018-09-10 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16609917#comment-16609917
 ] 

Zian Chen commented on YARN-8763:
-

Provide initial patch for this.

> Add WebSocket logic to the Node Manager web server to establish servlet
> ---
>
> Key: YARN-8763
> URL: https://issues.apache.org/jira/browse/YARN-8763
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
>
> The reason we want to use WebSocket servlet to serve the backend instead of 
> establishing the connection through HTTP is that WebSocket solves a few 
> issues with HTTP which needed for our scenario,
>  # In HTTP, the request is always initiated by the client and the response is 
> processed by the server — making HTTP a unidirectional protocol, while web 
> socket provides the Bi-directional protocol which means either client/server 
> can send a message to the other party.
>  # Full-duplex communication — client and server can talk to each other 
> independently at the same time
>  # Single TCP connection — After upgrading the HTTP connection in the 
> beginning, client and server communicate over that same TCP connection 
> throughout the lifecycle of WebSocket connection



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet

2018-09-10 Thread Zian Chen (JIRA)

Zian Chen created YARN-8763:
---

 Summary: Add WebSocket logic to the Node Manager web server to 
establish servlet
 Key: YARN-8763
 URL: https://issues.apache.org/jira/browse/YARN-8763
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zian Chen
Assignee: Zian Chen


The reason we want to use WebSocket servlet to serve the backend instead of 
establishing the connection through HTTP is that WebSocket solves a few issues 
with HTTP which needed for our scenario,
 # In HTTP, the request is always initiated by the client and the response is 
processed by the server — making HTTP a unidirectional protocol, while web 
socket provides the Bi-directional protocol which means either client/server 
can send a message to the other party.
 # Full-duplex communication — client and server can talk to each other 
independently at the same time
 # Single TCP connection — After upgrading the HTTP connection in the 
beginning, client and server communicate over that same TCP connection 
throughout the lifecycle of WebSocket connection



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8762) [Umbrella] Support Interactive Docker Shell to running Containers

2018-09-10 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8762:

Attachment: Interactive Docker Shell design doc.pdf

> [Umbrella] Support Interactive Docker Shell to running Containers
> -
>
> Key: YARN-8762
> URL: https://issues.apache.org/jira/browse/YARN-8762
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Zian Chen
>Priority: Major
>  Labels: Docker
> Attachments: Interactive Docker Shell design doc.pdf
>
>
> Debugging distributed application can be challenging on Hadoop. Hadoop 
> provide limited debugging ability through application log files. One of the 
> most frequently requested feature is to provide interactive shell to assist 
> real time debugging. This feature is inspired by docker exec to provide 
> ability to run arbitrary commands in docker container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8762) [Umbrella] Support Interactive Docker Shell to running Containers

2018-09-10 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16609916#comment-16609916
 ] 

Zian Chen commented on YARN-8762:
-

Provide design doc for this. 

> [Umbrella] Support Interactive Docker Shell to running Containers
> -
>
> Key: YARN-8762
> URL: https://issues.apache.org/jira/browse/YARN-8762
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Zian Chen
>Priority: Major
>  Labels: Docker
>
> Debugging distributed application can be challenging on Hadoop. Hadoop 
> provide limited debugging ability through application log files. One of the 
> most frequently requested feature is to provide interactive shell to assist 
> real time debugging. This feature is inspired by docker exec to provide 
> ability to run arbitrary commands in docker container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8523) Interactive docker shell

2018-09-10 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16609914#comment-16609914
 ] 

Zian Chen commented on YARN-8523:
-

Offline discussed with Eric and Wangda, this feature involves creating a 
pipeline among NM, container-exec and docker exec which requires a lot of 
changes to container stack, create Umbrella Jira YARN-8762 to track progress.

> Interactive docker shell
> 
>
> Key: YARN-8523
> URL: https://issues.apache.org/jira/browse/YARN-8523
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
>
> Some application might require interactive unix commands executions to carry 
> out operations.  Container-executor can interface with docker exec to debug 
> or analyze docker containers while the application is running.  It would be 
> nice to support an API to invoke docker exec to perform unix commands and 
> report back the output to application master.  Application master can 
> distribute and aggregate execution of the commands to record in application 
> master log file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-8762) [Umbrella] Support Interactive Docker Shell to running Containers

2018-09-10 Thread Zian Chen (JIRA)

Zian Chen created YARN-8762:
---

 Summary: [Umbrella] Support Interactive Docker Shell to running 
Containers
 Key: YARN-8762
 URL: https://issues.apache.org/jira/browse/YARN-8762
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Zian Chen


Debugging distributed application can be challenging on Hadoop. Hadoop provide 
limited debugging ability through application log files. One of the most 
frequently requested feature is to provide interactive shell to assist real 
time debugging. This feature is inspired by docker exec to provide ability to 
run arbitrary commands in docker container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8509) Total pending resource calculation in preemption should use user-limit factor instead of minimum-user-limit-percent

2018-08-24 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592307#comment-16592307
 ] 

Zian Chen commented on YARN-8509:
-

Hi [~eepayne], sorry for getting back late. I took some time to run SLS test in 
order to evaluate if this change introduced unnecessary preemption, however, 
there is no suitable dataset which can be used to test preemption behavior 
since almost all the dataset submits applications at the same time, which let 
scheduler considered all the resource request in allocation stage. This leaves 
no chance for preemption to come into play. I'll work on generate a dataset for 
preemption SLS test offline and may take several weeks. I'll comment on the 
updates once I have some progress.

  

> Total pending resource calculation in preemption should use user-limit factor 
> instead of minimum-user-limit-percent
> ---
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: capacityscheduler
> Attachments: YARN-8509.001.patch, YARN-8509.002.patch, 
> YARN-8509.003.patch, YARN-8509.004.patch, YARN-8509.005.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>   
>  We need to change the logic to let queue pending can go beyond userlimit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8509) Total pending resource calculation in preemption should use user-limit factor instead of minimum-user-limit-percent

2018-08-21 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588010#comment-16588010
 ] 

Zian Chen commented on YARN-8509:
-

Fix failed UTs and re-upload the patch

> Total pending resource calculation in preemption should use user-limit factor 
> instead of minimum-user-limit-percent
> ---
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: capacityscheduler
> Attachments: YARN-8509.001.patch, YARN-8509.002.patch, 
> YARN-8509.003.patch, YARN-8509.004.patch, YARN-8509.005.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>   
>  We need to change the logic to let queue pending can go beyond userlimit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8509) Total pending resource calculation in preemption should use user-limit factor instead of minimum-user-limit-percent

2018-08-21 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8509:

Attachment: YARN-8509.005.patch

> Total pending resource calculation in preemption should use user-limit factor 
> instead of minimum-user-limit-percent
> ---
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: capacityscheduler
> Attachments: YARN-8509.001.patch, YARN-8509.002.patch, 
> YARN-8509.003.patch, YARN-8509.004.patch, YARN-8509.005.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>   
>  We need to change the logic to let queue pending can go beyond userlimit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8509) Total pending resource calculation in preemption should use user-limit factor instead of minimum-user-limit-percent

2018-08-20 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8509:

Attachment: YARN-8509.004.patch

> Total pending resource calculation in preemption should use user-limit factor 
> instead of minimum-user-limit-percent
> ---
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: capacityscheduler
> Attachments: YARN-8509.001.patch, YARN-8509.002.patch, 
> YARN-8509.003.patch, YARN-8509.004.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>   
>  We need to change the logic to let queue pending can go beyond userlimit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8509) Total pending resource calculation in preemption should use user-limit factor instead of minimum-user-limit-percent

2018-08-15 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16581573#comment-16581573
 ] 

Zian Chen commented on YARN-8509:
-

Offline discussed with Eric and Wangda, will upload a new patch to evaluate the 
algorithm we provided here works as expected and not cause any over-preemption.

> Total pending resource calculation in preemption should use user-limit factor 
> instead of minimum-user-limit-percent
> ---
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8509.001.patch, YARN-8509.002.patch, 
> YARN-8509.003.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>   
>  We need to change the logic to let queue pending can go beyond userlimit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7417) re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to remove duplicate codes

2018-08-13 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16578859#comment-16578859
 ] 

Zian Chen commented on YARN-7417:
-

Hi [~eyang], thanks for the comments. I think we don't need to add extra UTs, 
we already get a bunch of UTs locate in hadoop-mpreduce-client, 
hadoop-yarn-common and a lot more which used for testing 
IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock already. 

As long as all the UTs are passed, it means we didn't break anything after this 
refactoring. 

> re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to 
> remove duplicate codes
> 
>
> Key: YARN-7417
> URL: https://issues.apache.org/jira/browse/YARN-7417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-7417.001.patch, YARN-7417.002.patch, 
> YARN-7417.003.patch
>
>
> This Jira is focus on refactor code for IndexedFileAggregatedLogsBlock and 
> TFileAggregatedLogsBlock
>  # We have duplicate code in current implementation of 
> IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock which can be 
> abstract into common methods. 
>  # render method is too long in both of these class, we want to make it clear 
> by abstracting some helper methods out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7417) re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to remove duplicate codes

2018-08-10 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-7417:

Attachment: YARN-7417.003.patch

> re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to 
> remove duplicate codes
> 
>
> Key: YARN-7417
> URL: https://issues.apache.org/jira/browse/YARN-7417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-7417.001.patch, YARN-7417.002.patch, 
> YARN-7417.003.patch
>
>
> This Jira is focus on refactor code for IndexedFileAggregatedLogsBlock and 
> TFileAggregatedLogsBlock
>  # We have duplicate code in current implementation of 
> IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock which can be 
> abstract into common methods. 
>  # render method is too long in both of these class, we want to make it clear 
> by abstracting some helper methods out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-8523) Interactive docker shell

2018-08-10 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen reassigned YARN-8523:
---

Assignee: Zian Chen

> Interactive docker shell
> 
>
> Key: YARN-8523
> URL: https://issues.apache.org/jira/browse/YARN-8523
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Zian Chen
>Priority: Major
>  Labels: Docker
>
> Some application might require interactive unix commands executions to carry 
> out operations.  Container-executor can interface with docker exec to debug 
> or analyze docker containers while the application is running.  It would be 
> nice to support an API to invoke docker exec to perform unix commands and 
> report back the output to application master.  Application master can 
> distribute and aggregate execution of the commands to record in application 
> master log file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8523) Interactive docker shell

2018-08-10 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576945#comment-16576945
 ] 

Zian Chen commented on YARN-8523:
-

Make sense. I'll work on provide an initial patch for this idea. Thanks [~eyang]

> Interactive docker shell
> 
>
> Key: YARN-8523
> URL: https://issues.apache.org/jira/browse/YARN-8523
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Priority: Major
>  Labels: Docker
>
> Some application might require interactive unix commands executions to carry 
> out operations.  Container-executor can interface with docker exec to debug 
> or analyze docker containers while the application is running.  It would be 
> nice to support an API to invoke docker exec to perform unix commands and 
> report back the output to application master.  Application master can 
> distribute and aggregate execution of the commands to record in application 
> master log file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7417) re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to remove duplicate codes

2018-08-10 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576917#comment-16576917
 ] 

Zian Chen commented on YARN-7417:
-

But looks like we can make AggregatedLogFormat.ContainerLogsReader to extend 
InputStream to achieve this. Let me update the patch.

> re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to 
> remove duplicate codes
> 
>
> Key: YARN-7417
> URL: https://issues.apache.org/jira/browse/YARN-7417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-7417.001.patch, YARN-7417.002.patch
>
>
> This Jira is focus on refactor code for IndexedFileAggregatedLogsBlock and 
> TFileAggregatedLogsBlock
>  # We have duplicate code in current implementation of 
> IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock which can be 
> abstract into common methods. 
>  # render method is too long in both of these class, we want to make it clear 
> by abstracting some helper methods out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8509) Total pending resource calculation in preemption should use user-limit factor instead of minimum-user-limit-percent

2018-08-10 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576900#comment-16576900
 ] 

Zian Chen commented on YARN-8509:
-

Hi [~eepayne], sure, let address these two questions,

1) the summation is for each user, we calculate the minimum of these two 
expression, one is the pending resource for this user per partition, another 
one is user limit (which is queue_capacity * user_limit_factor) - user used 
resource per partition

2) I think there is some misunderstanding here. First of all, after the title 
been changed, this Jira is not intend to only support balancing of queues after 
satisfied. It intend to change the general strategy of how user limit is been 
calculated in preemption scenario.

So the queue capacities I mentioned here for the example is an initial state, 
which is like this,

 
|| ||queue-a||queue-b||queue-c||queue-b||
|Guaranteed|30|30|30|10|
|Used|10|40|50|0|
|Pending|6|30|30|0|

this configuration should able to happen if we set user_limit_percent to 50 and 
user_limit_factor to 1.0f, 3.0f, 3.0f and 2.0f respectively. But within current 
equation, this initial state won't happen.

user_limit = 
          min(max(current_capacity)/ #active_users, 
                         current_capacity * user_limit_percent), 
          queue_capacity * user_limit_factor)

in above case, queue-b's  queue_capacity * user_limit_factor is 90GB while 
max(current_capacity)/ #active_users, current_capacity * user_limit_percent) is 
40GB, this will make user-limit-factor don't make any effect at all, and 
headroom becomes zero for queue-b. 

So the point is, we should let user-limit to reach at most queue_capacity * 
user_limit_factor

 

> Total pending resource calculation in preemption should use user-limit factor 
> instead of minimum-user-limit-percent
> ---
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8509.001.patch, YARN-8509.002.patch, 
> YARN-8509.003.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>   
>  We need to change the logic to let queue pending can go beyond userlimit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-8509) Total pending resource calculation in preemption should use user-limit factor instead of minimum-user-limit-percent

2018-08-10 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16573997#comment-16573997
 ] 

Zian Chen edited comment on YARN-8509 at 8/10/18 9:21 PM:
--

Hi Eric, thanks for the comments. Discussed with Wangda, the patch uploaded 
before is not correct due to misunderstand of the original problem.

I have changed the Jira title. The intention of this Jira is to fix calculation 
of pending resource consider user-limit in preemption scenario. Currently, 
pending resource calculation in preemption uses the calculation algorithm in 
scheduling which is this one,
{code:java}
user_limit = min(max(current_capacity)/ #active_users, current_capacity * 
user_limit_percent), queue_capacity * user_limit_factor)
{code}
this is good for scheduling cause we want to make sure users can get at least 
"minimum-user-limit-percent" of resource to use, which is more like a lower 
bound of user-limit. However we should not capture total pending resource a 
leaf queue can get by minimum-user-limit-percent, instead, we want to use 
user-limit-factor which is the upper bound to capture pending resource in 
preemption. Cause if we use minimum-user-limit-percent to capture pending 
resource, resource under-utilization will happen in preemption scenario. Thus, 
we suggest the pending resource calculation for preemption should use this 
formula.

 
{code:java}
total_pending(partition,queue) = min {Q_max(partition) - Q_used(partition), Σ 
(min {
User.ulf(partition) - User.used(partition), User.pending(partition})}
{code}
Let me give an example,

 
{code:java}
   Root
/  |  \  \
   a   b   c  d
  30  30  30  10


 1) Only one node (n1) in the cluster, it has 100G.

 2) app1 submit to queue-a, asks for 10G used, 6G pending.

 3) app2 submit to queue-b, asks for 40G used, 30G pending.

 4) app3 submit to queue-c, asks for 50G used, 30G pending.
{code}
Here we only have one user, and user-limit-factor for queues are

 

 
||Queue name|| minimum-user-limit-percent ||user-limit-factor||
|         a|                      50|        1.0 f|
|         b|                      50|        3.0 f|
|         c|                      50|        3.0 f|
|         d|                      50|        2.0 f|

With old calculation, user-limit for queue-a is 30G, which can let app1 has 6G 
pending, but user-limit for queue-b becomes 40G, which makes headroom become 
zero after subtract 40G used, the 30G pending resource been asked can not be 
accepted, same thing with queue-c too.

However if we see this test case in preemption point of view, we should allow 
queue-b and queue-c take more pending resources. Because even though queue-a 
has 30G guaranteed configured, it's under utilization. And by pending resource 
captured by the old algorithm, queue-b and queue-c can not take available 
resource through preemption which make the cluster resource not used 
effectively. 

To summarize, since user-limit-factor maintains the hard-limit of how much 
resource can be used by a user, we should calculate pending resource consider 
user-limit-factor instead of minimum-user-limit-percent. 

Could you share your opinion on this, [~eepayne]?

 

 


was (Author: zian chen):
Hi Eric, thanks for the comments. Discussed with Wangda, the patch uploaded 
before is not correct due to misunderstand of the original problem.

I have changed the Jira title. The intention of this Jira is to fix calculation 
of pending resource consider user-limit in preemption scenario. Currently, 
pending resource calculation in preemption uses the calculation algorithm in 
scheduling which is this one,
{code:java}
user_limit = min(max(current_capacity)/ #active_users, current_capacity * 
user_limit_percent), queue_capacity * user_limit_factor)
{code}
this is good for scheduling cause we want to make sure users can get at least 
"minimum-user-limit-percent" of resource to use, which is more like a lower 
bound of user-limit. However we should not capture total pending resource a 
leaf queue can get by minimum-user-limit-percent, instead, we want to use 
user-limit-factor which is the upper bound to capture pending resource in 
preemption. Cause if we use minimum-user-limit-percent to capture pending 
resource, resource under-utilization will happen in preemption scenario. Thus, 
we suggest the pending resource calculation for preemption should use this 
formula.

 
{code:java}

total_pending(partition,queue) = min {Q_max(partition) - Q_used(partition), Σ 
(min {
User.ulf(partition) - User.used(partition), User.pending(partition})}
{code}
Let me give an example,

 
{code:java}
   Root
/  |  \  \
   a   b   c  d
  30  30  30  10


 1) Only one node (n1) in the cluster, it has 100G.

 2) app1 submit to queue-a, asks for 10G used, 6G pending.

 3) app2 submit to queue-b, asks for 40G used

[jira] [Commented] (YARN-7417) re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to remove duplicate codes

2018-08-10 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576815#comment-16576815
 ] 

Zian Chen commented on YARN-7417:
-

Thanks for the review [~eyang], that was my original plan to make it reusable, 
but after investigating the logic, it's very almost impossible to achieve this.

The main reason is one formal parameter can not be abstracted into a common 
class type. The "AggregatedLogFormat.ContainerLogsReader logReader" in 
TFileAggregatedLogsBlock is a static class which can not be converted into any 
of the parent class of the formal parameter "InputStream in" in 
IndexedFileAggregatedLogsBlock

> re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to 
> remove duplicate codes
> 
>
> Key: YARN-7417
> URL: https://issues.apache.org/jira/browse/YARN-7417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-7417.001.patch, YARN-7417.002.patch
>
>
> This Jira is focus on refactor code for IndexedFileAggregatedLogsBlock and 
> TFileAggregatedLogsBlock
>  # We have duplicate code in current implementation of 
> IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock which can be 
> abstract into common methods. 
>  # render method is too long in both of these class, we want to make it clear 
> by abstracting some helper methods out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7417) re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to remove duplicate codes

2018-08-10 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-7417:

Description: 
This Jira is focus on refactor code for IndexedFileAggregatedLogsBlock and 
TFileAggregatedLogsBlock
 # We have duplicate code in current implementation of 
IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock which can be 
abstract into common methods. 
 # render method is too long in both of these class, we want to make it clear 
by abstracting some helper methods out.

  was:
This Jira is focus on refactor code for IndexedFileAggregatedLogsBlock

We have duplicate code in current implementation of 
IndexedFileAggregatedLogsBlock and IndexedFileAggregatedLogsBlock which can be 
abstract into common method. 


> re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to 
> remove duplicate codes
> 
>
> Key: YARN-7417
> URL: https://issues.apache.org/jira/browse/YARN-7417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-7417.001.patch, YARN-7417.002.patch
>
>
> This Jira is focus on refactor code for IndexedFileAggregatedLogsBlock and 
> TFileAggregatedLogsBlock
>  # We have duplicate code in current implementation of 
> IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock which can be 
> abstract into common methods. 
>  # render method is too long in both of these class, we want to make it clear 
> by abstracting some helper methods out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7417) re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to remove duplicate codes

2018-08-10 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-7417:

Description: 
This Jira is focus on refactor code for IndexedFileAggregatedLogsBlock

We have duplicate code in current implementation of 
IndexedFileAggregatedLogsBlock and IndexedFileAggregatedLogsBlock which can be 
abstract into common method. 

  was:We have duplicate code in current implementation of 
IndexedFileAggregatedLogsBlock and 


> re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to 
> remove duplicate codes
> 
>
> Key: YARN-7417
> URL: https://issues.apache.org/jira/browse/YARN-7417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-7417.001.patch, YARN-7417.002.patch
>
>
> This Jira is focus on refactor code for IndexedFileAggregatedLogsBlock
> We have duplicate code in current implementation of 
> IndexedFileAggregatedLogsBlock and IndexedFileAggregatedLogsBlock which can 
> be abstract into common method. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7417) re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to remove duplicate codes

2018-08-10 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-7417:

Description: We have duplicate code in current implementation of 
IndexedFileAggregatedLogsBlock and 

> re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to 
> remove duplicate codes
> 
>
> Key: YARN-7417
> URL: https://issues.apache.org/jira/browse/YARN-7417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-7417.001.patch, YARN-7417.002.patch
>
>
> We have duplicate code in current implementation of 
> IndexedFileAggregatedLogsBlock and 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8523) Interactive docker shell

2018-08-10 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576697#comment-16576697
 ] 

Zian Chen commented on YARN-8523:
-

Good point, I think we can make this Jira focus on building this pipline and 
create a second Jira for persistent docker exec state while NM restart. Two 
more questions here,
 # Should we give user sone kind of notification while NM restart and we are 
trying to resuming the docker exec? What if we get several retries to reconnect 
and don't succeed? We may need to give user some friendly reminder to avoid the 
misunderstanding of session been stuck for too long, right?
 # How to handle NM unexpected shutdown(like crash, etc) scenario?

> Interactive docker shell
> 
>
> Key: YARN-8523
> URL: https://issues.apache.org/jira/browse/YARN-8523
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Priority: Major
>  Labels: Docker
>
> Some application might require interactive unix commands executions to carry 
> out operations.  Container-executor can interface with docker exec to debug 
> or analyze docker containers while the application is running.  It would be 
> nice to support an API to invoke docker exec to perform unix commands and 
> report back the output to application master.  Application master can 
> distribute and aggregate execution of the commands to record in application 
> master log file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8509) Total pending resource calculation in preemption should use user-limit factor instead of minimum-user-limit-percent

2018-08-08 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16573997#comment-16573997
 ] 

Zian Chen commented on YARN-8509:
-

Hi Eric, thanks for the comments. Discussed with Wangda, the patch uploaded 
before is not correct due to misunderstand of the original problem.

I have changed the Jira title. The intention of this Jira is to fix calculation 
of pending resource consider user-limit in preemption scenario. Currently, 
pending resource calculation in preemption uses the calculation algorithm in 
scheduling which is this one,
{code:java}
user_limit = min(max(current_capacity)/ #active_users, current_capacity * 
user_limit_percent), queue_capacity * user_limit_factor)
{code}
this is good for scheduling cause we want to make sure users can get at least 
"minimum-user-limit-percent" of resource to use, which is more like a lower 
bound of user-limit. However we should not capture total pending resource a 
leaf queue can get by minimum-user-limit-percent, instead, we want to use 
user-limit-factor which is the upper bound to capture pending resource in 
preemption. Cause if we use minimum-user-limit-percent to capture pending 
resource, resource under-utilization will happen in preemption scenario. Thus, 
we suggest the pending resource calculation for preemption should use this 
formula.

 
{code:java}

total_pending(partition,queue) = min {Q_max(partition) - Q_used(partition), Σ 
(min {
User.ulf(partition) - User.used(partition), User.pending(partition})}
{code}
Let me give an example,

 
{code:java}
   Root
/  |  \  \
   a   b   c  d
  30  30  30  10


 1) Only one node (n1) in the cluster, it has 100G.

 2) app1 submit to queue-a, asks for 10G used, 6G pending.

 3) app2 submit to queue-b, asks for 40G used, 30G pending.

 4) app3 submit to queue-c, asks for 50G used, 30G pending.
{code}
Here we only have one user, and user-limit-factor for queues are

 

 
||Queue name|| minimum-user-limit-percent ||user-limit-factor||
|         a|                      1|        1.0 f|
|         b|                      1|        2.0 f|
|         c|                      1|        2.0 f|
|         d|                      1|        2.0 f|

With old calculation, user-limit for queue-a is 30G, which can let app1 has 6G 
pending, but user-limit for queue-b becomes 40G, which makes headroom become 
zero after subtract 40G used, the 30G pending resource been asked can not be 
accepted, same thing with queue-c too.

However if we see this test case in preemption point of view, we should allow 
queue-b and queue-c take more pending resources. Because even though queue-a 
has 30G guaranteed configured, it's under utilization. And by pending resource 
captured by the old algorithm, queue-b and queue-c can not take available 
resource through preemption which make the cluster resource not used 
effectively. 

To summarize, since user-limit-factor maintains the hard-limit of how much 
resource can be used by a user, we should calculate pending resource consider 
user-limit-factor instead of minimum-user-limit-percent. 

Could you share your opinion on this, [~eepayne]?

 

 

> Total pending resource calculation in preemption should use user-limit factor 
> instead of minimum-user-limit-percent
> ---
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8509.001.patch, YARN-8509.002.patch, 
> YARN-8509.003.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>   
>  We need to change the logic to let queue pending can go beyond userlimit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8523) Interactive docker shell

2018-08-08 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16573946#comment-16573946
 ] 

Zian Chen commented on YARN-8523:
-

[~eyang], thanks for raising this feature. This is very useful for live debug 
of container diagnosis. we can add a series of  interactive commands to let 
user debug more effectively, like tail -f container log, container resource 
usage, etc. 

For handling nodemanager restart scenario, we can register a event listener to 
listen restart or shutdown signal of node manager web socket and respond in 
xterm js terminal accordingly, (like print out NM restart/shutdown message to 
user, etc) and do reconnect retries several times after typical nm restart 
interval. Again, if NM meet any unexpected issue which can not resume its 
service, that's something we can not solve on this interactive docker shell by 
itself and we should just give user reasonable alert message to inform the 
current situation (like retry failed with timeout, please check NM log to get 
more information, etc). I think pass command through NM web socket and reuse 
container-executor security check would be a good prototype we can build first 
without have too much burden on handling root daemon by carving another secure 
channel. 

> Interactive docker shell
> 
>
> Key: YARN-8523
> URL: https://issues.apache.org/jira/browse/YARN-8523
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Priority: Major
>  Labels: Docker
>
> Some application might require interactive unix commands executions to carry 
> out operations.  Container-executor can interface with docker exec to debug 
> or analyze docker containers while the application is running.  It would be 
> nice to support an API to invoke docker exec to perform unix commands and 
> report back the output to application master.  Application master can 
> distribute and aggregate execution of the commands to record in application 
> master log file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-8523) Interactive docker shell

2018-08-08 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16573946#comment-16573946
 ] 

Zian Chen edited comment on YARN-8523 at 8/8/18 10:07 PM:
--

[~eyang], thanks for raising this feature. This is very useful for live debug 
of container diagnosis. we can add a series of  interactive commands to let 
user debug more effectively, like tail -f container log, container resource 
usage, etc. 

For handling nodemanager restart scenario, we can register a event listener to 
listen restart or shutdown signal of node manager web socket and respond in 
xterm js terminal accordingly, (like print out NM restart/shutdown message to 
user, etc) and do reconnect retries several times after typical nm restart 
interval.

Again, if NM meet any unexpected issue which can not resume its service, that's 
something we can not solve on this interactive docker shell by itself and we 
should just give user reasonable alert message to inform the current situation 
(like retry failed with timeout, please check NM log to get more information, 
etc).

I think pass command through NM web socket and reuse container-executor 
security check would be a good prototype we can build first without have too 
much burden on handling root daemon by carving another secure channel. 


was (Author: zian chen):
[~eyang], thanks for raising this feature. This is very useful for live debug 
of container diagnosis. we can add a series of  interactive commands to let 
user debug more effectively, like tail -f container log, container resource 
usage, etc. 

For handling nodemanager restart scenario, we can register a event listener to 
listen restart or shutdown signal of node manager web socket and respond in 
xterm js terminal accordingly, (like print out NM restart/shutdown message to 
user, etc) and do reconnect retries several times after typical nm restart 
interval. Again, if NM meet any unexpected issue which can not resume its 
service, that's something we can not solve on this interactive docker shell by 
itself and we should just give user reasonable alert message to inform the 
current situation (like retry failed with timeout, please check NM log to get 
more information, etc). I think pass command through NM web socket and reuse 
container-executor security check would be a good prototype we can build first 
without have too much burden on handling root daemon by carving another secure 
channel. 

> Interactive docker shell
> 
>
> Key: YARN-8523
> URL: https://issues.apache.org/jira/browse/YARN-8523
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Priority: Major
>  Labels: Docker
>
> Some application might require interactive unix commands executions to carry 
> out operations.  Container-executor can interface with docker exec to debug 
> or analyze docker containers while the application is running.  It would be 
> nice to support an API to invoke docker exec to perform unix commands and 
> report back the output to application master.  Application master can 
> distribute and aggregate execution of the commands to record in application 
> master log file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8509) Total pending resource calculation in preemption should use user-limit factor instead of minimum-user-limit-percent

2018-08-08 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8509:

Summary: Total pending resource calculation in preemption should use 
user-limit factor instead of minimum-user-limit-percent  (was: Fix UserLimit 
calculation for preemption to balance scenario after queue satisfied  )

> Total pending resource calculation in preemption should use user-limit factor 
> instead of minimum-user-limit-percent
> ---
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8509.001.patch, YARN-8509.002.patch, 
> YARN-8509.003.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>   
>  We need to change the logic to let queue pending can go beyond userlimit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7417) re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to remove duplicate codes

2018-08-06 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16570836#comment-16570836
 ] 

Zian Chen commented on YARN-7417:
-

Update the patch 003 to address the findbugs issue.

> re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to 
> remove duplicate codes
> 
>
> Key: YARN-7417
> URL: https://issues.apache.org/jira/browse/YARN-7417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-7417.001.patch, YARN-7417.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7417) re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to remove duplicate codes

2018-08-06 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16570838#comment-16570838
 ] 

Zian Chen commented on YARN-7417:
-

[~sunilg], could you help review the patch? Thanks

> re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to 
> remove duplicate codes
> 
>
> Key: YARN-7417
> URL: https://issues.apache.org/jira/browse/YARN-7417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-7417.001.patch, YARN-7417.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7417) re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to remove duplicate codes

2018-08-06 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-7417:

Attachment: YARN-7417.002.patch

> re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to 
> remove duplicate codes
> 
>
> Key: YARN-7417
> URL: https://issues.apache.org/jira/browse/YARN-7417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-7417.001.patch, YARN-7417.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7089) Mark the log-aggregation-controller APIs as public

2018-08-06 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-7089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16570780#comment-16570780
 ] 

Zian Chen commented on YARN-7089:
-

Hi [~leftnoteasy], could you help review this patch? Thanks

> Mark the log-aggregation-controller APIs as public
> --
>
> Key: YARN-7089
> URL: https://issues.apache.org/jira/browse/YARN-7089
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-7089.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7417) re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to remove duplicate codes

2018-08-06 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-7417:

Attachment: YARN-7417.001.patch

> re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to 
> remove duplicate codes
> 
>
> Key: YARN-7417
> URL: https://issues.apache.org/jira/browse/YARN-7417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-7417.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7417) re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to remove duplicate codes

2018-08-03 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568795#comment-16568795
 ] 

Zian Chen commented on YARN-7417:
-

Thanks [~xgong] for report this issue. I'll work on it and provide patch 
shortly.

> re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to 
> remove duplicate codes
> 
>
> Key: YARN-7417
> URL: https://issues.apache.org/jira/browse/YARN-7417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-7417) re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to remove duplicate codes

2018-08-03 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen reassigned YARN-7417:
---

Assignee: Zian Chen

> re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to 
> remove duplicate codes
> 
>
> Key: YARN-7417
> URL: https://issues.apache.org/jira/browse/YARN-7417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8509) Fix UserLimit calculation for preemption to balance scenario after queue satisfied

2018-08-02 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16566490#comment-16566490
 ] 

Zian Chen commented on YARN-8509:
-

Thanks [~csingh] for the review. The failed UT are not related.

 

[~sunilg], can you help commit the patch if everything looks good? Thanks

> Fix UserLimit calculation for preemption to balance scenario after queue 
> satisfied  
> 
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8509.001.patch, YARN-8509.002.patch, 
> YARN-8509.003.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>   
>  We need to change the logic to let queue pending can go beyond userlimit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-7089) Mark the log-aggregation-controller APIs as public

2018-08-01 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-7089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen reassigned YARN-7089:
---

Assignee: Zian Chen  (was: Xuan Gong)

> Mark the log-aggregation-controller APIs as public
> --
>
> Key: YARN-7089
> URL: https://issues.apache.org/jira/browse/YARN-7089
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7089) Mark the log-aggregation-controller APIs as public

2018-08-01 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-7089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16565895#comment-16565895
 ] 

Zian Chen commented on YARN-7089:
-

[~djp] [~xgong], [~rkanter] could you help review the patch? Thanks

 

> Mark the log-aggregation-controller APIs as public
> --
>
> Key: YARN-7089
> URL: https://issues.apache.org/jira/browse/YARN-7089
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-7089.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7089) Mark the log-aggregation-controller APIs as public

2018-08-01 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-7089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-7089:

Attachment: YARN-7089.001.patch

> Mark the log-aggregation-controller APIs as public
> --
>
> Key: YARN-7089
> URL: https://issues.apache.org/jira/browse/YARN-7089
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-7089.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8509) Fix UserLimit calculation for preemption to balance scenario after queue satisfied

2018-07-30 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8509:

Attachment: YARN-8509.003.patch

> Fix UserLimit calculation for preemption to balance scenario after queue 
> satisfied  
> 
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8509.001.patch, YARN-8509.002.patch, 
> YARN-8509.003.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>   
>  We need to change the logic to let queue pending can go beyond userlimit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8509) Fix UserLimit calculation for preemption to balance scenario after queue satisfied

2018-07-30 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16562513#comment-16562513
 ] 

Zian Chen commented on YARN-8509:
-

Thanks [~csingh] for reviewing the patch. Fix the javadoc block and remove 
debug level. 

Also changed the comments for the item3 to be more straightforward.  Is it 
looks better now?

> Fix UserLimit calculation for preemption to balance scenario after queue 
> satisfied  
> 
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8509.001.patch, YARN-8509.002.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>   
>  We need to change the logic to let queue pending can go beyond userlimit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8509) Fix UserLimit calculation for preemption to balance scenario after queue satisfied

2018-07-30 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8509:

Attachment: (was: YARN-8509.003.patch)

> Fix UserLimit calculation for preemption to balance scenario after queue 
> satisfied  
> 
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8509.001.patch, YARN-8509.002.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>   
>  We need to change the logic to let queue pending can go beyond userlimit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8509) Fix UserLimit calculation for preemption to balance scenario after queue satisfied

2018-07-30 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8509:

Attachment: YARN-8509.003.patch

> Fix UserLimit calculation for preemption to balance scenario after queue 
> satisfied  
> 
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8509.001.patch, YARN-8509.002.patch, 
> YARN-8509.003.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>   
>  We need to change the logic to let queue pending can go beyond userlimit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8522) Application fails with InvalidResourceRequestException

2018-07-26 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16558965#comment-16558965
 ] 

Zian Chen commented on YARN-8522:
-

[~sunilg], could you help review the latest patch?

> Application fails with InvalidResourceRequestException
> --
>
> Key: YARN-8522
> URL: https://issues.apache.org/jira/browse/YARN-8522
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8522.001.patch, YARN-8522.002.patch
>
>
> Launch multiple streaming app simultaneously. Here, sometimes one of the 
> application fails with below stack trace.
> {code}
> 18/07/02 07:14:32 INFO retry.RetryInvocationHandler: 
> java.net.ConnectException: Call From xx.xx.xx.xx/xx.xx.xx.xx to 
> xx.xx.xx.xx:8032 failed on connection exception: java.net.ConnectException: 
> Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.submitApplication over null. Retrying 
> after sleeping for 3ms.
> 18/07/02 07:14:32 WARN client.RequestHedgingRMFailoverProxyProvider: 
> Invocation returned exception: 
> org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid 
> resource request, only one resource request with * is allowed
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:502)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:389)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:320)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:645)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:277)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:563)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
>  on [rm2], so propagating back to caller.
> 18/07/02 07:14:32 INFO mapreduce.JobSubmitter: Cleaning up the staging area 
> /user/hrt_qa/.staging/job_1530515284077_0007
> 18/07/02 07:14:32 ERROR streaming.StreamJob: Error Launching job : 
> org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid 
> resource request, only one resource request with * is allowed
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:502)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:389)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:320)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:645)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:277)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:563)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
> Streaming Command Failed!{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsub

[jira] [Commented] (YARN-8509) Fix UserLimit calculation for preemption to balance scenario after queue satisfied

2018-07-26 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16558936#comment-16558936
 ] 

Zian Chen commented on YARN-8509:
-

Update patch fixing failed UTs.

> Fix UserLimit calculation for preemption to balance scenario after queue 
> satisfied  
> 
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8509.001.patch, YARN-8509.002.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>   
>  We need to change the logic to let queue pending can go beyond userlimit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8509) Fix UserLimit calculation for preemption to balance scenario after queue satisfied

2018-07-26 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8509:

Attachment: YARN-8509.002.patch

> Fix UserLimit calculation for preemption to balance scenario after queue 
> satisfied  
> 
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8509.001.patch, YARN-8509.002.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>   
>  We need to change the logic to let queue pending can go beyond userlimit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8509) Fix UserLimit calculation for preemption to balance scenario after queue satisfied

2018-07-26 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8509:

Description: 
In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
pending resource based on user-limit percent and user-limit factor which will 
cap pending resource for each user to the minimum of user-limit pending and 
actual pending. This will prevent queue from taking more pending resource to 
achieve queue balance after all queue satisfied with its ideal allocation.
  
 We need to change the logic to let queue pending can go beyond userlimit.

  was:
In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
pending resource based on user-limit percent and user-limit factor which will 
cap pending resource for each user to the minimum of user-limit pending and 
actual pending. This will prevent queue from taking more pending resource to 
achieve queue balance after all queue satisfied with its ideal allocation.
 
We need to change the logic to let queue pending can reach at most 
(Queue_max_capacity - Queue_used_capacity).


> Fix UserLimit calculation for preemption to balance scenario after queue 
> satisfied  
> 
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8509.001.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>   
>  We need to change the logic to let queue pending can go beyond userlimit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8509) Fix UserLimit calculation for preemption to balance scenario after queue satisfied

2018-07-26 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16558882#comment-16558882
 ] 

Zian Chen commented on YARN-8509:
-

talked with [~sunilg], It can go beyond maxCap - usedCap. Because a user can 
ask for 1maps. but cluster can run a max of 1000. In this case, as soon as 
each map finish, other one pending will get scheduled

> Fix UserLimit calculation for preemption to balance scenario after queue 
> satisfied  
> 
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8509.001.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>  
> We need to change the logic to let queue pending can reach at most 
> (Queue_max_capacity - Queue_used_capacity).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8522) Application fails with InvalidResourceRequestException

2018-07-26 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8522:

Attachment: YARN-8522.002.patch

> Application fails with InvalidResourceRequestException
> --
>
> Key: YARN-8522
> URL: https://issues.apache.org/jira/browse/YARN-8522
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8522.001.patch, YARN-8522.002.patch
>
>
> Launch multiple streaming app simultaneously. Here, sometimes one of the 
> application fails with below stack trace.
> {code}
> 18/07/02 07:14:32 INFO retry.RetryInvocationHandler: 
> java.net.ConnectException: Call From xx.xx.xx.xx/xx.xx.xx.xx to 
> xx.xx.xx.xx:8032 failed on connection exception: java.net.ConnectException: 
> Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.submitApplication over null. Retrying 
> after sleeping for 3ms.
> 18/07/02 07:14:32 WARN client.RequestHedgingRMFailoverProxyProvider: 
> Invocation returned exception: 
> org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid 
> resource request, only one resource request with * is allowed
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:502)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:389)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:320)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:645)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:277)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:563)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
>  on [rm2], so propagating back to caller.
> 18/07/02 07:14:32 INFO mapreduce.JobSubmitter: Cleaning up the staging area 
> /user/hrt_qa/.staging/job_1530515284077_0007
> 18/07/02 07:14:32 ERROR streaming.StreamJob: Error Launching job : 
> org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid 
> resource request, only one resource request with * is allowed
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:502)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:389)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:320)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:645)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:277)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:563)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
> Streaming Command Failed!{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional

[jira] [Commented] (YARN-8522) Application fails with InvalidResourceRequestException

2018-07-26 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16558870#comment-16558870
 ] 

Zian Chen commented on YARN-8522:
-

Thanks for the suggestions [~sunilg], Update patch 002.

> Application fails with InvalidResourceRequestException
> --
>
> Key: YARN-8522
> URL: https://issues.apache.org/jira/browse/YARN-8522
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8522.001.patch, YARN-8522.002.patch
>
>
> Launch multiple streaming app simultaneously. Here, sometimes one of the 
> application fails with below stack trace.
> {code}
> 18/07/02 07:14:32 INFO retry.RetryInvocationHandler: 
> java.net.ConnectException: Call From xx.xx.xx.xx/xx.xx.xx.xx to 
> xx.xx.xx.xx:8032 failed on connection exception: java.net.ConnectException: 
> Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.submitApplication over null. Retrying 
> after sleeping for 3ms.
> 18/07/02 07:14:32 WARN client.RequestHedgingRMFailoverProxyProvider: 
> Invocation returned exception: 
> org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid 
> resource request, only one resource request with * is allowed
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:502)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:389)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:320)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:645)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:277)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:563)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
>  on [rm2], so propagating back to caller.
> 18/07/02 07:14:32 INFO mapreduce.JobSubmitter: Cleaning up the staging area 
> /user/hrt_qa/.staging/job_1530515284077_0007
> 18/07/02 07:14:32 ERROR streaming.StreamJob: Error Launching job : 
> org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid 
> resource request, only one resource request with * is allowed
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:502)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:389)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:320)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:645)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:277)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:563)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
> Streaming Command Failed!{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To

[jira] [Commented] (YARN-8509) Fix UserLimit calculation for preemption to balance scenario after queue satisfied

2018-07-24 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554956#comment-16554956
 ] 

Zian Chen commented on YARN-8509:
-

[~sunilg], I found something interesting while address the failed UT 
TestContainerAllocation#testPendingResourcesConsideringUserLimit, after we 
change the logic in patch 001, we actually allows pending resource to be simply 
reach whatever we made pending on the current app regardless of the max 
capacity hard limit. I didn't notice this before in my own UTs, my opinion for 
pending resource is it can go beyond user limit, but still cannot beyond maxCap 
- usedCap limit. What's your opinion?

> Fix UserLimit calculation for preemption to balance scenario after queue 
> satisfied  
> 
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8509.001.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>  
> We need to change the logic to let queue pending can reach at most 
> (Queue_max_capacity - Queue_used_capacity).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8509) Fix UserLimit calculation for preemption to balance scenario after queue satisfied

2018-07-20 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16551467#comment-16551467
 ] 

Zian Chen commented on YARN-8509:
-

[~sunilg], Let me address the failed UTs first.

> Fix UserLimit calculation for preemption to balance scenario after queue 
> satisfied  
> 
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8509.001.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>  
> We need to change the logic to let queue pending can reach at most 
> (Queue_max_capacity - Queue_used_capacity).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8509) Fix UserLimit calculation for preemption to balance scenario after queue satisfied

2018-07-20 Thread Zian Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16551334#comment-16551334
 ] 

Zian Chen commented on YARN-8509:
-

Upload first patch for review. [~sunilg],could you help review the patch? Thanks

> Fix UserLimit calculation for preemption to balance scenario after queue 
> satisfied  
> 
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8509.001.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>  
> We need to change the logic to let queue pending can reach at most 
> (Queue_max_capacity - Queue_used_capacity).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8509) Fix UserLimit calculation for preemption to balance scenario after queue satisfied

2018-07-20 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8509:

Attachment: YARN-8509.001.patch

> Fix UserLimit calculation for preemption to balance scenario after queue 
> satisfied  
> 
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8509.001.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>  
> We need to change the logic to let queue pending can reach at most 
> (Queue_max_capacity - Queue_used_capacity).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

1 2 3 4 >

1 - 100 of 365 matches

Mail list logo