date:20180201

[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread Yuqi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuqi Wang updated YARN-7872:

Description: 
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent to the leaf queue.

*Attachment is the fix according to this principle, please help to review.*

*Without it, we cannot use locality to request container within these labeled 
nodes.*

*If the fix is acceptable, we should also recheck whether the same issue 
happens in trunk and other hadoop versions.*

  was:
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent to the leaf queue.

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk and other hadoop versions.


> labeled node cannot be used to satisfy locality specified request
> -
>
> Key: YARN-7872
> URL: https://issues.apache.org/jira/browse/YARN-7872
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Yuqi Wang
>Assignee: Yuqi Wang
>Priority: Blocker
> Fix For: 2.7.2
>
> Attachments: YARN-7872-branch-2.7.2.001.patch
>
>
> labeled node (i.e. node with 'not empty' node label) cannot be used to 
> satisfy locality specified request (i.e. container request with 'not ANY' 
> resource name and the relax locality is false).
> For example:
> The node with available resource:
> [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
> [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
> \{/default-rack}]
> The container request:
>  [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
> {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: 
> \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]
> Current RM capacity

[jira] [Commented] (YARN-7446) Docker container privileged mode and --user flag contradict each other

2018-02-01 Thread Shane Kumpf (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348557#comment-16348557
 ] 

Shane Kumpf commented on YARN-7446:
---

Do you believe that all --privileged containers should run as the root user? if 
so, please hard code --user 0:0 as the user in this patch and we'll get this 
wrapped up.

> Docker container privileged mode and --user flag contradict each other
> --
>
> Key: YARN-7446
> URL: https://issues.apache.org/jira/browse/YARN-7446
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7446.001.patch
>
>
> In the current implementation, when privileged=true, --user flag is also 
> passed to docker for launching container.  In reality, the container has no 
> way to use root privileges unless there is sticky bit or sudoers in the image 
> for the specified user to gain privileges again.  To avoid duplication of 
> dropping and reacquire root privileges, we can reduce the duplication of 
> specifying both flag.  When privileged mode is enabled, --user flag should be 
> omitted.  When non-privileged mode is enabled, --user flag is supplied.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread Yuqi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuqi Wang updated YARN-7872:

Description: 
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent to the leaf queue.

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk and other hadoop versions.

  was:
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent to the leaf queue.

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk.


> labeled node cannot be used to satisfy locality specified request
> -
>
> Key: YARN-7872
> URL: https://issues.apache.org/jira/browse/YARN-7872
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Yuqi Wang
>Assignee: Yuqi Wang
>Priority: Blocker
> Fix For: 2.7.2
>
>
> labeled node (i.e. node with 'not empty' node label) cannot be used to 
> satisfy locality specified request (i.e. container request with 'not ANY' 
> resource name and the relax locality is false).
> For example:
> The node with available resource:
> [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
> [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
> \{/default-rack}]
> The container request:
>  [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
> {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: 
> \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]
> Current RM capacity scheduler's behavior is that, the node cannot allocate 
> container for the request because of the

[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread Yuqi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuqi Wang updated YARN-7872:

Attachment: YARN-7872-branch-2.7.2.001.patch

> labeled node cannot be used to satisfy locality specified request
> -
>
> Key: YARN-7872
> URL: https://issues.apache.org/jira/browse/YARN-7872
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Yuqi Wang
>Assignee: Yuqi Wang
>Priority: Blocker
> Fix For: 2.7.2
>
> Attachments: YARN-7872-branch-2.7.2.001.patch
>
>
> labeled node (i.e. node with 'not empty' node label) cannot be used to 
> satisfy locality specified request (i.e. container request with 'not ANY' 
> resource name and the relax locality is false).
> For example:
> The node with available resource:
> [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
> [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
> \{/default-rack}]
> The container request:
>  [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
> {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: 
> \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]
> Current RM capacity scheduler's behavior is that, the node cannot allocate 
> container for the request because of the node label not matched in the leaf 
> queue assign container.
> However, node locality and node label should be two orthogonal dimensions to 
> select candidate nodes for container request. And the node label matching 
> should only be executed for container request with ANY resource name, since 
> only this kind of container request is allowed to have 'not empty' node label.
> So, for container request with 'not ANY' resource name (so, we know it should 
> not have node label), we should use resource name to match with the node 
> instead of using node label to match with the node. And this resource name 
> matching should be safe, since the node whose node label is not accessible 
> for the queue will not be sent to the leaf queue.
> Attachment is the fix according to this principle, please help to review.
> Without it, we cannot use locality to request container within these labeled 
> nodes.
> If the fix is acceptable, we should also recheck whether the same issue 
> happens in trunk and other hadoop versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread Yuqi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuqi Wang updated YARN-7872:

Description: 
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent to the leaf queue.

*Attachment is the fix according to this principle, please help to review.*

*Without it, we cannot use locality to request container within these labeled 
nodes.*

*If the fix is acceptable, we should also recheck whether the same issue 
happens in trunk and other hadoop versions.*

*If not* *acceptable (i.e. the current behavior is by designed), so, how can we 
use* *locality to request container within labeled nodes?*

  was:
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent to the leaf queue.

*Attachment is the fix according to this principle, please help to review.*

*Without it, we cannot use locality to request container within these labeled 
nodes.*

*If the fix is acceptable, we should also recheck whether the same issue 
happens in trunk and other hadoop versions.*

*If not* *acceptable (i.e. the current behavior is by designed), so, how can we 
use* *locality to request container within labeled nodes?***


> labeled node cannot be used to satisfy locality specified request
> -
>
> Key: YARN-7872
> URL: https://issues.apache.org/jira/browse/YARN-7872
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Yuqi Wang
>Assignee: Yuqi Wang
>Priority: Blocker
> Fix For: 2.7.2
>
> Attachments: YARN-7872-branch-2.7.2.001.patch
>
>
> labeled node (i.e. node with 'not empty' node label) cannot be used to 
> satisfy locality specified request (i.e. container request with 'not ANY' 
> resource name and the relax locality is false).
> For example:
> The node with available resource:
> [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
> [persistent]{color} {color:#f79232}HostName: \{SRG}{color}

[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread Yuqi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuqi Wang updated YARN-7872:

Description: 
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent to the leaf queue.

*Attachment is the fix according to this principle, please help to review.*

*Without it, we cannot use locality to request container within these labeled 
nodes.*

*If the fix is acceptable, we should also recheck whether the same issue 
happens in trunk and other hadoop versions.*

*If not* *acceptable (i.e. the current behavior is by designed), so, how can we 
specify* *locality to request container within labeled nodes?***

  was:
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent to the leaf queue.

*Attachment is the fix according to this principle, please help to review.*

*Without it, we cannot use locality to request container within these labeled 
nodes.*

*If the fix is acceptable, we should also recheck whether the same issue 
happens in trunk and other hadoop versions.*


> labeled node cannot be used to satisfy locality specified request
> -
>
> Key: YARN-7872
> URL: https://issues.apache.org/jira/browse/YARN-7872
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Yuqi Wang
>Assignee: Yuqi Wang
>Priority: Blocker
> Fix For: 2.7.2
>
> Attachments: YARN-7872-branch-2.7.2.001.patch
>
>
> labeled node (i.e. node with 'not empty' node label) cannot be used to 
> satisfy locality specified request (i.e. container request with 'not ANY' 
> resource name and the relax locality is false).
> For example:
> The node with available resource:
> [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
> [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
> \{/default-rack}]
> The container request:
>  [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
>

[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread Yuqi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuqi Wang updated YARN-7872:

Description: 
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent to the leaf queue.

*Attachment is the fix according to this principle, please help to review.*

*Without it, we cannot use locality to request container within these labeled 
nodes.*

*If the fix is acceptable, we should also recheck whether the same issue 
happens in trunk and other hadoop versions.*

*If not* *acceptable (i.e. the current behavior is by designed), so, how can we 
use* *locality to request container within labeled nodes?***

  was:
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent to the leaf queue.

*Attachment is the fix according to this principle, please help to review.*

*Without it, we cannot use locality to request container within these labeled 
nodes.*

*If the fix is acceptable, we should also recheck whether the same issue 
happens in trunk and other hadoop versions.*

*If not* *acceptable (i.e. the current behavior is by designed), so, how can we 
specify* *locality to request container within labeled nodes?***


> labeled node cannot be used to satisfy locality specified request
> -
>
> Key: YARN-7872
> URL: https://issues.apache.org/jira/browse/YARN-7872
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Yuqi Wang
>Assignee: Yuqi Wang
>Priority: Blocker
> Fix For: 2.7.2
>
> Attachments: YARN-7872-branch-2.7.2.001.patch
>
>
> labeled node (i.e. node with 'not empty' node label) cannot be used to 
> satisfy locality specified request (i.e. container request with 'not ANY' 
> resource name and the relax locality is false).
> For example:
> The node with available resource:
> [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
> [persistent]{color} {color:#f79232}HostName:

[jira] [Commented] (YARN-7840) Update PB for prefix support of node attributes

2018-02-01 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348579#comment-16348579
 ] 

Sunil G commented on YARN-7840:
---

I am fine with latest patch. +1

I ll commit tomorrow if no objections.

> Update PB for prefix support of node attributes
> ---
>
> Key: YARN-7840
> URL: https://issues.apache.org/jira/browse/YARN-7840
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Naganarasimha G R
>Priority: Blocker
> Attachments: YARN-7840-YARN-3409.001.patch, 
> YARN-7840-YARN-3409.002.patch, YARN-7840-YARN-3409.003.patch, 
> YARN-7840-YARN-3409.004.patch
>
>
> We need to support prefix (namespace) for node attributes, this will add the 
> flexibility to provide ability to do proper ACL, avoid naming conflicts etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7850) New UI does not show status for Log Aggregation

2018-02-01 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Novák updated YARN-7850:

Attachment: Screen Shot 2018-02-01 at 11.34.36.png

> New UI does not show status for Log Aggregation
> ---
>
> Key: YARN-7850
> URL: https://issues.apache.org/jira/browse/YARN-7850
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Yesha Vora
>Assignee: Gergely Novák
>Priority: Major
> Attachments: Screen Shot 2018-02-01 at 11.34.36.png, 
> YARN-7850.001.patch
>
>
> The status of Log Aggregation is not specified any where.
> New UI should show the Log aggregation status for finished application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7850) New UI does not show status for Log Aggregation

2018-02-01 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Novák updated YARN-7850:

Attachment: YARN-7850.001.patch

> New UI does not show status for Log Aggregation
> ---
>
> Key: YARN-7850
> URL: https://issues.apache.org/jira/browse/YARN-7850
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Yesha Vora
>Assignee: Gergely Novák
>Priority: Major
> Attachments: Screen Shot 2018-02-01 at 11.34.36.png, 
> YARN-7850.001.patch
>
>
> The status of Log Aggregation is not specified any where.
> New UI should show the Log aggregation status for finished application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread Yuqi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuqi Wang updated YARN-7872:

Description: 
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behaiour is:
 The node cannot allocate container for the request because of the node label 
not matched in the leaf queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (besides, it should not 
have node label), we should use resource name to match with the node instead of 
node label to match with the node. And it should be safe, since the node which 
is not accessible for the queue will not be sent in the leaf queue.

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk.

  was:
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
[Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

 

Current RM capacity scheduler's behaiour is:
The node cannot allocate container for the request because of the node label 
not matched in the leaf queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (besides, it should not 
have node label), we should use resource name to match with the node instead of 
node label to match with the node. And it should be safe, since the node which 
is not accessible for the queue will not be sent in the leaf queue.

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk.


> labeled node cannot be used to satisfy locality specified request
> -
>
> Key: YARN-7872
> URL: https://issues.apache.org/jira/browse/YARN-7872
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Yuqi Wang
>Assignee: Yuqi Wang
>Priority: Blocker
> Fix For: 2.8.0, 2.7.2
>
>
> labeled node (i.e. node with 'not empty' node label) cannot be used to 
> satisfy locality specified request (i.e. container request with 'not ANY' 
> resource name and the relax locality is false).
> For example:
> The node with available resource:
> [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
> [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
> \{/default-rack}]
> The container request:
>  [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
> {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: 
> \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]
> Current RM capacity scheduler's behaiour is:
>  The node cannot allocate container for the request because of the node label 
> not matched in the leaf queue assign container.
> However, node locality and node label should be two

[jira] [Commented] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348591#comment-16348591
 ] 

genericqa commented on YARN-7872:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  8m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2.7.2 Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  1m 
51s{color} | {color:red} root in branch-2.7.2 failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
29s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.7.2 
failed with JDK v1.8.0_151. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.7.2 
failed with JDK v9-internal. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} branch-2.7.2 passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
17s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.7.2 
failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.7.2 
failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.7.2 
failed with JDK v1.8.0_151. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.7.2 
failed with JDK v9-internal. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed 
with JDK v1.8.0_151. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m  9s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_151. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed 
with JDK v9-internal. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 10s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v9-internal. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 25s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 2 new + 816 unchanged - 1 fixed = 818 total (was 817) 
{color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
11s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed 
with JDK v1.8.0_151. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed 
with JDK v9-internal. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 10s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v9-internal. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 15m 17s{color} | 
{color:black} {color} |
\\
\\
||

[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread Yuqi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuqi Wang updated YARN-7872:

Description: 
*Issue summary:*

labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

 

*For example:*

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that (at least for version 2.7 and 
2.8), the node cannot allocate container for the request, because the node 
label is not matched when the leaf queue assign container.

 

*Possible solution:*

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we clearly know it 
should not have node label), we should use the requested resource name to match 
with the node instead of using the requested node label to match with the node. 
And this resource name matching should be safe, since the node whose node label 
is not accessible for the queue will not be sent to the leaf queue.

 

*Discussion:*

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk and other hadoop versions.

If not acceptable (i.e. the current behavior is by designed), so, how can we 
use locality to request container within these labeled nodes?

  was:
*Issue summary:*

labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

 

*For example:*

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that (at least for version 2.7 and 
2.8), the node cannot allocate container for the request, because the node 
label is not matched when the leaf queue assign container.

 

*Possible solution:*

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use the requested resource name to match with 
the node instead of using the requested node label to match with the node. And 
this resource name matching should be safe, since the node whose node label is 
not accessible for the queue will not be sent to the leaf queue.

 

*Discussion:*

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk and other hadoop versions.

If not acceptable (i.e. the current behavior is by designed), so, how can we 
use locality to request container within these labeled nodes?


> labeled node cannot be used to satisfy locality specified request
> -
>
> Key: YARN-7872
> URL: https://issues.apache.org/jira/browse/YARN-7872
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Yuqi Wang
>Assignee: Yuqi Wang
>Priority: Blocker
> Fix For: 2.7.2
>
> Attachments: YARN-7872-branch-2.7.2.001.patch
>
>
> *Issue summary:*
> labeled node (i.e. node with 'not empty' node label) cannot be used to 
> satisfy locality specified request

[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread Yuqi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuqi Wang updated YARN-7872:

Description: 
*Issue summary:*

labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

 

*For example:*

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that (at least for version 2.7 and 
2.8), the node cannot allocate container for the request, because the node 
label is not matched when the leaf queue assign container.

 

*Possible solution:*

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use the requested resource name to match with 
the node instead of using the requested node label to match with the node. And 
this resource name matching should be safe, since the node whose node label is 
not accessible for the queue will not be sent to the leaf queue.

 

*Discussion:*

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk and other hadoop versions.

If not acceptable (i.e. the current behavior is by designed), so, how can we 
use locality to request container within these labeled nodes?

  was:
*Issue summary:*

labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

 

*For example:*

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that (at least for version 2.7 and 
2.8), the node cannot allocate container for the request, because the node 
label is not matched when the leaf queue assign container.

 

*Possible solution:*

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent to the leaf queue.

 

*Discussion:*

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk and other hadoop versions.

If not acceptable (i.e. the current behavior is by designed), so, how can we 
use locality to request container within these labeled nodes?


> labeled node cannot be used to satisfy locality specified request
> -
>
> Key: YARN-7872
> URL: https://issues.apache.org/jira/browse/YARN-7872
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Yuqi Wang
>Assignee: Yuqi Wang
>Priority: Blocker
> Fix For: 2.7.2
>
> Attachments: YARN-7872-branch-2.7.2.001.patch
>
>
> *Issue summary:*
> labeled node (i.e. node with 'not empty' node label) cannot be used to 
> satisfy locality specified request (i.e. container request with 'not

[jira] [Commented] (YARN-7829) Rebalance UI2 cluster overview page

2018-02-01 Thread JIRA


[ 
https://issues.apache.org/jira/browse/YARN-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348609#comment-16348609
 ] 

Gergely Novák commented on YARN-7829:
-

[~sunilg] Please find the attached screenshot. All I did was moved the Node 
Managers to the 2nd row, didn't touch the resources.

> Rebalance UI2 cluster overview page
> ---
>
> Key: YARN-7829
> URL: https://issues.apache.org/jira/browse/YARN-7829
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Affects Versions: 3.0.0
>Reporter: Eric Yang
>Assignee: Gergely Novák
>Priority: Major
> Attachments: YARN-7829.001.patch, YARN-7829.jpg, 
> ui2-cluster-overview.png
>
>
> The cluster overview page looks like a upside down triangle.  It would be 
> nice to rebalance the charts to ensure horizontal real estate are utilized 
> properly.  The screenshot attachment includes some suggestion for rebalance.  
> Node Manager status and cluster resource are closely related, it would be 
> nice to promote the chart to first row.  Application Status, and Resource 
> Availability are closely related.  It would be nice to promote Resource usage 
> to side by side with Application Status to fill up the horizontal real 
> estates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7850) New UI does not show status for Log Aggregation

2018-02-01 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Novák updated YARN-7850:

Attachment: Screen Shot 2018-02-01 at 11.37.30.png

> New UI does not show status for Log Aggregation
> ---
>
> Key: YARN-7850
> URL: https://issues.apache.org/jira/browse/YARN-7850
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Yesha Vora
>Assignee: Gergely Novák
>Priority: Major
> Attachments: Screen Shot 2018-02-01 at 11.37.30.png, 
> YARN-7850.001.patch
>
>
> The status of Log Aggregation is not specified any where.
> New UI should show the Log aggregation status for finished application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7850) New UI does not show status for Log Aggregation

2018-02-01 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Novák updated YARN-7850:

Attachment: (was: Screen Shot 2018-02-01 at 11.34.36.png)

> New UI does not show status for Log Aggregation
> ---
>
> Key: YARN-7850
> URL: https://issues.apache.org/jira/browse/YARN-7850
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Yesha Vora
>Assignee: Gergely Novák
>Priority: Major
> Attachments: YARN-7850.001.patch
>
>
> The status of Log Aggregation is not specified any where.
> New UI should show the Log aggregation status for finished application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7850) New UI does not show status for Log Aggregation

2018-02-01 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Novák updated YARN-7850:

Attachment: YARN-7850.001.patch

> New UI does not show status for Log Aggregation
> ---
>
> Key: YARN-7850
> URL: https://issues.apache.org/jira/browse/YARN-7850
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Yesha Vora
>Assignee: Gergely Novák
>Priority: Major
> Attachments: YARN-7850.001.patch
>
>
> The status of Log Aggregation is not specified any where.
> New UI should show the Log aggregation status for finished application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7850) New UI does not show status for Log Aggregation

2018-02-01 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Novák updated YARN-7850:

Attachment: (was: YARN-7850.001.patch)

> New UI does not show status for Log Aggregation
> ---
>
> Key: YARN-7850
> URL: https://issues.apache.org/jira/browse/YARN-7850
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Yesha Vora
>Assignee: Gergely Novák
>Priority: Major
> Attachments: YARN-7850.001.patch
>
>
> The status of Log Aggregation is not specified any where.
> New UI should show the Log aggregation status for finished application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7829) Rebalance UI2 cluster overview page

2018-02-01 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348573#comment-16348573
 ] 

Sunil G commented on YARN-7829:
---

[~GergelyNovak] Cud u pls attach screen shot as per latest patch. Also pls keep 
resources in same line (gpu etc to be shown later)

> Rebalance UI2 cluster overview page
> ---
>
> Key: YARN-7829
> URL: https://issues.apache.org/jira/browse/YARN-7829
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Affects Versions: 3.0.0
>Reporter: Eric Yang
>Assignee: Gergely Novák
>Priority: Major
> Attachments: YARN-7829.001.patch, ui2-cluster-overview.png
>
>
> The cluster overview page looks like a upside down triangle.  It would be 
> nice to rebalance the charts to ensure horizontal real estate are utilized 
> properly.  The screenshot attachment includes some suggestion for rebalance.  
> Node Manager status and cluster resource are closely related, it would be 
> nice to promote the chart to first row.  Application Status, and Resource 
> Availability are closely related.  It would be nice to promote Resource usage 
> to side by side with Application Status to fill up the horizontal real 
> estates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7829) Rebalance UI2 cluster overview page

2018-02-01 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348613#comment-16348613
 ] 

Sunil G commented on YARN-7829:
---

perfect! Thanks [~GergelyNovak].

> Rebalance UI2 cluster overview page
> ---
>
> Key: YARN-7829
> URL: https://issues.apache.org/jira/browse/YARN-7829
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Affects Versions: 3.0.0
>Reporter: Eric Yang
>Assignee: Gergely Novák
>Priority: Major
> Attachments: YARN-7829.001.patch, YARN-7829.jpg, 
> ui2-cluster-overview.png
>
>
> The cluster overview page looks like a upside down triangle.  It would be 
> nice to rebalance the charts to ensure horizontal real estate are utilized 
> properly.  The screenshot attachment includes some suggestion for rebalance.  
> Node Manager status and cluster resource are closely related, it would be 
> nice to promote the chart to first row.  Application Status, and Resource 
> Availability are closely related.  It would be nice to promote Resource usage 
> to side by side with Application Status to fill up the horizontal real 
> estates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-7850) New UI does not show status for Log Aggregation

2018-02-01 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Novák reassigned YARN-7850:
---

Assignee: Gergely Novák

> New UI does not show status for Log Aggregation
> ---
>
> Key: YARN-7850
> URL: https://issues.apache.org/jira/browse/YARN-7850
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Yesha Vora
>Assignee: Gergely Novák
>Priority: Major
>
> The status of Log Aggregation is not specified any where.
> New UI should show the Log aggregation status for finished application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7850) New UI does not show status for Log Aggregation

2018-02-01 Thread JIRA


[ 
https://issues.apache.org/jira/browse/YARN-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348366#comment-16348366
 ] 

Gergely Novák commented on YARN-7850:
-

In patch #1 I added Log Aggregation Status to the Logs tab. 

However, the old UI offers more than that, on 
\{rm}:8088/cluster/logaggregationstatus/\{app_id} it shows a table with all the 
affected nodes, their log aggregations statuses and diagnostic messages. In 
order to present the same on the new UI we need to add this information to the 
RM Web Services by creating a new API endpoint. [~yeshavora] Can I open a 
separate ticket for that?

 

> New UI does not show status for Log Aggregation
> ---
>
> Key: YARN-7850
> URL: https://issues.apache.org/jira/browse/YARN-7850
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Yesha Vora
>Assignee: Gergely Novák
>Priority: Major
> Attachments: Screen Shot 2018-02-01 at 11.37.30.png, 
> YARN-7850.001.patch
>
>
> The status of Log Aggregation is not specified any where.
> New UI should show the Log aggregation status for finished application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7840) Update PB for prefix support of node attributes

2018-02-01 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348377#comment-16348377
 ] 

genericqa commented on YARN-7840:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} YARN-3409 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
52s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
12s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
33s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
58s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
23s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 58s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
13s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in 
YARN-3409 has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} YARN-3409 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  6m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
56s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 10 unchanged - 1 fixed = 10 total (was 11) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 17s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
41s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
14s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 70m 16s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7840 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908758/YARN-7840-YARN-3409.004.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux ba3d48b04260 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality |

[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread Yuqi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuqi Wang updated YARN-7872:

Target Version/s: 2.7.2  (was: 2.8.0, 2.7.2)

> labeled node cannot be used to satisfy locality specified request
> -
>
> Key: YARN-7872
> URL: https://issues.apache.org/jira/browse/YARN-7872
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Yuqi Wang
>Assignee: Yuqi Wang
>Priority: Blocker
> Fix For: 2.7.2
>
>
> labeled node (i.e. node with 'not empty' node label) cannot be used to 
> satisfy locality specified request (i.e. container request with 'not ANY' 
> resource name and the relax locality is false).
> For example:
> The node with available resource:
> [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
> [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
> \{/default-rack}]
> The container request:
>  [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
> {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: 
> \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]
> Current RM capacity scheduler's behavior is that, the node cannot allocate 
> container for the request because of the node label not matched in the leaf 
> queue assign container.
> However, node locality and node label should be two orthogonal dimensions to 
> select candidate nodes for container request. And the node label matching 
> should only be executed for container request with ANY resource name, since 
> only this kind of container request is allowed to have 'not empty' node label.
> So, for container request with 'not ANY' resource name (so, we know it should 
> not have node label), we should use resource name to match with the node 
> instead of using node label to match with the node. And this resource name 
> matching should be safe, since the node whose node label is not accessible 
> for the queue will not be sent to the leaf queue.
> Attachment is the fix according to this principle, please help to review.
> Without it, we cannot use locality to request container within these labeled 
> nodes.
> If the fix is acceptable, we should also recheck whether the same issue 
> happens in trunk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread Yuqi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuqi Wang updated YARN-7872:

Fix Version/s: (was: 2.8.0)

> labeled node cannot be used to satisfy locality specified request
> -
>
> Key: YARN-7872
> URL: https://issues.apache.org/jira/browse/YARN-7872
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Yuqi Wang
>Assignee: Yuqi Wang
>Priority: Blocker
> Fix For: 2.7.2
>
>
> labeled node (i.e. node with 'not empty' node label) cannot be used to 
> satisfy locality specified request (i.e. container request with 'not ANY' 
> resource name and the relax locality is false).
> For example:
> The node with available resource:
> [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
> [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
> \{/default-rack}]
> The container request:
>  [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
> {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: 
> \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]
> Current RM capacity scheduler's behavior is that, the node cannot allocate 
> container for the request because of the node label not matched in the leaf 
> queue assign container.
> However, node locality and node label should be two orthogonal dimensions to 
> select candidate nodes for container request. And the node label matching 
> should only be executed for container request with ANY resource name, since 
> only this kind of container request is allowed to have 'not empty' node label.
> So, for container request with 'not ANY' resource name (so, we know it should 
> not have node label), we should use resource name to match with the node 
> instead of using node label to match with the node. And this resource name 
> matching should be safe, since the node whose node label is not accessible 
> for the queue will not be sent to the leaf queue.
> Attachment is the fix according to this principle, please help to review.
> Without it, we cannot use locality to request container within these labeled 
> nodes.
> If the fix is acceptable, we should also recheck whether the same issue 
> happens in trunk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread Yuqi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuqi Wang updated YARN-7872:

Description: 
*Issue summary:*

labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

 

*For example:*

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that (at least for version 2.7 and 
2.8), the node cannot allocate container for the request, because the node 
label is not matched when the leaf queue assign container.

 

*Possible solution:*

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent to the leaf queue.

 

*Discussion:*

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk and other hadoop versions.

If not acceptable (i.e. the current behavior is by designed), so, how can we 
use locality to request container within these labeled nodes?

  was:
*Issue summary:*

labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

 

*For example:*

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request, because the node label is not matched when the leaf 
queue assign container.

 

*Possible solution:*

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent to the leaf queue.

 

*Discussion:*

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk and other hadoop versions.

If not acceptable (i.e. the current behavior is by designed), so, how can we 
use locality to request container within these labeled nodes?


> labeled node cannot be used to satisfy locality specified request
> -
>
> Key: YARN-7872
> URL: https://issues.apache.org/jira/browse/YARN-7872
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Yuqi Wang
>Assignee: Yuqi Wang
>Priority: Blocker
> Fix For: 2.7.2
>
> Attachments: YARN-7872-branch-2.7.2.001.patch
>
>
> *Issue summary:*
> labeled node (i.e. node with 'not empty' node label) cannot be used to 
> satisfy locality specified request (i.e. container request with 'not ANY' 
> resource name and the relax locality is false).
>  
>

[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread Yuqi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuqi Wang updated YARN-7872:

Description: 
*Issue summary:*

labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

 

*For example:*

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request, because the node label is not matched when the leaf 
queue assign container.

 

*Possible solution:*

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent to the leaf queue.

 

*Discussion:*

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk and other hadoop versions.

If not acceptable (i.e. the current behavior is by designed), so, how can we 
use locality to request container within these labeled nodes?

  was:
*Issue summary:*

labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

 

*For example:*

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

 

*Possible solution:*

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent to the leaf queue.

 

*Discussion:*

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk and other hadoop versions.

If not acceptable (i.e. the current behavior is by designed), so, how can we 
use locality to request container within these labeled nodes?


> labeled node cannot be used to satisfy locality specified request
> -
>
> Key: YARN-7872
> URL: https://issues.apache.org/jira/browse/YARN-7872
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Yuqi Wang
>Assignee: Yuqi Wang
>Priority: Blocker
> Fix For: 2.7.2
>
> Attachments: YARN-7872-branch-2.7.2.001.patch
>
>
> *Issue summary:*
> labeled node (i.e. node with 'not empty' node label) cannot be used to 
> satisfy locality specified request (i.e. container request with 'not ANY' 
> resource name and the relax locality is false).
>  
> *For example:*
> The node with available

[jira] [Updated] (YARN-7829) Rebalance UI2 cluster overview page

2018-02-01 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Novák updated YARN-7829:

Attachment: YARN-7829.jpg

> Rebalance UI2 cluster overview page
> ---
>
> Key: YARN-7829
> URL: https://issues.apache.org/jira/browse/YARN-7829
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Affects Versions: 3.0.0
>Reporter: Eric Yang
>Assignee: Gergely Novák
>Priority: Major
> Attachments: YARN-7829.001.patch, YARN-7829.jpg, 
> ui2-cluster-overview.png
>
>
> The cluster overview page looks like a upside down triangle.  It would be 
> nice to rebalance the charts to ensure horizontal real estate are utilized 
> properly.  The screenshot attachment includes some suggestion for rebalance.  
> Node Manager status and cluster resource are closely related, it would be 
> nice to promote the chart to first row.  Application Status, and Resource 
> Availability are closely related.  It would be nice to promote Resource usage 
> to side by side with Application Status to fill up the horizontal real 
> estates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread Yuqi Wang (JIRA)

Yuqi Wang created YARN-7872:
---

 Summary: labeled node cannot be used to satisfy locality specified 
request
 Key: YARN-7872
 URL: https://issues.apache.org/jira/browse/YARN-7872
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacity scheduler, capacityscheduler, resourcemanager
Affects Versions: 2.7.2
Reporter: Yuqi Wang
Assignee: Yuqi Wang
 Fix For: 2.7.2, 2.8.0


labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
[Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

 

Current RM capacity scheduler's behaiour is:
The node cannot allocate container for the request because of the node label 
not matched in the leaf queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (besides, it should not 
have node label), we should use resource name to match with the node instead of 
node label to match with the node. And it should be safe, since the node which 
is not accessible for the queue will not be sent in the leaf queue.

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348684#comment-16348684
 ] 

Jim Brennan commented on YARN-7677:
---

Thanks [~jlowe] I will put up a patch for branch-2.


> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Fix For: 3.1.0, 3.0.1
>
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread Yuqi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuqi Wang updated YARN-7872:

Description: 
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (besides, it should not 
have node label), we should use resource name to match with the node instead of 
node label to match with the node. And it should be safe, since the node which 
is not accessible for the queue will not be sent in the leaf queue.

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk.

  was:
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behaiour is:
 The node cannot allocate container for the request because of the node label 
not matched in the leaf queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (besides, it should not 
have node label), we should use resource name to match with the node instead of 
node label to match with the node. And it should be safe, since the node which 
is not accessible for the queue will not be sent in the leaf queue.

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk.


> labeled node cannot be used to satisfy locality specified request
> -
>
> Key: YARN-7872
> URL: https://issues.apache.org/jira/browse/YARN-7872
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Yuqi Wang
>Assignee: Yuqi Wang
>Priority: Blocker
> Fix For: 2.8.0, 2.7.2
>
>
> labeled node (i.e. node with 'not empty' node label) cannot be used to 
> satisfy locality specified request (i.e. container request with 'not ANY' 
> resource name and the relax locality is false).
> For example:
> The node with available resource:
> [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
> [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
> \{/default-rack}]
> The container request:
>  [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
> {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: 
> \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]
> Current RM capacity scheduler's behavior is that, the node cannot allocate 
> container for the request because of the node label not matched in the leaf 
> queue assign container.
> However, node locality and node label should be two

[jira] [Comment Edited] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread Yuqi Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348571#comment-16348571
 ] 

Yuqi Wang edited comment on YARN-7872 at 2/1/18 1:26 PM:
-

Just a init to trigger Jenkins.


was (Author: yqwang):
Just a init try

> labeled node cannot be used to satisfy locality specified request
> -
>
> Key: YARN-7872
> URL: https://issues.apache.org/jira/browse/YARN-7872
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Yuqi Wang
>Assignee: Yuqi Wang
>Priority: Blocker
> Fix For: 2.7.2
>
> Attachments: YARN-7872-branch-2.7.2.001.patch
>
>
> labeled node (i.e. node with 'not empty' node label) cannot be used to 
> satisfy locality specified request (i.e. container request with 'not ANY' 
> resource name and the relax locality is false).
> For example:
> The node with available resource:
> [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
> [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
> \{/default-rack}]
> The container request:
>  [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
> {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: 
> \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]
> Current RM capacity scheduler's behavior is that, the node cannot allocate 
> container for the request because of the node label not matched in the leaf 
> queue assign container.
> However, node locality and node label should be two orthogonal dimensions to 
> select candidate nodes for container request. And the node label matching 
> should only be executed for container request with ANY resource name, since 
> only this kind of container request is allowed to have 'not empty' node label.
> So, for container request with 'not ANY' resource name (so, we know it should 
> not have node label), we should use resource name to match with the node 
> instead of using node label to match with the node. And this resource name 
> matching should be safe, since the node whose node label is not accessible 
> for the queue will not be sent to the leaf queue.
> Attachment is the fix according to this principle, please help to review.
> Without it, we cannot use locality to request container within these labeled 
> nodes.
> If the fix is acceptable, we should also recheck whether the same issue 
> happens in trunk and other hadoop versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread Yuqi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuqi Wang updated YARN-7872:

Description: 
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent to the leaf queue.

*Attachment is the fix according to this principle, please help to review.*

*Without it, we cannot use locality to request container within these labeled 
nodes.*

*If the fix is acceptable, we should also recheck whether the same issue 
happens in trunk and other hadoop versions.*

*If not* *acceptable (i.e. the current behavior is by designed), so, how can we 
use* *locality to request container within these labeled nodes?*

  was:
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent to the leaf queue.

*Attachment is the fix according to this principle, please help to review.*

*Without it, we cannot use locality to request container within these labeled 
nodes.*

*If the fix is acceptable, we should also recheck whether the same issue 
happens in trunk and other hadoop versions.*

*If not* *acceptable (i.e. the current behavior is by designed), so, how can we 
use* *locality to request container within labeled nodes?*


> labeled node cannot be used to satisfy locality specified request
> -
>
> Key: YARN-7872
> URL: https://issues.apache.org/jira/browse/YARN-7872
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Yuqi Wang
>Assignee: Yuqi Wang
>Priority: Blocker
> Fix For: 2.7.2
>
> Attachments: YARN-7872-branch-2.7.2.001.patch
>
>
> labeled node (i.e. node with 'not empty' node label) cannot be used to 
> satisfy locality specified request (i.e. container request with 'not ANY' 
> resource name and the relax locality is false).
> For example:
> The node with available resource:
> [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
> [persistent]{color} {color:#f79232}HostName:

[jira] [Commented] (YARN-7829) Rebalance UI2 cluster overview page

2018-02-01 Thread JIRA


[ 
https://issues.apache.org/jira/browse/YARN-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348482#comment-16348482
 ] 

Gergely Novák commented on YARN-7829:
-

??Small nits that Memory and VCore are also related to system resource, which 
are more closely related to Cluster Resource on the top row. Would it be better 
to move Finished Apps and Running Apps to third row???

There are two problems with it:
 # one might say that Finished/Running Apps are also related Cluster Resource 
Usage by Applications in the top left corner
 # the number of resources in the (current) 3rd line is not necessarily 2. We 
might have additional resource types (GPUs, etc, see YARN-7330), so we should 
leave the opportunity for this last row to "overflow", we shouldn't add any 
other charts to it or move it up.

 

> Rebalance UI2 cluster overview page
> ---
>
> Key: YARN-7829
> URL: https://issues.apache.org/jira/browse/YARN-7829
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Affects Versions: 3.0.0
>Reporter: Eric Yang
>Assignee: Gergely Novák
>Priority: Major
> Attachments: YARN-7829.001.patch, ui2-cluster-overview.png
>
>
> The cluster overview page looks like a upside down triangle.  It would be 
> nice to rebalance the charts to ensure horizontal real estate are utilized 
> properly.  The screenshot attachment includes some suggestion for rebalance.  
> Node Manager status and cluster resource are closely related, it would be 
> nice to promote the chart to first row.  Application Status, and Resource 
> Availability are closely related.  It would be nice to promote Resource usage 
> to side by side with Application Status to fill up the horizontal real 
> estates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7221) Add security check for privileged docker container

2018-02-01 Thread Shane Kumpf (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348544#comment-16348544
 ] 

Shane Kumpf commented on YARN-7221:
---

I'll just point out that In many organization the Hadoop administrators are not 
the same group that has access to manage sudo rules. Enforcing this will make 
it very challenging and time consuming to use this feature in some clusters.

> Add security check for privileged docker container
> --
>
> Key: YARN-7221
> URL: https://issues.apache.org/jira/browse/YARN-7221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7221.001.patch, YARN-7221.002.patch
>
>
> When a docker is running with privileges, majority of the use case is to have 
> some program running with root then drop privileges to another user.  i.e. 
> httpd to start with privileged and bind to port 80, then drop privileges to 
> www user.  
> # We should add security check for submitting users, to verify they have 
> "sudo" access to run privileged container.  
> # We should remove --user=uid:gid for privileged containers.  
>  
> Docker can be launched with --privileged=true, and --user=uid:gid flag.  With 
> this parameter combinations, user will not have access to become root user.  
> All docker exec command will be drop to uid:gid user to run instead of 
> granting privileges.  User can gain root privileges if container file system 
> contains files that give user extra power, but this type of image is 
> considered as dangerous.  Non-privileged user can launch container with 
> special bits to acquire same level of root power.  Hence, we lose control of 
> which image should be run with --privileges, and who have sudo rights to use 
> privileged container images.  As the result, we should check for sudo access 
> then decide to parameterize --privileged=true OR --user=uid:gid.  This will 
> avoid leading developer down the wrong path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread Yuqi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuqi Wang updated YARN-7872:

Description: 
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent in the leaf queue.

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk.

  was:
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of node label to match with the node. And it should be safe, since the 
node which is not accessible for the queue will not be sent in the leaf queue.

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk.


> labeled node cannot be used to satisfy locality specified request
> -
>
> Key: YARN-7872
> URL: https://issues.apache.org/jira/browse/YARN-7872
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Yuqi Wang
>Assignee: Yuqi Wang
>Priority: Blocker
> Fix For: 2.8.0, 2.7.2
>
>
> labeled node (i.e. node with 'not empty' node label) cannot be used to 
> satisfy locality specified request (i.e. container request with 'not ANY' 
> resource name and the relax locality is false).
> For example:
> The node with available resource:
> [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
> [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
> \{/default-rack}]
> The container request:
>  [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
> {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: 
> \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]
> Current RM capacity scheduler's behavior is that, the node cannot allocate 
> container for the request because of the node label not matched in the leaf 
> queue assign container.

[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread Yuqi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuqi Wang updated YARN-7872:

Description: 
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of node label to match with the node. And it should be safe, since the 
node which is not accessible for the queue will not be sent in the leaf queue.

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk.

  was:
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (besides, it should not 
have node label), we should use resource name to match with the node instead of 
node label to match with the node. And it should be safe, since the node which 
is not accessible for the queue will not be sent in the leaf queue.

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk.


> labeled node cannot be used to satisfy locality specified request
> -
>
> Key: YARN-7872
> URL: https://issues.apache.org/jira/browse/YARN-7872
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Yuqi Wang
>Assignee: Yuqi Wang
>Priority: Blocker
> Fix For: 2.8.0, 2.7.2
>
>
> labeled node (i.e. node with 'not empty' node label) cannot be used to 
> satisfy locality specified request (i.e. container request with 'not ANY' 
> resource name and the relax locality is false).
> For example:
> The node with available resource:
> [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
> [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
> \{/default-rack}]
> The container request:
>  [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
> {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: 
> \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]
> Current RM capacity scheduler's behavior is that, the node cannot allocate 
> container for the request because of the node label not matched in the leaf 
> queue assign container.
> However, node locality and node label

[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread Yuqi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuqi Wang updated YARN-7872:

Description: 
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent to the leaf queue.

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk.

  was:
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent in the leaf queue.

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk.


> labeled node cannot be used to satisfy locality specified request
> -
>
> Key: YARN-7872
> URL: https://issues.apache.org/jira/browse/YARN-7872
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Yuqi Wang
>Assignee: Yuqi Wang
>Priority: Blocker
> Fix For: 2.8.0, 2.7.2
>
>
> labeled node (i.e. node with 'not empty' node label) cannot be used to 
> satisfy locality specified request (i.e. container request with 'not ANY' 
> resource name and the relax locality is false).
> For example:
> The node with available resource:
> [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
> [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
> \{/default-rack}]
> The container request:
>  [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
> {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: 
> \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]
> Current RM capacity scheduler's behavior is that, the node cannot allocate 
> container for the request because of the node label not

[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request

2018-02-01 Thread Yuqi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuqi Wang updated YARN-7872:

Description: 
*Issue summary:*

labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

 

*For example:*

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

 

*Possible solution:*

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent to the leaf queue.

 

*Discussion:*

Attachment is the fix according to this principle, please help to review.

Without it, we cannot use locality to request container within these labeled 
nodes.

If the fix is acceptable, we should also recheck whether the same issue happens 
in trunk and other hadoop versions.

If not acceptable (i.e. the current behavior is by designed), so, how can we 
use locality to request container within these labeled nodes?

  was:
labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy 
locality specified request (i.e. container request with 'not ANY' resource name 
and the relax locality is false).

For example:

The node with available resource:

[Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
[persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
\{/default-rack}]

The container request:
 [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
{color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} 
RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]

Current RM capacity scheduler's behavior is that, the node cannot allocate 
container for the request because of the node label not matched in the leaf 
queue assign container.

However, node locality and node label should be two orthogonal dimensions to 
select candidate nodes for container request. And the node label matching 
should only be executed for container request with ANY resource name, since 
only this kind of container request is allowed to have 'not empty' node label.

So, for container request with 'not ANY' resource name (so, we know it should 
not have node label), we should use resource name to match with the node 
instead of using node label to match with the node. And this resource name 
matching should be safe, since the node whose node label is not accessible for 
the queue will not be sent to the leaf queue.

*Attachment is the fix according to this principle, please help to review.*

*Without it, we cannot use locality to request container within these labeled 
nodes.*

*If the fix is acceptable, we should also recheck whether the same issue 
happens in trunk and other hadoop versions.*

*If not* *acceptable (i.e. the current behavior is by designed), so, how can we 
use* *locality to request container within these labeled nodes?*


> labeled node cannot be used to satisfy locality specified request
> -
>
> Key: YARN-7872
> URL: https://issues.apache.org/jira/browse/YARN-7872
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Yuqi Wang
>Assignee: Yuqi Wang
>Priority: Blocker
> Fix For: 2.7.2
>
> Attachments: YARN-7872-branch-2.7.2.001.patch
>
>
> *Issue summary:*
> labeled node (i.e. node with 'not empty' node label) cannot be used to 
> satisfy locality specified request (i.e. container request with 'not ANY' 
> resource name and the relax locality is false).
>  
> *For example:*
> The node with available resource:
> [Resource: [MemoryMB: [100] CpuNumber: [12]]

[jira] [Reopened] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Billie Rinaldi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billie Rinaldi reopened YARN-7677:
--

> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Fix For: 3.1.0, 3.0.1
>
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7857) -fstack-check compilation flag causes binary incompatibility for container-executor between RHEL 6 and RHEL 7

2018-02-01 Thread Jim Brennan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated YARN-7857:
--
Attachment: YARN-7857.001.patch

> -fstack-check compilation flag causes binary incompatibility for 
> container-executor between RHEL 6 and RHEL 7
> -
>
> Key: YARN-7857
> URL: https://issues.apache.org/jira/browse/YARN-7857
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7857.001.patch
>
>
> The segmentation fault in container-executor reported in [YARN-7796]  appears 
> to be due to a binary compatibility issue with the {{-fstack-check}} flag 
> that was added in [YARN-6721]
> Based on my testing, a container-executor (without the patch from 
> [YARN-7796]) compiled on RHEL 6 with the -fstack-check flag always hits this 
> segmentation fault when run on RHEL 7.  But if you compile without this flag, 
> the container-executor runs on RHEL 7 with no problems.  I also verified this 
> with a simple program that just does the copy_file.
> I think we need to either remove this flag, or find a suitable alternative.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7221) Add security check for privileged docker container

2018-02-01 Thread Shane Kumpf (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349080#comment-16349080
 ] 

Shane Kumpf commented on YARN-7221:
---

Sure. I agree that we need protections in place around the use of --privileged. 
If sudo is the best way to achieve that goal, I'm fine with that direction.

> Add security check for privileged docker container
> --
>
> Key: YARN-7221
> URL: https://issues.apache.org/jira/browse/YARN-7221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7221.001.patch, YARN-7221.002.patch
>
>
> When a docker is running with privileges, majority of the use case is to have 
> some program running with root then drop privileges to another user.  i.e. 
> httpd to start with privileged and bind to port 80, then drop privileges to 
> www user.  
> # We should add security check for submitting users, to verify they have 
> "sudo" access to run privileged container.  
> # We should remove --user=uid:gid for privileged containers.  
>  
> Docker can be launched with --privileged=true, and --user=uid:gid flag.  With 
> this parameter combinations, user will not have access to become root user.  
> All docker exec command will be drop to uid:gid user to run instead of 
> granting privileges.  User can gain root privileges if container file system 
> contains files that give user extra power, but this type of image is 
> considered as dangerous.  Non-privileged user can launch container with 
> special bits to acquire same level of root power.  Hence, we lose control of 
> which image should be run with --privileges, and who have sudo rights to use 
> privileged container images.  As the result, we should check for sudo access 
> then decide to parameterize --privileged=true OR --user=uid:gid.  This will 
> avoid leading developer down the wrong path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7516) Security check for trusted docker image

2018-02-01 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-7516:

Attachment: YARN-7516.016.patch

> Security check for trusted docker image
> ---
>
> Key: YARN-7516
> URL: https://issues.apache.org/jira/browse/YARN-7516
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7516.001.patch, YARN-7516.002.patch, 
> YARN-7516.003.patch, YARN-7516.004.patch, YARN-7516.005.patch, 
> YARN-7516.006.patch, YARN-7516.007.patch, YARN-7516.008.patch, 
> YARN-7516.009.patch, YARN-7516.010.patch, YARN-7516.011.patch, 
> YARN-7516.012.patch, YARN-7516.013.patch, YARN-7516.014.patch, 
> YARN-7516.015.patch, YARN-7516.016.patch
>
>
> Hadoop YARN Services can support using private docker registry image or 
> docker image from docker hub.  In current implementation, Hadoop security is 
> enforced through username and group membership, and enforce uid:gid 
> consistency in docker container and distributed file system.  There is cloud 
> use case for having ability to run untrusted docker image on the same cluster 
> for testing.  
> The basic requirement for untrusted container is to ensure all kernel and 
> root privileges are dropped, and there is no interaction with distributed 
> file system to avoid contamination.  We can probably enforce detection of 
> untrusted docker image by checking the following:
> # If docker image is from public docker hub repository, the container is 
> automatically flagged as insecure, and disk volume mount are disabled 
> automatically, and drop all kernel capabilities.
> # If docker image is from private repository in docker hub, and there is a 
> white list to allow the private repository, disk volume mount is allowed, 
> kernel capabilities follows the allowed list.
> # If docker image is from private trusted registry with image name like 
> "private.registry.local:5000/centos", and white list allows this private 
> trusted repository.  Disk volume mount is allowed, kernel capabilities 
> follows the allowed list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5028) RMStateStore should trim down app state for completed applications

2018-02-01 Thread Gergo Repas (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergo Repas updated YARN-5028:
--
Attachment: YARN-5028.000.patch

> RMStateStore should trim down app state for completed applications
> --
>
> Key: YARN-5028
> URL: https://issues.apache.org/jira/browse/YARN-5028
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Gergo Repas
>Priority: Major
> Attachments: YARN-5028.000.patch
>
>
> RMStateStore stores enough information to recover applications in case of a 
> restart. The store also retains this information for completed applications 
> to serve their status to REST, WebUI, Java and CLI clients. We don't need all 
> the information we store today to serve application status; for instance, we 
> don't need the {{ApplicationSubmissionContext}}. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7446) Docker container privileged mode and --user flag contradict each other

2018-02-01 Thread Shane Kumpf (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349064#comment-16349064
 ] 

Shane Kumpf commented on YARN-7446:
---

Thanks for clarifying. That approach sounds good to me.

> Docker container privileged mode and --user flag contradict each other
> --
>
> Key: YARN-7446
> URL: https://issues.apache.org/jira/browse/YARN-7446
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7446.001.patch
>
>
> In the current implementation, when privileged=true, --user flag is also 
> passed to docker for launching container.  In reality, the container has no 
> way to use root privileges unless there is sticky bit or sudoers in the image 
> for the specified user to gain privileges again.  To avoid duplication of 
> dropping and reacquire root privileges, we can reduce the duplication of 
> specifying both flag.  When privileged mode is enabled, --user flag should be 
> omitted.  When non-privileged mode is enabled, --user flag is supplied.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7868) Provide improved error message when YARN service is disabled

2018-02-01 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349123#comment-16349123
 ] 

Chandni Singh commented on YARN-7868:
-

+1 lgtm

> Provide improved error message when YARN service is disabled
> 
>
> Key: YARN-7868
> URL: https://issues.apache.org/jira/browse/YARN-7868
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7868.001.patch
>
>
> Some YARN CLI command will throw verbose error message when YARN service is 
> disabled.  The error message looks like this:
> {code}
> Jan 31, 2018 4:24:46 PM com.sun.jersey.api.client.ClientResponse getEntity
> SEVERE: A message body reader for Java class 
> org.apache.hadoop.yarn.service.api.records.ServiceStatus, and Java type class 
> org.apache.hadoop.yarn.service.api.records.ServiceStatus, and MIME media type 
> application/octet-stream was not found
> Jan 31, 2018 4:24:46 PM com.sun.jersey.api.client.ClientResponse getEntity
> SEVERE: The registered message body readers compatible with the MIME media 
> type are:
> application/octet-stream ->
>   com.sun.jersey.core.impl.provider.entity.ByteArrayProvider
>   com.sun.jersey.core.impl.provider.entity.FileProvider
>   com.sun.jersey.core.impl.provider.entity.InputStreamProvider
>   com.sun.jersey.core.impl.provider.entity.DataSourceProvider
>   com.sun.jersey.core.impl.provider.entity.RenderedImageProvider
> */* ->
>   com.sun.jersey.core.impl.provider.entity.FormProvider
>   com.sun.jersey.core.impl.provider.entity.StringProvider
>   com.sun.jersey.core.impl.provider.entity.ByteArrayProvider
>   com.sun.jersey.core.impl.provider.entity.FileProvider
>   com.sun.jersey.core.impl.provider.entity.InputStreamProvider
>   com.sun.jersey.core.impl.provider.entity.DataSourceProvider
>   com.sun.jersey.core.impl.provider.entity.XMLJAXBElementProvider$General
>   com.sun.jersey.core.impl.provider.entity.ReaderProvider
>   com.sun.jersey.core.impl.provider.entity.DocumentProvider
>   com.sun.jersey.core.impl.provider.entity.SourceProvider$StreamSourceReader
>   com.sun.jersey.core.impl.provider.entity.SourceProvider$SAXSourceReader
>   com.sun.jersey.core.impl.provider.entity.SourceProvider$DOMSourceReader
>   com.sun.jersey.json.impl.provider.entity.JSONJAXBElementProvider$General
>   com.sun.jersey.json.impl.provider.entity.JSONArrayProvider$General
>   com.sun.jersey.json.impl.provider.entity.JSONObjectProvider$General
>   com.sun.jersey.core.impl.provider.entity.XMLRootElementProvider$General
>   com.sun.jersey.core.impl.provider.entity.XMLListElementProvider$General
>   com.sun.jersey.core.impl.provider.entity.XMLRootObjectProvider$General
>   com.sun.jersey.core.impl.provider.entity.EntityHolderReader
>   com.sun.jersey.json.impl.provider.entity.JSONRootElementProvider$General
>   com.sun.jersey.json.impl.provider.entity.JSONListElementProvider$General
>   com.sun.jersey.json.impl.provider.entity.JacksonProviderProxy
>   com.fasterxml.jackson.jaxrs.json.JacksonJsonProvider
> 2018-01-31 16:24:46,415 ERROR client.ApiServiceClient: 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7516) Security check for trusted docker image

2018-02-01 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349150#comment-16349150
 ] 

Eric Yang commented on YARN-7516:
-

- Patch 16 drops launch command for untrusted image.
- For now, drop all privileges for untrusted image, and denied untrusted image 
to run with privileged=true flag.

The ideas behind YARN-7221, YARN-7446, YARN-7516 is to mimic basic sudo 
security.  Privileged users can gain access to run trusted binaries as another 
user or multi-process container.  If the binary is not trusted, run the image 
with least amount of privileges in a sandbox.  For now, I don't preset 
capabilities for untrusted image to simulate root in a sandbox.  The risk 
out-weights the benefit, therefore, we errant on the side of caution.  

YARN mode like docker container is safe guarded by user mapping (YARN-4266), 
there is no impersonation capability.  Given this reason, we don't need two ACL 
list to track what capabilities to turn on for each mode of docker images.  

> Security check for trusted docker image
> ---
>
> Key: YARN-7516
> URL: https://issues.apache.org/jira/browse/YARN-7516
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7516.001.patch, YARN-7516.002.patch, 
> YARN-7516.003.patch, YARN-7516.004.patch, YARN-7516.005.patch, 
> YARN-7516.006.patch, YARN-7516.007.patch, YARN-7516.008.patch, 
> YARN-7516.009.patch, YARN-7516.010.patch, YARN-7516.011.patch, 
> YARN-7516.012.patch, YARN-7516.013.patch, YARN-7516.014.patch, 
> YARN-7516.015.patch, YARN-7516.016.patch
>
>
> Hadoop YARN Services can support using private docker registry image or 
> docker image from docker hub.  In current implementation, Hadoop security is 
> enforced through username and group membership, and enforce uid:gid 
> consistency in docker container and distributed file system.  There is cloud 
> use case for having ability to run untrusted docker image on the same cluster 
> for testing.  
> The basic requirement for untrusted container is to ensure all kernel and 
> root privileges are dropped, and there is no interaction with distributed 
> file system to avoid contamination.  We can probably enforce detection of 
> untrusted docker image by checking the following:
> # If docker image is from public docker hub repository, the container is 
> automatically flagged as insecure, and disk volume mount are disabled 
> automatically, and drop all kernel capabilities.
> # If docker image is from private repository in docker hub, and there is a 
> white list to allow the private repository, disk volume mount is allowed, 
> kernel capabilities follows the allowed list.
> # If docker image is from private trusted registry with image name like 
> "private.registry.local:5000/centos", and white list allows this private 
> trusted repository.  Disk volume mount is allowed, kernel capabilities 
> follows the allowed list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349117#comment-16349117
 ] 

Jim Brennan commented on YARN-7677:
---

Thanks [~shaneku...@gmail.com] and [~billie.rinaldi], I will try out that 
change.

> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349168#comment-16349168
 ] 

Jim Brennan commented on YARN-7677:
---

Agreed - given that we are just processing the hash map in order, it seems like 
we've just been getting lucky that the variables on which the classpath depends 
are coming before it in the launch_container.sh script.


> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7778) Merging of constraints defined at different levels

2018-02-01 Thread Konstantinos Karanasos (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349178#comment-16349178
 ] 

Konstantinos Karanasos commented on YARN-7778:
--

+1 on latest patch, thanks [~cheersyang].

I will do a minor fix to say that the "constraint" is coming from the 
SchedulingRequest when I commit the patch, if you don't mind.

> Merging of constraints defined at different levels
> --
>
> Key: YARN-7778
> URL: https://issues.apache.org/jira/browse/YARN-7778
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: Merge Constraints Solution.pdf, 
> YARN-7778-YARN-7812.001.patch, YARN-7778-YARN-7812.002.patch, 
> YARN-7778.003.patch, YARN-7778.004.patch
>
>
> When we have multiple constraints defined for a given set of allocation tags 
> at different levels (i.e., at the cluster, the application or the scheduling 
> request level), we need to merge those constraints.
> Defining constraint levels as cluster > application > scheduling request, 
> constraints defined at lower levels should only be more restrictive than 
> those of higher levels. Otherwise the allocation should fail.
> For example, if there is an application level constraint that allows no more 
> than 5 HBase containers per rack, a scheduling request can further restrict 
> that to 3 containers per rack but not to 7 containers per rack.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7446) Docker container privileged mode and --user flag contradict each other

2018-02-01 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349008#comment-16349008
 ] 

Eric Yang commented on YARN-7446:
-

[~shaneku...@gmail.com] It would be better to leave --user 0:0 out for some 
reasons.

1.  If a privileged user use --privileged and docker container has a defined a 
service user.  i.e. Hive.  By remove --user 0:0, this allows a system 
administrator, such as Eric to have "sudo" like behavior on YARN cluster (given 
that sudoers check happens in YARN-7221).  Although hive user is dropped to 
normal privileges.  This provides sudo like mechanism in a secure manner for 
trusted docker images in YARN-7516.

2.  If a privileged user made a mistake to run --privileged flag with normal 
user container image.  He will have ability to discover his mistakes.

3.  If the image does not have a predefined user, then full root capability is 
given.

With changes in YARN-7446, YARN-7221, and YARN-7516.  These three JIRA provides 
system administrator a way to run authorized executable on the system with 
privileges in docker images.  This is the same concept as sudoers list to 
authorize users to run authorized binaries.  The changes are to help the system 
compliant with Linux security.  I think it is better to avoid hard code --user 
0:0 to make sure #1, and #2 corner cases are properly supported.

> Docker container privileged mode and --user flag contradict each other
> --
>
> Key: YARN-7446
> URL: https://issues.apache.org/jira/browse/YARN-7446
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7446.001.patch
>
>
> In the current implementation, when privileged=true, --user flag is also 
> passed to docker for launching container.  In reality, the container has no 
> way to use root privileges unless there is sticky bit or sudoers in the image 
> for the specified user to gain privileges again.  To avoid duplication of 
> dropping and reacquire root privileges, we can reduce the duplication of 
> specifying both flag.  When privileged mode is enabled, --user flag should be 
> omitted.  When non-privileged mode is enabled, --user flag is supplied.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Shane Kumpf (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349119#comment-16349119
 ] 

Shane Kumpf commented on YARN-7677:
---

I'm not sure if it will be appropriate to address here, but I think we need to 
improve how we handle the ordering of the environment variables within the 
launch script. Right now it depends on hash map ordering... We likely need to 
ensure that any variable values are expanded out or defined prior to use to 
avoid this kind of issue.

> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-7446) Docker container privileged mode and --user flag contradict each other

2018-02-01 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349008#comment-16349008
 ] 

Eric Yang edited comment on YARN-7446 at 2/1/18 6:00 PM:
-

[~shaneku...@gmail.com] It would be better to leave --user 0:0 out for some 
reasons.

1.  If a privileged user use --privileged and docker container has a defined a 
service user.  i.e. Hive.  By remove --user 0:0, this allows a system 
administrator, such as Eric to have "sudo" like behavior on YARN cluster (given 
that sudoers check happens in YARN-7221).  Although hive user is dropped to 
normal privileges.  This provides sudo like mechanism in a secure manner for 
trusted docker images in YARN-7516.

2.  If a privileged user made a mistake to run --privileged flag with normal 
user container image.  He will have ability to discover his mistakes.

3.  If the image does not have a predefined user, then full root capability is 
given.

With changes in YARN-7446, YARN-7221, and YARN-7516.  These three JIRA provides 
system administrator a way to run authorized executable on the system with 
privileges in docker images.  This is the same concept as sudoers list to 
authorize users to run authorized binaries with privileges.  The changes are to 
help the system compliant with Linux security.  I think it is better to avoid 
hard code --user 0:0 to make sure #1, and #2 corner cases are properly 
supported.


was (Author: eyang):
[~shaneku...@gmail.com] It would be better to leave --user 0:0 out for some 
reasons.

1.  If a privileged user use --privileged and docker container has a defined a 
service user.  i.e. Hive.  By remove --user 0:0, this allows a system 
administrator, such as Eric to have "sudo" like behavior on YARN cluster (given 
that sudoers check happens in YARN-7221).  Although hive user is dropped to 
normal privileges.  This provides sudo like mechanism in a secure manner for 
trusted docker images in YARN-7516.

2.  If a privileged user made a mistake to run --privileged flag with normal 
user container image.  He will have ability to discover his mistakes.

3.  If the image does not have a predefined user, then full root capability is 
given.

With changes in YARN-7446, YARN-7221, and YARN-7516.  These three JIRA provides 
system administrator a way to run authorized executable on the system with 
privileges in docker images.  This is the same concept as sudoers list to 
authorize users to run authorized binaries.  The changes are to help the system 
compliant with Linux security.  I think it is better to avoid hard code --user 
0:0 to make sure #1, and #2 corner cases are properly supported.

> Docker container privileged mode and --user flag contradict each other
> --
>
> Key: YARN-7446
> URL: https://issues.apache.org/jira/browse/YARN-7446
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7446.001.patch
>
>
> In the current implementation, when privileged=true, --user flag is also 
> passed to docker for launching container.  In reality, the container has no 
> way to use root privileges unless there is sticky bit or sudoers in the image 
> for the specified user to gain privileges again.  To avoid duplication of 
> dropping and reacquire root privileges, we can reduce the duplication of 
> specifying both flag.  When privileged mode is enabled, --user flag should be 
> omitted.  When non-privileged mode is enabled, --user flag is supplied.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349024#comment-16349024
 ] 

Jim Brennan commented on YARN-7677:
---

[~shaneku...@gmail.com], [~jlowe], [~ebadger] I have verified that this does 
not fail in the same way when I revert this change, so I agree with [~ebadger], 
we should revert this change until I can find a fix.

It is not immediately clear to me why it is failing.  Apologies for not 
catching this before submitting my patch.

> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Fix For: 3.1.0, 3.0.1
>
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Shane Kumpf (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349042#comment-16349042
 ] 

Shane Kumpf commented on YARN-7677:
---

[~Jim_Brennan] - Thanks for the update. [~billie.rinaldi] and I have been 
looking into it as well and we believe we have it figured out.

When comparing launch_container.sh with and without the change, 
HADOOP_CONF_DIR, HADOOP_COMMON_HOME, HADOOP_HDFS_HOME, etc are defined before 
CLASSPATH. With this change, all of the whilte listed env processing happens 
last, so the variables are the last to be defined. Moving the whitelist 
processing before the rest of environment processing fixed the issue for us.

> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-7677:
-
Fix Version/s: (was: 3.0.1)
   (was: 3.1.0)

I reverted this from trunk and branch-3.0.

> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349085#comment-16349085
 ] 

Hudson commented on YARN-7677:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13598 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13598/])
Revert "YARN-7677. Docker image cannot set HADOOP_CONF_DIR. Contributed (jlowe: 
rev 682ea21f2bbc587e1b727b3c895c2f513a908432)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/runtime/ContainerRuntime.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DefaultLinuxContainerRuntime.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DockerLinuxContainerRuntime.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DelegatingLinuxContainerRuntime.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/ContainerExecutor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java


> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7850) New UI does not show status for Log Aggregation

2018-02-01 Thread Yesha Vora (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348832#comment-16348832
 ] 

Yesha Vora commented on YARN-7850:
--

yes sure. Thanks for picking up this Jira.

> New UI does not show status for Log Aggregation
> ---
>
> Key: YARN-7850
> URL: https://issues.apache.org/jira/browse/YARN-7850
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Yesha Vora
>Assignee: Gergely Novák
>Priority: Major
> Attachments: Screen Shot 2018-02-01 at 11.37.30.png, 
> YARN-7850.001.patch
>
>
> The status of Log Aggregation is not specified any where.
> New UI should show the Log aggregation status for finished application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348876#comment-16348876
 ] 

Eric Badger commented on YARN-7677:
---

If all AMs are failing in [~shaneku...@gmail.com]'s case, shouldn't we revert 
first, and ask questions later? We don't want to destabilize the build.

> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Fix For: 3.1.0, 3.0.1
>
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7819) Allow PlacementProcessor to be used with the FairScheduler

2018-02-01 Thread Arun Suresh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348714#comment-16348714
 ] 

Arun Suresh commented on YARN-7819:
---

Updated patch. Fixinig findbugs and checkstyles.

[~templedf], 
bq. Will calling the assignment node-local in the metrics update confuse the 
metrics? What if it's not actually node local?
So, by definition, attemptAllocationOnNode, will always be a node local 
request, It SHOULD therefore update the metrics.

> Allow PlacementProcessor to be used with the FairScheduler
> --
>
> Key: YARN-7819
> URL: https://issues.apache.org/jira/browse/YARN-7819
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Major
> Attachments: YARN-7819-YARN-6592.001.patch, 
> YARN-7819-YARN-7812.001.patch, YARN-7819.002.patch, YARN-7819.003.patch
>
>
> The FairScheduler needs to implement the 
> {{ResourceScheduler#attemptAllocationOnNode}} function for the processor to 
> support the FairScheduler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7221) Add security check for privileged docker container

2018-02-01 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1634#comment-1634
 ] 

Eric Yang commented on YARN-7221:
-

[~shaneku...@gmail.com] How about get this in, and community can contribute for 
a separate ACL mechanism when the need arises?  This will ensure that we errant 
on the side of caution instead of giving too much power to non privileged Linux 
user.

> Add security check for privileged docker container
> --
>
> Key: YARN-7221
> URL: https://issues.apache.org/jira/browse/YARN-7221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7221.001.patch, YARN-7221.002.patch
>
>
> When a docker is running with privileges, majority of the use case is to have 
> some program running with root then drop privileges to another user.  i.e. 
> httpd to start with privileged and bind to port 80, then drop privileges to 
> www user.  
> # We should add security check for submitting users, to verify they have 
> "sudo" access to run privileged container.  
> # We should remove --user=uid:gid for privileged containers.  
>  
> Docker can be launched with --privileged=true, and --user=uid:gid flag.  With 
> this parameter combinations, user will not have access to become root user.  
> All docker exec command will be drop to uid:gid user to run instead of 
> granting privileges.  User can gain root privileges if container file system 
> contains files that give user extra power, but this type of image is 
> considered as dangerous.  Non-privileged user can launch container with 
> special bits to acquire same level of root power.  Hence, we lose control of 
> which image should be run with --privileges, and who have sudo rights to use 
> privileged container images.  As the result, we should check for sudo access 
> then decide to parameterize --privileged=true OR --user=uid:gid.  This will 
> avoid leading developer down the wrong path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7221) Add security check for privileged docker container

2018-02-01 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348882#comment-16348882
 ] 

Eric Badger commented on YARN-7221:
---

bq. I'll just point out that In many organization the Hadoop administrators are 
not the same group that has access to manage sudo rules. Enforcing this will 
make it very challenging and time consuming to use this feature in some 
clusters.
This is certainly true and it could/would be a pain to set this up if the 
relevant users were not already in the sudoers list. However, from the opposite 
perspective, it would also be bad for users to be granted sudo access when the 
administrators did not grant that privilege to them. This is 100% a 
conversation about usability vs. security in my mind. I tend to lean in the 
direction of secure by default with options to relax those constraints to 
increase usability. It's ugly, but an idea could be to have different 
configurable mechanisms to check for privileged users. One could be the sudo 
check and a different one could be a container-executor.cfg privileged user 
list check. I'm not sure if I would even support this, but it's an idea of how 
to make both of these scenarios work.

> Add security check for privileged docker container
> --
>
> Key: YARN-7221
> URL: https://issues.apache.org/jira/browse/YARN-7221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7221.001.patch, YARN-7221.002.patch
>
>
> When a docker is running with privileges, majority of the use case is to have 
> some program running with root then drop privileges to another user.  i.e. 
> httpd to start with privileged and bind to port 80, then drop privileges to 
> www user.  
> # We should add security check for submitting users, to verify they have 
> "sudo" access to run privileged container.  
> # We should remove --user=uid:gid for privileged containers.  
>  
> Docker can be launched with --privileged=true, and --user=uid:gid flag.  With 
> this parameter combinations, user will not have access to become root user.  
> All docker exec command will be drop to uid:gid user to run instead of 
> granting privileges.  User can gain root privileges if container file system 
> contains files that give user extra power, but this type of image is 
> considered as dangerous.  Non-privileged user can launch container with 
> special bits to acquire same level of root power.  Hence, we lose control of 
> which image should be run with --privileges, and who have sudo rights to use 
> privileged container images.  As the result, we should check for sudo access 
> then decide to parameterize --privileged=true OR --user=uid:gid.  This will 
> avoid leading developer down the wrong path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Shane Kumpf (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348815#comment-16348815
 ] 

Shane Kumpf commented on YARN-7677:
---

Docker is enabled, but the applications in question are not leveraging docker. 
These are simple apps like MR sleep and distributed shell. All mapreduce 
classpath settings, yarn.application.classpath, and whitelist env are not set 
and are using the defaults. I've tried setting these in various ways that used 
to work, but haven't found a working combination yet.

> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Fix For: 3.1.0, 3.0.1
>
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348810#comment-16348810
 ] 

Jason Lowe commented on YARN-7677:
--

Is this with Docker containers or without?  There are two main changes with 
this patch:
# HADOOP_CONF_DIR needs to be in the whitelist config to be inherited from the 
NM.  (It is in the default whitelist setting already).
# In Docker, whitelisted variables that would be inherited from the NM but are 
also set by the Docker image will use the Docker image setting instead of the 
NM setting.


> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Fix For: 3.1.0, 3.0.1
>
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348841#comment-16348841
 ] 

Jason Lowe commented on YARN-7677:
--

Ultimately one way to debug this would be to compare the container launch 
scripts between the two scenarios (i.e.: with and without YARN-7677 applied).  
The only difference should be that some variables in the launch script will 
have the export var=$\{_var_:-_value_\} syntax that didn't before.  In the 
non-Docker case, all of those variables should be getting the NM settings 
unless somehow those variables already are set to _different_ values in the 
environment before the launch script runs.


> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Fix For: 3.1.0, 3.0.1
>
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7857) -fstack-check compilation flag causes binary incompatibility for container-executor between RHEL 6 and RHEL 7

2018-02-01 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348926#comment-16348926
 ] 

genericqa commented on YARN-7857:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
28m  5s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 34s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
20s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 61m 53s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7857 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908829/YARN-7857.001.patch |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 0b6933f920b8 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / ae2177d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19566/testReport/ |
| Max. process+thread count | 302 (vs. ulimit of 5000) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19566/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> -fstack-check compilation flag causes binary incompatibility for 
> container-executor between RHEL 6 and RHEL 7
> -
>
> Key: YARN-7857
> URL: https://issues.apache.org/jira/browse/YARN-7857
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7857.001.patch
>
>
> The segmentation fault in container-executor reported in [YARN-7796]

[jira] [Commented] (YARN-7819) Allow PlacementProcessor to be used with the FairScheduler

2018-02-01 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348928#comment-16348928
 ] 

genericqa commented on YARN-7819:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 30s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 12s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
16s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 57s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}112m 47s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
|  |  Unchecked/unconfirmed cast from 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt
 to org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt 
in 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptAllocationOnNode(SchedulerApplicationAttempt,
 SchedulingRequest, SchedulerNode)  At 
FairScheduler.java:org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt
 in 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptAllocationOnNode(SchedulerApplicationAttempt,
 SchedulingRequest, SchedulerNode)  At FairScheduler.java:[line 1883] |
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.constraint.TestPlacementProcessor |
|   | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServiceAppsNodelabel |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7819 |
| JIRA Patch URL |

[jira] [Updated] (YARN-7223) Document GPU isolation feature

2018-02-01 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-7223:
-
Fix Version/s: 3.1.0

> Document GPU isolation feature
> --
>
> Key: YARN-7223
> URL: https://issues.apache.org/jira/browse/YARN-7223
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Fix For: 3.1.0
>
> Attachments: YARN-7223.wip.001.patch, YARN-7223.wip.001.pdf
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7223) Document GPU isolation feature

2018-02-01 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-7223:
-
Priority: Blocker  (was: Major)

> Document GPU isolation feature
> --
>
> Key: YARN-7223
> URL: https://issues.apache.org/jira/browse/YARN-7223
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Fix For: 3.1.0
>
> Attachments: YARN-7223.wip.001.patch, YARN-7223.wip.001.pdf
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Shane Kumpf (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348779#comment-16348779
 ] 

Shane Kumpf commented on YARN-7677:
---

[~Jim_Brennan], thanks for putting this together. With this patch in, all AM's 
are failing to launch with classpath related issue in my dev environment. Still 
looking into the cause, but do you have any thoughts?

> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Fix For: 3.1.0, 3.0.1
>
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7819) Allow PlacementProcessor to be used with the FairScheduler

2018-02-01 Thread Arun Suresh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-7819:
--
Attachment: YARN-7819.003.patch

> Allow PlacementProcessor to be used with the FairScheduler
> --
>
> Key: YARN-7819
> URL: https://issues.apache.org/jira/browse/YARN-7819
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Major
> Attachments: YARN-7819-YARN-6592.001.patch, 
> YARN-7819-YARN-7812.001.patch, YARN-7819.002.patch, YARN-7819.003.patch
>
>
> The FairScheduler needs to implement the 
> {{ResourceScheduler#attemptAllocationOnNode}} function for the processor to 
> support the FairScheduler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348795#comment-16348795
 ] 

Jim Brennan commented on YARN-7677:
---

[~shaneku...@gmail.com] are you running with docker?

> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Fix For: 3.1.0, 3.0.1
>
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348898#comment-16348898
 ] 

Jim Brennan commented on YARN-7677:
---

[~shaneku...@gmail.com], I am trying to repro locally. In my dev setup, it is 
currently working, but I typically run with 
mapreduce.application.framework.path and mapreduce.application.classpath 
defined in my mapred-site.xml file, pointing to a tarball in my home dir in 
hdfs.

If i remove those, I do get errors like these:
{noformat}
Error: Could not find or load main class 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster

Please check whether your etc/hadoop/mapred-site.xml contains the below 
configuration:

  yarn.app.mapreduce.am.env
  HADOOP_MAPRED_HOME=${full path of your hadoop distribution 
directory}


  mapreduce.map.env
  HADOOP_MAPRED_HOME=${full path of your hadoop distribution 
directory}


  mapreduce.reduce.env
  HADOOP_MAPRED_HOME=${full path of your hadoop distribution 
directory}

{noformat}
Is this what you are seeing? I don't think this behavior was different before 
my change, but I'm going to revert it locally and double-check.

> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Fix For: 3.1.0, 3.0.1
>
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7781) Update YARN-Services-Examples.md to be in sync with the latest code

2018-02-01 Thread Jian He (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-7781:
--
Attachment: YARN-7781.03.patch

> Update YARN-Services-Examples.md to be in sync with the latest code
> ---
>
> Key: YARN-7781
> URL: https://issues.apache.org/jira/browse/YARN-7781
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Jian He
>Priority: Major
> Attachments: YARN-7781.01.patch, YARN-7781.02.patch, 
> YARN-7781.03.patch
>
>
> Update YARN-Services-Examples.md to make the following additions/changes:
> 1. Add an additional URL and PUT Request JSON to support flex:
> Update to flex up/down the no of containers (instances) of a component of a 
> service
> PUT URL – http://localhost:8088/app/v1/services/hello-world
> PUT Request JSON
> {code}
> {
>   "components" : [ {
> "name" : "hello",
> "number_of_containers" : 3
>   } ]
> }
> {code}
> 2. Modify all occurrences of /ws/ to /app/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5714) ContainerExecutor does not order environment map

2018-02-01 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349304#comment-16349304
 ] 

genericqa commented on YARN-5714:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  4s{color} 
| {color:red} YARN-5714 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-5714 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845159/YARN-5714.006.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19571/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> ContainerExecutor does not order environment map
> 
>
> Key: YARN-5714
> URL: https://issues.apache.org/jira/browse/YARN-5714
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.4.1, 2.5.2, 2.7.3, 2.6.4, 3.0.0-alpha1
> Environment: all (linux and windows alike)
>Reporter: Remi Catherinot
>Assignee: Remi Catherinot
>Priority: Trivial
>  Labels: oct16-medium
> Attachments: YARN-5714.001.patch, YARN-5714.002.patch, 
> YARN-5714.003.patch, YARN-5714.004.patch, YARN-5714.005.patch, 
> YARN-5714.006.patch
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> when dumping the launch container script, environment variables are dumped 
> based on the order internally used by the map implementation (hash based). It 
> does not take into consideration that some env varibales may refer each 
> other, and so that some env variables must be declared before those 
> referencing them.
> In my case, i ended up having LD_LIBRARY_PATH which was depending on 
> HADOOP_COMMON_HOME being dumped before HADOOP_COMMON_HOME. Thus it had a 
> wrong value and so native libraries weren't loaded. jobs were running but not 
> at their best efficiency. This is just a use case falling into that bug, but 
> i'm sure others may happen as well.
> I already have a patch running in my production environment, i just estimate 
> to 5 days for packaging the patch in the right fashion for JIRA + try my best 
> to add tests.
> Note : the patch is not OS aware with a default empty implementation. I will 
> only implement the unix version on a 1st release. I'm not used to windows env 
> variables syntax so it will take me more time/research for it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Billie Rinaldi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349331#comment-16349331
 ] 

Billie Rinaldi commented on YARN-7677:
--

bq. In the general case, we're not going to be able to order the variables 
without doing a dependency analysis between them
It seems as if the dependency analysis ticket stalled due to disagreement about 
approach. I don't think we necessarily need dependency analysis; the primary 
use case is AM-defined vars being able to reference NM-defined vars, which we 
could accomplish by writing NM vars to the launch script first.

> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7781) Update YARN-Services-Examples.md to be in sync with the latest code

2018-02-01 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349347#comment-16349347
 ] 

genericqa commented on YARN-7781:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 48s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 34s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 34m 17s{color} 
| {color:red} hadoop-yarn-services-core in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
23s{color} | {color:green} hadoop-yarn-services-api in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 77m 58s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7781 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908858/YARN-7781.03.patch |
| Optional Tests |  asflicense  mvnsite  compile  javac  javadoc  mvninstall  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 400553c11455 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 
19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / dd50f53 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| unit |

[jira] [Commented] (YARN-7516) Security check for trusted docker image

2018-02-01 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349232#comment-16349232
 ] 

genericqa commented on YARN-7516:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 18m  
6s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
34m  5s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  6m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 23s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
52s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
21s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 92m 16s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7516 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12907540/YARN-7516.015.patch |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 8def665676ab 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 6ca7204 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19567/testReport/ |
| Max. process+thread count | 410 (vs. ulimit of 5000) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: 
hadoop-yarn-project/hadoop-yarn |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19567/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Security check for trusted docker image
> ---
>
> Key: YARN-7516
> URL: https://issues.apache.org/jira/browse/YARN-7516
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>

[jira] [Commented] (YARN-7221) Add security check for privileged docker container

2018-02-01 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349262#comment-16349262
 ] 

Eric Yang commented on YARN-7221:
-

Rebased patch to current trunk.

> Add security check for privileged docker container
> --
>
> Key: YARN-7221
> URL: https://issues.apache.org/jira/browse/YARN-7221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7221.001.patch, YARN-7221.002.patch, 
> YARN-7221.003.patch
>
>
> When a docker is running with privileges, majority of the use case is to have 
> some program running with root then drop privileges to another user.  i.e. 
> httpd to start with privileged and bind to port 80, then drop privileges to 
> www user.  
> # We should add security check for submitting users, to verify they have 
> "sudo" access to run privileged container.  
> # We should remove --user=uid:gid for privileged containers.  
>  
> Docker can be launched with --privileged=true, and --user=uid:gid flag.  With 
> this parameter combinations, user will not have access to become root user.  
> All docker exec command will be drop to uid:gid user to run instead of 
> granting privileges.  User can gain root privileges if container file system 
> contains files that give user extra power, but this type of image is 
> considered as dangerous.  Non-privileged user can launch container with 
> special bits to acquire same level of root power.  Hence, we lose control of 
> which image should be run with --privileges, and who have sudo rights to use 
> privileged container images.  As the result, we should check for sudo access 
> then decide to parameterize --privileged=true OR --user=uid:gid.  This will 
> avoid leading developer down the wrong path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5714) ContainerExecutor does not order environment map

2018-02-01 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349312#comment-16349312
 ] 

Jason Lowe commented on YARN-5714:
--

Sorry to come to this late. We ran across an instance of this debugging some 
environment variable ordering issues in YARN-7677.

While the LinkedHashMap solution sounds nice and clean, I don't think it can 
work in practice. The problem is we have more than one list of environment 
variables: the user-provided list and the inherited variables via the NM 
whitelist. To make things worse, we don't know how they could be 
interconnected. The problem in YARN-7677 occurred because variables in the 
user's settings referenced variables in the NM whitelist, and the whitelist 
variables were listed after the user variables. Theoretically an admin could 
setup NM variables that have "plugin" variables that could come from 
user-provided settings, and thus we can't always assume NM whitelist variables 
come before user variables.

In short, I think we'll have to do some sort of dependency analysis. I'm not a 
fan of parsing the shell syntax to figure out the dependencies in the value, 
but I don't see a viable alternative to get all the variables, user-specified 
and otherwise, listed in the proper order.

> ContainerExecutor does not order environment map
> 
>
> Key: YARN-5714
> URL: https://issues.apache.org/jira/browse/YARN-5714
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.4.1, 2.5.2, 2.7.3, 2.6.4, 3.0.0-alpha1
> Environment: all (linux and windows alike)
>Reporter: Remi Catherinot
>Assignee: Remi Catherinot
>Priority: Trivial
>  Labels: oct16-medium
> Attachments: YARN-5714.001.patch, YARN-5714.002.patch, 
> YARN-5714.003.patch, YARN-5714.004.patch, YARN-5714.005.patch, 
> YARN-5714.006.patch
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> when dumping the launch container script, environment variables are dumped 
> based on the order internally used by the map implementation (hash based). It 
> does not take into consideration that some env varibales may refer each 
> other, and so that some env variables must be declared before those 
> referencing them.
> In my case, i ended up having LD_LIBRARY_PATH which was depending on 
> HADOOP_COMMON_HOME being dumped before HADOOP_COMMON_HOME. Thus it had a 
> wrong value and so native libraries weren't loaded. jobs were running but not 
> at their best efficiency. This is just a use case falling into that bug, but 
> i'm sure others may happen as well.
> I already have a patch running in my production environment, i just estimate 
> to 5 days for packaging the patch in the right fashion for JIRA + try my best 
> to add tests.
> Note : the patch is not OS aware with a default empty implementation. I will 
> only implement the unix version on a 1st release. I'm not used to windows env 
> variables syntax so it will take me more time/research for it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4882) Change the log level to DEBUG for recovering completed applications

2018-02-01 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349432#comment-16349432
 ] 

Jason Lowe commented on YARN-4882:
--

bq. From the above code, if RM fails to recover an application or an attempt 
all the other applications won't be loaded.

The same was true even before this patch.  The proposed code would change the 
semantics of application recovery which is out of the scope of this JIRA.  
Admins may not desire a ResourceManager to say it recovered when not all 
applications recovered.  Otherwise the RM may appear to be up, admin things 
everything is fine, when one or more (possibly all!) applications are simply 
gone.


> Change the log level to DEBUG for recovering completed applications
> ---
>
> Key: YARN-4882
> URL: https://issues.apache.org/jira/browse/YARN-4882
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Daniel Templeton
>Priority: Major
>  Labels: oct16-easy
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-4882.001.patch, YARN-4882.002.patch, 
> YARN-4882.003.patch, YARN-4882.004.patch, YARN-4882.005.patch
>
>
> I think for recovering completed applications no need to log as INFO, rather 
> it can be made it as DEBUG.  The problem seen from large cluster is if any 
> issue happens during RM start up and continuously switching , then  RM logs 
> are filled with most with recovering applications only. 
> There are 6 lines are logged for 1 applications as I shown in below logs, 
> then consider RM default value for max-completed applications is 10K. So for 
> each switch 10K*6=60K lines will be added which is not useful I feel.
> {noformat}
> 2016-03-01 10:20:59,077 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager: Default priority 
> level is set to application:application_1456298208485_21507
> 2016-03-01 10:20:59,094 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Recovering 
> app: application_1456298208485_21507 with 1 attempts and final state = 
> FINISHED
> 2016-03-01 10:20:59,100 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Recovering attempt: appattempt_1456298208485_21507_01 with final state: 
> FINISHED
> 2016-03-01 10:20:59,107 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1456298208485_21507_01 State change from NEW to FINISHED
> 2016-03-01 10:20:59,111 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: 
> application_1456298208485_21507 State change from NEW to FINISHED
> 2016-03-01 10:20:59,112 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=rohith   
> OPERATION=Application Finished - Succeeded  TARGET=RMAppManager 
> RESULT=SUCCESS  APPID=application_1456298208485_21507
> {noformat}
> The main problem is missing important information's from the logs before RM 
> unstable. Even though log roll back is 50 or 100, in a short period all these 
> logs will be rolled out and all the logs contains only RM switching 
> information that too recovering applications!!. 
> I suggest at least completed applications recovery should be logged as DEBUG.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Billie Rinaldi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349449#comment-16349449
 ] 

Billie Rinaldi commented on YARN-7677:
--

They'd have to be told the available versions by the admins, so they could just 
as easily be told the full paths. :)

> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349437#comment-16349437
 ] 

Jason Lowe commented on YARN-7677:
--

True, but that assumes the user even knows what the path is.  The point of such 
a setup is to decouple desired java version from where admins installed it.

> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7781) Update YARN-Services-Examples.md to be in sync with the latest code

2018-02-01 Thread Gour Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349226#comment-16349226
 ] 

Gour Saha commented on YARN-7781:
-

Makes sense. +1 for patch 03.

> Update YARN-Services-Examples.md to be in sync with the latest code
> ---
>
> Key: YARN-7781
> URL: https://issues.apache.org/jira/browse/YARN-7781
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Jian He
>Priority: Major
> Attachments: YARN-7781.01.patch, YARN-7781.02.patch, 
> YARN-7781.03.patch
>
>
> Update YARN-Services-Examples.md to make the following additions/changes:
> 1. Add an additional URL and PUT Request JSON to support flex:
> Update to flex up/down the no of containers (instances) of a component of a 
> service
> PUT URL – http://localhost:8088/app/v1/services/hello-world
> PUT Request JSON
> {code}
> {
>   "components" : [ {
> "name" : "hello",
> "number_of_containers" : 3
>   } ]
> }
> {code}
> 2. Modify all occurrences of /ws/ to /app/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5028) RMStateStore should trim down app state for completed applications

2018-02-01 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349286#comment-16349286
 ] 

genericqa commented on YARN-5028:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
39s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  5s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 36s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 48s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}109m 27s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestRMHAForAsyncScheduler |
|   | hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart |
|   | hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStore |
|   | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServiceAppsNodelabel |
|   | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-5028 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908846/YARN-5028.000.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux e9d06c47935d 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 
19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 6ca7204 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/19568/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results |

[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349301#comment-16349301
 ] 

Jason Lowe commented on YARN-7677:
--

YARN-5714 is very relevant here.  In the general case, we're not going to be 
able to order the variables without doing a dependency analysis between them, 
and that's what YARN-5714 proposes to do.  I'll see what I can do to push that 
forward, since it looks like a more deterministic ordering will be a 
prerequisite to doing any sort of change relative to how environment variables 
are handled.  Otherwise we'll risk breaking some case where variable ordering 
happened to work, and the user has little recourse to restore it to a working 
condition.


> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Billie Rinaldi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349435#comment-16349435
 ] 

Billie Rinaldi commented on YARN-7677:
--

It would be much more straightforward for the user to set JAVA_HOME to their 
desired value in that case.

> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6648) [GPG] Add SubClusterCleaner in Global Policy Generator

2018-02-01 Thread Botong Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349491#comment-16349491
 ] 

Botong Huang commented on YARN-6648:


Committed to YARN-7402. Thanks [~curino] for the review!

> [GPG] Add SubClusterCleaner in Global Policy Generator
> --
>
> Key: YARN-6648
> URL: https://issues.apache.org/jira/browse/YARN-6648
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
>  Labels: federation, gpg
> Attachments: YARN-6648-YARN-2915.v1.patch, 
> YARN-6648-YARN-7402.v2.patch, YARN-6648-YARN-7402.v3.patch, 
> YARN-6648-YARN-7402.v4.patch, YARN-6648-YARN-7402.v5.patch, 
> YARN-6648-YARN-7402.v6.patch, YARN-6648-YARN-7402.v7.patch, 
> YARN-6648-YARN-7402.v8.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7781) Update YARN-Services-Examples.md to be in sync with the latest code

2018-02-01 Thread Gour Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349256#comment-16349256
 ] 

Gour Saha commented on YARN-7781:
-

I filed YARN-7836 for the component name path not used issue.

> Update YARN-Services-Examples.md to be in sync with the latest code
> ---
>
> Key: YARN-7781
> URL: https://issues.apache.org/jira/browse/YARN-7781
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Jian He
>Priority: Major
> Attachments: YARN-7781.01.patch, YARN-7781.02.patch, 
> YARN-7781.03.patch
>
>
> Update YARN-Services-Examples.md to make the following additions/changes:
> 1. Add an additional URL and PUT Request JSON to support flex:
> Update to flex up/down the no of containers (instances) of a component of a 
> service
> PUT URL – http://localhost:8088/app/v1/services/hello-world
> PUT Request JSON
> {code}
> {
>   "components" : [ {
> "name" : "hello",
> "number_of_containers" : 3
>   } ]
> }
> {code}
> 2. Modify all occurrences of /ws/ to /app/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7857) -fstack-check compilation flag causes binary incompatibility for container-executor between RHEL 6 and RHEL 7

2018-02-01 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349410#comment-16349410
 ] 

Jim Brennan commented on YARN-7857:
---

As this is a change to the command line for container-executor, it was tested 
by compiling on both RHEL6 and RHEL7 and running the mapreduce pi example.   
Also ran with the RHEL6-compiled container-executor on RHEL7.

I manually tested the change by compiling a small program that just includes a 
main() that calls a stripped down version of copy_file().  With 
{{-fstack-check}}, this program fails when compiled on RHEL 6 and run on RHEL 
7.  With {{-fstack-protect}}, the RHEL6 version runs on RHEL7.

Please review.

> -fstack-check compilation flag causes binary incompatibility for 
> container-executor between RHEL 6 and RHEL 7
> -
>
> Key: YARN-7857
> URL: https://issues.apache.org/jira/browse/YARN-7857
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7857.001.patch
>
>
> The segmentation fault in container-executor reported in [YARN-7796]  appears 
> to be due to a binary compatibility issue with the {{-fstack-check}} flag 
> that was added in [YARN-6721]
> Based on my testing, a container-executor (without the patch from 
> [YARN-7796]) compiled on RHEL 6 with the -fstack-check flag always hits this 
> segmentation fault when run on RHEL 7.  But if you compile without this flag, 
> the container-executor runs on RHEL 7 with no problems.  I also verified this 
> with a simple program that just does the copy_file.
> I think we need to either remove this flag, or find a suitable alternative.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349465#comment-16349465
 ] 

Eric Yang commented on YARN-7677:
-

[~jlowe] I agree with [~billie.rinaldi] and YARN-5714 approach.  The classic 
unix approach to source system environment first, and user can override it in 
their own .profile or .bashrc.  System does not reference user environment 
variables to prevent user from doing harm to the system.

> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7221) Add security check for privileged docker container

2018-02-01 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349474#comment-16349474
 ] 

Eric Yang commented on YARN-7221:
-

The failed unit test is not related to this patch.

> Add security check for privileged docker container
> --
>
> Key: YARN-7221
> URL: https://issues.apache.org/jira/browse/YARN-7221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7221.001.patch, YARN-7221.002.patch, 
> YARN-7221.003.patch
>
>
> When a docker is running with privileges, majority of the use case is to have 
> some program running with root then drop privileges to another user.  i.e. 
> httpd to start with privileged and bind to port 80, then drop privileges to 
> www user.  
> # We should add security check for submitting users, to verify they have 
> "sudo" access to run privileged container.  
> # We should remove --user=uid:gid for privileged containers.  
>  
> Docker can be launched with --privileged=true, and --user=uid:gid flag.  With 
> this parameter combinations, user will not have access to become root user.  
> All docker exec command will be drop to uid:gid user to run instead of 
> granting privileges.  User can gain root privileges if container file system 
> contains files that give user extra power, but this type of image is 
> considered as dangerous.  Non-privileged user can launch container with 
> special bits to acquire same level of root power.  Hence, we lose control of 
> which image should be run with --privileges, and who have sudo rights to use 
> privileged container images.  As the result, we should check for sudo access 
> then decide to parameterize --privileged=true OR --user=uid:gid.  This will 
> avoid leading developer down the wrong path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7781) Update YARN-Services-Examples.md to be in sync with the latest code

2018-02-01 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349218#comment-16349218
 ] 

Jian He commented on YARN-7781:
---

talked with Billie offline, I reverted the changes about kerberos principal, 
that needs more verifications.
uploaded patch03

> Update YARN-Services-Examples.md to be in sync with the latest code
> ---
>
> Key: YARN-7781
> URL: https://issues.apache.org/jira/browse/YARN-7781
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Jian He
>Priority: Major
> Attachments: YARN-7781.01.patch, YARN-7781.02.patch, 
> YARN-7781.03.patch
>
>
> Update YARN-Services-Examples.md to make the following additions/changes:
> 1. Add an additional URL and PUT Request JSON to support flex:
> Update to flex up/down the no of containers (instances) of a component of a 
> service
> PUT URL – http://localhost:8088/app/v1/services/hello-world
> PUT Request JSON
> {code}
> {
>   "components" : [ {
> "name" : "hello",
> "number_of_containers" : 3
>   } ]
> }
> {code}
> 2. Modify all occurrences of /ws/ to /app/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-7781) Update YARN-Services-Examples.md to be in sync with the latest code

2018-02-01 Thread Gour Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349256#comment-16349256
 ] 

Gour Saha edited comment on YARN-7781 at 2/1/18 8:59 PM:
-

I filed YARN-7836 for the component name path not used issue. [~jianhe] feel 
free to take YARN-7836 and work on a single patch for both.


was (Author: gsaha):
I filed YARN-7836 for the component name path not used issue.

> Update YARN-Services-Examples.md to be in sync with the latest code
> ---
>
> Key: YARN-7781
> URL: https://issues.apache.org/jira/browse/YARN-7781
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Jian He
>Priority: Major
> Attachments: YARN-7781.01.patch, YARN-7781.02.patch, 
> YARN-7781.03.patch
>
>
> Update YARN-Services-Examples.md to make the following additions/changes:
> 1. Add an additional URL and PUT Request JSON to support flex:
> Update to flex up/down the no of containers (instances) of a component of a 
> service
> PUT URL – http://localhost:8088/app/v1/services/hello-world
> PUT Request JSON
> {code}
> {
>   "components" : [ {
> "name" : "hello",
> "number_of_containers" : 3
>   } ]
> }
> {code}
> 2. Modify all occurrences of /ws/ to /app/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7866) [UI2] Kerberizing the UI doesn't give any warning or content when UI is accessed without kinit

2018-02-01 Thread Jian He (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-7866:
--
Reporter: Sumana Sathish  (was: Sunil G)

> [UI2] Kerberizing the UI doesn't give any warning or content when UI is 
> accessed without kinit
> --
>
> Key: YARN-7866
> URL: https://issues.apache.org/jira/browse/YARN-7866
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Sumana Sathish
>Assignee: Sunil G
>Priority: Major
> Attachments: YARN-7866.001.patch
>
>
> Handle 401 error and show in UI
> credit to [~ssath...@hortonworks.com] for finding  this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-02-01 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349305#comment-16349305
 ] 

Jim Brennan commented on YARN-7677:
---

I have tested a version of the patch where I write out the whitelisted 
variables first, and it does work for my test cases. But looking at the 
launch_container.sh that is produced, the order of other variables is not the 
same as launch_container.sh from before my changes. Since the whitelisted 
variables are not added to the environment hash map, the order of traversal is 
different. I'm not comfortable with putting this change in as-is, because while 
the ordering differences I'm seeing are not a problem in my test cases, there 
is no guarantee that others would not run into problems due to this change.

I discussed this with [~jlowe], and he pointed me at YARN-5714. I think we need 
to address that problem before putting this change in.

> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7221) Add security check for privileged docker container

2018-02-01 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349350#comment-16349350
 ] 

genericqa commented on YARN-7221:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
26m 24s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 44s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 23s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 59m 17s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.TestContainerManager |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7221 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908864/YARN-7221.003.patch |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux a57d2eae51a1 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / dd50f53 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/19570/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19570/testReport/ |
| Max. process+thread count | 430 (vs. ulimit of 5000) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19570/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add security check for privileged docker container
> --
>
> Key: YARN-7221
> URL: https://issues.apache.org/jira/browse/YARN-7221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7221.001.patch, YARN-7221.002.patch, 
> YARN-7221.003.patch
>
>
> When a docker is running with privileges,

[jira] [Assigned] (YARN-7859) New feature: add queue scheduling deadLine in fairScheduler.

2018-02-01 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu reassigned YARN-7859:
--

Assignee: wangwj

> New feature: add queue scheduling deadLine in fairScheduler.
> 
>
> Key: YARN-7859
> URL: https://issues.apache.org/jira/browse/YARN-7859
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: fairscheduler
>Affects Versions: 3.0.0
>Reporter: wangwj
>Assignee: wangwj
>Priority: Major
>  Labels: fairscheduler, features, patch
> Fix For: 3.0.0
>
> Attachments: YARN-7859-v1.patch, log, screenshot-1.png, 
> screenshot-3.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
>  As everyone knows.In FairScheduler the phenomenon of queue scheduling 
> starvation often occurs when the number of cluster jobs is large.The App in 
> one or more queue are pending.So I have thought a way to solve this 
> problem.Add queue scheduling deadLine in fairScheduler.When a queue is not 
> scheduled for FairScheduler within a specified time.We mandatory scheduler it!
> Now the way of community solves queue scheduling to starvation is preempt 
> container.But this way may increases the failure rate of the job.
> On the basis of the above, I propose this issue...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4882) Change the log level to DEBUG for recovering completed applications

2018-02-01 Thread Giovanni Matteo Fumarola (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349243#comment-16349243
 ] 

Giovanni Matteo Fumarola commented on YARN-4882:


[~templedf], [~rohithsharma] I have a quick comment about this old patch.
{code:java}
try {
  for (ApplicationStateData appState : appStates.values()) {
recoverApplication(appState, state);
count += 1;
  }
} finally {
  LOG.info("Successfully recovered " + count + " out of "
  + appStates.size() + " applications");
}
{code}
>From the above code, if RM fails to recover an application or an attempt all 
>the other applications won't be loaded.

Due this reason the above code should be implemented as:


{code:java}
for (ApplicationStateData appState : appStates.values()) {
  try {
recoverApplication(appState, state);
count += 1;
  } catch (Exception e) {
LOG.error(e);
  }
}
LOG.info("Successfully recovered " + count + " out of " + appStates.size()
+ " applications");
{code}
Thoughts? 

 

 

> Change the log level to DEBUG for recovering completed applications
> ---
>
> Key: YARN-4882
> URL: https://issues.apache.org/jira/browse/YARN-4882
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Daniel Templeton
>Priority: Major
>  Labels: oct16-easy
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-4882.001.patch, YARN-4882.002.patch, 
> YARN-4882.003.patch, YARN-4882.004.patch, YARN-4882.005.patch
>
>
> I think for recovering completed applications no need to log as INFO, rather 
> it can be made it as DEBUG.  The problem seen from large cluster is if any 
> issue happens during RM start up and continuously switching , then  RM logs 
> are filled with most with recovering applications only. 
> There are 6 lines are logged for 1 applications as I shown in below logs, 
> then consider RM default value for max-completed applications is 10K. So for 
> each switch 10K*6=60K lines will be added which is not useful I feel.
> {noformat}
> 2016-03-01 10:20:59,077 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager: Default priority 
> level is set to application:application_1456298208485_21507
> 2016-03-01 10:20:59,094 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Recovering 
> app: application_1456298208485_21507 with 1 attempts and final state = 
> FINISHED
> 2016-03-01 10:20:59,100 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Recovering attempt: appattempt_1456298208485_21507_01 with final state: 
> FINISHED
> 2016-03-01 10:20:59,107 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1456298208485_21507_01 State change from NEW to FINISHED
> 2016-03-01 10:20:59,111 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: 
> application_1456298208485_21507 State change from NEW to FINISHED
> 2016-03-01 10:20:59,112 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=rohith   
> OPERATION=Application Finished - Succeeded  TARGET=RMAppManager 
> RESULT=SUCCESS  APPID=application_1456298208485_21507
> {noformat}
> The main problem is missing important information's from the logs before RM 
> unstable. Even though log roll back is 50 or 100, in a short period all these 
> logs will be rolled out and all the logs contains only RM switching 
> information that too recovering applications!!. 
> I suggest at least completed applications recovery should be logged as DEBUG.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

1 2 >

1 - 100 of 145 matches

Mail list logo