[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuqi Wang updated YARN-7872: Description: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. *Attachment is the fix according to this principle, please help to review.* *Without it, we cannot use locality to request container within these labeled nodes.* *If the fix is acceptable, we should also recheck whether the same issue happens in trunk and other hadoop versions.* was: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk and other hadoop versions. > labeled node cannot be used to satisfy locality specified request > - > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Blocker > Fix For: 2.7.2 > > Attachments: YARN-7872-branch-2.7.2.001.patch > > > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request (i.e. container request with 'not ANY' > resource name and the relax locality is false). > For example: > The node with available resource: > [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: > [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: > \{/default-rack}] > The container request: > [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] > {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: > \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] > Current RM capacity
[jira] [Commented] (YARN-7446) Docker container privileged mode and --user flag contradict each other
[ https://issues.apache.org/jira/browse/YARN-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348557#comment-16348557 ] Shane Kumpf commented on YARN-7446: --- Do you believe that all --privileged containers should run as the root user? if so, please hard code --user 0:0 as the user in this patch and we'll get this wrapped up. > Docker container privileged mode and --user flag contradict each other > -- > > Key: YARN-7446 > URL: https://issues.apache.org/jira/browse/YARN-7446 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-7446.001.patch > > > In the current implementation, when privileged=true, --user flag is also > passed to docker for launching container. In reality, the container has no > way to use root privileges unless there is sticky bit or sudoers in the image > for the specified user to gain privileges again. To avoid duplication of > dropping and reacquire root privileges, we can reduce the duplication of > specifying both flag. When privileged mode is enabled, --user flag should be > omitted. When non-privileged mode is enabled, --user flag is supplied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuqi Wang updated YARN-7872: Description: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk and other hadoop versions. was: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk. > labeled node cannot be used to satisfy locality specified request > - > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Blocker > Fix For: 2.7.2 > > > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request (i.e. container request with 'not ANY' > resource name and the relax locality is false). > For example: > The node with available resource: > [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: > [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: > \{/default-rack}] > The container request: > [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] > {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: > \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] > Current RM capacity scheduler's behavior is that, the node cannot allocate > container for the request because of the
[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuqi Wang updated YARN-7872: Attachment: YARN-7872-branch-2.7.2.001.patch > labeled node cannot be used to satisfy locality specified request > - > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Blocker > Fix For: 2.7.2 > > Attachments: YARN-7872-branch-2.7.2.001.patch > > > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request (i.e. container request with 'not ANY' > resource name and the relax locality is false). > For example: > The node with available resource: > [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: > [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: > \{/default-rack}] > The container request: > [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] > {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: > \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] > Current RM capacity scheduler's behavior is that, the node cannot allocate > container for the request because of the node label not matched in the leaf > queue assign container. > However, node locality and node label should be two orthogonal dimensions to > select candidate nodes for container request. And the node label matching > should only be executed for container request with ANY resource name, since > only this kind of container request is allowed to have 'not empty' node label. > So, for container request with 'not ANY' resource name (so, we know it should > not have node label), we should use resource name to match with the node > instead of using node label to match with the node. And this resource name > matching should be safe, since the node whose node label is not accessible > for the queue will not be sent to the leaf queue. > Attachment is the fix according to this principle, please help to review. > Without it, we cannot use locality to request container within these labeled > nodes. > If the fix is acceptable, we should also recheck whether the same issue > happens in trunk and other hadoop versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuqi Wang updated YARN-7872: Description: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. *Attachment is the fix according to this principle, please help to review.* *Without it, we cannot use locality to request container within these labeled nodes.* *If the fix is acceptable, we should also recheck whether the same issue happens in trunk and other hadoop versions.* *If not* *acceptable (i.e. the current behavior is by designed), so, how can we use* *locality to request container within labeled nodes?* was: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. *Attachment is the fix according to this principle, please help to review.* *Without it, we cannot use locality to request container within these labeled nodes.* *If the fix is acceptable, we should also recheck whether the same issue happens in trunk and other hadoop versions.* *If not* *acceptable (i.e. the current behavior is by designed), so, how can we use* *locality to request container within labeled nodes?*** > labeled node cannot be used to satisfy locality specified request > - > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Blocker > Fix For: 2.7.2 > > Attachments: YARN-7872-branch-2.7.2.001.patch > > > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request (i.e. container request with 'not ANY' > resource name and the relax locality is false). > For example: > The node with available resource: > [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: > [persistent]{color} {color:#f79232}HostName: \{SRG}{color}
[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuqi Wang updated YARN-7872: Description: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. *Attachment is the fix according to this principle, please help to review.* *Without it, we cannot use locality to request container within these labeled nodes.* *If the fix is acceptable, we should also recheck whether the same issue happens in trunk and other hadoop versions.* *If not* *acceptable (i.e. the current behavior is by designed), so, how can we specify* *locality to request container within labeled nodes?*** was: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. *Attachment is the fix according to this principle, please help to review.* *Without it, we cannot use locality to request container within these labeled nodes.* *If the fix is acceptable, we should also recheck whether the same issue happens in trunk and other hadoop versions.* > labeled node cannot be used to satisfy locality specified request > - > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Blocker > Fix For: 2.7.2 > > Attachments: YARN-7872-branch-2.7.2.001.patch > > > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request (i.e. container request with 'not ANY' > resource name and the relax locality is false). > For example: > The node with available resource: > [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: > [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: > \{/default-rack}] > The container request: > [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] >
[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuqi Wang updated YARN-7872: Description: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. *Attachment is the fix according to this principle, please help to review.* *Without it, we cannot use locality to request container within these labeled nodes.* *If the fix is acceptable, we should also recheck whether the same issue happens in trunk and other hadoop versions.* *If not* *acceptable (i.e. the current behavior is by designed), so, how can we use* *locality to request container within labeled nodes?*** was: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. *Attachment is the fix according to this principle, please help to review.* *Without it, we cannot use locality to request container within these labeled nodes.* *If the fix is acceptable, we should also recheck whether the same issue happens in trunk and other hadoop versions.* *If not* *acceptable (i.e. the current behavior is by designed), so, how can we specify* *locality to request container within labeled nodes?*** > labeled node cannot be used to satisfy locality specified request > - > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Blocker > Fix For: 2.7.2 > > Attachments: YARN-7872-branch-2.7.2.001.patch > > > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request (i.e. container request with 'not ANY' > resource name and the relax locality is false). > For example: > The node with available resource: > [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: > [persistent]{color} {color:#f79232}HostName:
[jira] [Commented] (YARN-7840) Update PB for prefix support of node attributes
[ https://issues.apache.org/jira/browse/YARN-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348579#comment-16348579 ] Sunil G commented on YARN-7840: --- I am fine with latest patch. +1 I ll commit tomorrow if no objections. > Update PB for prefix support of node attributes > --- > > Key: YARN-7840 > URL: https://issues.apache.org/jira/browse/YARN-7840 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Assignee: Naganarasimha G R >Priority: Blocker > Attachments: YARN-7840-YARN-3409.001.patch, > YARN-7840-YARN-3409.002.patch, YARN-7840-YARN-3409.003.patch, > YARN-7840-YARN-3409.004.patch > > > We need to support prefix (namespace) for node attributes, this will add the > flexibility to provide ability to do proper ACL, avoid naming conflicts etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7850) New UI does not show status for Log Aggregation
[ https://issues.apache.org/jira/browse/YARN-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Novák updated YARN-7850: Attachment: Screen Shot 2018-02-01 at 11.34.36.png > New UI does not show status for Log Aggregation > --- > > Key: YARN-7850 > URL: https://issues.apache.org/jira/browse/YARN-7850 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Reporter: Yesha Vora >Assignee: Gergely Novák >Priority: Major > Attachments: Screen Shot 2018-02-01 at 11.34.36.png, > YARN-7850.001.patch > > > The status of Log Aggregation is not specified any where. > New UI should show the Log aggregation status for finished application. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7850) New UI does not show status for Log Aggregation
[ https://issues.apache.org/jira/browse/YARN-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Novák updated YARN-7850: Attachment: YARN-7850.001.patch > New UI does not show status for Log Aggregation > --- > > Key: YARN-7850 > URL: https://issues.apache.org/jira/browse/YARN-7850 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Reporter: Yesha Vora >Assignee: Gergely Novák >Priority: Major > Attachments: Screen Shot 2018-02-01 at 11.34.36.png, > YARN-7850.001.patch > > > The status of Log Aggregation is not specified any where. > New UI should show the Log aggregation status for finished application. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuqi Wang updated YARN-7872: Description: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behaiour is: The node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (besides, it should not have node label), we should use resource name to match with the node instead of node label to match with the node. And it should be safe, since the node which is not accessible for the queue will not be sent in the leaf queue. Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk. was: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behaiour is: The node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (besides, it should not have node label), we should use resource name to match with the node instead of node label to match with the node. And it should be safe, since the node which is not accessible for the queue will not be sent in the leaf queue. Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk. > labeled node cannot be used to satisfy locality specified request > - > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Blocker > Fix For: 2.8.0, 2.7.2 > > > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request (i.e. container request with 'not ANY' > resource name and the relax locality is false). > For example: > The node with available resource: > [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: > [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: > \{/default-rack}] > The container request: > [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] > {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: > \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] > Current RM capacity scheduler's behaiour is: > The node cannot allocate container for the request because of the node label > not matched in the leaf queue assign container. > However, node locality and node label should be two
[jira] [Commented] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348591#comment-16348591 ] genericqa commented on YARN-7872: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 8m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} branch-2.7.2 Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 1m 51s{color} | {color:red} root in branch-2.7.2 failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 29s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.7.2 failed with JDK v1.8.0_151. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.7.2 failed with JDK v9-internal. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green} branch-2.7.2 passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 17s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.7.2 failed. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.7.2 failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.7.2 failed with JDK v1.8.0_151. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.7.2 failed with JDK v9-internal. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 9s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 9s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_151. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 9s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_151. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v9-internal. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v9-internal. {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 25s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 2 new + 816 unchanged - 1 fixed = 818 total (was 817) {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 11s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 9s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_151. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v9-internal. {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v9-internal. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 15m 17s{color} | {color:black} {color} | \\ \\ ||
[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuqi Wang updated YARN-7872: Description: *Issue summary:* labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). *For example:* The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that (at least for version 2.7 and 2.8), the node cannot allocate container for the request, because the node label is not matched when the leaf queue assign container. *Possible solution:* However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we clearly know it should not have node label), we should use the requested resource name to match with the node instead of using the requested node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. *Discussion:* Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk and other hadoop versions. If not acceptable (i.e. the current behavior is by designed), so, how can we use locality to request container within these labeled nodes? was: *Issue summary:* labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). *For example:* The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that (at least for version 2.7 and 2.8), the node cannot allocate container for the request, because the node label is not matched when the leaf queue assign container. *Possible solution:* However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use the requested resource name to match with the node instead of using the requested node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. *Discussion:* Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk and other hadoop versions. If not acceptable (i.e. the current behavior is by designed), so, how can we use locality to request container within these labeled nodes? > labeled node cannot be used to satisfy locality specified request > - > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Blocker > Fix For: 2.7.2 > > Attachments: YARN-7872-branch-2.7.2.001.patch > > > *Issue summary:* > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request
[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuqi Wang updated YARN-7872: Description: *Issue summary:* labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). *For example:* The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that (at least for version 2.7 and 2.8), the node cannot allocate container for the request, because the node label is not matched when the leaf queue assign container. *Possible solution:* However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use the requested resource name to match with the node instead of using the requested node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. *Discussion:* Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk and other hadoop versions. If not acceptable (i.e. the current behavior is by designed), so, how can we use locality to request container within these labeled nodes? was: *Issue summary:* labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). *For example:* The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that (at least for version 2.7 and 2.8), the node cannot allocate container for the request, because the node label is not matched when the leaf queue assign container. *Possible solution:* However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. *Discussion:* Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk and other hadoop versions. If not acceptable (i.e. the current behavior is by designed), so, how can we use locality to request container within these labeled nodes? > labeled node cannot be used to satisfy locality specified request > - > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Blocker > Fix For: 2.7.2 > > Attachments: YARN-7872-branch-2.7.2.001.patch > > > *Issue summary:* > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request (i.e. container request with 'not
[jira] [Commented] (YARN-7829) Rebalance UI2 cluster overview page
[ https://issues.apache.org/jira/browse/YARN-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348609#comment-16348609 ] Gergely Novák commented on YARN-7829: - [~sunilg] Please find the attached screenshot. All I did was moved the Node Managers to the 2nd row, didn't touch the resources. > Rebalance UI2 cluster overview page > --- > > Key: YARN-7829 > URL: https://issues.apache.org/jira/browse/YARN-7829 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Affects Versions: 3.0.0 >Reporter: Eric Yang >Assignee: Gergely Novák >Priority: Major > Attachments: YARN-7829.001.patch, YARN-7829.jpg, > ui2-cluster-overview.png > > > The cluster overview page looks like a upside down triangle. It would be > nice to rebalance the charts to ensure horizontal real estate are utilized > properly. The screenshot attachment includes some suggestion for rebalance. > Node Manager status and cluster resource are closely related, it would be > nice to promote the chart to first row. Application Status, and Resource > Availability are closely related. It would be nice to promote Resource usage > to side by side with Application Status to fill up the horizontal real > estates. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7850) New UI does not show status for Log Aggregation
[ https://issues.apache.org/jira/browse/YARN-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Novák updated YARN-7850: Attachment: Screen Shot 2018-02-01 at 11.37.30.png > New UI does not show status for Log Aggregation > --- > > Key: YARN-7850 > URL: https://issues.apache.org/jira/browse/YARN-7850 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Reporter: Yesha Vora >Assignee: Gergely Novák >Priority: Major > Attachments: Screen Shot 2018-02-01 at 11.37.30.png, > YARN-7850.001.patch > > > The status of Log Aggregation is not specified any where. > New UI should show the Log aggregation status for finished application. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7850) New UI does not show status for Log Aggregation
[ https://issues.apache.org/jira/browse/YARN-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Novák updated YARN-7850: Attachment: (was: Screen Shot 2018-02-01 at 11.34.36.png) > New UI does not show status for Log Aggregation > --- > > Key: YARN-7850 > URL: https://issues.apache.org/jira/browse/YARN-7850 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Reporter: Yesha Vora >Assignee: Gergely Novák >Priority: Major > Attachments: YARN-7850.001.patch > > > The status of Log Aggregation is not specified any where. > New UI should show the Log aggregation status for finished application. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7850) New UI does not show status for Log Aggregation
[ https://issues.apache.org/jira/browse/YARN-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Novák updated YARN-7850: Attachment: YARN-7850.001.patch > New UI does not show status for Log Aggregation > --- > > Key: YARN-7850 > URL: https://issues.apache.org/jira/browse/YARN-7850 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Reporter: Yesha Vora >Assignee: Gergely Novák >Priority: Major > Attachments: YARN-7850.001.patch > > > The status of Log Aggregation is not specified any where. > New UI should show the Log aggregation status for finished application. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7850) New UI does not show status for Log Aggregation
[ https://issues.apache.org/jira/browse/YARN-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Novák updated YARN-7850: Attachment: (was: YARN-7850.001.patch) > New UI does not show status for Log Aggregation > --- > > Key: YARN-7850 > URL: https://issues.apache.org/jira/browse/YARN-7850 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Reporter: Yesha Vora >Assignee: Gergely Novák >Priority: Major > Attachments: YARN-7850.001.patch > > > The status of Log Aggregation is not specified any where. > New UI should show the Log aggregation status for finished application. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7829) Rebalance UI2 cluster overview page
[ https://issues.apache.org/jira/browse/YARN-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348573#comment-16348573 ] Sunil G commented on YARN-7829: --- [~GergelyNovak] Cud u pls attach screen shot as per latest patch. Also pls keep resources in same line (gpu etc to be shown later) > Rebalance UI2 cluster overview page > --- > > Key: YARN-7829 > URL: https://issues.apache.org/jira/browse/YARN-7829 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Affects Versions: 3.0.0 >Reporter: Eric Yang >Assignee: Gergely Novák >Priority: Major > Attachments: YARN-7829.001.patch, ui2-cluster-overview.png > > > The cluster overview page looks like a upside down triangle. It would be > nice to rebalance the charts to ensure horizontal real estate are utilized > properly. The screenshot attachment includes some suggestion for rebalance. > Node Manager status and cluster resource are closely related, it would be > nice to promote the chart to first row. Application Status, and Resource > Availability are closely related. It would be nice to promote Resource usage > to side by side with Application Status to fill up the horizontal real > estates. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7829) Rebalance UI2 cluster overview page
[ https://issues.apache.org/jira/browse/YARN-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348613#comment-16348613 ] Sunil G commented on YARN-7829: --- perfect! Thanks [~GergelyNovak]. > Rebalance UI2 cluster overview page > --- > > Key: YARN-7829 > URL: https://issues.apache.org/jira/browse/YARN-7829 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Affects Versions: 3.0.0 >Reporter: Eric Yang >Assignee: Gergely Novák >Priority: Major > Attachments: YARN-7829.001.patch, YARN-7829.jpg, > ui2-cluster-overview.png > > > The cluster overview page looks like a upside down triangle. It would be > nice to rebalance the charts to ensure horizontal real estate are utilized > properly. The screenshot attachment includes some suggestion for rebalance. > Node Manager status and cluster resource are closely related, it would be > nice to promote the chart to first row. Application Status, and Resource > Availability are closely related. It would be nice to promote Resource usage > to side by side with Application Status to fill up the horizontal real > estates. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-7850) New UI does not show status for Log Aggregation
[ https://issues.apache.org/jira/browse/YARN-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Novák reassigned YARN-7850: --- Assignee: Gergely Novák > New UI does not show status for Log Aggregation > --- > > Key: YARN-7850 > URL: https://issues.apache.org/jira/browse/YARN-7850 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Reporter: Yesha Vora >Assignee: Gergely Novák >Priority: Major > > The status of Log Aggregation is not specified any where. > New UI should show the Log aggregation status for finished application. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7850) New UI does not show status for Log Aggregation
[ https://issues.apache.org/jira/browse/YARN-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348366#comment-16348366 ] Gergely Novák commented on YARN-7850: - In patch #1 I added Log Aggregation Status to the Logs tab. However, the old UI offers more than that, on \{rm}:8088/cluster/logaggregationstatus/\{app_id} it shows a table with all the affected nodes, their log aggregations statuses and diagnostic messages. In order to present the same on the new UI we need to add this information to the RM Web Services by creating a new API endpoint. [~yeshavora] Can I open a separate ticket for that? > New UI does not show status for Log Aggregation > --- > > Key: YARN-7850 > URL: https://issues.apache.org/jira/browse/YARN-7850 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Reporter: Yesha Vora >Assignee: Gergely Novák >Priority: Major > Attachments: Screen Shot 2018-02-01 at 11.37.30.png, > YARN-7850.001.patch > > > The status of Log Aggregation is not specified any where. > New UI should show the Log aggregation status for finished application. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7840) Update PB for prefix support of node attributes
[ https://issues.apache.org/jira/browse/YARN-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348377#comment-16348377 ] genericqa commented on YARN-7840: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} YARN-3409 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 52s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 12s{color} | {color:green} YARN-3409 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 33s{color} | {color:green} YARN-3409 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 58s{color} | {color:green} YARN-3409 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 23s{color} | {color:green} YARN-3409 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 58s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 13s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in YARN-3409 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 14s{color} | {color:green} YARN-3409 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 56s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 0 new + 10 unchanged - 1 fixed = 10 total (was 11) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 17s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 41s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 14s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 70m 16s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7840 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12908758/YARN-7840-YARN-3409.004.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc | | uname | Linux ba3d48b04260 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuqi Wang updated YARN-7872: Target Version/s: 2.7.2 (was: 2.8.0, 2.7.2) > labeled node cannot be used to satisfy locality specified request > - > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Blocker > Fix For: 2.7.2 > > > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request (i.e. container request with 'not ANY' > resource name and the relax locality is false). > For example: > The node with available resource: > [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: > [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: > \{/default-rack}] > The container request: > [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] > {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: > \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] > Current RM capacity scheduler's behavior is that, the node cannot allocate > container for the request because of the node label not matched in the leaf > queue assign container. > However, node locality and node label should be two orthogonal dimensions to > select candidate nodes for container request. And the node label matching > should only be executed for container request with ANY resource name, since > only this kind of container request is allowed to have 'not empty' node label. > So, for container request with 'not ANY' resource name (so, we know it should > not have node label), we should use resource name to match with the node > instead of using node label to match with the node. And this resource name > matching should be safe, since the node whose node label is not accessible > for the queue will not be sent to the leaf queue. > Attachment is the fix according to this principle, please help to review. > Without it, we cannot use locality to request container within these labeled > nodes. > If the fix is acceptable, we should also recheck whether the same issue > happens in trunk. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuqi Wang updated YARN-7872: Fix Version/s: (was: 2.8.0) > labeled node cannot be used to satisfy locality specified request > - > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Blocker > Fix For: 2.7.2 > > > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request (i.e. container request with 'not ANY' > resource name and the relax locality is false). > For example: > The node with available resource: > [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: > [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: > \{/default-rack}] > The container request: > [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] > {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: > \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] > Current RM capacity scheduler's behavior is that, the node cannot allocate > container for the request because of the node label not matched in the leaf > queue assign container. > However, node locality and node label should be two orthogonal dimensions to > select candidate nodes for container request. And the node label matching > should only be executed for container request with ANY resource name, since > only this kind of container request is allowed to have 'not empty' node label. > So, for container request with 'not ANY' resource name (so, we know it should > not have node label), we should use resource name to match with the node > instead of using node label to match with the node. And this resource name > matching should be safe, since the node whose node label is not accessible > for the queue will not be sent to the leaf queue. > Attachment is the fix according to this principle, please help to review. > Without it, we cannot use locality to request container within these labeled > nodes. > If the fix is acceptable, we should also recheck whether the same issue > happens in trunk. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuqi Wang updated YARN-7872: Description: *Issue summary:* labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). *For example:* The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that (at least for version 2.7 and 2.8), the node cannot allocate container for the request, because the node label is not matched when the leaf queue assign container. *Possible solution:* However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. *Discussion:* Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk and other hadoop versions. If not acceptable (i.e. the current behavior is by designed), so, how can we use locality to request container within these labeled nodes? was: *Issue summary:* labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). *For example:* The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request, because the node label is not matched when the leaf queue assign container. *Possible solution:* However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. *Discussion:* Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk and other hadoop versions. If not acceptable (i.e. the current behavior is by designed), so, how can we use locality to request container within these labeled nodes? > labeled node cannot be used to satisfy locality specified request > - > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Blocker > Fix For: 2.7.2 > > Attachments: YARN-7872-branch-2.7.2.001.patch > > > *Issue summary:* > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request (i.e. container request with 'not ANY' > resource name and the relax locality is false). > >
[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuqi Wang updated YARN-7872: Description: *Issue summary:* labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). *For example:* The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request, because the node label is not matched when the leaf queue assign container. *Possible solution:* However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. *Discussion:* Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk and other hadoop versions. If not acceptable (i.e. the current behavior is by designed), so, how can we use locality to request container within these labeled nodes? was: *Issue summary:* labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). *For example:* The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. *Possible solution:* However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. *Discussion:* Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk and other hadoop versions. If not acceptable (i.e. the current behavior is by designed), so, how can we use locality to request container within these labeled nodes? > labeled node cannot be used to satisfy locality specified request > - > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Blocker > Fix For: 2.7.2 > > Attachments: YARN-7872-branch-2.7.2.001.patch > > > *Issue summary:* > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request (i.e. container request with 'not ANY' > resource name and the relax locality is false). > > *For example:* > The node with available
[jira] [Updated] (YARN-7829) Rebalance UI2 cluster overview page
[ https://issues.apache.org/jira/browse/YARN-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Novák updated YARN-7829: Attachment: YARN-7829.jpg > Rebalance UI2 cluster overview page > --- > > Key: YARN-7829 > URL: https://issues.apache.org/jira/browse/YARN-7829 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Affects Versions: 3.0.0 >Reporter: Eric Yang >Assignee: Gergely Novák >Priority: Major > Attachments: YARN-7829.001.patch, YARN-7829.jpg, > ui2-cluster-overview.png > > > The cluster overview page looks like a upside down triangle. It would be > nice to rebalance the charts to ensure horizontal real estate are utilized > properly. The screenshot attachment includes some suggestion for rebalance. > Node Manager status and cluster resource are closely related, it would be > nice to promote the chart to first row. Application Status, and Resource > Availability are closely related. It would be nice to promote Resource usage > to side by side with Application Status to fill up the horizontal real > estates. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7872) labeled node cannot be used to satisfy locality specified request
Yuqi Wang created YARN-7872: --- Summary: labeled node cannot be used to satisfy locality specified request Key: YARN-7872 URL: https://issues.apache.org/jira/browse/YARN-7872 Project: Hadoop YARN Issue Type: Bug Components: capacity scheduler, capacityscheduler, resourcemanager Affects Versions: 2.7.2 Reporter: Yuqi Wang Assignee: Yuqi Wang Fix For: 2.7.2, 2.8.0 labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behaiour is: The node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (besides, it should not have node label), we should use resource name to match with the node instead of node label to match with the node. And it should be safe, since the node which is not accessible for the queue will not be sent in the leaf queue. Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348684#comment-16348684 ] Jim Brennan commented on YARN-7677: --- Thanks [~jlowe] I will put up a patch for branch-2. > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuqi Wang updated YARN-7872: Description: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (besides, it should not have node label), we should use resource name to match with the node instead of node label to match with the node. And it should be safe, since the node which is not accessible for the queue will not be sent in the leaf queue. Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk. was: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behaiour is: The node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (besides, it should not have node label), we should use resource name to match with the node instead of node label to match with the node. And it should be safe, since the node which is not accessible for the queue will not be sent in the leaf queue. Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk. > labeled node cannot be used to satisfy locality specified request > - > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Blocker > Fix For: 2.8.0, 2.7.2 > > > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request (i.e. container request with 'not ANY' > resource name and the relax locality is false). > For example: > The node with available resource: > [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: > [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: > \{/default-rack}] > The container request: > [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] > {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: > \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] > Current RM capacity scheduler's behavior is that, the node cannot allocate > container for the request because of the node label not matched in the leaf > queue assign container. > However, node locality and node label should be two
[jira] [Comment Edited] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348571#comment-16348571 ] Yuqi Wang edited comment on YARN-7872 at 2/1/18 1:26 PM: - Just a init to trigger Jenkins. was (Author: yqwang): Just a init try > labeled node cannot be used to satisfy locality specified request > - > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Blocker > Fix For: 2.7.2 > > Attachments: YARN-7872-branch-2.7.2.001.patch > > > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request (i.e. container request with 'not ANY' > resource name and the relax locality is false). > For example: > The node with available resource: > [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: > [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: > \{/default-rack}] > The container request: > [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] > {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: > \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] > Current RM capacity scheduler's behavior is that, the node cannot allocate > container for the request because of the node label not matched in the leaf > queue assign container. > However, node locality and node label should be two orthogonal dimensions to > select candidate nodes for container request. And the node label matching > should only be executed for container request with ANY resource name, since > only this kind of container request is allowed to have 'not empty' node label. > So, for container request with 'not ANY' resource name (so, we know it should > not have node label), we should use resource name to match with the node > instead of using node label to match with the node. And this resource name > matching should be safe, since the node whose node label is not accessible > for the queue will not be sent to the leaf queue. > Attachment is the fix according to this principle, please help to review. > Without it, we cannot use locality to request container within these labeled > nodes. > If the fix is acceptable, we should also recheck whether the same issue > happens in trunk and other hadoop versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuqi Wang updated YARN-7872: Description: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. *Attachment is the fix according to this principle, please help to review.* *Without it, we cannot use locality to request container within these labeled nodes.* *If the fix is acceptable, we should also recheck whether the same issue happens in trunk and other hadoop versions.* *If not* *acceptable (i.e. the current behavior is by designed), so, how can we use* *locality to request container within these labeled nodes?* was: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. *Attachment is the fix according to this principle, please help to review.* *Without it, we cannot use locality to request container within these labeled nodes.* *If the fix is acceptable, we should also recheck whether the same issue happens in trunk and other hadoop versions.* *If not* *acceptable (i.e. the current behavior is by designed), so, how can we use* *locality to request container within labeled nodes?* > labeled node cannot be used to satisfy locality specified request > - > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Blocker > Fix For: 2.7.2 > > Attachments: YARN-7872-branch-2.7.2.001.patch > > > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request (i.e. container request with 'not ANY' > resource name and the relax locality is false). > For example: > The node with available resource: > [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: > [persistent]{color} {color:#f79232}HostName:
[jira] [Commented] (YARN-7829) Rebalance UI2 cluster overview page
[ https://issues.apache.org/jira/browse/YARN-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348482#comment-16348482 ] Gergely Novák commented on YARN-7829: - ??Small nits that Memory and VCore are also related to system resource, which are more closely related to Cluster Resource on the top row. Would it be better to move Finished Apps and Running Apps to third row??? There are two problems with it: # one might say that Finished/Running Apps are also related Cluster Resource Usage by Applications in the top left corner # the number of resources in the (current) 3rd line is not necessarily 2. We might have additional resource types (GPUs, etc, see YARN-7330), so we should leave the opportunity for this last row to "overflow", we shouldn't add any other charts to it or move it up. > Rebalance UI2 cluster overview page > --- > > Key: YARN-7829 > URL: https://issues.apache.org/jira/browse/YARN-7829 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Affects Versions: 3.0.0 >Reporter: Eric Yang >Assignee: Gergely Novák >Priority: Major > Attachments: YARN-7829.001.patch, ui2-cluster-overview.png > > > The cluster overview page looks like a upside down triangle. It would be > nice to rebalance the charts to ensure horizontal real estate are utilized > properly. The screenshot attachment includes some suggestion for rebalance. > Node Manager status and cluster resource are closely related, it would be > nice to promote the chart to first row. Application Status, and Resource > Availability are closely related. It would be nice to promote Resource usage > to side by side with Application Status to fill up the horizontal real > estates. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7221) Add security check for privileged docker container
[ https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348544#comment-16348544 ] Shane Kumpf commented on YARN-7221: --- I'll just point out that In many organization the Hadoop administrators are not the same group that has access to manage sudo rules. Enforcing this will make it very challenging and time consuming to use this feature in some clusters. > Add security check for privileged docker container > -- > > Key: YARN-7221 > URL: https://issues.apache.org/jira/browse/YARN-7221 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-7221.001.patch, YARN-7221.002.patch > > > When a docker is running with privileges, majority of the use case is to have > some program running with root then drop privileges to another user. i.e. > httpd to start with privileged and bind to port 80, then drop privileges to > www user. > # We should add security check for submitting users, to verify they have > "sudo" access to run privileged container. > # We should remove --user=uid:gid for privileged containers. > > Docker can be launched with --privileged=true, and --user=uid:gid flag. With > this parameter combinations, user will not have access to become root user. > All docker exec command will be drop to uid:gid user to run instead of > granting privileges. User can gain root privileges if container file system > contains files that give user extra power, but this type of image is > considered as dangerous. Non-privileged user can launch container with > special bits to acquire same level of root power. Hence, we lose control of > which image should be run with --privileges, and who have sudo rights to use > privileged container images. As the result, we should check for sudo access > then decide to parameterize --privileged=true OR --user=uid:gid. This will > avoid leading developer down the wrong path. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuqi Wang updated YARN-7872: Description: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent in the leaf queue. Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk. was: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of node label to match with the node. And it should be safe, since the node which is not accessible for the queue will not be sent in the leaf queue. Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk. > labeled node cannot be used to satisfy locality specified request > - > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Blocker > Fix For: 2.8.0, 2.7.2 > > > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request (i.e. container request with 'not ANY' > resource name and the relax locality is false). > For example: > The node with available resource: > [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: > [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: > \{/default-rack}] > The container request: > [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] > {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: > \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] > Current RM capacity scheduler's behavior is that, the node cannot allocate > container for the request because of the node label not matched in the leaf > queue assign container.
[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuqi Wang updated YARN-7872: Description: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of node label to match with the node. And it should be safe, since the node which is not accessible for the queue will not be sent in the leaf queue. Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk. was: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (besides, it should not have node label), we should use resource name to match with the node instead of node label to match with the node. And it should be safe, since the node which is not accessible for the queue will not be sent in the leaf queue. Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk. > labeled node cannot be used to satisfy locality specified request > - > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Blocker > Fix For: 2.8.0, 2.7.2 > > > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request (i.e. container request with 'not ANY' > resource name and the relax locality is false). > For example: > The node with available resource: > [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: > [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: > \{/default-rack}] > The container request: > [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] > {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: > \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] > Current RM capacity scheduler's behavior is that, the node cannot allocate > container for the request because of the node label not matched in the leaf > queue assign container. > However, node locality and node label
[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuqi Wang updated YARN-7872: Description: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk. was: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent in the leaf queue. Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk. > labeled node cannot be used to satisfy locality specified request > - > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Blocker > Fix For: 2.8.0, 2.7.2 > > > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request (i.e. container request with 'not ANY' > resource name and the relax locality is false). > For example: > The node with available resource: > [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: > [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: > \{/default-rack}] > The container request: > [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] > {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: > \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] > Current RM capacity scheduler's behavior is that, the node cannot allocate > container for the request because of the node label not
[jira] [Updated] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuqi Wang updated YARN-7872: Description: *Issue summary:* labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). *For example:* The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. *Possible solution:* However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. *Discussion:* Attachment is the fix according to this principle, please help to review. Without it, we cannot use locality to request container within these labeled nodes. If the fix is acceptable, we should also recheck whether the same issue happens in trunk and other hadoop versions. If not acceptable (i.e. the current behavior is by designed), so, how can we use locality to request container within these labeled nodes? was: labeled node (i.e. node with 'not empty' node label) cannot be used to satisfy locality specified request (i.e. container request with 'not ANY' resource name and the relax locality is false). For example: The node with available resource: [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: \{/default-rack}] The container request: [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] Current RM capacity scheduler's behavior is that, the node cannot allocate container for the request because of the node label not matched in the leaf queue assign container. However, node locality and node label should be two orthogonal dimensions to select candidate nodes for container request. And the node label matching should only be executed for container request with ANY resource name, since only this kind of container request is allowed to have 'not empty' node label. So, for container request with 'not ANY' resource name (so, we know it should not have node label), we should use resource name to match with the node instead of using node label to match with the node. And this resource name matching should be safe, since the node whose node label is not accessible for the queue will not be sent to the leaf queue. *Attachment is the fix according to this principle, please help to review.* *Without it, we cannot use locality to request container within these labeled nodes.* *If the fix is acceptable, we should also recheck whether the same issue happens in trunk and other hadoop versions.* *If not* *acceptable (i.e. the current behavior is by designed), so, how can we use* *locality to request container within these labeled nodes?* > labeled node cannot be used to satisfy locality specified request > - > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Blocker > Fix For: 2.7.2 > > Attachments: YARN-7872-branch-2.7.2.001.patch > > > *Issue summary:* > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request (i.e. container request with 'not ANY' > resource name and the relax locality is false). > > *For example:* > The node with available resource: > [Resource: [MemoryMB: [100] CpuNumber: [12]]
[jira] [Reopened] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Billie Rinaldi reopened YARN-7677: -- > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7857) -fstack-check compilation flag causes binary incompatibility for container-executor between RHEL 6 and RHEL 7
[ https://issues.apache.org/jira/browse/YARN-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Brennan updated YARN-7857: -- Attachment: YARN-7857.001.patch > -fstack-check compilation flag causes binary incompatibility for > container-executor between RHEL 6 and RHEL 7 > - > > Key: YARN-7857 > URL: https://issues.apache.org/jira/browse/YARN-7857 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-7857.001.patch > > > The segmentation fault in container-executor reported in [YARN-7796] appears > to be due to a binary compatibility issue with the {{-fstack-check}} flag > that was added in [YARN-6721] > Based on my testing, a container-executor (without the patch from > [YARN-7796]) compiled on RHEL 6 with the -fstack-check flag always hits this > segmentation fault when run on RHEL 7. But if you compile without this flag, > the container-executor runs on RHEL 7 with no problems. I also verified this > with a simple program that just does the copy_file. > I think we need to either remove this flag, or find a suitable alternative. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7221) Add security check for privileged docker container
[ https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349080#comment-16349080 ] Shane Kumpf commented on YARN-7221: --- Sure. I agree that we need protections in place around the use of --privileged. If sudo is the best way to achieve that goal, I'm fine with that direction. > Add security check for privileged docker container > -- > > Key: YARN-7221 > URL: https://issues.apache.org/jira/browse/YARN-7221 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-7221.001.patch, YARN-7221.002.patch > > > When a docker is running with privileges, majority of the use case is to have > some program running with root then drop privileges to another user. i.e. > httpd to start with privileged and bind to port 80, then drop privileges to > www user. > # We should add security check for submitting users, to verify they have > "sudo" access to run privileged container. > # We should remove --user=uid:gid for privileged containers. > > Docker can be launched with --privileged=true, and --user=uid:gid flag. With > this parameter combinations, user will not have access to become root user. > All docker exec command will be drop to uid:gid user to run instead of > granting privileges. User can gain root privileges if container file system > contains files that give user extra power, but this type of image is > considered as dangerous. Non-privileged user can launch container with > special bits to acquire same level of root power. Hence, we lose control of > which image should be run with --privileges, and who have sudo rights to use > privileged container images. As the result, we should check for sudo access > then decide to parameterize --privileged=true OR --user=uid:gid. This will > avoid leading developer down the wrong path. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7516) Security check for trusted docker image
[ https://issues.apache.org/jira/browse/YARN-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7516: Attachment: YARN-7516.016.patch > Security check for trusted docker image > --- > > Key: YARN-7516 > URL: https://issues.apache.org/jira/browse/YARN-7516 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-7516.001.patch, YARN-7516.002.patch, > YARN-7516.003.patch, YARN-7516.004.patch, YARN-7516.005.patch, > YARN-7516.006.patch, YARN-7516.007.patch, YARN-7516.008.patch, > YARN-7516.009.patch, YARN-7516.010.patch, YARN-7516.011.patch, > YARN-7516.012.patch, YARN-7516.013.patch, YARN-7516.014.patch, > YARN-7516.015.patch, YARN-7516.016.patch > > > Hadoop YARN Services can support using private docker registry image or > docker image from docker hub. In current implementation, Hadoop security is > enforced through username and group membership, and enforce uid:gid > consistency in docker container and distributed file system. There is cloud > use case for having ability to run untrusted docker image on the same cluster > for testing. > The basic requirement for untrusted container is to ensure all kernel and > root privileges are dropped, and there is no interaction with distributed > file system to avoid contamination. We can probably enforce detection of > untrusted docker image by checking the following: > # If docker image is from public docker hub repository, the container is > automatically flagged as insecure, and disk volume mount are disabled > automatically, and drop all kernel capabilities. > # If docker image is from private repository in docker hub, and there is a > white list to allow the private repository, disk volume mount is allowed, > kernel capabilities follows the allowed list. > # If docker image is from private trusted registry with image name like > "private.registry.local:5000/centos", and white list allows this private > trusted repository. Disk volume mount is allowed, kernel capabilities > follows the allowed list. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5028) RMStateStore should trim down app state for completed applications
[ https://issues.apache.org/jira/browse/YARN-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergo Repas updated YARN-5028: -- Attachment: YARN-5028.000.patch > RMStateStore should trim down app state for completed applications > -- > > Key: YARN-5028 > URL: https://issues.apache.org/jira/browse/YARN-5028 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla >Assignee: Gergo Repas >Priority: Major > Attachments: YARN-5028.000.patch > > > RMStateStore stores enough information to recover applications in case of a > restart. The store also retains this information for completed applications > to serve their status to REST, WebUI, Java and CLI clients. We don't need all > the information we store today to serve application status; for instance, we > don't need the {{ApplicationSubmissionContext}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7446) Docker container privileged mode and --user flag contradict each other
[ https://issues.apache.org/jira/browse/YARN-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349064#comment-16349064 ] Shane Kumpf commented on YARN-7446: --- Thanks for clarifying. That approach sounds good to me. > Docker container privileged mode and --user flag contradict each other > -- > > Key: YARN-7446 > URL: https://issues.apache.org/jira/browse/YARN-7446 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-7446.001.patch > > > In the current implementation, when privileged=true, --user flag is also > passed to docker for launching container. In reality, the container has no > way to use root privileges unless there is sticky bit or sudoers in the image > for the specified user to gain privileges again. To avoid duplication of > dropping and reacquire root privileges, we can reduce the duplication of > specifying both flag. When privileged mode is enabled, --user flag should be > omitted. When non-privileged mode is enabled, --user flag is supplied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7868) Provide improved error message when YARN service is disabled
[ https://issues.apache.org/jira/browse/YARN-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349123#comment-16349123 ] Chandni Singh commented on YARN-7868: - +1 lgtm > Provide improved error message when YARN service is disabled > > > Key: YARN-7868 > URL: https://issues.apache.org/jira/browse/YARN-7868 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-native-services >Affects Versions: 3.1.0 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-7868.001.patch > > > Some YARN CLI command will throw verbose error message when YARN service is > disabled. The error message looks like this: > {code} > Jan 31, 2018 4:24:46 PM com.sun.jersey.api.client.ClientResponse getEntity > SEVERE: A message body reader for Java class > org.apache.hadoop.yarn.service.api.records.ServiceStatus, and Java type class > org.apache.hadoop.yarn.service.api.records.ServiceStatus, and MIME media type > application/octet-stream was not found > Jan 31, 2018 4:24:46 PM com.sun.jersey.api.client.ClientResponse getEntity > SEVERE: The registered message body readers compatible with the MIME media > type are: > application/octet-stream -> > com.sun.jersey.core.impl.provider.entity.ByteArrayProvider > com.sun.jersey.core.impl.provider.entity.FileProvider > com.sun.jersey.core.impl.provider.entity.InputStreamProvider > com.sun.jersey.core.impl.provider.entity.DataSourceProvider > com.sun.jersey.core.impl.provider.entity.RenderedImageProvider > */* -> > com.sun.jersey.core.impl.provider.entity.FormProvider > com.sun.jersey.core.impl.provider.entity.StringProvider > com.sun.jersey.core.impl.provider.entity.ByteArrayProvider > com.sun.jersey.core.impl.provider.entity.FileProvider > com.sun.jersey.core.impl.provider.entity.InputStreamProvider > com.sun.jersey.core.impl.provider.entity.DataSourceProvider > com.sun.jersey.core.impl.provider.entity.XMLJAXBElementProvider$General > com.sun.jersey.core.impl.provider.entity.ReaderProvider > com.sun.jersey.core.impl.provider.entity.DocumentProvider > com.sun.jersey.core.impl.provider.entity.SourceProvider$StreamSourceReader > com.sun.jersey.core.impl.provider.entity.SourceProvider$SAXSourceReader > com.sun.jersey.core.impl.provider.entity.SourceProvider$DOMSourceReader > com.sun.jersey.json.impl.provider.entity.JSONJAXBElementProvider$General > com.sun.jersey.json.impl.provider.entity.JSONArrayProvider$General > com.sun.jersey.json.impl.provider.entity.JSONObjectProvider$General > com.sun.jersey.core.impl.provider.entity.XMLRootElementProvider$General > com.sun.jersey.core.impl.provider.entity.XMLListElementProvider$General > com.sun.jersey.core.impl.provider.entity.XMLRootObjectProvider$General > com.sun.jersey.core.impl.provider.entity.EntityHolderReader > com.sun.jersey.json.impl.provider.entity.JSONRootElementProvider$General > com.sun.jersey.json.impl.provider.entity.JSONListElementProvider$General > com.sun.jersey.json.impl.provider.entity.JacksonProviderProxy > com.fasterxml.jackson.jaxrs.json.JacksonJsonProvider > 2018-01-31 16:24:46,415 ERROR client.ApiServiceClient: > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7516) Security check for trusted docker image
[ https://issues.apache.org/jira/browse/YARN-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349150#comment-16349150 ] Eric Yang commented on YARN-7516: - - Patch 16 drops launch command for untrusted image. - For now, drop all privileges for untrusted image, and denied untrusted image to run with privileged=true flag. The ideas behind YARN-7221, YARN-7446, YARN-7516 is to mimic basic sudo security. Privileged users can gain access to run trusted binaries as another user or multi-process container. If the binary is not trusted, run the image with least amount of privileges in a sandbox. For now, I don't preset capabilities for untrusted image to simulate root in a sandbox. The risk out-weights the benefit, therefore, we errant on the side of caution. YARN mode like docker container is safe guarded by user mapping (YARN-4266), there is no impersonation capability. Given this reason, we don't need two ACL list to track what capabilities to turn on for each mode of docker images. > Security check for trusted docker image > --- > > Key: YARN-7516 > URL: https://issues.apache.org/jira/browse/YARN-7516 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-7516.001.patch, YARN-7516.002.patch, > YARN-7516.003.patch, YARN-7516.004.patch, YARN-7516.005.patch, > YARN-7516.006.patch, YARN-7516.007.patch, YARN-7516.008.patch, > YARN-7516.009.patch, YARN-7516.010.patch, YARN-7516.011.patch, > YARN-7516.012.patch, YARN-7516.013.patch, YARN-7516.014.patch, > YARN-7516.015.patch, YARN-7516.016.patch > > > Hadoop YARN Services can support using private docker registry image or > docker image from docker hub. In current implementation, Hadoop security is > enforced through username and group membership, and enforce uid:gid > consistency in docker container and distributed file system. There is cloud > use case for having ability to run untrusted docker image on the same cluster > for testing. > The basic requirement for untrusted container is to ensure all kernel and > root privileges are dropped, and there is no interaction with distributed > file system to avoid contamination. We can probably enforce detection of > untrusted docker image by checking the following: > # If docker image is from public docker hub repository, the container is > automatically flagged as insecure, and disk volume mount are disabled > automatically, and drop all kernel capabilities. > # If docker image is from private repository in docker hub, and there is a > white list to allow the private repository, disk volume mount is allowed, > kernel capabilities follows the allowed list. > # If docker image is from private trusted registry with image name like > "private.registry.local:5000/centos", and white list allows this private > trusted repository. Disk volume mount is allowed, kernel capabilities > follows the allowed list. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349117#comment-16349117 ] Jim Brennan commented on YARN-7677: --- Thanks [~shaneku...@gmail.com] and [~billie.rinaldi], I will try out that change. > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349168#comment-16349168 ] Jim Brennan commented on YARN-7677: --- Agreed - given that we are just processing the hash map in order, it seems like we've just been getting lucky that the variables on which the classpath depends are coming before it in the launch_container.sh script. > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7778) Merging of constraints defined at different levels
[ https://issues.apache.org/jira/browse/YARN-7778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349178#comment-16349178 ] Konstantinos Karanasos commented on YARN-7778: -- +1 on latest patch, thanks [~cheersyang]. I will do a minor fix to say that the "constraint" is coming from the SchedulingRequest when I commit the patch, if you don't mind. > Merging of constraints defined at different levels > -- > > Key: YARN-7778 > URL: https://issues.apache.org/jira/browse/YARN-7778 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Konstantinos Karanasos >Assignee: Weiwei Yang >Priority: Major > Attachments: Merge Constraints Solution.pdf, > YARN-7778-YARN-7812.001.patch, YARN-7778-YARN-7812.002.patch, > YARN-7778.003.patch, YARN-7778.004.patch > > > When we have multiple constraints defined for a given set of allocation tags > at different levels (i.e., at the cluster, the application or the scheduling > request level), we need to merge those constraints. > Defining constraint levels as cluster > application > scheduling request, > constraints defined at lower levels should only be more restrictive than > those of higher levels. Otherwise the allocation should fail. > For example, if there is an application level constraint that allows no more > than 5 HBase containers per rack, a scheduling request can further restrict > that to 3 containers per rack but not to 7 containers per rack. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7446) Docker container privileged mode and --user flag contradict each other
[ https://issues.apache.org/jira/browse/YARN-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349008#comment-16349008 ] Eric Yang commented on YARN-7446: - [~shaneku...@gmail.com] It would be better to leave --user 0:0 out for some reasons. 1. If a privileged user use --privileged and docker container has a defined a service user. i.e. Hive. By remove --user 0:0, this allows a system administrator, such as Eric to have "sudo" like behavior on YARN cluster (given that sudoers check happens in YARN-7221). Although hive user is dropped to normal privileges. This provides sudo like mechanism in a secure manner for trusted docker images in YARN-7516. 2. If a privileged user made a mistake to run --privileged flag with normal user container image. He will have ability to discover his mistakes. 3. If the image does not have a predefined user, then full root capability is given. With changes in YARN-7446, YARN-7221, and YARN-7516. These three JIRA provides system administrator a way to run authorized executable on the system with privileges in docker images. This is the same concept as sudoers list to authorize users to run authorized binaries. The changes are to help the system compliant with Linux security. I think it is better to avoid hard code --user 0:0 to make sure #1, and #2 corner cases are properly supported. > Docker container privileged mode and --user flag contradict each other > -- > > Key: YARN-7446 > URL: https://issues.apache.org/jira/browse/YARN-7446 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-7446.001.patch > > > In the current implementation, when privileged=true, --user flag is also > passed to docker for launching container. In reality, the container has no > way to use root privileges unless there is sticky bit or sudoers in the image > for the specified user to gain privileges again. To avoid duplication of > dropping and reacquire root privileges, we can reduce the duplication of > specifying both flag. When privileged mode is enabled, --user flag should be > omitted. When non-privileged mode is enabled, --user flag is supplied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349119#comment-16349119 ] Shane Kumpf commented on YARN-7677: --- I'm not sure if it will be appropriate to address here, but I think we need to improve how we handle the ordering of the environment variables within the launch script. Right now it depends on hash map ordering... We likely need to ensure that any variable values are expanded out or defined prior to use to avoid this kind of issue. > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7446) Docker container privileged mode and --user flag contradict each other
[ https://issues.apache.org/jira/browse/YARN-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349008#comment-16349008 ] Eric Yang edited comment on YARN-7446 at 2/1/18 6:00 PM: - [~shaneku...@gmail.com] It would be better to leave --user 0:0 out for some reasons. 1. If a privileged user use --privileged and docker container has a defined a service user. i.e. Hive. By remove --user 0:0, this allows a system administrator, such as Eric to have "sudo" like behavior on YARN cluster (given that sudoers check happens in YARN-7221). Although hive user is dropped to normal privileges. This provides sudo like mechanism in a secure manner for trusted docker images in YARN-7516. 2. If a privileged user made a mistake to run --privileged flag with normal user container image. He will have ability to discover his mistakes. 3. If the image does not have a predefined user, then full root capability is given. With changes in YARN-7446, YARN-7221, and YARN-7516. These three JIRA provides system administrator a way to run authorized executable on the system with privileges in docker images. This is the same concept as sudoers list to authorize users to run authorized binaries with privileges. The changes are to help the system compliant with Linux security. I think it is better to avoid hard code --user 0:0 to make sure #1, and #2 corner cases are properly supported. was (Author: eyang): [~shaneku...@gmail.com] It would be better to leave --user 0:0 out for some reasons. 1. If a privileged user use --privileged and docker container has a defined a service user. i.e. Hive. By remove --user 0:0, this allows a system administrator, such as Eric to have "sudo" like behavior on YARN cluster (given that sudoers check happens in YARN-7221). Although hive user is dropped to normal privileges. This provides sudo like mechanism in a secure manner for trusted docker images in YARN-7516. 2. If a privileged user made a mistake to run --privileged flag with normal user container image. He will have ability to discover his mistakes. 3. If the image does not have a predefined user, then full root capability is given. With changes in YARN-7446, YARN-7221, and YARN-7516. These three JIRA provides system administrator a way to run authorized executable on the system with privileges in docker images. This is the same concept as sudoers list to authorize users to run authorized binaries. The changes are to help the system compliant with Linux security. I think it is better to avoid hard code --user 0:0 to make sure #1, and #2 corner cases are properly supported. > Docker container privileged mode and --user flag contradict each other > -- > > Key: YARN-7446 > URL: https://issues.apache.org/jira/browse/YARN-7446 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-7446.001.patch > > > In the current implementation, when privileged=true, --user flag is also > passed to docker for launching container. In reality, the container has no > way to use root privileges unless there is sticky bit or sudoers in the image > for the specified user to gain privileges again. To avoid duplication of > dropping and reacquire root privileges, we can reduce the duplication of > specifying both flag. When privileged mode is enabled, --user flag should be > omitted. When non-privileged mode is enabled, --user flag is supplied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349024#comment-16349024 ] Jim Brennan commented on YARN-7677: --- [~shaneku...@gmail.com], [~jlowe], [~ebadger] I have verified that this does not fail in the same way when I revert this change, so I agree with [~ebadger], we should revert this change until I can find a fix. It is not immediately clear to me why it is failing. Apologies for not catching this before submitting my patch. > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349042#comment-16349042 ] Shane Kumpf commented on YARN-7677: --- [~Jim_Brennan] - Thanks for the update. [~billie.rinaldi] and I have been looking into it as well and we believe we have it figured out. When comparing launch_container.sh with and without the change, HADOOP_CONF_DIR, HADOOP_COMMON_HOME, HADOOP_HDFS_HOME, etc are defined before CLASSPATH. With this change, all of the whilte listed env processing happens last, so the variables are the last to be defined. Moving the whitelist processing before the rest of environment processing fixed the issue for us. > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-7677: - Fix Version/s: (was: 3.0.1) (was: 3.1.0) I reverted this from trunk and branch-3.0. > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349085#comment-16349085 ] Hudson commented on YARN-7677: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13598 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13598/]) Revert "YARN-7677. Docker image cannot set HADOOP_CONF_DIR. Contributed (jlowe: rev 682ea21f2bbc587e1b727b3c895c2f513a908432) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/runtime/ContainerRuntime.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DefaultLinuxContainerRuntime.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DockerLinuxContainerRuntime.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DelegatingLinuxContainerRuntime.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/ContainerExecutor.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7850) New UI does not show status for Log Aggregation
[ https://issues.apache.org/jira/browse/YARN-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348832#comment-16348832 ] Yesha Vora commented on YARN-7850: -- yes sure. Thanks for picking up this Jira. > New UI does not show status for Log Aggregation > --- > > Key: YARN-7850 > URL: https://issues.apache.org/jira/browse/YARN-7850 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Reporter: Yesha Vora >Assignee: Gergely Novák >Priority: Major > Attachments: Screen Shot 2018-02-01 at 11.37.30.png, > YARN-7850.001.patch > > > The status of Log Aggregation is not specified any where. > New UI should show the Log aggregation status for finished application. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348876#comment-16348876 ] Eric Badger commented on YARN-7677: --- If all AMs are failing in [~shaneku...@gmail.com]'s case, shouldn't we revert first, and ask questions later? We don't want to destabilize the build. > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7819) Allow PlacementProcessor to be used with the FairScheduler
[ https://issues.apache.org/jira/browse/YARN-7819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348714#comment-16348714 ] Arun Suresh commented on YARN-7819: --- Updated patch. Fixinig findbugs and checkstyles. [~templedf], bq. Will calling the assignment node-local in the metrics update confuse the metrics? What if it's not actually node local? So, by definition, attemptAllocationOnNode, will always be a node local request, It SHOULD therefore update the metrics. > Allow PlacementProcessor to be used with the FairScheduler > -- > > Key: YARN-7819 > URL: https://issues.apache.org/jira/browse/YARN-7819 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh >Priority: Major > Attachments: YARN-7819-YARN-6592.001.patch, > YARN-7819-YARN-7812.001.patch, YARN-7819.002.patch, YARN-7819.003.patch > > > The FairScheduler needs to implement the > {{ResourceScheduler#attemptAllocationOnNode}} function for the processor to > support the FairScheduler. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7221) Add security check for privileged docker container
[ https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1634#comment-1634 ] Eric Yang commented on YARN-7221: - [~shaneku...@gmail.com] How about get this in, and community can contribute for a separate ACL mechanism when the need arises? This will ensure that we errant on the side of caution instead of giving too much power to non privileged Linux user. > Add security check for privileged docker container > -- > > Key: YARN-7221 > URL: https://issues.apache.org/jira/browse/YARN-7221 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-7221.001.patch, YARN-7221.002.patch > > > When a docker is running with privileges, majority of the use case is to have > some program running with root then drop privileges to another user. i.e. > httpd to start with privileged and bind to port 80, then drop privileges to > www user. > # We should add security check for submitting users, to verify they have > "sudo" access to run privileged container. > # We should remove --user=uid:gid for privileged containers. > > Docker can be launched with --privileged=true, and --user=uid:gid flag. With > this parameter combinations, user will not have access to become root user. > All docker exec command will be drop to uid:gid user to run instead of > granting privileges. User can gain root privileges if container file system > contains files that give user extra power, but this type of image is > considered as dangerous. Non-privileged user can launch container with > special bits to acquire same level of root power. Hence, we lose control of > which image should be run with --privileges, and who have sudo rights to use > privileged container images. As the result, we should check for sudo access > then decide to parameterize --privileged=true OR --user=uid:gid. This will > avoid leading developer down the wrong path. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7221) Add security check for privileged docker container
[ https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348882#comment-16348882 ] Eric Badger commented on YARN-7221: --- bq. I'll just point out that In many organization the Hadoop administrators are not the same group that has access to manage sudo rules. Enforcing this will make it very challenging and time consuming to use this feature in some clusters. This is certainly true and it could/would be a pain to set this up if the relevant users were not already in the sudoers list. However, from the opposite perspective, it would also be bad for users to be granted sudo access when the administrators did not grant that privilege to them. This is 100% a conversation about usability vs. security in my mind. I tend to lean in the direction of secure by default with options to relax those constraints to increase usability. It's ugly, but an idea could be to have different configurable mechanisms to check for privileged users. One could be the sudo check and a different one could be a container-executor.cfg privileged user list check. I'm not sure if I would even support this, but it's an idea of how to make both of these scenarios work. > Add security check for privileged docker container > -- > > Key: YARN-7221 > URL: https://issues.apache.org/jira/browse/YARN-7221 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-7221.001.patch, YARN-7221.002.patch > > > When a docker is running with privileges, majority of the use case is to have > some program running with root then drop privileges to another user. i.e. > httpd to start with privileged and bind to port 80, then drop privileges to > www user. > # We should add security check for submitting users, to verify they have > "sudo" access to run privileged container. > # We should remove --user=uid:gid for privileged containers. > > Docker can be launched with --privileged=true, and --user=uid:gid flag. With > this parameter combinations, user will not have access to become root user. > All docker exec command will be drop to uid:gid user to run instead of > granting privileges. User can gain root privileges if container file system > contains files that give user extra power, but this type of image is > considered as dangerous. Non-privileged user can launch container with > special bits to acquire same level of root power. Hence, we lose control of > which image should be run with --privileges, and who have sudo rights to use > privileged container images. As the result, we should check for sudo access > then decide to parameterize --privileged=true OR --user=uid:gid. This will > avoid leading developer down the wrong path. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348815#comment-16348815 ] Shane Kumpf commented on YARN-7677: --- Docker is enabled, but the applications in question are not leveraging docker. These are simple apps like MR sleep and distributed shell. All mapreduce classpath settings, yarn.application.classpath, and whitelist env are not set and are using the defaults. I've tried setting these in various ways that used to work, but haven't found a working combination yet. > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348810#comment-16348810 ] Jason Lowe commented on YARN-7677: -- Is this with Docker containers or without? There are two main changes with this patch: # HADOOP_CONF_DIR needs to be in the whitelist config to be inherited from the NM. (It is in the default whitelist setting already). # In Docker, whitelisted variables that would be inherited from the NM but are also set by the Docker image will use the Docker image setting instead of the NM setting. > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348841#comment-16348841 ] Jason Lowe commented on YARN-7677: -- Ultimately one way to debug this would be to compare the container launch scripts between the two scenarios (i.e.: with and without YARN-7677 applied). The only difference should be that some variables in the launch script will have the export var=$\{_var_:-_value_\} syntax that didn't before. In the non-Docker case, all of those variables should be getting the NM settings unless somehow those variables already are set to _different_ values in the environment before the launch script runs. > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7857) -fstack-check compilation flag causes binary incompatibility for container-executor between RHEL 6 and RHEL 7
[ https://issues.apache.org/jira/browse/YARN-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348926#comment-16348926 ] genericqa commented on YARN-7857: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 28m 5s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 34s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 20s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 61m 53s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7857 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12908829/YARN-7857.001.patch | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux 0b6933f920b8 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / ae2177d | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/19566/testReport/ | | Max. process+thread count | 302 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/19566/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > -fstack-check compilation flag causes binary incompatibility for > container-executor between RHEL 6 and RHEL 7 > - > > Key: YARN-7857 > URL: https://issues.apache.org/jira/browse/YARN-7857 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-7857.001.patch > > > The segmentation fault in container-executor reported in [YARN-7796]
[jira] [Commented] (YARN-7819) Allow PlacementProcessor to be used with the FairScheduler
[ https://issues.apache.org/jira/browse/YARN-7819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348928#comment-16348928 ] genericqa commented on YARN-7819: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 33s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 30s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 12s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 16s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 57s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}112m 47s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | | Unchecked/unconfirmed cast from org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt to org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt in org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptAllocationOnNode(SchedulerApplicationAttempt, SchedulingRequest, SchedulerNode) At FairScheduler.java:org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt in org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptAllocationOnNode(SchedulerApplicationAttempt, SchedulingRequest, SchedulerNode) At FairScheduler.java:[line 1883] | | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.constraint.TestPlacementProcessor | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServiceAppsNodelabel | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7819 | | JIRA Patch URL |
[jira] [Updated] (YARN-7223) Document GPU isolation feature
[ https://issues.apache.org/jira/browse/YARN-7223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-7223: - Fix Version/s: 3.1.0 > Document GPU isolation feature > -- > > Key: YARN-7223 > URL: https://issues.apache.org/jira/browse/YARN-7223 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Fix For: 3.1.0 > > Attachments: YARN-7223.wip.001.patch, YARN-7223.wip.001.pdf > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7223) Document GPU isolation feature
[ https://issues.apache.org/jira/browse/YARN-7223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-7223: - Priority: Blocker (was: Major) > Document GPU isolation feature > -- > > Key: YARN-7223 > URL: https://issues.apache.org/jira/browse/YARN-7223 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Fix For: 3.1.0 > > Attachments: YARN-7223.wip.001.patch, YARN-7223.wip.001.pdf > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348779#comment-16348779 ] Shane Kumpf commented on YARN-7677: --- [~Jim_Brennan], thanks for putting this together. With this patch in, all AM's are failing to launch with classpath related issue in my dev environment. Still looking into the cause, but do you have any thoughts? > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7819) Allow PlacementProcessor to be used with the FairScheduler
[ https://issues.apache.org/jira/browse/YARN-7819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-7819: -- Attachment: YARN-7819.003.patch > Allow PlacementProcessor to be used with the FairScheduler > -- > > Key: YARN-7819 > URL: https://issues.apache.org/jira/browse/YARN-7819 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh >Priority: Major > Attachments: YARN-7819-YARN-6592.001.patch, > YARN-7819-YARN-7812.001.patch, YARN-7819.002.patch, YARN-7819.003.patch > > > The FairScheduler needs to implement the > {{ResourceScheduler#attemptAllocationOnNode}} function for the processor to > support the FairScheduler. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348795#comment-16348795 ] Jim Brennan commented on YARN-7677: --- [~shaneku...@gmail.com] are you running with docker? > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348898#comment-16348898 ] Jim Brennan commented on YARN-7677: --- [~shaneku...@gmail.com], I am trying to repro locally. In my dev setup, it is currently working, but I typically run with mapreduce.application.framework.path and mapreduce.application.classpath defined in my mapred-site.xml file, pointing to a tarball in my home dir in hdfs. If i remove those, I do get errors like these: {noformat} Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster Please check whether your etc/hadoop/mapred-site.xml contains the below configuration: yarn.app.mapreduce.am.env HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory} mapreduce.map.env HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory} mapreduce.reduce.env HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory} {noformat} Is this what you are seeing? I don't think this behavior was different before my change, but I'm going to revert it locally and double-check. > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7781) Update YARN-Services-Examples.md to be in sync with the latest code
[ https://issues.apache.org/jira/browse/YARN-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-7781: -- Attachment: YARN-7781.03.patch > Update YARN-Services-Examples.md to be in sync with the latest code > --- > > Key: YARN-7781 > URL: https://issues.apache.org/jira/browse/YARN-7781 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gour Saha >Assignee: Jian He >Priority: Major > Attachments: YARN-7781.01.patch, YARN-7781.02.patch, > YARN-7781.03.patch > > > Update YARN-Services-Examples.md to make the following additions/changes: > 1. Add an additional URL and PUT Request JSON to support flex: > Update to flex up/down the no of containers (instances) of a component of a > service > PUT URL – http://localhost:8088/app/v1/services/hello-world > PUT Request JSON > {code} > { > "components" : [ { > "name" : "hello", > "number_of_containers" : 3 > } ] > } > {code} > 2. Modify all occurrences of /ws/ to /app/ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5714) ContainerExecutor does not order environment map
[ https://issues.apache.org/jira/browse/YARN-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349304#comment-16349304 ] genericqa commented on YARN-5714: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s{color} | {color:red} YARN-5714 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-5714 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12845159/YARN-5714.006.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/19571/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > ContainerExecutor does not order environment map > > > Key: YARN-5714 > URL: https://issues.apache.org/jira/browse/YARN-5714 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.4.1, 2.5.2, 2.7.3, 2.6.4, 3.0.0-alpha1 > Environment: all (linux and windows alike) >Reporter: Remi Catherinot >Assignee: Remi Catherinot >Priority: Trivial > Labels: oct16-medium > Attachments: YARN-5714.001.patch, YARN-5714.002.patch, > YARN-5714.003.patch, YARN-5714.004.patch, YARN-5714.005.patch, > YARN-5714.006.patch > > Original Estimate: 120h > Remaining Estimate: 120h > > when dumping the launch container script, environment variables are dumped > based on the order internally used by the map implementation (hash based). It > does not take into consideration that some env varibales may refer each > other, and so that some env variables must be declared before those > referencing them. > In my case, i ended up having LD_LIBRARY_PATH which was depending on > HADOOP_COMMON_HOME being dumped before HADOOP_COMMON_HOME. Thus it had a > wrong value and so native libraries weren't loaded. jobs were running but not > at their best efficiency. This is just a use case falling into that bug, but > i'm sure others may happen as well. > I already have a patch running in my production environment, i just estimate > to 5 days for packaging the patch in the right fashion for JIRA + try my best > to add tests. > Note : the patch is not OS aware with a default empty implementation. I will > only implement the unix version on a 1st release. I'm not used to windows env > variables syntax so it will take me more time/research for it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349331#comment-16349331 ] Billie Rinaldi commented on YARN-7677: -- bq. In the general case, we're not going to be able to order the variables without doing a dependency analysis between them It seems as if the dependency analysis ticket stalled due to disagreement about approach. I don't think we necessarily need dependency analysis; the primary use case is AM-defined vars being able to reference NM-defined vars, which we could accomplish by writing NM vars to the launch script first. > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7781) Update YARN-Services-Examples.md to be in sync with the latest code
[ https://issues.apache.org/jira/browse/YARN-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349347#comment-16349347 ] genericqa commented on YARN-7781: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 29s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 48s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 34s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 34m 17s{color} | {color:red} hadoop-yarn-services-core in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s{color} | {color:green} hadoop-yarn-services-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 77m 58s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7781 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12908858/YARN-7781.03.patch | | Optional Tests | asflicense mvnsite compile javac javadoc mvninstall unit shadedclient findbugs checkstyle | | uname | Linux 400553c11455 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / dd50f53 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | unit |
[jira] [Commented] (YARN-7516) Security check for trusted docker image
[ https://issues.apache.org/jira/browse/YARN-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349232#comment-16349232 ] genericqa commented on YARN-7516: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 18m 6s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 34m 5s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 23s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 52s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 92m 16s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7516 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12907540/YARN-7516.015.patch | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux 8def665676ab 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 6ca7204 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/19567/testReport/ | | Max. process+thread count | 410 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: hadoop-yarn-project/hadoop-yarn | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/19567/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Security check for trusted docker image > --- > > Key: YARN-7516 > URL: https://issues.apache.org/jira/browse/YARN-7516 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major >
[jira] [Commented] (YARN-7221) Add security check for privileged docker container
[ https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349262#comment-16349262 ] Eric Yang commented on YARN-7221: - Rebased patch to current trunk. > Add security check for privileged docker container > -- > > Key: YARN-7221 > URL: https://issues.apache.org/jira/browse/YARN-7221 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-7221.001.patch, YARN-7221.002.patch, > YARN-7221.003.patch > > > When a docker is running with privileges, majority of the use case is to have > some program running with root then drop privileges to another user. i.e. > httpd to start with privileged and bind to port 80, then drop privileges to > www user. > # We should add security check for submitting users, to verify they have > "sudo" access to run privileged container. > # We should remove --user=uid:gid for privileged containers. > > Docker can be launched with --privileged=true, and --user=uid:gid flag. With > this parameter combinations, user will not have access to become root user. > All docker exec command will be drop to uid:gid user to run instead of > granting privileges. User can gain root privileges if container file system > contains files that give user extra power, but this type of image is > considered as dangerous. Non-privileged user can launch container with > special bits to acquire same level of root power. Hence, we lose control of > which image should be run with --privileges, and who have sudo rights to use > privileged container images. As the result, we should check for sudo access > then decide to parameterize --privileged=true OR --user=uid:gid. This will > avoid leading developer down the wrong path. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5714) ContainerExecutor does not order environment map
[ https://issues.apache.org/jira/browse/YARN-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349312#comment-16349312 ] Jason Lowe commented on YARN-5714: -- Sorry to come to this late. We ran across an instance of this debugging some environment variable ordering issues in YARN-7677. While the LinkedHashMap solution sounds nice and clean, I don't think it can work in practice. The problem is we have more than one list of environment variables: the user-provided list and the inherited variables via the NM whitelist. To make things worse, we don't know how they could be interconnected. The problem in YARN-7677 occurred because variables in the user's settings referenced variables in the NM whitelist, and the whitelist variables were listed after the user variables. Theoretically an admin could setup NM variables that have "plugin" variables that could come from user-provided settings, and thus we can't always assume NM whitelist variables come before user variables. In short, I think we'll have to do some sort of dependency analysis. I'm not a fan of parsing the shell syntax to figure out the dependencies in the value, but I don't see a viable alternative to get all the variables, user-specified and otherwise, listed in the proper order. > ContainerExecutor does not order environment map > > > Key: YARN-5714 > URL: https://issues.apache.org/jira/browse/YARN-5714 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.4.1, 2.5.2, 2.7.3, 2.6.4, 3.0.0-alpha1 > Environment: all (linux and windows alike) >Reporter: Remi Catherinot >Assignee: Remi Catherinot >Priority: Trivial > Labels: oct16-medium > Attachments: YARN-5714.001.patch, YARN-5714.002.patch, > YARN-5714.003.patch, YARN-5714.004.patch, YARN-5714.005.patch, > YARN-5714.006.patch > > Original Estimate: 120h > Remaining Estimate: 120h > > when dumping the launch container script, environment variables are dumped > based on the order internally used by the map implementation (hash based). It > does not take into consideration that some env varibales may refer each > other, and so that some env variables must be declared before those > referencing them. > In my case, i ended up having LD_LIBRARY_PATH which was depending on > HADOOP_COMMON_HOME being dumped before HADOOP_COMMON_HOME. Thus it had a > wrong value and so native libraries weren't loaded. jobs were running but not > at their best efficiency. This is just a use case falling into that bug, but > i'm sure others may happen as well. > I already have a patch running in my production environment, i just estimate > to 5 days for packaging the patch in the right fashion for JIRA + try my best > to add tests. > Note : the patch is not OS aware with a default empty implementation. I will > only implement the unix version on a 1st release. I'm not used to windows env > variables syntax so it will take me more time/research for it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4882) Change the log level to DEBUG for recovering completed applications
[ https://issues.apache.org/jira/browse/YARN-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349432#comment-16349432 ] Jason Lowe commented on YARN-4882: -- bq. From the above code, if RM fails to recover an application or an attempt all the other applications won't be loaded. The same was true even before this patch. The proposed code would change the semantics of application recovery which is out of the scope of this JIRA. Admins may not desire a ResourceManager to say it recovered when not all applications recovered. Otherwise the RM may appear to be up, admin things everything is fine, when one or more (possibly all!) applications are simply gone. > Change the log level to DEBUG for recovering completed applications > --- > > Key: YARN-4882 > URL: https://issues.apache.org/jira/browse/YARN-4882 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Rohith Sharma K S >Assignee: Daniel Templeton >Priority: Major > Labels: oct16-easy > Fix For: 2.9.0, 3.0.0-alpha2 > > Attachments: YARN-4882.001.patch, YARN-4882.002.patch, > YARN-4882.003.patch, YARN-4882.004.patch, YARN-4882.005.patch > > > I think for recovering completed applications no need to log as INFO, rather > it can be made it as DEBUG. The problem seen from large cluster is if any > issue happens during RM start up and continuously switching , then RM logs > are filled with most with recovering applications only. > There are 6 lines are logged for 1 applications as I shown in below logs, > then consider RM default value for max-completed applications is 10K. So for > each switch 10K*6=60K lines will be added which is not useful I feel. > {noformat} > 2016-03-01 10:20:59,077 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager: Default priority > level is set to application:application_1456298208485_21507 > 2016-03-01 10:20:59,094 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Recovering > app: application_1456298208485_21507 with 1 attempts and final state = > FINISHED > 2016-03-01 10:20:59,100 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > Recovering attempt: appattempt_1456298208485_21507_01 with final state: > FINISHED > 2016-03-01 10:20:59,107 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > appattempt_1456298208485_21507_01 State change from NEW to FINISHED > 2016-03-01 10:20:59,111 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: > application_1456298208485_21507 State change from NEW to FINISHED > 2016-03-01 10:20:59,112 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=rohith > OPERATION=Application Finished - Succeeded TARGET=RMAppManager > RESULT=SUCCESS APPID=application_1456298208485_21507 > {noformat} > The main problem is missing important information's from the logs before RM > unstable. Even though log roll back is 50 or 100, in a short period all these > logs will be rolled out and all the logs contains only RM switching > information that too recovering applications!!. > I suggest at least completed applications recovery should be logged as DEBUG. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349449#comment-16349449 ] Billie Rinaldi commented on YARN-7677: -- They'd have to be told the available versions by the admins, so they could just as easily be told the full paths. :) > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349437#comment-16349437 ] Jason Lowe commented on YARN-7677: -- True, but that assumes the user even knows what the path is. The point of such a setup is to decouple desired java version from where admins installed it. > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7781) Update YARN-Services-Examples.md to be in sync with the latest code
[ https://issues.apache.org/jira/browse/YARN-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349226#comment-16349226 ] Gour Saha commented on YARN-7781: - Makes sense. +1 for patch 03. > Update YARN-Services-Examples.md to be in sync with the latest code > --- > > Key: YARN-7781 > URL: https://issues.apache.org/jira/browse/YARN-7781 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gour Saha >Assignee: Jian He >Priority: Major > Attachments: YARN-7781.01.patch, YARN-7781.02.patch, > YARN-7781.03.patch > > > Update YARN-Services-Examples.md to make the following additions/changes: > 1. Add an additional URL and PUT Request JSON to support flex: > Update to flex up/down the no of containers (instances) of a component of a > service > PUT URL – http://localhost:8088/app/v1/services/hello-world > PUT Request JSON > {code} > { > "components" : [ { > "name" : "hello", > "number_of_containers" : 3 > } ] > } > {code} > 2. Modify all occurrences of /ws/ to /app/ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5028) RMStateStore should trim down app state for completed applications
[ https://issues.apache.org/jira/browse/YARN-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349286#comment-16349286 ] genericqa commented on YARN-5028: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 39s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 5s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 36s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 48s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}109m 27s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMHAForAsyncScheduler | | | hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart | | | hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStore | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServiceAppsNodelabel | | | hadoop.yarn.server.resourcemanager.TestRMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-5028 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12908846/YARN-5028.000.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux e9d06c47935d 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 6ca7204 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/19568/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results |
[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349301#comment-16349301 ] Jason Lowe commented on YARN-7677: -- YARN-5714 is very relevant here. In the general case, we're not going to be able to order the variables without doing a dependency analysis between them, and that's what YARN-5714 proposes to do. I'll see what I can do to push that forward, since it looks like a more deterministic ordering will be a prerequisite to doing any sort of change relative to how environment variables are handled. Otherwise we'll risk breaking some case where variable ordering happened to work, and the user has little recourse to restore it to a working condition. > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349435#comment-16349435 ] Billie Rinaldi commented on YARN-7677: -- It would be much more straightforward for the user to set JAVA_HOME to their desired value in that case. > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6648) [GPG] Add SubClusterCleaner in Global Policy Generator
[ https://issues.apache.org/jira/browse/YARN-6648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349491#comment-16349491 ] Botong Huang commented on YARN-6648: Committed to YARN-7402. Thanks [~curino] for the review! > [GPG] Add SubClusterCleaner in Global Policy Generator > -- > > Key: YARN-6648 > URL: https://issues.apache.org/jira/browse/YARN-6648 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Minor > Labels: federation, gpg > Attachments: YARN-6648-YARN-2915.v1.patch, > YARN-6648-YARN-7402.v2.patch, YARN-6648-YARN-7402.v3.patch, > YARN-6648-YARN-7402.v4.patch, YARN-6648-YARN-7402.v5.patch, > YARN-6648-YARN-7402.v6.patch, YARN-6648-YARN-7402.v7.patch, > YARN-6648-YARN-7402.v8.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7781) Update YARN-Services-Examples.md to be in sync with the latest code
[ https://issues.apache.org/jira/browse/YARN-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349256#comment-16349256 ] Gour Saha commented on YARN-7781: - I filed YARN-7836 for the component name path not used issue. > Update YARN-Services-Examples.md to be in sync with the latest code > --- > > Key: YARN-7781 > URL: https://issues.apache.org/jira/browse/YARN-7781 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gour Saha >Assignee: Jian He >Priority: Major > Attachments: YARN-7781.01.patch, YARN-7781.02.patch, > YARN-7781.03.patch > > > Update YARN-Services-Examples.md to make the following additions/changes: > 1. Add an additional URL and PUT Request JSON to support flex: > Update to flex up/down the no of containers (instances) of a component of a > service > PUT URL – http://localhost:8088/app/v1/services/hello-world > PUT Request JSON > {code} > { > "components" : [ { > "name" : "hello", > "number_of_containers" : 3 > } ] > } > {code} > 2. Modify all occurrences of /ws/ to /app/ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7857) -fstack-check compilation flag causes binary incompatibility for container-executor between RHEL 6 and RHEL 7
[ https://issues.apache.org/jira/browse/YARN-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349410#comment-16349410 ] Jim Brennan commented on YARN-7857: --- As this is a change to the command line for container-executor, it was tested by compiling on both RHEL6 and RHEL7 and running the mapreduce pi example. Also ran with the RHEL6-compiled container-executor on RHEL7. I manually tested the change by compiling a small program that just includes a main() that calls a stripped down version of copy_file(). With {{-fstack-check}}, this program fails when compiled on RHEL 6 and run on RHEL 7. With {{-fstack-protect}}, the RHEL6 version runs on RHEL7. Please review. > -fstack-check compilation flag causes binary incompatibility for > container-executor between RHEL 6 and RHEL 7 > - > > Key: YARN-7857 > URL: https://issues.apache.org/jira/browse/YARN-7857 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-7857.001.patch > > > The segmentation fault in container-executor reported in [YARN-7796] appears > to be due to a binary compatibility issue with the {{-fstack-check}} flag > that was added in [YARN-6721] > Based on my testing, a container-executor (without the patch from > [YARN-7796]) compiled on RHEL 6 with the -fstack-check flag always hits this > segmentation fault when run on RHEL 7. But if you compile without this flag, > the container-executor runs on RHEL 7 with no problems. I also verified this > with a simple program that just does the copy_file. > I think we need to either remove this flag, or find a suitable alternative. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349465#comment-16349465 ] Eric Yang commented on YARN-7677: - [~jlowe] I agree with [~billie.rinaldi] and YARN-5714 approach. The classic unix approach to source system environment first, and user can override it in their own .profile or .bashrc. System does not reference user environment variables to prevent user from doing harm to the system. > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7221) Add security check for privileged docker container
[ https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349474#comment-16349474 ] Eric Yang commented on YARN-7221: - The failed unit test is not related to this patch. > Add security check for privileged docker container > -- > > Key: YARN-7221 > URL: https://issues.apache.org/jira/browse/YARN-7221 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-7221.001.patch, YARN-7221.002.patch, > YARN-7221.003.patch > > > When a docker is running with privileges, majority of the use case is to have > some program running with root then drop privileges to another user. i.e. > httpd to start with privileged and bind to port 80, then drop privileges to > www user. > # We should add security check for submitting users, to verify they have > "sudo" access to run privileged container. > # We should remove --user=uid:gid for privileged containers. > > Docker can be launched with --privileged=true, and --user=uid:gid flag. With > this parameter combinations, user will not have access to become root user. > All docker exec command will be drop to uid:gid user to run instead of > granting privileges. User can gain root privileges if container file system > contains files that give user extra power, but this type of image is > considered as dangerous. Non-privileged user can launch container with > special bits to acquire same level of root power. Hence, we lose control of > which image should be run with --privileges, and who have sudo rights to use > privileged container images. As the result, we should check for sudo access > then decide to parameterize --privileged=true OR --user=uid:gid. This will > avoid leading developer down the wrong path. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7781) Update YARN-Services-Examples.md to be in sync with the latest code
[ https://issues.apache.org/jira/browse/YARN-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349218#comment-16349218 ] Jian He commented on YARN-7781: --- talked with Billie offline, I reverted the changes about kerberos principal, that needs more verifications. uploaded patch03 > Update YARN-Services-Examples.md to be in sync with the latest code > --- > > Key: YARN-7781 > URL: https://issues.apache.org/jira/browse/YARN-7781 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gour Saha >Assignee: Jian He >Priority: Major > Attachments: YARN-7781.01.patch, YARN-7781.02.patch, > YARN-7781.03.patch > > > Update YARN-Services-Examples.md to make the following additions/changes: > 1. Add an additional URL and PUT Request JSON to support flex: > Update to flex up/down the no of containers (instances) of a component of a > service > PUT URL – http://localhost:8088/app/v1/services/hello-world > PUT Request JSON > {code} > { > "components" : [ { > "name" : "hello", > "number_of_containers" : 3 > } ] > } > {code} > 2. Modify all occurrences of /ws/ to /app/ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7781) Update YARN-Services-Examples.md to be in sync with the latest code
[ https://issues.apache.org/jira/browse/YARN-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349256#comment-16349256 ] Gour Saha edited comment on YARN-7781 at 2/1/18 8:59 PM: - I filed YARN-7836 for the component name path not used issue. [~jianhe] feel free to take YARN-7836 and work on a single patch for both. was (Author: gsaha): I filed YARN-7836 for the component name path not used issue. > Update YARN-Services-Examples.md to be in sync with the latest code > --- > > Key: YARN-7781 > URL: https://issues.apache.org/jira/browse/YARN-7781 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gour Saha >Assignee: Jian He >Priority: Major > Attachments: YARN-7781.01.patch, YARN-7781.02.patch, > YARN-7781.03.patch > > > Update YARN-Services-Examples.md to make the following additions/changes: > 1. Add an additional URL and PUT Request JSON to support flex: > Update to flex up/down the no of containers (instances) of a component of a > service > PUT URL – http://localhost:8088/app/v1/services/hello-world > PUT Request JSON > {code} > { > "components" : [ { > "name" : "hello", > "number_of_containers" : 3 > } ] > } > {code} > 2. Modify all occurrences of /ws/ to /app/ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7866) [UI2] Kerberizing the UI doesn't give any warning or content when UI is accessed without kinit
[ https://issues.apache.org/jira/browse/YARN-7866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-7866: -- Reporter: Sumana Sathish (was: Sunil G) > [UI2] Kerberizing the UI doesn't give any warning or content when UI is > accessed without kinit > -- > > Key: YARN-7866 > URL: https://issues.apache.org/jira/browse/YARN-7866 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Reporter: Sumana Sathish >Assignee: Sunil G >Priority: Major > Attachments: YARN-7866.001.patch > > > Handle 401 error and show in UI > credit to [~ssath...@hortonworks.com] for finding this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349305#comment-16349305 ] Jim Brennan commented on YARN-7677: --- I have tested a version of the patch where I write out the whitelisted variables first, and it does work for my test cases. But looking at the launch_container.sh that is produced, the order of other variables is not the same as launch_container.sh from before my changes. Since the whitelisted variables are not added to the environment hash map, the order of traversal is different. I'm not comfortable with putting this change in as-is, because while the ordering differences I'm seeing are not a problem in my test cases, there is no guarantee that others would not run into problems due to this change. I discussed this with [~jlowe], and he pointed me at YARN-5714. I think we need to address that problem before putting this change in. > Docker image cannot set HADOOP_CONF_DIR > --- > > Key: YARN-7677 > URL: https://issues.apache.org/jira/browse/YARN-7677 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Eric Badger >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-7677.001.patch, YARN-7677.002.patch > > > Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether > it's set by the user or not. It completely bypasses the whitelist and so > there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes > problems in the Docker use case where Docker containers will set up their own > environment and have their own {{HADOOP_CONF_DIR}} preset in the image > itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7221) Add security check for privileged docker container
[ https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349350#comment-16349350 ] genericqa commented on YARN-7221: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 26m 24s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 44s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 23s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 59m 17s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.TestContainerManager | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7221 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12908864/YARN-7221.003.patch | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux a57d2eae51a1 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / dd50f53 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/19570/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/19570/testReport/ | | Max. process+thread count | 430 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/19570/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Add security check for privileged docker container > -- > > Key: YARN-7221 > URL: https://issues.apache.org/jira/browse/YARN-7221 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-7221.001.patch, YARN-7221.002.patch, > YARN-7221.003.patch > > > When a docker is running with privileges,
[jira] [Assigned] (YARN-7859) New feature: add queue scheduling deadLine in fairScheduler.
[ https://issues.apache.org/jira/browse/YARN-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu reassigned YARN-7859: -- Assignee: wangwj > New feature: add queue scheduling deadLine in fairScheduler. > > > Key: YARN-7859 > URL: https://issues.apache.org/jira/browse/YARN-7859 > Project: Hadoop YARN > Issue Type: New Feature > Components: fairscheduler >Affects Versions: 3.0.0 >Reporter: wangwj >Assignee: wangwj >Priority: Major > Labels: fairscheduler, features, patch > Fix For: 3.0.0 > > Attachments: YARN-7859-v1.patch, log, screenshot-1.png, > screenshot-3.png > > Original Estimate: 24h > Remaining Estimate: 24h > > As everyone knows.In FairScheduler the phenomenon of queue scheduling > starvation often occurs when the number of cluster jobs is large.The App in > one or more queue are pending.So I have thought a way to solve this > problem.Add queue scheduling deadLine in fairScheduler.When a queue is not > scheduled for FairScheduler within a specified time.We mandatory scheduler it! > Now the way of community solves queue scheduling to starvation is preempt > container.But this way may increases the failure rate of the job. > On the basis of the above, I propose this issue... -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4882) Change the log level to DEBUG for recovering completed applications
[ https://issues.apache.org/jira/browse/YARN-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349243#comment-16349243 ] Giovanni Matteo Fumarola commented on YARN-4882: [~templedf], [~rohithsharma] I have a quick comment about this old patch. {code:java} try { for (ApplicationStateData appState : appStates.values()) { recoverApplication(appState, state); count += 1; } } finally { LOG.info("Successfully recovered " + count + " out of " + appStates.size() + " applications"); } {code} >From the above code, if RM fails to recover an application or an attempt all >the other applications won't be loaded. Due this reason the above code should be implemented as: {code:java} for (ApplicationStateData appState : appStates.values()) { try { recoverApplication(appState, state); count += 1; } catch (Exception e) { LOG.error(e); } } LOG.info("Successfully recovered " + count + " out of " + appStates.size() + " applications"); {code} Thoughts? > Change the log level to DEBUG for recovering completed applications > --- > > Key: YARN-4882 > URL: https://issues.apache.org/jira/browse/YARN-4882 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Rohith Sharma K S >Assignee: Daniel Templeton >Priority: Major > Labels: oct16-easy > Fix For: 2.9.0, 3.0.0-alpha2 > > Attachments: YARN-4882.001.patch, YARN-4882.002.patch, > YARN-4882.003.patch, YARN-4882.004.patch, YARN-4882.005.patch > > > I think for recovering completed applications no need to log as INFO, rather > it can be made it as DEBUG. The problem seen from large cluster is if any > issue happens during RM start up and continuously switching , then RM logs > are filled with most with recovering applications only. > There are 6 lines are logged for 1 applications as I shown in below logs, > then consider RM default value for max-completed applications is 10K. So for > each switch 10K*6=60K lines will be added which is not useful I feel. > {noformat} > 2016-03-01 10:20:59,077 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager: Default priority > level is set to application:application_1456298208485_21507 > 2016-03-01 10:20:59,094 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Recovering > app: application_1456298208485_21507 with 1 attempts and final state = > FINISHED > 2016-03-01 10:20:59,100 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > Recovering attempt: appattempt_1456298208485_21507_01 with final state: > FINISHED > 2016-03-01 10:20:59,107 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > appattempt_1456298208485_21507_01 State change from NEW to FINISHED > 2016-03-01 10:20:59,111 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: > application_1456298208485_21507 State change from NEW to FINISHED > 2016-03-01 10:20:59,112 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=rohith > OPERATION=Application Finished - Succeeded TARGET=RMAppManager > RESULT=SUCCESS APPID=application_1456298208485_21507 > {noformat} > The main problem is missing important information's from the logs before RM > unstable. Even though log roll back is 50 or 100, in a short period all these > logs will be rolled out and all the logs contains only RM switching > information that too recovering applications!!. > I suggest at least completed applications recovery should be logged as DEBUG. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org