[jira] [Comment Edited] (YARN-7494) Add muti-node lookup mechanism and pluggable nodes sorting policies to optimize placement decision

2019-05-30 Thread Juanjuan Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851660#comment-16851660
 ] 

Juanjuan Tian  edited comment on YARN-7494 at 5/30/19 9:02 AM:
---

Thanks Weiwei for your reply. Here seems there is another issue in 
RegularContainerAllocator#allocate, 

refering to below codes,  it iterates though all nodes, but the 
reservedContainer doesn't change correspondingly with the iterated node, for 
muti-node policy, the reservedContainer and the iterated node will be 
inconsistent, and may procude incorrect ContainerAllocation(even though this 
ContainerAllocation will be abondoned at last, but it seems really wastes 
opportunity). [~cheersyang] what's your thought about this situation

while (iter.hasNext()) {
 FiCaSchedulerNode node = iter.next();

if (reservedContainer == null) {
 result = preCheckForNodeCandidateSet(clusterResource, node,
 schedulingMode, resourceLimits, schedulerKey);
 if (null != result)

{ continue; }

} else {
 // pre-check when allocating reserved container
 if (application.getOutstandingAsksCount(schedulerKey) == 0)

{ // Release result = new ContainerAllocation(reservedContainer, null, 
AllocationState.QUEUE_SKIPPED); continue; }

}

result = tryAllocateOnNode(clusterResource, node, schedulingMode,
 resourceLimits, schedulerKey, reservedContainer);

if (AllocationState.ALLOCATED == result.getAllocationState() || 

AllocationState.RESERVED == result.getAllocationState()) {
 result = doAllocation(result, node, schedulerKey, reservedContainer);
 break;
 }
}
 


was (Author: jutia):
Thanks Weiwei for your reply. Here seems there is another issue in 
RegularContainerAllocator#allocate, 

refering to below codes,  it iterates though all nodes, but the 
reservedContainer doesn't change correspondingly with the iterated node, for 
muti-node policy, the reservedContainer and the iterated node will be 
inconsistent, and may procude incorrect ContainerAllocation(even though this 
ContainerAllocation will be abondoned at last, but it seems really wastes 
opportunity). [~cheersyang] what's your thought about this situation

while (iter.hasNext()) {
 FiCaSchedulerNode node = iter.next();

if (reservedContainer == null) {
 result = preCheckForNodeCandidateSet(clusterResource, node,
 schedulingMode, resourceLimits, schedulerKey);
 if (null != result)

{ continue; }

} else {
 // pre-check when allocating reserved container
 if (application.getOutstandingAsksCount(schedulerKey) == 0)

{ // Release result = new ContainerAllocation(reservedContainer, null, 
AllocationState.QUEUE_SKIPPED); continue; }

}

result = tryAllocateOnNode(clusterResource, node, schedulingMode,
 resourceLimits, schedulerKey, reservedContainer);

if (AllocationState.ALLOCATED == result.getAllocationState()
 || AllocationState.RESERVED == result.getAllocationState()) {
 result = doAllocation(result, node, schedulerKey, reservedContainer);
 break;
}
 
 

> Add muti-node lookup mechanism and pluggable nodes sorting policies to 
> optimize placement decision
> --
>
> Key: YARN-7494
> URL: https://issues.apache.org/jira/browse/YARN-7494
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-7494.001.patch, YARN-7494.002.patch, 
> YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, 
> YARN-7494.006.patch, YARN-7494.007.patch, YARN-7494.008.patch, 
> YARN-7494.009.patch, YARN-7494.010.patch, YARN-7494.11.patch, 
> YARN-7494.12.patch, YARN-7494.13.patch, YARN-7494.14.patch, 
> YARN-7494.15.patch, YARN-7494.16.patch, YARN-7494.17.patch, 
> YARN-7494.18.patch, YARN-7494.19.patch, YARN-7494.20.patch, 
> YARN-7494.v0.patch, YARN-7494.v1.patch, multi-node-designProposal.png
>
>
> Instead of single node, for effectiveness we can consider a multi node lookup 
> based on partition to start with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7494) Add muti-node lookup mechanism and pluggable nodes sorting policies to optimize placement decision

2019-05-30 Thread Juanjuan Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851660#comment-16851660
 ] 

Juanjuan Tian  edited comment on YARN-7494 at 5/30/19 9:00 AM:
---

Thanks Weiwei for your reply. Here seems there is another issue in 
RegularContainerAllocator#allocate, 

refering to below codes,  it iterates though all nodes, but the 
reservedContainer doesn't change correspondingly with the iterated node, for 
muti-node policy, the reservedContainer and the iterated node will be 
inconsistent, and may procude incorrect ContainerAllocation(even though this 
ContainerAllocation will be abondoned at last, but it seems really wastes 
opportunity). [~cheersyang] what's your thought about this situation

while (iter.hasNext()) {
 FiCaSchedulerNode node = iter.next();

if (reservedContainer == null) {
 result = preCheckForNodeCandidateSet(clusterResource, node,
 schedulingMode, resourceLimits, schedulerKey);
 if (null != result)

{ continue; }

} else {
 // pre-check when allocating reserved container
 if (application.getOutstandingAsksCount(schedulerKey) == 0)

{ // Release result = new ContainerAllocation(reservedContainer, null, 
AllocationState.QUEUE_SKIPPED); continue; }

}

result = tryAllocateOnNode(clusterResource, node, schedulingMode,
 resourceLimits, schedulerKey, reservedContainer);

if (AllocationState.ALLOCATED == result.getAllocationState()
 || AllocationState.RESERVED == result.getAllocationState()) {
 result = doAllocation(result, node, schedulerKey, reservedContainer);
 break;
}
 
 


was (Author: jutia):
Thanks Weiwei for your reply. Here seems there is another issue in 
RegularContainerAllocator#allocate, 

refering to below codes,  it iterates though all nodes, but the 
reservedContainer doesn't change correspondingly with the iterated node, for 
muti-node policy, the reservedContainer and the iterated node will be 
inconsistent, and may procude incorrect ContainerAllocation(even though this 
ContainerAllocation will be abondoned at last, but it seems really wastes 
opportunity). [~cheersyang] what's your thought about this situation

while (iter.hasNext()) {
 FiCaSchedulerNode node = iter.next();

if (reservedContainer == null) {
 result = preCheckForNodeCandidateSet(clusterResource, node,
 schedulingMode, resourceLimits, schedulerKey);
 if (null != result)

{ continue; }

} else {
 // pre-check when allocating reserved container
 if (application.getOutstandingAsksCount(schedulerKey) == 0)

{ // Release result = new ContainerAllocation(reservedContainer, null, 
AllocationState.QUEUE_SKIPPED); continue; }

}

result = tryAllocateOnNode(clusterResource, node, schedulingMode,
 resourceLimits, schedulerKey, reservedContainer);

if (AllocationState.ALLOCATED == result.getAllocationState()
||AllocationState.RESERVED == result.getAllocationState()) \{ result = 
doAllocation(result, node, schedulerKey, reservedContainer); break; }}||

 

> Add muti-node lookup mechanism and pluggable nodes sorting policies to 
> optimize placement decision
> --
>
> Key: YARN-7494
> URL: https://issues.apache.org/jira/browse/YARN-7494
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-7494.001.patch, YARN-7494.002.patch, 
> YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, 
> YARN-7494.006.patch, YARN-7494.007.patch, YARN-7494.008.patch, 
> YARN-7494.009.patch, YARN-7494.010.patch, YARN-7494.11.patch, 
> YARN-7494.12.patch, YARN-7494.13.patch, YARN-7494.14.patch, 
> YARN-7494.15.patch, YARN-7494.16.patch, YARN-7494.17.patch, 
> YARN-7494.18.patch, YARN-7494.19.patch, YARN-7494.20.patch, 
> YARN-7494.v0.patch, YARN-7494.v1.patch, multi-node-designProposal.png
>
>
> Instead of single node, for effectiveness we can consider a multi node lookup 
> based on partition to start with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7494) Add muti-node lookup mechanism and pluggable nodes sorting policies to optimize placement decision

2019-05-30 Thread Juanjuan Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851660#comment-16851660
 ] 

Juanjuan Tian  edited comment on YARN-7494 at 5/30/19 8:59 AM:
---

Thanks Weiwei for your reply. Here seems there is another issue in 
RegularContainerAllocator#allocate, 

refering to below codes,  it iterates though all nodes, but the 
reservedContainer doesn't change correspondingly with the iterated node, for 
muti-node policy, the reservedContainer and the iterated node will be 
inconsistent, and may procude incorrect ContainerAllocation(even though this 
ContainerAllocation will be abondoned at last, but it seems really wastes 
opportunity). [~cheersyang] what's your thought about this situation

while (iter.hasNext()) {
 FiCaSchedulerNode node = iter.next();

if (reservedContainer == null) {
 result = preCheckForNodeCandidateSet(clusterResource, node,
 schedulingMode, resourceLimits, schedulerKey);
 if (null != result)

{ continue; }

} else {
 // pre-check when allocating reserved container
 if (application.getOutstandingAsksCount(schedulerKey) == 0)

{ // Release result = new ContainerAllocation(reservedContainer, null, 
AllocationState.QUEUE_SKIPPED); continue; }

}

result = tryAllocateOnNode(clusterResource, node, schedulingMode,
 resourceLimits, schedulerKey, reservedContainer);

if (AllocationState.ALLOCATED == result.getAllocationState()
||AllocationState.RESERVED == result.getAllocationState()) \{ result = 
doAllocation(result, node, schedulerKey, reservedContainer); break; }}||

 


was (Author: jutia):
Thanks Weiwei for your reply. here seems there is another issue in 
RegularContainerAllocator#allocate, 

there it iterates though all nodes, but the reservedContainer doesn't change 
with the iterated node, for muti-node policy, the reservedContainer and the 
iterated node will be inconsistent, and may procude incorrect 
ContainerAllocation(even though this ContainerAllocation will be abondoned at 
last, but it seems really wastes opportunity). [~cheersyang] what's your 
thought about this situation

while (iter.hasNext()) {
 FiCaSchedulerNode node = iter.next();

 if (reservedContainer == null) {
 result = preCheckForNodeCandidateSet(clusterResource, node,
 schedulingMode, resourceLimits, schedulerKey);
 if (null != result) {
 continue;
 }
 } else {
 // pre-check when allocating reserved container
 if (application.getOutstandingAsksCount(schedulerKey) == 0) {
 // Release
 result = new ContainerAllocation(reservedContainer, null,
 AllocationState.QUEUE_SKIPPED);
 continue;
 }
 }

 result = tryAllocateOnNode(clusterResource, node, schedulingMode,
 resourceLimits, schedulerKey, reservedContainer);

 if (AllocationState.ALLOCATED == result.getAllocationState()
 || AllocationState.RESERVED == result.getAllocationState()) {
 result = doAllocation(result, node, schedulerKey, reservedContainer);
 break;
 }
}

 

> Add muti-node lookup mechanism and pluggable nodes sorting policies to 
> optimize placement decision
> --
>
> Key: YARN-7494
> URL: https://issues.apache.org/jira/browse/YARN-7494
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-7494.001.patch, YARN-7494.002.patch, 
> YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, 
> YARN-7494.006.patch, YARN-7494.007.patch, YARN-7494.008.patch, 
> YARN-7494.009.patch, YARN-7494.010.patch, YARN-7494.11.patch, 
> YARN-7494.12.patch, YARN-7494.13.patch, YARN-7494.14.patch, 
> YARN-7494.15.patch, YARN-7494.16.patch, YARN-7494.17.patch, 
> YARN-7494.18.patch, YARN-7494.19.patch, YARN-7494.20.patch, 
> YARN-7494.v0.patch, YARN-7494.v1.patch, multi-node-designProposal.png
>
>
> Instead of single node, for effectiveness we can consider a multi node lookup 
> based on partition to start with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7494) Add muti-node lookup mechanism and pluggable nodes sorting policies to optimize placement decision

2019-05-22 Thread tianjuan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845536#comment-16845536
 ] 

tianjuan edited comment on YARN-7494 at 5/23/19 2:22 AM:
-

seems that ResourceUsageMultiNodeLookupPolicy may cause Application starve 
forever

for example, there are 10 nodes(h1,h2,...h9,h10), each has 8G memory in 
cluster, and two queues A,B, each is configured with 50% capacity.

firstly there are 10 jobs (each requests 6G respurce) is submited to queue A, 
and each node of the 10 nodes will have a contianer allocated.

Afterwards,  another job JobB which requests 3G resource is submited to queue 
B, and there will be one container with 3G size reserved on node h1,

with ResourceUsageMultiNodeLookupPolicy, the order policy will always be 
h1,h2,..h9,h10, and there will always be one container re-reverved on node h1, 
no other reservation happen, no preemption happens either, JobB will hang 
forever, [~sunilg] what's your thought about this situation?


was (Author: jutia):
seems that ResourceUsageMultiNodeLookupPolicy may cause Application starve 
forever

for example, there are 10 nodes(h1,h2,...h9,h10), each has 8G memory in 
cluster, and two queues A,B, each is configured with 50% capacity.

firstly there are 10 jobs (each requests 6G respurce) is submited to queue A, 
and each node of the 10 nodes will have a contianer allocated.

Afterwards,  another job JobB which requests 3G resource is submited to queue 
B, and there will be one container with 3G size reserved on node h1,

with ResourceUsageMultiNodeLookupPolicy, the order policy will always be 
h1,h2,..h9,h10, and there will always be one container re-reverved on node h1, 
no other reservation happen, no preemption happens eothr, JobB will hang 
forever, [~sunilg] what's your thought about this situation?

> Add muti-node lookup mechanism and pluggable nodes sorting policies to 
> optimize placement decision
> --
>
> Key: YARN-7494
> URL: https://issues.apache.org/jira/browse/YARN-7494
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-7494.001.patch, YARN-7494.002.patch, 
> YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, 
> YARN-7494.006.patch, YARN-7494.007.patch, YARN-7494.008.patch, 
> YARN-7494.009.patch, YARN-7494.010.patch, YARN-7494.11.patch, 
> YARN-7494.12.patch, YARN-7494.13.patch, YARN-7494.14.patch, 
> YARN-7494.15.patch, YARN-7494.16.patch, YARN-7494.17.patch, 
> YARN-7494.18.patch, YARN-7494.19.patch, YARN-7494.20.patch, 
> YARN-7494.v0.patch, YARN-7494.v1.patch, multi-node-designProposal.png
>
>
> Instead of single node, for effectiveness we can consider a multi node lookup 
> based on partition to start with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7494) Add muti-node lookup mechanism and pluggable nodes sorting policies to optimize placement decision

2019-05-22 Thread tianjuan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845536#comment-16845536
 ] 

tianjuan edited comment on YARN-7494 at 5/22/19 7:07 AM:
-

seems that ResourceUsageMultiNodeLookupPolicy may cause Application starve 
forever

for example, there are 10 nodes(h1,h2,...h9,h10), each has 8G memory in 
cluster, and two queues A,B, each is configured with 50% capacity.

firstly there are 10 jobs (each requests 6G respurce) is submited to queue A, 
and each node of the 10 nodes will have a contianer allocated.

Afterwards,  another job JobB which requests 3G resource is submited to queue 
B, and there will be one container with 3G size reserved on node h1,

with ResourceUsageMultiNodeLookupPolicy, the order policy will always be 
h1,h2,..h9,h10, and there will always be one container re-reverved on node h1, 
no other reservation happen, no preemption happens eothr, JobB will hang 
forever, [~sunilg] what's your thought about this situation?


was (Author: jutia):
seems that ResourceUsageMultiNodeLookupPolicy may cause Application starve 
forever

for example, there are 10 nodes(h1,h2,...h9,h10), each has 8G memory in 
cluster, and two queues A,B, each is configured with 50% capacity.

firstly there are 10 jobs (each requests 6G respurce) is submited to queue A, 
and each node of the 10 nodes will have a contianer allocated.

Afterwards,  another job JobB which requests 3G resource is submited to queue 
B, and there will be one container with 3G size reserved on node h1,

with ResourceUsageMultiNodeLookupPolicy, the order policy will always be 
h1,h2,..h9,h10, and there will always be one container re-reverved on node h1, 
no other reservation happen,  JobB will hang forever, [~sunilg] what's ypur 
thought about this situation?

> Add muti-node lookup mechanism and pluggable nodes sorting policies to 
> optimize placement decision
> --
>
> Key: YARN-7494
> URL: https://issues.apache.org/jira/browse/YARN-7494
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-7494.001.patch, YARN-7494.002.patch, 
> YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, 
> YARN-7494.006.patch, YARN-7494.007.patch, YARN-7494.008.patch, 
> YARN-7494.009.patch, YARN-7494.010.patch, YARN-7494.11.patch, 
> YARN-7494.12.patch, YARN-7494.13.patch, YARN-7494.14.patch, 
> YARN-7494.15.patch, YARN-7494.16.patch, YARN-7494.17.patch, 
> YARN-7494.18.patch, YARN-7494.19.patch, YARN-7494.20.patch, 
> YARN-7494.v0.patch, YARN-7494.v1.patch, multi-node-designProposal.png
>
>
> Instead of single node, for effectiveness we can consider a multi node lookup 
> based on partition to start with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org