[jira] [Commented] (MESOS-9806) Address allocator performance regression due to the addition of quota limits.

2019-08-23 Thread Meng Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-9806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16914674#comment-16914674
 ] 

Meng Zhu commented on MESOS-9806:
-

As of now, the performance is close to 1.8.1 even with the addition of limits 
enforcement. There will be more improvement as we deprecate the framework 
sorter and optimize the role sorter (MESOS-9942 and MESOS-9943).

> Address allocator performance regression due to the addition of quota limits.
> -
>
> Key: MESOS-9806
> URL: https://issues.apache.org/jira/browse/MESOS-9806
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Meng Zhu
>Assignee: Meng Zhu
>Priority: Critical
>  Labels: resource-management
>
> In MESOS-9802, we removed the quota role sorter which is tech debt.
> However, this slows down the allocator. The problem is that in the first 
> stage, even though a cluster might have no active roles with non-default 
> quota, the allocator will now have to sort and go through each and every role 
> in the cluster. Benchmark result shows that for 1k roles with 2k frameworks, 
> the allocator could experience ~50% performance degradation.
> There are a couple of ways to address this issue. For example, we could make 
> the sorter aware of quota. And add a method, say `sortQuotaRoles`, to return 
> all the roles with non-default quota. Alternatively, an even better approach 
> would be to deprecate the sorter concept and just have two standalone 
> functions e.g. sortRoles() and sortQuotaRoles() that takes in the role tree 
> structure (not yet exist in the allocator) and return the sorted roles.
> In addition, when implementing MESOS-8068, we need to do more during the 
> allocation cycle. In particular, we need to call shrink many more times than 
> before. These all contribute to the performance slowdown. Specifically, for 
> the quota oriented benchmark 
> `HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2` we can observe 
> 2-3x slowdown compared to the previous release (1.8.1):
> Current master:
> QuotaParam/BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with drf sorter
> Made 3500 allocations in 32.051382735secs
> Made 0 allocation in 27.976022773secs
> 1.8.1:
> HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
> Made 3500 allocations in 13.810811063secs
> Made 0 allocation in 9.885972984secs



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (MESOS-9806) Address allocator performance regression due to the addition of quota limits.

2019-08-23 Thread Meng Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-9806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16914673#comment-16914673
 ] 

Meng Zhu commented on MESOS-9806:
-

All the optimizations improved the performance by 50%

1.8.1
HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
Made 3500 allocations in 13.810811063secs
Made 0 allocation in 9.885972984secs

Before the optimization:
QuotaParam/BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with drf sorter
Made 3500 allocations in 32.051382735secs
Made 0 allocation in 27.976022773secs

After the optimization:
HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
Made 3500 allocations in 15.385276405secs
Made 0 allocation in 13.718502414secs

> Address allocator performance regression due to the addition of quota limits.
> -
>
> Key: MESOS-9806
> URL: https://issues.apache.org/jira/browse/MESOS-9806
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Meng Zhu
>Assignee: Meng Zhu
>Priority: Critical
>  Labels: resource-management
>
> In MESOS-9802, we removed the quota role sorter which is tech debt.
> However, this slows down the allocator. The problem is that in the first 
> stage, even though a cluster might have no active roles with non-default 
> quota, the allocator will now have to sort and go through each and every role 
> in the cluster. Benchmark result shows that for 1k roles with 2k frameworks, 
> the allocator could experience ~50% performance degradation.
> There are a couple of ways to address this issue. For example, we could make 
> the sorter aware of quota. And add a method, say `sortQuotaRoles`, to return 
> all the roles with non-default quota. Alternatively, an even better approach 
> would be to deprecate the sorter concept and just have two standalone 
> functions e.g. sortRoles() and sortQuotaRoles() that takes in the role tree 
> structure (not yet exist in the allocator) and return the sorted roles.
> In addition, when implementing MESOS-8068, we need to do more during the 
> allocation cycle. In particular, we need to call shrink many more times than 
> before. These all contribute to the performance slowdown. Specifically, for 
> the quota oriented benchmark 
> `HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2` we can observe 
> 2-3x slowdown compared to the previous release (1.8.1):
> Current master:
> QuotaParam/BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with drf sorter
> Made 3500 allocations in 32.051382735secs
> Made 0 allocation in 27.976022773secs
> 1.8.1:
> HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
> Made 3500 allocations in 13.810811063secs
> Made 0 allocation in 9.885972984secs



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (MESOS-9806) Address allocator performance regression due to the addition of quota limits.

2019-08-23 Thread Meng Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-9806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16914672#comment-16914672
 ] 

Meng Zhu commented on MESOS-9806:
-

Small vector optimization for ResourceQuantities, ResourceLimits and Resources:

{noformat}
commit 73033130de7872c6f240b9b05dced039d7666138
Author: Meng Zhu 
Date:   Thu Aug 22 17:19:30 2019 -0700

Used boost `small_vector` in `Resources`.

Master + previous patch:
*HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
Made 3500 allocations in 16.307044003secs
Made 0 allocation in 14.948262599secs

Master + previous patch + this patch:
*HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
Made 3500 allocations in 15.385276405secs
Made 0 allocation in 13.718502414secs

Review: https://reviews.apache.org/r/71357

commit 95201cbe4dc87eae2fde5754d16f5effbb6c1974
Author: Meng Zhu 
Date:   Thu Aug 22 16:55:34 2019 -0700

Used boost `small_vector` in Resource Quantities and Limits.

Master + previous patch
*HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
Made 3500 allocations in 16.831380548secs
Made 0 allocation in 15.102885644secs

Master + previous patch + this patch:
*HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
Made 3500 allocations in 16.307044003secs
Made 0 allocation in 14.948262599secs

Review: https://reviews.apache.org/r/71355

commit 25070f232a9bb97d1b78f8a7e5b774bbd50654f9
Author: Meng Zhu 
Date:   Thu Aug 22 16:54:42 2019 -0700

Updated the boost library.

This update includes adding `container/small_vector.hpp`.

Review: https://reviews.apache.org/r/71356
{noformat}


> Address allocator performance regression due to the addition of quota limits.
> -
>
> Key: MESOS-9806
> URL: https://issues.apache.org/jira/browse/MESOS-9806
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Meng Zhu
>Assignee: Meng Zhu
>Priority: Critical
>  Labels: resource-management
>
> In MESOS-9802, we removed the quota role sorter which is tech debt.
> However, this slows down the allocator. The problem is that in the first 
> stage, even though a cluster might have no active roles with non-default 
> quota, the allocator will now have to sort and go through each and every role 
> in the cluster. Benchmark result shows that for 1k roles with 2k frameworks, 
> the allocator could experience ~50% performance degradation.
> There are a couple of ways to address this issue. For example, we could make 
> the sorter aware of quota. And add a method, say `sortQuotaRoles`, to return 
> all the roles with non-default quota. Alternatively, an even better approach 
> would be to deprecate the sorter concept and just have two standalone 
> functions e.g. sortRoles() and sortQuotaRoles() that takes in the role tree 
> structure (not yet exist in the allocator) and return the sorted roles.
> In addition, when implementing MESOS-8068, we need to do more during the 
> allocation cycle. In particular, we need to call shrink many more times than 
> before. These all contribute to the performance slowdown. Specifically, for 
> the quota oriented benchmark 
> `HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2` we can observe 
> 2-3x slowdown compared to the previous release (1.8.1):
> Current master:
> QuotaParam/BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with drf sorter
> Made 3500 allocations in 32.051382735secs
> Made 0 allocation in 27.976022773secs
> 1.8.1:
> HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
> Made 3500 allocations in 13.810811063secs
> Made 0 allocation in 9.885972984secs



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (MESOS-9806) Address allocator performance regression due to the addition of quota limits.

2019-08-23 Thread Meng Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-9806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16914670#comment-16914670
 ] 

Meng Zhu commented on MESOS-9806:
-

Optimized the allocation loop

{noformat}
commit ec6b7b34215e821a63cb79e7d52d94ff08c1e110
Author: Meng Zhu 
Date:   Thu Aug 22 17:54:25 2019 -0700

Optimized the allocation loop.

Master:

HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
Made 3500 allocations in 23.37 secs
Made 0 allocation in 19.72 secs

Master + this patch:

HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
Made 3500 allocations in 16.831380548secs
Made 0 allocation in 15.102885644secs

Review: https://reviews.apache.org/r/71359
{noformat}


> Address allocator performance regression due to the addition of quota limits.
> -
>
> Key: MESOS-9806
> URL: https://issues.apache.org/jira/browse/MESOS-9806
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Meng Zhu
>Assignee: Meng Zhu
>Priority: Critical
>  Labels: resource-management
>
> In MESOS-9802, we removed the quota role sorter which is tech debt.
> However, this slows down the allocator. The problem is that in the first 
> stage, even though a cluster might have no active roles with non-default 
> quota, the allocator will now have to sort and go through each and every role 
> in the cluster. Benchmark result shows that for 1k roles with 2k frameworks, 
> the allocator could experience ~50% performance degradation.
> There are a couple of ways to address this issue. For example, we could make 
> the sorter aware of quota. And add a method, say `sortQuotaRoles`, to return 
> all the roles with non-default quota. Alternatively, an even better approach 
> would be to deprecate the sorter concept and just have two standalone 
> functions e.g. sortRoles() and sortQuotaRoles() that takes in the role tree 
> structure (not yet exist in the allocator) and return the sorted roles.
> In addition, when implementing MESOS-8068, we need to do more during the 
> allocation cycle. In particular, we need to call shrink many more times than 
> before. These all contribute to the performance slowdown. Specifically, for 
> the quota oriented benchmark 
> `HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2` we can observe 
> 2-3x slowdown compared to the previous release (1.8.1):
> Current master:
> QuotaParam/BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with drf sorter
> Made 3500 allocations in 32.051382735secs
> Made 0 allocation in 27.976022773secs
> 1.8.1:
> HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/2
> Made 3500 allocations in 13.810811063secs
> Made 0 allocation in 9.885972984secs



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (MESOS-9836) Docker containerizer overwrites `/mesos/slave` cgroups.

2019-08-23 Thread Qian Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16914034#comment-16914034
 ] 

Qian Zhang commented on MESOS-9836:
---

Master:

commit 5db1835531edd16b8a7b550a944c27b0dacafbf7
Author: Qian Zhang 
Date: Wed Aug 21 16:50:06 2019 +0800

Used cached cgroups for updating resources in Docker containerizer.
 
Review: https://reviews.apache.org/r/71335

 

1.8.x

commit ee01c8d479b34ced35ee6bd172108a128086277e
Author: Qian Zhang 
Date: Wed Aug 21 16:50:06 2019 +0800

Used cached cgroups for updating resources in Docker containerizer.
 
 Review: https://reviews.apache.org/r/71335

 

1.7.x

commit 7b01d0d35e8e17434d6f8e8840c8586565fe8d6c
Author: Qian Zhang 
Date: Wed Aug 21 16:50:06 2019 +0800

Used cached cgroups for updating resources in Docker containerizer.
 
Review: https://reviews.apache.org/r/71335

 

1.6.x

commit 1c70f29bdd270cdff8ce55bdfaab56581829017c (HEAD -> 
ci/qzhang/bp_9836_1.6.x)
Author: Qian Zhang 
Date: Wed Aug 21 16:50:06 2019 +0800

Used cached cgroups for updating resources in Docker containerizer.
 
 Review: https://reviews.apache.org/r/71335

> Docker containerizer overwrites `/mesos/slave` cgroups.
> ---
>
> Key: MESOS-9836
> URL: https://issues.apache.org/jira/browse/MESOS-9836
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Chun-Hung Hsiao
>Assignee: Qian Zhang
>Priority: Critical
>  Labels: docker, mesosphere
>
> The following bug was observed on our internal testing cluster.
> The docker containerizer launched a container on an agent:
> {noformat}
> I0523 06:00:53.888579 21815 docker.cpp:1195] Starting container 
> 'f69c8a8c-eba4-4494-a305-0956a44a6ad2' for task 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1' (and executor 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1') of framework 
> 415284b7-2967-407d-b66f-f445e93f064e-0011
> I0523 06:00:54.524171 21815 docker.cpp:783] Checkpointing pid 13716 to 
> '/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S2/frameworks/415284b7-2967-407d-b66f-f445e93f064e-0011/executors/apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1/runs/f69c8a8c-eba4-4494-a305-0956a44a6ad2/pids/forked.pid'
> {noformat}
> After the container was launched, the docker containerizer did a {{docker 
> inspect}} on the container and cached the pid:
>  
> [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1764]
>  The pid should be slightly greater than 13716.
> The docker executor sent a {{TASK_FINISHED}} status update around 16 minutes 
> later:
> {noformat}
> I0523 06:16:17.287595 21809 slave.cpp:5566] Handling status update 
> TASK_FINISHED (Status UUID: 4e00b786-b773-46cd-8327-c7deb08f1de9) for task 
> apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1 of framework 
> 415284b7-2967-407d-b66f-f445e93f064e-0011 from executor(1)@172.31.1.7:36244
> {noformat}
> After receiving the terminal status update, the agent asked the docker 
> containerizer to update {{cpu.cfs_period_us}}, {{cpu.cfs_quota_us}} and 
> {{memory.soft_limit_in_bytes}} of the container through the cached pid:
>  
> [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1696]
> {noformat}
> I0523 06:16:17.290447 21815 docker.cpp:1868] Updated 'cpu.shares' to 102 at 
> /sys/fs/cgroup/cpu,cpuacct/mesos/slave for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> I0523 06:16:17.290660 21815 docker.cpp:1895] Updated 'cpu.cfs_period_us' to 
> 100ms and 'cpu.cfs_quota_us' to 10ms (cpus 0.1) for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> I0523 06:16:17.889816 21815 docker.cpp:1937] Updated 
> 'memory.soft_limit_in_bytes' to 32MB for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> {noformat}
> Note that the cgroup of {{cpu.shares}} was {{/mesos/slave}}. This was 
> possibly because that over the 16 minutes the pid got reused:
> {noformat}
> # zgrep 'systemd.cpp:98\]' /var/log/mesos/archive/mesos-agent.log.12.gz
> ...
> I0523 06:00:54.525178 21815 systemd.cpp:98] Assigned child process '13716' to 
> 'mesos_executors.slice'
> I0523 06:00:55.078546 21808 systemd.cpp:98] Assigned child process '13798' to 
> 'mesos_executors.slice'
> I0523 06:00:55.134096 21808 systemd.cpp:98] Assigned child process '13799' to 
> 'mesos_executors.slice'
> ...
> I0523 06:06:30.997439 21808 systemd.cpp:98] Assigned child process '32689' to 
> 'mesos_executors.slice'
> I0523 06:06:31.050976 21808 systemd.cpp:98] Assigned child process '32690' to 
> 'mesos_executors.slice'
> I0523 06:06:31.110514 21815 systemd.cpp:98] Assigned child process '32692' to 
> 'mesos_executors.slice'
> I0523 06:06:33.143726 21818 systemd.cpp:98