[
https://issues.apache.org/jira/browse/MESOS-6162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16143301#comment-16143301
]
Qian Zhang commented on MESOS-6162:
-----------------------------------
For the [performance issue|https://github.com/opencontainers/runc/issues/861]
mentioned in the description of this ticket, after some experiments, I found it
will happen only when the IO scheduler for the disk is set to {{cfq}} and the
filesystem is {{ext4}}/{{ext3}} with the {{data=ordered}} option.
{code}
# pwd
/mnt
# mount | grep mnt
/dev/sdb on /mnt type ext4 (rw,relatime,data=ordered)
# cat /sys/block/sdb/queue/scheduler
noop deadline [cfq]
# echo $$ > /sys/fs/cgroup/blkio/cgroup.procs
# dd if=/dev/zero of=test.bin bs=512 count=1000 oflag=dsync
1000+0 records in
1000+0 records out
512000 bytes (512 kB, 500 KiB) copied, 1.51425 s, 338 kB/s
# echo $$ >/sys/fs/cgroup/blkio/test/cgroup.procs
# dd if=/dev/zero of=test.bin bs=512 count=1000 oflag=dsync
1000+0 records in
1000+0 records out
512000 bytes (512 kB, 500 KiB) copied, 16.0301 s, 31.9 kB/s <--- Performance
degradation when we put the process into "test" blkio cgroup
{code}
If we change the IO scheduler to {{deadline}} (see [this
doc|https://www.kernel.org/doc/Documentation/block/switching-sched.txt] for
more info about switching IO scheduler, and see [CFS
scheduler|https://www.kernel.org/doc/Documentation/block/cfq-iosched.txt] and
[deadline
scheduler|https://www.kernel.org/doc/Documentation/block/deadline-iosched.txt]
for more info about CFQ and deadline IO scheduler) , we will not have this
performance issue.
{code}
# echo deadline > /sys/block/sdb/queue/scheduler
# cat /sys/block/sdb/queue/scheduler
noop [deadline] cfq
# echo $$ > /sys/fs/cgroup/blkio/cgroup.procs
root@workstation:/mnt# dd if=/dev/zero of=test.bin bs=512 count=1000 oflag=dsync
1000+0 records in
1000+0 records out
512000 bytes (512 kB, 500 KiB) copied, 1.21094 s, 423 kB/s
# echo $$ > /sys/fs/cgroup/blkio/test/cgroup.procs
# dd if=/dev/zero of=test.bin bs=512 count=1000 oflag=dsync
1000+0 records in
1000+0 records out
512000 bytes (512 kB, 500 KiB) copied, 1.19367 s, 429 kB/s <--- No performance
degradation
{code}
And I also tested if the disk is formatted as other filesystems (e.g., {{xfs}},
{{btrfs}}) or the disk is mounted without the {{data=ordered}} option (it is
enabled by default for {{ext4}} and {{ext3}}, we can disable it by specifying a
different option when mounting the disk, e.g., {{data=journal}}), we will not
have this performance issue. See [this
doc|https://www.ibm.com/developerworks/library/l-fs8/index.html] for the
difference between {{data=ordered}} and {{data=journal}}.
{quote}
Theoretically, data=journal mode is the slowest journaling mode of all, since
data gets written to disk twice rather than once. However, it turns out that in
certain situations, data=journal mode can be blazingly fast.
{quote}
It seems only SUSE has this performance issue since it by default has the
disk's IO scheduler set to {{cfq}} and the filesystem is {{ext4}} with the
{{data=ordered}} option. I tested other distros (CoreOS, CentOS 7.2 and Ubuntu
16.04), they do not have that issue since some of them have the disk's IO
scheduler set to {{deadline}} by default (Ubuntu 16.04), and some of them have
the disk formatted as {{xfs}} by default (CentOS 7.2).
So I think this should not be a general performance issue since most of the
distros have not such issue, and this issue can be fixed on the fly by
switching IO scheduler to {{deadline}}. But in future, when we support blkio
control functionalities
([MESOS-7843|https://issues.apache.org/jira/browse/MESOS-7843]), setting IO
scheduler to {{deadline}} will be a problem because blkio control
functionalities needs IO scheduler set to {{CFQ}}, if it is set to
{{deadline}}, all the {{blkio.weight}}, {{blkio.weight_device}} and
{{blkio.leaf_weight\[_device\]}} proportional weight policy files will NOT take
effect.
> Add support for cgroups blkio subsystem blkio statistics.
> ---------------------------------------------------------
>
> Key: MESOS-6162
> URL: https://issues.apache.org/jira/browse/MESOS-6162
> Project: Mesos
> Issue Type: Task
> Components: cgroups, containerization
> Reporter: haosdent
> Assignee: Jason Lai
> Labels: cgroups, containerizer, mesosphere
> Fix For: 1.4.0
>
>
> Noted that cgroups blkio subsystem may have performance issue, refer to
> https://github.com/opencontainers/runc/issues/861
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)