[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone" when under load

2017-07-19 Thread Bug Watch Updater
** Changed in: zfs Status: Unknown => New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1567557 Title: Performance degradation of "zfs clone" when under load Status in

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone" when under load

2017-07-19 Thread Colin Ian King
Based on measurements, we get a x^3 polynomial in clone time. Sub 2000 clones it's not too bad, but over 2000 clones the delay in the ioctl() ramps up pretty quickly. I've got a fairly solid set of stats and drew some estimates based on the trend line. See attached datasheet. ** Attachment

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone" when under load

2017-07-19 Thread Colin Ian King
Running strace against zfs create I see the following ioctl() taking the time: 1500475028.118005 ioctl(3, _IOC(0, 0x5a, 0x12, 0x00), 0x7ffc7c2184f0) = -1 ENOMEM (Cannot allocate memory) <0.390093> 1500475028.508153 mmap(NULL, 290816, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone" when under load

2017-07-05 Thread Colin Ian King
So compound that with the huge amount of CPU suckage I'm seeing with lxd... -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1567557 Title: Performance degradation of "zfs clone" when

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone" when under load

2017-07-05 Thread Stéphane Graber
That's pure ZFS completely outside of LXD. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1567557 Title: Performance degradation of "zfs clone" when under load Status in lxd

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone" when under load

2017-07-05 Thread Stéphane Graber
(but doing something pretty similar to what LXD does internally as far as clones and mountpoint handling) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1567557 Title: Performance

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone" when under load

2017-07-05 Thread Stéphane Graber
I'm seeing similar performance degradation on xenial except that the server I used seems to be pretty slow to start with too: root@athos:~# sh test.sh Creating 100 clones Took: 11 seconds (9/s) Creating 200 clones Took: 46 seconds (4/s) Creating 400 clones Took: 297 seconds (1/s) -- You

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone" when under load

2017-07-05 Thread Stéphane Graber
Creating 100 clones Took: 4 seconds (25/s) Creating 200 clones Took: 13 seconds (15/s) Creating 400 clones Took: 46 seconds (8/s) Creating 600 clones Took: 156 seconds (3/s) ``` #!/bin/sh zfs destroy -R castiana/testzfs rm -Rf /tmp/testzfs zfs create castiana/testzfs -o mountpoint=none zfs

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone" when under load

2017-07-05 Thread Colin Ian King
Looks like there is a lot of contention on a futex and the underlying database too. Perf record and perf report can show that most of the issues are more to do with lxd, database and futex lock contention. Once these are resolved, I'll be happy to re-analyze the entire ZFS + lxd stack, but I

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone" when under load

2017-07-05 Thread Colin Ian King
Running health-check against lxd I see that there is a lot of contention on a futex and a load of context switches going on in lxd. ** Attachment added: "output from health-check when examining lxd"

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone" when under load

2017-07-05 Thread Colin Ian King
I've now compared ZFS with BTRFS and the overall CPU load profile from lxd in both scenarios is surprisingly similar - lxd is eating a lot of CPU and I think there is an underlying issue there rather than a massively broken performance issue with ZFS. See the attached data sheet; the CPU

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone" when under load

2017-07-05 Thread Colin Ian King
After a load tests, I've found the same anomaly with btrfs. I'm using a 80GB btrfs raw device, 16GB memory, 16 CPU threads: BTRFS: Time 24.02632 54.58464 125.520 128 272.835 256 809.038 512 3451.172 1024 So the pinch point happens later than with ZFS, but there seems to

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone" when under load

2017-07-05 Thread Colin Ian King
Most of the CPU being consumed is in strcmp() - 99.39% 0.00% zfs[kernel.kallsyms] [k] zfsdev_ioctl ▒ - 99.39% zfsdev_ioctl

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone" when under load

2017-07-05 Thread Colin Ian King
I've created 4096 zfs clones using the same commands that lxd is using and I have observed that the biggest pinch point is when fetching the zfs clone stats as this has to lock using dsl_dataset_hold_obj, fetch the data and then unlock with dsl_dataset_rele. Traversing hundreds of clones is

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone" when under load

2017-05-31 Thread Stéphane Graber
I'm trying to remember if we had to bump any of the sysctls to actually reach 1024 containers, I don't think any of the usual suspects would be in play until you reach 2000+ Alpine containers though. If you do run out of some kernel resources, you can try applying the following sysctls to get

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone" when under load

2017-05-31 Thread Stéphane Graber
Our test machines aren't particularly impressive, just 12GB of RAM or so. Note that as can be seen above, we're using Alpine (busybox) images rather than Ubuntu to limit the resource usage and get us to a lot more containers per system. -- You received this bug notification because you are a

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone" when under load

2017-05-31 Thread Colin Ian King
I've tried to reproduce this but when I pass more than 256 containers I run out of resources. What kind of configuration are you using to test this. Once I can reproduce this I'll try and see what the performance constraint is. ** Changed in: zfs-linux (Ubuntu) Status: In Progress =>

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone" when under load

2017-04-20 Thread Colin Ian King
** Changed in: zfs-linux (Ubuntu) Importance: Undecided => Medium ** Changed in: zfs-linux (Ubuntu) Assignee: (unassigned) => Colin Ian King (colin-king) ** Changed in: zfs-linux (Ubuntu) Status: New => In Progress -- You received this bug notification because you are a member