Tomas,

This is really interesting data, thanks a lot for collecting all of it and
formatting the helpful graphs.

Jerry


On Sun, Aug 26, 2018 at 4:14 PM, Tomas Vondra <tomas.von...@2ndquadrant.com>
wrote:

>
>
> On 08/25/2018 12:11 AM, Jerry Jelinek wrote:
> > Alvaro,
> >
> > I have previously posted ZFS numbers for SmartOS and FreeBSD to this
> > thread, although not with the exact same benchmark runs that Tomas did.
> >
> > I think the main purpose of running the benchmarks is to demonstrate
> > that there is no significant performance regression with wal recycling
> > disabled on a COW filesystem such as ZFS (which might just be intuitive
> > for a COW filesystem). I've tried to be sure it is clear in the doc
> > change with this patch that this tunable is only applicable to COW
> > filesystems. I do not think the benchmarks will be able to recreate the
> > problematic performance state that was originally described in Dave's
> > email thread here:
> >
> > https://www.postgresql.org/message-id/flat/
> CACukRjO7DJvub8e2AijOayj8BfKK3XXBTwu3KKARiTr67M3E3w%40mail.gmail.com#
> cacukrjo7djvub8e2aijoayj8bfkk3xxbtwu3kkaritr67m3...@mail.gmail.com
> >
>
> I agree - the benchmarks are valuable both to show improvement and lack
> of regression. I do have some numbers from LVM/ext4 (with snapshot
> recreated every minute, to trigger COW-like behavior, and without the
> snapshots), and from ZFS (on Linux, using zfsonlinux 0.7.9 on kernel
> 4.17.17).
>
> Attached are PDFs with summary charts, more detailed results are
> available at
>
>   https://bitbucket.org/tvondra/wal-recycle-test-xeon/src/master/
>
>
>
> lvm/ext4 (no snapshots)
> -----------------------
> This pretty much behaves like plain ex4, at least for scales 200 and
> 2000. I don't have results for scale 8000, because the test ran out of
> disk space (I've used part of the device for snapshots, and it was
> enough to trigger the disk space issue).
>
>
> lvm/ext4 (snapshots)
> ---------------------
> On the smallest scale (200), there's no visible difference. On scale
> 2000 disabling WAL reuse gives about 10% improvement (21468 vs. 23517
> tps), although it's not obvious from the chart. On the largest scale
> (6000, to prevent the disk space issues) the improvement is about 10%
> again, but it's much clearer.
>
>
> zfs (Linux)
> -----------
> On scale 200, there's pretty much no difference. On scale 2000, the
> throughput actually decreased a bit, by about 5% - from the chart it
> seems disabling the WAL reuse somewhat amplifies impact of checkpoints,
> for some reason.
>
> I have no idea what happened at the largest scale (8000) - on master
> there's a huge drop after ~120 minutes, which somewhat recovers at ~220
> minutes (but not fully). Without WAL reuse there's no such drop,
> although there seems to be some degradation after ~220 minutes (i.e. at
> about the same time the master partially recovers. I'm not sure what to
> think about this, I wonder if it might be caused by almost filling the
> disk space, or something like that. I'm rerunning this with scale 600.
>
> I'm also not sure how much can we extrapolate this to other ZFS configs
> (I mean, this is a ZFS on a single SSD device, while I'd generally
> expect ZFS on multiple devices, etc.).
>
>
> regards
>
> --
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Reply via email to