Re: FreeBSD UFS & fsync
On Fri, Mar 12, 2021 at 10:34 AM Thomas Munro wrote: > It looks like your system is performing very badly for some other > reason, so that synchronous I/O waits are only a small proportion of > the time, and thus fsync=off doesn't speed things up very much. I'd > look into profiling the system to try to figure out what it's doing... > maybe it's suffering from super slow hypercalls for gettimeofday(), or > something like that? Let me get this straight to see if I understand it correctly: pg_test_fsync reports 278000 tps in non sync-ed mode, and that is what I should expect (nearly) from turning off fsyc. However, something else is eating my resources, so I'm not getting the correct results. Now, what do you mean by profiling the system? Since I'm on FreeBSD I could use dtrace to see if there's any clue where the time is spent, even if I'm not so expert in dtrace. Please also note that pg_test_timing seems fine to me (I've tried several times with pretty much the same results): % pg_test_timing Testing timing overhead for 3 seconds. Per loop time including overhead: 37.68 ns Histogram of timing durations: < us % of total count 1 96.46399 76796834 2 3.524172805657 4 0.00400 3183 8 0.00320 2546 16 0.00235 1871 32 0.00124988 64 0.00065517 128 0.00024189 256 0.7 58 512 0.3 26 1024 0.2 18 2048 0.2 19 4096 0.1 9 8192 0.0 1 So apparently gettimeofday should not be the problem right here. Luca
Re: FreeBSD UFS & fsync
On Fri, Mar 12, 2021 at 10:33:29PM +1300, Thomas Munro wrote: > On Fri, Mar 12, 2021 at 10:09 PM Luca Ferrari wrote: > >fdatasync 16269.365 ops/sec 61 usecs/op > >fsync 8471.429 ops/sec 118 usecs/op > > > Non-sync'ed 8kB writes: > >write278484.510 ops/sec 4 usecs/op > > > tps = 136.963971 (excluding connections establishing) > > It looks like your system is performing very badly for some other > reason, so that synchronous I/O waits are only a small proportion of > the time, and thus fsync=off doesn't speed things up very much. I'd > look into profiling the system to try to figure out what it's doing... > maybe it's suffering from super slow hypercalls for gettimeofday(), or > something like that? And we have pg_test_timing for gettimeofday() testing. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Re: FreeBSD UFS & fsync
On Fri, Mar 12, 2021 at 10:09 PM Luca Ferrari wrote: >fdatasync 16269.365 ops/sec 61 usecs/op >fsync 8471.429 ops/sec 118 usecs/op > Non-sync'ed 8kB writes: >write278484.510 ops/sec 4 usecs/op > tps = 136.963971 (excluding connections establishing) It looks like your system is performing very badly for some other reason, so that synchronous I/O waits are only a small proportion of the time, and thus fsync=off doesn't speed things up very much. I'd look into profiling the system to try to figure out what it's doing... maybe it's suffering from super slow hypercalls for gettimeofday(), or something like that?
Re: FreeBSD UFS & fsync
On Thu, Mar 11, 2021 at 3:29 PM Bruce Momjian wrote: > > You should really be running pg_test_fsync for this kind of testing. > Sorry Bruce, but it is not clear to me: pg_test_fsync compares different fsync implementations, but not the fsync on/off setting of a cluster. Now, pg_test_fsync reports the "non synced writes", which are effectively 15x faster (that is near to what I was expecting turning off fsync): % pg_test_fsync 5 seconds per test O_DIRECT supported on this platform for open_datasync and open_sync. Compare file sync methods using one 8kB write: (in wal_sync_method preference order, except fdatasync is Linux's default) open_datasync n/a fdatasync 16269.365 ops/sec 61 usecs/op fsync 8471.429 ops/sec 118 usecs/op fsync_writethrough n/a open_sync 5664.861 ops/sec 177 usecs/op Compare file sync methods using two 8kB writes: (in wal_sync_method preference order, except fdatasync is Linux's default) open_datasync n/a fdatasync 15196.244 ops/sec 66 usecs/op fsync 7754.729 ops/sec 129 usecs/op fsync_writethrough n/a open_sync 2670.645 ops/sec 374 usecs/op Compare open_sync with different write sizes: (This is designed to compare the cost of writing 16kB in different write open_sync sizes.) 1 * 16kB open_sync write 5486.140 ops/sec 182 usecs/op 2 * 8kB open_sync writes 2344.310 ops/sec 427 usecs/op 4 * 4kB open_sync writes 1323.548 ops/sec 756 usecs/op 8 * 2kB open_sync writes 659.449 ops/sec1516 usecs/op 16 * 1kB open_sync writes 332.844 ops/sec3004 usecs/op Test if fsync on non-write file descriptor is honored: (If the times are similar, fsync() can sync data written on a different descriptor.) write, fsync, close7515.006 ops/sec 133 usecs/op write, close, fsync7107.698 ops/sec 141 usecs/op Non-sync'ed 8kB writes: write278484.510 ops/sec 4 usecs/op However, these are not results I'm getting via pgbench. % sudo -u postgres postgres -C fsync -D /postgres/12/data on % sudo -u postgres postgres -C checkpoint_timeout -D /postgres/12/data 30 % pgbench -T 60 -c 4 -r -n -U luca pgbench ... number of transactions actually processed: 7347 latency average = 32.947 ms tps = 121.405308 (including connections establishing) tps = 121.429075 (excluding connections establishing) % sudo -u postgres postgres -C checkpoint_timeout -D /postgres/12/data 30 % sudo -u postgres postgres -C fsync -D /postgres/12/data off % pgbench -T 60 -c 4 -r -n -U luca pgbench ... number of transactions actually processed: 8220 latency average = 29.212 ms tps = 136.929481 (including connections establishing) tps = 136.963971 (excluding connections establishing) Of course, the above test is really quick (and covers at least one checkpoint), but event longer tests provide similar results, that are somehow in contrast with the pg_test_fsync result. However, apparently the problem is not related to disck cache, since pg_test_fsync reports correct times (as far as I understand). Am I missing something? Luca
Re: FreeBSD UFS & fsync
On Tue, Feb 23, 2021 at 12:57:22PM +0100, Luca Ferrari wrote: > On Tue, Feb 23, 2021 at 8:46 AM Luca Ferrari wrote: > > I'm using sata disks, not scsi. Assuming I'm not looking at the wrong > > parameter, I wil attach a scsi disk to do the same test and see if > > something changes. > > I've tested the same version of PostgreSQL, same benchmark, on a scsi > disk. However, turning off fsync does not provide any increment at all > (something that spans in less than 1% tps). > I've checked and I have WCE enabled on such disk, but apparently I > cannot modify (I suspect this is due to the virtualization of the > disk): You should really be running pg_test_fsync for this kind of testing. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Re: FreeBSD UFS & fsync
On Tue, Feb 23, 2021 at 8:46 AM Luca Ferrari wrote: > I'm using sata disks, not scsi. Assuming I'm not looking at the wrong > parameter, I wil attach a scsi disk to do the same test and see if > something changes. I've tested the same version of PostgreSQL, same benchmark, on a scsi disk. However, turning off fsync does not provide any increment at all (something that spans in less than 1% tps). I've checked and I have WCE enabled on such disk, but apparently I cannot modify (I suspect this is due to the virtualization of the disk): # echo "WCE: 0" | camcontrol modepage da0 -m 0x08 -e camcontrol: error sending mode select command # camcontrol modepage da0 -m 0x08 | grep WCE WCE: 1 and the filesystem has everything disabled: # tunefs -p da0p1 tunefs: Can't stat da0p1: No such file or directory tunefs: POSIX.1e ACLs: (-a)disabled tunefs: NFSv4 ACLs: (-N) disabled tunefs: MAC multilabel: (-l) disabled tunefs: soft updates: (-n) disabled tunefs: soft update journaling: (-j) disabled tunefs: gjournal: (-J) disabled tunefs: trim: (-t) disabled tunefs: maximum blocks per file in a cylinder group: (-e) 4096 tunefs: average file size: (-f)16384 tunefs: average number of files in a directory: (-s) 64 tunefs: minimum percentage of free space: (-m) 8% tunefs: space to hold for metadata blocks: (-k)6408 tunefs: optimization preference: (-o) time tunefs: volume label: (-L) I think I will not be able to test in a virtual environment, unless I'm missing something. Thanks, Luca
Re: FreeBSD UFS & fsync
On Mon, Feb 22, 2021 at 10:38 PM Thomas Munro wrote: > Do you have WCE enabled? In that case, modern Linux file systems > would do a synchronous SYNCHRONIZE CACHE for our WAL fdatasync(), but > FreeBSD UFS wouldn't as far as I know. It does know how to do that > (there's a BIO_FLUSH operation, also used by ZFS), but as far as I can > see UFS uses it just for its own file system meta-data crash safety > currently (see softdep_synchronize()). (There is also no FUA flag for > O_[D]SYNC writes, an even more modern invention.) Apparently no WCE, but I could be looking at the wrong piece: % sysctl kern.cam.ada | grep write_cache kern.cam.ada.2.write_cache: -1 kern.cam.ada.1.write_cache: -1 kern.cam.ada.0.write_cache: -1 kern.cam.ada.write_cache: -1 I'm using sata disks, not scsi. Assuming I'm not looking at the wrong parameter, I wil attach a scsi disk to do the same test and see if something changes. Or if you have any other suggestion about what to inspect, please advice. Thanks, Luca
Re: FreeBSD UFS & fsync
On Tue, Feb 23, 2021 at 5:49 AM Luca Ferrari wrote: > I'm running a virtual machine with FreeBSD 12.2, PostgreSQL 12.5 and > UFS as filesystem. > I was experimenting with fsync = off and pgbench, and I see no > particular difference in tps having fsync enabled or disabled. > Now, the same tiny test on a linux box provides a 10x tps, while on > FreeBSD is a 1% increase. > I'm trying to figure out why, and I suspect there is something related > to how UFS handles writes. Do you have WCE enabled? In that case, modern Linux file systems would do a synchronous SYNCHRONIZE CACHE for our WAL fdatasync(), but FreeBSD UFS wouldn't as far as I know. It does know how to do that (there's a BIO_FLUSH operation, also used by ZFS), but as far as I can see UFS uses it just for its own file system meta-data crash safety currently (see softdep_synchronize()). (There is also no FUA flag for O_[D]SYNC writes, an even more modern invention.)
FreeBSD UFS & fsync
Hi all, I'm running a virtual machine with FreeBSD 12.2, PostgreSQL 12.5 and UFS as filesystem. I was experimenting with fsync = off and pgbench, and I see no particular difference in tps having fsync enabled or disabled. Now, the same tiny test on a linux box provides a 10x tps, while on FreeBSD is a 1% increase. I'm trying to figure out why, and I suspect there is something related to how UFS handles writes. Any particular advice about tuning and parameters that can be affecting the "no difference" with fsync turned off? % sudo tunefs -p /dev/gpt/DATA tunefs: POSIX.1e ACLs: (-a)disabled tunefs: NFSv4 ACLs: (-N)disabled tunefs: MAC multilabel: (-l) disabled tunefs: soft updates: (-n)disabled tunefs: soft update journaling: (-j) disabled tunefs: gjournal: (-J) disabled tunefs: trim: (-t)enabled tunefs: maximum blocks per file in a cylinder group: (-e) 8192 tunefs: average file size: (-f)16384 tunefs: average number of files in a directory: (-s) 64 tunefs: minimum percentage of free space: (-m) 8% tunefs: space to hold for metadata blocks: (-k)6408 tunefs: optimization preference: (-o) time tunefs: volume label: (-L) DATA