Re: FreeBSD UFS & fsync

2021-03-15 Thread Luca Ferrari
On Fri, Mar 12, 2021 at 10:34 AM Thomas Munro  wrote:
> It looks like your system is performing very badly for some other
> reason, so that synchronous I/O waits are only a small proportion of
> the time, and thus fsync=off doesn't speed things up very much.  I'd
> look into profiling the system to try to figure out what it's doing...
> maybe it's suffering from super slow hypercalls for gettimeofday(), or
> something like that?


Let me get this straight to see if I understand it correctly:
pg_test_fsync reports 278000 tps in non sync-ed mode, and that is what
I should expect (nearly) from turning off fsyc.
However, something else is eating my resources, so I'm not getting the
correct results.
Now, what do you mean by profiling the system? Since I'm on FreeBSD I
could use dtrace to see if there's any clue where the time is spent,
even if I'm not so expert in dtrace.
Please also note that pg_test_timing seems fine to me (I've tried
several times with pretty much the same results):

% pg_test_timing
Testing timing overhead for 3 seconds.
Per loop time including overhead: 37.68 ns
Histogram of timing durations:
  < us   % of total  count
 1 96.46399   76796834
 2  3.524172805657
 4  0.00400   3183
 8  0.00320   2546
16  0.00235   1871
32  0.00124988
64  0.00065517
   128  0.00024189
   256  0.7 58
   512  0.3 26
  1024  0.2 18
  2048  0.2 19
  4096  0.1  9
  8192  0.0  1

So apparently gettimeofday should not be the problem right here.

Luca




Re: FreeBSD UFS & fsync

2021-03-12 Thread Bruce Momjian
On Fri, Mar 12, 2021 at 10:33:29PM +1300, Thomas Munro wrote:
> On Fri, Mar 12, 2021 at 10:09 PM Luca Ferrari  wrote:
> >fdatasync 16269.365 ops/sec  61 usecs/op
> >fsync  8471.429 ops/sec 118 usecs/op
> 
> > Non-sync'ed 8kB writes:
> >write278484.510 ops/sec   4 usecs/op
> 
> > tps = 136.963971 (excluding connections establishing)
> 
> It looks like your system is performing very badly for some other
> reason, so that synchronous I/O waits are only a small proportion of
> the time, and thus fsync=off doesn't speed things up very much.  I'd
> look into profiling the system to try to figure out what it's doing...
> maybe it's suffering from super slow hypercalls for gettimeofday(), or
> something like that?

And we have pg_test_timing for gettimeofday() testing.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: FreeBSD UFS & fsync

2021-03-12 Thread Thomas Munro
On Fri, Mar 12, 2021 at 10:09 PM Luca Ferrari  wrote:
>fdatasync 16269.365 ops/sec  61 usecs/op
>fsync  8471.429 ops/sec 118 usecs/op

> Non-sync'ed 8kB writes:
>write278484.510 ops/sec   4 usecs/op

> tps = 136.963971 (excluding connections establishing)

It looks like your system is performing very badly for some other
reason, so that synchronous I/O waits are only a small proportion of
the time, and thus fsync=off doesn't speed things up very much.  I'd
look into profiling the system to try to figure out what it's doing...
maybe it's suffering from super slow hypercalls for gettimeofday(), or
something like that?




Re: FreeBSD UFS & fsync

2021-03-12 Thread Luca Ferrari
On Thu, Mar 11, 2021 at 3:29 PM Bruce Momjian  wrote:
>
> You should really be running pg_test_fsync for this kind of testing.
>

Sorry Bruce, but it is not clear to me: pg_test_fsync compares
different fsync implementations, but not the fsync on/off setting of a
cluster.

Now, pg_test_fsync reports the "non synced writes", which are
effectively 15x faster (that is near to what I was expecting turning
off fsync):

% pg_test_fsync
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync is Linux's default)
   open_datasync   n/a
   fdatasync 16269.365 ops/sec  61 usecs/op
   fsync  8471.429 ops/sec 118 usecs/op
   fsync_writethrough  n/a
   open_sync  5664.861 ops/sec 177 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync is Linux's default)
   open_datasync   n/a
   fdatasync 15196.244 ops/sec  66 usecs/op
   fsync  7754.729 ops/sec 129 usecs/op
   fsync_writethrough  n/a
   open_sync  2670.645 ops/sec 374 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB in different write
open_sync sizes.)
1 * 16kB open_sync write  5486.140 ops/sec 182 usecs/op
2 *  8kB open_sync writes 2344.310 ops/sec 427 usecs/op
4 *  4kB open_sync writes 1323.548 ops/sec 756 usecs/op
8 *  2kB open_sync writes  659.449 ops/sec1516 usecs/op
   16 *  1kB open_sync writes  332.844 ops/sec3004 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written on a different
descriptor.)
   write, fsync, close7515.006 ops/sec 133 usecs/op
   write, close, fsync7107.698 ops/sec 141 usecs/op

Non-sync'ed 8kB writes:
   write278484.510 ops/sec   4 usecs/op



However, these are not results I'm getting via pgbench.

% sudo -u postgres postgres -C fsync  -D /postgres/12/data
on
% sudo -u postgres postgres -C checkpoint_timeout  -D /postgres/12/data
30
%  pgbench -T 60 -c 4  -r -n  -U luca pgbench
...
number of transactions actually processed: 7347
latency average = 32.947 ms
tps = 121.405308 (including connections establishing)
tps = 121.429075 (excluding connections establishing)


% sudo -u postgres postgres -C checkpoint_timeout  -D /postgres/12/data
30
% sudo -u postgres postgres -C fsync  -D /postgres/12/data
off
%  pgbench -T 60 -c 4  -r -n  -U luca pgbench
...
number of transactions actually processed: 8220
latency average = 29.212 ms
tps = 136.929481 (including connections establishing)
tps = 136.963971 (excluding connections establishing)


Of course, the above test is really quick (and covers at least one
checkpoint), but event longer tests provide similar results, that are
somehow in contrast with the pg_test_fsync result.
However, apparently the problem is not related to disck cache, since
pg_test_fsync reports correct times (as far as I understand).
Am I missing something?

Luca




Re: FreeBSD UFS & fsync

2021-03-11 Thread Bruce Momjian
On Tue, Feb 23, 2021 at 12:57:22PM +0100, Luca Ferrari wrote:
> On Tue, Feb 23, 2021 at 8:46 AM Luca Ferrari  wrote:
> > I'm using sata disks, not scsi. Assuming I'm not looking at the wrong
> > parameter, I wil attach a scsi disk to do the same test and see if
> > something changes.
> 
> I've tested the same version of PostgreSQL, same benchmark, on a scsi
> disk. However, turning off fsync does not provide any increment at all
> (something that spans in less than 1% tps).
> I've checked and I have WCE enabled on such disk, but apparently I
> cannot modify (I suspect this is due to the virtualization of the
> disk):

You should really be running pg_test_fsync for this kind of testing.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: FreeBSD UFS & fsync

2021-02-23 Thread Luca Ferrari
On Tue, Feb 23, 2021 at 8:46 AM Luca Ferrari  wrote:
> I'm using sata disks, not scsi. Assuming I'm not looking at the wrong
> parameter, I wil attach a scsi disk to do the same test and see if
> something changes.

I've tested the same version of PostgreSQL, same benchmark, on a scsi
disk. However, turning off fsync does not provide any increment at all
(something that spans in less than 1% tps).
I've checked and I have WCE enabled on such disk, but apparently I
cannot modify (I suspect this is due to the virtualization of the
disk):

# echo "WCE: 0" | camcontrol modepage da0 -m 0x08 -e
camcontrol: error sending mode select command
# camcontrol modepage da0 -m 0x08 | grep WCE
WCE:  1

and the filesystem has everything disabled:

# tunefs -p da0p1
tunefs: Can't stat da0p1: No such file or directory
tunefs: POSIX.1e ACLs: (-a)disabled
tunefs: NFSv4 ACLs: (-N)   disabled
tunefs: MAC multilabel: (-l)   disabled
tunefs: soft updates: (-n) disabled
tunefs: soft update journaling: (-j)   disabled
tunefs: gjournal: (-J) disabled
tunefs: trim: (-t) disabled
tunefs: maximum blocks per file in a cylinder group: (-e)  4096
tunefs: average file size: (-f)16384
tunefs: average number of files in a directory: (-s)   64
tunefs: minimum percentage of free space: (-m) 8%
tunefs: space to hold for metadata blocks: (-k)6408
tunefs: optimization preference: (-o)  time
tunefs: volume label: (-L)

I think I will not be able to test in a virtual environment, unless
I'm missing something.

Thanks,
Luca




Re: FreeBSD UFS & fsync

2021-02-22 Thread Luca Ferrari
On Mon, Feb 22, 2021 at 10:38 PM Thomas Munro  wrote:
> Do you have WCE enabled?  In that case, modern Linux file systems
> would do a synchronous SYNCHRONIZE CACHE for our WAL fdatasync(), but
> FreeBSD UFS wouldn't as far as I know.  It does know how to do that
> (there's a BIO_FLUSH operation, also used by ZFS), but as far as I can
> see UFS uses it just for its own file system meta-data crash safety
> currently (see softdep_synchronize()).  (There is also no FUA flag for
> O_[D]SYNC writes, an even more modern invention.)

Apparently no WCE, but I could be looking at the wrong piece:

% sysctl kern.cam.ada | grep write_cache
kern.cam.ada.2.write_cache: -1
kern.cam.ada.1.write_cache: -1
kern.cam.ada.0.write_cache: -1
kern.cam.ada.write_cache: -1

I'm using sata disks, not scsi. Assuming I'm not looking at the wrong
parameter, I wil attach a scsi disk to do the same test and see if
something changes.
Or if you have any other suggestion about what to inspect, please advice.

Thanks,
Luca




Re: FreeBSD UFS & fsync

2021-02-22 Thread Thomas Munro
On Tue, Feb 23, 2021 at 5:49 AM Luca Ferrari  wrote:
> I'm running a virtual machine with FreeBSD 12.2, PostgreSQL 12.5 and
> UFS as filesystem.
> I was experimenting with fsync = off and pgbench, and I see no
> particular difference in tps having fsync enabled or disabled.
> Now, the same tiny test on a linux box provides a 10x tps, while on
> FreeBSD is a 1% increase.
> I'm trying to figure out why, and I suspect there is something related
> to how UFS handles writes.

Do you have WCE enabled?  In that case, modern Linux file systems
would do a synchronous SYNCHRONIZE CACHE for our WAL fdatasync(), but
FreeBSD UFS wouldn't as far as I know.  It does know how to do that
(there's a BIO_FLUSH operation, also used by ZFS), but as far as I can
see UFS uses it just for its own file system meta-data crash safety
currently (see softdep_synchronize()).  (There is also no FUA flag for
O_[D]SYNC writes, an even more modern invention.)




FreeBSD UFS & fsync

2021-02-22 Thread Luca Ferrari
Hi all,
I'm running a virtual machine with FreeBSD 12.2, PostgreSQL 12.5 and
UFS as filesystem.
I was experimenting with fsync = off and pgbench, and I see no
particular difference in tps having fsync enabled or disabled.
Now, the same tiny test on a linux box provides a 10x tps, while on
FreeBSD is a 1% increase.
I'm trying to figure out why, and I suspect there is something related
to how UFS handles writes.

Any particular advice about tuning and parameters that can be
affecting the "no difference" with fsync turned off?


% sudo tunefs -p /dev/gpt/DATA
tunefs: POSIX.1e ACLs: (-a)disabled
tunefs: NFSv4 ACLs: (-N)disabled
tunefs: MAC multilabel: (-l) disabled
tunefs: soft updates: (-n)disabled
tunefs: soft update journaling: (-j)  disabled
tunefs: gjournal: (-J) disabled
tunefs: trim: (-t)enabled
tunefs: maximum blocks per file in a cylinder group: (-e)  8192
tunefs: average file size: (-f)16384
tunefs: average number of files in a directory: (-s)   64
tunefs: minimum percentage of free space: (-m) 8%
tunefs: space to hold for metadata blocks: (-k)6408
tunefs: optimization preference: (-o)  time
tunefs: volume label: (-L) DATA