Re: Abysmally slow write to geom class volume over network

Michael Osipov Mon, 17 Oct 2016 15:14:25 -0700

Am 2016-10-15 um 13:22 schrieb Fabian Keil:

Michael Osipov <[email protected]> wrote:

Am 2016-10-14 um 11:43 schrieb Fabian Keil:

Michael Osipov <[email protected]> wrote:

Am 2016-10-13 um 03:41 schrieb John-Mark Gurney:

Michael Osipov wrote this message on Wed, Oct 12, 2016 at 20:54
+0200:

As if there is a bottleneck between socket read and geom write to
FS.

Is that better?


Have you run gstat on the system to see if there is an IO bottle
neck?  Since you are using graid3, you want to look to see if
it's %busy is ~100, while the underlying components are not.


This is hardly impossible because as soon as I start some SFTP
transfer, all of my SSH sessions free or receive connetion
timeout/abort.  Doing a SFTP from FreeBSD to FreeBSD gives me on both
physical disks and RAID3 volume a busy of zero to one perfect. In
other terms, the drives are bored.


Try checking the FAIL and SLEEP columns in the "vmstat -z" output.


I assume that you expect a rise on those numbers. I have made several
runs. Rebooted the machine and then started SFTP transfer. After seconds
my SSH sessions locked up. The transfer was aborted manually after 10
minutes which should have saturated the entire connection. After that, I
reran vmstat -z, no or minimal rise in FAIL and SLEEP.


IIRC the SLEEP column only showns currently sleeping requests,
therefore you may want to run "vmstat -z" multiple times while
the transfer is ongoing. Having said that, a custom DTrace script
would probably be a better tool to diagnose the issue anyway.

Interesting to say that this happens if is is a UFS volume on
gconcat/graid3/gvinum/gstripe configuration. Regular gpart with GPT has
no performance penalty. Additionally, it is not limited to SSH but
virtually everything with sockets: nc, ggate, smb.

This could be related to:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209680#c2


It pretty much sounds like it, though I do not use ipfw, pf or any NAT
stuff. I will try your first patch and let you know.

Do you want me to add my usecase to the issue?


If the patch helps, that could be useful once a committer
finds the time to look at the PR.


Just finished testing your patch.

Switched to 11-STABLE. First tests were w/o the patch:

1. gstripe, slow, SSH connection drops
2. graid3, slow, SSH connection drops

3. raidz, varies from 6 to 11 MB/s, SSH responding, CPU is at maximum.zfs is too much for this machine.

Tests with the patch: absolutely no change. All three tests yielded tothe same results. It is not a socket-related issue I think. If zfs worksw/o dropouts after several gigabytes, it must be some geom class bugcausing this. I am back where I was: at the beginning of the quest.

Unless someone else has a good idea, I will bury any multidisk geomclass and will likely resort to plain GPT with UFS SU+J partitions.

zfs is probably is not an option on this old Pentium 4 machine.

Michael
_______________________________________________
[email protected] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-geom
To unsubscribe, send any mail to "[email protected]"

Re: Abysmally slow write to geom class volume over network

Reply via email to