Re: TRIM, iSCSI and %busy waves

2018-04-05 Thread Eugene M. Zheganin

Hello,

On 05.04.2018 20:00, Warner Losh wrote:


I'm also having a couple of iSCSI issues that I'm dealing through
bounty with, so may be this is related somehow. Or may be not. Due
to some issues in iSCSI stack my system sometimes reboots, and
then these "waves" are stopped for some time.

So, my question is - can I fine-tune TRIM operations ? So they
don't consume the whole disk at 100%. I see several sysctl oids,
but they aren't well-documented.


You might be able to set the delete method.


Set it to what ? It's not like I'm seeing FreeBSD for the first time, 
but from what I see in sysctl - all of those "sysctl -a | grep trim" 
oids are numeric.


P.S. This is 11.x, disks are Toshibas, and they are attached via
LSI HBA.


Which LSI HBA?


A SAS9300-4i4e one.

Eugene.
Thanks.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: TRIM, iSCSI and %busy waves

2018-04-05 Thread Eugene M. Zheganin

Hello,

On 05.04.2018 19:57, Steven Hartland wrote:

You can indeed tune things here are the relevant sysctls:
sysctl -a | grep trim |grep -v kstat
vfs.zfs.trim.max_interval: 1
vfs.zfs.trim.timeout: 30
vfs.zfs.trim.txg_delay: 32
vfs.zfs.trim.enabled: 1
vfs.zfs.vdev.trim_max_pending: 1
vfs.zfs.vdev.trim_max_active: 64
vfs.zfs.vdev.trim_min_active: 1
vfs.zfs.vdev.trim_on_init: 1

Well, I've already seen these. How do I tune them ? The idea of just 
tampering with them and seeing what will happen doesn't look like a 
bright one to me. Do I increase or decrease them ? Which ones do I have 
to avoid ?


Eugene.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: TRIM, iSCSI and %busy waves

2018-04-05 Thread Warner Losh
On Thu, Apr 5, 2018 at 8:08 AM, Eugene M. Zheganin  wrote:

> Hi,
>
> I have a production iSCSI system (on zfs of course) with 15 ssd disks and
> it's often suffering from TRIMs.
>
> Well, I know what TRIM is for, and I know it's a good thing, but sometimes
> (actually often) I'm seeing my disks in gstat are overwhelmed by the TRIM
> waves, this looks like a "wave" of 20K 100%busy delete operations starting
> on first pool disk, then reaching second, then third,... - at the time it
> reaches the 15th disk the first one if freed from TRIM operations, and in
> 20-40 seconds this wave begins again.
>

There's two issues here. First, %busy doesn't necessarily mean what you
think it means. Back in the days of one operation at a time, it might have
been a reasonable indicator that the drive is busy. However, today with
queueing a 100% busy disk often can take additional load.

The second problem is that TRIMs suck for a lot of reasons. FFS (I don't
know about ZFS) sends lots of TRIMs at once when you delete a file. These
TRIMs are UFS block sized, so need to be combined in the ada/da layer. The
combining in the ada and da drivers isn't optimal, but implements a
'greedy' method where we pack as much as possible into each TRIM, which
makes each TRIM take longer. Plus, TRIMs are non NCQ commands, so force a
drain of all the other commands to do them. And we don't have any
throttling in 11.x (at the moment), so they tend to flood the device and
starve out other traffic when there's a lot of them. Not all controllers
support NCQ trim (LSI doesn't at the moment, I don't think). With NCQ we
only queue one at a time and that helps.

I'm working on trim shaping in -current right now. It's focused on NVMe,
but since I'm doing the bulk of it in cam_iosched.c, it will eventually be
available for ada and da. The notion is to measure how long the TRIMs take,
and only send them at 80% of that rate when there's other traffic in the
queue (so if trims are taking 100ms, send them no faster than 8/s). While
this will allow for better read/write traffic, it does slow the TRIMs down
which slows down whatever they may be blocking in the upper layers. Can't
speak to ZFS much, but for UFS that's freeing of blocks so things like new
block allocation may be delayed if we're almost out of disk (which we have
no signal for, so there's no way for the lower layers to prioritize trims
or not).


> I'm also having a couple of iSCSI issues that I'm dealing through bounty
> with, so may be this is related somehow. Or may be not. Due to some issues
> in iSCSI stack my system sometimes reboots, and then these "waves" are
> stopped for some time.
>
> So, my question is - can I fine-tune TRIM operations ? So they don't
> consume the whole disk at 100%. I see several sysctl oids, but they aren't
> well-documented.
>

You might be able to set the delete method.


> P.S. This is 11.x, disks are Toshibas, and they are attached via LSI HBA.
>

Which LSI HBA?

Warner
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: TRIM, iSCSI and %busy waves

2018-04-05 Thread Steven Hartland

You can indeed tune things here are the relevant sysctls:
sysctl -a | grep trim |grep -v kstat
vfs.zfs.trim.max_interval: 1
vfs.zfs.trim.timeout: 30
vfs.zfs.trim.txg_delay: 32
vfs.zfs.trim.enabled: 1
vfs.zfs.vdev.trim_max_pending: 1
vfs.zfs.vdev.trim_max_active: 64
vfs.zfs.vdev.trim_min_active: 1
vfs.zfs.vdev.trim_on_init: 1

    Regards
    Steve

On 05/04/2018 15:08, Eugene M. Zheganin wrote:

Hi,

I have a production iSCSI system (on zfs of course) with 15 ssd disks 
and it's often suffering from TRIMs.


Well, I know what TRIM is for, and I know it's a good thing, but 
sometimes (actually often) I'm seeing my disks in gstat are 
overwhelmed by the TRIM waves, this looks like a "wave" of 20K 
100%busy delete operations starting on first pool disk, then reaching 
second, then third,... - at the time it reaches the 15th disk the 
first one if freed from TRIM operations, and in 20-40 seconds this 
wave begins again.


I'm also having a couple of iSCSI issues that I'm dealing through 
bounty with, so may be this is related somehow. Or may be not. Due to 
some issues in iSCSI stack my system sometimes reboots, and then these 
"waves" are stopped for some time.


So, my question is - can I fine-tune TRIM operations ? So they don't 
consume the whole disk at 100%. I see several sysctl oids, but they 
aren't well-documented.


P.S. This is 11.x, disks are Toshibas, and they are attached via LSI HBA.


Thanks.

Eugene.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Problems with ifconfig when starting all jails after 10.3 -> 10.4 upgrade

2018-04-05 Thread Marc Branchaud

Hi all,

I just upgraded from 10.3 to 10.4, and "/etc/rc.d/jail start" is having 
problems starting all of my jails:


# /etc/rc.d/jail start
Starting jails:xipbuild_3_3: created
ifconfig:: bad value
jail: xipbuild_3_3_8: /sbin/ifconfig lo1 inet 10.1.1.38/32 alias: failed
xipbuild_3_4: created
ifconfig:: bad value
jail: xipbuild_4_0: /sbin/ifconfig lo1 inet 10.1.1.5/32 alias: failed
xipbuild: created
xipbuild_4_9: created
ifconfig:: bad value
jail: xipbuild9: /sbin/ifconfig lo1 inet 10.1.1.209/32 alias: failed
.

This worked fine in 10.3.  I can individually start each jail, e.g. 
"/etc/rc.d/jail start xipbuild9".


All the jails configure the same set of parameters.  Here's my jail.conf:

--- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< ---
xipbuild_3_3 {
  path="/usr/build-jails/jails/3.3";
  host.hostname="xipbuild_3_3";
  ip4.addr="10.1.1.3/32";

  allow.chflags;
  allow.mount;
  mount.devfs;

  persist;

  mount="/usr/home  /usr/build-jails/jails/3.3/usr/home nullfs rw 0 0";
  interface="lo1";
}
xipbuild_3_3_8 {
  path="/usr/build-jails/jails/3.3.8";
  host.hostname="xipbuild_3_3_8";
  ip4.addr="10.1.1.38/32";

  allow.chflags;
  allow.mount;
  mount.devfs;

  persist;

  mount="/usr/home  /usr/build-jails/jails/3.3.8/usr/home nullfs rw 0 0";
  interface="lo1";
}
xipbuild_3_4 {
  path="/usr/build-jails/jails/3.4";
  host.hostname="xipbuild_3_4";
  ip4.addr="10.1.1.4/32";

  allow.chflags;
  allow.mount;
  mount.devfs;

  persist;

  mount="/usr/home  /usr/build-jails/jails/3.4/usr/home nullfs rw 0 0";
  interface="lo1";
}
xipbuild_4_0 {
  path="/usr/build-jails/jails/4.0";
  host.hostname="xipbuild_4_0";
  ip4.addr="10.1.1.5/32";

  allow.chflags;
  allow.mount;
  mount.devfs;

  persist;

  mount="/usr/home  /usr/build-jails/jails/4.0/usr/home nullfs rw 0 0";
  interface="lo1";
}
xipbuild {
  path="/usr/build-jails/jails/latest";
  host.hostname="xipbuild";
  ip4.addr="10.1.1.200/32";

  allow.chflags;
  allow.mount;
  mount.devfs;

  persist;

  mount="/usr/home  /usr/build-jails/jails/latest/usr/home nullfs rw 0 0";
  interface="lo1";
}
xipbuild_4_9 {
  path="/usr/build-jails/jails/4.9";
  host.hostname="xipbuild_4_9";
  ip4.addr="10.1.1.90/32";

  allow.chflags;
  allow.mount;
  mount.devfs;

  persist;

  mount="/usr/home  /usr/build-jails/jails/4.9/usr/home nullfs rw 0 0";
  interface="lo1";
}
xipbuild9 {
  path="/usr/build-jails/jails/latest9";
  host.hostname="xipbuild9";
  ip4.addr="10.1.1.209/32";

  allow.chflags;
  allow.mount;
  mount.devfs;

  persist;

  mount="/usr/home  /usr/build-jails/jails/latest9/usr/home nullfs rw 0 0";
  interface="lo1";
}
--- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< ---

I use ipnat to give the jails network access.  Here's ipnat.rules:

--- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< ---
map em0 10.1.1.0/24 -> 0/32 proxy port ftp ftp/tcp
map em0 10.1.1.0/24 -> 0/32
--- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< ---

And here's my rc.conf:

--- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< ---
# Generated by Ansible

# hostname must be FQDN
hostname="devastator.xiplink.com"

zfs_enable="False"

# FIXME: previously auto-created?
ifconfig_lo1="create"


ifconfig_em0="DHCP SYNCDHCP"

network_interfaces="em0"
gateway_enable="YES"

# Prevent rpc
rpcbind_enable="NO"

# Prevent sendmail to try to connect to localhost
sendmail_enable="NO"
sendmail_submit_enable="NO"
sendmail_outbound_enable="NO"
sendmail_msp_queue_enable="NO"

# Bring up sshd, it takes some time and uses some entropy on first startup
sshd_enable="YES"

netwait_enable="YES"
netwait_ip="10.10.0.35"
netwait_if="em0"

jenkins_swarm_enable="YES"
jenkins_swarm_opts="-executors 8"

# --- Build jails ---
build_jails_enable="YES"
jail_enable="YES"

# Set rules in /etc/ipnat.rules
ipnat_enable="YES"

# Set interface name for ipnat
network_interfaces="${network_interfaces} lo1"

# Each jail needs to specify its IP address and mask bits in ipv4_addrs_lo1
ipv4_addrs_lo1="10.1.1.1/32"

jail_chflags_allow="yes"

varmfs="NO"
--- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< ---

Any insight would be deeply appreciated!

M.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


TRIM, iSCSI and %busy waves

2018-04-05 Thread Eugene M. Zheganin

Hi,

I have a production iSCSI system (on zfs of course) with 15 ssd disks 
and it's often suffering from TRIMs.


Well, I know what TRIM is for, and I know it's a good thing, but 
sometimes (actually often) I'm seeing my disks in gstat are overwhelmed 
by the TRIM waves, this looks like a "wave" of 20K 100%busy delete 
operations starting on first pool disk, then reaching second, then 
third,... - at the time it reaches the 15th disk the first one if freed 
from TRIM operations, and in 20-40 seconds this wave begins again.


I'm also having a couple of iSCSI issues that I'm dealing through bounty 
with, so may be this is related somehow. Or may be not. Due to some 
issues in iSCSI stack my system sometimes reboots, and then these 
"waves" are stopped for some time.


So, my question is - can I fine-tune TRIM operations ? So they don't 
consume the whole disk at 100%. I see several sysctl oids, but they 
aren't well-documented.


P.S. This is 11.x, disks are Toshibas, and they are attached via LSI HBA.


Thanks.

Eugene.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"