Re: TRIM, iSCSI and %busy waves
Hello, On 05.04.2018 20:00, Warner Losh wrote: I'm also having a couple of iSCSI issues that I'm dealing through bounty with, so may be this is related somehow. Or may be not. Due to some issues in iSCSI stack my system sometimes reboots, and then these "waves" are stopped for some time. So, my question is - can I fine-tune TRIM operations ? So they don't consume the whole disk at 100%. I see several sysctl oids, but they aren't well-documented. You might be able to set the delete method. Set it to what ? It's not like I'm seeing FreeBSD for the first time, but from what I see in sysctl - all of those "sysctl -a | grep trim" oids are numeric. P.S. This is 11.x, disks are Toshibas, and they are attached via LSI HBA. Which LSI HBA? A SAS9300-4i4e one. Eugene. Thanks. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: TRIM, iSCSI and %busy waves
Hello, On 05.04.2018 19:57, Steven Hartland wrote: You can indeed tune things here are the relevant sysctls: sysctl -a | grep trim |grep -v kstat vfs.zfs.trim.max_interval: 1 vfs.zfs.trim.timeout: 30 vfs.zfs.trim.txg_delay: 32 vfs.zfs.trim.enabled: 1 vfs.zfs.vdev.trim_max_pending: 1 vfs.zfs.vdev.trim_max_active: 64 vfs.zfs.vdev.trim_min_active: 1 vfs.zfs.vdev.trim_on_init: 1 Well, I've already seen these. How do I tune them ? The idea of just tampering with them and seeing what will happen doesn't look like a bright one to me. Do I increase or decrease them ? Which ones do I have to avoid ? Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: TRIM, iSCSI and %busy waves
On Thu, Apr 5, 2018 at 8:08 AM, Eugene M. Zheganinwrote: > Hi, > > I have a production iSCSI system (on zfs of course) with 15 ssd disks and > it's often suffering from TRIMs. > > Well, I know what TRIM is for, and I know it's a good thing, but sometimes > (actually often) I'm seeing my disks in gstat are overwhelmed by the TRIM > waves, this looks like a "wave" of 20K 100%busy delete operations starting > on first pool disk, then reaching second, then third,... - at the time it > reaches the 15th disk the first one if freed from TRIM operations, and in > 20-40 seconds this wave begins again. > There's two issues here. First, %busy doesn't necessarily mean what you think it means. Back in the days of one operation at a time, it might have been a reasonable indicator that the drive is busy. However, today with queueing a 100% busy disk often can take additional load. The second problem is that TRIMs suck for a lot of reasons. FFS (I don't know about ZFS) sends lots of TRIMs at once when you delete a file. These TRIMs are UFS block sized, so need to be combined in the ada/da layer. The combining in the ada and da drivers isn't optimal, but implements a 'greedy' method where we pack as much as possible into each TRIM, which makes each TRIM take longer. Plus, TRIMs are non NCQ commands, so force a drain of all the other commands to do them. And we don't have any throttling in 11.x (at the moment), so they tend to flood the device and starve out other traffic when there's a lot of them. Not all controllers support NCQ trim (LSI doesn't at the moment, I don't think). With NCQ we only queue one at a time and that helps. I'm working on trim shaping in -current right now. It's focused on NVMe, but since I'm doing the bulk of it in cam_iosched.c, it will eventually be available for ada and da. The notion is to measure how long the TRIMs take, and only send them at 80% of that rate when there's other traffic in the queue (so if trims are taking 100ms, send them no faster than 8/s). While this will allow for better read/write traffic, it does slow the TRIMs down which slows down whatever they may be blocking in the upper layers. Can't speak to ZFS much, but for UFS that's freeing of blocks so things like new block allocation may be delayed if we're almost out of disk (which we have no signal for, so there's no way for the lower layers to prioritize trims or not). > I'm also having a couple of iSCSI issues that I'm dealing through bounty > with, so may be this is related somehow. Or may be not. Due to some issues > in iSCSI stack my system sometimes reboots, and then these "waves" are > stopped for some time. > > So, my question is - can I fine-tune TRIM operations ? So they don't > consume the whole disk at 100%. I see several sysctl oids, but they aren't > well-documented. > You might be able to set the delete method. > P.S. This is 11.x, disks are Toshibas, and they are attached via LSI HBA. > Which LSI HBA? Warner ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: TRIM, iSCSI and %busy waves
You can indeed tune things here are the relevant sysctls: sysctl -a | grep trim |grep -v kstat vfs.zfs.trim.max_interval: 1 vfs.zfs.trim.timeout: 30 vfs.zfs.trim.txg_delay: 32 vfs.zfs.trim.enabled: 1 vfs.zfs.vdev.trim_max_pending: 1 vfs.zfs.vdev.trim_max_active: 64 vfs.zfs.vdev.trim_min_active: 1 vfs.zfs.vdev.trim_on_init: 1 Regards Steve On 05/04/2018 15:08, Eugene M. Zheganin wrote: Hi, I have a production iSCSI system (on zfs of course) with 15 ssd disks and it's often suffering from TRIMs. Well, I know what TRIM is for, and I know it's a good thing, but sometimes (actually often) I'm seeing my disks in gstat are overwhelmed by the TRIM waves, this looks like a "wave" of 20K 100%busy delete operations starting on first pool disk, then reaching second, then third,... - at the time it reaches the 15th disk the first one if freed from TRIM operations, and in 20-40 seconds this wave begins again. I'm also having a couple of iSCSI issues that I'm dealing through bounty with, so may be this is related somehow. Or may be not. Due to some issues in iSCSI stack my system sometimes reboots, and then these "waves" are stopped for some time. So, my question is - can I fine-tune TRIM operations ? So they don't consume the whole disk at 100%. I see several sysctl oids, but they aren't well-documented. P.S. This is 11.x, disks are Toshibas, and they are attached via LSI HBA. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Problems with ifconfig when starting all jails after 10.3 -> 10.4 upgrade
Hi all, I just upgraded from 10.3 to 10.4, and "/etc/rc.d/jail start" is having problems starting all of my jails: # /etc/rc.d/jail start Starting jails:xipbuild_3_3: created ifconfig:: bad value jail: xipbuild_3_3_8: /sbin/ifconfig lo1 inet 10.1.1.38/32 alias: failed xipbuild_3_4: created ifconfig:: bad value jail: xipbuild_4_0: /sbin/ifconfig lo1 inet 10.1.1.5/32 alias: failed xipbuild: created xipbuild_4_9: created ifconfig:: bad value jail: xipbuild9: /sbin/ifconfig lo1 inet 10.1.1.209/32 alias: failed . This worked fine in 10.3. I can individually start each jail, e.g. "/etc/rc.d/jail start xipbuild9". All the jails configure the same set of parameters. Here's my jail.conf: --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- xipbuild_3_3 { path="/usr/build-jails/jails/3.3"; host.hostname="xipbuild_3_3"; ip4.addr="10.1.1.3/32"; allow.chflags; allow.mount; mount.devfs; persist; mount="/usr/home /usr/build-jails/jails/3.3/usr/home nullfs rw 0 0"; interface="lo1"; } xipbuild_3_3_8 { path="/usr/build-jails/jails/3.3.8"; host.hostname="xipbuild_3_3_8"; ip4.addr="10.1.1.38/32"; allow.chflags; allow.mount; mount.devfs; persist; mount="/usr/home /usr/build-jails/jails/3.3.8/usr/home nullfs rw 0 0"; interface="lo1"; } xipbuild_3_4 { path="/usr/build-jails/jails/3.4"; host.hostname="xipbuild_3_4"; ip4.addr="10.1.1.4/32"; allow.chflags; allow.mount; mount.devfs; persist; mount="/usr/home /usr/build-jails/jails/3.4/usr/home nullfs rw 0 0"; interface="lo1"; } xipbuild_4_0 { path="/usr/build-jails/jails/4.0"; host.hostname="xipbuild_4_0"; ip4.addr="10.1.1.5/32"; allow.chflags; allow.mount; mount.devfs; persist; mount="/usr/home /usr/build-jails/jails/4.0/usr/home nullfs rw 0 0"; interface="lo1"; } xipbuild { path="/usr/build-jails/jails/latest"; host.hostname="xipbuild"; ip4.addr="10.1.1.200/32"; allow.chflags; allow.mount; mount.devfs; persist; mount="/usr/home /usr/build-jails/jails/latest/usr/home nullfs rw 0 0"; interface="lo1"; } xipbuild_4_9 { path="/usr/build-jails/jails/4.9"; host.hostname="xipbuild_4_9"; ip4.addr="10.1.1.90/32"; allow.chflags; allow.mount; mount.devfs; persist; mount="/usr/home /usr/build-jails/jails/4.9/usr/home nullfs rw 0 0"; interface="lo1"; } xipbuild9 { path="/usr/build-jails/jails/latest9"; host.hostname="xipbuild9"; ip4.addr="10.1.1.209/32"; allow.chflags; allow.mount; mount.devfs; persist; mount="/usr/home /usr/build-jails/jails/latest9/usr/home nullfs rw 0 0"; interface="lo1"; } --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- I use ipnat to give the jails network access. Here's ipnat.rules: --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- map em0 10.1.1.0/24 -> 0/32 proxy port ftp ftp/tcp map em0 10.1.1.0/24 -> 0/32 --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- And here's my rc.conf: --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- # Generated by Ansible # hostname must be FQDN hostname="devastator.xiplink.com" zfs_enable="False" # FIXME: previously auto-created? ifconfig_lo1="create" ifconfig_em0="DHCP SYNCDHCP" network_interfaces="em0" gateway_enable="YES" # Prevent rpc rpcbind_enable="NO" # Prevent sendmail to try to connect to localhost sendmail_enable="NO" sendmail_submit_enable="NO" sendmail_outbound_enable="NO" sendmail_msp_queue_enable="NO" # Bring up sshd, it takes some time and uses some entropy on first startup sshd_enable="YES" netwait_enable="YES" netwait_ip="10.10.0.35" netwait_if="em0" jenkins_swarm_enable="YES" jenkins_swarm_opts="-executors 8" # --- Build jails --- build_jails_enable="YES" jail_enable="YES" # Set rules in /etc/ipnat.rules ipnat_enable="YES" # Set interface name for ipnat network_interfaces="${network_interfaces} lo1" # Each jail needs to specify its IP address and mask bits in ipv4_addrs_lo1 ipv4_addrs_lo1="10.1.1.1/32" jail_chflags_allow="yes" varmfs="NO" --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- Any insight would be deeply appreciated! M. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
TRIM, iSCSI and %busy waves
Hi, I have a production iSCSI system (on zfs of course) with 15 ssd disks and it's often suffering from TRIMs. Well, I know what TRIM is for, and I know it's a good thing, but sometimes (actually often) I'm seeing my disks in gstat are overwhelmed by the TRIM waves, this looks like a "wave" of 20K 100%busy delete operations starting on first pool disk, then reaching second, then third,... - at the time it reaches the 15th disk the first one if freed from TRIM operations, and in 20-40 seconds this wave begins again. I'm also having a couple of iSCSI issues that I'm dealing through bounty with, so may be this is related somehow. Or may be not. Due to some issues in iSCSI stack my system sometimes reboots, and then these "waves" are stopped for some time. So, my question is - can I fine-tune TRIM operations ? So they don't consume the whole disk at 100%. I see several sysctl oids, but they aren't well-documented. P.S. This is 11.x, disks are Toshibas, and they are attached via LSI HBA. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"