On Wed, 2018-07-25 at 16:58 +0200, Len Weincier wrote: > Hi > We a very strange situation trying to upgrade to a newer smartos > image where the disk I/O is *very* slow. > I have been working through the released images and the last one that > works 100% is 20180329T002644Z > From 20180412T003259Z onwards, the release with the new zfs features > like spacemaps etc, the hosts become unusable in terms of disk i/o > In our testing with the lab machine with only 128G ram we see no > pathologies. > Hosts are running ALL SSDs (RAIDZ2), and Intel Gold 6150 x2 > processors on an SMC X11DPH-T board.. The lab machine with 128GB RAM > has exactly the same processors, board, and SSD-only setup - except > for RAM.. > On a production machine with 768G ram and the newer image for eg zfs > create -V 10G zones/test takes 2 minutes while at the same time > iostat is showing the disks as relatively idle (%b = 10) > For example inside an ubuntu kvm with postgres we are seeing 40% wait > time for any disk i/o and there are only 2 vm's running, underlying > disks essentially idle. > Is there anything we can look at to get to the bottom of this as it > pretty critical and affecting our customers
Hi I have managed to grab a bunch of stack traces from a dtrace script on the fbt:zfs::entry events and generated a flamegraph while the system was behaving badly https://static.prod.cloudafrica.net/out.svg This show a bunch of activity in the metaslab allocation if I read it correctly ? Any ideas or anything I can look at please let me know. I have confirmed that this only occurs when the system is under i/o load. Thanks Len ------------------------------------------- smartos-discuss Archives: https://www.listbox.com/member/archive/184463/=now Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125 Powered by Listbox: https://www.listbox.com
