On Thu, Jul 26, 2018 at 5:02 AM, Len Weincier <l...@cloudafrica.net> wrote:
> On Wed, 2018-07-25 at 16:58 +0200, Len Weincier wrote: > > Hi > > We a very strange situation trying to upgrade to a newer smartos image > where the disk I/O is *very* slow. > > I have been working through the released images and the last one that > works 100% is 20180329T002644Z > > From 20180412T003259Z onwards, the release with the new zfs features like > spacemaps etc, the hosts become unusable in terms of disk i/o > > In our testing with the lab machine with only 128G ram we see no > pathologies. > > Hosts are running ALL SSDs (RAIDZ2), and Intel Gold 6150 x2 processors on > an SMC X11DPH-T board.. > The lab machine with 128GB RAM has exactly the same processors, board, and > SSD-only setup - except for RAM.. > > On a production machine with 768G ram and the newer image for eg zfs > create -V 10G zones/test takes 2 minutes while at the same time iostat is > showing the disks as relatively idle (%b = 10) > > For example inside an ubuntu kvm with postgres we are seeing 40% wait time > for any disk i/o and there are only 2 vm's running, underlying disks > essentially idle. > > Is there anything we can look at to get to the bottom of this as it pretty > critical and affecting our customers > > > > Hi > > I have managed to grab a bunch of stack traces from a dtrace script on the > fbt:zfs::entry events and generated a flamegraph while the system was > behaving badly > > https://static.prod.cloudafrica.net/out.svg > > This show a bunch of activity in the metaslab allocation if I read it > correctly ? > > Any ideas or anything I can look at please let me know. > > I have confirmed that this only occurs when the system is under i/o load. > > I've created a platform image based on 20180329T002644Z with this change that you mentioned removed. commit f78cdc34af236a6199dd9e21376f4a46348c0d56 Author: Paul Dagnelie <p...@delphix.com> Date: Mon Feb 12 12:56:06 2018 -0800 9112 Improve allocation performance on high-end systems Reviewed by: Matthew Ahrens <mahr...@delphix.com> Reviewed by: George Wilson <george.wil...@delphix.com> Reviewed by: Serapheim Dimitropoulos <serapheim.dimi...@delphix.com> Reviewed by: Alexander Motin <m...@freebsd.org> Approved by: Gordon Ross <g...@nexenta.com> My testing has involved booting the iso under vmware and verifying that it could import an existing single disk pool and run the VMs on it. Can you give this PI a try? As a reminder, my testing has been quite superficial. I hope it won't eat your data, but can offer no guarantees. https://us-east.manta.joyent.com/mgerdts/public/pi/len/ platform-20180726T160921Z.tgz https://us-east.manta.joyent.com/mgerdts/public/pi/len/ platform-20180726T160921Z.iso https://us-east.manta.joyent.com/mgerdts/public/pi/len/ platform-20180726T160921Z.usb.bz2 Regards, Mike ------------------------------------------- smartos-discuss Archives: https://www.listbox.com/member/archive/184463/=now Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125 Powered by Listbox: https://www.listbox.com