On Wed, 2018-07-25 at 16:58 +0200, Len Weincier wrote:
> Hi 
> We a very strange situation trying to upgrade to a newer smartos
> image where the disk I/O is *very* slow.
> I have been working through the released images and the last one that
> works 100% is 20180329T002644Z
> From 20180412T003259Z onwards, the release with the new zfs features
> like spacemaps etc, the hosts become unusable in terms of disk i/o
> In our testing with the lab machine with only 128G ram we see no
> pathologies.
> Hosts are running ALL SSDs (RAIDZ2), and Intel Gold 6150 x2
> processors on an SMC X11DPH-T board.. The lab machine with 128GB RAM
> has exactly the same processors, board, and SSD-only setup - except
> for RAM..
> On a production machine with 768G ram and the newer image for eg zfs
> create -V 10G zones/test takes 2 minutes while at the same time
> iostat is showing the disks as relatively idle (%b = 10) 
> For example inside an ubuntu kvm with postgres we are seeing 40% wait
> time for any disk i/o and there are only 2 vm's running, underlying
> disks essentially idle.
> Is there anything we can look at to get to the bottom of this as it
> pretty critical and affecting our customers 

Hi 

I have managed to grab a bunch of stack traces from a dtrace script on
the fbt:zfs::entry events and generated a flamegraph while the system
was behaving badly 

https://static.prod.cloudafrica.net/out.svg

This show a bunch of activity in the metaslab allocation if I read it
correctly ?

Any ideas or anything I can look at please let me know.

I have confirmed that this only occurs when the system is under i/o
load.

Thanks
Len



-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125
Powered by Listbox: https://www.listbox.com

Reply via email to