On 07/25/18 08:49 PM, Len Weincier wrote:
Hi

I see this commit and the hosts where we see issues have 2 NVMe devs used as slogs if that helps

https://github.com/joyent/illumos-joyent/commit/f78cdc34af236a6199dd9e21376f4a46348c0d56

Len


On 25 Jul 2018, at 16:58, Len Weincier <[email protected] <mailto:[email protected]>> wrote:

Hi

We a very strange situation trying to upgrade to a newer smartos image where the disk I/O is *very* slow.

I have been working through the released images and the last one that works 100% is 20180329T002644Z

From 20180412T003259Z onwards, the release with the new zfs features like spacemaps etc, the hosts become unusable in terms of disk i/o

In our testing with the lab machine with only 128G ram we see no pathologies.

Hosts are running ALL SSDs (RAIDZ2), and Intel Gold 6150 x2 processors on an SMC X11DPH-T board.. The lab machine with 128GB RAM has exactly the same processors, board, and SSD-only setup - except for RAM..

On a production machine with 768G ram and the newer image for eg zfs create -V 10G zones/test takes 2 minutes while at the same time iostat is showing the disks as relatively idle (%b = 10)

For example inside an ubuntu kvm with postgres we are seeing 40% wait time for any disk i/o and there are only 2 vm's running, underlying disks essentially idle.

Is there anything we can look at to get to the bottom of this as it pretty critical and affecting our customers

Any help appreciated
Thanks
Len

The log (https://us-east.manta.joyent.com/Joyent_Dev/public/SmartOS/smartos.html#20180412T003259Z) also says "Implement KPTI". Does the problem go away when you boot with kpti=0? I've seen disk IO going haywire for very old CPU without process-context identifiers (PCID).

Michal


-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125
Powered by Listbox: https://www.listbox.com

Reply via email to