On 07/25/18 08:49 PM, Len Weincier wrote:
Hi
I see this commit and the hosts where we see issues have 2 NVMe devs
used as slogs if that helps
https://github.com/joyent/illumos-joyent/commit/f78cdc34af236a6199dd9e21376f4a46348c0d56
Len
On 25 Jul 2018, at 16:58, Len Weincier <[email protected]
<mailto:[email protected]>> wrote:
Hi
We a very strange situation trying to upgrade to a newer smartos image
where the disk I/O is *very* slow.
I have been working through the released images and the last one that
works 100% is 20180329T002644Z
From 20180412T003259Z onwards, the release with the new zfs features
like spacemaps etc, the hosts become unusable in terms of disk i/o
In our testing with the lab machine with only 128G ram we see no
pathologies.
Hosts are running ALL SSDs (RAIDZ2), and Intel Gold 6150 x2 processors
on an SMC X11DPH-T board..
The lab machine with 128GB RAM has exactly the same processors, board,
and SSD-only setup - except for RAM..
On a production machine with 768G ram and the newer image for eg zfs
create -V 10G zones/test takes 2 minutes while at the same time iostat
is showing the disks as relatively idle (%b = 10)
For example inside an ubuntu kvm with postgres we are seeing 40% wait
time for any disk i/o and there are only 2 vm's running, underlying
disks essentially idle.
Is there anything we can look at to get to the bottom of this as it
pretty critical and affecting our customers
Any help appreciated
Thanks
Len
The log
(https://us-east.manta.joyent.com/Joyent_Dev/public/SmartOS/smartos.html#20180412T003259Z)
also says "Implement KPTI". Does the problem go away when you boot with
kpti=0? I've seen disk IO going haywire for very old CPU without
process-context identifiers (PCID).
Michal
-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125
Powered by Listbox: https://www.listbox.com