That is interesting!!I'd have been inclined to: - crack off a dtrace fbt 
profile run looking at what was on cpu most of the time, and what the most 
common stacks were - grab a crash dump so you can poke around later... - check 
out the cpu power states (using powertop or similar)If you were stuck in the 
lowest c-state, all the things you described could wel happen, but I have never 
seen thay happen... Id also be looking closely at something having clamped the 
arc. I have seen arc_no_grow stuck as 1 in the past, and once in that state, 
the arc shrinks to mearly nothing, causing much pain. mdb -k and then a ::arc 
for some arc details and perhaps a ::arc_no_grow::print -tNothing much else 
comes immediately to mind... Cheers!Nathan. Sent from my Samsung Galaxy 
smartphone.
-------- Original message --------From: Andre van Eyssen <[email protected]> 
Date: 30/3/20  10:43 am  (GMT+10:00) To: [email protected] Cc: Melbourne 
Solaris and Oracle Systems User Group <[email protected]> Subject: 
[msosug] Interesting case Heya,Since we're having Fun Times with Solaris while 
all locked-down, I'll add one that popped up over the weekend.Host is running 
11.3/x86. Good performance, normally stable and running a near-idle CPU burn as 
this is primarily a storage host with moderate demands (only writing about 
2MB/sec average over time apart from bursts).In an absolute instant with no 
warning, CPU usages starts running high. Here's a graph from Zabbix: 
http://mexico.purplecow.org/p/data/images/7/7.pngSimple operations like an ssh 
connection to the host are very sluggish. Normally low-burning processes are 
now consuming a real % of CPU. For example, nmbd was burning 5% according to 
prstat.The kernel process for running the main data zpool was now burning about 
50% of the user time on the host and I/O was taking a long time to dequeue to 
disk, leading to measurable performance changes on client systems.FMA is 
reporting no activity and there is nothing in any logs, including the 
non-default logfile that takes debug.* from syslog.DRAC reports system 
temperature of 23 degrees and CPU temperature of 42 degrees, fluctuating only 
mildly.Symptom-wise, for all intents and purposes it looked like the CPU had 
throttled down to some insanely slow speed and was dragging everything through 
the mud, now running up at 90% systime just servicing interrupts and 
ZFS.Nothing interesting in lockstat. Nothing interesting in intrstat other than 
the sheer % of CPU burned servicing mpt_sas and qlt.After an extended period of 
trying to work out the root cause, the machine was dealt an init 5 (after zpool 
offline on a number of clients, good thing they were mirrored across heads...) 
and after a very long shutdown the machine was re-powered with all symptoms 
gone.Ideas?CPU burn graph including pre-event and 
post-reboot:http://mexico.purplecow.org/p/data/images/9/9.pngGraph legend:      
blue = user time                red = sys time          green = idleAndre.-- 
Andre van Eyssen.                  Phone:     +61 417 211 788mail:     
[email protected]      http://andre.purplecow.orgAbout & Contact:          
http://www.purplecow.org/andre.html_______________________________________________msosug
 mailing 
[email protected]http://mexico.purplecow.org/m/listinfo/msosugDelivered
 for: [email protected]
_______________________________________________
msosug mailing list
[email protected]
http://mexico.purplecow.org/m/listinfo/msosug
Delivered for: [email protected]

Reply via email to