Doh.. the problems of it being late :) .. there should be a ‘public’ in there. 

Try

https://us-east.manta.joyent.com/jbk/public/OS-7079/platform-20180719T001516Z.iso


From: Youzhong Yang <[email protected]>
Reply: Youzhong Yang <[email protected]>
Date: August 13, 2018 at 12:12:52 AM
To: Jason King <[email protected]>
Cc: [email protected] <[email protected]>
Subject:  Re: [smartos-discuss] still hang at boot - OS-7079 mp_startup_common 
races itself  

I got this:

{"code":"ResourceNotFound","message":"/jbk/OS-7079/platform-20180719T001516Z.iso
 does not exist"}

In our /etc/system, I have 
set pcplusmp:apic_panic_on_nmi=1
set apix:apic_panic_on_nmi=1

If I set them to 0, and boot with -k, a NMI should drop into kmdb, right? I 
will build an image now and test.


On Mon, Aug 13, 2018 at 1:04 AM, Jason King <[email protected]> wrote:
There’s a couple of ways — you can boot -kd and set a breakpoint to set it.  
You can also set it in etc/system in the proto area when building an image.

If you want, I do have an image of 20180719 w/ OS-7079 applied and kmdb on NMI 
already set (you’d still want to boot -k)  — you can grab it at
https://us-east.manta.joyent.com/jbk/OS-7079/platform-20180719T001516Z.{iso,tgz,usb.bz2}

If you do, it’d be interesting to see ::cpustack on each core looks like.


From: Youzhong Yang <[email protected]>
Reply: Youzhong Yang <[email protected]>
Date: August 12, 2018 at 11:58:48 PM
To: Jason King <[email protected]>
Cc: [email protected] <[email protected]>
Subject:  Re: [smartos-discuss] still hang at boot - OS-7079 mp_startup_common 
races itself

I sent NMI, but it printed out a stack trace plus a message "no dump device" or 
something then rebooted. I tried -v on my old supermicro system, on the console 
I saw message about sd## devices, then it hung. The console still responded to 
keyboard, but just stayed that way forever.

What change is needed to drop into kmdb when the OS receives NMI?

On Mon, Aug 13, 2018 at 12:06 AM, Jason King <[email protected]> wrote:
Was that with boot -v?  Are you able to send the system an NMI after it hangs 
(or get the boot -v output up to the hang)?

Prior to OS-7079, the system would start to startup the next CPU before it had 
completely finished initializing the ‘current’ CPU (which could deadlock 
depending on which CPU obtained a particular lock first), the change makes it 
wait until the current CPU is finished starting up before proceeding to the 
next CPU.

It’s certainly possible it could have revealed another bug — OS-7079 itself was 
introduced almost 10 years ago, but didn’t seem to be easy to trigger until 
recent CPUs.


From: Youzhong Yang <[email protected]>
Reply: [email protected] <[email protected]>
Date: August 12, 2018 at 10:46:05 PM
To: [email protected] <[email protected]>
Subject:  [smartos-discuss] still hang at boot - OS-7079 mp_startup_common 
races itself

Today I built a smartos image (with all git repos synced to master) and 
rebooted the host with that image. It hung after the banner message + one more 
line about power management or something.

Then I reverted OS-7079, built an image, rebooted, it worked perfectly.

So does it mean OS-7079 fixed one issue, but caused another? My host is an old 
Supermicro X8DAH, Intel(R) Xeon(R) CPU X5570  @ 2.93GHz. Tomorrow I will try on 
a new all NVMe system and see if it works.

Thanks.
smartos-discuss | Archives | Modify Your Subscription           

Attachment: signature.asc
Description: Message signed with OpenPGP using AMPGpg

Attachment: smime.p7s
Description: S/MIME cryptographic signature




-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125
Powered by Listbox: https://www.listbox.com

Reply via email to