I got this:

{"code":"ResourceNotFound","message":"/jbk/OS-7079/platform-20180719T001516Z.iso
does not exist"}

In our /etc/system, I have
set pcplusmp:apic_panic_on_nmi=1
set apix:apic_panic_on_nmi=1

If I set them to 0, and boot with -k, a NMI should drop into kmdb, right? I
will build an image now and test.


On Mon, Aug 13, 2018 at 1:04 AM, Jason King <[email protected]>
wrote:

> There’s a couple of ways — you can boot -kd and set a breakpoint to set
> it.  You can also set it in etc/system in the proto area when building an
> image.
>
> If you want, I do have an image of 20180719 w/ OS-7079 applied and kmdb on
> NMI already set (you’d still want to boot -k)  — you can grab it at
> https://us-east.manta.joyent.com/jbk/OS-7079/platform-
> 20180719T001516Z.{iso,tgz,usb.bz2}
>
> If you do, it’d be interesting to see ::cpustack on each core looks like.
>
>
> From: Youzhong Yang <[email protected]> <[email protected]>
> Reply: Youzhong Yang <[email protected]> <[email protected]>
> Date: August 12, 2018 at 11:58:48 PM
> To: Jason King <[email protected]> <[email protected]>
> Cc: [email protected] <[email protected]>
> <[email protected]>
> Subject:  Re: [smartos-discuss] still hang at boot - OS-7079
> mp_startup_common races itself
>
> I sent NMI, but it printed out a stack trace plus a message "no dump
> device" or something then rebooted. I tried -v on my old supermicro system,
> on the console I saw message about sd## devices, then it hung. The console
> still responded to keyboard, but just stayed that way forever.
>
> What change is needed to drop into kmdb when the OS receives NMI?
>
> On Mon, Aug 13, 2018 at 12:06 AM, Jason King <[email protected]>
> wrote:
>
>> Was that with boot -v?  Are you able to send the system an NMI after it
>> hangs (or get the boot -v output up to the hang)?
>>
>> Prior to OS-7079, the system would start to startup the next CPU before
>> it had completely finished initializing the ‘current’ CPU (which could
>> deadlock depending on which CPU obtained a particular lock first), the
>> change makes it wait until the current CPU is finished starting up before
>> proceeding to the next CPU.
>>
>> It’s certainly possible it could have revealed another bug — OS-7079
>> itself was introduced almost 10 years ago, but didn’t seem to be easy to
>> trigger until recent CPUs.
>>
>>
>> From: Youzhong Yang <[email protected]> <[email protected]>
>> Reply: [email protected] <[email protected]
>> .org> <[email protected]>
>> Date: August 12, 2018 at 10:46:05 PM
>> To: [email protected] <[email protected]>
>> <[email protected]>
>> Subject:  [smartos-discuss] still hang at boot - OS-7079
>> mp_startup_common races itself
>>
>> Today I built a smartos image (with all git repos synced to master) and
>> rebooted the host with that image. It hung after the banner message + one
>> more line about power management or something.
>>
>> Then I reverted OS-7079, built an image, rebooted, it worked perfectly.
>>
>> So does it mean OS-7079 fixed one issue, but caused another? My host is
>> an old Supermicro X8DAH, Intel(R) Xeon(R) CPU X5570  @ 2.93GHz. Tomorrow I
>> will try on a new all NVMe system and see if it works.
>>
>> Thanks.
>> *smartos-discuss* | Archives
>> <https://www.listbox.com/member/archive/184463/=now> | Modify
>> <https://www.listbox.com/member/?> Your Subscription
>> <https://www.listbox.com>
>>
>>
>



-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125
Powered by Listbox: https://www.listbox.com

Reply via email to