Re: !EARLY_AP_STARTUP and -CURRENT

2017-08-31 Thread Kevin Bowling
panic: mutex sched lock 0 not owned at /d0/kev/freebsd/sys/kern/sched_ule.c:2379

On Thu, Aug 31, 2017 at 7:38 AM, John Baldwin  wrote:
> On Wednesday, August 30, 2017 04:54:07 PM Kevin Bowling wrote:
>> I'm dealing with a shit sandwich right now where the mps(4) or cam_da
>> reorders drives on a few thousand legacy MBR machines I have (and I
>> can't easily install glabel ATM), and !EARLY_AP_STARTUP seems to have
>> regressed.  I'd like to be able to run w/o EARLY_AP_STARTUP right
>> quick so I can take a more leisurely approach to fixing mps(4) boot
>> probe correctly (freebsd-scsi@ has that thread).
>>
>> With WITNESS and !EARLY_AP_STARTUP I hit an assert in sched_setpreempt
>> in kern/sched_ule.c 100% of the time.  Here are a couple invocations,
>> with oddness around a different CPU holding the curthread lock but
>> somehow a different AP is runnable in the function:
>
> Do you have the panic messages?
>
>> Tracing pid 11 tid 100020 td 0xf80128cd1560
>> kdb_enter() at kdb_enter+0x3b/frame 0xfe3e653dcc10
>> vpanic() at vpanic+0x1b9/frame 0xfe3e653dcc90
>> panic() at panic+0x43/frame 0xfe3e653dccf0
>> __mtx_assert() at __mtx_assert+0xb4/frame 0xfe3e653dcd00
>> sched_add() at sched_add+0x152/frame 0xfe3e653dcd40
>> intr_event_schedule_thread() at intr_event_schedule_thread+0xca/frame
>> 0xfe3e653dcd80
>> swi_sched() at swi_sched+0x6c/frame 0xfe3e653dcdc0
>> softclock_call_cc() at softclock_call_cc+0x155/frame 0xfe3e653dce70
>> callout_process() at callout_process+0x1f9/frame 0xfe3e653dcef0
>> handleevents() at handleevents+0x1a4/frame 0xfe3e653dcf30
>> cpu_initclocks_ap() at cpu_initclocks_ap+0xc8/frame 0xfe3e653dcf60
>> init_secondary_tail() at init_secondary_tail+0x1e3/frame 0xfe3e653dcf90
>> init_secondary() at init_secondary+0x2b3/frame 0xfe3e653dcff0
>>
>>
>> db> show thread 0xf80128cd1560
>> Thread 100020 at 0xf80128cd1560:
>>  proc (pid 11): 0xf80128cb5000
>>  name: idle: cpu17
>>  stack: 0xfe3e5cd88000-0xfe3e5cd8bfff
>>  flags: 0x40024  pflags: 0x20
>>  state: CAN RUN
>>  priority: 255
>>  container lock: sched lock 0 (0x81c39800)
>> db> show lock 0x81c39800
>>  class: spin mutex
>>  name: sched lock 0
>>  flags: {SPIN, RECURSE}
>>  state: {OWNED}
>>  owner: 0xf80128cca000 (tid 100017, pid 11, "idle: cpu14")
>>
>>
>> db> bt
>> Tracing pid 11 tid 100021 td 0xf80128cd2000
>> kdb_enter() at kdb_enter+0x3b/frame 0xfe3e655e4c10
>> vpanic() at vpanic+0x1b9/frame 0xfe3e655e4c90
>> panic() at panic+0x43/frame 0xfe3e655e4cf0
>> __mtx_assert() at __mtx_assert+0xb4/frame 0xfe3e655e4d00
>> sched_add() at sched_add+0x152/frame 0xfe3e655e4d40
>> intr_event_schedule_thread() at intr_event_schedule_thread+0xca/frame
>> 0xfe3e655e4d80
>> swi_sched() at swi_sched+0x6c/frame 0xfe3e655e4dc0
>> softclock_call_cc() at softclock_call_cc+0x155/frame 0xfe3e655e4e70
>> callout_process() at callout_process+0x1f9/frame 0xfe3e655e4ef0
>> handleevents() at handleevents+0x1a4/frame 0xfe3e655e4f30
>> cpu_initclocks_ap() at cpu_initclocks_ap+0xc8/frame 0xfe3e655e4f60
>> init_secondary_tail() at init_secondary_tail+0x1e3/frame 0xfe3e655e4f90
>> init_secondary() at init_secondary+0x2b3/frame 0xfe3e655e4ff0
>> db> show thread 0xf80128cd2000
>> Thread 100021 at 0xf80128cd2000:
>>  proc (pid 11): 0xf80128cb6000
>>  name: idle: cpu18
>>  stack: 0xfe3e5cf17000-0xfe3e5cf1afff
>>  flags: 0x40024  pflags: 0x20
>>  state: CAN RUN
>>  priority: 255
>>  container lock: sched lock 0 (0x81c39800)
>> db> show lock 0x81c39800
>>  class: spin mutex
>>  name: sched lock 0
>>  flags: {SPIN, RECURSE}
>>  state: {OWNED}
>>  owner: 0xf80128cdb560 (tid 100028, pid 11, "idle: cpu25")
>>
>> Regards,
>> Kevin
>
>
> --
> John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: !EARLY_AP_STARTUP and -CURRENT

2017-08-31 Thread John Baldwin
On Wednesday, August 30, 2017 04:54:07 PM Kevin Bowling wrote:
> I'm dealing with a shit sandwich right now where the mps(4) or cam_da
> reorders drives on a few thousand legacy MBR machines I have (and I
> can't easily install glabel ATM), and !EARLY_AP_STARTUP seems to have
> regressed.  I'd like to be able to run w/o EARLY_AP_STARTUP right
> quick so I can take a more leisurely approach to fixing mps(4) boot
> probe correctly (freebsd-scsi@ has that thread).
> 
> With WITNESS and !EARLY_AP_STARTUP I hit an assert in sched_setpreempt
> in kern/sched_ule.c 100% of the time.  Here are a couple invocations,
> with oddness around a different CPU holding the curthread lock but
> somehow a different AP is runnable in the function:

Do you have the panic messages?

> Tracing pid 11 tid 100020 td 0xf80128cd1560
> kdb_enter() at kdb_enter+0x3b/frame 0xfe3e653dcc10
> vpanic() at vpanic+0x1b9/frame 0xfe3e653dcc90
> panic() at panic+0x43/frame 0xfe3e653dccf0
> __mtx_assert() at __mtx_assert+0xb4/frame 0xfe3e653dcd00
> sched_add() at sched_add+0x152/frame 0xfe3e653dcd40
> intr_event_schedule_thread() at intr_event_schedule_thread+0xca/frame
> 0xfe3e653dcd80
> swi_sched() at swi_sched+0x6c/frame 0xfe3e653dcdc0
> softclock_call_cc() at softclock_call_cc+0x155/frame 0xfe3e653dce70
> callout_process() at callout_process+0x1f9/frame 0xfe3e653dcef0
> handleevents() at handleevents+0x1a4/frame 0xfe3e653dcf30
> cpu_initclocks_ap() at cpu_initclocks_ap+0xc8/frame 0xfe3e653dcf60
> init_secondary_tail() at init_secondary_tail+0x1e3/frame 0xfe3e653dcf90
> init_secondary() at init_secondary+0x2b3/frame 0xfe3e653dcff0
> 
> 
> db> show thread 0xf80128cd1560
> Thread 100020 at 0xf80128cd1560:
>  proc (pid 11): 0xf80128cb5000
>  name: idle: cpu17
>  stack: 0xfe3e5cd88000-0xfe3e5cd8bfff
>  flags: 0x40024  pflags: 0x20
>  state: CAN RUN
>  priority: 255
>  container lock: sched lock 0 (0x81c39800)
> db> show lock 0x81c39800
>  class: spin mutex
>  name: sched lock 0
>  flags: {SPIN, RECURSE}
>  state: {OWNED}
>  owner: 0xf80128cca000 (tid 100017, pid 11, "idle: cpu14")
> 
> 
> db> bt
> Tracing pid 11 tid 100021 td 0xf80128cd2000
> kdb_enter() at kdb_enter+0x3b/frame 0xfe3e655e4c10
> vpanic() at vpanic+0x1b9/frame 0xfe3e655e4c90
> panic() at panic+0x43/frame 0xfe3e655e4cf0
> __mtx_assert() at __mtx_assert+0xb4/frame 0xfe3e655e4d00
> sched_add() at sched_add+0x152/frame 0xfe3e655e4d40
> intr_event_schedule_thread() at intr_event_schedule_thread+0xca/frame
> 0xfe3e655e4d80
> swi_sched() at swi_sched+0x6c/frame 0xfe3e655e4dc0
> softclock_call_cc() at softclock_call_cc+0x155/frame 0xfe3e655e4e70
> callout_process() at callout_process+0x1f9/frame 0xfe3e655e4ef0
> handleevents() at handleevents+0x1a4/frame 0xfe3e655e4f30
> cpu_initclocks_ap() at cpu_initclocks_ap+0xc8/frame 0xfe3e655e4f60
> init_secondary_tail() at init_secondary_tail+0x1e3/frame 0xfe3e655e4f90
> init_secondary() at init_secondary+0x2b3/frame 0xfe3e655e4ff0
> db> show thread 0xf80128cd2000
> Thread 100021 at 0xf80128cd2000:
>  proc (pid 11): 0xf80128cb6000
>  name: idle: cpu18
>  stack: 0xfe3e5cf17000-0xfe3e5cf1afff
>  flags: 0x40024  pflags: 0x20
>  state: CAN RUN
>  priority: 255
>  container lock: sched lock 0 (0x81c39800)
> db> show lock 0x81c39800
>  class: spin mutex
>  name: sched lock 0
>  flags: {SPIN, RECURSE}
>  state: {OWNED}
>  owner: 0xf80128cdb560 (tid 100028, pid 11, "idle: cpu25")
> 
> Regards,
> Kevin


-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: !EARLY_AP_STARTUP and -CURRENT

2017-08-31 Thread Gary Jennejohn
On Wed, 30 Aug 2017 16:54:07 -0700
Kevin Bowling  wrote:

> I'm dealing with a shit sandwich right now where the mps(4) or cam_da
> reorders drives on a few thousand legacy MBR machines I have (and I
> can't easily install glabel ATM), and !EARLY_AP_STARTUP seems to have
> regressed.  I'd like to be able to run w/o EARLY_AP_STARTUP right
> quick so I can take a more leisurely approach to fixing mps(4) boot
> probe correctly (freebsd-scsi@ has that thread).
> 
> With WITNESS and !EARLY_AP_STARTUP I hit an assert in sched_setpreempt
> in kern/sched_ule.c 100% of the time.  Here are a couple invocations,
> with oddness around a different CPU holding the curthread lock but
> somehow a different AP is runnable in the function:
> 

You might consider using SCHED_4BSD until this problem can be
resolved.

I tried booting a system with WITNESS and !EARLY_AP_STARTUP, but
using SCHED_4BSD, and it at least reaches multi-user mode.

Since it isn't clear under what circumstances your problem arises
I didn't pursue it any further, but this approach may at least
allow you to continue on.

> Tracing pid 11 tid 100020 td 0xf80128cd1560
> kdb_enter() at kdb_enter+0x3b/frame 0xfe3e653dcc10
> vpanic() at vpanic+0x1b9/frame 0xfe3e653dcc90
> panic() at panic+0x43/frame 0xfe3e653dccf0
> __mtx_assert() at __mtx_assert+0xb4/frame 0xfe3e653dcd00
> sched_add() at sched_add+0x152/frame 0xfe3e653dcd40
> intr_event_schedule_thread() at intr_event_schedule_thread+0xca/frame
> 0xfe3e653dcd80
> swi_sched() at swi_sched+0x6c/frame 0xfe3e653dcdc0
> softclock_call_cc() at softclock_call_cc+0x155/frame 0xfe3e653dce70
> callout_process() at callout_process+0x1f9/frame 0xfe3e653dcef0
> handleevents() at handleevents+0x1a4/frame 0xfe3e653dcf30
> cpu_initclocks_ap() at cpu_initclocks_ap+0xc8/frame 0xfe3e653dcf60
> init_secondary_tail() at init_secondary_tail+0x1e3/frame 0xfe3e653dcf90
> init_secondary() at init_secondary+0x2b3/frame 0xfe3e653dcff0
> 
> 
> db> show thread 0xf80128cd1560  
> Thread 100020 at 0xf80128cd1560:
>  proc (pid 11): 0xf80128cb5000
>  name: idle: cpu17
>  stack: 0xfe3e5cd88000-0xfe3e5cd8bfff
>  flags: 0x40024  pflags: 0x20
>  state: CAN RUN
>  priority: 255
>  container lock: sched lock 0 (0x81c39800)
> db> show lock 0x81c39800  
>  class: spin mutex
>  name: sched lock 0
>  flags: {SPIN, RECURSE}
>  state: {OWNED}
>  owner: 0xf80128cca000 (tid 100017, pid 11, "idle: cpu14")
> 
> 
> db> bt  
> Tracing pid 11 tid 100021 td 0xf80128cd2000
> kdb_enter() at kdb_enter+0x3b/frame 0xfe3e655e4c10
> vpanic() at vpanic+0x1b9/frame 0xfe3e655e4c90
> panic() at panic+0x43/frame 0xfe3e655e4cf0
> __mtx_assert() at __mtx_assert+0xb4/frame 0xfe3e655e4d00
> sched_add() at sched_add+0x152/frame 0xfe3e655e4d40
> intr_event_schedule_thread() at intr_event_schedule_thread+0xca/frame
> 0xfe3e655e4d80
> swi_sched() at swi_sched+0x6c/frame 0xfe3e655e4dc0
> softclock_call_cc() at softclock_call_cc+0x155/frame 0xfe3e655e4e70
> callout_process() at callout_process+0x1f9/frame 0xfe3e655e4ef0
> handleevents() at handleevents+0x1a4/frame 0xfe3e655e4f30
> cpu_initclocks_ap() at cpu_initclocks_ap+0xc8/frame 0xfe3e655e4f60
> init_secondary_tail() at init_secondary_tail+0x1e3/frame 0xfe3e655e4f90
> init_secondary() at init_secondary+0x2b3/frame 0xfe3e655e4ff0
> db> show thread 0xf80128cd2000  
> Thread 100021 at 0xf80128cd2000:
>  proc (pid 11): 0xf80128cb6000
>  name: idle: cpu18
>  stack: 0xfe3e5cf17000-0xfe3e5cf1afff
>  flags: 0x40024  pflags: 0x20
>  state: CAN RUN
>  priority: 255
>  container lock: sched lock 0 (0x81c39800)
> db> show lock 0x81c39800  
>  class: spin mutex
>  name: sched lock 0
>  flags: {SPIN, RECURSE}
>  state: {OWNED}
>  owner: 0xf80128cdb560 (tid 100028, pid 11, "idle: cpu25")
> 
> Regards,
> Kevin
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


-- 
Gary Jennejohn
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


!EARLY_AP_STARTUP and -CURRENT

2017-08-30 Thread Kevin Bowling
I'm dealing with a shit sandwich right now where the mps(4) or cam_da
reorders drives on a few thousand legacy MBR machines I have (and I
can't easily install glabel ATM), and !EARLY_AP_STARTUP seems to have
regressed.  I'd like to be able to run w/o EARLY_AP_STARTUP right
quick so I can take a more leisurely approach to fixing mps(4) boot
probe correctly (freebsd-scsi@ has that thread).

With WITNESS and !EARLY_AP_STARTUP I hit an assert in sched_setpreempt
in kern/sched_ule.c 100% of the time.  Here are a couple invocations,
with oddness around a different CPU holding the curthread lock but
somehow a different AP is runnable in the function:

Tracing pid 11 tid 100020 td 0xf80128cd1560
kdb_enter() at kdb_enter+0x3b/frame 0xfe3e653dcc10
vpanic() at vpanic+0x1b9/frame 0xfe3e653dcc90
panic() at panic+0x43/frame 0xfe3e653dccf0
__mtx_assert() at __mtx_assert+0xb4/frame 0xfe3e653dcd00
sched_add() at sched_add+0x152/frame 0xfe3e653dcd40
intr_event_schedule_thread() at intr_event_schedule_thread+0xca/frame
0xfe3e653dcd80
swi_sched() at swi_sched+0x6c/frame 0xfe3e653dcdc0
softclock_call_cc() at softclock_call_cc+0x155/frame 0xfe3e653dce70
callout_process() at callout_process+0x1f9/frame 0xfe3e653dcef0
handleevents() at handleevents+0x1a4/frame 0xfe3e653dcf30
cpu_initclocks_ap() at cpu_initclocks_ap+0xc8/frame 0xfe3e653dcf60
init_secondary_tail() at init_secondary_tail+0x1e3/frame 0xfe3e653dcf90
init_secondary() at init_secondary+0x2b3/frame 0xfe3e653dcff0


db> show thread 0xf80128cd1560
Thread 100020 at 0xf80128cd1560:
 proc (pid 11): 0xf80128cb5000
 name: idle: cpu17
 stack: 0xfe3e5cd88000-0xfe3e5cd8bfff
 flags: 0x40024  pflags: 0x20
 state: CAN RUN
 priority: 255
 container lock: sched lock 0 (0x81c39800)
db> show lock 0x81c39800
 class: spin mutex
 name: sched lock 0
 flags: {SPIN, RECURSE}
 state: {OWNED}
 owner: 0xf80128cca000 (tid 100017, pid 11, "idle: cpu14")


db> bt
Tracing pid 11 tid 100021 td 0xf80128cd2000
kdb_enter() at kdb_enter+0x3b/frame 0xfe3e655e4c10
vpanic() at vpanic+0x1b9/frame 0xfe3e655e4c90
panic() at panic+0x43/frame 0xfe3e655e4cf0
__mtx_assert() at __mtx_assert+0xb4/frame 0xfe3e655e4d00
sched_add() at sched_add+0x152/frame 0xfe3e655e4d40
intr_event_schedule_thread() at intr_event_schedule_thread+0xca/frame
0xfe3e655e4d80
swi_sched() at swi_sched+0x6c/frame 0xfe3e655e4dc0
softclock_call_cc() at softclock_call_cc+0x155/frame 0xfe3e655e4e70
callout_process() at callout_process+0x1f9/frame 0xfe3e655e4ef0
handleevents() at handleevents+0x1a4/frame 0xfe3e655e4f30
cpu_initclocks_ap() at cpu_initclocks_ap+0xc8/frame 0xfe3e655e4f60
init_secondary_tail() at init_secondary_tail+0x1e3/frame 0xfe3e655e4f90
init_secondary() at init_secondary+0x2b3/frame 0xfe3e655e4ff0
db> show thread 0xf80128cd2000
Thread 100021 at 0xf80128cd2000:
 proc (pid 11): 0xf80128cb6000
 name: idle: cpu18
 stack: 0xfe3e5cf17000-0xfe3e5cf1afff
 flags: 0x40024  pflags: 0x20
 state: CAN RUN
 priority: 255
 container lock: sched lock 0 (0x81c39800)
db> show lock 0x81c39800
 class: spin mutex
 name: sched lock 0
 flags: {SPIN, RECURSE}
 state: {OWNED}
 owner: 0xf80128cdb560 (tid 100028, pid 11, "idle: cpu25")

Regards,
Kevin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"