Re: sched_4bsd startup crash trying to run a bound thread on an AP that hasn't started
On Mon, Apr 25, 2011 at 2:58 PM, John Baldwin j...@freebsd.org wrote: Yes, I would perhaps tweak the comment to reflect the full if statement though. Maybe something like: /* * If SMP is started and the thread is pinned or otherwise limited to * a specific set of CPUs, queue the thread to a per-CPU run queue. * Otherwise, queue the thread to the global run queue. */ That looks fine, but I'm going to add a sentence explaining why the smp_started condition is necessary: /* * If SMP is started and the thread is pinned or otherwise limited to * a specific set of CPUs, queue the thread to a per-CPU run queue. * Otherwise, queue the thread to the global run queue. * * If SMP has not yet been started we must use the global run queue * as per-CPU state may not be initialized yet and we may crash if we * try to access the per-CPU run queues. */ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: sched_4bsd startup crash trying to run a bound thread on an AP that hasn't started
On Tuesday, April 26, 2011 3:46:06 pm Ryan Stone wrote: On Mon, Apr 25, 2011 at 2:58 PM, John Baldwin j...@freebsd.org wrote: Yes, I would perhaps tweak the comment to reflect the full if statement though. Maybe something like: /* * If SMP is started and the thread is pinned or otherwise limited to * a specific set of CPUs, queue the thread to a per-CPU run queue. * Otherwise, queue the thread to the global run queue. */ That looks fine, but I'm going to add a sentence explaining why the smp_started condition is necessary: /* * If SMP is started and the thread is pinned or otherwise limited to * a specific set of CPUs, queue the thread to a per-CPU run queue. * Otherwise, queue the thread to the global run queue. * * If SMP has not yet been started we must use the global run queue * as per-CPU state may not be initialized yet and we may crash if we * try to access the per-CPU run queues. */ Sure, commit away. -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: sched_4bsd startup crash trying to run a bound thread on an AP that hasn't started
On Wednesday, April 20, 2011 6:02:42 pm Ryan Stone wrote: On Wed, Apr 6, 2011 at 2:29 PM, John Baldwin j...@freebsd.org wrote: I guess one other option would be something like this: if (smp_started (td-td_pinned != 0 || td-td_flags TDF_BOUND || ts-ts_flags TSF_AFFINITY)) { if (td-td_pinned != 0) cpu = td-td_lastcpu; else if (td-td_flags TDF_BOUND) { /* Find CPU from bound runq. */ KASSERT(...); cpu = ts-ts_runq - runq_pcpu[0]; } else /* Find a valid CPU for our cpuset. */ cpu = sched_pickcpu(td); ts-ts_runq = runq_pcpu[cpu]; single_cpu = 1; CTR3(KTR_RUNQ, ...); } else { /* Global runq case. */ } This also avoids duplicating some common code to all the single_cpu cases. -- John Baldwin I went with this option. Does this look right? Yes, I would perhaps tweak the comment to reflect the full if statement though. Maybe something like: /* * If SMP is started and the thread is pinned or otherwise limited to * a specific set of CPUs, queue the thread to a per-CPU run queue. * Otherwise, queue the thread to the global run queue. */ Index: sys/kern/sched_4bsd.c === --- sys/kern/sched_4bsd.c (revision 220603) +++ sys/kern/sched_4bsd.c (working copy) @@ -1246,30 +1246,28 @@ } TD_SET_RUNQ(td); - if (td-td_pinned != 0) { - cpu = td-td_lastcpu; + /* +* If SMP is not started, don't obey any requested CPU pinning as that +* CPU has either not yet started or it is curcpu. Trying to run a +* thread on a CPU that has not yet started will panic the system. +*/ + if (smp_started (td-td_pinned != 0 || td-td_flags TDF_BOUND || + ts-ts_flags TSF_AFFINITY)) { + if (td-td_pinned != 0) + cpu = td-td_lastcpu; + else if (td-td_flags TDF_BOUND) { + /* Find CPU from bound runq. */ + KASSERT(SKE_RUNQ_PCPU(ts), + (sched_add: bound td_sched not on cpu runq)); + cpu = ts-ts_runq - runq_pcpu[0]; + } else + /* Find a valid CPU for our cpuset */ + cpu = sched_pickcpu(td); ts-ts_runq = runq_pcpu[cpu]; single_cpu = 1; CTR3(KTR_RUNQ, sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td, cpu); - } else if (td-td_flags TDF_BOUND) { - /* Find CPU from bound runq. */ - KASSERT(SKE_RUNQ_PCPU(ts), - (sched_add: bound td_sched not on cpu runq)); - cpu = ts-ts_runq - runq_pcpu[0]; - single_cpu = 1; - CTR3(KTR_RUNQ, - sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td, - cpu); - } else if (ts-ts_flags TSF_AFFINITY) { - /* Find a valid CPU for our cpuset */ - cpu = sched_pickcpu(td); - ts-ts_runq = runq_pcpu[cpu]; - single_cpu = 1; - CTR3(KTR_RUNQ, - sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td, - cpu); } else { CTR2(KTR_RUNQ, sched_add: adding td_sched:%p (td:%p) to gbl runq, ts, -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: sched_4bsd startup crash trying to run a bound thread on an AP that hasn't started
On Wed, Apr 6, 2011 at 2:29 PM, John Baldwin j...@freebsd.org wrote: I guess one other option would be something like this: if (smp_started (td-td_pinned != 0 || td-td_flags TDF_BOUND || ts-ts_flags TSF_AFFINITY)) { if (td-td_pinned != 0) cpu = td-td_lastcpu; else if (td-td_flags TDF_BOUND) { /* Find CPU from bound runq. */ KASSERT(...); cpu = ts-ts_runq - runq_pcpu[0]; } else /* Find a valid CPU for our cpuset. */ cpu = sched_pickcpu(td); ts-ts_runq = runq_pcpu[cpu]; single_cpu = 1; CTR3(KTR_RUNQ, ...); } else { /* Global runq case. */ } This also avoids duplicating some common code to all the single_cpu cases. -- John Baldwin I went with this option. Does this look right? Index: sys/kern/sched_4bsd.c === --- sys/kern/sched_4bsd.c (revision 220603) +++ sys/kern/sched_4bsd.c (working copy) @@ -1246,30 +1246,28 @@ } TD_SET_RUNQ(td); - if (td-td_pinned != 0) { - cpu = td-td_lastcpu; + /* +* If SMP is not started, don't obey any requested CPU pinning as that +* CPU has either not yet started or it is curcpu. Trying to run a +* thread on a CPU that has not yet started will panic the system. +*/ + if (smp_started (td-td_pinned != 0 || td-td_flags TDF_BOUND || + ts-ts_flags TSF_AFFINITY)) { + if (td-td_pinned != 0) + cpu = td-td_lastcpu; + else if (td-td_flags TDF_BOUND) { + /* Find CPU from bound runq. */ + KASSERT(SKE_RUNQ_PCPU(ts), + (sched_add: bound td_sched not on cpu runq)); + cpu = ts-ts_runq - runq_pcpu[0]; + } else + /* Find a valid CPU for our cpuset */ + cpu = sched_pickcpu(td); ts-ts_runq = runq_pcpu[cpu]; single_cpu = 1; CTR3(KTR_RUNQ, sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td, cpu); - } else if (td-td_flags TDF_BOUND) { - /* Find CPU from bound runq. */ - KASSERT(SKE_RUNQ_PCPU(ts), - (sched_add: bound td_sched not on cpu runq)); - cpu = ts-ts_runq - runq_pcpu[0]; - single_cpu = 1; - CTR3(KTR_RUNQ, - sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td, - cpu); - } else if (ts-ts_flags TSF_AFFINITY) { - /* Find a valid CPU for our cpuset */ - cpu = sched_pickcpu(td); - ts-ts_runq = runq_pcpu[cpu]; - single_cpu = 1; - CTR3(KTR_RUNQ, - sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td, - cpu); } else { CTR2(KTR_RUNQ, sched_add: adding td_sched:%p (td:%p) to gbl runq, ts, ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: sched_4bsd startup crash trying to run a bound thread on an AP that hasn't started
On Monday, April 04, 2011 5:10:20 pm Ryan Stone wrote: I'm running into a bootup crash under sched_4bsd on HEAD. The crash happens when I have a thread bound to a single CPU that isn't the BSP, and that thread is scheduled. If the AP that the thread is bound hasn't been started up, kick_other_cpu() crashes because pcpu-pc_curthread is NULL for the AP. I've put a small test kld in http://people.freebsd.org/~rstone/4bsd_bind/ that reproduces the problem. I'm not sure what the best way to address the crash is. ULE is not affected by the problem; it seems to run the swi thread on CPU 0 until CPU 1 is running. Hummm. Patching 4BSD to use the same route as ULE may be the best solution for now if that is easiest. Alternatively, you could change 4BSD's sched_add() to not try to kick other CPUs until smp_started is true. -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: sched_4bsd startup crash trying to run a bound thread on an AP that hasn't started
On Wed, Apr 6, 2011 at 8:36 AM, John Baldwin j...@freebsd.org wrote: Hummm. Patching 4BSD to use the same route as ULE may be the best solution for now if that is easiest. Alternatively, you could change 4BSD's sched_add() to not try to kick other CPUs until smp_started is true. At first I thought that it was a consequence of the way it does CPU affinity, but now I see that it shortcuts if smp_started is not true. How about something like the following for 4BSD? --- sched_4bsd.c(revision 220222) +++ sched_4bsd.c(working copy) @@ -1242,14 +1242,14 @@ } TD_SET_RUNQ(td); - if (td-td_pinned != 0) { + if (smp_started td-td_pinned != 0) { cpu = td-td_lastcpu; ts-ts_runq = runq_pcpu[cpu]; single_cpu = 1; CTR3(KTR_RUNQ, sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td, cpu); - } else if (td-td_flags TDF_BOUND) { + } else if (smp_started (td-td_flags TDF_BOUND)) { /* Find CPU from bound runq. */ KASSERT(SKE_RUNQ_PCPU(ts), (sched_add: bound td_sched not on cpu runq)); @@ -1258,7 +1258,7 @@ CTR3(KTR_RUNQ, sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td, cpu); - } else if (ts-ts_flags TSF_AFFINITY) { + } else if (smp_started (ts-ts_flags TSF_AFFINITY)) { /* Find a valid CPU for our cpuset */ cpu = sched_pickcpu(td); ts-ts_runq = runq_pcpu[cpu]; The flow control is a bit awkward because of the multiple affinity/bound cpu cases. If somebody prefers the code to be structured differently I'd be open to suggestions. Ryan ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: sched_4bsd startup crash trying to run a bound thread on an AP that hasn't started
2011/4/6 Ryan Stone ryst...@gmail.com: On Wed, Apr 6, 2011 at 8:36 AM, John Baldwin j...@freebsd.org wrote: Hummm. Patching 4BSD to use the same route as ULE may be the best solution for now if that is easiest. Alternatively, you could change 4BSD's sched_add() to not try to kick other CPUs until smp_started is true. At first I thought that it was a consequence of the way it does CPU affinity, but now I see that it shortcuts if smp_started is not true. How about something like the following for 4BSD? --- sched_4bsd.c (revision 220222) +++ sched_4bsd.c (working copy) @@ -1242,14 +1242,14 @@ } TD_SET_RUNQ(td); - if (td-td_pinned != 0) { + if (smp_started td-td_pinned != 0) { cpu = td-td_lastcpu; ts-ts_runq = runq_pcpu[cpu]; single_cpu = 1; CTR3(KTR_RUNQ, sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td, cpu); - } else if (td-td_flags TDF_BOUND) { + } else if (smp_started (td-td_flags TDF_BOUND)) { /* Find CPU from bound runq. */ KASSERT(SKE_RUNQ_PCPU(ts), (sched_add: bound td_sched not on cpu runq)); @@ -1258,7 +1258,7 @@ CTR3(KTR_RUNQ, sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td, cpu); - } else if (ts-ts_flags TSF_AFFINITY) { + } else if (smp_started (ts-ts_flags TSF_AFFINITY)) { /* Find a valid CPU for our cpuset */ cpu = sched_pickcpu(td); ts-ts_runq = runq_pcpu[cpu]; The flow control is a bit awkward because of the multiple affinity/bound cpu cases. If somebody prefers the code to be structured differently I'd be open to suggestions. That is more or less what ULE does -- in ULE it is simpler because it goes via sched_pickcpu(), which still returns always CPU0 if APs still didn't kick off. I would also add a comment on top explaining the check, eventually, but otherwise looks fine. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: sched_4bsd startup crash trying to run a bound thread on an AP that hasn't started
On Wednesday, April 06, 2011 1:08:20 pm Ryan Stone wrote: On Wed, Apr 6, 2011 at 8:36 AM, John Baldwin j...@freebsd.org wrote: Hummm. Patching 4BSD to use the same route as ULE may be the best solution for now if that is easiest. Alternatively, you could change 4BSD's sched_add() to not try to kick other CPUs until smp_started is true. At first I thought that it was a consequence of the way it does CPU affinity, but now I see that it shortcuts if smp_started is not true. How about something like the following for 4BSD? --- sched_4bsd.c(revision 220222) +++ sched_4bsd.c(working copy) @@ -1242,14 +1242,14 @@ } TD_SET_RUNQ(td); - if (td-td_pinned != 0) { + if (smp_started td-td_pinned != 0) { cpu = td-td_lastcpu; ts-ts_runq = runq_pcpu[cpu]; single_cpu = 1; CTR3(KTR_RUNQ, sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td, cpu); - } else if (td-td_flags TDF_BOUND) { + } else if (smp_started (td-td_flags TDF_BOUND)) { /* Find CPU from bound runq. */ KASSERT(SKE_RUNQ_PCPU(ts), (sched_add: bound td_sched not on cpu runq)); @@ -1258,7 +1258,7 @@ CTR3(KTR_RUNQ, sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td, cpu); - } else if (ts-ts_flags TSF_AFFINITY) { + } else if (smp_started (ts-ts_flags TSF_AFFINITY)) { /* Find a valid CPU for our cpuset */ cpu = sched_pickcpu(td); ts-ts_runq = runq_pcpu[cpu]; The flow control is a bit awkward because of the multiple affinity/bound cpu cases. If somebody prefers the code to be structured differently I'd be open to suggestions. Maybe it could do this: if (!smp_started) { cpu = NOCPU; ts-runq = runq; } else if (td-td_pinned) { ... That would be a smaller patch and I think more obvious to the reader even though it duplicates the global runq selection. I would even be ok with a goto for this case that if !smp_started it just jumps to the global runq bit in the last else. I guess one other option would be something like this: if (smp_started (td-td_pinned != 0 || td-td_flags TDF_BOUND || ts-ts_flags TSF_AFFINITY)) { if (td-td_pinned != 0) cpu = td-td_lastcpu; else if (td-td_flags TDF_BOUND) { /* Find CPU from bound runq. */ KASSERT(...); cpu = ts-ts_runq - runq_pcpu[0]; } else /* Find a valid CPU for our cpuset. */ cpu = sched_pickcpu(td); ts-ts_runq = runq_pcpu[cpu]; single_cpu = 1; CTR3(KTR_RUNQ, ...); } else { /* Global runq case. */ } This also avoids duplicating some common code to all the single_cpu cases. -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
sched_4bsd startup crash trying to run a bound thread on an AP that hasn't started
I'm running into a bootup crash under sched_4bsd on HEAD. The crash happens when I have a thread bound to a single CPU that isn't the BSP, and that thread is scheduled. If the AP that the thread is bound hasn't been started up, kick_other_cpu() crashes because pcpu-pc_curthread is NULL for the AP. I've put a small test kld in http://people.freebsd.org/~rstone/4bsd_bind/ that reproduces the problem. I'm not sure what the best way to address the crash is. ULE is not affected by the problem; it seems to run the swi thread on CPU 0 until CPU 1 is running. Here's a sample backtrace: Fatal trap 12: page fault while in kernel mode^M^M cpuid = 0; apic id = 00^M^M fault virtual address = 0x2fa^M^M fault code = supervisor read data, page not present^M^M instruction pointer = 0x20:0x803b473b^M^M stack pointer = 0x28:0x80a2c740^M^M frame pointer = 0x28:0x80a2c790^M^M code segment= base 0x0, limit 0xf, type 0x1b^M^M = DPL 0, pres 1, long 1, def32 0, gran 1^M^M processor eflags= resume, IOPL = 0^M^M current process = 0 (swapper)^M^M trap number = 12^M^M panic: page fault^M^M cpuid = 0^M^M KDB: stack backtrace:^M^M db_trace_self_wrapper() at 0x801cac8a = db_trace_self_wrapper+0x2a^M^M panic() at 0x8038ef92 = panic+0x182^M^M trap_fatal() at 0x8057d32d = trap_fatal+0x2ad^M^M trap() at 0x8057e01f = trap+0x29f^M^M calltrap() at 0x80561397 = calltrap+0x8^M^M --- trap 0xc, rip = 0x803b473b, rsp = 0x80a2c740, rbp = 0xfff f80a2c790 ---^M^M sched_add() at 0x803b473b = sched_add+0xeb^M^M intr_event_schedule_thread() at 0x803633e0 = intr_event_schedule_thread+0 xa0^M^M hardclock_device_poll() at 0x8037f9a5 = hardclock_device_poll+0x35^M^M hardclock() at 0x80342dd3 = hardclock+0x43^M^M lapic_handle_timer() at 0x80568033 = lapic_handle_timer+0xf3^M^M Xtimerint() at 0x80561ecc = Xtimerint+0x8c^M^M ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org