Re: sched_4bsd startup crash trying to run a bound thread on an AP that hasn't started

2011-04-26 Thread Ryan Stone
On Mon, Apr 25, 2011 at 2:58 PM, John Baldwin j...@freebsd.org wrote:

 Yes, I would perhaps tweak the comment to reflect the full if statement
 though.  Maybe something like:

 /*
  * If SMP is started and the thread is pinned or otherwise limited to
  * a specific set of CPUs, queue the thread to a per-CPU run queue.
  * Otherwise, queue the thread to the global run queue.
  */

That looks fine, but I'm going to add a sentence explaining why the
smp_started condition is necessary:

/*
 * If SMP is started and the thread is pinned or otherwise limited to
 * a specific set of CPUs, queue the thread to a per-CPU run queue.
 * Otherwise, queue the thread to the global run queue.
 *
 * If SMP has not yet been started we must use the global run queue
 * as per-CPU state may not be initialized yet and we may crash if we
 * try to access the per-CPU run queues.
 */
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: sched_4bsd startup crash trying to run a bound thread on an AP that hasn't started

2011-04-26 Thread John Baldwin
On Tuesday, April 26, 2011 3:46:06 pm Ryan Stone wrote:
 On Mon, Apr 25, 2011 at 2:58 PM, John Baldwin j...@freebsd.org wrote:
 
  Yes, I would perhaps tweak the comment to reflect the full if statement
  though.  Maybe something like:
 
  /*
   * If SMP is started and the thread is pinned or otherwise limited to
   * a specific set of CPUs, queue the thread to a per-CPU run queue.
   * Otherwise, queue the thread to the global run queue.
   */
 
 That looks fine, but I'm going to add a sentence explaining why the
 smp_started condition is necessary:
 
 /*
  * If SMP is started and the thread is pinned or otherwise limited to
  * a specific set of CPUs, queue the thread to a per-CPU run queue.
  * Otherwise, queue the thread to the global run queue.
  *
  * If SMP has not yet been started we must use the global run queue
  * as per-CPU state may not be initialized yet and we may crash if we
  * try to access the per-CPU run queues.
  */

Sure, commit away.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: sched_4bsd startup crash trying to run a bound thread on an AP that hasn't started

2011-04-25 Thread John Baldwin
On Wednesday, April 20, 2011 6:02:42 pm Ryan Stone wrote:
 On Wed, Apr 6, 2011 at 2:29 PM, John Baldwin j...@freebsd.org wrote:
  I guess one other option would be something like this:
 
 if (smp_started  (td-td_pinned != 0 || td-td_flags  
  TDF_BOUND ||
 ts-ts_flags  TSF_AFFINITY)) {
 if (td-td_pinned != 0)
 cpu = td-td_lastcpu;
 else if (td-td_flags  TDF_BOUND) {
 /* Find CPU from bound runq. */
 KASSERT(...);
 cpu = ts-ts_runq - runq_pcpu[0];
 } else
 /* Find a valid CPU for our cpuset. */
 cpu = sched_pickcpu(td);
 ts-ts_runq = runq_pcpu[cpu];
 single_cpu = 1;
 CTR3(KTR_RUNQ, ...);
 } else {
 /* Global runq case. */
 }
 
  This also avoids duplicating some common code to all the single_cpu cases.
 
  --
  John Baldwin
 
 
 I went with this option.  Does this look right?

Yes, I would perhaps tweak the comment to reflect the full if statement
though.  Maybe something like:

/*
 * If SMP is started and the thread is pinned or otherwise limited to
 * a specific set of CPUs, queue the thread to a per-CPU run queue.
 * Otherwise, queue the thread to the global run queue.
 */

 
 Index: sys/kern/sched_4bsd.c
 ===
 --- sys/kern/sched_4bsd.c   (revision 220603)
 +++ sys/kern/sched_4bsd.c   (working copy)
 @@ -1246,30 +1246,28 @@
 }
 TD_SET_RUNQ(td);
 
 -   if (td-td_pinned != 0) {
 -   cpu = td-td_lastcpu;
 +   /*
 +* If SMP is not started, don't obey any requested CPU pinning as that
 +* CPU has either not yet started or it is curcpu.  Trying to run a
 +* thread on a CPU that has not yet started will panic the system.
 +*/
 +   if (smp_started  (td-td_pinned != 0 || td-td_flags  TDF_BOUND ||
 +   ts-ts_flags  TSF_AFFINITY)) {
 +   if (td-td_pinned != 0)
 +   cpu = td-td_lastcpu;
 +   else if (td-td_flags  TDF_BOUND) {
 +   /* Find CPU from bound runq. */
 +   KASSERT(SKE_RUNQ_PCPU(ts),
 +   (sched_add: bound td_sched not on cpu runq));
 +   cpu = ts-ts_runq - runq_pcpu[0];
 +   } else
 +   /* Find a valid CPU for our cpuset */
 +   cpu = sched_pickcpu(td);
 ts-ts_runq = runq_pcpu[cpu];
 single_cpu = 1;
 CTR3(KTR_RUNQ,
 sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td,
 cpu);
 -   } else if (td-td_flags  TDF_BOUND) {
 -   /* Find CPU from bound runq. */
 -   KASSERT(SKE_RUNQ_PCPU(ts),
 -   (sched_add: bound td_sched not on cpu runq));
 -   cpu = ts-ts_runq - runq_pcpu[0];
 -   single_cpu = 1;
 -   CTR3(KTR_RUNQ,
 -   sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td,
 -   cpu);
 -   } else if (ts-ts_flags  TSF_AFFINITY) {
 -   /* Find a valid CPU for our cpuset */
 -   cpu = sched_pickcpu(td);
 -   ts-ts_runq = runq_pcpu[cpu];
 -   single_cpu = 1;
 -   CTR3(KTR_RUNQ,
 -   sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td,
 -   cpu);
 } else {
 CTR2(KTR_RUNQ,
 sched_add: adding td_sched:%p (td:%p) to gbl runq, ts,
 

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: sched_4bsd startup crash trying to run a bound thread on an AP that hasn't started

2011-04-20 Thread Ryan Stone
On Wed, Apr 6, 2011 at 2:29 PM, John Baldwin j...@freebsd.org wrote:
 I guess one other option would be something like this:

                if (smp_started  (td-td_pinned != 0 || td-td_flags  
 TDF_BOUND ||
                    ts-ts_flags  TSF_AFFINITY)) {
                        if (td-td_pinned != 0)
                                cpu = td-td_lastcpu;
                        else if (td-td_flags  TDF_BOUND) {
                                /* Find CPU from bound runq. */
                                KASSERT(...);
                                cpu = ts-ts_runq - runq_pcpu[0];
                        } else
                                /* Find a valid CPU for our cpuset. */
                                cpu = sched_pickcpu(td);
                        ts-ts_runq = runq_pcpu[cpu];
                        single_cpu = 1;
                        CTR3(KTR_RUNQ, ...);
                } else {
                        /* Global runq case. */
                }

 This also avoids duplicating some common code to all the single_cpu cases.

 --
 John Baldwin


I went with this option.  Does this look right?

Index: sys/kern/sched_4bsd.c
===
--- sys/kern/sched_4bsd.c   (revision 220603)
+++ sys/kern/sched_4bsd.c   (working copy)
@@ -1246,30 +1246,28 @@
}
TD_SET_RUNQ(td);

-   if (td-td_pinned != 0) {
-   cpu = td-td_lastcpu;
+   /*
+* If SMP is not started, don't obey any requested CPU pinning as that
+* CPU has either not yet started or it is curcpu.  Trying to run a
+* thread on a CPU that has not yet started will panic the system.
+*/
+   if (smp_started  (td-td_pinned != 0 || td-td_flags  TDF_BOUND ||
+   ts-ts_flags  TSF_AFFINITY)) {
+   if (td-td_pinned != 0)
+   cpu = td-td_lastcpu;
+   else if (td-td_flags  TDF_BOUND) {
+   /* Find CPU from bound runq. */
+   KASSERT(SKE_RUNQ_PCPU(ts),
+   (sched_add: bound td_sched not on cpu runq));
+   cpu = ts-ts_runq - runq_pcpu[0];
+   } else
+   /* Find a valid CPU for our cpuset */
+   cpu = sched_pickcpu(td);
ts-ts_runq = runq_pcpu[cpu];
single_cpu = 1;
CTR3(KTR_RUNQ,
sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td,
cpu);
-   } else if (td-td_flags  TDF_BOUND) {
-   /* Find CPU from bound runq. */
-   KASSERT(SKE_RUNQ_PCPU(ts),
-   (sched_add: bound td_sched not on cpu runq));
-   cpu = ts-ts_runq - runq_pcpu[0];
-   single_cpu = 1;
-   CTR3(KTR_RUNQ,
-   sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td,
-   cpu);
-   } else if (ts-ts_flags  TSF_AFFINITY) {
-   /* Find a valid CPU for our cpuset */
-   cpu = sched_pickcpu(td);
-   ts-ts_runq = runq_pcpu[cpu];
-   single_cpu = 1;
-   CTR3(KTR_RUNQ,
-   sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td,
-   cpu);
} else {
CTR2(KTR_RUNQ,
sched_add: adding td_sched:%p (td:%p) to gbl runq, ts,
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: sched_4bsd startup crash trying to run a bound thread on an AP that hasn't started

2011-04-06 Thread John Baldwin
On Monday, April 04, 2011 5:10:20 pm Ryan Stone wrote:
 I'm running into a bootup crash under sched_4bsd on HEAD.  The crash
 happens when I have a thread bound to a single CPU that isn't the BSP,
 and that thread is scheduled.  If the AP that the thread is bound
 hasn't been started up, kick_other_cpu() crashes because
 pcpu-pc_curthread is NULL for the AP.
 
 I've put a small test kld in
 http://people.freebsd.org/~rstone/4bsd_bind/ that reproduces the
 problem.  I'm not sure what the best way to address the crash is.  ULE
 is not affected by the problem; it seems to run the swi thread on CPU
 0 until CPU 1 is running.

Hummm.  Patching 4BSD to use the same route as ULE may be the best solution 
for now if that is easiest.  Alternatively, you could change 4BSD's 
sched_add() to not try to kick other CPUs until smp_started is true.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: sched_4bsd startup crash trying to run a bound thread on an AP that hasn't started

2011-04-06 Thread Ryan Stone
On Wed, Apr 6, 2011 at 8:36 AM, John Baldwin j...@freebsd.org wrote:
 Hummm.  Patching 4BSD to use the same route as ULE may be the best solution
 for now if that is easiest.  Alternatively, you could change 4BSD's
 sched_add() to not try to kick other CPUs until smp_started is true.

At first I thought that it was a consequence of the way it does CPU
affinity, but now I see that it shortcuts if smp_started is not true.
How about something like the following for 4BSD?

--- sched_4bsd.c(revision 220222)
+++ sched_4bsd.c(working copy)
@@ -1242,14 +1242,14 @@
}
TD_SET_RUNQ(td);

-   if (td-td_pinned != 0) {
+   if (smp_started  td-td_pinned != 0) {
cpu = td-td_lastcpu;
ts-ts_runq = runq_pcpu[cpu];
single_cpu = 1;
CTR3(KTR_RUNQ,
sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td,
cpu);
-   } else if (td-td_flags  TDF_BOUND) {
+   } else if (smp_started  (td-td_flags  TDF_BOUND)) {
/* Find CPU from bound runq. */
KASSERT(SKE_RUNQ_PCPU(ts),
(sched_add: bound td_sched not on cpu runq));
@@ -1258,7 +1258,7 @@
CTR3(KTR_RUNQ,
sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td,
cpu);
-   } else if (ts-ts_flags  TSF_AFFINITY) {
+   } else if (smp_started  (ts-ts_flags  TSF_AFFINITY)) {
/* Find a valid CPU for our cpuset */
cpu = sched_pickcpu(td);
ts-ts_runq = runq_pcpu[cpu];

The flow control is a bit awkward because of the multiple
affinity/bound cpu cases.  If somebody prefers the code to be
structured differently I'd be open to suggestions.

Ryan
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: sched_4bsd startup crash trying to run a bound thread on an AP that hasn't started

2011-04-06 Thread Attilio Rao
2011/4/6 Ryan Stone ryst...@gmail.com:
 On Wed, Apr 6, 2011 at 8:36 AM, John Baldwin j...@freebsd.org wrote:
 Hummm.  Patching 4BSD to use the same route as ULE may be the best solution
 for now if that is easiest.  Alternatively, you could change 4BSD's
 sched_add() to not try to kick other CPUs until smp_started is true.

 At first I thought that it was a consequence of the way it does CPU
 affinity, but now I see that it shortcuts if smp_started is not true.
 How about something like the following for 4BSD?

 --- sched_4bsd.c        (revision 220222)
 +++ sched_4bsd.c        (working copy)
 @@ -1242,14 +1242,14 @@
        }
        TD_SET_RUNQ(td);

 -       if (td-td_pinned != 0) {
 +       if (smp_started  td-td_pinned != 0) {
                cpu = td-td_lastcpu;
                ts-ts_runq = runq_pcpu[cpu];
                single_cpu = 1;
                CTR3(KTR_RUNQ,
                    sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td,
                    cpu);
 -       } else if (td-td_flags  TDF_BOUND) {
 +       } else if (smp_started  (td-td_flags  TDF_BOUND)) {
                /* Find CPU from bound runq. */
                KASSERT(SKE_RUNQ_PCPU(ts),
                    (sched_add: bound td_sched not on cpu runq));
 @@ -1258,7 +1258,7 @@
                CTR3(KTR_RUNQ,
                    sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, td,
                    cpu);
 -       } else if (ts-ts_flags  TSF_AFFINITY) {
 +       } else if (smp_started  (ts-ts_flags  TSF_AFFINITY)) {
                /* Find a valid CPU for our cpuset */
                cpu = sched_pickcpu(td);
                ts-ts_runq = runq_pcpu[cpu];

 The flow control is a bit awkward because of the multiple
 affinity/bound cpu cases.  If somebody prefers the code to be
 structured differently I'd be open to suggestions.

That is more or less what ULE does -- in ULE it is simpler because it
goes via sched_pickcpu(), which still returns always CPU0 if APs still
didn't kick off.

I would also add a comment on top explaining the check, eventually,
but otherwise looks fine.

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: sched_4bsd startup crash trying to run a bound thread on an AP that hasn't started

2011-04-06 Thread John Baldwin
On Wednesday, April 06, 2011 1:08:20 pm Ryan Stone wrote:
 On Wed, Apr 6, 2011 at 8:36 AM, John Baldwin j...@freebsd.org wrote:
  Hummm.  Patching 4BSD to use the same route as ULE may be the best 
solution
  for now if that is easiest.  Alternatively, you could change 4BSD's
  sched_add() to not try to kick other CPUs until smp_started is true.
 
 At first I thought that it was a consequence of the way it does CPU
 affinity, but now I see that it shortcuts if smp_started is not true.
 How about something like the following for 4BSD?
 
 --- sched_4bsd.c(revision 220222)
 +++ sched_4bsd.c(working copy)
 @@ -1242,14 +1242,14 @@
 }
 TD_SET_RUNQ(td);
 
 -   if (td-td_pinned != 0) {
 +   if (smp_started  td-td_pinned != 0) {
 cpu = td-td_lastcpu;
 ts-ts_runq = runq_pcpu[cpu];
 single_cpu = 1;
 CTR3(KTR_RUNQ,
 sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, 
td,
 cpu);
 -   } else if (td-td_flags  TDF_BOUND) {
 +   } else if (smp_started  (td-td_flags  TDF_BOUND)) {
 /* Find CPU from bound runq. */
 KASSERT(SKE_RUNQ_PCPU(ts),
 (sched_add: bound td_sched not on cpu runq));
 @@ -1258,7 +1258,7 @@
 CTR3(KTR_RUNQ,
 sched_add: Put td_sched:%p(td:%p) on cpu%d runq, ts, 
td,
 cpu);
 -   } else if (ts-ts_flags  TSF_AFFINITY) {
 +   } else if (smp_started  (ts-ts_flags  TSF_AFFINITY)) {
 /* Find a valid CPU for our cpuset */
 cpu = sched_pickcpu(td);
 ts-ts_runq = runq_pcpu[cpu];
 
 The flow control is a bit awkward because of the multiple
 affinity/bound cpu cases.  If somebody prefers the code to be
 structured differently I'd be open to suggestions.

Maybe it could do this:

if (!smp_started) {
cpu = NOCPU;
ts-runq = runq;
} else if (td-td_pinned) {
...

That would be a smaller patch and I think more obvious to the reader even
though it duplicates the global runq selection.  I would even be ok with a
goto for this case that if !smp_started it just jumps to the global runq
bit in the last else.

I guess one other option would be something like this:

if (smp_started  (td-td_pinned != 0 || td-td_flags  
TDF_BOUND ||
ts-ts_flags  TSF_AFFINITY)) {
if (td-td_pinned != 0)
cpu = td-td_lastcpu;
else if (td-td_flags  TDF_BOUND) {
/* Find CPU from bound runq. */
KASSERT(...);
cpu = ts-ts_runq - runq_pcpu[0];
} else
/* Find a valid CPU for our cpuset. */
cpu = sched_pickcpu(td);
ts-ts_runq = runq_pcpu[cpu];
single_cpu = 1;
CTR3(KTR_RUNQ, ...);
} else {
/* Global runq case. */
}

This also avoids duplicating some common code to all the single_cpu cases.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


sched_4bsd startup crash trying to run a bound thread on an AP that hasn't started

2011-04-04 Thread Ryan Stone
I'm running into a bootup crash under sched_4bsd on HEAD.  The crash
happens when I have a thread bound to a single CPU that isn't the BSP,
and that thread is scheduled.  If the AP that the thread is bound
hasn't been started up, kick_other_cpu() crashes because
pcpu-pc_curthread is NULL for the AP.

I've put a small test kld in
http://people.freebsd.org/~rstone/4bsd_bind/ that reproduces the
problem.  I'm not sure what the best way to address the crash is.  ULE
is not affected by the problem; it seems to run the swi thread on CPU
0 until CPU 1 is running.

Here's a sample backtrace:
Fatal trap 12: page fault while in kernel mode^M^M
cpuid = 0; apic id = 00^M^M
fault virtual address   = 0x2fa^M^M
fault code  = supervisor read data, page not present^M^M
instruction pointer = 0x20:0x803b473b^M^M
stack pointer   = 0x28:0x80a2c740^M^M
frame pointer   = 0x28:0x80a2c790^M^M
code segment= base 0x0, limit 0xf, type 0x1b^M^M
= DPL 0, pres 1, long 1, def32 0, gran 1^M^M
processor eflags= resume, IOPL = 0^M^M
current process = 0 (swapper)^M^M
trap number = 12^M^M
panic: page fault^M^M
cpuid = 0^M^M
KDB: stack backtrace:^M^M
db_trace_self_wrapper() at 0x801cac8a = db_trace_self_wrapper+0x2a^M^M
panic() at 0x8038ef92 = panic+0x182^M^M
trap_fatal() at 0x8057d32d = trap_fatal+0x2ad^M^M
trap() at 0x8057e01f = trap+0x29f^M^M
calltrap() at 0x80561397 = calltrap+0x8^M^M
--- trap 0xc, rip = 0x803b473b, rsp = 0x80a2c740, rbp
= 0xfff
f80a2c790 ---^M^M
sched_add() at 0x803b473b = sched_add+0xeb^M^M
intr_event_schedule_thread() at 0x803633e0 =
intr_event_schedule_thread+0
xa0^M^M
hardclock_device_poll() at 0x8037f9a5 = hardclock_device_poll+0x35^M^M
hardclock() at 0x80342dd3 = hardclock+0x43^M^M
lapic_handle_timer() at 0x80568033 = lapic_handle_timer+0xf3^M^M
Xtimerint() at 0x80561ecc = Xtimerint+0x8c^M^M
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org