Re: [osol-code] Re: Intel Core Duo: "cpu 1 failed to start" problem

Andrei Dorofeev Mon, 10 Jul 2006 10:50:43 -0700

On 7/10/06, Jürgen Keil <[EMAIL PROTECTED]> wrote:

I've filed a new bug for this problem:
6446729 "cpu 1 failed to start" when TSC counters are not in sync
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6446729


By the way, I've raised this bug's priority to P2.

To fix the problem, I've moved the tsc clock synchronization code for the slave
cpus in mp_startup() up a few lines, so that the gethrtime() call [via cmn_err()
-> gethrestime() -> pc_gethrestime() ] runs when the tsc clock delta is 
initialized.
(see the bug report)

The ASUS N4L-VM mainboard doesn't hang any more during mp cpu startup.


This is a good fix and it solves the problem with early gethrtime() calls as a
result of cmn_err() calls in mp_startup().  But I do think that slave TSC clock
synchronization could be moved even higher on the list of things done by
mp_startup().  I'm worried that, for example, if somebody were to change
cpuid_pass1() to call cmn_err() or something else that could call gethrtime(),
then we'd run into this exact problem again.  I'm wondering if procset bitmap
setting with tsc_sync_slave() call should be the very first thing done by
mp_startup() after splx(ipltospl(LOCK_LEVEL)) call.  I think MTRR sync
can be safely done after TSC sync.  I'm not so sure about the syscall
handlers though.  I'll take a closer look at this and will update the
bug report.

- Andrei
_______________________________________________
opensolaris-code mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/opensolaris-code

Re: [osol-code] Re: Intel Core Duo: "cpu 1 failed to start" problem

Reply via email to