Re: [XenPPC] [PATCH/RFC] Schedule idle domain on secondary processors

2006-08-29 Thread Amos Waterland
On Tue, Aug 29, 2006 at 09:02:58AM +0200, Segher Boessenkool wrote:
 It is quite stable in that the secondary processors reliably join the
 idle domain and wait for free pages to scrub, handling 0x980  
 interrupts
 with no problem.
 
 What's this 980 exception?

Perhaps my phrasing is bad.  I was referring to the hypervisor
decrementor interrupt (hdec).

 However, the domU's sometimes hang during initialization.  When the
 domU hangs, it seems the whole machine freezes, including the serial
 console.
 
 Most common cause of this is hanging the U3/U4.  Do you have a
 hardware debugger to see where this happens?

I had a friend take a look at the state of cpu 0, but everything seems ok.
It looks like there is a race and occasionally one of the secondary
processors is hanging the U4.


___
Xen-ppc-devel mailing list
Xen-ppc-devel@lists.xensource.com
http://lists.xensource.com/xen-ppc-devel


Re: [XenPPC] [PATCH/RFC] Schedule idle domain on secondary processors

2006-08-29 Thread Segher Boessenkool
It is quite stable in that the secondary processors reliably join  
the

idle domain and wait for free pages to scrub, handling 0x980
interrupts
with no problem.


What's this 980 exception?


Perhaps my phrasing is bad.  I was referring to the hypervisor
decrementor interrupt (hdec).


Ah yes, I forgot, thanks.


However, the domU's sometimes hang during initialization.  When the
domU hangs, it seems the whole machine freezes, including the serial
console.


Most common cause of this is hanging the U3/U4.  Do you have a
hardware debugger to see where this happens?


I had a friend take a look at the state of cpu 0, but everything  
seems ok.

It looks like there is a race and occasionally one of the secondary
processors is hanging the U4.


Doing a cacheable load/store to HT (or something else on U4)
perhaps?


Segher


___
Xen-ppc-devel mailing list
Xen-ppc-devel@lists.xensource.com
http://lists.xensource.com/xen-ppc-devel


Re: [XenPPC] [PATCH/RFC] Schedule idle domain on secondary processors

2006-08-29 Thread Segher Boessenkool
Most common cause of this is hanging the U3/U4.  Do you have a  
hardware

debugger to see where this happens?


It's been my experience that RISCWatch isn't very helpful in these
situations (e.g. can't stop the processor). When the northbridge goes,
JTAG becomes unhappy.


Works fine for me, don't know what the difference is -- different
debugger?  The CPU JTAG chain is not connected to the U4 in any
way, fwiw.

What happens is, the API (EI, whatever -- the CPU bus) becomes
unusable after the bad I/O to U4; the CPU waits forever for the
bad load/store to finish.


Segher


___
Xen-ppc-devel mailing list
Xen-ppc-devel@lists.xensource.com
http://lists.xensource.com/xen-ppc-devel