>>>>> On Thu, 24 Mar 2005 17:17:49 -0800, David Mosberger <[EMAIL PROTECTED]>
>>>>> said:
>>>>> On Tue, 22 Mar 2005 14:04:55 +1100, Keith Owens <[EMAIL PROTECTED]> said:
>>> I don't see off-hand why this wouldn't work as intended.
Keith> It's got me puzzled as well. On my test system, single
Keith> stepping the offending instruction _WILL_ cause a fault, but
Keith> letting it run normally does not cause an error. A normal
Keith> run (without single step) definitely uses lfetch with an
Keith> invalid address, however ia64_fault() is not invoked, not
Keith> even for isr.na.
Keith> I am trying to get some time on the big system to reproduce
Keith> the problem and see why lfetch is faulting there. Is there
Keith> any chance that a concurrent interrupt (the failing system
Keith> does a lot of I/O) can lose the lfetch status?
David> Hmmh, odd indeed. I changed prefetch()/prefetchw() to use
David> lfetch.fault and now the kernel dies early on on an lfetch.fault that
David> goes to address 0 (triggered by find_pid()). Since that's a NaT page,
David> you'd expect a general exception (NaT consumption). However, the CPU
David> seems to get stuck in an infinite loop of general exceptions. From
David> what I can tell, it get to "dispatch_to_fault_handler" and as soon as
David> it re-enables PSR.IC or perhaps PSR.I (not sure which), it gets
David> another general exception fault.
After some more digging, it appears that we do get a vhpt-miss fault
first and for some reason, that handler triggers a (nested) general
exception fault with ISR.code7:4}==3 (IA-64 Reserved Register/Field
fault, Unimplemented Data Address fault". Not sure yet what triggers
the nested fault. The odd thing is that the same kernel works fine
with the Ski simulator (where I do get the expected
ia64_do_page_fault() when find_pid prefetches address 0).
--david
-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html