On Thu, 2006-11-09 at 17:06 -0500, Vivek Goyal wrote:
> On Thu, Nov 09, 2006 at 03:39:17PM -0500, Ben Romer wrote:
> > For the last several weeks I've been trying to get to the root of a
> > kexec clock problem we've been seeing on our ES7000/ONE systems. I've
> > been down various routes, verifying that the system is going into
> > virtual wire mode correctly and trying to turn the PM clock back on
> > before rebooting, but I have been unable to get a kexec dump without
> > passing in a loops-per-jiffy value on the kexec command line. The clock
> > seems to re-activate once the APIC is initialized, but before that I get
> > nothing at all. Without the lpj= value, the kexec'd kernel hangs while
> > trying to compute lpj.
> > 
> 
> So timer interrupts are not coming in second kernel hence jiffies
> don't get updated and you hang in calibrate_delay_loop()?
> 

Yep, that's exactly the problem. I've checked the APIC routing and the
virtual wire mode code puts us back into the exact same state that we
had when we booted, but the clock doesn't work. 

> Is it anyway related to boot cpu? If you boot your first kernel with only
> one processor, do you still see the issue? Does kexec work on this machine? 
> If kexec works then probably its more of a software setting issue.
> 
> Does it work if second kernel is passed with command line option
> "nolapic" ?
> 

I'll try both of these suggestions right away. :)

> Sorry I am not very well versed with timers, hence a stupid question.
> What is a PM timer? Can it driver the timer interrupt like an 8253/8254
> chipset or an HPET can do? If yes, how the interrupt is routed to CPU
> on your mahine?
> 

I'm not all that well versed with timers either, which I suspect is why
I'm having trouble. ;) In the code I've read the kernel treats the PM
timer identically to how a PIT timer works, so I believe they're
basically the same thing. What I saw was that the kernel switches over
to the APIC timer later in the boot process, and it shuts off the PM
timer. So, I attempted to re-enable the PM timer inside of
machine_kexec() but that didn't work. With lpj set the system makes it
past calibrate_delay_loop() and the clock comes back on when we get to
the IOAPIC initialization. 

> Any idea in your system, initially how does BIOS setup the LAPIC/IOAPIC
> to deliver the timer interrupt to the CPU? It is done directly through
> LAPIC or routed through IOAPIC?
> 

I'm pretty sure when we come up we're in virtual wire mode B, and the
timer interrupt is routed through the IOAPIC.

> Have you been able to verify that after kdump, the timer interrupts
> are not being generated at all or due to some routing issues, they are
> not being delivered to the cpu? I think apic=verbose and using the
> print_local_APIC() to print the states of local APIC and IOAPIC might
> be of some help.  
> 

Yep, what I did was print out the value of jiffies between each function
in init/main.c, and what seems to happen in the case when I pass in an
lpj= value is that the clock doesn't work at all until we re-initialize
the IOAPIC. The APIC and IOAPIC tables looked correct.

> > So, I have two questions I'd like to throw out here for discussion:
> > 
> > First, what do you think of adding a command-line parameter to the kexec
> > program, that would grab an lpj value from the currently running kernel
> > and append it to the kexec kernel's command line? As it stands now,
> > customers who want to run SLES10 with kexec-dumps on our systems have to
> > manually find the lpj value using dmesg, and customize their command
> > line to fit, which could be a problem if they decide to upgrade CPUs and
> > forget to update their kexec command line. I'm sure there are other
> > platforms that might benefit from automating this, and I'm willing to
> > write and submit the patch myself.
> > 
> - Most likely it is a software issue somewhere related to settings of
>   the LAPIC/IOAPIC/timer chip etc. Then IMHO, we should fix the issue
>   instead of a work around. If it boils down to some hardware limitation
>   then ofcourse we don't have a way out. 
> 
> - How would you find the lpj value in kexec? dig out dmesg?
> 

I wasn't able to find lpj exposed anywhere in /proc, so I was planning
on finding its location in /proc/kallsyms and then pulling it out
of /dev/mem. I haven't tried this yet, though, so I don't know if that's
even feasible. :)

> > Secondly, speaking in general - if the system clock is actually broken,
> > using kexec for dumps won't ever work, will it? When the crash kernel
> > tries to boot, having no clock will break the scheduler, even with an
> > lpj value set. While a dead system clock may not be a very likely
> > situation, it would seem that diskdump and/or LKCD had better chances of
> > being able to take the dump under those conditions. 
> > 
> 
> - Can diskdump or LKCD capture the dump if system clock is not working?
> - Capturing the kernel core dump in the event of hardware failures,
>   might not always be possible.  
> 

Well, in diskdump I believe there were special driver modifications to
operate the I/O devices in a polling mode rather than interrupt-driven,
and the clock didn't really play into things at all. I suspect that a
broken clock would break msleep and usleep, which were used a lot in the
crashdump drivers, but it's certainly possible to re-implement those in
a way that wouldn't use the clock. *shrug* I don't think it's all that
likely to be a problem, though. 

> Thanks
> Vivek

_______________________________________________
fastboot mailing list
[email protected]
https://lists.osdl.org/mailman/listinfo/fastboot

Reply via email to