For the last several weeks I've been trying to get to the root of a
kexec clock problem we've been seeing on our ES7000/ONE systems. I've
been down various routes, verifying that the system is going into
virtual wire mode correctly and trying to turn the PM clock back on
before rebooting, but I have been unable to get a kexec dump without
passing in a loops-per-jiffy value on the kexec command line. The clock
seems to re-activate once the APIC is initialized, but before that I get
nothing at all. Without the lpj= value, the kexec'd kernel hangs while
trying to compute lpj.

So, I have two questions I'd like to throw out here for discussion:

First, what do you think of adding a command-line parameter to the kexec
program, that would grab an lpj value from the currently running kernel
and append it to the kexec kernel's command line? As it stands now,
customers who want to run SLES10 with kexec-dumps on our systems have to
manually find the lpj value using dmesg, and customize their command
line to fit, which could be a problem if they decide to upgrade CPUs and
forget to update their kexec command line. I'm sure there are other
platforms that might benefit from automating this, and I'm willing to
write and submit the patch myself.

Secondly, speaking in general - if the system clock is actually broken,
using kexec for dumps won't ever work, will it? When the crash kernel
tries to boot, having no clock will break the scheduler, even with an
lpj value set. While a dead system clock may not be a very likely
situation, it would seem that diskdump and/or LKCD had better chances of
being able to take the dump under those conditions. 

Thanks!

-- Ben

_______________________________________________
fastboot mailing list
[email protected]
https://lists.osdl.org/mailman/listinfo/fastboot

Reply via email to