We saw a similar problem on i386. Try replacing load_segments() in arch/x86_64/kernel/machine_kexec.c with the following code. The fix hasn't made it upstream yet, but will soon. I'm not sure it will fix your problem, but it's worth a try. The only real change is the addition of "memory" in the last line.
static void load_segments(void)
{
__asm__ __volatile__ (
"\tmovl %0,%%ds\n"
"\tmovl %0,%%es\n"
"\tmovl %0,%%ss\n"
"\tmovl %0,%%fs\n"
"\tmovl %0,%%gs\n"
: : "a" (__KERNEL_DS) : "memory"
);
}
Mike Mason
IBM Linux Technology Center
Beaverton, Oregon, USA
[EMAIL PROTECTED]
Sampathkumar, Kishore (STSD) wrote:
Hi,
I am backporting kexec/kdump as it existed on 2.6.13, along with the
x86_64 patches (both kernel and kexec-tool patches) posted in Nov 2005
on fastboot mailing list onto a 2.6.9 system for x86_64 architecture for
an Opteron based system. Specifically, I'm doing this back-port onto
RHEL U2.
I have generated the patches, and am trying to test them.
The "first" kernel is named "2.6.9-22.EL-prep".
The "capture" kernel is name with suffix "2.6.9-22.EL-kdump-1".
I have followed the kernel instructions in Documentation/kdump.txt
After re-booting the system on the "first" kernel (I have added
[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> in the kernel line in
grub.conf), I issued the following to test "kexec":
Scenario #1
-----------
# kexec -l /boot/vmlinuz-2.6.9-22.EL-kdump-1
--initrd=/boot/initrd-2.6.9-22.EL-kdump-1.img --append="root=/dev/sda2
[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>"
# kexec -e
The system rebooted fine.
I then rebooted the system again on the "first" kernel, and then tried
the following:
Scenario #2
-----------
# kexec -p
/usr/src/redhat/BUILD/kernel-2.6.9-22.EL-kdump-1/linux-2.6.9/vmlinux
--initrd=/boot/initrd-2.6.9-22.EL-kdump-1.img --args-linux
--append="root=/dev/sda2 maxcpus=1 init 1"
The above loaded fine, without any errors whatsoever.
# insmod /root/kishore/mypanic.ko
When the system panics, the kdump kernel isn't booting up. The system is
just getting reset and it goes through the BIOS sequence all over again.
So, something must be wrong here.
To debug, I printed the following values in the kernel from
machine_kexec() in arch/x86_64/kernel/machine_kexec.c. These printk()
statements are placed just after:
set_idt(phys_to_virt(0),0);
and just before the following:
/* now call it */
rnk = (relocate_new_kernel_t) control_code_buffer;
(*rnk)(page_list, control_code_buffer, image->start, start_pgtable);
The printk output I'm capturing is the following:
Printing kimage structure ...
image->entry = 0x7d806400
image->last_entry = 0x7d806400
image->destination = 0x0
image->start = 0x1467550
image->control_code_page = 0x5049cd0
image->nr_segments = 9
image->control_page = 0x151afff
image->type = 0x1
I checked the same above values for Scenario #1. The output was:
Printing kimage structure ...
image->entry = 0x41f5b628
image->last_entry = 0x41f5bff8
image->destination = 0x7ff20000
image->start = 0x8e550
image->control_code_page = 0x5f5c4e0
image->nr_segments = 4
image->control_page = 0xffffffff
image->type = 0x0
How should I proceed to debug? Is the "image->destination = 0x0"
supposed to be non-zero in Scenario #2? How can I validate if the kimage
structure has valid values before even testing with "kexec -p" followed
by a panic of the system? Is there any specific information that I can
capture, which will be of help to debug?
Thanks,
- Kishore
------------------------------------------------------------------------
_______________________________________________
fastboot mailing list
[email protected]
https://lists.osdl.org/mailman/listinfo/fastboot
_______________________________________________
fastboot mailing list
[email protected]
https://lists.osdl.org/mailman/listinfo/fastboot