I tried the below suggestion. It doesn't seem to work.

What I now notice is that, the system reboots immediately on executing
the "memory" in load_segments(). It doesn't ever get to executing
set_idt() and the printk statements I have placed after that.

Wanted to know: by placing "memory" in load_segments(), you are
essentially inserting a memory barrier, isn't it? So, is this just to
ensure ordered execution of memory store operations involved in code
before and the code after "memory"?

- Kishore

-----Original Message-----
From: Mike Mason [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, March 08, 2006 5:35 AM
To: Sampathkumar, Kishore (STSD)
Cc: [email protected]
Subject: Re: [Fastboot] Needed help to debug this kdump problem on
x86_64 systems

We saw a similar problem on i386.  Try replacing load_segments() in
arch/x86_64/kernel/machine_kexec.c with the following code.  The fix
hasn't made it upstream yet, but will soon.  I'm not sure it will fix
your problem, but it's worth a try.  The only real change is the
addition of "memory" in the last line.  

static void load_segments(void)
{
        __asm__ __volatile__ (
                "\tmovl %0,%%ds\n"
                "\tmovl %0,%%es\n"
                "\tmovl %0,%%ss\n"
                "\tmovl %0,%%fs\n"
                "\tmovl %0,%%gs\n"
                : : "a" (__KERNEL_DS) : "memory"
                );
}

Mike Mason
IBM Linux Technology Center
Beaverton, Oregon, USA
[EMAIL PROTECTED]

Sampathkumar, Kishore (STSD) wrote:
> Hi,
>  
> I am backporting kexec/kdump as it existed on 2.6.13, along with the 
> x86_64 patches (both kernel and kexec-tool patches) posted in Nov 2005

> on fastboot mailing list onto a 2.6.9 system for x86_64 architecture
for 
> an Opteron based system. Specifically, I'm doing this back-port onto 
> RHEL U2.
>  
> I have generated the patches, and am trying to test them.
>  
> The "first" kernel is named "2.6.9-22.EL-prep".
> The "capture" kernel is name with suffix "2.6.9-22.EL-kdump-1".
> I have followed the kernel instructions in Documentation/kdump.txt
>  
> After re-booting the system on the "first" kernel (I have added 
> [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> in the kernel line in

> grub.conf), I issued the following to test "kexec":
>  
> Scenario #1
> -----------
> # kexec -l /boot/vmlinuz-2.6.9-22.EL-kdump-1 
> --initrd=/boot/initrd-2.6.9-22.EL-kdump-1.img --append="root=/dev/sda2

> [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>"
>  
> # kexec -e
>  
> The system rebooted fine.
>  
> I then rebooted the system again on the "first" kernel, and then tried

> the following:
>  
> Scenario #2
> -----------
> # kexec -p 
> /usr/src/redhat/BUILD/kernel-2.6.9-22.EL-kdump-1/linux-2.6.9/vmlinux 
> --initrd=/boot/initrd-2.6.9-22.EL-kdump-1.img --args-linux 
> --append="root=/dev/sda2 maxcpus=1 init 1"
>  
> The above loaded fine, without any errors whatsoever.
>  
> # insmod /root/kishore/mypanic.ko
>  
> When the system panics, the kdump kernel isn't booting up. The system
is 
> just getting reset and it goes through the BIOS sequence all over
again. 
> So, something must be wrong here.
>  
> To debug, I printed the following values in the kernel from 
> machine_kexec() in arch/x86_64/kernel/machine_kexec.c. These printk() 
> statements are placed just after:
>         set_idt(phys_to_virt(0),0);
>  
> and just before the following:
>         /* now call it */
>         rnk = (relocate_new_kernel_t) control_code_buffer;
>         (*rnk)(page_list, control_code_buffer, image->start,
start_pgtable);
>  
> The printk output I'm capturing is the following:
>  
> Printing kimage structure ...
> image->entry = 0x7d806400
> image->last_entry = 0x7d806400
> image->destination = 0x0
> image->start = 0x1467550
> image->control_code_page = 0x5049cd0
> image->nr_segments = 9
> image->control_page = 0x151afff
> image->type = 0x1
>  
> I checked the same above values for Scenario #1. The output was:
>  
> Printing kimage structure ...
> image->entry = 0x41f5b628
> image->last_entry = 0x41f5bff8
> image->destination = 0x7ff20000
> image->start = 0x8e550
> image->control_code_page = 0x5f5c4e0
> image->nr_segments = 4
> image->control_page = 0xffffffff
> image->type = 0x0
>  
> How should I proceed to debug? Is the "image->destination = 0x0" 
> supposed to be non-zero in Scenario #2? How can I validate if the
kimage 
> structure has valid values before even testing with "kexec -p"
followed 
> by a panic of the system? Is there any specific information that I can

> capture, which will be of help to debug?
>  
> Thanks,
> - Kishore
> 
> 
>
------------------------------------------------------------------------
> 
> _______________________________________________
> fastboot mailing list
> [email protected]
> https://lists.osdl.org/mailman/listinfo/fastboot


_______________________________________________
fastboot mailing list
[email protected]
https://lists.osdl.org/mailman/listinfo/fastboot

Reply via email to