Bug#657014: [Pkg-xen-devel] Bug#657014: Bug#657014: Wheezy, Squeeze and Ubuntu server 11.10 has missing boot option to Xen

2012-01-25 Thread Erik Hjelmås

On 01/24/2012 01:31 PM, Pasi Kärkkäinen wrote:

On Mon, Jan 23, 2012 at 06:26:29PM +0100, Erik Hjelmås wrote:

When installing xen-linux-system on a new Dell R810 server, Wheezy,
Squeeze and Ubuntu server 11.10 all boot fine when booted directly into
the standard kernel, but when booting through Xen (after installing the
package xen-linux-system) the boot process hangs as soon as Xen passes
control to the Dom0 kernel:

Gave up waiting for root device etc

and it gives me Busybox, but it is also frozen, so there's no other
option that reboot.


This might be the same issue as #649923. But please could you provide
full console logs so we can verify. If you are able to try the patch in
that bug or perhaps a backported 4.1 hypervisor that would also be
potentially interesting.


its the same behaviour on both squeeze (4.0) and wheezy (4.1)

since everything freezes on boot, I dont have any logs other than the
attached screenshot at the time it freezes (or is something logged this
early in the boot process by Xen? I can reboot without Xen and inspect
other logs)



Please set up a serial console so you can capture the full boot messages
from both Xen hypervisor and dom0 kernel.

In your case you can also use the SOL (Serial Over LAN),
provided by the iDRAC management processor.

http://wiki.xen.org/xenwiki/XenSerialConsole


sorry, this wasnt straight forward and Im running out of time, but I 
captured a video showing the full boot process

http://www.ansatt.hig.no/erikh/capture-1.avi

(if you really need the text output I can try setup SOL but that will 
have to be next week)



After several days of troubleshooting it turns out that adding dom0_mem
option (e.g. dom0_mem=8192M) to the multiboot line of
/etc/grub.d/20_linux_xen FIXES THE PROBLEM !

maybe this has to be fixed in the package?


That file is provided by grub, not the hypervisor but I don't think that
fix will work since a) really it is a workaround not a fix and b) it is
not really possible to determine what is the right number to use for any
given system.


I agree, my solution is a workaround and not at fix

I first suspected that there was a problem with the initramfs so I added
the megasas driver to /etc/initramfs/modules and updated initramfs. This
resulted in a Cannot allocate memory error (when trying to load the
megasas driver) at the same stage in the boot process. And since Busybox
is not able to run either, Xen seems to not give any memory available to
the Dom0 kernel when it passes control to it in the boot process.
Meaning the Dom0 kernel fails immediately since I guess accessing the
root devices is one of the first things it tries to do.

I will attempt install squeeze and wheezy on a different older server
(Dell R900), during the next couple of days to see of the same error
pops up, and Ill look for more log data then ,



I assume you have already installed all the latest BIOS/firmware updates etc?


no, I havent updated any firmware, so that maybe something I should try

at least the BIOS seems to be in the most recent version, 2.4.4
(the server is brand new btw)

but I see that there might be firmware for the RAID controller,

/Erik



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#657014: [Pkg-xen-devel] Bug#657014: Bug#657014: Wheezy, Squeeze and Ubuntu server 11.10 has missing boot option to Xen

2012-01-24 Thread Pasi Kärkkäinen
On Mon, Jan 23, 2012 at 06:26:29PM +0100, Erik Hjelmås wrote:
 When installing xen-linux-system on a new Dell R810 server, Wheezy,
 Squeeze and Ubuntu server 11.10 all boot fine when booted directly into
 the standard kernel, but when booting through Xen (after installing the
 package xen-linux-system) the boot process hangs as soon as Xen passes
 control to the Dom0 kernel:

 Gave up waiting for root device etc

 and it gives me Busybox, but it is also frozen, so there's no other
 option that reboot.

 This might be the same issue as #649923. But please could you provide
 full console logs so we can verify. If you are able to try the patch in
 that bug or perhaps a backported 4.1 hypervisor that would also be
 potentially interesting.

 its the same behaviour on both squeeze (4.0) and wheezy (4.1)

 since everything freezes on boot, I dont have any logs other than the  
 attached screenshot at the time it freezes (or is something logged this  
 early in the boot process by Xen? I can reboot without Xen and inspect  
 other logs)


Please set up a serial console so you can capture the full boot messages
from both Xen hypervisor and dom0 kernel.

In your case you can also use the SOL (Serial Over LAN),
provided by the iDRAC management processor.

http://wiki.xen.org/xenwiki/XenSerialConsole


 After several days of troubleshooting it turns out that adding dom0_mem
 option (e.g. dom0_mem=8192M) to the multiboot line of
 /etc/grub.d/20_linux_xen FIXES THE PROBLEM !

 maybe this has to be fixed in the package?

 That file is provided by grub, not the hypervisor but I don't think that
 fix will work since a) really it is a workaround not a fix and b) it is
 not really possible to determine what is the right number to use for any
 given system.

 I agree, my solution is a workaround and not at fix

 I first suspected that there was a problem with the initramfs so I added  
 the megasas driver to /etc/initramfs/modules and updated initramfs. This  
 resulted in a Cannot allocate memory error (when trying to load the  
 megasas driver) at the same stage in the boot process. And since Busybox  
 is not able to run either, Xen seems to not give any memory available to  
 the Dom0 kernel when it passes control to it in the boot process.  
 Meaning the Dom0 kernel fails immediately since I guess accessing the  
 root devices is one of the first things it tries to do.

 I will attempt install squeeze and wheezy on a different older server  
 (Dell R900), during the next couple of days to see of the same error  
 pops up, and Ill look for more log data then ,


I assume you have already installed all the latest BIOS/firmware updates etc?

-- Pasi




--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org