Bug#606964:
I'm running into a similar issue with a Debian6 system fully updated (standard and -updates repos). It's an IBM x3850, 16G of ram I tried setting max_dom0_mem like stated here [1] but didn't solve the issue, the server reboots after scrubbing free ram. Still hadn't chance to connect a serial console. I also tried setting: GRUB_CMDLINE_XEN="dom0_mem=512M dom0_max_vcpus=1 dom0_vcpus_pin" but no go. Any other possible workaround? I few months ago I installed two other servers with Debian6 and much more ram and didn't face the issue. [1] http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1698#c4 -- Lorenzo Milesi - lorenzo.mil...@yetopen.it GPG/PGP Key-Id: 0xE704E230 - http://keyserver.linux.it -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1072254951.33983.1342592898230.javamail.r...@yetopen.it
Bug#606964: linux-image-2.6.32-5-xen-amd64: Xen fails to boot dom0 kernel on system with, lots of RAM
On Thu, 2011-01-06 at 10:17 +0100, Rik Theys wrote: > Ian, > > On 01/06/2011 10:09 AM, Ian Campbell wrote: > >> I can test some more options on this system for a few more days, but > >> then I have to reinstall it. > > > > If you are able it would be useful to know if the tip of > > xen-4.0-testing.hg (which is just about to become 4.0.2-rc1) works or > > not. You would only need to build the hypervisor binary itself not all > > the tools etc which is relatively straight forward. I can give more > > detailed instructions if necessary. > > Please do (provide detailed instructions). You'll need to install mercurial, gcc, make and python (I think those are the only dependencies for the hypervisor itself, hopefully any errors due to missing tools will be fairly straightforward) and then: $ hg clone http://xenbits.xen.org/xen-4.0-testing.hg $ cd xen-4.0-testing.hg $ make -C xen # Add -j as desired. Then copy xen/xen.gz to /boot/xen-4.0-testing.gz (or whatever name you prefer) and update your bootloader (update-grub should do it?). To rebuild with debugging enabled, which would be useful if the above doesn't result in a successful boot: $ make -C xen clean $ make -C xen debug=y # Add -j as desired. If 4.0-testing doesn't work then doing the same thing with http://xenbits.xen.org/xen-unstable.hg would also be informative. Thanks! Ian. -- Ian Campbell Current Noise: Death - Flesh And The Power It Holds Pandora's Rule: Never open a box you didn't close. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1294307075.3831.3619.ca...@zakaz.uk.xensource.com
Bug#606964: linux-image-2.6.32-5-xen-amd64: Xen fails to boot dom0 kernel on system with, lots of RAM
Ian, On 01/06/2011 10:09 AM, Ian Campbell wrote: I can test some more options on this system for a few more days, but then I have to reinstall it. If you are able it would be useful to know if the tip of xen-4.0-testing.hg (which is just about to become 4.0.2-rc1) works or not. You would only need to build the hypervisor binary itself not all the tools etc which is relatively straight forward. I can give more detailed instructions if necessary. Please do (provide detailed instructions). Regards, Rik -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4d2588b0.2050...@esat.kuleuven.be
Bug#606964: linux-image-2.6.32-5-xen-amd64: Xen fails to boot dom0 kernel on system with, lots of RAM
On Thu, 2011-01-06 at 10:01 +0100, Rik Theys wrote: > Hi, > > I've tried the following combinations: [...] Thanks, I'll dig through the logs in a bit and see if anything shows up. > This looks like a bug in the Xen hypervisor and not a kernel parameter? I think so, the CONFIG_XEN_MAX_DOMAIN_MEMORY thing is probably a red-herring. > I can test some more options on this system for a few more days, but > then I have to reinstall it. If you are able it would be useful to know if the tip of xen-4.0-testing.hg (which is just about to become 4.0.2-rc1) works or not. You would only need to build the hypervisor binary itself not all the tools etc which is relatively straight forward. I can give more detailed instructions if necessary. Ian. -- Ian Campbell Current Noise: Death - Scavenger Of Human Sorrow "Once they go up, who cares where they come down? That's not my department." -- Werner von Braun -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1294304981.3831.3588.ca...@zakaz.uk.xensource.com
Bug#606964: linux-image-2.6.32-5-xen-amd64: Xen fails to boot dom0 kernel on system with, lots of RAM
On Tue, 2011-01-04 at 21:46 +0100, Rik Theys wrote: > Iann > > >> Now I think of it can you also collect a similar log with the dom0_mem > >> workaround in place, for comparisons sake. > > >Are you able to rebuild the kernel with > >CONFIG_XEN_MAX_DOMAIN_MEMORY=128? If so then a log of that boot for > >comparison would be very interesting. > > >If you don't know how let me know and I'll build one for you. > > If you find the time, please build one. I hope to be able to test it soon. Please try the kernel from http://xenbits.xen.org/people/ianc/2.6.32-29+mem128/ and collect a log for comparison with the failing case. Thanks, Ian. -- Ian Campbell It's not the men in my life, but the life in my men that counts. -- Mae West -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1294176887.13733.4.ca...@localhost.localdomain
Bug#606964: linux-image-2.6.32-5-xen-amd64: Xen fails to boot dom0 kernel on system with, lots of RAM
Iann Now I think of it can you also collect a similar log with the dom0_mem workaround in place, for comparisons sake. Are you able to rebuild the kernel with CONFIG_XEN_MAX_DOMAIN_MEMORY=128? If so then a log of that boot for comparison would be very interesting. If you don't know how let me know and I'll build one for you. If you find the time, please build one. I hope to be able to test it soon. -- Rik -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/alpine.lrh.2.00.1101042142500.20...@helium.esat.kuleuven.be
Bug#606964: linux-image-2.6.32-5-xen-amd64: Xen fails to boot dom0 kernel on system with, lots of RAM
On Tue, 2011-01-04 at 14:08 +, Ian Campbell wrote: > Now I think of it can you also collect a similar log with the dom0_mem > workaround in place, for comparisons sake. Are you able to rebuild the kernel with CONFIG_XEN_MAX_DOMAIN_MEMORY=128? If so then a log of that boot for comparison would be very interesting. If you don't know how let me know and I'll build one for you. Ian. -- Ian Campbell Current Noise: The Hidden Hand - Sons Of Kings If you are too busy to read, then you are too busy. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1294150698.3831.264.ca...@zakaz.uk.xensource.com
Bug#606964: linux-image-2.6.32-5-xen-amd64: Xen fails to boot dom0 kernel on system with, lots of RAM
On Mon, 2011-01-03 at 09:59 +0100, Rik Theys wrote: > Hi, > > On 12/22/2010 10:34 AM, Ian Campbell wrote: > > I was wondering about clamping something in the kernel to correspond to > > CONFIG_XEN_MAX_DOMAIN_MEMORY and avoid the issue but you say the crash > > is after "x VCPUS" and before "Scrubbing Free RAM" so I'm surprised the > > dom0 kernel has run at this point and hence changing it would not help. > > The hypervisor doesn't know about this guest configuration item so there > > isn't much which can be done there. > > > > To try and confirm what is going on is there any chance you could you > > collect a serial console log with verbose debugging enabled by using the > > command line parameters described in "Are there more debugging options I > > could enable to troubleshoot booting problems?" of > > http://wiki.xensource.com/xenwiki/XenParavirtOps (without the > > recommended dom0_mem). > > The console log is attached. Thanks. Did you use earlyprintk=xen on the kernel command line as well as the loglvl=all stuff on the hypervisor command line? Now I think of it can you also collect a similar log with the dom0_mem workaround in place, for comparisons sake. > > Assuming that shows that the domain 0 kernel did actually get a chance > > to run and crash I'll take a look at what else needs to be clamped to > > have it just work. > > From what I can see in the log, the dom0 kernel doesn't seem to get > started. It does rather look that way. If you press Ctrl-A three times on the hypervisor serial console then you might be able to inject some hypervisor debug keys. Such as 'q' which should dump a dump of interesting state about dom0. '0' and 'd' might also print something of interest. Press 'h' for a list of available keys. Ian. -- Ian Campbell Current Noise: The Hidden Hand - Magdalene Never get into fights with ugly people because they have nothing to lose. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1294150116.3831.248.ca...@zakaz.uk.xensource.com
Bug#606964: linux-image-2.6.32-5-xen-amd64: Xen fails to boot dom0 kernel on system with, lots of RAM
Hi, On 12/22/2010 10:34 AM, Ian Campbell wrote: I was wondering about clamping something in the kernel to correspond to CONFIG_XEN_MAX_DOMAIN_MEMORY and avoid the issue but you say the crash is after "x VCPUS" and before "Scrubbing Free RAM" so I'm surprised the dom0 kernel has run at this point and hence changing it would not help. The hypervisor doesn't know about this guest configuration item so there isn't much which can be done there. To try and confirm what is going on is there any chance you could you collect a serial console log with verbose debugging enabled by using the command line parameters described in "Are there more debugging options I could enable to troubleshoot booting problems?" of http://wiki.xensource.com/xenwiki/XenParavirtOps (without the recommended dom0_mem). The console log is attached. Assuming that shows that the domain 0 kernel did actually get a chance to run and crash I'll take a look at what else needs to be clamped to have it just work. From what I can see in the log, the dom0 kernel doesn't seem to get started. Hope this helps. Regards, Rik (XEN) Xen version 4.0.1 (Debian 4.0.1-1) (wa...@debian.org) (gcc version 4.4.5 20100824 (prerelease) (Debian 4.4.4-11) ) Fri Sep 3 15:38:12 UTC 2010 (XEN) Console output is synchronous. (XEN) Bootloader: GRUB 1.98+20100804-10 (XEN) Command line: placeholder loglvl=all guest_loglvl=all sync_console console_to_ring com1=115200,8n1 console=com1 (XEN) Video information: (XEN) VGA is text mode 80x25, font 8x16 (XEN) VBE/DDC methods: none; EDID transfer time: 0 seconds (XEN) EDID info not retrieved because no DDC retrieval method detected (XEN) Disc information: (XEN) Found 1 MBR signatures (XEN) Found 1 EDD information structures (XEN) Xen-e820 RAM map: (XEN) - 000a (usable) (XEN) 0010 - bf379000 (usable) (XEN) bf379000 - bf38f000 (reserved) (XEN) bf38f000 - bf3ce000 (ACPI data) (XEN) bf3ce000 - c000 (reserved) (XEN) e000 - f000 (reserved) (XEN) fe00 - 0001 (reserved) (XEN) 0001 - 00124000 (usable) (XEN) ACPI: RSDP 000F1240, 0024 (r2 DELL ) (XEN) ACPI: XSDT 000F1344, 009C (r1 DELL PE_SC3 1 DELL1) (XEN) ACPI: FACP BF3B3F9C, 00F4 (r3 DELL PE_SC3 1 DELL1) (XEN) ACPI: DSDT BF38F000, 3D72 (r1 DELL PE_SC3 1 INTL 20050624) (XEN) ACPI: FACS BF3B6000, 0040 (XEN) ACPI: APIC BF3B3478, 015E (r1 DELL PE_SC3 1 DELL1) (XEN) ACPI: SPCR BF3B35D8, 0050 (r1 DELL PE_SC3 1 DELL1) (XEN) ACPI: HPET BF3B362C, 0038 (r1 DELL PE_SC3 1 DELL1) (XEN) ACPI: DMAR BF3B3668, 01C0 (r1 DELL PE_SC3 1 DELL1) (XEN) ACPI: MCFG BF3B38C4, 003C (r1 DELL PE_SC3 1 DELL1) (XEN) ACPI: WD__ BF3B3904, 0134 (r1 DELL PE_SC3 1 DELL1) (XEN) ACPI: SLIC BF3B3A3C, 0024 (r1 DELL PE_SC3 1 DELL1) (XEN) ACPI: ERST BF392EF4, 0270 (r1 DELL PE_SC3 1 DELL1) (XEN) ACPI: HEST BF393164, 03A8 (r1 DELL PE_SC3 1 DELL1) (XEN) ACPI: BERT BF392D74, 0030 (r1 DELL PE_SC3 1 DELL1) (XEN) ACPI: EINJ BF392DA4, 0150 (r1 DELL PE_SC3 1 DELL1) (XEN) ACPI: SRAT BF3B3BC0, 0370 (r1 DELL PE_SC3 1 DELL1) (XEN) ACPI: TCPA BF3B3F34, 0064 (r2 DELL PE_SC3 1 DELL1) (XEN) ACPI: SSDT BF3B7000, 6C0C (r1 INTEL PPM RCM 8001 INTL 20061109) (XEN) System RAM: 73715MB (75484260kB) (XEN) SRAT: PXM 1 -> APIC 32 -> Node 0 (XEN) SRAT: PXM 2 -> APIC 0 -> Node 1 (XEN) SRAT: PXM 1 -> APIC 34 -> Node 0 (XEN) SRAT: PXM 2 -> APIC 2 -> Node 1 (XEN) SRAT: PXM 1 -> APIC 36 -> Node 0 (XEN) SRAT: PXM 2 -> APIC 4 -> Node 1 (XEN) SRAT: PXM 1 -> APIC 48 -> Node 0 (XEN) SRAT: PXM 2 -> APIC 16 -> Node 1 (XEN) SRAT: PXM 1 -> APIC 50 -> Node 0 (XEN) SRAT: PXM 2 -> APIC 18 -> Node 1 (XEN) SRAT: PXM 1 -> APIC 52 -> Node 0 (XEN) SRAT: PXM 2 -> APIC 20 -> Node 1 (XEN) SRAT: PXM 1 -> APIC 33 -> Node 0 (XEN) SRAT: PXM 2 -> APIC 1 -> Node 1 (XEN) SRAT: PXM 1 -> APIC 35 -> Node 0 (XEN) SRAT: PXM 2 -> APIC 3 -> Node 1 (XEN) SRAT: PXM 1 -> APIC 37 -> Node 0 (XEN) SRAT: PXM 2 -> APIC 5 -> Node 1 (XEN) SRAT: PXM 1 -> APIC 49 -> Node 0 (XEN) SRAT: PXM 2 -> APIC 17 -> Node 1 (XEN) SRAT: PXM 1 -> APIC 51 -> Node 0 (XEN) SRAT: PXM 2 -> APIC 19 -> Node 1 (XEN) SRAT: PXM 1 -> APIC 53 -> Node 0 (XEN) SRAT: PXM 2 -> APIC 21 -> Node 1 (XEN) SRAT: Node 1 PXM 2 0-c000 (XEN) SRAT: Node 1 PXM 2 1-94000 (XEN) SRAT: Node 0 PXM 1 94000-124000 (XEN) NUMA: Allocated memnodemap from 123fdfe000 - 123fdff000 (XEN) NUMA: Using 18 for the hash shift. (XEN) Domain heap initialised DMA width 32 bits (XEN) found SMP MP-table at 000fe710 (XEN) DMI 2.6 present. (XEN) Using APIC driver default (XEN) ACPI: PM-Timer IO Port: 0x808 (XEN) ACPI: ACPI SLEEP INFO: pm1x_cnt[804,0], pm1x_evt[80
Bug#606964: linux-image-2.6.32-5-xen-amd64: Xen fails to boot dom0 kernel on system with, lots of RAM
On Wed, 2010-12-22 at 07:20 +0100, Rik Theys wrote: > Hi, > > On 12/14/2010 02:41 PM, Ian Campbell wrote: > >> When adding dom0_mem=2G to the boot line, the system boots OK. > > > > I expect it will work ok with everything up to and including > > dom0_mem=32G? > > It does. Thanks. > > It depends a bit on your usecase but it is often best recommended to use > > dom0_mem= anyway, since you are most often going to balloon dom0 down > > significantly anyway to make room for guest domains. > > I agree and I was planning to use the dom0_mem parameter. But I did > expect the system to actually boot. Maybe it should be mentioned > somewhere that the dom0_mem _must_ be set on systems with >32GB memory? Yes, I should at least mention it in the wiki or something. I was wondering about clamping something in the kernel to correspond to CONFIG_XEN_MAX_DOMAIN_MEMORY and avoid the issue but you say the crash is after "x VCPUS" and before "Scrubbing Free RAM" so I'm surprised the dom0 kernel has run at this point and hence changing it would not help. The hypervisor doesn't know about this guest configuration item so there isn't much which can be done there. To try and confirm what is going on is there any chance you could you collect a serial console log with verbose debugging enabled by using the command line parameters described in "Are there more debugging options I could enable to troubleshoot booting problems?" of http://wiki.xensource.com/xenwiki/XenParavirtOps (without the recommended dom0_mem). The primary options of interest in this case are "loglvl=all guest_loglvl=all sync_console console_to_ring" on the hypervisor plus "com1=115200,8n1 console=com1" to enable serial logging and "console=hvc0 earlyprintk=xen nomodeset initcall_debug debug loglevel=10" on the kernel. Assuming that shows that the domain 0 kernel did actually get a chance to run and crash I'll take a look at what else needs to be clamped to have it just work. Thanks, Ian. -- Ian Campbell Current Noise: Suffocation - Demise Of The Clone Conscience doth make cowards of us all. -- Shakespeare -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1293010491.4500.3233.ca...@zakaz.uk.xensource.com
Bug#606964: linux-image-2.6.32-5-xen-amd64: Xen fails to boot dom0 kernel on system with, lots of RAM
Hi, On 12/14/2010 02:41 PM, Ian Campbell wrote: When adding dom0_mem=2G to the boot line, the system boots OK. I expect it will work ok with everything up to and including dom0_mem=32G? It does. It depends a bit on your usecase but it is often best recommended to use dom0_mem= anyway, since you are most often going to balloon dom0 down significantly anyway to make room for guest domains. I agree and I was planning to use the dom0_mem parameter. But I did expect the system to actually boot. Maybe it should be mentioned somewhere that the dom0_mem _must_ be set on systems with >32GB memory? Regards, Rik -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4d1198a0.7030...@esat.kuleuven.be
Bug#606964: linux-image-2.6.32-5-xen-amd64: Xen fails to boot dom0 kernel on system with, lots of RAM
On Mon, 2010-12-13 at 11:42 +0100, Rik Theys wrote: > Package: linux-2.6 > Version: 2.6.32-29 > Severity: normal > > When booting Xen 4.0.1 with the Debian dom0 kernel, Xen hangs on startup > after saying that the > system has x VCPUS and before starting the "Scrubbing Free RAM". > This looks related to the bug mentioned here: > > http://lists.xensource.com/archives/html/xen-devel/2010-08/msg00625.html > > When adding dom0_mem=2G to the boot line, the system boots OK. I expect it will work ok with everything up to and including dom0_mem=32G? > I've added it to both the kernel command line and the xen command line > to make sure I had > the parameter set. > > The system has 72GB ram. > > The thread mentions a kernel config that can be changed to fix this? > http://lists.xensource.com/archives/html/xen-devel/2010-08/msg00884.html Changing this option imposes a static (at compile time) memory overhead on the kernel when running under Xen. It has become more dynamic in future kernels but those changes are not suitable for backporting to Squeeze at this stage. 32GB seems like a reasonable compromise in the meantime. It depends a bit on your usecase but it is often best recommended to use dom0_mem= anyway, since you are most often going to balloon dom0 down significantly anyway to make room for guest domains. Note also that the smallest size you can balloon dom0 to is proportional to the initial size of the domain, due to various data structures (such as the frame table) which are sized on boot according to memory size. IIRC a72GB domain 0 would have several hundred megabytes of such data structures which would therefore not be available for guest use and effectively be wasted. If you use dom0_mem then those datastructures are much smaller which leaves more memory for guests. I would recommend that you determine how big you want/need domain 0 to be and pass an appropriate dom0_mem option. Ian, -- Ian Campbell Current Noise: Devin Townsend - Addicted I consider the day misspent that I am not either charged with a crime, or arrested for one. -- "Ratsy" Tourbillon -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1292334083.32368.73.ca...@zakaz.uk.xensource.com