Re: [Xen-devel] [BUG] XEN domU crash when PV grub chainloads 32-bit domU grub

2016-01-22 Thread Andrew Cooper
On 22/01/16 12:56, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
> On 22.09.2015 10:53, Ian Campbell wrote:
>> Hi Vladimir & grub-devel,
>>
>> Do you have any thoughts on this issue with i386 pv-grub2?
>>
> Is it still an issue? If so I'll try to replicate it. From stack dump I
> see that it has jumped to NULL. GRUB has no threads so it's not a race
> condition with itself but may be one with some Xen part. An altrnative
> possibility is that grub forgets to flush cache at some point in boot
> process.

Looks like GRUB doesn't have a traptable registered with Xen (the PV
equivalent of the IDT).

First, Xen tried to inject a #GP fault and found that the entry EIP was
at 0 (which is sadly the default if nothing is specified).  It then took
a pagefault while attempting to inject the #GP, and crashed the domain.

~Andrew

>> Thanks, Ian.
>>
>> On Mon, 2015-09-21 at 22:03 +0200, Andreas Sundstrom wrote:
>>> This is using Debian Jessie and grub 2.02~beta2-22 (with Debian patches
>>> applied) and Xen 4.4.1
>>>
>>> I originally posted a bug report with Debian but got the suggestion to
>>> file bugs with upstream as well.
>>> Debian bug report:
>>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=799480
>>>
>>> Note that my original thought was that this bug probably is within GRUB.
>>> But Ian asked me to file a bug with Xen as well, you have to live with
>>> the
>>> fact that it is centered around GRUB though.
>>>
>>> Here's the information from my original bug report:
>>>
>>> Using 64-bit dom0 and 32-bit domU PV (para-virtualized) grub sometimes
>>> fail when chainloading the domU's grub. 64-bit domU seem to work 100%
>>> of the time.
>>>
>>> My understanding of the process:
>>>
>>>  * dom0 launches domU with grub that is loaded from dom0's disk.
>>>  * Grub reads config file from memdisk, and then looks for grub binary in
>>> domU filesystem.
>>>  * If grub is found in domU it then chainloads (multiboot) that grub
>>> binary
>>> and the domU grub reads grub.cfg and continue booting.
>>>  * If grub is not found in domU it reads grub.cfg and continues with
>>> boot.
>>>
>>> It fails at step 3 in my list of the boot process, but sometimes it
>>> does work so it may be something like a race condition that causes the
>>> problem?
>>>
>>> A workaround is to not install or rename /boot/xen in domU so that the
>>> first grub that is loaded from dom0's disk will not find the grub
>>> binary in the domU filesystem and hence continues to read grub.cfg and
>>> boot. The drawback of this is of course that the two versions can't
>>> differ too much as there are different setups creating grub.cfg and
>>> then reading/parsing it at boot time.
>>>
>>> I am not sure at this point whether this is a problem in XEN or a
>>> problem in grub but I compiled the legacy pvgrub that uses some minios
>>> from XEN (don't really know much more about it) and when that legacy
>>> pvgrub chainloads the domU grub it seems to work 100% of the time. Now
>>> the legace pvgrub is not a real alternative as it's not packaged for
>>> Debian though.
>>>
>>> When it fails "xl create vm -c" outputs this:
>>> Parsing config from /etc/xen/vm
>>> libxl: error: libxl_dom.c:35:libxl__domain_type: unable to get domain
>>> type for domid=16
>>> Unable to attach console
>>> libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: console
>>> child [0] exited with error status 1
>>>
>>> And "xl dmesg" shows errors like this:
>>> (XEN) traps.c:2514:d15 Domain attempted WRMSR c0010201 from
>>> 0x to 0x.
>>> (XEN) d16:v0: unhandled page fault (ec=0010)
>>> (XEN) Pagetable walk from :
>>> (XEN) L4[0x000] = 000200256027 049c
>>> (XEN) L3[0x000] = 000200255027 049d
>>> (XEN) L2[0x000] = 000200251023 04a1
>>> (XEN) L1[0x000] =  
>>> (XEN) domain_crash_sync called from entry.S: fault at 82d08021feb0
>>> compat_create_bounce_frame+0xc6/0xde
>>> (XEN) Domain 16 (vcpu#0) crashed on cpu#0:
>>> (XEN) [ Xen-4.4.1 x86_64 debug=n Not tainted ]
>>> (XEN) CPU: 0
>>> (XEN) RIP: e019:[<>]
>>> (XEN) RFLAGS: 0246 EM: 1 CONTEXT: pv guest
>>> (XEN) rax:  rbx:  rcx: 
>>> (XEN) rdx:  rsi: 00499000 rdi: 0080
>>> (XEN) rbp: 000a rsp: 005a5ff0 r8: 
>>> (XEN) r9:  r10: 83023e9b9000 r11: 83023e9b9000
>>> (XEN) r12: 033f3d335bfb r13: 82d080300800 r14: 82d0802ea940
>>> (XEN) r15: 83005e819000 cr0: 8005003b cr4: 000506f0
>>> (XEN) cr3: 000200b7a000 cr2: 
>>> (XEN) ds: e021 es: e021 fs: e021 gs: e021 ss: e021 cs: e019
>>> (XEN) Guest stack trace from esp=005a5ff0:
>>> (XEN) 0010  0001e019 00010046 0016b38b 0016b38a 0016b389
>>> 0016b388
>>> (XEN) 0016b387 0016b386 0016b385 0016b384 0016b383 0016b382 

Re: [Xen-devel] [BUG] XEN domU crash when PV grub chainloads 32-bit domU grub

2016-01-22 Thread Vladimir 'φ-coder/phcoder' Serbinenko
On 22.09.2015 10:53, Ian Campbell wrote:
> Hi Vladimir & grub-devel,
> 
> Do you have any thoughts on this issue with i386 pv-grub2?
> 
Is it still an issue? If so I'll try to replicate it. From stack dump I
see that it has jumped to NULL. GRUB has no threads so it's not a race
condition with itself but may be one with some Xen part. An altrnative
possibility is that grub forgets to flush cache at some point in boot
process.
> Thanks, Ian.
> 
> On Mon, 2015-09-21 at 22:03 +0200, Andreas Sundstrom wrote:
>> This is using Debian Jessie and grub 2.02~beta2-22 (with Debian patches
>> applied) and Xen 4.4.1
>>
>> I originally posted a bug report with Debian but got the suggestion to
>> file bugs with upstream as well.
>> Debian bug report:
>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=799480
>>
>> Note that my original thought was that this bug probably is within GRUB.
>> But Ian asked me to file a bug with Xen as well, you have to live with
>> the
>> fact that it is centered around GRUB though.
>>
>> Here's the information from my original bug report:
>>
>> Using 64-bit dom0 and 32-bit domU PV (para-virtualized) grub sometimes
>> fail when chainloading the domU's grub. 64-bit domU seem to work 100%
>> of the time.
>>
>> My understanding of the process:
>>
>>  * dom0 launches domU with grub that is loaded from dom0's disk.
>>  * Grub reads config file from memdisk, and then looks for grub binary in
>> domU filesystem.
>>  * If grub is found in domU it then chainloads (multiboot) that grub
>> binary
>> and the domU grub reads grub.cfg and continue booting.
>>  * If grub is not found in domU it reads grub.cfg and continues with
>> boot.
>>
>> It fails at step 3 in my list of the boot process, but sometimes it
>> does work so it may be something like a race condition that causes the
>> problem?
>>
>> A workaround is to not install or rename /boot/xen in domU so that the
>> first grub that is loaded from dom0's disk will not find the grub
>> binary in the domU filesystem and hence continues to read grub.cfg and
>> boot. The drawback of this is of course that the two versions can't
>> differ too much as there are different setups creating grub.cfg and
>> then reading/parsing it at boot time.
>>
>> I am not sure at this point whether this is a problem in XEN or a
>> problem in grub but I compiled the legacy pvgrub that uses some minios
>> from XEN (don't really know much more about it) and when that legacy
>> pvgrub chainloads the domU grub it seems to work 100% of the time. Now
>> the legace pvgrub is not a real alternative as it's not packaged for
>> Debian though.
>>
>> When it fails "xl create vm -c" outputs this:
>> Parsing config from /etc/xen/vm
>> libxl: error: libxl_dom.c:35:libxl__domain_type: unable to get domain
>> type for domid=16
>> Unable to attach console
>> libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: console
>> child [0] exited with error status 1
>>
>> And "xl dmesg" shows errors like this:
>> (XEN) traps.c:2514:d15 Domain attempted WRMSR c0010201 from
>> 0x to 0x.
>> (XEN) d16:v0: unhandled page fault (ec=0010)
>> (XEN) Pagetable walk from :
>> (XEN) L4[0x000] = 000200256027 049c
>> (XEN) L3[0x000] = 000200255027 049d
>> (XEN) L2[0x000] = 000200251023 04a1
>> (XEN) L1[0x000] =  
>> (XEN) domain_crash_sync called from entry.S: fault at 82d08021feb0
>> compat_create_bounce_frame+0xc6/0xde
>> (XEN) Domain 16 (vcpu#0) crashed on cpu#0:
>> (XEN) [ Xen-4.4.1 x86_64 debug=n Not tainted ]
>> (XEN) CPU: 0
>> (XEN) RIP: e019:[<>]
>> (XEN) RFLAGS: 0246 EM: 1 CONTEXT: pv guest
>> (XEN) rax:  rbx:  rcx: 
>> (XEN) rdx:  rsi: 00499000 rdi: 0080
>> (XEN) rbp: 000a rsp: 005a5ff0 r8: 
>> (XEN) r9:  r10: 83023e9b9000 r11: 83023e9b9000
>> (XEN) r12: 033f3d335bfb r13: 82d080300800 r14: 82d0802ea940
>> (XEN) r15: 83005e819000 cr0: 8005003b cr4: 000506f0
>> (XEN) cr3: 000200b7a000 cr2: 
>> (XEN) ds: e021 es: e021 fs: e021 gs: e021 ss: e021 cs: e019
>> (XEN) Guest stack trace from esp=005a5ff0:
>> (XEN) 0010  0001e019 00010046 0016b38b 0016b38a 0016b389
>> 0016b388
>> (XEN) 0016b387 0016b386 0016b385 0016b384 0016b383 0016b382 0016b381
>> 0016b380
>> (XEN) 0016b37f 0016b37e 0016b37d 0016b37c 0016b37b 0016b37a 0016b379
>> 0016b378
>> (XEN) 0016b377 0016b376 0016b375 0016b374 0016b373 0016b372 0016b371
>> 0016b370
>> (XEN) 0016b36f 0016b36e 0016b36d 0016b36c 0016b36b 0016b36a 0016b369
>> 0016b368
>> (XEN) 0016b367 0016b366 0016b365 0016b364 0016b363 0016b362 0016b361
>> 0016b360
>> (XEN) 0016b35f 0016b35e 0016b35d 0016b35c 0016b35b 0016b35a 0016b359
>> 0016b358
>> (XEN) 0016b357 0016b356 0016b355 0016b354 

Re: [Xen-devel] [BUG] XEN domU crash when PV grub chainloads 32-bit domU grub

2016-01-22 Thread Vladimir 'φ-coder/phcoder' Serbinenko
On 22.01.2016 14:01, Andrew Cooper wrote:
> On 22/01/16 12:56, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
>> On 22.09.2015 10:53, Ian Campbell wrote:
>>> Hi Vladimir & grub-devel,
>>>
>>> Do you have any thoughts on this issue with i386 pv-grub2?
>>>
>> Is it still an issue? If so I'll try to replicate it. From stack dump I
>> see that it has jumped to NULL. GRUB has no threads so it's not a race
>> condition with itself but may be one with some Xen part. An altrnative
>> possibility is that grub forgets to flush cache at some point in boot
>> process.
> 
> Looks like GRUB doesn't have a traptable registered with Xen (the PV
> equivalent of the IDT).
> 
> First, Xen tried to inject a #GP fault and found that the entry EIP was
> at 0 (which is sadly the default if nothing is specified).  It then took
> a pagefault while attempting to inject the #GP, and crashed the domain.
> 
Do you have a link how to add one? We can put a catch-stacktrace-abort
on it.
> ~Andrew
> 
>>> Thanks, Ian.
>>>
>>> On Mon, 2015-09-21 at 22:03 +0200, Andreas Sundstrom wrote:
 This is using Debian Jessie and grub 2.02~beta2-22 (with Debian patches
 applied) and Xen 4.4.1

 I originally posted a bug report with Debian but got the suggestion to
 file bugs with upstream as well.
 Debian bug report:
 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=799480

 Note that my original thought was that this bug probably is within GRUB.
 But Ian asked me to file a bug with Xen as well, you have to live with
 the
 fact that it is centered around GRUB though.

 Here's the information from my original bug report:

 Using 64-bit dom0 and 32-bit domU PV (para-virtualized) grub sometimes
 fail when chainloading the domU's grub. 64-bit domU seem to work 100%
 of the time.

 My understanding of the process:

  * dom0 launches domU with grub that is loaded from dom0's disk.
  * Grub reads config file from memdisk, and then looks for grub binary in
 domU filesystem.
  * If grub is found in domU it then chainloads (multiboot) that grub
 binary
 and the domU grub reads grub.cfg and continue booting.
  * If grub is not found in domU it reads grub.cfg and continues with
 boot.

 It fails at step 3 in my list of the boot process, but sometimes it
 does work so it may be something like a race condition that causes the
 problem?

 A workaround is to not install or rename /boot/xen in domU so that the
 first grub that is loaded from dom0's disk will not find the grub
 binary in the domU filesystem and hence continues to read grub.cfg and
 boot. The drawback of this is of course that the two versions can't
 differ too much as there are different setups creating grub.cfg and
 then reading/parsing it at boot time.

 I am not sure at this point whether this is a problem in XEN or a
 problem in grub but I compiled the legacy pvgrub that uses some minios
 from XEN (don't really know much more about it) and when that legacy
 pvgrub chainloads the domU grub it seems to work 100% of the time. Now
 the legace pvgrub is not a real alternative as it's not packaged for
 Debian though.

 When it fails "xl create vm -c" outputs this:
 Parsing config from /etc/xen/vm
 libxl: error: libxl_dom.c:35:libxl__domain_type: unable to get domain
 type for domid=16
 Unable to attach console
 libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: console
 child [0] exited with error status 1

 And "xl dmesg" shows errors like this:
 (XEN) traps.c:2514:d15 Domain attempted WRMSR c0010201 from
 0x to 0x.
 (XEN) d16:v0: unhandled page fault (ec=0010)
 (XEN) Pagetable walk from :
 (XEN) L4[0x000] = 000200256027 049c
 (XEN) L3[0x000] = 000200255027 049d
 (XEN) L2[0x000] = 000200251023 04a1
 (XEN) L1[0x000] =  
 (XEN) domain_crash_sync called from entry.S: fault at 82d08021feb0
 compat_create_bounce_frame+0xc6/0xde
 (XEN) Domain 16 (vcpu#0) crashed on cpu#0:
 (XEN) [ Xen-4.4.1 x86_64 debug=n Not tainted ]
 (XEN) CPU: 0
 (XEN) RIP: e019:[<>]
 (XEN) RFLAGS: 0246 EM: 1 CONTEXT: pv guest
 (XEN) rax:  rbx:  rcx: 
 (XEN) rdx:  rsi: 00499000 rdi: 0080
 (XEN) rbp: 000a rsp: 005a5ff0 r8: 
 (XEN) r9:  r10: 83023e9b9000 r11: 83023e9b9000
 (XEN) r12: 033f3d335bfb r13: 82d080300800 r14: 82d0802ea940
 (XEN) r15: 83005e819000 cr0: 8005003b cr4: 000506f0
 (XEN) cr3: 000200b7a000 cr2: 
 (XEN) ds: e021 es: e021 fs: 

Re: [Xen-devel] [BUG] XEN domU crash when PV grub chainloads 32-bit domU grub

2016-01-22 Thread Andrew Cooper
On 22/01/16 13:08, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
> On 22.01.2016 14:01, Andrew Cooper wrote:
>> On 22/01/16 12:56, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
>>> On 22.09.2015 10:53, Ian Campbell wrote:
 Hi Vladimir & grub-devel,

 Do you have any thoughts on this issue with i386 pv-grub2?

>>> Is it still an issue? If so I'll try to replicate it. From stack dump I
>>> see that it has jumped to NULL. GRUB has no threads so it's not a race
>>> condition with itself but may be one with some Xen part. An altrnative
>>> possibility is that grub forgets to flush cache at some point in boot
>>> process.
>> Looks like GRUB doesn't have a traptable registered with Xen (the PV
>> equivalent of the IDT).
>>
>> First, Xen tried to inject a #GP fault and found that the entry EIP was
>> at 0 (which is sadly the default if nothing is specified).  It then took
>> a pagefault while attempting to inject the #GP, and crashed the domain.
>>
> Do you have a link how to add one? We can put a catch-stacktrace-abort
> on it.

This is from my microkernel framework, and is probably the most succinct
code implementation:

http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen-test-framework.git;a=blob;f=arch/x86/pv/traps.c;h=7f9a1908d260659c10f5cbb1d2d234c9fea1edb5;hb=HEAD#l31

The hypercall ABI documentation is:

http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/include/public/arch-x86/xen.h;h=cdd93c1c6446a92e89188c6a5132538188825d27;hb=refs/heads/staging#l126

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [BUG] XEN domU crash when PV grub chainloads 32-bit domU grub

2016-01-22 Thread Andreas Sundstrom
On 2016-01-22 13:56, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
> On 22.09.2015 10:53, Ian Campbell wrote:
>> Hi Vladimir & grub-devel,
>>
>> Do you have any thoughts on this issue with i386 pv-grub2?
>>
> Is it still an issue? If so I'll try to replicate it. From stack dump I
> see that it has jumped to NULL. GRUB has no threads so it's not a race
> condition with itself but may be one with some Xen part. An altrnative
> possibility is that grub forgets to flush cache at some point in boot
> process.

I can still reproduce the issue.
I don't think much has changed in my setup since the report.
I run the current version of Xen and GRUB from Debian stable.

/Andreas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [BUG] XEN domU crash when PV grub chainloads 32-bit domU grub

2015-09-25 Thread Ian Campbell
On Thu, 2015-09-24 at 19:28 +0200, Andreas Sundstrom wrote:
> On 2015-09-23 16:18, Ian Campbell wrote:
> > On Wed, 2015-09-23 at 12:47 +, Andreas Sundstrom wrote:
> > > Citerar Ian Campbell :
> > > 
> > > > Along those lines, if the _host_ has buckets of RAM then might it
> > > > be
> > > > worth
> > > > restricting it in case the issue is with getting MFNs which are not
> > > > representably by the 32-bit PV interfaces? (IIRC the limit is ~160G
> > > > due
> > > > to
> > > > the size of the m2p hole, a 32-bit MFN spans 16TB so it's unlikely
> > > > to
> > > > be
> > > > that).
> > > > 
> > > > Likewise maybe the issue is with full addresses which don't fit in
> > > > a 32
> > > > -bit
> > > > number (which is maybe more likely to happen if grub uses a 1:1
> > > > mapping
> > > > like I would guess it does), so limiting the host to <4GB might
> > > > also be
> > > > interesting?
> > > > 
> > > 
> > > If this was meant for me I will need more information to understand  
> > > what to test.
> > > dom0 has either 12G or 8G memory in my test machines if that makes a 
> > > difference.
> > 
> > It was, sorry for not being clear.
> > 
> > How much memory do the test machines have?
> > 
> > If it is more than 160G then try booting with "mem=160G" on the
> > hypervisor
> > (not Linux) command line. You can just edit that in via grub.
> > 
> > Then try with mem=4G (which might require shrinking dom0 too of
> > course).
> 
> Well as I said my test machines only have 12 and 8G of memory.

You said dom0 did, from which I wasn't able to tell how much RAM the host
had, giving 12GB to dom0 on a 256G machine would be a plausible
configuration. But this is a confusing distinction for many and I should
have made that reasoning for including the 160G test clearer, sorry.

> I did a quick test with mem=2G though just to be sure, it failed on
> first attempt.

OK, so it is unlikely to be any of the possible integer overflow type
things I was thinking of then, thanks for testing.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [BUG] XEN domU crash when PV grub chainloads 32-bit domU grub

2015-09-25 Thread Andreas Sundstrom


Citerar Ian Campbell :


On Thu, 2015-09-24 at 19:28 +0200, Andreas Sundstrom wrote:

On 2015-09-23 16:18, Ian Campbell wrote:
> On Wed, 2015-09-23 at 12:47 +, Andreas Sundstrom wrote:
> > Citerar Ian Campbell :
> >
> > > Along those lines, if the _host_ has buckets of RAM then might it
> > > be
> > > worth
> > > restricting it in case the issue is with getting MFNs which are not
> > > representably by the 32-bit PV interfaces? (IIRC the limit is ~160G
> > > due
> > > to
> > > the size of the m2p hole, a 32-bit MFN spans 16TB so it's unlikely
> > > to
> > > be
> > > that).
> > >
> > > Likewise maybe the issue is with full addresses which don't fit in
> > > a 32
> > > -bit
> > > number (which is maybe more likely to happen if grub uses a 1:1
> > > mapping
> > > like I would guess it does), so limiting the host to <4GB might
> > > also be
> > > interesting?
> > >
> >
> > If this was meant for me I will need more information to understand
> > what to test.
> > dom0 has either 12G or 8G memory in my test machines if that makes a
> > difference.
>
> It was, sorry for not being clear.
>
> How much memory do the test machines have?
>
> If it is more than 160G then try booting with "mem=160G" on the
> hypervisor
> (not Linux) command line. You can just edit that in via grub.
>
> Then try with mem=4G (which might require shrinking dom0 too of
> course).

Well as I said my test machines only have 12 and 8G of memory.


You said dom0 did, from which I wasn't able to tell how much RAM the host
had, giving 12GB to dom0 on a 256G machine would be a plausible
configuration. But this is a confusing distinction for many and I should
have made that reasoning for including the 160G test clearer, sorry.


No worries I was equally unclear when I should have said that the host had
X amount of RAM not dom0.




I did a quick test with mem=2G though just to be sure, it failed on
first attempt.


OK, so it is unlikely to be any of the possible integer overflow type
things I was thinking of then, thanks for testing.


No worries, I have received nothing with regards to grub as of yet but I
think that is where further debugging needs to happen.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [BUG] XEN domU crash when PV grub chainloads 32-bit domU grub

2015-09-24 Thread Andreas Sundstrom
On 2015-09-23 16:18, Ian Campbell wrote:
> On Wed, 2015-09-23 at 12:47 +, Andreas Sundstrom wrote:
>> Citerar Ian Campbell :
>>
>>> Along those lines, if the _host_ has buckets of RAM then might it be
>>> worth
>>> restricting it in case the issue is with getting MFNs which are not
>>> representably by the 32-bit PV interfaces? (IIRC the limit is ~160G due
>>> to
>>> the size of the m2p hole, a 32-bit MFN spans 16TB so it's unlikely to
>>> be
>>> that).
>>>
>>> Likewise maybe the issue is with full addresses which don't fit in a 32
>>> -bit
>>> number (which is maybe more likely to happen if grub uses a 1:1 mapping
>>> like I would guess it does), so limiting the host to <4GB might also be
>>> interesting?
>>>
>>
>> If this was meant for me I will need more information to understand  
>> what to test.
>> dom0 has either 12G or 8G memory in my test machines if that makes a  
>> difference.
> 
> It was, sorry for not being clear.
> 
> How much memory do the test machines have?
> 
> If it is more than 160G then try booting with "mem=160G" on the hypervisor
> (not Linux) command line. You can just edit that in via grub.
> 
> Then try with mem=4G (which might require shrinking dom0 too of course).

Well as I said my test machines only have 12 and 8G of memory.
I did a quick test with mem=2G though just to be sure, it failed on
first attempt.

/Andreas


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [BUG] XEN domU crash when PV grub chainloads 32-bit domU grub

2015-09-23 Thread Andreas Sundstrom

Citerar Ian Campbell :


On Wed, 2015-09-23 at 00:37 +0200, Samuel Thibault wrote:

Andreas Sundstrom, le Mon 21 Sep 2015 22:03:22 +0200, a écrit :
> Note that my original thought was that this bug probably is within
> GRUB.

It's probably in the GRUB implementation of loading the domU GRUB, since
you say that pvgrub1 loads it fine.

> (XEN) domain_crash_sync called from entry.S: fault at 82d08021feb0
> compat_create_bounce_frame+0xc6/0xde

So it's inside xen entry code...

> (XEN) Guest stack trace from esp=005a5ff0:

This looks like the end of the stack

> (XEN) 0010  0001e019 00010046 0016b38b 0016b38a 0016b389
> 0016b388
> (XEN) 0016b387 0016b386 0016b385 0016b384 0016b383 0016b382 0016b381
> 0016b380
[...]

and this looks like a lot of consecutive numbers.  Perhaps we simply
somehow overflow?  Did you try giving less memory to the domU?


Along those lines, if the _host_ has buckets of RAM then might it be worth
restricting it in case the issue is with getting MFNs which are not
representably by the 32-bit PV interfaces? (IIRC the limit is ~160G due to
the size of the m2p hole, a 32-bit MFN spans 16TB so it's unlikely to be
that).

Likewise maybe the issue is with full addresses which don't fit in a 32-bit
number (which is maybe more likely to happen if grub uses a 1:1 mapping
like I would guess it does), so limiting the host to <4GB might also be
interesting?



If this was meant for me I will need more information to understand  
what to test.
dom0 has either 12G or 8G memory in my test machines if that makes a  
difference.


/Andreas


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [BUG] XEN domU crash when PV grub chainloads 32-bit domU grub

2015-09-23 Thread Ian Campbell
On Wed, 2015-09-23 at 12:47 +, Andreas Sundstrom wrote:
> Citerar Ian Campbell :
> 
> > Along those lines, if the _host_ has buckets of RAM then might it be
> > worth
> > restricting it in case the issue is with getting MFNs which are not
> > representably by the 32-bit PV interfaces? (IIRC the limit is ~160G due
> > to
> > the size of the m2p hole, a 32-bit MFN spans 16TB so it's unlikely to
> > be
> > that).
> > 
> > Likewise maybe the issue is with full addresses which don't fit in a 32
> > -bit
> > number (which is maybe more likely to happen if grub uses a 1:1 mapping
> > like I would guess it does), so limiting the host to <4GB might also be
> > interesting?
> > 
> 
> If this was meant for me I will need more information to understand  
> what to test.
> dom0 has either 12G or 8G memory in my test machines if that makes a  
> difference.

It was, sorry for not being clear.

How much memory do the test machines have?

If it is more than 160G then try booting with "mem=160G" on the hypervisor
(not Linux) command line. You can just edit that in via grub.

Then try with mem=4G (which might require shrinking dom0 too of course).

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [BUG] XEN domU crash when PV grub chainloads 32-bit domU grub

2015-09-23 Thread Ian Campbell
On Wed, 2015-09-23 at 00:37 +0200, Samuel Thibault wrote:
> Andreas Sundstrom, le Mon 21 Sep 2015 22:03:22 +0200, a écrit :
> > Note that my original thought was that this bug probably is within
> > GRUB.
> 
> It's probably in the GRUB implementation of loading the domU GRUB, since
> you say that pvgrub1 loads it fine.
> 
> > (XEN) domain_crash_sync called from entry.S: fault at 82d08021feb0
> > compat_create_bounce_frame+0xc6/0xde
> 
> So it's inside xen entry code...
> 
> > (XEN) Guest stack trace from esp=005a5ff0:
> 
> This looks like the end of the stack
> 
> > (XEN) 0010  0001e019 00010046 0016b38b 0016b38a 0016b389
> > 0016b388
> > (XEN) 0016b387 0016b386 0016b385 0016b384 0016b383 0016b382 0016b381
> > 0016b380
> [...]
> 
> and this looks like a lot of consecutive numbers.  Perhaps we simply
> somehow overflow?  Did you try giving less memory to the domU?

Along those lines, if the _host_ has buckets of RAM then might it be worth
restricting it in case the issue is with getting MFNs which are not
representably by the 32-bit PV interfaces? (IIRC the limit is ~160G due to
the size of the m2p hole, a 32-bit MFN spans 16TB so it's unlikely to be
that).

Likewise maybe the issue is with full addresses which don't fit in a 32-bit
number (which is maybe more likely to happen if grub uses a 1:1 mapping
like I would guess it does), so limiting the host to <4GB might also be
interesting?

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [BUG] XEN domU crash when PV grub chainloads 32-bit domU grub

2015-09-23 Thread Andreas Sundstrom

Citerar Samuel Thibault :


Andreas Sundstrom, le Mon 21 Sep 2015 22:03:22 +0200, a écrit :

Note that my original thought was that this bug probably is within GRUB.


It's probably in the GRUB implementation of loading the domU GRUB, since
you say that pvgrub1 loads it fine.


(XEN) domain_crash_sync called from entry.S: fault at 82d08021feb0
compat_create_bounce_frame+0xc6/0xde


So it's inside xen entry code...


(XEN) Guest stack trace from esp=005a5ff0:


This looks like the end of the stack


(XEN) 0010  0001e019 00010046 0016b38b 0016b38a 0016b389
0016b388
(XEN) 0016b387 0016b386 0016b385 0016b384 0016b383 0016b382 0016b381
0016b380

[...]

and this looks like a lot of consecutive numbers.  Perhaps we simply
somehow overflow?  Did you try giving less memory to the domU?


No I had not tried that, one of the machines that I have used to replicate
the problem with had:
maxmem = 1024
memory = 512

First just removed the maxmem part as that is probably quite unusal.
No difference.

Then I set memory to 128 and at first I was not able to reproduce.
But I did some more tests while writing this response, and eventually it
failed with 128M as well.
Any use reason to try lower?


I see no output from the domU grub (except when it works as it should
of course).


Yes, as explained in another mail domU has to get to connect to the
console before getting messages from there.  Another way is to make
console_io hypercalls, which should end up into xl dmesg.

You may also want to enable grub debugging prints, by setting the debug
variable to "all".


I just tried some with "set debug=all" at the top of the grub.cfg file.
And I could not see any difference in the output from the 1st. grub when
comparing a working chainload to a non-working (by diffing the output).

Adding the debug statement to the grub.cfg that is loaded by the 2nd.
grub (loaded from domU) gives no output at all when booting fails and
of course a lot of output when booting works.

So it seems quite clear to me that the actual chainloading/handover to
the 2nd. grub is where something goes wrong.

/Andreas



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [BUG] XEN domU crash when PV grub chainloads 32-bit domU grub

2015-09-22 Thread Andreas Sundstrom
This is using Debian Jessie and grub 2.02~beta2-22 (with Debian patches
applied) and Xen 4.4.1

I originally posted a bug report with Debian but got the suggestion to
file bugs with upstream as well.
Debian bug report:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=799480

Note that my original thought was that this bug probably is within GRUB.
But Ian asked me to file a bug with Xen as well, you have to live with the
fact that it is centered around GRUB though.

Here's the information from my original bug report:

Using 64-bit dom0 and 32-bit domU PV (para-virtualized) grub sometimes
fail when chainloading the domU's grub. 64-bit domU seem to work 100%
of the time.

My understanding of the process:

 * dom0 launches domU with grub that is loaded from dom0's disk.
 * Grub reads config file from memdisk, and then looks for grub binary in
domU filesystem.
 * If grub is found in domU it then chainloads (multiboot) that grub binary
and the domU grub reads grub.cfg and continue booting.
 * If grub is not found in domU it reads grub.cfg and continues with boot.

It fails at step 3 in my list of the boot process, but sometimes it
does work so it may be something like a race condition that causes the
problem?

A workaround is to not install or rename /boot/xen in domU so that the
first grub that is loaded from dom0's disk will not find the grub
binary in the domU filesystem and hence continues to read grub.cfg and
boot. The drawback of this is of course that the two versions can't
differ too much as there are different setups creating grub.cfg and
then reading/parsing it at boot time.

I am not sure at this point whether this is a problem in XEN or a
problem in grub but I compiled the legacy pvgrub that uses some minios
from XEN (don't really know much more about it) and when that legacy
pvgrub chainloads the domU grub it seems to work 100% of the time. Now
the legace pvgrub is not a real alternative as it's not packaged for
Debian though.

When it fails "xl create vm -c" outputs this:
Parsing config from /etc/xen/vm
libxl: error: libxl_dom.c:35:libxl__domain_type: unable to get domain
type for domid=16
Unable to attach console
libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: console
child [0] exited with error status 1

And "xl dmesg" shows errors like this:
(XEN) traps.c:2514:d15 Domain attempted WRMSR c0010201 from
0x to 0x.
(XEN) d16:v0: unhandled page fault (ec=0010)
(XEN) Pagetable walk from :
(XEN) L4[0x000] = 000200256027 049c
(XEN) L3[0x000] = 000200255027 049d
(XEN) L2[0x000] = 000200251023 04a1
(XEN) L1[0x000] =  
(XEN) domain_crash_sync called from entry.S: fault at 82d08021feb0
compat_create_bounce_frame+0xc6/0xde
(XEN) Domain 16 (vcpu#0) crashed on cpu#0:
(XEN) [ Xen-4.4.1 x86_64 debug=n Not tainted ]
(XEN) CPU: 0
(XEN) RIP: e019:[<>]
(XEN) RFLAGS: 0246 EM: 1 CONTEXT: pv guest
(XEN) rax:  rbx:  rcx: 
(XEN) rdx:  rsi: 00499000 rdi: 0080
(XEN) rbp: 000a rsp: 005a5ff0 r8: 
(XEN) r9:  r10: 83023e9b9000 r11: 83023e9b9000
(XEN) r12: 033f3d335bfb r13: 82d080300800 r14: 82d0802ea940
(XEN) r15: 83005e819000 cr0: 8005003b cr4: 000506f0
(XEN) cr3: 000200b7a000 cr2: 
(XEN) ds: e021 es: e021 fs: e021 gs: e021 ss: e021 cs: e019
(XEN) Guest stack trace from esp=005a5ff0:
(XEN) 0010  0001e019 00010046 0016b38b 0016b38a 0016b389
0016b388
(XEN) 0016b387 0016b386 0016b385 0016b384 0016b383 0016b382 0016b381
0016b380
(XEN) 0016b37f 0016b37e 0016b37d 0016b37c 0016b37b 0016b37a 0016b379
0016b378
(XEN) 0016b377 0016b376 0016b375 0016b374 0016b373 0016b372 0016b371
0016b370
(XEN) 0016b36f 0016b36e 0016b36d 0016b36c 0016b36b 0016b36a 0016b369
0016b368
(XEN) 0016b367 0016b366 0016b365 0016b364 0016b363 0016b362 0016b361
0016b360
(XEN) 0016b35f 0016b35e 0016b35d 0016b35c 0016b35b 0016b35a 0016b359
0016b358
(XEN) 0016b357 0016b356 0016b355 0016b354 0016b353 0016b352 0016b351
0016b350
(XEN) 0016b34f 0016b34e 0016b34d 0016b34c 0016b34b 0016b34a 0016b349
0016b348
(XEN) 0016b347 0016b346 0016b345 0016b344 0016b343 0016b342 0016b341
0016b340
(XEN) 0016b33f 0016b33e 0016b33d 0016b33c 0016b33b 0016b33a 0016b339
0016b338
(XEN) 0016b337 0016b336 0016b335 0016b334 0016b333 0016b332 0016b331
0016b330
(XEN) 0016b32f 0016b32e 0016b32d 0016b32c 0016b32b 0016b32a 0016b329
0016b328
(XEN) 0016b327 0016b326 0016b325 0016b324 0016b323 0016b322 0016b321
0016b320
(XEN) 0016b31f 0016b31e 0016b31d 0016b31c 0016b31b 0016b31a 0016b319
0016b318
(XEN) 0016b317 0016b316 0016b315 0016b314 0016b313 0016b312 0016b311
0016b310
(XEN) 0016b30f 0016b30e 0016b30d 0016b30c 0016b30b 0016b30a 0016b309
0016b308
(XEN) 0016b307 0016b306 0016b305 0016b304 0016b303 

Re: [Xen-devel] [BUG] XEN domU crash when PV grub chainloads 32-bit domU grub

2015-09-22 Thread Samuel Thibault
Andreas Sundstrom, le Mon 21 Sep 2015 22:03:22 +0200, a écrit :
> Note that my original thought was that this bug probably is within GRUB.

It's probably in the GRUB implementation of loading the domU GRUB, since
you say that pvgrub1 loads it fine.

> (XEN) domain_crash_sync called from entry.S: fault at 82d08021feb0
> compat_create_bounce_frame+0xc6/0xde

So it's inside xen entry code...

> (XEN) Guest stack trace from esp=005a5ff0:

This looks like the end of the stack

> (XEN) 0010  0001e019 00010046 0016b38b 0016b38a 0016b389
> 0016b388
> (XEN) 0016b387 0016b386 0016b385 0016b384 0016b383 0016b382 0016b381
> 0016b380
[...]

and this looks like a lot of consecutive numbers.  Perhaps we simply
somehow overflow?  Did you try giving less memory to the domU?

> I see no output from the domU grub (except when it works as it should
> of course).

Yes, as explained in another mail domU has to get to connect to the
console before getting messages from there.  Another way is to make
console_io hypercalls, which should end up into xl dmesg.

You may also want to enable grub debugging prints, by setting the debug
variable to "all".

Samuel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [BUG] XEN domU crash when PV grub chainloads 32-bit domU grub

2015-09-22 Thread Andrew Cooper
On 21/09/2015 21:03, Andreas Sundstrom wrote:
> This is using Debian Jessie and grub 2.02~beta2-22 (with Debian patches
> applied) and Xen 4.4.1
>
> I originally posted a bug report with Debian but got the suggestion to
> file bugs with upstream as well.
> Debian bug report:
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=799480
>
> Note that my original thought was that this bug probably is within GRUB.
> But Ian asked me to file a bug with Xen as well, you have to live with the
> fact that it is centered around GRUB though.
>
> Here's the information from my original bug report:
>
> Using 64-bit dom0 and 32-bit domU PV (para-virtualized) grub sometimes
> fail when chainloading the domU's grub. 64-bit domU seem to work 100%
> of the time.

You say sometimes.  Do you mean that repeated attempts to boot a 32bit
domU causes it to ether boot correctly, or die in the below manor?

>
> My understanding of the process:
>
>  * dom0 launches domU with grub that is loaded from dom0's disk.
>  * Grub reads config file from memdisk, and then looks for grub binary in
> domU filesystem.
>  * If grub is found in domU it then chainloads (multiboot) that grub binary
> and the domU grub reads grub.cfg and continue booting.
>  * If grub is not found in domU it reads grub.cfg and continues with boot.
>
> It fails at step 3 in my list of the boot process, but sometimes it
> does work so it may be something like a race condition that causes the
> problem?
>
> A workaround is to not install or rename /boot/xen in domU so that the
> first grub that is loaded from dom0's disk will not find the grub
> binary in the domU filesystem and hence continues to read grub.cfg and
> boot. The drawback of this is of course that the two versions can't
> differ too much as there are different setups creating grub.cfg and
> then reading/parsing it at boot time.
>
> I am not sure at this point whether this is a problem in XEN or a
> problem in grub but I compiled the legacy pvgrub that uses some minios
> from XEN (don't really know much more about it) and when that legacy
> pvgrub chainloads the domU grub it seems to work 100% of the time. Now
> the legace pvgrub is not a real alternative as it's not packaged for
> Debian though.
>
> When it fails "xl create vm -c" outputs this:
> Parsing config from /etc/xen/vm
> libxl: error: libxl_dom.c:35:libxl__domain_type: unable to get domain
> type for domid=16
> Unable to attach console
> libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: console
> child [0] exited with error status 1

These error messages are just because the domain crashes sufficiently
early that libxl can't find the console information.  Running `xl
create` without '-c' would remove the libxl errors.

>
> And "xl dmesg" shows errors like this:
> (XEN) traps.c:2514:d15 Domain attempted WRMSR c0010201 from
> 0x to 0x.
> (XEN) d16:v0: unhandled page fault (ec=0010)
> (XEN) Pagetable walk from :
> (XEN) L4[0x000] = 000200256027 049c
> (XEN) L3[0x000] = 000200255027 049d
> (XEN) L2[0x000] = 000200251023 04a1
> (XEN) L1[0x000] =  
> (XEN) domain_crash_sync called from entry.S: fault at 82d08021feb0
> compat_create_bounce_frame+0xc6/0xde
> (XEN) Domain 16 (vcpu#0) crashed on cpu#0:
> (XEN) [ Xen-4.4.1 x86_64 debug=n Not tainted ]
> (XEN) CPU: 0
> (XEN) RIP: e019:[<>]
> (XEN) RFLAGS: 0246 EM: 1 CONTEXT: pv guest
> (XEN) rax:  rbx:  rcx: 
> (XEN) rdx:  rsi: 00499000 rdi: 0080
> (XEN) rbp: 000a rsp: 005a5ff0 r8: 
> (XEN) r9:  r10: 83023e9b9000 r11: 83023e9b9000
> (XEN) r12: 033f3d335bfb r13: 82d080300800 r14: 82d0802ea940
> (XEN) r15: 83005e819000 cr0: 8005003b cr4: 000506f0
> (XEN) cr3: 000200b7a000 cr2: 
> (XEN) ds: e021 es: e021 fs: e021 gs: e021 ss: e021 cs: e019
> (XEN) Guest stack trace from esp=005a5ff0:
> (XEN) 0010  0001e019 00010046 0016b38b 0016b38a 0016b389
> 0016b388
> (XEN) 0016b387 0016b386 0016b385 0016b384 0016b383 0016b382 0016b381
> 0016b380
> (XEN) 0016b37f 0016b37e 0016b37d 0016b37c 0016b37b 0016b37a 0016b379
> 0016b378
> (XEN) 0016b377 0016b376 0016b375 0016b374 0016b373 0016b372 0016b371
> 0016b370
> (XEN) 0016b36f 0016b36e 0016b36d 0016b36c 0016b36b 0016b36a 0016b369
> 0016b368
> (XEN) 0016b367 0016b366 0016b365 0016b364 0016b363 0016b362 0016b361
> 0016b360
> (XEN) 0016b35f 0016b35e 0016b35d 0016b35c 0016b35b 0016b35a 0016b359
> 0016b358
> (XEN) 0016b357 0016b356 0016b355 0016b354 0016b353 0016b352 0016b351
> 0016b350
> (XEN) 0016b34f 0016b34e 0016b34d 0016b34c 0016b34b 0016b34a 0016b349
> 0016b348
> (XEN) 0016b347 0016b346 0016b345 0016b344 0016b343 0016b342 0016b341
> 0016b340
> (XEN) 0016b33f 0016b33e 0016b33d 

Re: [Xen-devel] [BUG] XEN domU crash when PV grub chainloads 32-bit domU grub

2015-09-22 Thread Ian Campbell
On Tue, 2015-09-22 at 08:22 +0100, Andrew Cooper wrote:
> 
> The segment registers indicate that the domU is executing in ring1 which
> makes it a 32bit guest (also why 32bit words are used for the stack
> dump), but r10 through r14 have 64bit values in.

r10..r14 are not visible to 32-bit guests but it appears that this dumping
function in Xen doesn't check for that and omit printing them.

I suspect that if these were zeroed or poisoned upon domain creation you
would see those values unmodified here.

> It does appear to be an intermittent bug in 32bit grub-xen in the
> eventual domU, but I have no help to offer with respect to debugging
> grub-xen further.

Me neither. I did suggest to Andreas that he also took it to grub-devel.
I'll reply to the original with a full quote and copy that list.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [BUG] XEN domU crash when PV grub chainloads 32-bit domU grub

2015-09-22 Thread Ian Campbell
Hi Vladimir & grub-devel,

Do you have any thoughts on this issue with i386 pv-grub2?

Thanks, Ian.

On Mon, 2015-09-21 at 22:03 +0200, Andreas Sundstrom wrote:
> This is using Debian Jessie and grub 2.02~beta2-22 (with Debian patches
> applied) and Xen 4.4.1
> 
> I originally posted a bug report with Debian but got the suggestion to
> file bugs with upstream as well.
> Debian bug report:
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=799480
> 
> Note that my original thought was that this bug probably is within GRUB.
> But Ian asked me to file a bug with Xen as well, you have to live with
> the
> fact that it is centered around GRUB though.
> 
> Here's the information from my original bug report:
> 
> Using 64-bit dom0 and 32-bit domU PV (para-virtualized) grub sometimes
> fail when chainloading the domU's grub. 64-bit domU seem to work 100%
> of the time.
> 
> My understanding of the process:
> 
>  * dom0 launches domU with grub that is loaded from dom0's disk.
>  * Grub reads config file from memdisk, and then looks for grub binary in
> domU filesystem.
>  * If grub is found in domU it then chainloads (multiboot) that grub
> binary
> and the domU grub reads grub.cfg and continue booting.
>  * If grub is not found in domU it reads grub.cfg and continues with
> boot.
> 
> It fails at step 3 in my list of the boot process, but sometimes it
> does work so it may be something like a race condition that causes the
> problem?
> 
> A workaround is to not install or rename /boot/xen in domU so that the
> first grub that is loaded from dom0's disk will not find the grub
> binary in the domU filesystem and hence continues to read grub.cfg and
> boot. The drawback of this is of course that the two versions can't
> differ too much as there are different setups creating grub.cfg and
> then reading/parsing it at boot time.
> 
> I am not sure at this point whether this is a problem in XEN or a
> problem in grub but I compiled the legacy pvgrub that uses some minios
> from XEN (don't really know much more about it) and when that legacy
> pvgrub chainloads the domU grub it seems to work 100% of the time. Now
> the legace pvgrub is not a real alternative as it's not packaged for
> Debian though.
> 
> When it fails "xl create vm -c" outputs this:
> Parsing config from /etc/xen/vm
> libxl: error: libxl_dom.c:35:libxl__domain_type: unable to get domain
> type for domid=16
> Unable to attach console
> libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: console
> child [0] exited with error status 1
> 
> And "xl dmesg" shows errors like this:
> (XEN) traps.c:2514:d15 Domain attempted WRMSR c0010201 from
> 0x to 0x.
> (XEN) d16:v0: unhandled page fault (ec=0010)
> (XEN) Pagetable walk from :
> (XEN) L4[0x000] = 000200256027 049c
> (XEN) L3[0x000] = 000200255027 049d
> (XEN) L2[0x000] = 000200251023 04a1
> (XEN) L1[0x000] =  
> (XEN) domain_crash_sync called from entry.S: fault at 82d08021feb0
> compat_create_bounce_frame+0xc6/0xde
> (XEN) Domain 16 (vcpu#0) crashed on cpu#0:
> (XEN) [ Xen-4.4.1 x86_64 debug=n Not tainted ]
> (XEN) CPU: 0
> (XEN) RIP: e019:[<>]
> (XEN) RFLAGS: 0246 EM: 1 CONTEXT: pv guest
> (XEN) rax:  rbx:  rcx: 
> (XEN) rdx:  rsi: 00499000 rdi: 0080
> (XEN) rbp: 000a rsp: 005a5ff0 r8: 
> (XEN) r9:  r10: 83023e9b9000 r11: 83023e9b9000
> (XEN) r12: 033f3d335bfb r13: 82d080300800 r14: 82d0802ea940
> (XEN) r15: 83005e819000 cr0: 8005003b cr4: 000506f0
> (XEN) cr3: 000200b7a000 cr2: 
> (XEN) ds: e021 es: e021 fs: e021 gs: e021 ss: e021 cs: e019
> (XEN) Guest stack trace from esp=005a5ff0:
> (XEN) 0010  0001e019 00010046 0016b38b 0016b38a 0016b389
> 0016b388
> (XEN) 0016b387 0016b386 0016b385 0016b384 0016b383 0016b382 0016b381
> 0016b380
> (XEN) 0016b37f 0016b37e 0016b37d 0016b37c 0016b37b 0016b37a 0016b379
> 0016b378
> (XEN) 0016b377 0016b376 0016b375 0016b374 0016b373 0016b372 0016b371
> 0016b370
> (XEN) 0016b36f 0016b36e 0016b36d 0016b36c 0016b36b 0016b36a 0016b369
> 0016b368
> (XEN) 0016b367 0016b366 0016b365 0016b364 0016b363 0016b362 0016b361
> 0016b360
> (XEN) 0016b35f 0016b35e 0016b35d 0016b35c 0016b35b 0016b35a 0016b359
> 0016b358
> (XEN) 0016b357 0016b356 0016b355 0016b354 0016b353 0016b352 0016b351
> 0016b350
> (XEN) 0016b34f 0016b34e 0016b34d 0016b34c 0016b34b 0016b34a 0016b349
> 0016b348
> (XEN) 0016b347 0016b346 0016b345 0016b344 0016b343 0016b342 0016b341
> 0016b340
> (XEN) 0016b33f 0016b33e 0016b33d 0016b33c 0016b33b 0016b33a 0016b339
> 0016b338
> (XEN) 0016b337 0016b336 0016b335 0016b334 0016b333 0016b332 0016b331
> 0016b330
> (XEN) 0016b32f 0016b32e 0016b32d 0016b32c 0016b32b 0016b32a 0016b329
> 

Re: [Xen-devel] [BUG] XEN domU crash when PV grub chainloads 32-bit domU grub

2015-09-22 Thread Andreas Sundstrom


Citerar Andrew Cooper :


On 21/09/2015 21:03, Andreas Sundstrom wrote:

Using 64-bit dom0 and 32-bit domU PV (para-virtualized) grub sometimes
fail when chainloading the domU's grub. 64-bit domU seem to work 100%
of the time.


You say sometimes.  Do you mean that repeated attempts to boot a 32bit
domU causes it to ether boot correctly, or die in the below manor?


Yes that is correct, it may or may not be able to complete the booting.


When it fails "xl create vm -c" outputs this:
Parsing config from /etc/xen/vm
libxl: error: libxl_dom.c:35:libxl__domain_type: unable to get domain
type for domid=16
Unable to attach console
libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: console
child [0] exited with error status 1


These error messages are just because the domain crashes sufficiently
early that libxl can't find the console information.  Running `xl
create` without '-c' would remove the libxl errors.


Correct, they only appear due to the failed connecting of the console
so not really relevant to the actual issue.


I am unable to understand how to debug grub further on my own, I have
printed out text from grub so that I understood that it is the
chainload that fails. I see no output from the domU grub (except when
it works as it should of course). I can help with further testing if
needed.


It does appear to be an intermittent bug in 32bit grub-xen in the
eventual domU, but I have no help to offer with respect to debugging
grub-xen further.

~Andrew


As Ian Campbell suggested I have also filed a bug with upstream grub:
http://savannah.gnu.org/bugs/?46014

/Andreas



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel