Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-10-25 Thread Wei Liu
On Tue, Oct 25, 2016 at 07:25:06PM +0200, Sander Eikelenboom wrote:
> On 2016-10-25 16:49, Wei Liu wrote:
> >On Tue, Oct 25, 2016 at 01:37:45PM +0200, Sander Eikelenboom wrote:
> >>
> >>Tuesday, October 25, 2016, 1:24:12 PM, you wrote:
> >>
> >>> On Tue, Oct 18, 2016 at 01:48:23PM +0100, Wei Liu wrote:
>  On Mon, Oct 17, 2016 at 05:28:17PM +0200, Sander Eikelenboom wrote:
>  > Thursday, October 13, 2016, 4:43:31 PM, you wrote:
>  >
>  > > Hi Jan / Wei,
>  >
>  > > Took a while before i had the chance to fiddle some more to find the 
>  > > actual culprit.
>  > > After analyzing the output of xl -v create somewhat more i came 
>  > > to the
>  > > insight it was probably Qemu and not Xen causing the fault.
>  >
>  > > As a test I just used a qemu-xen binary build with xen-4.6.0 booting 
>  > > up a guest with
>  > > direct kernel boot mode on xen-unstable. And that old qemu binary 
>  > > works fine.
>  >
>  > > After testing i can conclude, Jan was right, the bisection was a red 
>  > > herring,
>  > > the problem is caused by some change in Qemu and not by something in 
>  > > the Xen tree.
>  > > (strange thing is that for as far as i know i did a "make distclean" 
>  > > between
>  > > every build (taking a lot of time), which should have pulled a fresh 
>  > > qemu-xen
>  > > tree and therefor the bisection should have lead to a commit with a 
>  > > Config.mk
>  > > hash change for qemu-xen version.)
>  >
>  > > Will see if i can find some more time and bisect qemu and find the 
>  > > culprit.
>  >
>  > > --
>  > > Sander
>  >
>  >
>  > Unfortunately i have to give up on this issue, for me it's impossible 
>  > to bisect this
>  > issue with my present git-foo.
>  >
>  > The first try with bisection of the whole xen-tree seems to have hit 
>  > the issue that the
>  > qemu-revision that gets pulled on a fresh build is "master" during the 
>  > whole
>  > dev period. That creates havoc when trying to bisect, since you are 
>  > testing
>  > combinations that were never developed (nor auto tested) in that 
>  > combination
>  > (especially when a xen-tree and qemu-tree change have a dependency 
>  > like Roger's
>  > "xen: fix usage of xc_domain_create in domain builder")
>  >
>  > While trying to bisect only qemu (keeping xen itself on RELEASE-4.6.0 
>  > and
>  > seabios on rel-1.8.2) it get stuck on issues with that tree.
>  > Between 4.6.0 and 4.7.0 the qemu tree switched from 
>  > git://xenbits.xen.org/qemu-upstream-4.6-testing.git
>  > to git://xenbits.xen.org/qemu-xen.git),after that there seem to have
>  > been a lot of merges going back and forth and to me it seems a mess 
>  > (but as i
>  > said it could also be a lack of git-foo). I tried by manual bisecting, 
>  > removing
>  > and cloning trees again etc. but that doesn't suffice, it's all going 
>  > no-where.
>  > (while the known good build (plain RELEASE-4.6.0) always works, so it 
>  > doesn't
>  > seem to be some random problem)
>  >
> 
>  Thanks for trying.
> 
>  > So perhaps some dev can at least verify that the issue is there (since 
>  > 4.7.0)
>  > and put it on the "known broken" list of things.
>  >
> 
>  I will put this into the list of things I need to look at.
> 
> >>
> >>> I investigated this a bit. The root cause is the memory accounting is
> >>> wrong in QEMU. It would try to allocate more ram than allowed. I haven't
> >>> tried to figure out exactly what is wrong, though.
> >>
> >>That confirms what i was thinking in the end, but bisection the
> >>qemu-tree
> >>changes between the xen-4.6.0 and xen-4.7.0 release proved to be pretty
> >>difficult as i explained. So i you have a hunch as to in what code it
> >>should
> >>reside debugging instead of bisecting would probably be better.
> >>(so one of the questions is what changes in the memory accounting when
> >>you
> >>supply the kernel from the host instead of the guest, since booting a
> >>kernel
> >>with grub from within the guest doesn't give any memory accounting
> >>issues.)
> >>
> >>Thanks for investigating !
> >
> >I think I hunted down the offending function.
> >
> >Mind trying this patch for me?
> 
> Hi Wei,
> 
> This seems to help :)
> 
> With a linux 4.8 kernel the HVM guest now boots fine with direct kernel boot
> !
> 
> But there seems to be a gotcha which i think is not in the Xen docs/wiki:
> when trying a linux 4.3 kernel the guest still didn't boot and i got a:
> "qemu: linux kernel too old to load a ram disk" in the qemu log.
> I don't know what qemu regards as "old" in this case.
> 

QEMU checks for a  signature / version in kernel header or whatnot. I
can't tell why that specific number is chosen, though.

> Another considiration: would it be 

Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-10-25 Thread Sander Eikelenboom

On 2016-10-25 16:49, Wei Liu wrote:

On Tue, Oct 25, 2016 at 01:37:45PM +0200, Sander Eikelenboom wrote:


Tuesday, October 25, 2016, 1:24:12 PM, you wrote:

> On Tue, Oct 18, 2016 at 01:48:23PM +0100, Wei Liu wrote:
>> On Mon, Oct 17, 2016 at 05:28:17PM +0200, Sander Eikelenboom wrote:
>> > Thursday, October 13, 2016, 4:43:31 PM, you wrote:
>> >
>> > > Hi Jan / Wei,
>> >
>> > > Took a while before i had the chance to fiddle some more to find the 
actual culprit.
>> > > After analyzing the output of xl -v create somewhat more i came to 
the
>> > > insight it was probably Qemu and not Xen causing the fault.
>> >
>> > > As a test I just used a qemu-xen binary build with xen-4.6.0 booting up 
a guest with
>> > > direct kernel boot mode on xen-unstable. And that old qemu binary works 
fine.
>> >
>> > > After testing i can conclude, Jan was right, the bisection was a red 
herring,
>> > > the problem is caused by some change in Qemu and not by something in the 
Xen tree.
>> > > (strange thing is that for as far as i know i did a "make distclean" 
between
>> > > every build (taking a lot of time), which should have pulled a fresh 
qemu-xen
>> > > tree and therefor the bisection should have lead to a commit with a 
Config.mk
>> > > hash change for qemu-xen version.)
>> >
>> > > Will see if i can find some more time and bisect qemu and find the 
culprit.
>> >
>> > > --
>> > > Sander
>> >
>> >
>> > Unfortunately i have to give up on this issue, for me it's impossible to 
bisect this
>> > issue with my present git-foo.
>> >
>> > The first try with bisection of the whole xen-tree seems to have hit the 
issue that the
>> > qemu-revision that gets pulled on a fresh build is "master" during the 
whole
>> > dev period. That creates havoc when trying to bisect, since you are testing
>> > combinations that were never developed (nor auto tested) in that 
combination
>> > (especially when a xen-tree and qemu-tree change have a dependency like 
Roger's
>> > "xen: fix usage of xc_domain_create in domain builder")
>> >
>> > While trying to bisect only qemu (keeping xen itself on RELEASE-4.6.0 and
>> > seabios on rel-1.8.2) it get stuck on issues with that tree.
>> > Between 4.6.0 and 4.7.0 the qemu tree switched from 
git://xenbits.xen.org/qemu-upstream-4.6-testing.git
>> > to git://xenbits.xen.org/qemu-xen.git),after that there seem to have
>> > been a lot of merges going back and forth and to me it seems a mess (but 
as i
>> > said it could also be a lack of git-foo). I tried by manual bisecting, 
removing
>> > and cloning trees again etc. but that doesn't suffice, it's all going 
no-where.
>> > (while the known good build (plain RELEASE-4.6.0) always works, so it 
doesn't
>> > seem to be some random problem)
>> >
>>
>> Thanks for trying.
>>
>> > So perhaps some dev can at least verify that the issue is there (since 
4.7.0)
>> > and put it on the "known broken" list of things.
>> >
>>
>> I will put this into the list of things I need to look at.
>>

> I investigated this a bit. The root cause is the memory accounting is
> wrong in QEMU. It would try to allocate more ram than allowed. I haven't
> tried to figure out exactly what is wrong, though.

That confirms what i was thinking in the end, but bisection the 
qemu-tree
changes between the xen-4.6.0 and xen-4.7.0 release proved to be 
pretty
difficult as i explained. So i you have a hunch as to in what code it 
should

reside debugging instead of bisecting would probably be better.
(so one of the questions is what changes in the memory accounting when 
you
supply the kernel from the host instead of the guest, since booting a 
kernel
with grub from within the guest doesn't give any memory accounting 
issues.)


Thanks for investigating !


I think I hunted down the offending function.

Mind trying this patch for me?


Hi Wei,

This seems to help :)

With a linux 4.8 kernel the HVM guest now boots fine with direct kernel 
boot !


But there seems to be a gotcha which i think is not in the Xen 
docs/wiki:

when trying a linux 4.3 kernel the guest still didn't boot and i got a:
"qemu: linux kernel too old to load a ram disk" in the qemu log.
I don't know what qemu regards as "old" in this case.

Another considiration: would it be worthwhile to add an OSStest for 
direct kernel boot ?
(under the assumption that the host kernel that gets build can also boot 
on HVM guest it's probably a very cheap test not requiring any 
additional builds.)


Thanks again !

--
Sander



---8<---
From 3c7f8b55109959cf470d452f452f7c0ade51 Mon Sep 17 00:00:00 2001
From: Wei Liu 
Date: Tue, 25 Oct 2016 15:45:04 +0100
Subject: [PATCH] acpi: don't build acpi tables for xen guests

Xen's toolstack is in charge of building ACPI tables. Skip acpi table
building if running on Xen.

This issue is discovered due to direct kernel boot on Xen doesn't boot
anymore, because the new ACPI tables cause the guest to exceed its
memory allocation limit.

Reported-by: Sander Eikelenboom 

Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-10-25 Thread Wei Liu
On Tue, Oct 25, 2016 at 03:49:59PM +0100, Wei Liu wrote:
> On Tue, Oct 25, 2016 at 01:37:45PM +0200, Sander Eikelenboom wrote:
> > 
> > Tuesday, October 25, 2016, 1:24:12 PM, you wrote:
> > 
> > > On Tue, Oct 18, 2016 at 01:48:23PM +0100, Wei Liu wrote:
> > >> On Mon, Oct 17, 2016 at 05:28:17PM +0200, Sander Eikelenboom wrote:
> > >> > Thursday, October 13, 2016, 4:43:31 PM, you wrote:
> > >> > 
> > >> > > Hi Jan / Wei,
> > >> > 
> > >> > > Took a while before i had the chance to fiddle some more to find the 
> > >> > > actual culprit.
> > >> > > After analyzing the output of xl -v create somewhat more i came 
> > >> > > to the 
> > >> > > insight it was probably Qemu and not Xen causing the fault.
> > >> > 
> > >> > > As a test I just used a qemu-xen binary build with xen-4.6.0 booting 
> > >> > > up a guest with
> > >> > > direct kernel boot mode on xen-unstable. And that old qemu binary 
> > >> > > works fine.
> > >> > 
> > >> > > After testing i can conclude, Jan was right, the bisection was a red 
> > >> > > herring,
> > >> > > the problem is caused by some change in Qemu and not by something in 
> > >> > > the Xen tree.
> > >> > > (strange thing is that for as far as i know i did a "make distclean" 
> > >> > > between 
> > >> > > every build (taking a lot of time), which should have pulled a fresh 
> > >> > > qemu-xen 
> > >> > > tree and therefor the bisection should have lead to a commit with a 
> > >> > > Config.mk 
> > >> > > hash change for qemu-xen version.)
> > >> > 
> > >> > > Will see if i can find some more time and bisect qemu and find the 
> > >> > > culprit.
> > >> > 
> > >> > > --
> > >> > > Sander
> > >> > 
> > >> > 
> > >> > Unfortunately i have to give up on this issue, for me it's impossible 
> > >> > to bisect this 
> > >> > issue with my present git-foo.
> > >> > 
> > >> > The first try with bisection of the whole xen-tree seems to have hit 
> > >> > the issue that the 
> > >> > qemu-revision that gets pulled on a fresh build is "master" during the 
> > >> > whole
> > >> > dev period. That creates havoc when trying to bisect, since you are 
> > >> > testing 
> > >> > combinations that were never developed (nor auto tested) in that 
> > >> > combination
> > >> > (especially when a xen-tree and qemu-tree change have a dependency 
> > >> > like Roger's 
> > >> > "xen: fix usage of xc_domain_create in domain builder")
> > >> > 
> > >> > While trying to bisect only qemu (keeping xen itself on RELEASE-4.6.0 
> > >> > and 
> > >> > seabios on rel-1.8.2) it get stuck on issues with that tree.
> > >> > Between 4.6.0 and 4.7.0 the qemu tree switched from 
> > >> > git://xenbits.xen.org/qemu-upstream-4.6-testing.git
> > >> > to git://xenbits.xen.org/qemu-xen.git),after that there seem to have 
> > >> > been a lot of merges going back and forth and to me it seems a mess 
> > >> > (but as i 
> > >> > said it could also be a lack of git-foo). I tried by manual bisecting, 
> > >> > removing 
> > >> > and cloning trees again etc. but that doesn't suffice, it's all going 
> > >> > no-where.
> > >> > (while the known good build (plain RELEASE-4.6.0) always works, so it 
> > >> > doesn't 
> > >> > seem to be some random problem)
> > >> > 
> > >> 
> > >> Thanks for trying.
> > >> 
> > >> > So perhaps some dev can at least verify that the issue is there (since 
> > >> > 4.7.0)
> > >> > and put it on the "known broken" list of things.
> > >> > 
> > >> 
> > >> I will put this into the list of things I need to look at.
> > >> 
> > 
> > > I investigated this a bit. The root cause is the memory accounting is
> > > wrong in QEMU. It would try to allocate more ram than allowed. I haven't
> > > tried to figure out exactly what is wrong, though.
> > 
> > That confirms what i was thinking in the end, but bisection the qemu-tree 
> > changes between the xen-4.6.0 and xen-4.7.0 release proved to be pretty 
> > difficult as i explained. So i you have a hunch as to in what code it 
> > should 
> > reside debugging instead of bisecting would probably be better.
> > (so one of the questions is what changes in the memory accounting when you
> > supply the kernel from the host instead of the guest, since booting a kernel
> > with grub from within the guest doesn't give any memory accounting issues.) 
> > 
> > Thanks for investigating !
> 
> I think I hunted down the offending function.
> 
> Mind trying this patch for me?
> 
> ---8<---
> From 3c7f8b55109959cf470d452f452f7c0ade51 Mon Sep 17 00:00:00 2001
> From: Wei Liu 
> Date: Tue, 25 Oct 2016 15:45:04 +0100
> Subject: [PATCH] acpi: don't build acpi tables for xen guests
> 
> Xen's toolstack is in charge of building ACPI tables. Skip acpi table
> building if running on Xen.
> 
> This issue is discovered due to direct kernel boot on Xen doesn't boot
> anymore, because the new ACPI tables cause the guest to exceed its
> memory allocation limit.
> 
> Reported-by: Sander Eikelenboom 
> Signed-off-by: Wei Liu 

Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-10-25 Thread Wei Liu
On Tue, Oct 25, 2016 at 01:37:45PM +0200, Sander Eikelenboom wrote:
> 
> Tuesday, October 25, 2016, 1:24:12 PM, you wrote:
> 
> > On Tue, Oct 18, 2016 at 01:48:23PM +0100, Wei Liu wrote:
> >> On Mon, Oct 17, 2016 at 05:28:17PM +0200, Sander Eikelenboom wrote:
> >> > Thursday, October 13, 2016, 4:43:31 PM, you wrote:
> >> > 
> >> > > Hi Jan / Wei,
> >> > 
> >> > > Took a while before i had the chance to fiddle some more to find the 
> >> > > actual culprit.
> >> > > After analyzing the output of xl -v create somewhat more i came to 
> >> > > the 
> >> > > insight it was probably Qemu and not Xen causing the fault.
> >> > 
> >> > > As a test I just used a qemu-xen binary build with xen-4.6.0 booting 
> >> > > up a guest with
> >> > > direct kernel boot mode on xen-unstable. And that old qemu binary 
> >> > > works fine.
> >> > 
> >> > > After testing i can conclude, Jan was right, the bisection was a red 
> >> > > herring,
> >> > > the problem is caused by some change in Qemu and not by something in 
> >> > > the Xen tree.
> >> > > (strange thing is that for as far as i know i did a "make distclean" 
> >> > > between 
> >> > > every build (taking a lot of time), which should have pulled a fresh 
> >> > > qemu-xen 
> >> > > tree and therefor the bisection should have lead to a commit with a 
> >> > > Config.mk 
> >> > > hash change for qemu-xen version.)
> >> > 
> >> > > Will see if i can find some more time and bisect qemu and find the 
> >> > > culprit.
> >> > 
> >> > > --
> >> > > Sander
> >> > 
> >> > 
> >> > Unfortunately i have to give up on this issue, for me it's impossible to 
> >> > bisect this 
> >> > issue with my present git-foo.
> >> > 
> >> > The first try with bisection of the whole xen-tree seems to have hit the 
> >> > issue that the 
> >> > qemu-revision that gets pulled on a fresh build is "master" during the 
> >> > whole
> >> > dev period. That creates havoc when trying to bisect, since you are 
> >> > testing 
> >> > combinations that were never developed (nor auto tested) in that 
> >> > combination
> >> > (especially when a xen-tree and qemu-tree change have a dependency like 
> >> > Roger's 
> >> > "xen: fix usage of xc_domain_create in domain builder")
> >> > 
> >> > While trying to bisect only qemu (keeping xen itself on RELEASE-4.6.0 
> >> > and 
> >> > seabios on rel-1.8.2) it get stuck on issues with that tree.
> >> > Between 4.6.0 and 4.7.0 the qemu tree switched from 
> >> > git://xenbits.xen.org/qemu-upstream-4.6-testing.git
> >> > to git://xenbits.xen.org/qemu-xen.git),after that there seem to have 
> >> > been a lot of merges going back and forth and to me it seems a mess (but 
> >> > as i 
> >> > said it could also be a lack of git-foo). I tried by manual bisecting, 
> >> > removing 
> >> > and cloning trees again etc. but that doesn't suffice, it's all going 
> >> > no-where.
> >> > (while the known good build (plain RELEASE-4.6.0) always works, so it 
> >> > doesn't 
> >> > seem to be some random problem)
> >> > 
> >> 
> >> Thanks for trying.
> >> 
> >> > So perhaps some dev can at least verify that the issue is there (since 
> >> > 4.7.0)
> >> > and put it on the "known broken" list of things.
> >> > 
> >> 
> >> I will put this into the list of things I need to look at.
> >> 
> 
> > I investigated this a bit. The root cause is the memory accounting is
> > wrong in QEMU. It would try to allocate more ram than allowed. I haven't
> > tried to figure out exactly what is wrong, though.
> 
> That confirms what i was thinking in the end, but bisection the qemu-tree 
> changes between the xen-4.6.0 and xen-4.7.0 release proved to be pretty 
> difficult as i explained. So i you have a hunch as to in what code it should 
> reside debugging instead of bisecting would probably be better.
> (so one of the questions is what changes in the memory accounting when you
> supply the kernel from the host instead of the guest, since booting a kernel
> with grub from within the guest doesn't give any memory accounting issues.) 
> 
> Thanks for investigating !

I think I hunted down the offending function.

Mind trying this patch for me?

---8<---
From 3c7f8b55109959cf470d452f452f7c0ade51 Mon Sep 17 00:00:00 2001
From: Wei Liu 
Date: Tue, 25 Oct 2016 15:45:04 +0100
Subject: [PATCH] acpi: don't build acpi tables for xen guests

Xen's toolstack is in charge of building ACPI tables. Skip acpi table
building if running on Xen.

This issue is discovered due to direct kernel boot on Xen doesn't boot
anymore, because the new ACPI tables cause the guest to exceed its
memory allocation limit.

Reported-by: Sander Eikelenboom 
Signed-off-by: Wei Liu 
---
Cc: Anthony PERARD 
Cc: Stefano Stabellini 

RFC because I'm not sure this is the best way to fix it.
---
 hw/i386/acpi-build.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c

Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-10-25 Thread Sander Eikelenboom

Tuesday, October 25, 2016, 1:24:12 PM, you wrote:

> On Tue, Oct 18, 2016 at 01:48:23PM +0100, Wei Liu wrote:
>> On Mon, Oct 17, 2016 at 05:28:17PM +0200, Sander Eikelenboom wrote:
>> > Thursday, October 13, 2016, 4:43:31 PM, you wrote:
>> > 
>> > > Hi Jan / Wei,
>> > 
>> > > Took a while before i had the chance to fiddle some more to find the 
>> > > actual culprit.
>> > > After analyzing the output of xl -v create somewhat more i came to 
>> > > the 
>> > > insight it was probably Qemu and not Xen causing the fault.
>> > 
>> > > As a test I just used a qemu-xen binary build with xen-4.6.0 booting up 
>> > > a guest with
>> > > direct kernel boot mode on xen-unstable. And that old qemu binary works 
>> > > fine.
>> > 
>> > > After testing i can conclude, Jan was right, the bisection was a red 
>> > > herring,
>> > > the problem is caused by some change in Qemu and not by something in the 
>> > > Xen tree.
>> > > (strange thing is that for as far as i know i did a "make distclean" 
>> > > between 
>> > > every build (taking a lot of time), which should have pulled a fresh 
>> > > qemu-xen 
>> > > tree and therefor the bisection should have lead to a commit with a 
>> > > Config.mk 
>> > > hash change for qemu-xen version.)
>> > 
>> > > Will see if i can find some more time and bisect qemu and find the 
>> > > culprit.
>> > 
>> > > --
>> > > Sander
>> > 
>> > 
>> > Unfortunately i have to give up on this issue, for me it's impossible to 
>> > bisect this 
>> > issue with my present git-foo.
>> > 
>> > The first try with bisection of the whole xen-tree seems to have hit the 
>> > issue that the 
>> > qemu-revision that gets pulled on a fresh build is "master" during the 
>> > whole
>> > dev period. That creates havoc when trying to bisect, since you are 
>> > testing 
>> > combinations that were never developed (nor auto tested) in that 
>> > combination
>> > (especially when a xen-tree and qemu-tree change have a dependency like 
>> > Roger's 
>> > "xen: fix usage of xc_domain_create in domain builder")
>> > 
>> > While trying to bisect only qemu (keeping xen itself on RELEASE-4.6.0 and 
>> > seabios on rel-1.8.2) it get stuck on issues with that tree.
>> > Between 4.6.0 and 4.7.0 the qemu tree switched from 
>> > git://xenbits.xen.org/qemu-upstream-4.6-testing.git
>> > to git://xenbits.xen.org/qemu-xen.git),after that there seem to have 
>> > been a lot of merges going back and forth and to me it seems a mess (but 
>> > as i 
>> > said it could also be a lack of git-foo). I tried by manual bisecting, 
>> > removing 
>> > and cloning trees again etc. but that doesn't suffice, it's all going 
>> > no-where.
>> > (while the known good build (plain RELEASE-4.6.0) always works, so it 
>> > doesn't 
>> > seem to be some random problem)
>> > 
>> 
>> Thanks for trying.
>> 
>> > So perhaps some dev can at least verify that the issue is there (since 
>> > 4.7.0)
>> > and put it on the "known broken" list of things.
>> > 
>> 
>> I will put this into the list of things I need to look at.
>> 

> I investigated this a bit. The root cause is the memory accounting is
> wrong in QEMU. It would try to allocate more ram than allowed. I haven't
> tried to figure out exactly what is wrong, though.

That confirms what i was thinking in the end, but bisection the qemu-tree 
changes between the xen-4.6.0 and xen-4.7.0 release proved to be pretty 
difficult as i explained. So i you have a hunch as to in what code it should 
reside debugging instead of bisecting would probably be better.
(so one of the questions is what changes in the memory accounting when you
supply the kernel from the host instead of the guest, since booting a kernel
with grub from within the guest doesn't give any memory accounting issues.) 

Thanks for investigating !
--

Sander

> Wei.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-10-25 Thread Wei Liu
On Tue, Oct 18, 2016 at 01:48:23PM +0100, Wei Liu wrote:
> On Mon, Oct 17, 2016 at 05:28:17PM +0200, Sander Eikelenboom wrote:
> > Thursday, October 13, 2016, 4:43:31 PM, you wrote:
> > 
> > > Hi Jan / Wei,
> > 
> > > Took a while before i had the chance to fiddle some more to find the 
> > > actual culprit.
> > > After analyzing the output of xl -v create somewhat more i came to 
> > > the 
> > > insight it was probably Qemu and not Xen causing the fault.
> > 
> > > As a test I just used a qemu-xen binary build with xen-4.6.0 booting up a 
> > > guest with
> > > direct kernel boot mode on xen-unstable. And that old qemu binary works 
> > > fine.
> > 
> > > After testing i can conclude, Jan was right, the bisection was a red 
> > > herring,
> > > the problem is caused by some change in Qemu and not by something in the 
> > > Xen tree.
> > > (strange thing is that for as far as i know i did a "make distclean" 
> > > between 
> > > every build (taking a lot of time), which should have pulled a fresh 
> > > qemu-xen 
> > > tree and therefor the bisection should have lead to a commit with a 
> > > Config.mk 
> > > hash change for qemu-xen version.)
> > 
> > > Will see if i can find some more time and bisect qemu and find the 
> > > culprit.
> > 
> > > --
> > > Sander
> > 
> > 
> > Unfortunately i have to give up on this issue, for me it's impossible to 
> > bisect this 
> > issue with my present git-foo.
> > 
> > The first try with bisection of the whole xen-tree seems to have hit the 
> > issue that the 
> > qemu-revision that gets pulled on a fresh build is "master" during the whole
> > dev period. That creates havoc when trying to bisect, since you are testing 
> > combinations that were never developed (nor auto tested) in that combination
> > (especially when a xen-tree and qemu-tree change have a dependency like 
> > Roger's 
> > "xen: fix usage of xc_domain_create in domain builder")
> > 
> > While trying to bisect only qemu (keeping xen itself on RELEASE-4.6.0 and 
> > seabios on rel-1.8.2) it get stuck on issues with that tree.
> > Between 4.6.0 and 4.7.0 the qemu tree switched from 
> > git://xenbits.xen.org/qemu-upstream-4.6-testing.git
> > to git://xenbits.xen.org/qemu-xen.git),after that there seem to have 
> > been a lot of merges going back and forth and to me it seems a mess (but as 
> > i 
> > said it could also be a lack of git-foo). I tried by manual bisecting, 
> > removing 
> > and cloning trees again etc. but that doesn't suffice, it's all going 
> > no-where.
> > (while the known good build (plain RELEASE-4.6.0) always works, so it 
> > doesn't 
> > seem to be some random problem)
> > 
> 
> Thanks for trying.
> 
> > So perhaps some dev can at least verify that the issue is there (since 
> > 4.7.0)
> > and put it on the "known broken" list of things.
> > 
> 
> I will put this into the list of things I need to look at.
> 

I investigated this a bit. The root cause is the memory accounting is
wrong in QEMU. It would try to allocate more ram than allowed. I haven't
tried to figure out exactly what is wrong, though.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-10-18 Thread Håkon Alstadheim
Den 18. okt. 2016 14:48, skrev Wei Liu:
> On Mon, Oct 17, 2016 at 05:28:17PM +0200, Sander Eikelenboom wrote:
>> Thursday, October 13, 2016, 4:43:31 PM, you wrote:
>>
>>> Hi Jan / Wei,
>>> Took a while before i had the chance to fiddle some more to find the actual 
>>> culprit.
>>> After analyzing the output of xl -v create somewhat more i came to the 
>>> insight it was probably Qemu and not Xen causing the fault.
>>> As a test I just used a qemu-xen binary build with xen-4.6.0 booting up a 
>>> guest with
>>> direct kernel boot mode on xen-unstable. And that old qemu binary works 
>>> fine.
>>> After testing i can conclude, Jan was right, the bisection was a red 
>>> herring,
>>> the problem is caused by some change in Qemu and not by something in the 
>>> Xen tree.
>>> (strange thing is that for as far as i know i did a "make distclean" 
>>> between 
>>> every build (taking a lot of time), which should have pulled a fresh 
>>> qemu-xen 
>>> tree and therefor the bisection should have lead to a commit with a 
>>> Config.mk 
>>> hash change for qemu-xen version.)
>>> Will see if i can find some more time and bisect qemu and find the culprit.
>>> --
>>> Sander
>>
>> Unfortunately i have to give up on this issue, for me it's impossible to 
>> bisect this 
>> issue with my present git-foo.
>>
>> The first try with bisection of the whole xen-tree seems to have hit the 
>> issue that the 
>> qemu-revision that gets pulled on a fresh build is "master" during the whole
>> dev period. That creates havoc when trying to bisect, since you are testing 
>> combinations that were never developed (nor auto tested) in that combination
>> (especially when a xen-tree and qemu-tree change have a dependency like 
>> Roger's 
>> "xen: fix usage of xc_domain_create in domain builder")
>>
>> While trying to bisect only qemu (keeping xen itself on RELEASE-4.6.0 and 
>> seabios on rel-1.8.2) it get stuck on issues with that tree.
>> Between 4.6.0 and 4.7.0 the qemu tree switched from 
>> git://xenbits.xen.org/qemu-upstream-4.6-testing.git
>> to git://xenbits.xen.org/qemu-xen.git),after that there seem to have 
>> been a lot of merges going back and forth and to me it seems a mess (but as 
>> i 
>> said it could also be a lack of git-foo). I tried by manual bisecting, 
>> removing 
>> and cloning trees again etc. but that doesn't suffice, it's all going 
>> no-where.
>> (while the known good build (plain RELEASE-4.6.0) always works, so it 
>> doesn't 
>> seem to be some random problem)
>>
> Thanks for trying.
>
>> So perhaps some dev can at least verify that the issue is there (since 4.7.0)
>> and put it on the "known broken" list of things.
>>
> I will put this into the list of things I need to look at.
>
> Wei.
>
In the mean time, a viable work-around is to use pxe boot if one needs
external boot for hvm under xen-4.7.

Still, the effort is appreciated :-).
Regards, Håkon.

P.S: I had some difficulty with pxe-boot and serial console, feel free
to email me direct if anyone wants to compare notes.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-10-18 Thread Wei Liu
On Mon, Oct 17, 2016 at 05:28:17PM +0200, Sander Eikelenboom wrote:
> Thursday, October 13, 2016, 4:43:31 PM, you wrote:
> 
> > Hi Jan / Wei,
> 
> > Took a while before i had the chance to fiddle some more to find the actual 
> > culprit.
> > After analyzing the output of xl -v create somewhat more i came to the 
> > insight it was probably Qemu and not Xen causing the fault.
> 
> > As a test I just used a qemu-xen binary build with xen-4.6.0 booting up a 
> > guest with
> > direct kernel boot mode on xen-unstable. And that old qemu binary works 
> > fine.
> 
> > After testing i can conclude, Jan was right, the bisection was a red 
> > herring,
> > the problem is caused by some change in Qemu and not by something in the 
> > Xen tree.
> > (strange thing is that for as far as i know i did a "make distclean" 
> > between 
> > every build (taking a lot of time), which should have pulled a fresh 
> > qemu-xen 
> > tree and therefor the bisection should have lead to a commit with a 
> > Config.mk 
> > hash change for qemu-xen version.)
> 
> > Will see if i can find some more time and bisect qemu and find the culprit.
> 
> > --
> > Sander
> 
> 
> Unfortunately i have to give up on this issue, for me it's impossible to 
> bisect this 
> issue with my present git-foo.
> 
> The first try with bisection of the whole xen-tree seems to have hit the 
> issue that the 
> qemu-revision that gets pulled on a fresh build is "master" during the whole
> dev period. That creates havoc when trying to bisect, since you are testing 
> combinations that were never developed (nor auto tested) in that combination
> (especially when a xen-tree and qemu-tree change have a dependency like 
> Roger's 
> "xen: fix usage of xc_domain_create in domain builder")
> 
> While trying to bisect only qemu (keeping xen itself on RELEASE-4.6.0 and 
> seabios on rel-1.8.2) it get stuck on issues with that tree.
> Between 4.6.0 and 4.7.0 the qemu tree switched from 
> git://xenbits.xen.org/qemu-upstream-4.6-testing.git
> to git://xenbits.xen.org/qemu-xen.git),after that there seem to have 
> been a lot of merges going back and forth and to me it seems a mess (but as i 
> said it could also be a lack of git-foo). I tried by manual bisecting, 
> removing 
> and cloning trees again etc. but that doesn't suffice, it's all going 
> no-where.
> (while the known good build (plain RELEASE-4.6.0) always works, so it doesn't 
> seem to be some random problem)
> 

Thanks for trying.

> So perhaps some dev can at least verify that the issue is there (since 4.7.0)
> and put it on the "known broken" list of things.
> 

I will put this into the list of things I need to look at.

Wei.

> --
> Sander
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-10-17 Thread Sander Eikelenboom
Thursday, October 13, 2016, 4:43:31 PM, you wrote:

> Hi Jan / Wei,

> Took a while before i had the chance to fiddle some more to find the actual 
> culprit.
> After analyzing the output of xl -v create somewhat more i came to the 
> insight it was probably Qemu and not Xen causing the fault.

> As a test I just used a qemu-xen binary build with xen-4.6.0 booting up a 
> guest with
> direct kernel boot mode on xen-unstable. And that old qemu binary works fine.

> After testing i can conclude, Jan was right, the bisection was a red herring,
> the problem is caused by some change in Qemu and not by something in the Xen 
> tree.
> (strange thing is that for as far as i know i did a "make distclean" between 
> every build (taking a lot of time), which should have pulled a fresh qemu-xen 
> tree and therefor the bisection should have lead to a commit with a Config.mk 
> hash change for qemu-xen version.)

> Will see if i can find some more time and bisect qemu and find the culprit.

> --
> Sander


Unfortunately i have to give up on this issue, for me it's impossible to bisect 
this 
issue with my present git-foo.

The first try with bisection of the whole xen-tree seems to have hit the issue 
that the 
qemu-revision that gets pulled on a fresh build is "master" during the whole
dev period. That creates havoc when trying to bisect, since you are testing 
combinations that were never developed (nor auto tested) in that combination
(especially when a xen-tree and qemu-tree change have a dependency like Roger's 
"xen: fix usage of xc_domain_create in domain builder")

While trying to bisect only qemu (keeping xen itself on RELEASE-4.6.0 and 
seabios on rel-1.8.2) it get stuck on issues with that tree.
Between 4.6.0 and 4.7.0 the qemu tree switched from 
git://xenbits.xen.org/qemu-upstream-4.6-testing.git
to git://xenbits.xen.org/qemu-xen.git),after that there seem to have 
been a lot of merges going back and forth and to me it seems a mess (but as i 
said it could also be a lack of git-foo). I tried by manual bisecting, removing 
and cloning trees again etc. but that doesn't suffice, it's all going no-where.
(while the known good build (plain RELEASE-4.6.0) always works, so it doesn't 
seem to be some random problem)

So perhaps some dev can at least verify that the issue is there (since 4.7.0)
and put it on the "known broken" list of things.

--
Sander


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-10-13 Thread Sander Eikelenboom
Hi Jan / Wei,

Took a while before i had the chance to fiddle some more to find the actual 
culprit.
After analyzing the output of xl -v create somewhat more i came to the 
insight it was probably Qemu and not Xen causing the fault.

As a test I just used a qemu-xen binary build with xen-4.6.0 booting up a guest 
with
direct kernel boot mode on xen-unstable. And that old qemu binary works fine.

After testing i can conclude, Jan was right, the bisection was a red herring,
the problem is caused by some change in Qemu and not by something in the Xen 
tree.
(strange thing is that for as far as i know i did a "make distclean" between 
every build (taking a lot of time), which should have pulled a fresh qemu-xen 
tree and therefor the bisection should have lead to a commit with a Config.mk 
hash change for qemu-xen version.)

Will see if i can find some more time and bisect qemu and find the culprit.

--
Sander


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-09-05 Thread linux

On 2016-09-05 13:43, Jan Beulich wrote:

On 05.09.16 at 13:19,  wrote:

On 2016-09-05 12:25, Jan Beulich wrote:

Anyway - with you quite clearly having used HAP before, I can't
see how this commit would matter for you at all. In case you want
to double check you could try with a hypervisor built without
shadow paging code (which we've been allowing for quite a
while).


I just tried that and without shadow paging code the guest boots fine,
so that's interesting.


Indeed. Was that try with plain staging/master, or with much of
the reverts in place (from the bisection)? It seems to me that
investigating this odd difference would perhaps be a better
route than trying to guess what's wrong with said commit.

Jan


It was a try with a tree at the culprit commit and editted 
xen/arch/x86/Rules.mk to disable the shadow paging code from being 
build.


Now just tried with unstable and using Kconfig, but with that build the 
guest doesn't boot.

So
or the KConfig option doesn't work
or the reliability isn't 100% afterall (but i should have noticed that 
earlier on i would say)

or there is something else (semantics around the disabling changed ?)

*sigh*, seems it's not going to be an easy one :-\

My /boot/xen-4.8-unstable.config gives:
#
# Architecture Features
#
CONFIG_NR_CPUS=256
# CONFIG_SHADOW_PAGING is not set
# CONFIG_BIGMEM is not set
CONFIG_HVM_FEP=y
CONFIG_TBOOT=y

So it should be off i guess.

--
Sander

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-09-05 Thread Jan Beulich
>>> On 05.09.16 at 13:19,  wrote:
> On 2016-09-05 12:25, Jan Beulich wrote:
>> Anyway - with you quite clearly having used HAP before, I can't
>> see how this commit would matter for you at all. In case you want
>> to double check you could try with a hypervisor built without
>> shadow paging code (which we've been allowing for quite a
>> while).
> 
> I just tried that and without shadow paging code the guest boots fine, 
> so that's interesting.

Indeed. Was that try with plain staging/master, or with much of
the reverts in place (from the bisection)? It seems to me that
investigating this odd difference would perhaps be a better
route than trying to guess what's wrong with said commit.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-09-05 Thread linux

On 2016-09-05 12:25, Jan Beulich wrote:

On 05.09.16 at 12:02,  wrote:

On 2016-09-05 11:46, Jan Beulich wrote:

On 05.09.16 at 11:20,  wrote:
Hmm it seems my thread was kind of hijacked and i was dropped from 
the

CC.

I had some time and bisected the issue and it resulted in:

5a3ce8f85e7e7bdd339d259daa19f6bc5cb4735f is the first bad commit
commit 5a3ce8f85e7e7bdd339d259daa19f6bc5cb4735f
Author: Jan Beulich 
Date:   Wed Oct 21 10:56:31 2015 +0200

 x86/shadow: drop stray name tags from
sh_{guest_get,map}_eff_l1e()


Hmm, as Wei already indicated - that's rather odd. The commit isn't
really supposed to have any effect on functionality (and going
through it again I also can't spot any now). And are you indeed
using shadow mode, and if so does your problem not occur when
you use HAP instead?

In any event, if there was some hidden (and unintended) change
in functionality here, then the most likely result would seem to be
a crash, yet from the log fragment you posted it doesn't look like
there's _any_ relevant hypervisor output.


Hmm i was already afraid of that.
Attached is the output of xl dmesg, HAP is supported and should be
enabled by default (and i didn't disable it explicitly in my 
guest.cfg).


I just tried the opposite and specified hap=0 in my guest.cfg and this
case leads to 2 lines of additional output:

XEN) [2016-09-05 09:58:22.201] sh error: sh_remove_all_mappings(): 
can't

find all mappings of mfn 471b69: c=8003 t=7401
(XEN) [2016-09-05 09:58:22.201] sh error: sh_remove_all_mappings():
can't find all mappings of mfn 471b68: c=8003
t=7401


And these two messages are relevant here? I.e. do they go away
when you use a commit ahead of the one your bisect spotted?


Just double checked with a build one commit ahead of the culprit the 
bisection reported and hap=0,

and those messages are there as well and the guest boots fine now.
So they don't seem to be relevant.


Anyway - with you quite clearly having used HAP before, I can't
see how this commit would matter for you at all. In case you want
to double check you could try with a hypervisor built without
shadow paging code (which we've been allowing for quite a
while).


I just tried that and without shadow paging code the guest boots fine, 
so that's

interesting.


Is it possible that the reproduction of the issue isn't 100% reliable?


Nope it seems 100% reliable.


I.e. did you verify with a couple of runs each that it really is this
commit, and not just some spurious effect? If it is, then from all I
know so far I'd suspect an effect from code / data arrangement
rather than the commit itself to be the actual culprit.


Well at least there is one other independent user running into the same 
issue,

so it doesn't seem specifically related to my machine or my builds.

It also happens when running all my guests (and this is the last to 
start) and with only this guest.



Which reminds
me of another possible way of double checking: If said commit
reverts reasonably cleanly at the tip of staging or master, maybe
you could try with just this change reverted, instead of with
everything subsequent to it reverted too?


Nope it tried that already and it didn't revert cleanly (and i didn't 
see how to correctly fix it up).


--
Sander


Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-09-05 Thread Jan Beulich
>>> On 05.09.16 at 12:02,  wrote:
> On 2016-09-05 11:46, Jan Beulich wrote:
> On 05.09.16 at 11:20,  wrote:
>>> Hmm it seems my thread was kind of hijacked and i was dropped from the
>>> CC.
>>> 
>>> I had some time and bisected the issue and it resulted in:
>>> 
>>> 5a3ce8f85e7e7bdd339d259daa19f6bc5cb4735f is the first bad commit
>>> commit 5a3ce8f85e7e7bdd339d259daa19f6bc5cb4735f
>>> Author: Jan Beulich 
>>> Date:   Wed Oct 21 10:56:31 2015 +0200
>>> 
>>>  x86/shadow: drop stray name tags from 
>>> sh_{guest_get,map}_eff_l1e()
>> 
>> Hmm, as Wei already indicated - that's rather odd. The commit isn't
>> really supposed to have any effect on functionality (and going
>> through it again I also can't spot any now). And are you indeed
>> using shadow mode, and if so does your problem not occur when
>> you use HAP instead?
>> 
>> In any event, if there was some hidden (and unintended) change
>> in functionality here, then the most likely result would seem to be
>> a crash, yet from the log fragment you posted it doesn't look like
>> there's _any_ relevant hypervisor output.
> 
> Hmm i was already afraid of that.
> Attached is the output of xl dmesg, HAP is supported and should be 
> enabled by default (and i didn't disable it explicitly in my guest.cfg).
> 
> I just tried the opposite and specified hap=0 in my guest.cfg and this 
> case leads to 2 lines of additional output:
> 
> XEN) [2016-09-05 09:58:22.201] sh error: sh_remove_all_mappings(): can't 
> find all mappings of mfn 471b69: c=8003 t=7401
> (XEN) [2016-09-05 09:58:22.201] sh error: sh_remove_all_mappings(): 
> can't find all mappings of mfn 471b68: c=8003 
> t=7401

And these two messages are relevant here? I.e. do they go away
when you use a commit ahead of the one your bisect spotted?

Anyway - with you quite clearly having used HAP before, I can't
see how this commit would matter for you at all. In case you want
to double check you could try with a hypervisor built without
shadow paging code (which we've been allowing for quite a
while).

Is it possible that the reproduction of the issue isn't 100% reliable?
I.e. did you verify with a couple of runs each that it really is this
commit, and not just some spurious effect? If it is, then from all I
know so far I'd suspect an effect from code / data arrangement
rather than the commit itself to be the actual culprit. Which reminds
me of another possible way of double checking: If said commit
reverts reasonably cleanly at the tip of staging or master, maybe
you could try with just this change reverted, instead of with
everything subsequent to it reverted too?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-09-05 Thread linux

On 2016-09-05 11:46, Jan Beulich wrote:

On 05.09.16 at 11:20,  wrote:

Hmm it seems my thread was kind of hijacked and i was dropped from the
CC.

I had some time and bisected the issue and it resulted in:

5a3ce8f85e7e7bdd339d259daa19f6bc5cb4735f is the first bad commit
commit 5a3ce8f85e7e7bdd339d259daa19f6bc5cb4735f
Author: Jan Beulich 
Date:   Wed Oct 21 10:56:31 2015 +0200

 x86/shadow: drop stray name tags from 
sh_{guest_get,map}_eff_l1e()


Hmm, as Wei already indicated - that's rather odd. The commit isn't
really supposed to have any effect on functionality (and going
through it again I also can't spot any now). And are you indeed
using shadow mode, and if so does your problem not occur when
you use HAP instead?

In any event, if there was some hidden (and unintended) change
in functionality here, then the most likely result would seem to be
a crash, yet from the log fragment you posted it doesn't look like
there's _any_ relevant hypervisor output.

Jan


Hmm i was already afraid of that.
Attached is the output of xl dmesg, HAP is supported and should be 
enabled by default (and i didn't disable it explicitly in my guest.cfg).


I just tried the opposite and specified hap=0 in my guest.cfg and this 
case leads to 2 lines of additional output:


XEN) [2016-09-05 09:58:22.201] sh error: sh_remove_all_mappings(): can't 
find all mappings of mfn 471b69: c=8003 t=7401
(XEN) [2016-09-05 09:58:22.201] sh error: sh_remove_all_mappings(): 
can't find all mappings of mfn 471b68: c=8003 
t=7401
(XEN) [2016-09-05 09:58:22.334] d0v5 Over-allocation for domain 3: 
262401 > 262400
(XEN) [2016-09-05 09:58:22.334] memory.c:163:d0v5 Could not allocate 
order=0 extent: id=3 memflags=0 (192 of 512)


--
Sander
 __  ___  _   _  __ _  
 \ \/ /___ _ __   | || | |___  | _   _ _ __  ___| |_ __ _| |__ | | ___ 
  \  // _ \ '_ \  | || |_   / /_| | | | '_ \/ __| __/ _` | '_ \| |/ _ \
  /  \  __/ | | | |__   _| / /__| |_| | | | \__ \ || (_| | |_) | |  __/
 /_/\_\___|_| |_||_|(_)_/\__,_|_| |_|___/\__\__,_|_.__/|_|\___|
   
(XEN) Xen version 4.7-unstable (r...@dyndns.org) (gcc-4.9.real (Debian 
4.9.2-10) 4.9.2) debug=y Mon Sep  5 11:03:14 CEST 2016
(XEN) Latest ChangeSet: Wed Oct 21 10:56:31 2015 +0200 git:5a3ce8f
(XEN) Bootloader: GRUB 2.02~beta2-22+deb8u1
(XEN) Command line: dom0_mem=1536M,max:1536M loglvl=all loglvl_guest=all 
console_timestamps=datems vga=gfx-1280x1024x32 no-cpuidle cpufreq=xen 
com1=38400,8n1 console=vga,com1 ivrs_ioapic[6]=00:14.0 
iommu=on,verbose,debug,amd-iommu-debug conring_size=128k sched=credit2 ucode=-1
(XEN) Video information:
(XEN)  VGA is graphics mode 1280x1024, 32 bpp
(XEN)  VBE/DDC methods: V2; EDID transfer time: 1 seconds
(XEN)  EDID info not retrieved because of reasons unknown
(XEN) Disc information:
(XEN)  Found 2 MBR signatures
(XEN)  Found 2 EDD information structures
(XEN) Xen-e820 RAM map:
(XEN)   - 00099400 (usable)
(XEN)  00099400 - 000a (reserved)
(XEN)  000e4000 - 0010 (reserved)
(XEN)  0010 - 9ff9 (usable)
(XEN)  9ff9 - 9ff9e000 (ACPI data)
(XEN)  9ff9e000 - 9ffe (ACPI NVS)
(XEN)  9ffe - a000 (reserved)
(XEN)  ffe0 - 0001 (reserved)
(XEN)  0001 - 00056000 (usable)
(XEN) ACPI: RSDP 000FB100, 0014 (r0 ACPIAM)
(XEN) ACPI: RSDT 9FF9, 0048 (r1 MSIOEMSLIC  20100913 MSFT   97)
(XEN) ACPI: FACP 9FF90200, 0084 (r1 7640MS A7640100 20100913 MSFT   97)
(XEN) ACPI: DSDT 9FF905E0, 9427 (r1  A7640 A7640100  100 INTL 20051117)
(XEN) ACPI: FACS 9FF9E000, 0040
(XEN) ACPI: APIC 9FF90390, 0088 (r1 7640MS A7640100 20100913 MSFT   97)
(XEN) ACPI: MCFG 9FF90420, 003C (r1 7640MS OEMMCFG  20100913 MSFT   97)
(XEN) ACPI: SLIC 9FF90460, 0176 (r1 MSIOEMSLIC  20100913 MSFT   97)
(XEN) ACPI: OEMB 9FF9E040, 0072 (r1 7640MS A7640100 20100913 MSFT   97)
(XEN) ACPI: SRAT 9FF9A5E0, 0108 (r3 AMDFAM_F_102 AMD 1)
(XEN) ACPI: HPET 9FF9A6F0, 0038 (r1 7640MS OEMHPET  20100913 MSFT   97)
(XEN) ACPI: IVRS 9FF9A730, 0110 (r1  AMD RD890S   202031 AMD 0)
(XEN) ACPI: SSDT 9FF9A840, 0DA4 (r1 A M I  POWERNOW1 AMD 1)
(XEN) System RAM: 20479MB (20970660kB)
(XEN) SRAT: PXM 0 -> APIC 00 -> Node 0
(XEN) SRAT: PXM 0 -> APIC 01 -> Node 0
(XEN) SRAT: PXM 0 -> APIC 02 -> Node 0
(XEN) SRAT: PXM 0 -> APIC 03 -> Node 0
(XEN) SRAT: PXM 0 -> APIC 04 -> Node 0
(XEN) SRAT: PXM 0 -> APIC 05 -> Node 0
(XEN) SRAT: Node 0 PXM 0 0-a
(XEN) SRAT: Node 0 PXM 0 10-a000
(XEN) SRAT: Node 0 PXM 0 1-56000
(XEN) NUMA: Allocated memnodemap from 55c797000 - 55c79d000
(XEN) NUMA: Using 8 for the hash shift.
(XEN) Domain heap initialised

Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-09-05 Thread Jan Beulich
>>> On 05.09.16 at 11:20,  wrote:
> Hmm it seems my thread was kind of hijacked and i was dropped from the 
> CC.
> 
> I had some time and bisected the issue and it resulted in:
> 
> 5a3ce8f85e7e7bdd339d259daa19f6bc5cb4735f is the first bad commit
> commit 5a3ce8f85e7e7bdd339d259daa19f6bc5cb4735f
> Author: Jan Beulich 
> Date:   Wed Oct 21 10:56:31 2015 +0200
> 
>  x86/shadow: drop stray name tags from sh_{guest_get,map}_eff_l1e()

Hmm, as Wei already indicated - that's rather odd. The commit isn't
really supposed to have any effect on functionality (and going
through it again I also can't spot any now). And are you indeed
using shadow mode, and if so does your problem not occur when
you use HAP instead?

In any event, if there was some hidden (and unintended) change
in functionality here, then the most likely result would seem to be
a crash, yet from the log fragment you posted it doesn't look like
there's _any_ relevant hypervisor output.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-09-05 Thread Wei Liu
On Mon, Sep 05, 2016 at 11:20:30AM +0200, li...@eikelenboom.it wrote:
> On 2016-08-25 23:18, li...@eikelenboom.it wrote:
> >On 2016-08-25 22:34, Doug Goldstein wrote:
> >>On 8/25/16 4:21 PM, li...@eikelenboom.it wrote:
> >>>Today i tried to switch some of my HVM guests (qemu-xen) from booting
> >>>of
> >>>a kernel *inside* the guest, to a dom0 supplied kernel, which is
> >>>described as "Direct Kernel Boot" here:
> >>>https://xenbits.xen.org/docs/unstable/man/xl.cfg.5.html :
> >>>
> >>>Direct Kernel Boot
> >>>
> >>>Direct kernel boot allows booting directly from a kernel and
> >>>initrd
> >>>stored in the host physical
> >>>machine OS, allowing command line arguments to be passed directly.
> >>>PV guest direct kernel boot
> >>>is supported. HVM guest direct kernel boot is supported with
> >>>limitation (it's supported when
> >>>using qemu-xen and default BIOS 'seabios'; not supported in case
> >>>of
> >>>stubdom-dm and old rombios.)
> >>>
> >>>kernel="PATHNAME"Load the specified file as the kernel image.
> >>>ramdisk="PATHNAME"   Load the specified file as the ramdisk.
> >>>
> >>>But qemu fails to start, output appended below.
> >>>
> >>>I tested with:
> >>>- current Xen-unstable, which fails.
> >>>- xen-stable-4.7.0 release, which fails.
> >>>- xen-stable-4.6.0 release, works fine.
> >>
> >>Can you include the logs from xl dmesg around that time frame as well?
> >
> >Ah i thought there wasn't any, but didn't check thoroughly or wasn't there
> >since the release builds are non-debug by default.
> >
> >However, back on xen-unstable:
> >(XEN) [2016-08-25 21:09:15.172] HVM19 save: CPU
> >(XEN) [2016-08-25 21:09:15.172] HVM19 save: PIC
> >(XEN) [2016-08-25 21:09:15.172] HVM19 save: IOAPIC
> >(XEN) [2016-08-25 21:09:15.172] HVM19 save: LAPIC
> >(XEN) [2016-08-25 21:09:15.172] HVM19 save: LAPIC_REGS
> >(XEN) [2016-08-25 21:09:15.172] HVM19 save: PCI_IRQ
> >(XEN) [2016-08-25 21:09:15.172] HVM19 save: ISA_IRQ
> >(XEN) [2016-08-25 21:09:15.172] HVM19 save: PCI_LINK
> >(XEN) [2016-08-25 21:09:15.172] HVM19 save: PIT
> >(XEN) [2016-08-25 21:09:15.172] HVM19 save: RTC
> >(XEN) [2016-08-25 21:09:15.172] HVM19 save: HPET
> >(XEN) [2016-08-25 21:09:15.172] HVM19 save: PMTIMER
> >(XEN) [2016-08-25 21:09:15.172] HVM19 save: MTRR
> >(XEN) [2016-08-25 21:09:15.172] HVM19 save: VIRIDIAN_DOMAIN
> >(XEN) [2016-08-25 21:09:15.172] HVM19 save: CPU_XSAVE
> >(XEN) [2016-08-25 21:09:15.172] HVM19 save: VIRIDIAN_VCPU
> >(XEN) [2016-08-25 21:09:15.172] HVM19 save: VMCE_VCPU
> >(XEN) [2016-08-25 21:09:15.172] HVM19 save: TSC_ADJUST
> >(XEN) [2016-08-25 21:09:15.172] HVM19 restore: CPU 0
> >(XEN) [2016-08-25 21:09:16.126] d0v1 Over-allocation for domain 19:
> >262401 > 262400
> >(XEN) [2016-08-25 21:09:16.126] memory.c:213:d0v1 Could not allocate
> >order=0 extent: id=19 memflags=0 (192 of 512)
> >
> >Hmm some off by one issue ?
> >
> >
> >>Just wondering how much RAM you're domain is defined with as well?
> >
> >1024 Mb, there is more than enough unallocated memory for xen to start
> >the guest (and dom0 is fixed with dom0_mem=1536M,max:1536M and
> >ballooning is off)
> 
> 
> Hmm it seems my thread was kind of hijacked and i was dropped from the CC.
> 

Oops, I thought you were CC'ed. Sorry.

> I had some time and bisected the issue and it resulted in:
> 
> 5a3ce8f85e7e7bdd339d259daa19f6bc5cb4735f is the first bad commit
> commit 5a3ce8f85e7e7bdd339d259daa19f6bc5cb4735f
> Author: Jan Beulich 
> Date:   Wed Oct 21 10:56:31 2015 +0200
> 
> x86/shadow: drop stray name tags from sh_{guest_get,map}_eff_l1e()
> 
> They (as a now being removed comment validly says) depend only on Xen's
> number of page table levels, and hence their tags didn't serve any
> useful purpose (there could only ever be one instance in a single
> binary, even back in the x86-32 days).
> 
> Further conditionalize the inclusion of PV-specific hook pointers, at
> once making sure that PV guests can't ever get other than 4-level mode
> enabled for them.
> 
> For consistency reasons shadow_{write,cmpxchg}_guest_entry() also get
> moved next to the other PV-only actors, allowing them to become static
> just like the $subject ones do.
> 
> Signed-off-by: Jan Beulich 
> Acked-by: Tim Deegan 
> 
> :04 04 0c2e3475f81547f934a5960d9f1ac4849707d4ed
> f17f5ff17ca50d6ab908afe9a2d8555d954d3d0a M  xen
> 

Unfortunately I can't see immediately why this would affect QEMU direct
boot. It also suggests that it only affects shadow code -- what kind of
hardware are you using?

Wei.

> 
> --
> Sander
> 
> 
> >
> >--
> >Sander

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-09-05 Thread linux

On 2016-08-25 23:18, li...@eikelenboom.it wrote:

On 2016-08-25 22:34, Doug Goldstein wrote:

On 8/25/16 4:21 PM, li...@eikelenboom.it wrote:
Today i tried to switch some of my HVM guests (qemu-xen) from booting 
of

a kernel *inside* the guest, to a dom0 supplied kernel, which is
described as "Direct Kernel Boot" here:
https://xenbits.xen.org/docs/unstable/man/xl.cfg.5.html :

Direct Kernel Boot

Direct kernel boot allows booting directly from a kernel and 
initrd

stored in the host physical
machine OS, allowing command line arguments to be passed 
directly.

PV guest direct kernel boot
is supported. HVM guest direct kernel boot is supported with
limitation (it's supported when
using qemu-xen and default BIOS 'seabios'; not supported in case 
of

stubdom-dm and old rombios.)

kernel="PATHNAME"Load the specified file as the kernel image.
ramdisk="PATHNAME"   Load the specified file as the ramdisk.

But qemu fails to start, output appended below.

I tested with:
- current Xen-unstable, which fails.
- xen-stable-4.7.0 release, which fails.
- xen-stable-4.6.0 release, works fine.


Can you include the logs from xl dmesg around that time frame as well?


Ah i thought there wasn't any, but didn't check thoroughly or wasn't 
there

since the release builds are non-debug by default.

However, back on xen-unstable:
(XEN) [2016-08-25 21:09:15.172] HVM19 save: CPU
(XEN) [2016-08-25 21:09:15.172] HVM19 save: PIC
(XEN) [2016-08-25 21:09:15.172] HVM19 save: IOAPIC
(XEN) [2016-08-25 21:09:15.172] HVM19 save: LAPIC
(XEN) [2016-08-25 21:09:15.172] HVM19 save: LAPIC_REGS
(XEN) [2016-08-25 21:09:15.172] HVM19 save: PCI_IRQ
(XEN) [2016-08-25 21:09:15.172] HVM19 save: ISA_IRQ
(XEN) [2016-08-25 21:09:15.172] HVM19 save: PCI_LINK
(XEN) [2016-08-25 21:09:15.172] HVM19 save: PIT
(XEN) [2016-08-25 21:09:15.172] HVM19 save: RTC
(XEN) [2016-08-25 21:09:15.172] HVM19 save: HPET
(XEN) [2016-08-25 21:09:15.172] HVM19 save: PMTIMER
(XEN) [2016-08-25 21:09:15.172] HVM19 save: MTRR
(XEN) [2016-08-25 21:09:15.172] HVM19 save: VIRIDIAN_DOMAIN
(XEN) [2016-08-25 21:09:15.172] HVM19 save: CPU_XSAVE
(XEN) [2016-08-25 21:09:15.172] HVM19 save: VIRIDIAN_VCPU
(XEN) [2016-08-25 21:09:15.172] HVM19 save: VMCE_VCPU
(XEN) [2016-08-25 21:09:15.172] HVM19 save: TSC_ADJUST
(XEN) [2016-08-25 21:09:15.172] HVM19 restore: CPU 0
(XEN) [2016-08-25 21:09:16.126] d0v1 Over-allocation for domain 19:
262401 > 262400
(XEN) [2016-08-25 21:09:16.126] memory.c:213:d0v1 Could not allocate
order=0 extent: id=19 memflags=0 (192 of 512)

Hmm some off by one issue ?



Just wondering how much RAM you're domain is defined with as well?


1024 Mb, there is more than enough unallocated memory for xen to start
the guest (and dom0 is fixed with dom0_mem=1536M,max:1536M and
ballooning is off)



Hmm it seems my thread was kind of hijacked and i was dropped from the 
CC.


I had some time and bisected the issue and it resulted in:

5a3ce8f85e7e7bdd339d259daa19f6bc5cb4735f is the first bad commit
commit 5a3ce8f85e7e7bdd339d259daa19f6bc5cb4735f
Author: Jan Beulich 
Date:   Wed Oct 21 10:56:31 2015 +0200

x86/shadow: drop stray name tags from sh_{guest_get,map}_eff_l1e()

They (as a now being removed comment validly says) depend only on 
Xen's

number of page table levels, and hence their tags didn't serve any
useful purpose (there could only ever be one instance in a single
binary, even back in the x86-32 days).

Further conditionalize the inclusion of PV-specific hook pointers, 
at
once making sure that PV guests can't ever get other than 4-level 
mode

enabled for them.

For consistency reasons shadow_{write,cmpxchg}_guest_entry() also 
get
moved next to the other PV-only actors, allowing them to become 
static

just like the $subject ones do.

Signed-off-by: Jan Beulich 
Acked-by: Tim Deegan 

:04 04 0c2e3475f81547f934a5960d9f1ac4849707d4ed 
f17f5ff17ca50d6ab908afe9a2d8555d954d3d0a M  xen



--
Sander




--
Sander


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-08-30 Thread Wei Liu
Could you please use xl -vvv create to create the guest and collect the
output?

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-08-26 Thread Håkon Alstadheim
Den 25. aug. 2016 23:18, skrev li...@eikelenboom.it:
> On 2016-08-25 22:34, Doug Goldstein wrote:
>> On 8/25/16 4:21 PM, li...@eikelenboom.it wrote:
>>> Today i tried to switch some of my HVM guests (qemu-xen) from
>>> booting of
>>> a kernel *inside* the guest, to a dom0 supplied kernel, which is
>>> described as "Direct Kernel Boot" here:
>>> https://xenbits.xen.org/docs/unstable/man/xl.cfg.5.html :
>>>
>>> Direct Kernel Boot
>>>
>>> Direct kernel boot allows booting directly from a kernel and initrd
>>> stored in the host physical
>>> machine OS, allowing command line arguments to be passed directly.
>>> PV guest direct kernel boot
>>> is supported. HVM guest direct kernel boot is supported with
>>> limitation (it's supported when
>>> using qemu-xen and default BIOS 'seabios'; not supported in case of
>>> stubdom-dm and old rombios.)
>>>
>>> kernel="PATHNAME"Load the specified file as the kernel image.
>>> ramdisk="PATHNAME"   Load the specified file as the ramdisk.
>>>
>>> But qemu fails to start, output appended below.
>>>
>>> I tested with:
>>> - current Xen-unstable, which fails.
>>> - xen-stable-4.7.0 release, which fails.
>>> - xen-stable-4.6.0 release, works fine.
>>
>> Can you include the logs from xl dmesg around that time frame as well?
>
> Ah i thought there wasn't any, but didn't check thoroughly or wasn't
> there
> since the release builds are non-debug by default.
>
> However, back on xen-unstable:
> (XEN) [2016-08-25 21:09:15.172] HVM19 save: CPU
> (XEN) [2016-08-25 21:09:15.172] HVM19 save: PIC
> (XEN) [2016-08-25 21:09:15.172] HVM19 save: IOAPIC
> (XEN) [2016-08-25 21:09:15.172] HVM19 save: LAPIC
> (XEN) [2016-08-25 21:09:15.172] HVM19 save: LAPIC_REGS
> (XEN) [2016-08-25 21:09:15.172] HVM19 save: PCI_IRQ
> (XEN) [2016-08-25 21:09:15.172] HVM19 save: ISA_IRQ
> (XEN) [2016-08-25 21:09:15.172] HVM19 save: PCI_LINK
> (XEN) [2016-08-25 21:09:15.172] HVM19 save: PIT
> (XEN) [2016-08-25 21:09:15.172] HVM19 save: RTC
> (XEN) [2016-08-25 21:09:15.172] HVM19 save: HPET
> (XEN) [2016-08-25 21:09:15.172] HVM19 save: PMTIMER
> (XEN) [2016-08-25 21:09:15.172] HVM19 save: MTRR
> (XEN) [2016-08-25 21:09:15.172] HVM19 save: VIRIDIAN_DOMAIN
> (XEN) [2016-08-25 21:09:15.172] HVM19 save: CPU_XSAVE
> (XEN) [2016-08-25 21:09:15.172] HVM19 save: VIRIDIAN_VCPU
> (XEN) [2016-08-25 21:09:15.172] HVM19 save: VMCE_VCPU
> (XEN) [2016-08-25 21:09:15.172] HVM19 save: TSC_ADJUST
> (XEN) [2016-08-25 21:09:15.172] HVM19 restore: CPU 0
> (XEN) [2016-08-25 21:09:16.126] d0v1 Over-allocation for domain 19:
> 262401 > 262400
> (XEN) [2016-08-25 21:09:16.126] memory.c:213:d0v1 Could not allocate
> order=0 extent: id=19 memflags=0 (192 of 512)
>
> Hmm some off by one issue ?
>
>
>> Just wondering how much RAM you're domain is defined with as well?
>
> 1024 Mb, there is more than enough unallocated memory for xen to start
> the guest (and dom0 is fixed with dom0_mem=1536M,max:1536M and
> ballooning is off)
>
> -- 
> Sander
>
>
>

I've got the same issue, reported it in xen-users som time ago. I never
caught on that internal/external kernel would trigger it. I'll just
paste my entire message from xen-users below:
--

I have been trying for some time now to upgrade from Xen 4.6.* to 4.7.
Trying several different dom0 kernel versions, and jiggling the xl.cfg
files. All to no avail.

I am unable to launch most of my guests under 4.7, though they run fine
under 4.6 (except for some usb/pci-pass-though -related issues)  . As
seen from the device-model log below, qemu claims it is unable to
allocate ram: "qemu: hardware error: xen: failed to populate ram at
28005", but I have plenty ram available, and this same VM (and many
more) launch fine under 4.6.*

I admit I am a rank amateur at this, so my config is probably pretty
weird, possibly leading to a set-up that nobody knowledgeable would run.
If somebody can give me a hint on how to work around this issue I'll
happily test patches and provide logs.

Example VM which does not start under 4.7. :

--xl.cfg for media.hvm (i pass pci-pass-through for usb-card on
command-line. Works OK)name = "media.hvm"
builder = "hvm"
xen_platform_pci = '1'
pvh=1
memory = 7168
mmio_hole=3072
vcpus = 6
cap=600
cpus_soft="node:0"
cpu_weight=6144
device_model_version="qemu-xen"
serial = 'pty'
disk = [ 'vdev=xvda, format=raw, target=/dev/system/media-backend'
,'vdev=xvdb, format=raw, target=/dev/system/media-backend-swap'
,'vdev=xvdd, format=raw, target=/dev/system/apub'
,'vdev=xvde, format=raw, target=/dev/system/apub1'
,'vdev=xvdf, format=raw, target=/dev/system/apub2'
,'vdev=xvdg, format=raw, target=/dev/system/apub3'
,'vdev=xvdh, format=raw, target=/dev/system/apub4'
,'vdev=xvdi, format=raw, target=/dev/system/apub5'
,'vdev=xvdj, format=raw, target=/dev/system/apub6'
,'vdev=xvdk, format=raw, target=/dev/system/apub7' ]
kernel = "/etc/xen/media-boot/vmlinuz-4.1.12-gentoo"
extra = 

Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-08-25 Thread linux

On 2016-08-25 22:34, Doug Goldstein wrote:

On 8/25/16 4:21 PM, li...@eikelenboom.it wrote:
Today i tried to switch some of my HVM guests (qemu-xen) from booting 
of

a kernel *inside* the guest, to a dom0 supplied kernel, which is
described as "Direct Kernel Boot" here:
https://xenbits.xen.org/docs/unstable/man/xl.cfg.5.html :

Direct Kernel Boot

Direct kernel boot allows booting directly from a kernel and 
initrd

stored in the host physical
machine OS, allowing command line arguments to be passed directly.
PV guest direct kernel boot
is supported. HVM guest direct kernel boot is supported with
limitation (it's supported when
using qemu-xen and default BIOS 'seabios'; not supported in case 
of

stubdom-dm and old rombios.)

kernel="PATHNAME"Load the specified file as the kernel image.
ramdisk="PATHNAME"   Load the specified file as the ramdisk.

But qemu fails to start, output appended below.

I tested with:
- current Xen-unstable, which fails.
- xen-stable-4.7.0 release, which fails.
- xen-stable-4.6.0 release, works fine.


Can you include the logs from xl dmesg around that time frame as well?


Ah i thought there wasn't any, but didn't check thoroughly or wasn't 
there

since the release builds are non-debug by default.

However, back on xen-unstable:
(XEN) [2016-08-25 21:09:15.172] HVM19 save: CPU
(XEN) [2016-08-25 21:09:15.172] HVM19 save: PIC
(XEN) [2016-08-25 21:09:15.172] HVM19 save: IOAPIC
(XEN) [2016-08-25 21:09:15.172] HVM19 save: LAPIC
(XEN) [2016-08-25 21:09:15.172] HVM19 save: LAPIC_REGS
(XEN) [2016-08-25 21:09:15.172] HVM19 save: PCI_IRQ
(XEN) [2016-08-25 21:09:15.172] HVM19 save: ISA_IRQ
(XEN) [2016-08-25 21:09:15.172] HVM19 save: PCI_LINK
(XEN) [2016-08-25 21:09:15.172] HVM19 save: PIT
(XEN) [2016-08-25 21:09:15.172] HVM19 save: RTC
(XEN) [2016-08-25 21:09:15.172] HVM19 save: HPET
(XEN) [2016-08-25 21:09:15.172] HVM19 save: PMTIMER
(XEN) [2016-08-25 21:09:15.172] HVM19 save: MTRR
(XEN) [2016-08-25 21:09:15.172] HVM19 save: VIRIDIAN_DOMAIN
(XEN) [2016-08-25 21:09:15.172] HVM19 save: CPU_XSAVE
(XEN) [2016-08-25 21:09:15.172] HVM19 save: VIRIDIAN_VCPU
(XEN) [2016-08-25 21:09:15.172] HVM19 save: VMCE_VCPU
(XEN) [2016-08-25 21:09:15.172] HVM19 save: TSC_ADJUST
(XEN) [2016-08-25 21:09:15.172] HVM19 restore: CPU 0
(XEN) [2016-08-25 21:09:16.126] d0v1 Over-allocation for domain 19: 
262401 > 262400
(XEN) [2016-08-25 21:09:16.126] memory.c:213:d0v1 Could not allocate 
order=0 extent: id=19 memflags=0 (192 of 512)


Hmm some off by one issue ?



Just wondering how much RAM you're domain is defined with as well?


1024 Mb, there is more than enough unallocated memory for xen to start 
the guest (and dom0 is fixed with dom0_mem=1536M,max:1536M and 
ballooning is off)


--
Sander





___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-08-25 Thread Doug Goldstein
On 8/25/16 4:21 PM, li...@eikelenboom.it wrote:
> Today i tried to switch some of my HVM guests (qemu-xen) from booting of
> a kernel *inside* the guest, to a dom0 supplied kernel, which is
> described as "Direct Kernel Boot" here:
> https://xenbits.xen.org/docs/unstable/man/xl.cfg.5.html :
> 
> Direct Kernel Boot
> 
> Direct kernel boot allows booting directly from a kernel and initrd
> stored in the host physical
> machine OS, allowing command line arguments to be passed directly.
> PV guest direct kernel boot
> is supported. HVM guest direct kernel boot is supported with
> limitation (it's supported when
> using qemu-xen and default BIOS 'seabios'; not supported in case of
> stubdom-dm and old rombios.)
> 
> kernel="PATHNAME"Load the specified file as the kernel image.
> ramdisk="PATHNAME"   Load the specified file as the ramdisk.
> 
> But qemu fails to start, output appended below.
> 
> I tested with:
> - current Xen-unstable, which fails.
> - xen-stable-4.7.0 release, which fails.
> - xen-stable-4.6.0 release, works fine.

Can you include the logs from xl dmesg around that time frame as well?
Just wondering how much RAM you're domain is defined with as well?

-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.

2016-08-25 Thread linux
Today i tried to switch some of my HVM guests (qemu-xen) from booting of 
a kernel *inside* the guest, to a dom0 supplied kernel, which is 
described as "Direct Kernel Boot" here: 
https://xenbits.xen.org/docs/unstable/man/xl.cfg.5.html :


Direct Kernel Boot

Direct kernel boot allows booting directly from a kernel and initrd 
stored in the host physical
machine OS, allowing command line arguments to be passed directly. 
PV guest direct kernel boot
is supported. HVM guest direct kernel boot is supported with 
limitation (it's supported when
using qemu-xen and default BIOS 'seabios'; not supported in case of 
stubdom-dm and old rombios.)


kernel="PATHNAME"Load the specified file as the kernel image.
ramdisk="PATHNAME"   Load the specified file as the ramdisk.

But qemu fails to start, output appended below.

I tested with:
- current Xen-unstable, which fails.
- xen-stable-4.7.0 release, which fails.
- xen-stable-4.6.0 release, works fine.

So it's a regression somewhere between 4.6.0 and 4.7.0, but hopefully 
someone has a hunch before trying to do a whole bisect between those two 
releases.


--
Sander

From the qemu log:

qemu: hardware error: xen: failed to populate ram at 4005
CPU #0:
EAX= EBX= ECX= EDX=0663
ESI= EDI= EBP= ESP=
EIP=fff0 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =   9300
CS =f000   9b00
SS =   9300
DS =   9300
FS =   9300
GS =   9300
LDT=   8200
TR =   8b00
GDT=  
IDT=  
CR0=6010 CR2= CR3= CR4=
DR0= DR1= DR2= DR3=
DR6=0ff0 DR7=0400
EFER=
FCW=037f FSW= [ST=0] FTW=00 MXCSR=1f80
FPR0=  FPR1= 
FPR2=  FPR3= 
FPR4=  FPR5= 
FPR6=  FPR7= 
XMM00= 
XMM01=
XMM02= 
XMM03=
XMM04= 
XMM05=
XMM06= 
XMM07=

CPU #1:
EAX= EBX= ECX= EDX=0663
ESI= EDI= EBP= ESP=
EIP=fff0 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=0 HLT=1
ES =   9300
CS =f000   9b00
SS =   9300
DS =   9300
FS =   9300
GS =   9300
LDT=   8200
TR =   8b00
GDT=  
IDT=  
CR0=6010 CR2= CR3= CR4=
DR0= DR1= DR2= DR3=
DR6=0ff0 DR7=0400
EFER=
FCW=037f FSW= [ST=0] FTW=00 MXCSR=1f80
FPR0=  FPR1= 
FPR2=  FPR3= 
FPR4=  FPR5= 
FPR6=  FPR7= 
XMM00= 
XMM01=
XMM02= 
XMM03=
XMM04= 
XMM05=
XMM06= 
XMM07=

CPU #2:
EAX= EBX= ECX= EDX=0663
ESI= EDI= EBP= ESP=
EIP=fff0 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=0 HLT=1
ES =   9300
CS =f000   9b00
SS =   9300
DS =   9300
FS =   9300
GS =   9300
LDT=   8200
TR =   8b00
GDT=  
IDT=  
CR0=6010 CR2= CR3= CR4=
DR0= DR1= DR2= DR3=
DR6=0ff0 DR7=0400
EFER=
FCW=037f FSW= [ST=0] FTW=00 MXCSR=1f80
FPR0=  FPR1= 
FPR2=  FPR3= 
FPR4=  FPR5= 
FPR6=  FPR7= 
XMM00= 
XMM01=
XMM02= 
XMM03=
XMM04= 
XMM05=
XMM06= 
XMM07=

CPU #3: