Re: [Xen-devel] questions of vm save/restore on arm64

2016-06-13 Thread Julien Grall

Hello,

On 13/06/16 01:55, Chenxiao Zhao wrote:



On 6/12/2016 11:31 PM, Julien Grall wrote:

On 12/06/2016 10:46, Chenxiao Zhao wrote:

I finally got save/restore working on arm64, but it only works when I
assign only one vCPU to VM. If I set vcpus=4 in configure file, the
restored VM does not work properly.


Can you describe what you mean by "does not work properly"? What are the
symptoms?


After restoring VM with more than one vCPU, the VM keeps in "b" state.


This happen if all the vCPUs of the guest are waiting on an event. For 
instance if the guest is executing the instruction WFI, the vCPU will 
get blocked until an interrupt is coming up.


I would not worry about this.


[   32.530490] Xen: initializing cpu0
[   32.530490] xen:grant_table: Grant tables using version 1 layout
[   32.531034] PM: noirq restore of devices complete after 0.382 msecs
[   32.531382] PM: early restore of devices complete after 0.300 msecs
[   32.531430] Xen: initializing cpu1
[   32.569028] PM: restore of devices complete after 24.663 msecs
[   32.569304] Restarting tasks ...
[   32.569903] systemd-journal[800]: undefined instruction:
pc=007fa37dd4c8
[   32.569975] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.571530] done.
[   32.571631] systemd[1]: undefined instruction: pc=007f8a9ea4c8
[   32.571650] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.573527] auditd[1365]: undefined instruction: pc=007f8aca24c8
[   32.573553] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.636573] systemd-cgroups[2210]: undefined instruction:
pc=007f99ad14c8
[   32.636633] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.636726] audit: *NO* daemon at audit_pid=1365
[   32.636741] audit: audit_lost=1 audit_rate_limit=0
audit_backlog_limit=320
[   32.636755] audit: auditd disappeared
[   32.638545] systemd-logind[1387]: undefined instruction:
pc=007f86e5b4c8
[   32.638594] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)


[...]


[   32.638673] audit: type=1701 audit(68.167:214): auid=4294967295 uid=0
gid=0 s
es=4294967295 subj=system_u:system_r:systemd_logind_t:s0 pid=1387
comm="systemd-
logind" exe="/usr/lib/systemd/systemd-logind" sig=4
[   32.647972] systemd-cgroups[2211]: undefined instruction:
pc=007fa7f414c8
[   32.648017] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.648087] audit: type=1701 audit(68.177:215): auid=4294967295 uid=0
gid=0 s
es=4294967295 subj=system_u:system_r:init_t:s0 pid=2211
comm="systemd-cgroups" e
xe="/usr/lib/systemd/systemd-cgroups-agent" sig=4
[   61.401838] do_undefinstr: 8 callbacks suppressed
[   61.401882] crond[1550]: undefined instruction: pc=007f8d15d4c8
[   61.401903] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)


[...]



Also, I would start by debugging with 2 vCPUs and then increasing the
number step by step.


It's the same issue when restoring VM with more than one vCPUS. What I
see is guest reported "undefined instruction" with random PC depends on
the save point.


My point was that it is easier to debug with 2 vCPUs than 4 vCPUs. There 
is less concurrency involved.


The PC is the program counter of the application, which might be fully 
randomized.




Can you advice how would I start debugging this issue?


The undefined instructions are always the same in your log (d53be04f).
This is the encoding for "mrs x15, cntvct_el0". This register is 
only accessible at EL0 if CTKCTL_EL0.EL0VCTEN is enabled.


I guess that this register has not been save/restore correctly.

Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] questions of vm save/restore on arm64

2016-06-12 Thread Chenxiao Zhao



On 6/12/2016 11:31 PM, Julien Grall wrote:

On 12/06/2016 10:46, Chenxiao Zhao wrote:

Hi all,


Hello,


I finally got save/restore working on arm64, but it only works when I
assign only one vCPU to VM. If I set vcpus=4 in configure file, the
restored VM does not work properly.


Can you describe what you mean by "does not work properly"? What are the
symptoms?


After restoring VM with more than one vCPU, the VM keeps in "b" state.

I'm running Centos on guest and listed the console log after restore.

[   32.530490] Xen: initializing cpu0
[   32.530490] xen:grant_table: Grant tables using version 1 layout
[   32.531034] PM: noirq restore of devices complete after 0.382 msecs
[   32.531382] PM: early restore of devices complete after 0.300 msecs
[   32.531430] Xen: initializing cpu1
[   32.569028] PM: restore of devices complete after 24.663 msecs
[   32.569304] Restarting tasks ...
[   32.569903] systemd-journal[800]: undefined instruction: 
pc=007fa37dd4c8

[   32.569975] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.571530] done.
[   32.571631] systemd[1]: undefined instruction: pc=007f8a9ea4c8
[   32.571650] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.573527] auditd[1365]: undefined instruction: pc=007f8aca24c8
[   32.573553] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.636573] systemd-cgroups[2210]: undefined instruction: 
pc=007f99ad14c8

[   32.636633] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.636726] audit: *NO* daemon at audit_pid=1365
[   32.636741] audit: audit_lost=1 audit_rate_limit=0 
audit_backlog_limit=320

[   32.636755] audit: auditd disappeared
[   32.638545] systemd-logind[1387]: undefined instruction: 
pc=007f86e5b4c8

[   32.638594] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.638673] audit: type=1701 audit(68.167:214): auid=4294967295 uid=0 
gid=0 s
es=4294967295 subj=system_u:system_r:systemd_logind_t:s0 pid=1387 
comm="systemd-

logind" exe="/usr/lib/systemd/systemd-logind" sig=4
[   32.647972] systemd-cgroups[2211]: undefined instruction: 
pc=007fa7f414c8

[   32.648017] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.648087] audit: type=1701 audit(68.177:215): auid=4294967295 uid=0 
gid=0 s
es=4294967295 subj=system_u:system_r:init_t:s0 pid=2211 
comm="systemd-cgroups" e

xe="/usr/lib/systemd/systemd-cgroups-agent" sig=4
[   61.401838] do_undefinstr: 8 callbacks suppressed
[   61.401882] crond[1550]: undefined instruction: pc=007f8d15d4c8
[   61.401903] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   61.402077] audit: type=1701 audit(96.947:218): auid=4294967295 uid=0 
gid=0 s
es=4294967295 subj=system_u:system_r:crond_t:s0-s0:c0.c1023 pid=1550 
comm="crond

" exe="/usr/sbin/crond" sig=4
[   61.407024] dbus-daemon[1390]: undefined instruction: pc=007f87fae4c8
[   61.407064] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   61.407212] audit: type=1701 audit(96.947:219): auid=4294967295 
uid=81 gid=81
 ses=4294967295 subj=system_u:system_r:system_dbusd_t:s0-s0:c0.c1023 
pid=1390 co

mm="dbus-daemon" exe="/usr/bin/dbus-daemon" sig=4
[   61.408311] systemd-cgroups-agent[2214]: Failed to process message 
[type=erro
r sender=org.freedesktop.DBus path=n/a interface=n/a member=n/a 
signature=s]: Co

nnection timed out
[   61.416815] systemd-cgroups-agent[2216]: Failed to get D-Bus 
connection: Conn

ection refused
[   61.421499] systemd-cgroups-agent[2215]: Failed to get D-Bus 
connection: Conn

ection refused
[   61.429413] systemd-cgroups-agent[2217]: Failed to get D-Bus 
connection: Conn

ection refused
[   61.434301] systemd-cgroups-agent[2218]: Failed to get D-Bus 
connection: Conn

ection refused
[   61.435016] systemd-cgroups-agent[2219]: Failed to get D-Bus 
connection: Conn

ection refused

[  110.095570] audit: type=1701 audit(145.637:220): auid=0 uid=0 gid=0 
ses=1 sub
j=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=2189 
comm="bash" exe

="/usr/bin/bash" sig=4
[  110.098120] audit: type=1104 audit(145.637:221): pid=1602 uid=0 
auid=0 ses=1
subj=system_u:system_r:local_login_t:s0-s0:c0.c1023 msg='op=PAM:setcred 
grantors
=pam_securetty,pam_unix acct="root" exe="/usr/bin/login" hostname=? 
addr=? termi

nal=hvc0 res=success'
[  110.102730] audit: type=1106 audit(145.637:222): pid=1602 uid=0 
auid=0 ses=1
subj=system_u:system_r:local_login_t:s0-s0:c0.c1023 
msg='op=PAM:session_close gr
antors=? acct="root" exe="/usr/bin/login" hostname=? addr=? 
terminal=hvc0 res=fa

iled'
[  110.112341] systemd-cgroups-agent[2220]: Failed to get D-Bus 
connection: Conn

ection refused



Also, I would start by debugging with 2 vCPUs and then increasing the
number step by step.


It's the same issue when restoring VM with more than one vCPUS. What I 
see is guest reported "undefined instruction" with random PC depends on 
the save point.


Can you advice how would I start debugging this issue?

Thanks.



Regards,




Re: [Xen-devel] questions of vm save/restore on arm64

2016-06-12 Thread Julien Grall

On 12/06/2016 10:46, Chenxiao Zhao wrote:

Hi all,


Hello,


I finally got save/restore working on arm64, but it only works when I
assign only one vCPU to VM. If I set vcpus=4 in configure file, the
restored VM does not work properly.


Can you describe what you mean by "does not work properly"? What are the 
symptoms?


Also, I would start by debugging with 2 vCPUs and then increasing the 
number step by step.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] questions of vm save/restore on arm64

2016-06-12 Thread Chenxiao Zhao



On 6/7/2016 9:17 AM, Chenxiao Zhao wrote:



On 6/6/2016 7:58 PM, Stefano Stabellini wrote:

On Fri, 3 Jun 2016, Chenxiao Zhao wrote:

On 6/3/2016 4:02 AM, Julien Grall wrote:

Hello,

First thing, the time in the mail headers seems to be wrong. Maybe
because of a wrong timezone?

I got: 04/06/16 02:32 however we are still the 3rd in my timezone.

On 04/06/16 02:32, Chenxiao Zhao wrote:



On 6/3/2016 3:16 AM, Julien Grall wrote:

Hello,

On 03/06/16 18:05, Chenxiao Zhao wrote:

I finally found out that the problem is that the toolstack did
not get
corret p2m_size while sending all pages on save(always be zero).
After I
fixed that, the guest could be restored but guest kernel caught
handle_mm_fault().

where do you think I'm going to investigate, guest kernel
hibernation
restore or xen?


The hibernation support for ARM64 has only been merged recently in
the
kernel. Which kernel are you using?


Hi Julien,

I'm using a linaro ported Linux kernel 4.1 for hikey from this link.

https://github.com/96boards/linux/tree/android-hikey-linaro-4.1

I also applied following patches to make the kernel support
hibernation.


This looks the wrong way to do it as this series may requires some
patches which have been upstreamed before hand.

Linux upstream seems support to the hikey board [1]. Any reason to not
using it?


I tried a newer version of kernel 4.4, but got no luck to start dom0
with xen.
so I decide to stay in 4.1 for now.




[1] http://www.spinics.net/lists/arm-kernel/msg477769.html
[2] http://lists.xen.org/archives/html/xen-devel/2015-12/msg01068.html



Also, what are the modifications you have made to support Xen
suspend/resume for ARM64?


I believe I have posted my modifications on xen in the first mail of
this thread.


I mean in Linux. The patch from Ian Campbell does not have any kind of
support for ARM64.

For instance arch/arm/xen/suspend.c needs to be built for ARM64. So
I am
wondering if your kernel has support of hibernation...


Oh, yes, I most forgot I added this file in arch/arm64/xen/Makefile
to let it
build for arm64.




 From my understanding, a kernel hibernation will cause kernel to save
memories to disk(swap partition). But on guest save progress, the
hibernation for domU does not make the guest save memories to disk.
it's
more like suspend all processes in guest, and memors actually
depends on
xen toolstack to save the pages to file. Am I correct?


You are using an older tree with a patch series based on a newer tree.

So I would recommend you to move to a newer tree. If it is not
possible,
please test that hibernation works on baremetal.


I think the suspend/resume in guest is working, cause I can use
pause/unpause
command in toolstack to suspend/resume guest without problem. I can
also see
the suspend/resume kernel messages from guest's console. The only
problem is
it's can not resume from restore.


But can you still connect to the guest after resume, maybe over the
network?
If you cannot, then something is likely wrong.


Hi Stefano,

I can connect to the guest after resume from xen console. It responds by
'return' key, but I can not run any other commands, e.g. ls or ps. I
think the guest is not 'fully' restored.





One thing that confused me is that the kernel's hibernation means the
guest
kernel will save the memory state to disk and power off VM at last.
The guest
will also take care of the memory restore itself. But I do not see the
save/restore on xen works that way. So my question is why it requires
hibernation (aka. suspend to disk) instead of the real suspend (aka.
suspend
to RAM and standby)?


Xen suspend/resume has nothing to do with guest suspend to RAM or guest
hibernation.

Xen suspend/resume is a way for the hypervisor to save to file the
entire state of the VM, including RAM and the state of any devices.
Guest suspend to RAM and guest hibernation are two guest driven
technologies to save the state of the operating system to RAM or to
disk. The only link between Xen suspend and guest suspend is that when
Xen issues a domain suspend, it notifies the guest of it so that it can
ease the process.  The code in Linux to support Xen suspend/resume is:

drivers/xen/manage.c:do_suspend

and makes use of some of the Linux internal hooks provided for
hibernations (see CONFIG_HIBERNATE_CALLBACKS). But that's just for
better integration with the rest of Linux: hibernation is NOT what is
happening.

I hope that this clarifies things a bit, I realize that it is confusing.



Thanks for your explanation, It clear enough and just as my
understanding from the code. I think the problem might caused by
incompatible of arm p2m and xen save/restore mechanism. I'll try a
core-dump and compare the memory after save and restore. I suppose this
two dumps should be identical but there already pages are different.
I'll let you know if I got some progress.

Regards.


Hi all,

I finally got save/restore working on arm64, but it only works when I 
assign only one vCPU to VM. If I set 

Re: [Xen-devel] questions of vm save/restore on arm64

2016-06-06 Thread Chenxiao Zhao



On 6/6/2016 7:58 PM, Stefano Stabellini wrote:

On Fri, 3 Jun 2016, Chenxiao Zhao wrote:

On 6/3/2016 4:02 AM, Julien Grall wrote:

Hello,

First thing, the time in the mail headers seems to be wrong. Maybe
because of a wrong timezone?

I got: 04/06/16 02:32 however we are still the 3rd in my timezone.

On 04/06/16 02:32, Chenxiao Zhao wrote:



On 6/3/2016 3:16 AM, Julien Grall wrote:

Hello,

On 03/06/16 18:05, Chenxiao Zhao wrote:

I finally found out that the problem is that the toolstack did not get
corret p2m_size while sending all pages on save(always be zero).
After I
fixed that, the guest could be restored but guest kernel caught
handle_mm_fault().

where do you think I'm going to investigate, guest kernel hibernation
restore or xen?


The hibernation support for ARM64 has only been merged recently in the
kernel. Which kernel are you using?


Hi Julien,

I'm using a linaro ported Linux kernel 4.1 for hikey from this link.

https://github.com/96boards/linux/tree/android-hikey-linaro-4.1

I also applied following patches to make the kernel support hibernation.


This looks the wrong way to do it as this series may requires some
patches which have been upstreamed before hand.

Linux upstream seems support to the hikey board [1]. Any reason to not
using it?


I tried a newer version of kernel 4.4, but got no luck to start dom0 with xen.
so I decide to stay in 4.1 for now.




[1] http://www.spinics.net/lists/arm-kernel/msg477769.html
[2] http://lists.xen.org/archives/html/xen-devel/2015-12/msg01068.html



Also, what are the modifications you have made to support Xen
suspend/resume for ARM64?


I believe I have posted my modifications on xen in the first mail of
this thread.


I mean in Linux. The patch from Ian Campbell does not have any kind of
support for ARM64.

For instance arch/arm/xen/suspend.c needs to be built for ARM64. So I am
wondering if your kernel has support of hibernation...


Oh, yes, I most forgot I added this file in arch/arm64/xen/Makefile to let it
build for arm64.




 From my understanding, a kernel hibernation will cause kernel to save
memories to disk(swap partition). But on guest save progress, the
hibernation for domU does not make the guest save memories to disk. it's
more like suspend all processes in guest, and memors actually depends on
xen toolstack to save the pages to file. Am I correct?


You are using an older tree with a patch series based on a newer tree.

So I would recommend you to move to a newer tree. If it is not possible,
please test that hibernation works on baremetal.


I think the suspend/resume in guest is working, cause I can use pause/unpause
command in toolstack to suspend/resume guest without problem. I can also see
the suspend/resume kernel messages from guest's console. The only problem is
it's can not resume from restore.


But can you still connect to the guest after resume, maybe over the network?
If you cannot, then something is likely wrong.


Hi Stefano,

I can connect to the guest after resume from xen console. It responds by 
'return' key, but I can not run any other commands, e.g. ls or ps. I 
think the guest is not 'fully' restored.






One thing that confused me is that the kernel's hibernation means the guest
kernel will save the memory state to disk and power off VM at last. The guest
will also take care of the memory restore itself. But I do not see the
save/restore on xen works that way. So my question is why it requires
hibernation (aka. suspend to disk) instead of the real suspend (aka. suspend
to RAM and standby)?


Xen suspend/resume has nothing to do with guest suspend to RAM or guest
hibernation.

Xen suspend/resume is a way for the hypervisor to save to file the
entire state of the VM, including RAM and the state of any devices.
Guest suspend to RAM and guest hibernation are two guest driven
technologies to save the state of the operating system to RAM or to
disk. The only link between Xen suspend and guest suspend is that when
Xen issues a domain suspend, it notifies the guest of it so that it can
ease the process.  The code in Linux to support Xen suspend/resume is:

drivers/xen/manage.c:do_suspend

and makes use of some of the Linux internal hooks provided for
hibernations (see CONFIG_HIBERNATE_CALLBACKS). But that's just for
better integration with the rest of Linux: hibernation is NOT what is
happening.

I hope that this clarifies things a bit, I realize that it is confusing.



Thanks for your explanation, It clear enough and just as my 
understanding from the code. I think the problem might caused by 
incompatible of arm p2m and xen save/restore mechanism. I'll try a 
core-dump and compare the memory after save and restore. I suppose this 
two dumps should be identical but there already pages are different. 
I'll let you know if I got some progress.


Regards.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] questions of vm save/restore on arm64

2016-06-06 Thread Stefano Stabellini
On Fri, 3 Jun 2016, Chenxiao Zhao wrote:
> On 6/3/2016 4:02 AM, Julien Grall wrote:
> > Hello,
> > 
> > First thing, the time in the mail headers seems to be wrong. Maybe
> > because of a wrong timezone?
> > 
> > I got: 04/06/16 02:32 however we are still the 3rd in my timezone.
> > 
> > On 04/06/16 02:32, Chenxiao Zhao wrote:
> > > 
> > > 
> > > On 6/3/2016 3:16 AM, Julien Grall wrote:
> > > > Hello,
> > > > 
> > > > On 03/06/16 18:05, Chenxiao Zhao wrote:
> > > > > I finally found out that the problem is that the toolstack did not get
> > > > > corret p2m_size while sending all pages on save(always be zero).
> > > > > After I
> > > > > fixed that, the guest could be restored but guest kernel caught
> > > > > handle_mm_fault().
> > > > > 
> > > > > where do you think I'm going to investigate, guest kernel hibernation
> > > > > restore or xen?
> > > > 
> > > > The hibernation support for ARM64 has only been merged recently in the
> > > > kernel. Which kernel are you using?
> > > 
> > > Hi Julien,
> > > 
> > > I'm using a linaro ported Linux kernel 4.1 for hikey from this link.
> > > 
> > > https://github.com/96boards/linux/tree/android-hikey-linaro-4.1
> > > 
> > > I also applied following patches to make the kernel support hibernation.
> > 
> > This looks the wrong way to do it as this series may requires some
> > patches which have been upstreamed before hand.
> > 
> > Linux upstream seems support to the hikey board [1]. Any reason to not
> > using it?
> 
> I tried a newer version of kernel 4.4, but got no luck to start dom0 with xen.
> so I decide to stay in 4.1 for now.
> 
> > 
> > > [1] http://www.spinics.net/lists/arm-kernel/msg477769.html
> > > [2] http://lists.xen.org/archives/html/xen-devel/2015-12/msg01068.html
> > > 
> > > > 
> > > > Also, what are the modifications you have made to support Xen
> > > > suspend/resume for ARM64?
> > > 
> > > I believe I have posted my modifications on xen in the first mail of
> > > this thread.
> > 
> > I mean in Linux. The patch from Ian Campbell does not have any kind of
> > support for ARM64.
> > 
> > For instance arch/arm/xen/suspend.c needs to be built for ARM64. So I am
> > wondering if your kernel has support of hibernation...
> 
> Oh, yes, I most forgot I added this file in arch/arm64/xen/Makefile to let it
> build for arm64.
> > 
> > > 
> > >  From my understanding, a kernel hibernation will cause kernel to save
> > > memories to disk(swap partition). But on guest save progress, the
> > > hibernation for domU does not make the guest save memories to disk. it's
> > > more like suspend all processes in guest, and memors actually depends on
> > > xen toolstack to save the pages to file. Am I correct?
> > 
> > You are using an older tree with a patch series based on a newer tree.
> > 
> > So I would recommend you to move to a newer tree. If it is not possible,
> > please test that hibernation works on baremetal.
> 
> I think the suspend/resume in guest is working, cause I can use pause/unpause
> command in toolstack to suspend/resume guest without problem. I can also see
> the suspend/resume kernel messages from guest's console. The only problem is
> it's can not resume from restore.

But can you still connect to the guest after resume, maybe over the network?
If you cannot, then something is likely wrong.


> One thing that confused me is that the kernel's hibernation means the guest
> kernel will save the memory state to disk and power off VM at last. The guest
> will also take care of the memory restore itself. But I do not see the
> save/restore on xen works that way. So my question is why it requires
> hibernation (aka. suspend to disk) instead of the real suspend (aka. suspend
> to RAM and standby)?

Xen suspend/resume has nothing to do with guest suspend to RAM or guest
hibernation.

Xen suspend/resume is a way for the hypervisor to save to file the
entire state of the VM, including RAM and the state of any devices.
Guest suspend to RAM and guest hibernation are two guest driven
technologies to save the state of the operating system to RAM or to
disk. The only link between Xen suspend and guest suspend is that when
Xen issues a domain suspend, it notifies the guest of it so that it can
ease the process.  The code in Linux to support Xen suspend/resume is:

drivers/xen/manage.c:do_suspend

and makes use of some of the Linux internal hooks provided for
hibernations (see CONFIG_HIBERNATE_CALLBACKS). But that's just for
better integration with the rest of Linux: hibernation is NOT what is
happening.

I hope that this clarifies things a bit, I realize that it is confusing.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] questions of vm save/restore on arm64

2016-06-03 Thread Julien Grall

Hello,

On 04/06/16 03:37, Chenxiao Zhao wrote:



On 6/3/2016 4:02 AM, Julien Grall wrote:

Hello,

First thing, the time in the mail headers seems to be wrong. Maybe
because of a wrong timezone?

I got: 04/06/16 02:32 however we are still the 3rd in my timezone.

On 04/06/16 02:32, Chenxiao Zhao wrote:



On 6/3/2016 3:16 AM, Julien Grall wrote:

Hello,

On 03/06/16 18:05, Chenxiao Zhao wrote:

I finally found out that the problem is that the toolstack did not get
corret p2m_size while sending all pages on save(always be zero).
After I
fixed that, the guest could be restored but guest kernel caught
handle_mm_fault().

where do you think I'm going to investigate, guest kernel hibernation
restore or xen?


The hibernation support for ARM64 has only been merged recently in the
kernel. Which kernel are you using?


Hi Julien,

I'm using a linaro ported Linux kernel 4.1 for hikey from this link.

https://github.com/96boards/linux/tree/android-hikey-linaro-4.1

I also applied following patches to make the kernel support hibernation.


This looks the wrong way to do it as this series may requires some
patches which have been upstreamed before hand.

Linux upstream seems support to the hikey board [1]. Any reason to not
using it?


I tried a newer version of kernel 4.4, but got no luck to start dom0
with xen. so I decide to stay in 4.1 for now.


The current upstream is 4.7-rc1 not 4.4.

However, the kernel for the guest does not require any support for your 
board so you can use upstream (i.e linus/master).


[...]





 From my understanding, a kernel hibernation will cause kernel to save
memories to disk(swap partition). But on guest save progress, the
hibernation for domU does not make the guest save memories to disk. it's
more like suspend all processes in guest, and memors actually depends on
xen toolstack to save the pages to file. Am I correct?


You are using an older tree with a patch series based on a newer tree.

So I would recommend you to move to a newer tree. If it is not possible,
please test that hibernation works on baremetal.


I think the suspend/resume in guest is working, cause I can use
pause/unpause command in toolstack to suspend/resume guest without
problem. I can also see the suspend/resume kernel messages from guest's
console. The only problem is it's can not resume from restore.


The commands pause/unpause do not require any kind of cooperation with 
the kernel. They are only request to the hypervisor to put the vCPUs in 
sleep or to wake them up.


You can look at the implementation of libxl_domain_{,un}pause.


One thing that confused me is that the kernel's hibernation means the
guest kernel will save the memory state to disk and power off VM at
last. The guest will also take care of the memory restore itself. But I
do not see the save/restore on xen works that way. So my question is why
it requires hibernation (aka. suspend to disk) instead of the real
suspend (aka. suspend to RAM and standby)?


I am not an expert in the suspend/resume of Xen. However by looking at 
the code, Xen has a specific path to suspend (see drivers/xen/manage.c). 
I guess, this code requires features which are only present when 
CONFIG_HIBERNATION is selected.


In any case, please use upstream Linux for the development in the guest. 
If there is still a bug, then we know that it is not because you are 
using a 4.5 based patch series in a 4.1 kernel.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] questions of vm save/restore on arm64

2016-06-03 Thread Chenxiao Zhao



On 6/3/2016 4:02 AM, Julien Grall wrote:

Hello,

First thing, the time in the mail headers seems to be wrong. Maybe
because of a wrong timezone?

I got: 04/06/16 02:32 however we are still the 3rd in my timezone.

On 04/06/16 02:32, Chenxiao Zhao wrote:



On 6/3/2016 3:16 AM, Julien Grall wrote:

Hello,

On 03/06/16 18:05, Chenxiao Zhao wrote:

I finally found out that the problem is that the toolstack did not get
corret p2m_size while sending all pages on save(always be zero).
After I
fixed that, the guest could be restored but guest kernel caught
handle_mm_fault().

where do you think I'm going to investigate, guest kernel hibernation
restore or xen?


The hibernation support for ARM64 has only been merged recently in the
kernel. Which kernel are you using?


Hi Julien,

I'm using a linaro ported Linux kernel 4.1 for hikey from this link.

https://github.com/96boards/linux/tree/android-hikey-linaro-4.1

I also applied following patches to make the kernel support hibernation.


This looks the wrong way to do it as this series may requires some
patches which have been upstreamed before hand.

Linux upstream seems support to the hikey board [1]. Any reason to not
using it?


I tried a newer version of kernel 4.4, but got no luck to start dom0 
with xen. so I decide to stay in 4.1 for now.





[1] http://www.spinics.net/lists/arm-kernel/msg477769.html
[2] http://lists.xen.org/archives/html/xen-devel/2015-12/msg01068.html



Also, what are the modifications you have made to support Xen
suspend/resume for ARM64?


I believe I have posted my modifications on xen in the first mail of
this thread.


I mean in Linux. The patch from Ian Campbell does not have any kind of
support for ARM64.

For instance arch/arm/xen/suspend.c needs to be built for ARM64. So I am
wondering if your kernel has support of hibernation...


Oh, yes, I most forgot I added this file in arch/arm64/xen/Makefile to 
let it build for arm64.




 From my understanding, a kernel hibernation will cause kernel to save
memories to disk(swap partition). But on guest save progress, the
hibernation for domU does not make the guest save memories to disk. it's
more like suspend all processes in guest, and memors actually depends on
xen toolstack to save the pages to file. Am I correct?


You are using an older tree with a patch series based on a newer tree.

So I would recommend you to move to a newer tree. If it is not possible,
please test that hibernation works on baremetal.


I think the suspend/resume in guest is working, cause I can use 
pause/unpause command in toolstack to suspend/resume guest without 
problem. I can also see the suspend/resume kernel messages from guest's 
console. The only problem is it's can not resume from restore.


One thing that confused me is that the kernel's hibernation means the 
guest kernel will save the memory state to disk and power off VM at 
last. The guest will also take care of the memory restore itself. But I 
do not see the save/restore on xen works that way. So my question is why 
it requires hibernation (aka. suspend to disk) instead of the real 
suspend (aka. suspend to RAM and standby)?





Regards,

[1] https://lists.96boards.org/pipermail/dev/2016-May/000933.html



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] questions of vm save/restore on arm64

2016-06-03 Thread Julien Grall

Hello,

First thing, the time in the mail headers seems to be wrong. Maybe 
because of a wrong timezone?


I got: 04/06/16 02:32 however we are still the 3rd in my timezone.

On 04/06/16 02:32, Chenxiao Zhao wrote:



On 6/3/2016 3:16 AM, Julien Grall wrote:

Hello,

On 03/06/16 18:05, Chenxiao Zhao wrote:

I finally found out that the problem is that the toolstack did not get
corret p2m_size while sending all pages on save(always be zero). After I
fixed that, the guest could be restored but guest kernel caught
handle_mm_fault().

where do you think I'm going to investigate, guest kernel hibernation
restore or xen?


The hibernation support for ARM64 has only been merged recently in the
kernel. Which kernel are you using?


Hi Julien,

I'm using a linaro ported Linux kernel 4.1 for hikey from this link.

https://github.com/96boards/linux/tree/android-hikey-linaro-4.1

I also applied following patches to make the kernel support hibernation.


This looks the wrong way to do it as this series may requires some 
patches which have been upstreamed before hand.


Linux upstream seems support to the hikey board [1]. Any reason to not 
using it?



[1] http://www.spinics.net/lists/arm-kernel/msg477769.html
[2] http://lists.xen.org/archives/html/xen-devel/2015-12/msg01068.html



Also, what are the modifications you have made to support Xen
suspend/resume for ARM64?


I believe I have posted my modifications on xen in the first mail of
this thread.


I mean in Linux. The patch from Ian Campbell does not have any kind of 
support for ARM64.


For instance arch/arm/xen/suspend.c needs to be built for ARM64. So I am 
wondering if your kernel has support of hibernation...




 From my understanding, a kernel hibernation will cause kernel to save
memories to disk(swap partition). But on guest save progress, the
hibernation for domU does not make the guest save memories to disk. it's
more like suspend all processes in guest, and memors actually depends on
xen toolstack to save the pages to file. Am I correct?


You are using an older tree with a patch series based on a newer tree.

So I would recommend you to move to a newer tree. If it is not possible, 
please test that hibernation works on baremetal.


Regards,

[1] https://lists.96boards.org/pipermail/dev/2016-May/000933.html

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] questions of vm save/restore on arm64

2016-06-03 Thread Chenxiao Zhao



On 6/3/2016 3:16 AM, Julien Grall wrote:

Hello,

On 03/06/16 18:05, Chenxiao Zhao wrote:

I finally found out that the problem is that the toolstack did not get
corret p2m_size while sending all pages on save(always be zero). After I
fixed that, the guest could be restored but guest kernel caught
handle_mm_fault().

where do you think I'm going to investigate, guest kernel hibernation
restore or xen?


The hibernation support for ARM64 has only been merged recently in the
kernel. Which kernel are you using?


Hi Julien,

I'm using a linaro ported Linux kernel 4.1 for hikey from this link.

https://github.com/96boards/linux/tree/android-hikey-linaro-4.1

I also applied following patches to make the kernel support hibernation.

[1] http://www.spinics.net/lists/arm-kernel/msg477769.html
[2] http://lists.xen.org/archives/html/xen-devel/2015-12/msg01068.html



Also, what are the modifications you have made to support Xen
suspend/resume for ARM64?


I believe I have posted my modifications on xen in the first mail of 
this thread.


From my understanding, a kernel hibernation will cause kernel to save 
memories to disk(swap partition). But on guest save progress, the 
hibernation for domU does not make the guest save memories to disk. it's 
more like suspend all processes in guest, and memors actually depends on 
xen toolstack to save the pages to file. Am I correct?


Looking forward for your advice.

Thanks.



Regards,



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] questions of vm save/restore on arm64

2016-06-03 Thread Julien Grall

Hello,

On 03/06/16 18:05, Chenxiao Zhao wrote:

I finally found out that the problem is that the toolstack did not get
corret p2m_size while sending all pages on save(always be zero). After I
fixed that, the guest could be restored but guest kernel caught
handle_mm_fault().

where do you think I'm going to investigate, guest kernel hibernation
restore or xen?


The hibernation support for ARM64 has only been merged recently in the 
kernel. Which kernel are you using?


Also, what are the modifications you have made to support Xen 
suspend/resume for ARM64?


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] questions of vm save/restore on arm64

2016-06-02 Thread Chenxiao Zhao



On 6/2/2016 5:29 AM, Julien Grall wrote:

Hello,

On 01/06/16 01:28, Chenxiao Zhao wrote:



On 5/30/2016 4:40 AM, Stefano Stabellini wrote:

On Fri, 27 May 2016, Chenxiao Zhao wrote:

Hi,

My board is Hikey on which have octa-core of arm cortex-a53. I have
applied patches [1] to try vm save/restore on arm.
These patches originally do not working on arm64. I have made some
changes based on patch set [2].


Hello Chenxiao,

thanks for your interest in Xen on ARM save/restore.


Hi Stefano,

Thanks for your advice.

I found a possible reason that cause the restore failure is that xen
always failed on p2m_lookup for guest domain.


Who is calling p2m_lookup?


It call by handle_hvm_params(tools/libxc/xc_sr_restore_arm.c) while 
restoring HVM_PARAM_STORE_PFN.






I called dump_p2m_lookup in p2m_look() and get the output like below:

(XEN) dom1 IPA 0x39001000


Looking at the memory layout (see include/public/arch-arm.h) 0x39001000
is part of the magic region. It contains pages for the console,
xenstore, memaccess (see the list in tools/libxc/xc_dom_arm.c).


yes, I also noticed that.



0x39001000 is the base address of the xenstore page.


(XEN) P2M @ 000801e7ce80 mfn:0x79f3a
(XEN) Using concatenated root table 0
(XEN) 1ST[0x0] = 0x004079f3c77f
(XEN) 2ND[0x1c8] = 0x

My question is:

1. who is responsible for restoring p2m table, Xen or guest kernel?


AFAIK, the toolstack is restoring the most of the memory.


2. After restore, the vm always get zero memory space, but there is no


What do you mean by "zero memory space"?


I mean xl does not assign any memory to the restored VM.

NameID   Mem VCPUs  State 
Time(s)
Domain-0 0  1024 8 r- 
  76.3
guest2 0 1 r- 
   1.9






error reported on the restore progress. Does the memory requested by the
guest kernel or should be allocated early by hypervisor?


Bear in mind that the patch series you are working on is an RFC and the
ARM64 was not supported. There might be some issue in the save/restore
path.

For instance xc_clear_domain_page (tools/libxc/xc_sr_restore_arm.c) main
return an error if the memory is not mapped. But the caller does not
check the return value.

Regards,


I finally found out that the problem is that the toolstack did not get 
corret p2m_size while sending all pages on save(always be zero). After I 
fixed that, the guest could be restored but guest kernel caught 
handle_mm_fault().


where do you think I'm going to investigate, guest kernel hibernation 
restore or xen?


Best regards.





___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] questions of vm save/restore on arm64

2016-06-02 Thread Julien Grall

Hello,

On 01/06/16 01:28, Chenxiao Zhao wrote:



On 5/30/2016 4:40 AM, Stefano Stabellini wrote:

On Fri, 27 May 2016, Chenxiao Zhao wrote:

Hi,

My board is Hikey on which have octa-core of arm cortex-a53. I have
applied patches [1] to try vm save/restore on arm.
These patches originally do not working on arm64. I have made some
changes based on patch set [2].


Hello Chenxiao,

thanks for your interest in Xen on ARM save/restore.


Hi Stefano,

Thanks for your advice.

I found a possible reason that cause the restore failure is that xen
always failed on p2m_lookup for guest domain.


Who is calling p2m_lookup?



I called dump_p2m_lookup in p2m_look() and get the output like below:

(XEN) dom1 IPA 0x39001000


Looking at the memory layout (see include/public/arch-arm.h) 0x39001000 
is part of the magic region. It contains pages for the console, 
xenstore, memaccess (see the list in tools/libxc/xc_dom_arm.c).


0x39001000 is the base address of the xenstore page.


(XEN) P2M @ 000801e7ce80 mfn:0x79f3a
(XEN) Using concatenated root table 0
(XEN) 1ST[0x0] = 0x004079f3c77f
(XEN) 2ND[0x1c8] = 0x

My question is:

1. who is responsible for restoring p2m table, Xen or guest kernel?


AFAIK, the toolstack is restoring the most of the memory.


2. After restore, the vm always get zero memory space, but there is no


What do you mean by "zero memory space"?


error reported on the restore progress. Does the memory requested by the
guest kernel or should be allocated early by hypervisor?


Bear in mind that the patch series you are working on is an RFC and the 
ARM64 was not supported. There might be some issue in the save/restore path.


For instance xc_clear_domain_page (tools/libxc/xc_sr_restore_arm.c) main 
return an error if the memory is not mapped. But the caller does not 
check the return value.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] questions of vm save/restore on arm64

2016-05-31 Thread Chenxiao Zhao



On 5/30/2016 4:40 AM, Stefano Stabellini wrote:

On Fri, 27 May 2016, Chenxiao Zhao wrote:

Hi,

My board is Hikey on which have octa-core of arm cortex-a53. I have applied 
patches [1] to try vm save/restore on arm.
These patches originally do not working on arm64. I have made some changes 
based on patch set [2].


Hello Chenxiao,

thanks for your interest in Xen on ARM save/restore.


Hi Stefano,

Thanks for your advice.

I found a possible reason that cause the restore failure is that xen 
always failed on p2m_lookup for guest domain.


I called dump_p2m_lookup in p2m_look() and get the output like below:

(XEN) dom1 IPA 0x39001000
(XEN) P2M @ 000801e7ce80 mfn:0x79f3a
(XEN) Using concatenated root table 0
(XEN) 1ST[0x0] = 0x004079f3c77f
(XEN) 2ND[0x1c8] = 0x

My question is:

1. who is responsible for restoring p2m table, Xen or guest kernel?
2. After restore, the vm always get zero memory space, but there is no 
error reported on the restore progress. Does the memory requested by the 
guest kernel or should be allocated early by hypervisor?



Name  ID   Mem VCPUs  State   Time(s)
Domain-0   0  1024 8 r-  15.0
guest  1 0 1 --p---   0.0







What I have got so far is

1. if I run 'xl save -p guest memState' to leave guest in suspend state, then 
run 'xl unpause guest'.
the guest can resume successfully. so I suppose the guest works find on 
suspend/resume.

2. if I run 'xl restore -p memState' to restore guest and use xenctx to dump 
all vcpu's registers.
all the registers are identical to the state on save. After I run 'xl 
unpause guest', I got no error but can not connect to console.
After restore the guest's PC is at a function called 
user_disable_single_step(), which is called by single_step_handler().

My question is

1. How could I debug guest on restore progress? are there any tools available?


Nothing special. You can use ctrl-AAA on the console to switch to the
hypervisor console and see the state of the guest. You can also add some
debug printks; if the console doesn't work you can use
dom0_write_console in Linux to get messages out of your guest (you need
to compile Xen with debug=y for that to work).



2. From my understanding, the restore not working is because some status is 
missing when saving.
 e.g. on cpu_save, it know the domain is 64bit, but on cpu_load, it always 
think it is a 32bit domain. so I have hard coded the domain type to
DOMAIN_64BIT.
Am I correct?


If Xen thinks the domain is 32-bit at restore, it must be a bug.



3. How could I dump all VM's status? I only found xenctx can dump vcpu's 
registers.


You can use the hypervisor console via the ctrl-aaa menu.



I have attached my patch and log below.

Looking forward for your feedback.
Thanks

xl list
NameID   Mem VCPUs  State   Time(s)
Domain-0 0  1024 8 r-  11.7
root@linaro-alip:~# xl create guest.cfg
Parsing config from guest.cfg
[   39.238723] xen-blkback: ring-ref 8, event-channel 3, protocol 1 (arm-abi) 
persistent grants

root@linaro-alip:~# xl save -p guest memState
Saving to memState new xl format (info 0x3/0x0/931)
xc: info: Saving domain 1, type ARM
(XEN) HVM1 save: VCPU
(XEN) HVM1 save: A15_TIMER
(XEN) HVM1 save: GICV2_GICD
(XEN) HVM1 save: GICV2_GICC
(XEN) HVM1 save: GICV3
root@linaro-alip:~# /usr/lib/xen/bin/xenctx -a 1
PC:   ffcab028
LR:   ffc00050458c
ELR_EL1:  ffc86b34
CPSR: 21c5
SPSR_EL1: 6145
SP_EL0:   007ff6f2a850
SP_EL1:   ffc0140a7ca0

 x0: 0001x1: deadbeefx2: 0002
 x3: 0002x4: 0004x5: 
 x6: 001bx7: 0001x8: 00618e589e00
 x9:    x10:    x11: 
x12: 01a3   x13: 1911a7d9   x14: 2ee0
x15: 0005   x16: deadbeef   x17: 0001
x18: 0007   x19:    x20: ffc014163d58
x21: ffc014163cd8   x22: 0001   x23: 0140
x24: ffc000d5bb18   x25: ffc014163cd8   x26: 
x27:    x28:    x29: ffc0140a7ca0

SCTLR: 34d5d91d
TTBCR: 0032b5193519
TTBR0: 002d54876000
TTBR1: 40dcf000
root@linaro-alip:~# xl destroy guest
(XEN) mm.c:1265:d0v1 gnttab_mark_dirty not implemented yet
root@linaro-alip:~# xl restore -p memState
Loading new save file memState (new xl fmt info 0x3/0x0/931)
 Savefile contains xl domain config in JSON format
Parsing config from 
xc: info: (XEN) HVM2 restore: VCPU 0
Found ARM domain from Xen 4.7
xc: info: Restoring domain
(XEN) HVM2 restore: A15_TIMER 0
(XEN) HVM2 restore: A15_TIMER 0
(XEN) HVM2 restore: GICV2_GICD 0
(XEN) HVM2 restore: GICV2_GICC 0
(XEN) 

Re: [Xen-devel] questions of vm save/restore on arm64

2016-05-30 Thread Stefano Stabellini
On Fri, 27 May 2016, Chenxiao Zhao wrote:
> Hi, 
> 
> My board is Hikey on which have octa-core of arm cortex-a53. I have applied 
> patches [1] to try vm save/restore on arm.
> These patches originally do not working on arm64. I have made some changes 
> based on patch set [2].

Hello Chenxiao,

thanks for your interest in Xen on ARM save/restore.


> What I have got so far is
> 
> 1. if I run 'xl save -p guest memState' to leave guest in suspend state, then 
> run 'xl unpause guest'.
>     the guest can resume successfully. so I suppose the guest works find on 
> suspend/resume. 
> 
> 2. if I run 'xl restore -p memState' to restore guest and use xenctx to dump 
> all vcpu's registers.
>     all the registers are identical to the state on save. After I run 'xl 
> unpause guest', I got no error but can not connect to console.
> After restore the guest's PC is at a function called 
> user_disable_single_step(), which is called by single_step_handler(). 
> 
> My question is
> 
> 1. How could I debug guest on restore progress? are there any tools available?

Nothing special. You can use ctrl-AAA on the console to switch to the
hypervisor console and see the state of the guest. You can also add some
debug printks; if the console doesn't work you can use
dom0_write_console in Linux to get messages out of your guest (you need
to compile Xen with debug=y for that to work).


> 2. From my understanding, the restore not working is because some status is 
> missing when saving.
>  e.g. on cpu_save, it know the domain is 64bit, but on cpu_load, it always 
> think it is a 32bit domain. so I have hard coded the domain type to
> DOMAIN_64BIT.
> Am I correct?

If Xen thinks the domain is 32-bit at restore, it must be a bug.


> 3. How could I dump all VM's status? I only found xenctx can dump vcpu's 
> registers.

You can use the hypervisor console via the ctrl-aaa menu.


> I have attached my patch and log below.
> 
> Looking forward for your feedback.
> Thanks
> 
> xl list
> Name                                        ID   Mem VCPUs      State   
> Time(s)
> Domain-0                                     0  1024     8     r-      
> 11.7
> root@linaro-alip:~# xl create guest.cfg
> Parsing config from guest.cfg
> [   39.238723] xen-blkback: ring-ref 8, event-channel 3, protocol 1 (arm-abi) 
> persistent grants
> 
> root@linaro-alip:~# xl save -p guest memState
> Saving to memState new xl format (info 0x3/0x0/931)
> xc: info: Saving domain 1, type ARM
> (XEN) HVM1 save: VCPU
> (XEN) HVM1 save: A15_TIMER
> (XEN) HVM1 save: GICV2_GICD
> (XEN) HVM1 save: GICV2_GICC
> (XEN) HVM1 save: GICV3
> root@linaro-alip:~# /usr/lib/xen/bin/xenctx -a 1
> PC:       ffcab028
> LR:       ffc00050458c
> ELR_EL1:  ffc86b34
> CPSR:     21c5
> SPSR_EL1: 6145
> SP_EL0:   007ff6f2a850
> SP_EL1:   ffc0140a7ca0
> 
>  x0: 0001    x1: deadbeef    x2: 0002
>  x3: 0002    x4: 0004    x5: 
>  x6: 001b    x7: 0001    x8: 00618e589e00
>  x9:    x10:    x11: 
> x12: 01a3   x13: 1911a7d9   x14: 2ee0
> x15: 0005   x16: deadbeef   x17: 0001
> x18: 0007   x19:    x20: ffc014163d58
> x21: ffc014163cd8   x22: 0001   x23: 0140
> x24: ffc000d5bb18   x25: ffc014163cd8   x26: 
> x27:    x28:    x29: ffc0140a7ca0
> 
> SCTLR: 34d5d91d
> TTBCR: 0032b5193519
> TTBR0: 002d54876000
> TTBR1: 40dcf000
> root@linaro-alip:~# xl destroy guest
> (XEN) mm.c:1265:d0v1 gnttab_mark_dirty not implemented yet
> root@linaro-alip:~# xl restore -p memState
> Loading new save file memState (new xl fmt info 0x3/0x0/931)
>  Savefile contains xl domain config in JSON format
> Parsing config from 
> xc: info: (XEN) HVM2 restore: VCPU 0
> Found ARM domain from Xen 4.7
> xc: info: Restoring domain
> (XEN) HVM2 restore: A15_TIMER 0
> (XEN) HVM2 restore: A15_TIMER 0
> (XEN) HVM2 restore: GICV2_GICD 0
> (XEN) HVM2 restore: GICV2_GICC 0
> (XEN) GICH_LRs (vcpu 0) mask=0
> (XEN)    VCPU_LR[0]=0
> (XEN)    VCPU_LR[1]=0
> (XEN)    VCPU_LR[2]=0
> (XEN)    VCPU_LR[3]=0
> xc: info: Restore successful
> xc: info: XenStore: mfn 0x39001, dom 0, evt 1
> xc: info: Console: mfn 0x39000, dom 0, evt 2
> root@linaro-alip:~# /usr/lib/xen/bin/xenctx -a 2
> PC:       ffcab028
> LR:       ffc00050458c
> ELR_EL1:  ffc86b34
> CPSR:     21c5
> SPSR_EL1: 6145
> SP_EL0:   007ff6f2a850
> SP_EL1:   ffc0140a7ca0
> 
>  x0:     x1: deadbeef    x2: 0002
>  x3: 0002    x4: 0004    x5: 
>  x6: 001b    x7: 0001    x8: 00618e589e00
>  x9:    x10:    x11: 
> x12: 

[Xen-devel] questions of vm save/restore on arm64

2016-05-27 Thread Chenxiao Zhao
Hi,

My board is Hikey on which have octa-core of arm cortex-a53. I have applied
patches [1] to try vm save/restore on arm.
These patches originally do not working on arm64. I have made some changes
based on patch set [2].

What I have got so far is

1. if I run 'xl save -p guest memState' to leave guest in suspend state,
then run 'xl unpause guest'.
the guest can resume successfully. so I suppose the guest works find on
suspend/resume.

2. if I run 'xl restore -p memState' to restore guest and use xenctx to
dump all vcpu's registers.
all the registers are identical to the state on save. After I run 'xl
unpause guest', I got no error but can not connect to console.
After restore the guest's PC is at a function
called user_disable_single_step(), which is called
by single_step_handler().

My question is

1. How could I debug guest on restore progress? are there any tools
available?
2. From my understanding, the restore not working is because some status is
missing when saving.
 e.g. on cpu_save, it know the domain is 64bit, but on cpu_load, it always
think it is a 32bit domain. so I have hard coded the domain type to
DOMAIN_64BIT.
Am I correct?
3. How could I dump all VM's status? I only found xenctx can dump vcpu's
registers.

I have attached my patch and log below.

Looking forward for your feedback.
Thanks

xl list
NameID   Mem VCPUs  State
Time(s)
Domain-0 0  1024 8 r-
 11.7
root@linaro-alip:~# xl create guest.cfg
Parsing config from guest.cfg
[   39.238723] xen-blkback: ring-ref 8, event-channel 3, protocol 1
(arm-abi) persistent grants

root@linaro-alip:~# xl save -p guest memState
Saving to memState new xl format (info 0x3/0x0/931)
xc: info: Saving domain 1, type ARM
(XEN) HVM1 save: VCPU
(XEN) HVM1 save: A15_TIMER
(XEN) HVM1 save: GICV2_GICD
(XEN) HVM1 save: GICV2_GICC
(XEN) HVM1 save: GICV3
root@linaro-alip:~# /usr/lib/xen/bin/xenctx -a 1
PC:   ffcab028
LR:   ffc00050458c
ELR_EL1:  ffc86b34
CPSR: 21c5
SPSR_EL1: 6145
SP_EL0:   007ff6f2a850
SP_EL1:   ffc0140a7ca0

 x0: 0001x1: deadbeefx2: 0002
 x3: 0002x4: 0004x5: 
 x6: 001bx7: 0001x8: 00618e589e00
 x9:    x10:    x11: 
x12: 01a3   x13: 1911a7d9   x14: 2ee0
x15: 0005   x16: deadbeef   x17: 0001
x18: 0007   x19:    x20: ffc014163d58
x21: ffc014163cd8   x22: 0001   x23: 0140
x24: ffc000d5bb18   x25: ffc014163cd8   x26: 
x27:    x28:    x29: ffc0140a7ca0

SCTLR: 34d5d91d
TTBCR: 0032b5193519
TTBR0: 002d54876000
TTBR1: 40dcf000
root@linaro-alip:~# xl destroy guest
(XEN) mm.c:1265:d0v1 gnttab_mark_dirty not implemented yet
root@linaro-alip:~# xl restore -p memState
Loading new save file memState (new xl fmt info 0x3/0x0/931)
 Savefile contains xl domain config in JSON format
Parsing config from 
xc: info: (XEN) HVM2 restore: VCPU 0
Found ARM domain from Xen 4.7
xc: info: Restoring domain
(XEN) HVM2 restore: A15_TIMER 0
(XEN) HVM2 restore: A15_TIMER 0
(XEN) HVM2 restore: GICV2_GICD 0
(XEN) HVM2 restore: GICV2_GICC 0
(XEN) GICH_LRs (vcpu 0) mask=0
(XEN)VCPU_LR[0]=0
(XEN)VCPU_LR[1]=0
(XEN)VCPU_LR[2]=0
(XEN)VCPU_LR[3]=0
xc: info: Restore successful
xc: info: XenStore: mfn 0x39001, dom 0, evt 1
xc: info: Console: mfn 0x39000, dom 0, evt 2
root@linaro-alip:~# /usr/lib/xen/bin/xenctx -a 2
PC:   ffcab028
LR:   ffc00050458c
ELR_EL1:  ffc86b34
CPSR: 21c5
SPSR_EL1: 6145
SP_EL0:   007ff6f2a850
SP_EL1:   ffc0140a7ca0

 x0: x1: deadbeefx2: 0002
 x3: 0002x4: 0004x5: 
 x6: 001bx7: 0001x8: 00618e589e00
 x9:    x10:    x11: 
x12: 01a3   x13: 1911a7d9   x14: 2ee0
x15: 0005   x16: deadbeef   x17: 0001
x18: 0007   x19:    x20: ffc014163d58
x21: ffc014163cd8   x22: 0001   x23: 0140
x24: ffc000d5bb18   x25: ffc014163cd8   x26: 
x27:    x28:    x29: ffc0140a7ca0

SCTLR: 34d5d91d
TTBCR: b5193519
TTBR0: 002d54876000
TTBR1: 40dcf000
root@linaro-alip:~# xl unpause guest
root@linaro-alip:~# xl list
NameID   Mem VCPUs  State
Time(s)
Domain-0 0  1024 8 r-
 22.2
guest2 0 1 r-
4.8