Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-02-01 Thread Dr. David Alan Gilbert
* Wei Huang (w...@redhat.com) wrote:
> 
> 
> On 01/25/2018 02:05 PM, Dr. David Alan Gilbert wrote:
> > * Wei Huang (w...@redhat.com) wrote:
> >> This patch adds the migration test support for aarch64. The test code,
> >> which implements the same functionality as x86, is compiled into a binary
> >> and booted as a kernel in qemu. Here are the design ideas:
> >>
> >>  * We choose this -kernel design because aarch64 QEMU doesn't provide a
> >>built-in fw like x86 does. So instead of relying on a boot loader, we
> >>use -kernel approach for aarch64.
> >>  * The serial output is sent to PL011 directly.
> >>  * The physical memory base for mach-virt machine is 0x4000. We change
> >>the start_address and end_address for aarch64.
> >>
> >> RFC->V1:
> >>  * aarch64 kernel blob is defined as an uint32_t array
> >>  * The test code is re-written to address a data caching issue under KVM.
> >>Tests passed under both x86 and aarch64.
> >>  * Re-use init_bootfile_x86() for both x86 and aarch64
> >>  * Other minor fixes
> >>
> >> Note that the test code is as the following:
> >>
> >> .section .text
> >>
> >> .globl  start
> >> 
> >> start:
> >> /* disable MMU to use phys mem address */
> >> mrs x0, sctlr_el1
> >> bic x0, x0, #(1<<0)
> >> msr sctlr_el1, x0
> >> isb
> >>
> >> /* output char 'A' to PL011 */
> >> mov w4, #65
> >> mov x5, #0x900
> >> strbw4, [x5]
> >>
> >> /* w6 keeps a counter so we can limit the output speed */
> >> mov w6, #0
> >>
> >> /* phys mem base addr = 0x4000 */
> >> mov x3, #(0x4000 + 100 *1024*1024) /* traverse 1M-100M */
> >> mov x2, #(0x4000 + 1 * 1024*1024)
> >>
> >> /* clean up memory first */
> >> mov w1, #0  
> >> clean:
> >> strbw1, [x2]
> >> add x2, x2, #(4096)
> >> cmp x2, x3
> >> ble clean
> >>
> >> /* main body */
> >> mainloop:
> >> mov x2, #(0x4000 + 1 * 1024*1024)
> >> 
> >> innerloop:
> >> /* clean cache because el2 might still cache guest data under KVM 
> >> */
> >> dc  civac, x2
> > 
> > Can you explain a bit more about that please;  if it's guest
> > visible acorss migration, doesn't that suggest we need the cache clear
> > in KVM or QEMU?
> 
> I think this is ARM specific. This is caused by the inconsistency
> between guest VM's data accesses and userspace's accesses (in
> check_guest_ram) at the destination:
> 
> 1) Because uncacheable (guest) + cacheable (host) ==> uncacheable. So
> the data accesses from guest VM go directly into memory.
> 2) QEMU user space will use the cacheable version. So it is possible
> that data can come from cache, instead of RAM. The difference between
> (1) and (2) obviously can cause check_guest_ram() to fail sometimes.
> 
> x86's EPT has the capability to ignore guest-provided memory type. So it
> is possible to have consistent data views between guest and QEMU user-space.

I'll admit to not quite understanding where this thread ended up;
it seems to have fallen into other ARM consistency questions.
Other than that it looks fine to me, but see the separate patch I posted
yesterday that includes the x86 source.

Peter: What's your view on the ight fix for hte memory consistency?

Dave

> 
> > 
> > Dave
> > 
> >> ldrbw1, [x2]
> >> add w1, w1, #1
> >> and w1, w1, #(0xff)
> >> strbw1, [x2]
> >>
> >> add x2, x2, #(4096)
> >> cmp x2, x3
> >> blt innerloop
> >>
> >> add w6, w6, #1
> >> and w6, w6, #(0xff)
> >> cmp w6, #0
> >> bne mainloop
> >> 
> >> /* output char 'B' to PL011 */
> >> mov w4, #66
> >> mov x5, #0x900
> >> strbw4, [x5]
> >>
> >> bl  mainloop
> >>
> >> The code is compiled with the following commands:
> >>  > gcc -c -o fill.o fill.s
> >>  > gcc -O2 -o fill.elf 
> >> -Wl,-T,/tmp/flat.lds,--build-id=none,-Ttext=4008 \
> >>-nostdlib fill.o
> >>  > objcopy -O binary fill.elf fill.flat
> >>  > truncate -c -s 144 ./fill.flat
> >>  > xxd -g 4 -c 24 -e fill.flat | awk '{print "0x"$2 ", " "0x"$3 ", " 
> >> "0x"C$4 ", " "0x"C$5 ", " "0x"$6 ", " "0x"C$7 "," }'
> >>
> >> The linker file (borrowed from KVM unit test) is defined as:
> >>
> >> SECTIONS
> >> {
> >> .text : { *(.init) *(.text) *(.text.*) }
> >> . = ALIGN(64K);
> >> etext = .;
> >> .data : {
> >> *(.data)
> >> }
> >> . = ALIGN(16);
> >> .rodata : { *(.rodata) }
> >> . = ALIGN(16);
> >> .bss : { *(.bss) }
> >> . = ALIGN(64K);
> >> edata = .;
> >> . += 64K;
> >> . = ALIGN(64K);
> >> /*
> >>  * stack depth is 16K for arm and PAGE_SIZE for arm64, see THREAD_SIZE
> >>  * sp must be 16 byte aligned for 

Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-02-01 Thread Christoffer Dall
On Thu, Feb 01, 2018 at 01:25:20PM +0100, Andrew Jones wrote:
> On Thu, Feb 01, 2018 at 11:48:31AM +0100, Christoffer Dall wrote:
> > On Thu, Feb 01, 2018 at 11:42:22AM +0100, Andrew Jones wrote:
> > > On Thu, Feb 01, 2018 at 10:59:54AM +0100, Christoffer Dall wrote:
> > > > On Thu, Feb 1, 2018 at 10:33 AM, Ard Biesheuvel
> > > > > If you could use dirty page tracking to only perform the cache
> > > > > invalidation when the framebuffer memory has been updated, you can at
> > > > > least limit the impact to cases where the framebuffer is actually
> > > > > used, rather than sitting idle with a nice wallpaper image.
> > > 
> > > Yes, this is the exact approach I took back when I experimented with
> > > this. I must have screwed up my PoC in some way (like using the gcc
> > > builtin), because it wasn't working for me...
> > > 
> > Does that mean you have some code you feel like reviving and use to send
> > out an RFC based on? ;)
> >
> 
> I don't usually delete things, but I do do some sort of copy on use
> from old homedirs to new ones from time to time, letting stuff that
> gets really old completely drop at some point. I might be able to
> find the old work, or just hack a quick and ugly version from my
> memory to share - hopefully whatever I did to cause it to fail last
> time has been forgotten :-)
> 

If you have bandwidth to look at this that's great.  Otherwise let us
know, and I'll see if I can devote some time to it.

Thanks,
-Christoffer



Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-02-01 Thread Andrew Jones
On Thu, Feb 01, 2018 at 11:48:31AM +0100, Christoffer Dall wrote:
> On Thu, Feb 01, 2018 at 11:42:22AM +0100, Andrew Jones wrote:
> > On Thu, Feb 01, 2018 at 10:59:54AM +0100, Christoffer Dall wrote:
> > > On Thu, Feb 1, 2018 at 10:33 AM, Ard Biesheuvel
> > > > If you could use dirty page tracking to only perform the cache
> > > > invalidation when the framebuffer memory has been updated, you can at
> > > > least limit the impact to cases where the framebuffer is actually
> > > > used, rather than sitting idle with a nice wallpaper image.
> > 
> > Yes, this is the exact approach I took back when I experimented with
> > this. I must have screwed up my PoC in some way (like using the gcc
> > builtin), because it wasn't working for me...
> > 
> Does that mean you have some code you feel like reviving and use to send
> out an RFC based on? ;)
>

I don't usually delete things, but I do do some sort of copy on use
from old homedirs to new ones from time to time, letting stuff that
gets really old completely drop at some point. I might be able to
find the old work, or just hack a quick and ugly version from my
memory to share - hopefully whatever I did to cause it to fail last
time has been forgotten :-)

drew



Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-02-01 Thread Christoffer Dall
On Thu, Feb 01, 2018 at 11:42:22AM +0100, Andrew Jones wrote:
> On Thu, Feb 01, 2018 at 10:59:54AM +0100, Christoffer Dall wrote:
> > On Thu, Feb 1, 2018 at 10:33 AM, Ard Biesheuvel
> >  wrote:
> > > On 1 February 2018 at 09:17, Christoffer Dall
> > >  wrote:
> > >> On Wed, Jan 31, 2018 at 9:15 PM, Ard Biesheuvel
> > >>  wrote:
> > >>> On 31 January 2018 at 19:12, Christoffer Dall
> > >>>  wrote:
> >  On Wed, Jan 31, 2018 at 7:00 PM, Ard Biesheuvel
> >   wrote:
> > > On 31 January 2018 at 17:39, Christoffer Dall
> > >  wrote:
> > >> On Wed, Jan 31, 2018 at 5:59 PM, Ard Biesheuvel
> > >>  wrote:
> > >>> On 31 January 2018 at 16:53, Christoffer Dall
> > >>>  wrote:
> >  On Wed, Jan 31, 2018 at 4:18 PM, Ard Biesheuvel
> >   wrote:
> > > On 31 January 2018 at 09:53, Christoffer Dall
> > >  wrote:
> > >> On Mon, Jan 29, 2018 at 10:32:12AM +, Marc Zyngier wrote:
> > >>> On 29/01/18 10:04, Peter Maydell wrote:
> > >>> > On 29 January 2018 at 09:53, Dr. David Alan Gilbert 
> > >>> >  wrote:
> > >>> >> * Peter Maydell (peter.mayd...@linaro.org) wrote:
> > >>> >>> On 26 January 2018 at 19:46, Dr. David Alan Gilbert 
> > >>> >>>  wrote:
> > >>>  * Peter Maydell (peter.mayd...@linaro.org) wrote:
> > >>> > I think the correct fix here is that your test code 
> > >>> > should turn
> > >>> > its MMU on. Trying to treat guest RAM as uncacheable 
> > >>> > doesn't work
> > >>> > for Arm KVM guests (for the same reason that VGA device 
> > >>> > video memory
> > >>> > doesn't work). If it's RAM your guest has to arrange to 
> > >>> > map it as
> > >>> > Normal Cacheable, and then everything should work fine.
> > >>> 
> > >>>  Does this cause problems with migrating at just the wrong 
> > >>>  point during
> > >>>  a VM boot?
> > >>> >>>
> > >>> >>> It wouldn't surprise me if it did, but I don't think I've 
> > >>> >>> ever
> > >>> >>> tried to provoke that problem...
> > >>> >>
> > >>> >> If you think it'll get the RAM contents wrong, it might be 
> > >>> >> best to fail
> > >>> >> the migration if you can detect the cache is disabled in the 
> > >>> >> guest.
> > >>> >
> > >>> > I guess QEMU could look at the value of the "MMU 
> > >>> > disabled/enabled" bit
> > >>> > in the guest's system registers, and refuse migration if it's 
> > >>> > off...
> > >>> >
> > >>> > (cc'd Marc, Christoffer to check that I don't have the wrong 
> > >>> > end
> > >>> > of the stick about how thin the ice is in the period before 
> > >>> > the
> > >>> > guest turns on its MMU...)
> > >>>
> > >>> Once MMU and caches are on, we should be in a reasonable place 
> > >>> for QEMU
> > >>> to have a consistent view of the memory. The trick is to 
> > >>> prevent the
> > >>> vcpus from changing that. A guest could perfectly turn off its 
> > >>> MMU at
> > >>> any given time if it needs to (and it is actually required on 
> > >>> some HW if
> > >>> you want to mitigate headlining CVEs), and KVM won't know about 
> > >>> that.
> > >>>
> > >>
> > >> (Clarification: KVM can detect this is it bother to check the 
> > >> VCPU's
> > >> system registers, but we don't trap to KVM when the VCPU turns 
> > >> off its
> > >> caches, right?)
> > >>
> > >>> You may have to pause the vcpus before starting the migration, 
> > >>> or
> > >>> introduce a new KVM feature that would automatically pause a 
> > >>> vcpu that
> > >>> is trying to disable its MMU while the migration is on. This 
> > >>> would
> > >>> involve trapping all the virtual memory related system 
> > >>> registers, with
> > >>> an obvious cost. But that cost would be limited to the time it 
> > >>> takes to
> > >>> migrate the memory, so maybe that's acceptable.
> > >>>
> > >> Is that even sufficient?
> > >>
> > >> What if the following happened. (1) guest turns off MMU, (2) 
> > >> guest
> > >> writes some data directly to ram (3) qemu stops the vcpu (4) 
> > >> qemu reads
> > >> guest ram.  QEMU's view of guest ram is now incorrect 

Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-02-01 Thread Andrew Jones
On Thu, Feb 01, 2018 at 10:59:54AM +0100, Christoffer Dall wrote:
> On Thu, Feb 1, 2018 at 10:33 AM, Ard Biesheuvel
>  wrote:
> > On 1 February 2018 at 09:17, Christoffer Dall
> >  wrote:
> >> On Wed, Jan 31, 2018 at 9:15 PM, Ard Biesheuvel
> >>  wrote:
> >>> On 31 January 2018 at 19:12, Christoffer Dall
> >>>  wrote:
>  On Wed, Jan 31, 2018 at 7:00 PM, Ard Biesheuvel
>   wrote:
> > On 31 January 2018 at 17:39, Christoffer Dall
> >  wrote:
> >> On Wed, Jan 31, 2018 at 5:59 PM, Ard Biesheuvel
> >>  wrote:
> >>> On 31 January 2018 at 16:53, Christoffer Dall
> >>>  wrote:
>  On Wed, Jan 31, 2018 at 4:18 PM, Ard Biesheuvel
>   wrote:
> > On 31 January 2018 at 09:53, Christoffer Dall
> >  wrote:
> >> On Mon, Jan 29, 2018 at 10:32:12AM +, Marc Zyngier wrote:
> >>> On 29/01/18 10:04, Peter Maydell wrote:
> >>> > On 29 January 2018 at 09:53, Dr. David Alan Gilbert 
> >>> >  wrote:
> >>> >> * Peter Maydell (peter.mayd...@linaro.org) wrote:
> >>> >>> On 26 January 2018 at 19:46, Dr. David Alan Gilbert 
> >>> >>>  wrote:
> >>>  * Peter Maydell (peter.mayd...@linaro.org) wrote:
> >>> > I think the correct fix here is that your test code should 
> >>> > turn
> >>> > its MMU on. Trying to treat guest RAM as uncacheable 
> >>> > doesn't work
> >>> > for Arm KVM guests (for the same reason that VGA device 
> >>> > video memory
> >>> > doesn't work). If it's RAM your guest has to arrange to map 
> >>> > it as
> >>> > Normal Cacheable, and then everything should work fine.
> >>> 
> >>>  Does this cause problems with migrating at just the wrong 
> >>>  point during
> >>>  a VM boot?
> >>> >>>
> >>> >>> It wouldn't surprise me if it did, but I don't think I've ever
> >>> >>> tried to provoke that problem...
> >>> >>
> >>> >> If you think it'll get the RAM contents wrong, it might be 
> >>> >> best to fail
> >>> >> the migration if you can detect the cache is disabled in the 
> >>> >> guest.
> >>> >
> >>> > I guess QEMU could look at the value of the "MMU 
> >>> > disabled/enabled" bit
> >>> > in the guest's system registers, and refuse migration if it's 
> >>> > off...
> >>> >
> >>> > (cc'd Marc, Christoffer to check that I don't have the wrong end
> >>> > of the stick about how thin the ice is in the period before the
> >>> > guest turns on its MMU...)
> >>>
> >>> Once MMU and caches are on, we should be in a reasonable place 
> >>> for QEMU
> >>> to have a consistent view of the memory. The trick is to prevent 
> >>> the
> >>> vcpus from changing that. A guest could perfectly turn off its 
> >>> MMU at
> >>> any given time if it needs to (and it is actually required on 
> >>> some HW if
> >>> you want to mitigate headlining CVEs), and KVM won't know about 
> >>> that.
> >>>
> >>
> >> (Clarification: KVM can detect this is it bother to check the 
> >> VCPU's
> >> system registers, but we don't trap to KVM when the VCPU turns off 
> >> its
> >> caches, right?)
> >>
> >>> You may have to pause the vcpus before starting the migration, or
> >>> introduce a new KVM feature that would automatically pause a vcpu 
> >>> that
> >>> is trying to disable its MMU while the migration is on. This would
> >>> involve trapping all the virtual memory related system registers, 
> >>> with
> >>> an obvious cost. But that cost would be limited to the time it 
> >>> takes to
> >>> migrate the memory, so maybe that's acceptable.
> >>>
> >> Is that even sufficient?
> >>
> >> What if the following happened. (1) guest turns off MMU, (2) guest
> >> writes some data directly to ram (3) qemu stops the vcpu (4) qemu 
> >> reads
> >> guest ram.  QEMU's view of guest ram is now incorrect (stale,
> >> incoherent, ...).
> >>
> >> I'm also not really sure if pausing one VCPU because it turned off 
> >> its
> >> MMU will go very well when trying to migrate a large VM (wouldn't 
> >> this
> >> ask for all the other VCPUs beginning to complain that the stopped 
> >> VCPU
> >> 

Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-02-01 Thread Ard Biesheuvel
On 1 February 2018 at 09:59, Christoffer Dall
 wrote:
> On Thu, Feb 1, 2018 at 10:33 AM, Ard Biesheuvel
>  wrote:
>> On 1 February 2018 at 09:17, Christoffer Dall
>>  wrote:
...
>>>
>>> Unless I'm missing something fundamental, I think we should add
>>> functionality in QEMU to clean+invalidate pages on aarch64 hosts and
>>> see if that solves both VGA and migration, and if so, what the
>>> potential performance impacts are.
>>>
>>
>> Indeed.
>>
>> Identifying the framebuffer region is easy for QEMU, so there we can
>> invalidate (without clean) selectively.
>
> Do you think there's a material win of doing invalidate rather than
> clean+invalidate or could we just aim for "access coherent" ==
> "clean+invalidate" and we don't have to make any assumptions about how
> the guest is accessing a particular piece of memory?
>

In general, invalidate should be cheaper than clean, given that just
discarding the data involves less work than writing it back to memory.
Whether that actually holds in current SMP designs, and whether it
still applies when the memory can be assumed to be clean in the first
place are things I can't really answer.

>> However, although we'd
>> probably need to capture the state of the MMU bit anyway when handling
>> the stage 2 fault for the dirty page tracking, I don't think we can
>> generally infer whether a faulting access was made via an uncached
>> mapping while the MMU was on, and so this would require
>> clean+invalidate on all dirty pages when doing a migration.
>
> My gut feeling here is that there might be some very clever scheme to
> optimize cache maintenance of the 'hot pages' when dirty page tracking
> is enabled, but since the set of hot pages by virtue of migration
> algorithms has to be small, the win is not likely to impact anything
> real-life, and it's therefore not worth looking into.
>

Sounds reasonable.



Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-02-01 Thread Christoffer Dall
On Thu, Feb 1, 2018 at 10:33 AM, Ard Biesheuvel
 wrote:
> On 1 February 2018 at 09:17, Christoffer Dall
>  wrote:
>> On Wed, Jan 31, 2018 at 9:15 PM, Ard Biesheuvel
>>  wrote:
>>> On 31 January 2018 at 19:12, Christoffer Dall
>>>  wrote:
 On Wed, Jan 31, 2018 at 7:00 PM, Ard Biesheuvel
  wrote:
> On 31 January 2018 at 17:39, Christoffer Dall
>  wrote:
>> On Wed, Jan 31, 2018 at 5:59 PM, Ard Biesheuvel
>>  wrote:
>>> On 31 January 2018 at 16:53, Christoffer Dall
>>>  wrote:
 On Wed, Jan 31, 2018 at 4:18 PM, Ard Biesheuvel
  wrote:
> On 31 January 2018 at 09:53, Christoffer Dall
>  wrote:
>> On Mon, Jan 29, 2018 at 10:32:12AM +, Marc Zyngier wrote:
>>> On 29/01/18 10:04, Peter Maydell wrote:
>>> > On 29 January 2018 at 09:53, Dr. David Alan Gilbert 
>>> >  wrote:
>>> >> * Peter Maydell (peter.mayd...@linaro.org) wrote:
>>> >>> On 26 January 2018 at 19:46, Dr. David Alan Gilbert 
>>> >>>  wrote:
>>>  * Peter Maydell (peter.mayd...@linaro.org) wrote:
>>> > I think the correct fix here is that your test code should 
>>> > turn
>>> > its MMU on. Trying to treat guest RAM as uncacheable doesn't 
>>> > work
>>> > for Arm KVM guests (for the same reason that VGA device video 
>>> > memory
>>> > doesn't work). If it's RAM your guest has to arrange to map 
>>> > it as
>>> > Normal Cacheable, and then everything should work fine.
>>> 
>>>  Does this cause problems with migrating at just the wrong 
>>>  point during
>>>  a VM boot?
>>> >>>
>>> >>> It wouldn't surprise me if it did, but I don't think I've ever
>>> >>> tried to provoke that problem...
>>> >>
>>> >> If you think it'll get the RAM contents wrong, it might be best 
>>> >> to fail
>>> >> the migration if you can detect the cache is disabled in the 
>>> >> guest.
>>> >
>>> > I guess QEMU could look at the value of the "MMU 
>>> > disabled/enabled" bit
>>> > in the guest's system registers, and refuse migration if it's 
>>> > off...
>>> >
>>> > (cc'd Marc, Christoffer to check that I don't have the wrong end
>>> > of the stick about how thin the ice is in the period before the
>>> > guest turns on its MMU...)
>>>
>>> Once MMU and caches are on, we should be in a reasonable place for 
>>> QEMU
>>> to have a consistent view of the memory. The trick is to prevent the
>>> vcpus from changing that. A guest could perfectly turn off its MMU 
>>> at
>>> any given time if it needs to (and it is actually required on some 
>>> HW if
>>> you want to mitigate headlining CVEs), and KVM won't know about 
>>> that.
>>>
>>
>> (Clarification: KVM can detect this is it bother to check the VCPU's
>> system registers, but we don't trap to KVM when the VCPU turns off 
>> its
>> caches, right?)
>>
>>> You may have to pause the vcpus before starting the migration, or
>>> introduce a new KVM feature that would automatically pause a vcpu 
>>> that
>>> is trying to disable its MMU while the migration is on. This would
>>> involve trapping all the virtual memory related system registers, 
>>> with
>>> an obvious cost. But that cost would be limited to the time it 
>>> takes to
>>> migrate the memory, so maybe that's acceptable.
>>>
>> Is that even sufficient?
>>
>> What if the following happened. (1) guest turns off MMU, (2) guest
>> writes some data directly to ram (3) qemu stops the vcpu (4) qemu 
>> reads
>> guest ram.  QEMU's view of guest ram is now incorrect (stale,
>> incoherent, ...).
>>
>> I'm also not really sure if pausing one VCPU because it turned off 
>> its
>> MMU will go very well when trying to migrate a large VM (wouldn't 
>> this
>> ask for all the other VCPUs beginning to complain that the stopped 
>> VCPU
>> appears to be dead?).  As a short-term 'fix' it's probably better to
>> refuse migration if you detect that a VCPU had begun turning off its
>> MMU.
>>
>> On the larger scale of thins; this appears to me to be another case 
>> of
>> us really needing 

Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-02-01 Thread Ard Biesheuvel
On 1 February 2018 at 09:17, Christoffer Dall
 wrote:
> On Wed, Jan 31, 2018 at 9:15 PM, Ard Biesheuvel
>  wrote:
>> On 31 January 2018 at 19:12, Christoffer Dall
>>  wrote:
>>> On Wed, Jan 31, 2018 at 7:00 PM, Ard Biesheuvel
>>>  wrote:
 On 31 January 2018 at 17:39, Christoffer Dall
  wrote:
> On Wed, Jan 31, 2018 at 5:59 PM, Ard Biesheuvel
>  wrote:
>> On 31 January 2018 at 16:53, Christoffer Dall
>>  wrote:
>>> On Wed, Jan 31, 2018 at 4:18 PM, Ard Biesheuvel
>>>  wrote:
 On 31 January 2018 at 09:53, Christoffer Dall
  wrote:
> On Mon, Jan 29, 2018 at 10:32:12AM +, Marc Zyngier wrote:
>> On 29/01/18 10:04, Peter Maydell wrote:
>> > On 29 January 2018 at 09:53, Dr. David Alan Gilbert 
>> >  wrote:
>> >> * Peter Maydell (peter.mayd...@linaro.org) wrote:
>> >>> On 26 January 2018 at 19:46, Dr. David Alan Gilbert 
>> >>>  wrote:
>>  * Peter Maydell (peter.mayd...@linaro.org) wrote:
>> > I think the correct fix here is that your test code should turn
>> > its MMU on. Trying to treat guest RAM as uncacheable doesn't 
>> > work
>> > for Arm KVM guests (for the same reason that VGA device video 
>> > memory
>> > doesn't work). If it's RAM your guest has to arrange to map it 
>> > as
>> > Normal Cacheable, and then everything should work fine.
>> 
>>  Does this cause problems with migrating at just the wrong point 
>>  during
>>  a VM boot?
>> >>>
>> >>> It wouldn't surprise me if it did, but I don't think I've ever
>> >>> tried to provoke that problem...
>> >>
>> >> If you think it'll get the RAM contents wrong, it might be best 
>> >> to fail
>> >> the migration if you can detect the cache is disabled in the 
>> >> guest.
>> >
>> > I guess QEMU could look at the value of the "MMU disabled/enabled" 
>> > bit
>> > in the guest's system registers, and refuse migration if it's 
>> > off...
>> >
>> > (cc'd Marc, Christoffer to check that I don't have the wrong end
>> > of the stick about how thin the ice is in the period before the
>> > guest turns on its MMU...)
>>
>> Once MMU and caches are on, we should be in a reasonable place for 
>> QEMU
>> to have a consistent view of the memory. The trick is to prevent the
>> vcpus from changing that. A guest could perfectly turn off its MMU at
>> any given time if it needs to (and it is actually required on some 
>> HW if
>> you want to mitigate headlining CVEs), and KVM won't know about that.
>>
>
> (Clarification: KVM can detect this is it bother to check the VCPU's
> system registers, but we don't trap to KVM when the VCPU turns off its
> caches, right?)
>
>> You may have to pause the vcpus before starting the migration, or
>> introduce a new KVM feature that would automatically pause a vcpu 
>> that
>> is trying to disable its MMU while the migration is on. This would
>> involve trapping all the virtual memory related system registers, 
>> with
>> an obvious cost. But that cost would be limited to the time it takes 
>> to
>> migrate the memory, so maybe that's acceptable.
>>
> Is that even sufficient?
>
> What if the following happened. (1) guest turns off MMU, (2) guest
> writes some data directly to ram (3) qemu stops the vcpu (4) qemu 
> reads
> guest ram.  QEMU's view of guest ram is now incorrect (stale,
> incoherent, ...).
>
> I'm also not really sure if pausing one VCPU because it turned off its
> MMU will go very well when trying to migrate a large VM (wouldn't this
> ask for all the other VCPUs beginning to complain that the stopped 
> VCPU
> appears to be dead?).  As a short-term 'fix' it's probably better to
> refuse migration if you detect that a VCPU had begun turning off its
> MMU.
>
> On the larger scale of thins; this appears to me to be another case of
> us really needing some way to coherently access memory between QEMU 
> and
> the VM, but in the case of the VCPU turning off the MMU prior to
> migration, we don't even know where it may have written data, and I'm
> therefore not really sure what the 'proper' 

Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-02-01 Thread Christoffer Dall
On Wed, Jan 31, 2018 at 9:15 PM, Ard Biesheuvel
 wrote:
> On 31 January 2018 at 19:12, Christoffer Dall
>  wrote:
>> On Wed, Jan 31, 2018 at 7:00 PM, Ard Biesheuvel
>>  wrote:
>>> On 31 January 2018 at 17:39, Christoffer Dall
>>>  wrote:
 On Wed, Jan 31, 2018 at 5:59 PM, Ard Biesheuvel
  wrote:
> On 31 January 2018 at 16:53, Christoffer Dall
>  wrote:
>> On Wed, Jan 31, 2018 at 4:18 PM, Ard Biesheuvel
>>  wrote:
>>> On 31 January 2018 at 09:53, Christoffer Dall
>>>  wrote:
 On Mon, Jan 29, 2018 at 10:32:12AM +, Marc Zyngier wrote:
> On 29/01/18 10:04, Peter Maydell wrote:
> > On 29 January 2018 at 09:53, Dr. David Alan Gilbert 
> >  wrote:
> >> * Peter Maydell (peter.mayd...@linaro.org) wrote:
> >>> On 26 January 2018 at 19:46, Dr. David Alan Gilbert 
> >>>  wrote:
>  * Peter Maydell (peter.mayd...@linaro.org) wrote:
> > I think the correct fix here is that your test code should turn
> > its MMU on. Trying to treat guest RAM as uncacheable doesn't 
> > work
> > for Arm KVM guests (for the same reason that VGA device video 
> > memory
> > doesn't work). If it's RAM your guest has to arrange to map it 
> > as
> > Normal Cacheable, and then everything should work fine.
> 
>  Does this cause problems with migrating at just the wrong point 
>  during
>  a VM boot?
> >>>
> >>> It wouldn't surprise me if it did, but I don't think I've ever
> >>> tried to provoke that problem...
> >>
> >> If you think it'll get the RAM contents wrong, it might be best to 
> >> fail
> >> the migration if you can detect the cache is disabled in the guest.
> >
> > I guess QEMU could look at the value of the "MMU disabled/enabled" 
> > bit
> > in the guest's system registers, and refuse migration if it's off...
> >
> > (cc'd Marc, Christoffer to check that I don't have the wrong end
> > of the stick about how thin the ice is in the period before the
> > guest turns on its MMU...)
>
> Once MMU and caches are on, we should be in a reasonable place for 
> QEMU
> to have a consistent view of the memory. The trick is to prevent the
> vcpus from changing that. A guest could perfectly turn off its MMU at
> any given time if it needs to (and it is actually required on some HW 
> if
> you want to mitigate headlining CVEs), and KVM won't know about that.
>

 (Clarification: KVM can detect this is it bother to check the VCPU's
 system registers, but we don't trap to KVM when the VCPU turns off its
 caches, right?)

> You may have to pause the vcpus before starting the migration, or
> introduce a new KVM feature that would automatically pause a vcpu that
> is trying to disable its MMU while the migration is on. This would
> involve trapping all the virtual memory related system registers, with
> an obvious cost. But that cost would be limited to the time it takes 
> to
> migrate the memory, so maybe that's acceptable.
>
 Is that even sufficient?

 What if the following happened. (1) guest turns off MMU, (2) guest
 writes some data directly to ram (3) qemu stops the vcpu (4) qemu reads
 guest ram.  QEMU's view of guest ram is now incorrect (stale,
 incoherent, ...).

 I'm also not really sure if pausing one VCPU because it turned off its
 MMU will go very well when trying to migrate a large VM (wouldn't this
 ask for all the other VCPUs beginning to complain that the stopped VCPU
 appears to be dead?).  As a short-term 'fix' it's probably better to
 refuse migration if you detect that a VCPU had begun turning off its
 MMU.

 On the larger scale of thins; this appears to me to be another case of
 us really needing some way to coherently access memory between QEMU and
 the VM, but in the case of the VCPU turning off the MMU prior to
 migration, we don't even know where it may have written data, and I'm
 therefore not really sure what the 'proper' solution would be.

 (cc'ing Ard who has has thought about this problem before in the 
 context
 of UEFI and VGA.)

>>>
>>> Actually, the VGA case is much simpler because the host is not
>>> expected to write to 

Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-01-31 Thread Ard Biesheuvel
On 31 January 2018 at 19:12, Christoffer Dall
 wrote:
> On Wed, Jan 31, 2018 at 7:00 PM, Ard Biesheuvel
>  wrote:
>> On 31 January 2018 at 17:39, Christoffer Dall
>>  wrote:
>>> On Wed, Jan 31, 2018 at 5:59 PM, Ard Biesheuvel
>>>  wrote:
 On 31 January 2018 at 16:53, Christoffer Dall
  wrote:
> On Wed, Jan 31, 2018 at 4:18 PM, Ard Biesheuvel
>  wrote:
>> On 31 January 2018 at 09:53, Christoffer Dall
>>  wrote:
>>> On Mon, Jan 29, 2018 at 10:32:12AM +, Marc Zyngier wrote:
 On 29/01/18 10:04, Peter Maydell wrote:
 > On 29 January 2018 at 09:53, Dr. David Alan Gilbert 
 >  wrote:
 >> * Peter Maydell (peter.mayd...@linaro.org) wrote:
 >>> On 26 January 2018 at 19:46, Dr. David Alan Gilbert 
 >>>  wrote:
  * Peter Maydell (peter.mayd...@linaro.org) wrote:
 > I think the correct fix here is that your test code should turn
 > its MMU on. Trying to treat guest RAM as uncacheable doesn't work
 > for Arm KVM guests (for the same reason that VGA device video 
 > memory
 > doesn't work). If it's RAM your guest has to arrange to map it as
 > Normal Cacheable, and then everything should work fine.
 
  Does this cause problems with migrating at just the wrong point 
  during
  a VM boot?
 >>>
 >>> It wouldn't surprise me if it did, but I don't think I've ever
 >>> tried to provoke that problem...
 >>
 >> If you think it'll get the RAM contents wrong, it might be best to 
 >> fail
 >> the migration if you can detect the cache is disabled in the guest.
 >
 > I guess QEMU could look at the value of the "MMU disabled/enabled" 
 > bit
 > in the guest's system registers, and refuse migration if it's off...
 >
 > (cc'd Marc, Christoffer to check that I don't have the wrong end
 > of the stick about how thin the ice is in the period before the
 > guest turns on its MMU...)

 Once MMU and caches are on, we should be in a reasonable place for QEMU
 to have a consistent view of the memory. The trick is to prevent the
 vcpus from changing that. A guest could perfectly turn off its MMU at
 any given time if it needs to (and it is actually required on some HW 
 if
 you want to mitigate headlining CVEs), and KVM won't know about that.

>>>
>>> (Clarification: KVM can detect this is it bother to check the VCPU's
>>> system registers, but we don't trap to KVM when the VCPU turns off its
>>> caches, right?)
>>>
 You may have to pause the vcpus before starting the migration, or
 introduce a new KVM feature that would automatically pause a vcpu that
 is trying to disable its MMU while the migration is on. This would
 involve trapping all the virtual memory related system registers, with
 an obvious cost. But that cost would be limited to the time it takes to
 migrate the memory, so maybe that's acceptable.

>>> Is that even sufficient?
>>>
>>> What if the following happened. (1) guest turns off MMU, (2) guest
>>> writes some data directly to ram (3) qemu stops the vcpu (4) qemu reads
>>> guest ram.  QEMU's view of guest ram is now incorrect (stale,
>>> incoherent, ...).
>>>
>>> I'm also not really sure if pausing one VCPU because it turned off its
>>> MMU will go very well when trying to migrate a large VM (wouldn't this
>>> ask for all the other VCPUs beginning to complain that the stopped VCPU
>>> appears to be dead?).  As a short-term 'fix' it's probably better to
>>> refuse migration if you detect that a VCPU had begun turning off its
>>> MMU.
>>>
>>> On the larger scale of thins; this appears to me to be another case of
>>> us really needing some way to coherently access memory between QEMU and
>>> the VM, but in the case of the VCPU turning off the MMU prior to
>>> migration, we don't even know where it may have written data, and I'm
>>> therefore not really sure what the 'proper' solution would be.
>>>
>>> (cc'ing Ard who has has thought about this problem before in the context
>>> of UEFI and VGA.)
>>>
>>
>> Actually, the VGA case is much simpler because the host is not
>> expected to write to the framebuffer, only read from it, and the guest
>> is not expected to create a cacheable mapping for it, so any
>> incoherency can be trivially solved by cache invalidation on the host
>> side. (Note that this has nothing to do 

Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-01-31 Thread Christoffer Dall
On Wed, Jan 31, 2018 at 7:00 PM, Ard Biesheuvel
 wrote:
> On 31 January 2018 at 17:39, Christoffer Dall
>  wrote:
>> On Wed, Jan 31, 2018 at 5:59 PM, Ard Biesheuvel
>>  wrote:
>>> On 31 January 2018 at 16:53, Christoffer Dall
>>>  wrote:
 On Wed, Jan 31, 2018 at 4:18 PM, Ard Biesheuvel
  wrote:
> On 31 January 2018 at 09:53, Christoffer Dall
>  wrote:
>> On Mon, Jan 29, 2018 at 10:32:12AM +, Marc Zyngier wrote:
>>> On 29/01/18 10:04, Peter Maydell wrote:
>>> > On 29 January 2018 at 09:53, Dr. David Alan Gilbert 
>>> >  wrote:
>>> >> * Peter Maydell (peter.mayd...@linaro.org) wrote:
>>> >>> On 26 January 2018 at 19:46, Dr. David Alan Gilbert 
>>> >>>  wrote:
>>>  * Peter Maydell (peter.mayd...@linaro.org) wrote:
>>> > I think the correct fix here is that your test code should turn
>>> > its MMU on. Trying to treat guest RAM as uncacheable doesn't work
>>> > for Arm KVM guests (for the same reason that VGA device video 
>>> > memory
>>> > doesn't work). If it's RAM your guest has to arrange to map it as
>>> > Normal Cacheable, and then everything should work fine.
>>> 
>>>  Does this cause problems with migrating at just the wrong point 
>>>  during
>>>  a VM boot?
>>> >>>
>>> >>> It wouldn't surprise me if it did, but I don't think I've ever
>>> >>> tried to provoke that problem...
>>> >>
>>> >> If you think it'll get the RAM contents wrong, it might be best to 
>>> >> fail
>>> >> the migration if you can detect the cache is disabled in the guest.
>>> >
>>> > I guess QEMU could look at the value of the "MMU disabled/enabled" bit
>>> > in the guest's system registers, and refuse migration if it's off...
>>> >
>>> > (cc'd Marc, Christoffer to check that I don't have the wrong end
>>> > of the stick about how thin the ice is in the period before the
>>> > guest turns on its MMU...)
>>>
>>> Once MMU and caches are on, we should be in a reasonable place for QEMU
>>> to have a consistent view of the memory. The trick is to prevent the
>>> vcpus from changing that. A guest could perfectly turn off its MMU at
>>> any given time if it needs to (and it is actually required on some HW if
>>> you want to mitigate headlining CVEs), and KVM won't know about that.
>>>
>>
>> (Clarification: KVM can detect this is it bother to check the VCPU's
>> system registers, but we don't trap to KVM when the VCPU turns off its
>> caches, right?)
>>
>>> You may have to pause the vcpus before starting the migration, or
>>> introduce a new KVM feature that would automatically pause a vcpu that
>>> is trying to disable its MMU while the migration is on. This would
>>> involve trapping all the virtual memory related system registers, with
>>> an obvious cost. But that cost would be limited to the time it takes to
>>> migrate the memory, so maybe that's acceptable.
>>>
>> Is that even sufficient?
>>
>> What if the following happened. (1) guest turns off MMU, (2) guest
>> writes some data directly to ram (3) qemu stops the vcpu (4) qemu reads
>> guest ram.  QEMU's view of guest ram is now incorrect (stale,
>> incoherent, ...).
>>
>> I'm also not really sure if pausing one VCPU because it turned off its
>> MMU will go very well when trying to migrate a large VM (wouldn't this
>> ask for all the other VCPUs beginning to complain that the stopped VCPU
>> appears to be dead?).  As a short-term 'fix' it's probably better to
>> refuse migration if you detect that a VCPU had begun turning off its
>> MMU.
>>
>> On the larger scale of thins; this appears to me to be another case of
>> us really needing some way to coherently access memory between QEMU and
>> the VM, but in the case of the VCPU turning off the MMU prior to
>> migration, we don't even know where it may have written data, and I'm
>> therefore not really sure what the 'proper' solution would be.
>>
>> (cc'ing Ard who has has thought about this problem before in the context
>> of UEFI and VGA.)
>>
>
> Actually, the VGA case is much simpler because the host is not
> expected to write to the framebuffer, only read from it, and the guest
> is not expected to create a cacheable mapping for it, so any
> incoherency can be trivially solved by cache invalidation on the host
> side. (Note that this has nothing to do with DMA coherency, but only
> with PCI MMIO BARs that are backed by DRAM in the host)

 In case of the running guest, the host will also only read from the
 cached mapping.  Of 

Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-01-31 Thread Ard Biesheuvel
On 31 January 2018 at 17:39, Christoffer Dall
 wrote:
> On Wed, Jan 31, 2018 at 5:59 PM, Ard Biesheuvel
>  wrote:
>> On 31 January 2018 at 16:53, Christoffer Dall
>>  wrote:
>>> On Wed, Jan 31, 2018 at 4:18 PM, Ard Biesheuvel
>>>  wrote:
 On 31 January 2018 at 09:53, Christoffer Dall
  wrote:
> On Mon, Jan 29, 2018 at 10:32:12AM +, Marc Zyngier wrote:
>> On 29/01/18 10:04, Peter Maydell wrote:
>> > On 29 January 2018 at 09:53, Dr. David Alan Gilbert 
>> >  wrote:
>> >> * Peter Maydell (peter.mayd...@linaro.org) wrote:
>> >>> On 26 January 2018 at 19:46, Dr. David Alan Gilbert 
>> >>>  wrote:
>>  * Peter Maydell (peter.mayd...@linaro.org) wrote:
>> > I think the correct fix here is that your test code should turn
>> > its MMU on. Trying to treat guest RAM as uncacheable doesn't work
>> > for Arm KVM guests (for the same reason that VGA device video 
>> > memory
>> > doesn't work). If it's RAM your guest has to arrange to map it as
>> > Normal Cacheable, and then everything should work fine.
>> 
>>  Does this cause problems with migrating at just the wrong point 
>>  during
>>  a VM boot?
>> >>>
>> >>> It wouldn't surprise me if it did, but I don't think I've ever
>> >>> tried to provoke that problem...
>> >>
>> >> If you think it'll get the RAM contents wrong, it might be best to 
>> >> fail
>> >> the migration if you can detect the cache is disabled in the guest.
>> >
>> > I guess QEMU could look at the value of the "MMU disabled/enabled" bit
>> > in the guest's system registers, and refuse migration if it's off...
>> >
>> > (cc'd Marc, Christoffer to check that I don't have the wrong end
>> > of the stick about how thin the ice is in the period before the
>> > guest turns on its MMU...)
>>
>> Once MMU and caches are on, we should be in a reasonable place for QEMU
>> to have a consistent view of the memory. The trick is to prevent the
>> vcpus from changing that. A guest could perfectly turn off its MMU at
>> any given time if it needs to (and it is actually required on some HW if
>> you want to mitigate headlining CVEs), and KVM won't know about that.
>>
>
> (Clarification: KVM can detect this is it bother to check the VCPU's
> system registers, but we don't trap to KVM when the VCPU turns off its
> caches, right?)
>
>> You may have to pause the vcpus before starting the migration, or
>> introduce a new KVM feature that would automatically pause a vcpu that
>> is trying to disable its MMU while the migration is on. This would
>> involve trapping all the virtual memory related system registers, with
>> an obvious cost. But that cost would be limited to the time it takes to
>> migrate the memory, so maybe that's acceptable.
>>
> Is that even sufficient?
>
> What if the following happened. (1) guest turns off MMU, (2) guest
> writes some data directly to ram (3) qemu stops the vcpu (4) qemu reads
> guest ram.  QEMU's view of guest ram is now incorrect (stale,
> incoherent, ...).
>
> I'm also not really sure if pausing one VCPU because it turned off its
> MMU will go very well when trying to migrate a large VM (wouldn't this
> ask for all the other VCPUs beginning to complain that the stopped VCPU
> appears to be dead?).  As a short-term 'fix' it's probably better to
> refuse migration if you detect that a VCPU had begun turning off its
> MMU.
>
> On the larger scale of thins; this appears to me to be another case of
> us really needing some way to coherently access memory between QEMU and
> the VM, but in the case of the VCPU turning off the MMU prior to
> migration, we don't even know where it may have written data, and I'm
> therefore not really sure what the 'proper' solution would be.
>
> (cc'ing Ard who has has thought about this problem before in the context
> of UEFI and VGA.)
>

 Actually, the VGA case is much simpler because the host is not
 expected to write to the framebuffer, only read from it, and the guest
 is not expected to create a cacheable mapping for it, so any
 incoherency can be trivially solved by cache invalidation on the host
 side. (Note that this has nothing to do with DMA coherency, but only
 with PCI MMIO BARs that are backed by DRAM in the host)
>>>
>>> In case of the running guest, the host will also only read from the
>>> cached mapping.  Of course, at restore, the host will also write
>>> through a cached mapping, but shouldn't the latter case be solvable by
>>> having KVM clean the cache lines when faulting in any page?

Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-01-31 Thread Christoffer Dall
On Wed, Jan 31, 2018 at 5:59 PM, Ard Biesheuvel
 wrote:
> On 31 January 2018 at 16:53, Christoffer Dall
>  wrote:
>> On Wed, Jan 31, 2018 at 4:18 PM, Ard Biesheuvel
>>  wrote:
>>> On 31 January 2018 at 09:53, Christoffer Dall
>>>  wrote:
 On Mon, Jan 29, 2018 at 10:32:12AM +, Marc Zyngier wrote:
> On 29/01/18 10:04, Peter Maydell wrote:
> > On 29 January 2018 at 09:53, Dr. David Alan Gilbert 
> >  wrote:
> >> * Peter Maydell (peter.mayd...@linaro.org) wrote:
> >>> On 26 January 2018 at 19:46, Dr. David Alan Gilbert 
> >>>  wrote:
>  * Peter Maydell (peter.mayd...@linaro.org) wrote:
> > I think the correct fix here is that your test code should turn
> > its MMU on. Trying to treat guest RAM as uncacheable doesn't work
> > for Arm KVM guests (for the same reason that VGA device video memory
> > doesn't work). If it's RAM your guest has to arrange to map it as
> > Normal Cacheable, and then everything should work fine.
> 
>  Does this cause problems with migrating at just the wrong point 
>  during
>  a VM boot?
> >>>
> >>> It wouldn't surprise me if it did, but I don't think I've ever
> >>> tried to provoke that problem...
> >>
> >> If you think it'll get the RAM contents wrong, it might be best to fail
> >> the migration if you can detect the cache is disabled in the guest.
> >
> > I guess QEMU could look at the value of the "MMU disabled/enabled" bit
> > in the guest's system registers, and refuse migration if it's off...
> >
> > (cc'd Marc, Christoffer to check that I don't have the wrong end
> > of the stick about how thin the ice is in the period before the
> > guest turns on its MMU...)
>
> Once MMU and caches are on, we should be in a reasonable place for QEMU
> to have a consistent view of the memory. The trick is to prevent the
> vcpus from changing that. A guest could perfectly turn off its MMU at
> any given time if it needs to (and it is actually required on some HW if
> you want to mitigate headlining CVEs), and KVM won't know about that.
>

 (Clarification: KVM can detect this is it bother to check the VCPU's
 system registers, but we don't trap to KVM when the VCPU turns off its
 caches, right?)

> You may have to pause the vcpus before starting the migration, or
> introduce a new KVM feature that would automatically pause a vcpu that
> is trying to disable its MMU while the migration is on. This would
> involve trapping all the virtual memory related system registers, with
> an obvious cost. But that cost would be limited to the time it takes to
> migrate the memory, so maybe that's acceptable.
>
 Is that even sufficient?

 What if the following happened. (1) guest turns off MMU, (2) guest
 writes some data directly to ram (3) qemu stops the vcpu (4) qemu reads
 guest ram.  QEMU's view of guest ram is now incorrect (stale,
 incoherent, ...).

 I'm also not really sure if pausing one VCPU because it turned off its
 MMU will go very well when trying to migrate a large VM (wouldn't this
 ask for all the other VCPUs beginning to complain that the stopped VCPU
 appears to be dead?).  As a short-term 'fix' it's probably better to
 refuse migration if you detect that a VCPU had begun turning off its
 MMU.

 On the larger scale of thins; this appears to me to be another case of
 us really needing some way to coherently access memory between QEMU and
 the VM, but in the case of the VCPU turning off the MMU prior to
 migration, we don't even know where it may have written data, and I'm
 therefore not really sure what the 'proper' solution would be.

 (cc'ing Ard who has has thought about this problem before in the context
 of UEFI and VGA.)

>>>
>>> Actually, the VGA case is much simpler because the host is not
>>> expected to write to the framebuffer, only read from it, and the guest
>>> is not expected to create a cacheable mapping for it, so any
>>> incoherency can be trivially solved by cache invalidation on the host
>>> side. (Note that this has nothing to do with DMA coherency, but only
>>> with PCI MMIO BARs that are backed by DRAM in the host)
>>
>> In case of the running guest, the host will also only read from the
>> cached mapping.  Of course, at restore, the host will also write
>> through a cached mapping, but shouldn't the latter case be solvable by
>> having KVM clean the cache lines when faulting in any page?
>>
>
> We are still talking about the contents of the framebuffer, right? In
> that case, yes, afaict
>

I was talking about normal RAM actually... not sure if that changes anything?

>>>
>>> In the 

Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-01-31 Thread Ard Biesheuvel
On 31 January 2018 at 16:53, Christoffer Dall
 wrote:
> On Wed, Jan 31, 2018 at 4:18 PM, Ard Biesheuvel
>  wrote:
>> On 31 January 2018 at 09:53, Christoffer Dall
>>  wrote:
>>> On Mon, Jan 29, 2018 at 10:32:12AM +, Marc Zyngier wrote:
 On 29/01/18 10:04, Peter Maydell wrote:
 > On 29 January 2018 at 09:53, Dr. David Alan Gilbert 
 >  wrote:
 >> * Peter Maydell (peter.mayd...@linaro.org) wrote:
 >>> On 26 January 2018 at 19:46, Dr. David Alan Gilbert 
 >>>  wrote:
  * Peter Maydell (peter.mayd...@linaro.org) wrote:
 > I think the correct fix here is that your test code should turn
 > its MMU on. Trying to treat guest RAM as uncacheable doesn't work
 > for Arm KVM guests (for the same reason that VGA device video memory
 > doesn't work). If it's RAM your guest has to arrange to map it as
 > Normal Cacheable, and then everything should work fine.
 
  Does this cause problems with migrating at just the wrong point during
  a VM boot?
 >>>
 >>> It wouldn't surprise me if it did, but I don't think I've ever
 >>> tried to provoke that problem...
 >>
 >> If you think it'll get the RAM contents wrong, it might be best to fail
 >> the migration if you can detect the cache is disabled in the guest.
 >
 > I guess QEMU could look at the value of the "MMU disabled/enabled" bit
 > in the guest's system registers, and refuse migration if it's off...
 >
 > (cc'd Marc, Christoffer to check that I don't have the wrong end
 > of the stick about how thin the ice is in the period before the
 > guest turns on its MMU...)

 Once MMU and caches are on, we should be in a reasonable place for QEMU
 to have a consistent view of the memory. The trick is to prevent the
 vcpus from changing that. A guest could perfectly turn off its MMU at
 any given time if it needs to (and it is actually required on some HW if
 you want to mitigate headlining CVEs), and KVM won't know about that.

>>>
>>> (Clarification: KVM can detect this is it bother to check the VCPU's
>>> system registers, but we don't trap to KVM when the VCPU turns off its
>>> caches, right?)
>>>
 You may have to pause the vcpus before starting the migration, or
 introduce a new KVM feature that would automatically pause a vcpu that
 is trying to disable its MMU while the migration is on. This would
 involve trapping all the virtual memory related system registers, with
 an obvious cost. But that cost would be limited to the time it takes to
 migrate the memory, so maybe that's acceptable.

>>> Is that even sufficient?
>>>
>>> What if the following happened. (1) guest turns off MMU, (2) guest
>>> writes some data directly to ram (3) qemu stops the vcpu (4) qemu reads
>>> guest ram.  QEMU's view of guest ram is now incorrect (stale,
>>> incoherent, ...).
>>>
>>> I'm also not really sure if pausing one VCPU because it turned off its
>>> MMU will go very well when trying to migrate a large VM (wouldn't this
>>> ask for all the other VCPUs beginning to complain that the stopped VCPU
>>> appears to be dead?).  As a short-term 'fix' it's probably better to
>>> refuse migration if you detect that a VCPU had begun turning off its
>>> MMU.
>>>
>>> On the larger scale of thins; this appears to me to be another case of
>>> us really needing some way to coherently access memory between QEMU and
>>> the VM, but in the case of the VCPU turning off the MMU prior to
>>> migration, we don't even know where it may have written data, and I'm
>>> therefore not really sure what the 'proper' solution would be.
>>>
>>> (cc'ing Ard who has has thought about this problem before in the context
>>> of UEFI and VGA.)
>>>
>>
>> Actually, the VGA case is much simpler because the host is not
>> expected to write to the framebuffer, only read from it, and the guest
>> is not expected to create a cacheable mapping for it, so any
>> incoherency can be trivially solved by cache invalidation on the host
>> side. (Note that this has nothing to do with DMA coherency, but only
>> with PCI MMIO BARs that are backed by DRAM in the host)
>
> In case of the running guest, the host will also only read from the
> cached mapping.  Of course, at restore, the host will also write
> through a cached mapping, but shouldn't the latter case be solvable by
> having KVM clean the cache lines when faulting in any page?
>

We are still talking about the contents of the framebuffer, right? In
that case, yes, afaict

>>
>> In the migration case, it is much more complicated, and I think
>> capturing the state of the VM in a way that takes incoherency between
>> caches and main memory into account is simply infeasible (i.e., the
>> act of recording the state of guest RAM via a cached mapping may evict
>> 

Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-01-31 Thread Christoffer Dall
On Wed, Jan 31, 2018 at 4:18 PM, Ard Biesheuvel
 wrote:
> On 31 January 2018 at 09:53, Christoffer Dall
>  wrote:
>> On Mon, Jan 29, 2018 at 10:32:12AM +, Marc Zyngier wrote:
>>> On 29/01/18 10:04, Peter Maydell wrote:
>>> > On 29 January 2018 at 09:53, Dr. David Alan Gilbert  
>>> > wrote:
>>> >> * Peter Maydell (peter.mayd...@linaro.org) wrote:
>>> >>> On 26 January 2018 at 19:46, Dr. David Alan Gilbert 
>>> >>>  wrote:
>>>  * Peter Maydell (peter.mayd...@linaro.org) wrote:
>>> > I think the correct fix here is that your test code should turn
>>> > its MMU on. Trying to treat guest RAM as uncacheable doesn't work
>>> > for Arm KVM guests (for the same reason that VGA device video memory
>>> > doesn't work). If it's RAM your guest has to arrange to map it as
>>> > Normal Cacheable, and then everything should work fine.
>>> 
>>>  Does this cause problems with migrating at just the wrong point during
>>>  a VM boot?
>>> >>>
>>> >>> It wouldn't surprise me if it did, but I don't think I've ever
>>> >>> tried to provoke that problem...
>>> >>
>>> >> If you think it'll get the RAM contents wrong, it might be best to fail
>>> >> the migration if you can detect the cache is disabled in the guest.
>>> >
>>> > I guess QEMU could look at the value of the "MMU disabled/enabled" bit
>>> > in the guest's system registers, and refuse migration if it's off...
>>> >
>>> > (cc'd Marc, Christoffer to check that I don't have the wrong end
>>> > of the stick about how thin the ice is in the period before the
>>> > guest turns on its MMU...)
>>>
>>> Once MMU and caches are on, we should be in a reasonable place for QEMU
>>> to have a consistent view of the memory. The trick is to prevent the
>>> vcpus from changing that. A guest could perfectly turn off its MMU at
>>> any given time if it needs to (and it is actually required on some HW if
>>> you want to mitigate headlining CVEs), and KVM won't know about that.
>>>
>>
>> (Clarification: KVM can detect this is it bother to check the VCPU's
>> system registers, but we don't trap to KVM when the VCPU turns off its
>> caches, right?)
>>
>>> You may have to pause the vcpus before starting the migration, or
>>> introduce a new KVM feature that would automatically pause a vcpu that
>>> is trying to disable its MMU while the migration is on. This would
>>> involve trapping all the virtual memory related system registers, with
>>> an obvious cost. But that cost would be limited to the time it takes to
>>> migrate the memory, so maybe that's acceptable.
>>>
>> Is that even sufficient?
>>
>> What if the following happened. (1) guest turns off MMU, (2) guest
>> writes some data directly to ram (3) qemu stops the vcpu (4) qemu reads
>> guest ram.  QEMU's view of guest ram is now incorrect (stale,
>> incoherent, ...).
>>
>> I'm also not really sure if pausing one VCPU because it turned off its
>> MMU will go very well when trying to migrate a large VM (wouldn't this
>> ask for all the other VCPUs beginning to complain that the stopped VCPU
>> appears to be dead?).  As a short-term 'fix' it's probably better to
>> refuse migration if you detect that a VCPU had begun turning off its
>> MMU.
>>
>> On the larger scale of thins; this appears to me to be another case of
>> us really needing some way to coherently access memory between QEMU and
>> the VM, but in the case of the VCPU turning off the MMU prior to
>> migration, we don't even know where it may have written data, and I'm
>> therefore not really sure what the 'proper' solution would be.
>>
>> (cc'ing Ard who has has thought about this problem before in the context
>> of UEFI and VGA.)
>>
>
> Actually, the VGA case is much simpler because the host is not
> expected to write to the framebuffer, only read from it, and the guest
> is not expected to create a cacheable mapping for it, so any
> incoherency can be trivially solved by cache invalidation on the host
> side. (Note that this has nothing to do with DMA coherency, but only
> with PCI MMIO BARs that are backed by DRAM in the host)

In case of the running guest, the host will also only read from the
cached mapping.  Of course, at restore, the host will also write
through a cached mapping, but shouldn't the latter case be solvable by
having KVM clean the cache lines when faulting in any page?

>
> In the migration case, it is much more complicated, and I think
> capturing the state of the VM in a way that takes incoherency between
> caches and main memory into account is simply infeasible (i.e., the
> act of recording the state of guest RAM via a cached mapping may evict
> clean cachelines that are out of sync, and so it is impossible to
> record both the cached *and* the delta with the uncached state)

This may be an incredibly stupid question (and I may have asked it
before), but why can't we clean+invalidate the guest page before
reading it and 

Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-01-31 Thread Dr. David Alan Gilbert
* Ard Biesheuvel (ard.biesheu...@linaro.org) wrote:
> On 31 January 2018 at 09:53, Christoffer Dall
>  wrote:
> > On Mon, Jan 29, 2018 at 10:32:12AM +, Marc Zyngier wrote:
> >> On 29/01/18 10:04, Peter Maydell wrote:
> >> > On 29 January 2018 at 09:53, Dr. David Alan Gilbert 
> >> >  wrote:
> >> >> * Peter Maydell (peter.mayd...@linaro.org) wrote:
> >> >>> On 26 January 2018 at 19:46, Dr. David Alan Gilbert 
> >> >>>  wrote:
> >>  * Peter Maydell (peter.mayd...@linaro.org) wrote:
> >> > I think the correct fix here is that your test code should turn
> >> > its MMU on. Trying to treat guest RAM as uncacheable doesn't work
> >> > for Arm KVM guests (for the same reason that VGA device video memory
> >> > doesn't work). If it's RAM your guest has to arrange to map it as
> >> > Normal Cacheable, and then everything should work fine.
> >> 
> >>  Does this cause problems with migrating at just the wrong point during
> >>  a VM boot?
> >> >>>
> >> >>> It wouldn't surprise me if it did, but I don't think I've ever
> >> >>> tried to provoke that problem...
> >> >>
> >> >> If you think it'll get the RAM contents wrong, it might be best to fail
> >> >> the migration if you can detect the cache is disabled in the guest.
> >> >
> >> > I guess QEMU could look at the value of the "MMU disabled/enabled" bit
> >> > in the guest's system registers, and refuse migration if it's off...
> >> >
> >> > (cc'd Marc, Christoffer to check that I don't have the wrong end
> >> > of the stick about how thin the ice is in the period before the
> >> > guest turns on its MMU...)
> >>
> >> Once MMU and caches are on, we should be in a reasonable place for QEMU
> >> to have a consistent view of the memory. The trick is to prevent the
> >> vcpus from changing that. A guest could perfectly turn off its MMU at
> >> any given time if it needs to (and it is actually required on some HW if
> >> you want to mitigate headlining CVEs), and KVM won't know about that.
> >>
> >
> > (Clarification: KVM can detect this is it bother to check the VCPU's
> > system registers, but we don't trap to KVM when the VCPU turns off its
> > caches, right?)
> >
> >> You may have to pause the vcpus before starting the migration, or
> >> introduce a new KVM feature that would automatically pause a vcpu that
> >> is trying to disable its MMU while the migration is on. This would
> >> involve trapping all the virtual memory related system registers, with
> >> an obvious cost. But that cost would be limited to the time it takes to
> >> migrate the memory, so maybe that's acceptable.
> >>
> > Is that even sufficient?
> >
> > What if the following happened. (1) guest turns off MMU, (2) guest
> > writes some data directly to ram (3) qemu stops the vcpu (4) qemu reads
> > guest ram.  QEMU's view of guest ram is now incorrect (stale,
> > incoherent, ...).
> >
> > I'm also not really sure if pausing one VCPU because it turned off its
> > MMU will go very well when trying to migrate a large VM (wouldn't this
> > ask for all the other VCPUs beginning to complain that the stopped VCPU
> > appears to be dead?).  As a short-term 'fix' it's probably better to
> > refuse migration if you detect that a VCPU had begun turning off its
> > MMU.
> >
> > On the larger scale of thins; this appears to me to be another case of
> > us really needing some way to coherently access memory between QEMU and
> > the VM, but in the case of the VCPU turning off the MMU prior to
> > migration, we don't even know where it may have written data, and I'm
> > therefore not really sure what the 'proper' solution would be.
> >
> > (cc'ing Ard who has has thought about this problem before in the context
> > of UEFI and VGA.)
> >
> 
> Actually, the VGA case is much simpler because the host is not
> expected to write to the framebuffer, only read from it, and the guest
> is not expected to create a cacheable mapping for it, so any
> incoherency can be trivially solved by cache invalidation on the host
> side. (Note that this has nothing to do with DMA coherency, but only
> with PCI MMIO BARs that are backed by DRAM in the host)
> 
> In the migration case, it is much more complicated, and I think
> capturing the state of the VM in a way that takes incoherency between
> caches and main memory into account is simply infeasible (i.e., the
> act of recording the state of guest RAM via a cached mapping may evict
> clean cachelines that are out of sync, and so it is impossible to
> record both the cached *and* the delta with the uncached state)
> 
> I wonder how difficult it would be to
> a) enable trapping of the MMU system register when a guest CPU is
> found to have its MMU off at migration time
> b) allow the guest CPU to complete whatever it thinks it needs to be
> doing with the MMU off
> c) once it re-enables the MMU, proceed with capturing the memory state
> 
> Full disclosure: I know very little 

Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-01-31 Thread Ard Biesheuvel
On 31 January 2018 at 09:53, Christoffer Dall
 wrote:
> On Mon, Jan 29, 2018 at 10:32:12AM +, Marc Zyngier wrote:
>> On 29/01/18 10:04, Peter Maydell wrote:
>> > On 29 January 2018 at 09:53, Dr. David Alan Gilbert  
>> > wrote:
>> >> * Peter Maydell (peter.mayd...@linaro.org) wrote:
>> >>> On 26 January 2018 at 19:46, Dr. David Alan Gilbert 
>> >>>  wrote:
>>  * Peter Maydell (peter.mayd...@linaro.org) wrote:
>> > I think the correct fix here is that your test code should turn
>> > its MMU on. Trying to treat guest RAM as uncacheable doesn't work
>> > for Arm KVM guests (for the same reason that VGA device video memory
>> > doesn't work). If it's RAM your guest has to arrange to map it as
>> > Normal Cacheable, and then everything should work fine.
>> 
>>  Does this cause problems with migrating at just the wrong point during
>>  a VM boot?
>> >>>
>> >>> It wouldn't surprise me if it did, but I don't think I've ever
>> >>> tried to provoke that problem...
>> >>
>> >> If you think it'll get the RAM contents wrong, it might be best to fail
>> >> the migration if you can detect the cache is disabled in the guest.
>> >
>> > I guess QEMU could look at the value of the "MMU disabled/enabled" bit
>> > in the guest's system registers, and refuse migration if it's off...
>> >
>> > (cc'd Marc, Christoffer to check that I don't have the wrong end
>> > of the stick about how thin the ice is in the period before the
>> > guest turns on its MMU...)
>>
>> Once MMU and caches are on, we should be in a reasonable place for QEMU
>> to have a consistent view of the memory. The trick is to prevent the
>> vcpus from changing that. A guest could perfectly turn off its MMU at
>> any given time if it needs to (and it is actually required on some HW if
>> you want to mitigate headlining CVEs), and KVM won't know about that.
>>
>
> (Clarification: KVM can detect this is it bother to check the VCPU's
> system registers, but we don't trap to KVM when the VCPU turns off its
> caches, right?)
>
>> You may have to pause the vcpus before starting the migration, or
>> introduce a new KVM feature that would automatically pause a vcpu that
>> is trying to disable its MMU while the migration is on. This would
>> involve trapping all the virtual memory related system registers, with
>> an obvious cost. But that cost would be limited to the time it takes to
>> migrate the memory, so maybe that's acceptable.
>>
> Is that even sufficient?
>
> What if the following happened. (1) guest turns off MMU, (2) guest
> writes some data directly to ram (3) qemu stops the vcpu (4) qemu reads
> guest ram.  QEMU's view of guest ram is now incorrect (stale,
> incoherent, ...).
>
> I'm also not really sure if pausing one VCPU because it turned off its
> MMU will go very well when trying to migrate a large VM (wouldn't this
> ask for all the other VCPUs beginning to complain that the stopped VCPU
> appears to be dead?).  As a short-term 'fix' it's probably better to
> refuse migration if you detect that a VCPU had begun turning off its
> MMU.
>
> On the larger scale of thins; this appears to me to be another case of
> us really needing some way to coherently access memory between QEMU and
> the VM, but in the case of the VCPU turning off the MMU prior to
> migration, we don't even know where it may have written data, and I'm
> therefore not really sure what the 'proper' solution would be.
>
> (cc'ing Ard who has has thought about this problem before in the context
> of UEFI and VGA.)
>

Actually, the VGA case is much simpler because the host is not
expected to write to the framebuffer, only read from it, and the guest
is not expected to create a cacheable mapping for it, so any
incoherency can be trivially solved by cache invalidation on the host
side. (Note that this has nothing to do with DMA coherency, but only
with PCI MMIO BARs that are backed by DRAM in the host)

In the migration case, it is much more complicated, and I think
capturing the state of the VM in a way that takes incoherency between
caches and main memory into account is simply infeasible (i.e., the
act of recording the state of guest RAM via a cached mapping may evict
clean cachelines that are out of sync, and so it is impossible to
record both the cached *and* the delta with the uncached state)

I wonder how difficult it would be to
a) enable trapping of the MMU system register when a guest CPU is
found to have its MMU off at migration time
b) allow the guest CPU to complete whatever it thinks it needs to be
doing with the MMU off
c) once it re-enables the MMU, proceed with capturing the memory state

Full disclosure: I know very little about KVM migration ...



Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-01-31 Thread Christoffer Dall
On Mon, Jan 29, 2018 at 10:32:12AM +, Marc Zyngier wrote:
> On 29/01/18 10:04, Peter Maydell wrote:
> > On 29 January 2018 at 09:53, Dr. David Alan Gilbert  
> > wrote:
> >> * Peter Maydell (peter.mayd...@linaro.org) wrote:
> >>> On 26 January 2018 at 19:46, Dr. David Alan Gilbert  
> >>> wrote:
>  * Peter Maydell (peter.mayd...@linaro.org) wrote:
> > I think the correct fix here is that your test code should turn
> > its MMU on. Trying to treat guest RAM as uncacheable doesn't work
> > for Arm KVM guests (for the same reason that VGA device video memory
> > doesn't work). If it's RAM your guest has to arrange to map it as
> > Normal Cacheable, and then everything should work fine.
> 
>  Does this cause problems with migrating at just the wrong point during
>  a VM boot?
> >>>
> >>> It wouldn't surprise me if it did, but I don't think I've ever
> >>> tried to provoke that problem...
> >>
> >> If you think it'll get the RAM contents wrong, it might be best to fail
> >> the migration if you can detect the cache is disabled in the guest.
> > 
> > I guess QEMU could look at the value of the "MMU disabled/enabled" bit
> > in the guest's system registers, and refuse migration if it's off...
> > 
> > (cc'd Marc, Christoffer to check that I don't have the wrong end
> > of the stick about how thin the ice is in the period before the
> > guest turns on its MMU...)
> 
> Once MMU and caches are on, we should be in a reasonable place for QEMU
> to have a consistent view of the memory. The trick is to prevent the
> vcpus from changing that. A guest could perfectly turn off its MMU at
> any given time if it needs to (and it is actually required on some HW if
> you want to mitigate headlining CVEs), and KVM won't know about that.
> 

(Clarification: KVM can detect this is it bother to check the VCPU's
system registers, but we don't trap to KVM when the VCPU turns off its
caches, right?)

> You may have to pause the vcpus before starting the migration, or
> introduce a new KVM feature that would automatically pause a vcpu that
> is trying to disable its MMU while the migration is on. This would
> involve trapping all the virtual memory related system registers, with
> an obvious cost. But that cost would be limited to the time it takes to
> migrate the memory, so maybe that's acceptable.
> 
Is that even sufficient?

What if the following happened. (1) guest turns off MMU, (2) guest
writes some data directly to ram (3) qemu stops the vcpu (4) qemu reads
guest ram.  QEMU's view of guest ram is now incorrect (stale,
incoherent, ...).

I'm also not really sure if pausing one VCPU because it turned off its
MMU will go very well when trying to migrate a large VM (wouldn't this
ask for all the other VCPUs beginning to complain that the stopped VCPU
appears to be dead?).  As a short-term 'fix' it's probably better to
refuse migration if you detect that a VCPU had begun turning off its
MMU.

On the larger scale of thins; this appears to me to be another case of
us really needing some way to coherently access memory between QEMU and
the VM, but in the case of the VCPU turning off the MMU prior to
migration, we don't even know where it may have written data, and I'm
therefore not really sure what the 'proper' solution would be.

(cc'ing Ard who has has thought about this problem before in the context
of UEFI and VGA.)

Thanks,
-Christoffer



Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-01-29 Thread Marc Zyngier
On 29/01/18 10:04, Peter Maydell wrote:
> On 29 January 2018 at 09:53, Dr. David Alan Gilbert  
> wrote:
>> * Peter Maydell (peter.mayd...@linaro.org) wrote:
>>> On 26 January 2018 at 19:46, Dr. David Alan Gilbert  
>>> wrote:
 * Peter Maydell (peter.mayd...@linaro.org) wrote:
> I think the correct fix here is that your test code should turn
> its MMU on. Trying to treat guest RAM as uncacheable doesn't work
> for Arm KVM guests (for the same reason that VGA device video memory
> doesn't work). If it's RAM your guest has to arrange to map it as
> Normal Cacheable, and then everything should work fine.

 Does this cause problems with migrating at just the wrong point during
 a VM boot?
>>>
>>> It wouldn't surprise me if it did, but I don't think I've ever
>>> tried to provoke that problem...
>>
>> If you think it'll get the RAM contents wrong, it might be best to fail
>> the migration if you can detect the cache is disabled in the guest.
> 
> I guess QEMU could look at the value of the "MMU disabled/enabled" bit
> in the guest's system registers, and refuse migration if it's off...
> 
> (cc'd Marc, Christoffer to check that I don't have the wrong end
> of the stick about how thin the ice is in the period before the
> guest turns on its MMU...)

Once MMU and caches are on, we should be in a reasonable place for QEMU
to have a consistent view of the memory. The trick is to prevent the
vcpus from changing that. A guest could perfectly turn off its MMU at
any given time if it needs to (and it is actually required on some HW if
you want to mitigate headlining CVEs), and KVM won't know about that.

You may have to pause the vcpus before starting the migration, or
introduce a new KVM feature that would automatically pause a vcpu that
is trying to disable its MMU while the migration is on. This would
involve trapping all the virtual memory related system registers, with
an obvious cost. But that cost would be limited to the time it takes to
migrate the memory, so maybe that's acceptable.

Thoughts?

M.
-- 
Jazz is not dead. It just smells funny...



Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-01-29 Thread Dr. David Alan Gilbert
* Peter Maydell (peter.mayd...@linaro.org) wrote:
> On 29 January 2018 at 09:53, Dr. David Alan Gilbert  
> wrote:
> > * Peter Maydell (peter.mayd...@linaro.org) wrote:
> >> On 26 January 2018 at 19:46, Dr. David Alan Gilbert  
> >> wrote:
> >> > * Peter Maydell (peter.mayd...@linaro.org) wrote:
> >> >> I think the correct fix here is that your test code should turn
> >> >> its MMU on. Trying to treat guest RAM as uncacheable doesn't work
> >> >> for Arm KVM guests (for the same reason that VGA device video memory
> >> >> doesn't work). If it's RAM your guest has to arrange to map it as
> >> >> Normal Cacheable, and then everything should work fine.
> >> >
> >> > Does this cause problems with migrating at just the wrong point during
> >> > a VM boot?
> >>
> >> It wouldn't surprise me if it did, but I don't think I've ever
> >> tried to provoke that problem...
> >
> > If you think it'll get the RAM contents wrong, it might be best to fail
> > the migration if you can detect the cache is disabled in the guest.
> 
> I guess QEMU could look at the value of the "MMU disabled/enabled" bit
> in the guest's system registers, and refuse migration if it's off...
> 
> (cc'd Marc, Christoffer to check that I don't have the wrong end
> of the stick about how thin the ice is in the period before the
> guest turns on its MMU...)

OK; I mean it's not nice either way - but a failed migration is better
than one with corrupt RAM.

It's not pretty for cloud users; they do host-evacuations and the like
fully automatically without any knowledge of the state of a VM and hope
for migrations to work in any state.

Dave

> thanks
> -- PMM
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-01-29 Thread Peter Maydell
On 29 January 2018 at 09:53, Dr. David Alan Gilbert  wrote:
> * Peter Maydell (peter.mayd...@linaro.org) wrote:
>> On 26 January 2018 at 19:46, Dr. David Alan Gilbert  
>> wrote:
>> > * Peter Maydell (peter.mayd...@linaro.org) wrote:
>> >> I think the correct fix here is that your test code should turn
>> >> its MMU on. Trying to treat guest RAM as uncacheable doesn't work
>> >> for Arm KVM guests (for the same reason that VGA device video memory
>> >> doesn't work). If it's RAM your guest has to arrange to map it as
>> >> Normal Cacheable, and then everything should work fine.
>> >
>> > Does this cause problems with migrating at just the wrong point during
>> > a VM boot?
>>
>> It wouldn't surprise me if it did, but I don't think I've ever
>> tried to provoke that problem...
>
> If you think it'll get the RAM contents wrong, it might be best to fail
> the migration if you can detect the cache is disabled in the guest.

I guess QEMU could look at the value of the "MMU disabled/enabled" bit
in the guest's system registers, and refuse migration if it's off...

(cc'd Marc, Christoffer to check that I don't have the wrong end
of the stick about how thin the ice is in the period before the
guest turns on its MMU...)

thanks
-- PMM



Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-01-29 Thread Dr. David Alan Gilbert
* Peter Maydell (peter.mayd...@linaro.org) wrote:
> On 26 January 2018 at 19:46, Dr. David Alan Gilbert  
> wrote:
> > * Peter Maydell (peter.mayd...@linaro.org) wrote:
> >> I think the correct fix here is that your test code should turn
> >> its MMU on. Trying to treat guest RAM as uncacheable doesn't work
> >> for Arm KVM guests (for the same reason that VGA device video memory
> >> doesn't work). If it's RAM your guest has to arrange to map it as
> >> Normal Cacheable, and then everything should work fine.
> >
> > Does this cause problems with migrating at just the wrong point during
> > a VM boot?
> 
> It wouldn't surprise me if it did, but I don't think I've ever
> tried to provoke that problem...

If you think it'll get the RAM contents wrong, it might be best to fail
the migration if you can detect the cache is disabled in the guest.

Dave

> thanks
> -- PMM
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-01-28 Thread Peter Maydell
On 26 January 2018 at 19:46, Dr. David Alan Gilbert  wrote:
> * Peter Maydell (peter.mayd...@linaro.org) wrote:
>> I think the correct fix here is that your test code should turn
>> its MMU on. Trying to treat guest RAM as uncacheable doesn't work
>> for Arm KVM guests (for the same reason that VGA device video memory
>> doesn't work). If it's RAM your guest has to arrange to map it as
>> Normal Cacheable, and then everything should work fine.
>
> Does this cause problems with migrating at just the wrong point during
> a VM boot?

It wouldn't surprise me if it did, but I don't think I've ever
tried to provoke that problem...

thanks
-- PMM



Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-01-26 Thread Dr. David Alan Gilbert
* Peter Maydell (peter.mayd...@linaro.org) wrote:
> On 26 January 2018 at 15:47, Wei Huang  wrote:
> >
> >
> > On 01/25/2018 02:05 PM, Dr. David Alan Gilbert wrote:
> >> * Wei Huang (w...@redhat.com) wrote:
> >>> innerloop:
> >>> /* clean cache because el2 might still cache guest data under KVM 
> >>> */
> >>> dc  civac, x2
> >>
> >> Can you explain a bit more about that please;  if it's guest
> >> visible acorss migration, doesn't that suggest we need the cache clear
> >> in KVM or QEMU?
> >
> > I think this is ARM specific. This is caused by the inconsistency
> > between guest VM's data accesses and userspace's accesses (in
> > check_guest_ram) at the destination:
> >
> > 1) Because uncacheable (guest) + cacheable (host) ==> uncacheable. So
> > the data accesses from guest VM go directly into memory.
> > 2) QEMU user space will use the cacheable version. So it is possible
> > that data can come from cache, instead of RAM. The difference between
> > (1) and (2) obviously can cause check_guest_ram() to fail sometimes.
> 
> I think the correct fix here is that your test code should turn
> its MMU on. Trying to treat guest RAM as uncacheable doesn't work
> for Arm KVM guests (for the same reason that VGA device video memory
> doesn't work). If it's RAM your guest has to arrange to map it as
> Normal Cacheable, and then everything should work fine.

Does this cause problems with migrating at just the wrong point during
a VM boot?

Dave

> thanks
> -- PMM
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-01-26 Thread Wei Huang


On 01/26/2018 10:39 AM, Peter Maydell wrote:
> On 26 January 2018 at 15:47, Wei Huang  wrote:
>>
>>
>> On 01/25/2018 02:05 PM, Dr. David Alan Gilbert wrote:
>>> * Wei Huang (w...@redhat.com) wrote:
 innerloop:
 /* clean cache because el2 might still cache guest data under KVM 
 */
 dc  civac, x2
>>>
>>> Can you explain a bit more about that please;  if it's guest
>>> visible acorss migration, doesn't that suggest we need the cache clear
>>> in KVM or QEMU?
>>
>> I think this is ARM specific. This is caused by the inconsistency
>> between guest VM's data accesses and userspace's accesses (in
>> check_guest_ram) at the destination:
>>
>> 1) Because uncacheable (guest) + cacheable (host) ==> uncacheable. So
>> the data accesses from guest VM go directly into memory.
>> 2) QEMU user space will use the cacheable version. So it is possible
>> that data can come from cache, instead of RAM. The difference between
>> (1) and (2) obviously can cause check_guest_ram() to fail sometimes.
> 
> I think the correct fix here is that your test code should turn
> its MMU on. Trying to treat guest RAM as uncacheable doesn't work

I did try MMU-on while debugging this problem. It was a migration unit
test written on top of kvm-unit-tests and I forced QEMU's
tests/migration-test.c to use this .flat file as -kernel parameter. I
didn't see any memory check failures using this approach. So I can
confirm your suggestion.

But turning MMU on means a much larger binary blob. I know you don't
like the existing migration-test.c approach. Building a more complicated
test case and embedding it in migration-test.c code will make the
situation worse. Because of this, I chose the current version: small and
manageable (even with one extra cache flush instruction).

> for Arm KVM guests (for the same reason that VGA device video memory
> doesn't work). If it's RAM your guest has to arrange to map it as
> Normal Cacheable, and then everything should work fine.
> 
> thanks
> -- PMM
> 



Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-01-26 Thread Peter Maydell
On 26 January 2018 at 15:47, Wei Huang  wrote:
>
>
> On 01/25/2018 02:05 PM, Dr. David Alan Gilbert wrote:
>> * Wei Huang (w...@redhat.com) wrote:
>>> innerloop:
>>> /* clean cache because el2 might still cache guest data under KVM */
>>> dc  civac, x2
>>
>> Can you explain a bit more about that please;  if it's guest
>> visible acorss migration, doesn't that suggest we need the cache clear
>> in KVM or QEMU?
>
> I think this is ARM specific. This is caused by the inconsistency
> between guest VM's data accesses and userspace's accesses (in
> check_guest_ram) at the destination:
>
> 1) Because uncacheable (guest) + cacheable (host) ==> uncacheable. So
> the data accesses from guest VM go directly into memory.
> 2) QEMU user space will use the cacheable version. So it is possible
> that data can come from cache, instead of RAM. The difference between
> (1) and (2) obviously can cause check_guest_ram() to fail sometimes.

I think the correct fix here is that your test code should turn
its MMU on. Trying to treat guest RAM as uncacheable doesn't work
for Arm KVM guests (for the same reason that VGA device video memory
doesn't work). If it's RAM your guest has to arrange to map it as
Normal Cacheable, and then everything should work fine.

thanks
-- PMM



Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-01-26 Thread Wei Huang


On 01/25/2018 02:05 PM, Dr. David Alan Gilbert wrote:
> * Wei Huang (w...@redhat.com) wrote:
>> This patch adds the migration test support for aarch64. The test code,
>> which implements the same functionality as x86, is compiled into a binary
>> and booted as a kernel in qemu. Here are the design ideas:
>>
>>  * We choose this -kernel design because aarch64 QEMU doesn't provide a
>>built-in fw like x86 does. So instead of relying on a boot loader, we
>>use -kernel approach for aarch64.
>>  * The serial output is sent to PL011 directly.
>>  * The physical memory base for mach-virt machine is 0x4000. We change
>>the start_address and end_address for aarch64.
>>
>> RFC->V1:
>>  * aarch64 kernel blob is defined as an uint32_t array
>>  * The test code is re-written to address a data caching issue under KVM.
>>Tests passed under both x86 and aarch64.
>>  * Re-use init_bootfile_x86() for both x86 and aarch64
>>  * Other minor fixes
>>
>> Note that the test code is as the following:
>>
>> .section .text
>>
>> .globl  start
>> 
>> start:
>> /* disable MMU to use phys mem address */
>> mrs x0, sctlr_el1
>> bic x0, x0, #(1<<0)
>> msr sctlr_el1, x0
>> isb
>>
>> /* output char 'A' to PL011 */
>> mov w4, #65
>> mov x5, #0x900
>> strbw4, [x5]
>>
>> /* w6 keeps a counter so we can limit the output speed */
>> mov w6, #0
>>
>> /* phys mem base addr = 0x4000 */
>> mov x3, #(0x4000 + 100 *1024*1024) /* traverse 1M-100M */
>> mov x2, #(0x4000 + 1 * 1024*1024)
>>
>> /* clean up memory first */
>> mov w1, #0  
>> clean:
>> strbw1, [x2]
>> add x2, x2, #(4096)
>> cmp x2, x3
>> ble clean
>>
>> /* main body */
>> mainloop:
>> mov x2, #(0x4000 + 1 * 1024*1024)
>> 
>> innerloop:
>> /* clean cache because el2 might still cache guest data under KVM */
>> dc  civac, x2
> 
> Can you explain a bit more about that please;  if it's guest
> visible acorss migration, doesn't that suggest we need the cache clear
> in KVM or QEMU?

I think this is ARM specific. This is caused by the inconsistency
between guest VM's data accesses and userspace's accesses (in
check_guest_ram) at the destination:

1) Because uncacheable (guest) + cacheable (host) ==> uncacheable. So
the data accesses from guest VM go directly into memory.
2) QEMU user space will use the cacheable version. So it is possible
that data can come from cache, instead of RAM. The difference between
(1) and (2) obviously can cause check_guest_ram() to fail sometimes.

x86's EPT has the capability to ignore guest-provided memory type. So it
is possible to have consistent data views between guest and QEMU user-space.


> 
> Dave
> 
>> ldrbw1, [x2]
>> add w1, w1, #1
>> and w1, w1, #(0xff)
>> strbw1, [x2]
>>
>> add x2, x2, #(4096)
>> cmp x2, x3
>> blt innerloop
>>
>> add w6, w6, #1
>> and w6, w6, #(0xff)
>> cmp w6, #0
>> bne mainloop
>> 
>> /* output char 'B' to PL011 */
>> mov w4, #66
>> mov x5, #0x900
>> strbw4, [x5]
>>
>> bl  mainloop
>>
>> The code is compiled with the following commands:
>>  > gcc -c -o fill.o fill.s
>>  > gcc -O2 -o fill.elf -Wl,-T,/tmp/flat.lds,--build-id=none,-Ttext=4008 \
>>-nostdlib fill.o
>>  > objcopy -O binary fill.elf fill.flat
>>  > truncate -c -s 144 ./fill.flat
>>  > xxd -g 4 -c 24 -e fill.flat | awk '{print "0x"$2 ", " "0x"$3 ", " "0x"C$4 
>> ", " "0x"C$5 ", " "0x"$6 ", " "0x"C$7 "," }'
>>
>> The linker file (borrowed from KVM unit test) is defined as:
>>
>> SECTIONS
>> {
>> .text : { *(.init) *(.text) *(.text.*) }
>> . = ALIGN(64K);
>> etext = .;
>> .data : {
>> *(.data)
>> }
>> . = ALIGN(16);
>> .rodata : { *(.rodata) }
>> . = ALIGN(16);
>> .bss : { *(.bss) }
>> . = ALIGN(64K);
>> edata = .;
>> . += 64K;
>> . = ALIGN(64K);
>> /*
>>  * stack depth is 16K for arm and PAGE_SIZE for arm64, see THREAD_SIZE
>>  * sp must be 16 byte aligned for arm64, and 8 byte aligned for arm
>>  * sp must always be strictly less than the true stacktop
>>  */
>> stackptr = . - 16;
>> stacktop = .;
>> }
>>
>> ENTRY(start)
>>
>> Signed-off-by: Wei Huang 
>> ---
>>  tests/Makefile.include |  1 +
>>  tests/migration-test.c | 37 +++--
>>  2 files changed, 32 insertions(+), 6 deletions(-)
>>
>> diff --git a/tests/Makefile.include b/tests/Makefile.include
>> index b4bcc872f2..2a520e53ab 100644
>> --- a/tests/Makefile.include
>> +++ b/tests/Makefile.include
>> @@ -357,6 +357,7 @@ check-qtest-arm-y += 

Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-01-25 Thread Dr. David Alan Gilbert
* Wei Huang (w...@redhat.com) wrote:
> This patch adds the migration test support for aarch64. The test code,
> which implements the same functionality as x86, is compiled into a binary
> and booted as a kernel in qemu. Here are the design ideas:
> 
>  * We choose this -kernel design because aarch64 QEMU doesn't provide a
>built-in fw like x86 does. So instead of relying on a boot loader, we
>use -kernel approach for aarch64.
>  * The serial output is sent to PL011 directly.
>  * The physical memory base for mach-virt machine is 0x4000. We change
>the start_address and end_address for aarch64.
> 
> RFC->V1:
>  * aarch64 kernel blob is defined as an uint32_t array
>  * The test code is re-written to address a data caching issue under KVM.
>Tests passed under both x86 and aarch64.
>  * Re-use init_bootfile_x86() for both x86 and aarch64
>  * Other minor fixes
> 
> Note that the test code is as the following:
> 
> .section .text
> 
> .globl  start
> 
> start:
> /* disable MMU to use phys mem address */
> mrs x0, sctlr_el1
> bic x0, x0, #(1<<0)
> msr sctlr_el1, x0
> isb
> 
> /* output char 'A' to PL011 */
> mov w4, #65
> mov x5, #0x900
> strbw4, [x5]
> 
> /* w6 keeps a counter so we can limit the output speed */
> mov w6, #0
> 
> /* phys mem base addr = 0x4000 */
> mov x3, #(0x4000 + 100 *1024*1024) /* traverse 1M-100M */
> mov x2, #(0x4000 + 1 * 1024*1024)
> 
> /* clean up memory first */
> mov w1, #0  
> clean:
> strbw1, [x2]
> add x2, x2, #(4096)
> cmp x2, x3
> ble clean
> 
> /* main body */
> mainloop:
> mov x2, #(0x4000 + 1 * 1024*1024)
> 
> innerloop:
> /* clean cache because el2 might still cache guest data under KVM */
> dc  civac, x2

Can you explain a bit more about that please;  if it's guest
visible acorss migration, doesn't that suggest we need the cache clear
in KVM or QEMU?

Dave

> ldrbw1, [x2]
> add w1, w1, #1
> and w1, w1, #(0xff)
> strbw1, [x2]
> 
> add x2, x2, #(4096)
> cmp x2, x3
> blt innerloop
> 
> add w6, w6, #1
> and w6, w6, #(0xff)
> cmp w6, #0
> bne mainloop
> 
> /* output char 'B' to PL011 */
> mov w4, #66
> mov x5, #0x900
> strbw4, [x5]
> 
> bl  mainloop
> 
> The code is compiled with the following commands:
>  > gcc -c -o fill.o fill.s
>  > gcc -O2 -o fill.elf -Wl,-T,/tmp/flat.lds,--build-id=none,-Ttext=4008 \
>-nostdlib fill.o
>  > objcopy -O binary fill.elf fill.flat
>  > truncate -c -s 144 ./fill.flat
>  > xxd -g 4 -c 24 -e fill.flat | awk '{print "0x"$2 ", " "0x"$3 ", " "0x"C$4 
> ", " "0x"C$5 ", " "0x"$6 ", " "0x"C$7 "," }'
> 
> The linker file (borrowed from KVM unit test) is defined as:
> 
> SECTIONS
> {
> .text : { *(.init) *(.text) *(.text.*) }
> . = ALIGN(64K);
> etext = .;
> .data : {
> *(.data)
> }
> . = ALIGN(16);
> .rodata : { *(.rodata) }
> . = ALIGN(16);
> .bss : { *(.bss) }
> . = ALIGN(64K);
> edata = .;
> . += 64K;
> . = ALIGN(64K);
> /*
>  * stack depth is 16K for arm and PAGE_SIZE for arm64, see THREAD_SIZE
>  * sp must be 16 byte aligned for arm64, and 8 byte aligned for arm
>  * sp must always be strictly less than the true stacktop
>  */
> stackptr = . - 16;
> stacktop = .;
> }
> 
> ENTRY(start)
> 
> Signed-off-by: Wei Huang 
> ---
>  tests/Makefile.include |  1 +
>  tests/migration-test.c | 37 +++--
>  2 files changed, 32 insertions(+), 6 deletions(-)
> 
> diff --git a/tests/Makefile.include b/tests/Makefile.include
> index b4bcc872f2..2a520e53ab 100644
> --- a/tests/Makefile.include
> +++ b/tests/Makefile.include
> @@ -357,6 +357,7 @@ check-qtest-arm-y += tests/test-arm-mptimer$(EXESUF)
>  gcov-files-arm-y += hw/timer/arm_mptimer.c
>  
>  check-qtest-aarch64-y = tests/numa-test$(EXESUF)
> +check-qtest-aarch64-y += tests/migration-test$(EXESUF)
>  
>  check-qtest-microblazeel-y = $(check-qtest-microblaze-y)
>  
> diff --git a/tests/migration-test.c b/tests/migration-test.c
> index be598d3257..3237fe93b2 100644
> --- a/tests/migration-test.c
> +++ b/tests/migration-test.c
> @@ -22,8 +22,8 @@
>  
>  #define MIN_NVRAM_SIZE 8192 /* from spapr_nvram.c */
>  
> -const unsigned start_address = 1024 * 1024;
> -const unsigned end_address = 100 * 1024 * 1024;
> +unsigned start_address = 1024 * 1024;
> +unsigned end_address = 100 * 1024 * 1024;
>  bool got_stop;
>  
>  #if defined(__linux__)
> @@ -79,7 +79,7 @@ static const char *tmpfs;
>  /* A simple PC boot sector that modifies memory (1-100MB) quickly
>   * 

[Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64

2018-01-24 Thread Wei Huang
This patch adds the migration test support for aarch64. The test code,
which implements the same functionality as x86, is compiled into a binary
and booted as a kernel in qemu. Here are the design ideas:

 * We choose this -kernel design because aarch64 QEMU doesn't provide a
   built-in fw like x86 does. So instead of relying on a boot loader, we
   use -kernel approach for aarch64.
 * The serial output is sent to PL011 directly.
 * The physical memory base for mach-virt machine is 0x4000. We change
   the start_address and end_address for aarch64.

RFC->V1:
 * aarch64 kernel blob is defined as an uint32_t array
 * The test code is re-written to address a data caching issue under KVM.
   Tests passed under both x86 and aarch64.
 * Re-use init_bootfile_x86() for both x86 and aarch64
 * Other minor fixes

Note that the test code is as the following:

.section .text

.globl  start

start:
/* disable MMU to use phys mem address */
mrs x0, sctlr_el1
bic x0, x0, #(1<<0)
msr sctlr_el1, x0
isb

/* output char 'A' to PL011 */
mov w4, #65
mov x5, #0x900
strbw4, [x5]

/* w6 keeps a counter so we can limit the output speed */
mov w6, #0

/* phys mem base addr = 0x4000 */
mov x3, #(0x4000 + 100 *1024*1024) /* traverse 1M-100M */
mov x2, #(0x4000 + 1 * 1024*1024)

/* clean up memory first */
mov w1, #0  
clean:
strbw1, [x2]
add x2, x2, #(4096)
cmp x2, x3
ble clean

/* main body */
mainloop:
mov x2, #(0x4000 + 1 * 1024*1024)

innerloop:
/* clean cache because el2 might still cache guest data under KVM */
dc  civac, x2
ldrbw1, [x2]
add w1, w1, #1
and w1, w1, #(0xff)
strbw1, [x2]

add x2, x2, #(4096)
cmp x2, x3
blt innerloop

add w6, w6, #1
and w6, w6, #(0xff)
cmp w6, #0
bne mainloop

/* output char 'B' to PL011 */
mov w4, #66
mov x5, #0x900
strbw4, [x5]

bl  mainloop

The code is compiled with the following commands:
 > gcc -c -o fill.o fill.s
 > gcc -O2 -o fill.elf -Wl,-T,/tmp/flat.lds,--build-id=none,-Ttext=4008 \
   -nostdlib fill.o
 > objcopy -O binary fill.elf fill.flat
 > truncate -c -s 144 ./fill.flat
 > xxd -g 4 -c 24 -e fill.flat | awk '{print "0x"$2 ", " "0x"$3 ", " "0x"C$4 ", 
 > " "0x"C$5 ", " "0x"$6 ", " "0x"C$7 "," }'

The linker file (borrowed from KVM unit test) is defined as:

SECTIONS
{
.text : { *(.init) *(.text) *(.text.*) }
. = ALIGN(64K);
etext = .;
.data : {
*(.data)
}
. = ALIGN(16);
.rodata : { *(.rodata) }
. = ALIGN(16);
.bss : { *(.bss) }
. = ALIGN(64K);
edata = .;
. += 64K;
. = ALIGN(64K);
/*
 * stack depth is 16K for arm and PAGE_SIZE for arm64, see THREAD_SIZE
 * sp must be 16 byte aligned for arm64, and 8 byte aligned for arm
 * sp must always be strictly less than the true stacktop
 */
stackptr = . - 16;
stacktop = .;
}

ENTRY(start)

Signed-off-by: Wei Huang 
---
 tests/Makefile.include |  1 +
 tests/migration-test.c | 37 +++--
 2 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/tests/Makefile.include b/tests/Makefile.include
index b4bcc872f2..2a520e53ab 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -357,6 +357,7 @@ check-qtest-arm-y += tests/test-arm-mptimer$(EXESUF)
 gcov-files-arm-y += hw/timer/arm_mptimer.c
 
 check-qtest-aarch64-y = tests/numa-test$(EXESUF)
+check-qtest-aarch64-y += tests/migration-test$(EXESUF)
 
 check-qtest-microblazeel-y = $(check-qtest-microblaze-y)
 
diff --git a/tests/migration-test.c b/tests/migration-test.c
index be598d3257..3237fe93b2 100644
--- a/tests/migration-test.c
+++ b/tests/migration-test.c
@@ -22,8 +22,8 @@
 
 #define MIN_NVRAM_SIZE 8192 /* from spapr_nvram.c */
 
-const unsigned start_address = 1024 * 1024;
-const unsigned end_address = 100 * 1024 * 1024;
+unsigned start_address = 1024 * 1024;
+unsigned end_address = 100 * 1024 * 1024;
 bool got_stop;
 
 #if defined(__linux__)
@@ -79,7 +79,7 @@ static const char *tmpfs;
 /* A simple PC boot sector that modifies memory (1-100MB) quickly
  * outputing a 'B' every so often if it's still running.
  */
-unsigned char bootsect[] = {
+unsigned char x86_bootsect[] = {
   0xfa, 0x0f, 0x01, 0x16, 0x74, 0x7c, 0x66, 0xb8, 0x01, 0x00, 0x00, 0x00,
   0x0f, 0x22, 0xc0, 0x66, 0xea, 0x20, 0x7c, 0x00, 0x00, 0x08, 0x00, 0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xe4, 0x92, 0x0c, 0x02,
@@ -125,11 +125,20 @@ unsigned char bootsect[] = {
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x55, 0xaa
 };
 
-static void init_bootfile_x86(const char