Re: [PATCH v7 0/4] arm: dirty page logging support for ARMv7

2014-06-09 Thread Mario Smarduch
On 06/08/2014 03:45 AM, Christoffer Dall wrote:
 On Tue, Jun 03, 2014 at 04:19:23PM -0700, Mario Smarduch wrote:
 This patch adds support for dirty page logging so far tested only on ARMv7.
 With dirty page logging, GICv2 vGIC and arch timer save/restore support, 
 live 
 migration is supported. 

 Dirty page logging support -
 - initially write protects VM RAM memory regions - 2nd stage page tables
 - add support to read dirty page log and again write protect the dirty pages 
   - second stage page table for next pass.
 - second stage huge page are disolved into page tables to keep track of
   dirty pages at page granularity. Tracking at huge page granularity limits 
   migration to an almost idle system. There are couple approaches to handling
   huge pages:
   1 - break up huge page into page table and write protect all pte's
   2 - clear the PMD entry, create a page table install the faulted page entry
   and write protect it.
 
 not sure I fully understand.  Is option 2 simply write-protecting all
 PMDs and splitting it at fault time?

No that's 1 above. Option 2 is the optimized solution you describe in patch 4 
review - clear the PMD and let stage2_set_pte allocate a page table and install 
the pte, then it's demand faulting on future access to that PMD range.
 

   This patch implements #2, in the future #1 may be implemented depending on
   more bench mark results.

   Option 1: may over commit and do unnecessary work, but on heavy loads 
 appears
 to converge faster during live migration
   Option 2: Only write protects pages that are accessed, migration
  varies, takes longer then Option 1 but eventually catches up.

 - In the event migration is canceled, normal behavior is resumed huge pages
   are rebuilt over time.
 - Another alternative is use of reverse mappings where for each level 2nd
   stage tables (PTE, PMD, PUD) pointers to spte's are maintained (x86 impl.).
   Primary reverse mapping benefits are for mmu notifiers for large memory 
 range
   invalidations. Reverse mappings also improve dirty page logging, instead of
   walking page tables, spete pointers are accessed directly via reverse map
   array.
 - Reverse mappings will be considered for future support once the current
   implementation is hardened.
 
 Is the following a list of your future work?

I guess yes and no, with exception of lmbench I've ran these tests also
couple other folks have tested with prior revisions. I'll run
more (overnight, burn in tests) adding lmbench, but I'm hoping 
others will run tests to give this more run time, different loads 
and so on.
 
   o validate current dirty page logging support
   o VMID TLB Flushing, migrating multiple guests
   o GIC/arch-timer migration
   o migration under various loads, primarily page reclaim and validate 
 current
 mmu-notifiers
   o Run benchmarks (lmbench for now) and test impact on performance, and
 optimize
   o Test virtio - since it writes into guest memory. Wait until pci is 
 supported
 on ARM.
 
 So you're not testing with virtio now?  Your command line below seems to
 suggest that in fact you are.  /me confused.

Yes so I've see no errors with virtio-mmio transport and virto-net-device,
blk-device backends under moderate loads. But virtio inbound is purely user 
space 
in this case QEMU  so I can't say with certainty that virtio is 100%. 
Sometime back I found problems with virtio-mmio when transport and backend 
are not fused together none of the performance options (UFO, TSO, Partial 
Checksum...) got applied, like they did for virti-net-pci. So to summarize 
I need to see how virtio tracks dirty pages for virtio-mmio, 
and  virtio-pci in QEMU. I have fair idea where to look but have not
done so yet.


 
   o Currently on ARM, KVM doesn't appear to write into Guest address space,
 need to mark those pages dirty too (???).
 
 not sure what you mean here, can you expand?

For few architectures KVM writes into guest memory, one example is PV-EOI,
will write into guest memory to disable/enable PV-EOI while injecting an
interrupt - based one number of in flight interrupts. There is other code
that does it too, but I'm not familiar with all the use cases. So if we do
that on ARM the page(s) must marked dirty.

 
 - Move onto ARMv8 since 2nd stage mmu is shared between both architectures. 
   But in addition to dirty page log additional support for GIC, arch timers, 
   and emulated devices is required. Also working on emulated platform masks
   a lot of potential bugs, but does help to get majority of code working.

 Test Environment:
 ---
 NOTE: RUNNING on FAST Models will hardly ever fail and mask bugs, infact 
   initially light loads were succeeding without dirty page logging 
 support.
 ---
 - Will put all components on github, including test setup diagram
 - In short 

Re: [PATCH v7 0/4] arm: dirty page logging support for ARMv7

2014-06-08 Thread Christoffer Dall
On Tue, Jun 03, 2014 at 04:19:23PM -0700, Mario Smarduch wrote:
 This patch adds support for dirty page logging so far tested only on ARMv7.
 With dirty page logging, GICv2 vGIC and arch timer save/restore support, live 
 migration is supported. 
 
 Dirty page logging support -
 - initially write protects VM RAM memory regions - 2nd stage page tables
 - add support to read dirty page log and again write protect the dirty pages 
   - second stage page table for next pass.
 - second stage huge page are disolved into page tables to keep track of
   dirty pages at page granularity. Tracking at huge page granularity limits 
   migration to an almost idle system. There are couple approaches to handling
   huge pages:
   1 - break up huge page into page table and write protect all pte's
   2 - clear the PMD entry, create a page table install the faulted page entry
   and write protect it.

not sure I fully understand.  Is option 2 simply write-protecting all
PMDs and splitting it at fault time?

 
   This patch implements #2, in the future #1 may be implemented depending on
   more bench mark results.
 
   Option 1: may over commit and do unnecessary work, but on heavy loads 
 appears
 to converge faster during live migration
   Option 2: Only write protects pages that are accessed, migration
   varies, takes longer then Option 1 but eventually catches up.
 
 - In the event migration is canceled, normal behavior is resumed huge pages
   are rebuilt over time.
 - Another alternative is use of reverse mappings where for each level 2nd
   stage tables (PTE, PMD, PUD) pointers to spte's are maintained (x86 impl.).
   Primary reverse mapping benefits are for mmu notifiers for large memory 
 range
   invalidations. Reverse mappings also improve dirty page logging, instead of
   walking page tables, spete pointers are accessed directly via reverse map
   array.
 - Reverse mappings will be considered for future support once the current
   implementation is hardened.

Is the following a list of your future work?

   o validate current dirty page logging support
   o VMID TLB Flushing, migrating multiple guests
   o GIC/arch-timer migration
   o migration under various loads, primarily page reclaim and validate current
 mmu-notifiers
   o Run benchmarks (lmbench for now) and test impact on performance, and
 optimize
   o Test virtio - since it writes into guest memory. Wait until pci is 
 supported
 on ARM.

So you're not testing with virtio now?  Your command line below seems to
suggest that in fact you are.  /me confused.

   o Currently on ARM, KVM doesn't appear to write into Guest address space,
 need to mark those pages dirty too (???).

not sure what you mean here, can you expand?

 - Move onto ARMv8 since 2nd stage mmu is shared between both architectures. 
   But in addition to dirty page log additional support for GIC, arch timers, 
   and emulated devices is required. Also working on emulated platform masks
   a lot of potential bugs, but does help to get majority of code working.
 
 Test Environment:
 ---
 NOTE: RUNNING on FAST Models will hardly ever fail and mask bugs, infact 
   initially light loads were succeeding without dirty page logging 
 support.
 ---
 - Will put all components on github, including test setup diagram
 - In short summary
   o Two ARM Exyonys 5440 development platforms - 4-way 1.7 GHz, with 8GB, 
 256GB
 storage, 1GBs Ethernet, with swap enabled
   o NFS Server runing Ubuntu 13.04 
 - both ARM boards mount shared file system 
 - Shared file system includes - QEMU, Guest Kernel, DTB, multiple Ext3 
 root
   file systems.
   o Component versions: qemu-1.7.5, vexpress-a15, host/guest kernel 3.15-rc1,
   o Use QEMU Ctr+A+C and migrate -d tcp:IP:port command
 - Destination command syntax: can change smp to 4, machine model outdated,
   but has been tested on virt by others (need to upgrade)
   
   /mnt/migration/qemu-system-arm -enable-kvm -smp 2 -kernel \
   /mnt/migration/zImage -dtb /mnt/migration/guest-a15.dtb -m 1792 \
   -M vexpress-a15 -cpu cortex-a15 -nographic \
   -append root=/dev/vda rw console=ttyAMA0 rootwait \
   -drive if=none,file=/mnt/migration/guest1.root,id=vm1 \
   -device virtio-blk-device,drive=vm1 \
   -netdev type=tap,id=net0,ifname=tap0 \
   -device virtio-net-device,netdev=net0,mac=52:54:00:12:34:58 \
   -incoming tcp:0:4321
 
 - Source command syntax same except '-incoming'
 
   o Test migration of multiple VMs use tap0, tap1, ..., and guest0.root, .
 has been tested as well.
   o On source run multiple copies of 'dirtyram.arm' - simple program to dirty
 pages periodically.
 ./dirtyarm.ram total mmap size dirty page size sleep time
 Example:
 ./dirtyram.arm 102580 812 30
 - dirty 

[PATCH v7 0/4] arm: dirty page logging support for ARMv7

2014-06-03 Thread Mario Smarduch
This patch adds support for dirty page logging so far tested only on ARMv7.
With dirty page logging, GICv2 vGIC and arch timer save/restore support, live 
migration is supported. 

Dirty page logging support -
- initially write protects VM RAM memory regions - 2nd stage page tables
- add support to read dirty page log and again write protect the dirty pages 
  - second stage page table for next pass.
- second stage huge page are disolved into page tables to keep track of
  dirty pages at page granularity. Tracking at huge page granularity limits 
  migration to an almost idle system. There are couple approaches to handling
  huge pages:
  1 - break up huge page into page table and write protect all pte's
  2 - clear the PMD entry, create a page table install the faulted page entry
  and write protect it.

  This patch implements #2, in the future #1 may be implemented depending on
  more bench mark results.

  Option 1: may over commit and do unnecessary work, but on heavy loads appears
to converge faster during live migration
  Option 2: Only write protects pages that are accessed, migration
varies, takes longer then Option 1 but eventually catches up.

- In the event migration is canceled, normal behavior is resumed huge pages
  are rebuilt over time.
- Another alternative is use of reverse mappings where for each level 2nd
  stage tables (PTE, PMD, PUD) pointers to spte's are maintained (x86 impl.).
  Primary reverse mapping benefits are for mmu notifiers for large memory range
  invalidations. Reverse mappings also improve dirty page logging, instead of
  walking page tables, spete pointers are accessed directly via reverse map
  array.
- Reverse mappings will be considered for future support once the current
  implementation is hardened.
  o validate current dirty page logging support
  o VMID TLB Flushing, migrating multiple guests
  o GIC/arch-timer migration
  o migration under various loads, primarily page reclaim and validate current
mmu-notifiers
  o Run benchmarks (lmbench for now) and test impact on performance, and
optimize
  o Test virtio - since it writes into guest memory. Wait until pci is supported
on ARM.
  o Currently on ARM, KVM doesn't appear to write into Guest address space,
need to mark those pages dirty too (???).
- Move onto ARMv8 since 2nd stage mmu is shared between both architectures. 
  But in addition to dirty page log additional support for GIC, arch timers, 
  and emulated devices is required. Also working on emulated platform masks
  a lot of potential bugs, but does help to get majority of code working.

Test Environment:
---
NOTE: RUNNING on FAST Models will hardly ever fail and mask bugs, infact 
  initially light loads were succeeding without dirty page logging support.
---
- Will put all components on github, including test setup diagram
- In short summary
  o Two ARM Exyonys 5440 development platforms - 4-way 1.7 GHz, with 8GB, 256GB
storage, 1GBs Ethernet, with swap enabled
  o NFS Server runing Ubuntu 13.04 
- both ARM boards mount shared file system 
- Shared file system includes - QEMU, Guest Kernel, DTB, multiple Ext3 root
  file systems.
  o Component versions: qemu-1.7.5, vexpress-a15, host/guest kernel 3.15-rc1,
  o Use QEMU Ctr+A+C and migrate -d tcp:IP:port command
- Destination command syntax: can change smp to 4, machine model outdated,
  but has been tested on virt by others (need to upgrade)

/mnt/migration/qemu-system-arm -enable-kvm -smp 2 -kernel \
/mnt/migration/zImage -dtb /mnt/migration/guest-a15.dtb -m 1792 \
-M vexpress-a15 -cpu cortex-a15 -nographic \
-append root=/dev/vda rw console=ttyAMA0 rootwait \
-drive if=none,file=/mnt/migration/guest1.root,id=vm1 \
-device virtio-blk-device,drive=vm1 \
-netdev type=tap,id=net0,ifname=tap0 \
-device virtio-net-device,netdev=net0,mac=52:54:00:12:34:58 \
-incoming tcp:0:4321

- Source command syntax same except '-incoming'

  o Test migration of multiple VMs use tap0, tap1, ..., and guest0.root, .
has been tested as well.
  o On source run multiple copies of 'dirtyram.arm' - simple program to dirty
pages periodically.
./dirtyarm.ram total mmap size dirty page size sleep time
Example:
./dirtyram.arm 102580 812 30
- dirty 102580 pages
- 812 pages every 30ms with an incrementing counter 
- run anywhere from one to as many copies as VM resources can support. If 
  the dirty rate is too high migration will run indefintely
- run date output loop, check date is picked up smoothly
- place guest/host into page reclaim/swap mode - by whatever means in this
  case run multiple copies of 'dirtyram.ram' on host
- issue migrate command(s) on source
- Top result