On Fri, 9 Jun 2017 12:28:13 +1000 David Gibson <da...@gibson.dropbear.id.au> wrote:
> On Thu, Jun 08, 2017 at 03:42:32PM +0200, Greg Kurz wrote: > > I've provided answers for all comments from the v3 review that I > > deliberately > > don't address in v4. > > I've merged patches 1-4. 5 & 6 I'm still reviewing. > Cool. FYI, I forgot to mention that I only tested with KVM. I'm now trying with TCG and I hit various guest crash on the destination (using your ppc-for-2.10 branch WITHOUT my patches): cpu 0x0: Vector: 700 (Program Check) at [c0000000787ebae0] pc: c0000000002803c4: __fput+0x284/0x310 lr: c000000000280258: __fput+0x118/0x310 sp: c0000000787ebd60 msr: 8000000000029033 current = 0xc00000007cbab640 paca = 0xc000000007b80000 softe: 0 irq_happened: 0x01 pid = 1812, comm = gawk kernel BUG at ../include/linux/fs.h:2399! enter ? for help [c0000000787ebdb0] c0000000000d7d84 task_work_run+0xe4/0x160 [c0000000787ebe00] c000000000018054 do_notify_resume+0xb4/0xc0 [c0000000787ebe30] c00000000000a730 ret_from_except_lite+0x5c/0x60 --- Exception: c00 (System Call) at 00003fff9026dd90 SP (3fffcb37b790) is in userspace 0:mon> or cpu 0x0: Vector: 300 (Data Access) at [c00000007fff7490] pc: c0000000001ef768: free_pcppages_bulk+0x2b8/0x500 lr: c0000000001ef524: free_pcppages_bulk+0x74/0x500 sp: c00000007fff7710 msr: 8000000000009033 dar: c0000000807afc70 dsisr: 40000000 current = 0xc00000007c609190 paca = 0xc000000007b80000 softe: 0 irq_happened: 0x01 pid = 1631, comm = systemctl enter ? for help [c00000007fff77c0] c0000000001eff24 free_hot_cold_page+0x204/0x270 [c00000007fff7810] c0000000001f5848 __put_single_page+0x48/0x60 [c00000007fff7840] c00000000059ac50 skb_release_data+0xb0/0x180 [c00000007fff7880] c00000000059ae38 kfree_skb+0x58/0x130 [c00000007fff78c0] c00000000063f604 __udp4_lib_mcast_deliver+0x3d4/0x460 [c00000007fff7a50] c00000000063fb0c __udp4_lib_rcv+0x47c/0x770 [c00000007fff7b00] c0000000006023a8 ip_local_deliver_finish+0x148/0x310 [c00000007fff7b50] c0000000006026c4 ip_rcv_finish+0x154/0x420 [c00000007fff7bd0] c0000000005b1154 __netif_receive_skb_core+0x874/0xac0 [c00000007fff7cc0] c0000000005b30d4 netif_receive_skb+0x34/0xd0 [c00000007fff7d00] d000000000ef3c74 virtnet_poll+0x514/0x8a0 [virtio_net] [c00000007fff7e10] c0000000005b3668 net_rx_action+0x1d8/0x310 [c00000007fff7ea0] c0000000000b0cc4 __do_softirq+0x154/0x330 [c00000007fff7f90] c0000000000251ac call_do_softirq+0x14/0x24 [c00000007fff3ef0] c000000000011be0 do_softirq+0xe0/0x110 [c00000007fff3f30] c0000000000b10e8 irq_exit+0xc8/0x110 [c00000007fff3f60] c0000000000117e8 __do_irq+0xb8/0x1c0 [c00000007fff3f90] c0000000000251d0 call_do_irq+0x14/0x24 [c00000007a94bac0] c000000000011990 do_IRQ+0xa0/0x120 [c00000007a94bb20] c00000000000a8b0 restore_check_irq_replay+0x2c/0x5c --- Exception: 501 (Hardware Interrupt) at c000000000010f84 arch_local_irq_restore+0x74/0x90 [c00000007a94be10] 000000000000000c (unreliable) [c00000007a94be30] c00000000000a704 ret_from_except_lite+0x30/0x60 --- Exception: 501 (Hardware Interrupt) at 00003fffa04a2c28 SP (3ffff7f1bf60) is in userspace 0:mon> These doesn't seem to occur with QEMU master. I'll try to investigate. > > > > v4: - some patches from v3 got merged > > - added some more preparatory cleanup in xics (patches 1,2) > > - merge cpu_setup() handler into realize() (patch 4) > > - see individual changelog for patches 3 and 6 > > > > v3: - preparatory cleanup in pnv (patch 1) > > - rework ICPState realization and vmstate registration (patches 2,3,4) > > - fix migration using dummy icp/server entries (patch 5) > > > > v2: - some patches from v1 are already merged in ppc-for-2.10 > > - added a new fix to a potential memory leak (patch 1) > > - consolidate dt_id computation (patch 3) > > - see individual changelogs for patch 2 and 4 > > > > I could successfully do the following on POWER8 host with full cores (SMT8): > > > > 1) start a pseries-2.9 machine with QEMU 2.9: > > -smp cores=1,threads=2,maxcpus=8 > > 2) hotplug a core: > > device_add host-spapr-cpu-core,core-id=4 > > 3) migrate to QEMU 2.10 configured with core-id 0,4 > > 4) hotplug another core: > > device_add host-spapr-cpu-core,core-id=2 > > 5) migrate back to QEMU 2.9 configured with core-id 0,4,2 > > 6) hotplug the core in the last available slot: > > device_add host-spapr-cpu-core,core-id=6 > > 7) migrate to QEMU 2.10 configured with core-id 0,4,2,6 > > > > I could check that the guest is functional after each migration. > > >
pgpMNEbDDzV0I.pgp
Description: OpenPGP digital signature