On Tue, Jul 04, 2017 at 06:13:31PM -0300, Daniel Henrique Barboza wrote: > I just tested this patch set on top of current ppc-for-2.10 branch (which > contains > the patches from part V). It applied cleanly but required a couple of > trivial > fixes to build probably because it was made on top of an older code base.
Right, I fixed that up locally already, but haven't gotten around to reposting yet. You can look at the 'drcVI' branch on my github tree, if you're interested. > The trivial migration test worked fine. The libvirt scenario (attaching a > device on > target before migration, try to unplug after migration) isn't working as > expected > but we have a different result with this series. Instead of silently failing > to unplug > with error messages on dmesg, the hot unplug works on QEMU level: Thanks for testing. Just to clarify what you're saying here, you haven't spotted a regression with this series, but there is a case which was broken and is still broken with slightly different symptoms. Yes? > > (qemu) device_del core1 > (qemu) > (qemu) info cpus > * CPU #0: nip=0xc0000000000a3e0c thread_id=86162 > (qemu) info hotpluggable-cpus > Hotpluggable CPUs: > type: "host-spapr-cpu-core" > vcpus_count: "1" > CPUInstance Properties: > core-id: "3" > type: "host-spapr-cpu-core" > vcpus_count: "1" > CPUInstance Properties: > core-id: "2" > type: "host-spapr-cpu-core" > vcpus_count: "1" > CPUInstance Properties: > core-id: "1" > type: "host-spapr-cpu-core" > vcpus_count: "1" > qom_path: "/machine/unattached/device[0]" > CPUInstance Properties: > core-id: "0" > (qemu) > > > However, any operation on the guest afterwards (tried with lscpu and dmesg) > seems > to hung the guest. This is what I got when trying to do a dmesg after the > hot unplug: Ouch. That's bad. I'll have to look into it. I have rather a lot on my plate at the moment - if you get a chance to work out which of the patches in the series causes this behaviour, that could be handy. > danielhb@ubuntu1704:~$ > danielhb@ubuntu1704:~$ > danielhb@ubuntu1704:~$ dmesg | tail -n 5 > ^C > > [ 362.749693] INFO: task kworker/u8:1:30 blocked for more than 120 seconds. > [ 362.749819] Not tainted 4.10.0-26-generic #30-Ubuntu > [ 362.749892] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this message. > [ 362.750224] INFO: task kworker/0:3:1887 blocked for more than 120 > seconds. > [ 362.750320] Not tainted 4.10.0-26-generic #30-Ubuntu > [ 362.750394] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this message. > [ 483.568842] INFO: task kworker/u8:1:30 blocked for more than 120 seconds. > [ 483.568997] Not tainted 4.10.0-26-generic #30-Ubuntu > [ 483.569067] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this message. > [ 483.569364] INFO: task kworker/0:3:1887 blocked for more than 120 > seconds. > (...) > > > I am not sure if it was intended for this scenario to work with this patch > set already. Figured > it can serve as a FYI for the next series. This might be due to a problem I've discussed previously with Laurent. libvirt uses device_add on the destination's monitor before sending the migration stream, rather than starting the destination with already-hotplugged devices as -device. To my surprise, there doesn't seem to be a system reset between that device_add phase and processing the incoming stream, which means DRCs for those devices might be in the wrong state. I might need to explicitly trigger a reset of DRC state at incoming migration time to handle this. > Given that hotplug/unplug without migration still works and on the > assumption that > we'll look more into this libvirt scenario in the next spins, +1. > > > Tested-by: Daniel Barboza <danie...@linux.vnet.ibm.com> > > > > On 06/21/2017 06:18 AM, David Gibson wrote: > > This sixth set of DRC cleanup patches (to be applied on top of the > > patches from part V) is a complete rework of DRC state management. We > > stop tracking some unnecessary things, and change the basic state > > representation to a simpler and more robust model. > > > > Most of the patches in this set "break" migration from earlier git > > snapshots, but not from any released qemu version. The previous > > migration stream format had multiple problems, so better to fix them > > now, before 2.10 is out. > > > > David Gibson (5): > > spapr: Remove 'awaiting_allocation' DRC flag > > spapr: Refactor spapr_drc_detach() > > spapr: Cleanups relating to DRC awaiting_release field > > spapr: Consolidate DRC state variables > > spapr: Remove sPAPRConfigureConnectorState sub-structure > > > > hw/ppc/spapr.c | 9 +- > > hw/ppc/spapr_drc.c | 321 > > +++++++++++++++++++++------------------------ > > hw/ppc/spapr_pci.c | 13 +- > > hw/ppc/trace-events | 3 +- > > include/hw/ppc/spapr_drc.h | 53 +++++--- > > 5 files changed, 187 insertions(+), 212 deletions(-) > > > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature