Re: [Qemu-devel] [Bug Report] vm paused after succeeding to migrate
> -邮件原件- > 发件人: Dr. David Alan Gilbert [mailto:dgilb...@redhat.com] > 发送时间: 2018年4月12日 20:37 > 收件人: linzhecheng; pbonz...@redhat.com > 抄送: qemu-devel@nongnu.org; wangxin (U) ; > Zhoujian (jay) ; quint...@redhat.com > 主题: Re: [Qemu-devel] [Bug Report] vm paused after succeeding to migrate > > * linzhecheng (linzhech...@huawei.com) wrote: > > Hi, all > > I encounterd a bug when I try to migrate a windows vm. > > > > Enviroment information: > > host A: cpu E5620(model WestmereEP without flag xsave) host B: cpu > > E5-2643(model SandyBridgeEP with xsave) > > > > The reproduce steps is : > > 1. Start a windows 2008 vm with -cpu host(which means host-passthrough). > > 2. Migrate the vm to host B when cr4.OSXSAVE=0 (successfully). > > 3. Vm runs on host B for a while so that cr4.OSXSAVE changes to 1. > > 4. Then migrate the vm to host A (successfully), but vm was paused, and > qemu printed log as followed: > > Remember that migrating using -cpu host across different CPU models is NOT > expected to work. > > > KVM: entry failed, hardware error 0x8021 > > > > If you're running a guest on an Intel machine without unrestricted > > mode support, the failure can be most likely due to the guest entering > > an invalid state for Intel VT. For example, the guest maybe running in > > big real mode which is not supported on less recent Intel processors. > > > > EAX=019b3bb0 EBX=01a3ae80 ECX=01a61ce8 EDX= > > ESI=01a62000 EDI= EBP= ESP=01718b20 > > EIP=0185d982 EFL=0286 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES > > = 9300 CS =f000 9b00 > > SS = 9300 DS = > > 9300 FS = 9300 GS = > > 9300 > > LDT= 8200 > > TR = 8b00 > > GDT= > > IDT= > > CR0=6010 CR2= CR3= CR4= > > DR0= DR1= DR2= > > DR3= > > DR6=0ff0 DR7=0400 > > EFER= > > Code=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> > > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > 00 00 00 00 00 00 > > > > I have found that problem happened when kvm_put_sregs returns err - > 22(called by kvm_arch_put_registers(qemu)). > > Because kvm_arch_vcpu_ioctl_set_sregs(kvm-mod) checked that > guest_cpuid_has no X86_FEATURE_XSAVE but cr4.OSXSAVE=1. > > So should we cancel migration when kvm_arch_put_registers returns error? > > It would seem good if we can make the migration fail there rather than hitting > that KVM error. > It looks like we need to do a bit of plumbing to convert the places that call > it to > return a bool rather than void. I think we should return a int value of run_on_cpu which callback run_on_cpu_func, but run_on_cpu_func is the prototype of many functions, Is it overkill? > > Dave > > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH 2/4] pc-bios/s390-ccw: fix loadparm initialization and int conversion
On 12.04.2018 23:04, Farhan Ali wrote: > > > On 04/12/2018 04:57 PM, Collin Walling wrote: >> On 04/12/2018 02:57 PM, Thomas Huth wrote: >>> On 10.04.2018 17:01, Collin Walling wrote: Rename the loadparm char array in main.c to loadparm_str and increase the size by one byte to account for a null termination when converting the loadparm string to an int via atoui. Also allow the boot menu to be enabled when loadparm is set to an empty string or a series of spaces. Signed-off-by: Collin WallingReported-by: Vasily Gorbik --- hw/s390x/ipl.c | 2 ++ pc-bios/s390-ccw/main.c | 14 +++--- 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c index fdeaec3..23b5b54 100644 --- a/hw/s390x/ipl.c +++ b/hw/s390x/ipl.c @@ -352,6 +352,8 @@ int s390_ipl_set_loadparm(uint8_t *loadparm) loadparm[i] = ascii2ebcdic[(uint8_t) lp[i]]; } + memset(loadparm + i, 0x40, 8 - i); /* fill with EBCDIC spaces */ + g_free(lp); return 0; } diff --git a/pc-bios/s390-ccw/main.c b/pc-bios/s390-ccw/main.c index 9d9f8cf..26f9adf 100644 --- a/pc-bios/s390-ccw/main.c +++ b/pc-bios/s390-ccw/main.c @@ -15,11 +15,11 @@ char stack[PAGE_SIZE * 8] __attribute__((__aligned__(PAGE_SIZE))); static SubChannelId blk_schid = { .one = 1 }; IplParameterBlock iplb __attribute__((__aligned__(PAGE_SIZE))); -static char loadparm[8] = { 0, 0, 0, 0, 0, 0, 0, 0 }; +static char loadparm_str[9] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 }; QemuIplParameters qipl; #define LOADPARM_PROMPT "PROMPT " -#define LOADPARM_EMPTY "" +#define LOADPARM_EMPTY " " >>> >>> Sorry for my ignorance, but why was the old string containing dots? >>> >>> Thomas >>> >> >> No need for apologies :) >> >> If -machine loadparm is *not* present on the command line, then the >> loadparm in the sclp >> will be a series of nulls. For whatever reason, that gets translated >> into a series of dots. >> > > It's because of the ebc2asc table we use for conversion, which results > in the dots when converting from ebcdic_to_ascii. Ah, great, thanks to both of you for the explanation, that was the part that I was missing. I was only looking at the tables in include/hw/s390x/ebcdic.h (since I thought that the dots were already created on the QEMU side), and did not expect that the pc-bios could create them. The patch now makes sense to me: Reviewed-by: Thomas Huth
Re: [Qemu-devel] [PATCH v2 1/3] pc-dimm: factor out MemoryDevice interface
On Thu, Apr 12, 2018 at 04:36:33PM +0200, David Hildenbrand wrote: > On the qmp level, we already have the concept of memory devices: > "query-memory-devices" > Right now, we only support NVDIMM and PCDIMM. > > We want to map other devices later into the address space of the guest. > Such device could e.g. be virtio devices. These devices will have a > guest memory range assigned but won't be exposed via e.g. ACPI. We want > to make them look like memory device, but not glued to pc-dimm. > > Especially, it will not always be possible to have TYPE_PC_DIMM as a parent > class (e.g. virtio devices). Let's use an interface instead. As a first > part, convert handling of > - qmp_pc_dimm_device_list > - get_plugged_memory_size > to our new model. plug/unplug stuff etc. will follow later. > > A memory device will have to provide the following functions: > - get_addr(): Necessary, as the property "addr" can e.g. not be used for > virtio devices (already defined). Is the assumption here that a memory device has a fixed RAM address (GPA) from when it is realized? It seems to me plausible that you could have a memory device that has a specific size, but has a BAR of some sort that's programmed by the guest as part of the "plug" process. > - get_plugged_size(): The amount this device offers to the guest as of > now. > - get_region_size(): Because this can later on be bigger than the > plugged size. > - fill_device_info(): Fill MemoryDeviceInfo, e.g. for qmp. > > Signed-off-by: David Hildenbrand> --- > hw/i386/acpi-build.c | 3 +- > hw/mem/Makefile.objs | 1 + > hw/mem/memory-device.c | 119 > +++ > hw/mem/pc-dimm.c | 119 > ++- > hw/ppc/spapr.c | 3 +- > hw/ppc/spapr_hcall.c | 1 + > include/hw/mem/memory-device.h | 44 ++ > include/hw/mem/pc-dimm.h | 2 - > numa.c | 3 +- > qmp.c| 4 +- > stubs/Makefile.objs | 2 +- > stubs/{qmp_pc_dimm.c => qmp_memory_device.c} | 4 +- > 12 files changed, 239 insertions(+), 66 deletions(-) > create mode 100644 hw/mem/memory-device.c > create mode 100644 include/hw/mem/memory-device.h > rename stubs/{qmp_pc_dimm.c => qmp_memory_device.c} (61%) > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c > index 3cf2a1679c..ca3645d57b 100644 > --- a/hw/i386/acpi-build.c > +++ b/hw/i386/acpi-build.c > @@ -46,6 +46,7 @@ > #include "hw/acpi/vmgenid.h" > #include "sysemu/tpm_backend.h" > #include "hw/timer/mc146818rtc_regs.h" > +#include "hw/mem/memory-device.h" > #include "sysemu/numa.h" > > /* Supported chipsets: */ > @@ -2253,7 +2254,7 @@ build_tpm2(GArray *table_data, BIOSLinker *linker, > GArray *tcpalog) > static void build_srat_hotpluggable_memory(GArray *table_data, uint64_t base, > uint64_t len, int default_node) > { > -MemoryDeviceInfoList *info_list = qmp_pc_dimm_device_list(); > +MemoryDeviceInfoList *info_list = qmp_memory_device_list(); > MemoryDeviceInfoList *info; > MemoryDeviceInfo *mi; > PCDIMMDeviceInfo *di; > diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs > index f12f8b97a2..10be4df2a2 100644 > --- a/hw/mem/Makefile.objs > +++ b/hw/mem/Makefile.objs > @@ -1,2 +1,3 @@ > common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o > +common-obj-$(CONFIG_MEM_HOTPLUG) += memory-device.o > common-obj-$(CONFIG_NVDIMM) += nvdimm.o > diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c > new file mode 100644 > index 00..0e0d6b539a > --- /dev/null > +++ b/hw/mem/memory-device.c > @@ -0,0 +1,119 @@ > +/* > + * Memory Device Interface > + * > + * Copyright ProfitBricks GmbH 2012 > + * Copyright (C) 2014 Red Hat Inc > + * Copyright (c) 2018 Red Hat Inc > + * > + * This work is licensed under the terms of the GNU GPL, version 2 or later. > + * See the COPYING file in the top-level directory. > + */ > + > +#include "qemu/osdep.h" > +#include "hw/mem/memory-device.h" > +#include "hw/qdev.h" > +#include "qapi/error.h" > +#include "hw/boards.h" > +#include "qemu/range.h" > + > +static gint memory_device_addr_sort(gconstpointer a, gconstpointer b) > +{ > +MemoryDeviceState *md_a = MEMORY_DEVICE(a); > +MemoryDeviceState *md_b = MEMORY_DEVICE(b); > +MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(a); AFAICT from the code below, the two devices don't necessarily have to have the same class, so I think you need to get the class pointers for each of them separately. (Or if they do have to be the same by design, I think there should at least be an assert). > +const uint64_t addr_a = mdc->get_addr(md_a); > +const uint64_t addr_b =
[Qemu-devel] [PATCH] vhost: do not verify ring mappings when IOMMU is enabled
When IOMMU is enabled, we store virtqueue metadata as iova (though it may has _phys suffix) and access them through dma helpers. Any translation failures could be reported by IOMMU. In this case, trying to validate iova against gpa won't work and will cause a false error reporting. So this patch bypasses the ring verification if IOMMU is enabled which is similar to the behavior before 0ca1fd2d6878 that calls vhost_memory_map() which is a nop when IOMMU is enabled. Fixes: 0ca1fd2d6878 ("vhost: Simplify ring verification checks") Cc: Dr. David Alan GilbertCc: Igor Mammedov Cc: Michael S. Tsirkin Signed-off-by: Jason Wang --- hw/virtio/vhost.c | 4 1 file changed, 4 insertions(+) diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index f51bf57..9d5850a 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -342,6 +342,10 @@ static int vhost_verify_ring_mappings(struct vhost_dev *dev, "used ring" }; +if (vhost_dev_has_iommu(dev)) { +return 0; +} + for (i = 0; i < dev->nvqs; ++i) { struct vhost_virtqueue *vq = dev->vqs + i; -- 2.7.4
[Qemu-devel] [Bug 1763536] [NEW] go build fails under qemu-ppc64le-static (qemu-user)
Public bug reported: I am using qemu-user (built static) in a docker container environment. When running multi-threaded go commands in the container (go build for example) the process may hang, report segfaults or other errors. I built qemu-ppc64le from the upstream git (master). I see the problem running on a multi core system with Intel i7 processors. # cat /proc/cpuinfo | grep "model name" model name : Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz model name : Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz model name : Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz model name : Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz model name : Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz model name : Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz model name : Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz model name : Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz Steps to reproduce: 1) Build qemu-ppc64le as static and copy into docker build directory named it qemu-ppc64le-static. 2) Add hello.go to docker build dir. package main import "fmt" func main() { fmt.Println("hello world") } 3) Create the Dockerfile from below: FROM ppc64le/golang:1.10.1-alpine3. COPY qemu-ppc64le-static /usr/bin/ COPY hello.go /go 4) Build container $ docker build -t qemutest -f Dockerfile ./go 5) Run test $ docker run -it qemutest /go # /usr/bin/qemu-ppc64le-static --version qemu-ppc64le version 2.11.93 (v2.12.0-rc3-dirty) Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers /go # go version go version go1.10.1 linux/ppc64le /go # go build hello.go fatal error: fatal error: stopm holding locksunexpected signal during runtime execution panic during panic [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1003528c] runtime stack: runtime: unexpected return pc for syscall.Syscall6 called from 0xc42007f500 stack: frame={sp:0xc4203be840, fp:0xc4203be860} stack=[0x4000b7ecf0,0x4000b928f0) syscall.Syscall6(0x100744e8, 0x3d, 0xc42050c140, 0x20, 0x18, 0x10422b80, 0xc4203be968[signal , 0x10012d88SIGSEGV: segmentation violation, 0xc420594000 code=, 0x00x1 addr=0x0 pc=0x1003528c) ] runtime stack: /usr/local/go/src/syscall/asm_linux_ppc64x.s:61runtime.throw(0x10472d19, 0x13) + /usr/local/go/src/runtime/panic.go:0x6c616 +0x68 runtime.stopm() /usr/local/go/src/runtime/proc.go:1939goroutine +10x158 [runtime.exitsyscall0semacquire(0xc42007f500) /usr/local/go/src/runtime/proc.go:3129 +]: 0x130 runtime.mcall(0xc42007f500) /usr/local/go/src/runtime/asm_ppc64x.s:183 +0x58sync.runtime_Semacquire (0xc4201fab1c) /usr/local/go/src/runtime/sema.go:56 +0x38 Note the results may differ between attempts, hangs and other faults sometimes happen. If I run "go: single threaded I don't see the problem, for example: /go # GOMAXPROCS=1 go build -p 1 hello.go /go # ./hello hello world I see the same issue with arm64. I don't think this is a go issue, but don't have a real evidence to prove that. This problem looks similar to other problem I have seen reported against qemu running multi-threaded applications. ** Affects: qemu Importance: Undecided Status: New ** Summary changed: - go buld failes under qemu-ppc64le-static (qemu-user) + go build fails under qemu-ppc64le-static (qemu-user) -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1763536 Title: go build fails under qemu-ppc64le-static (qemu-user) Status in QEMU: New Bug description: I am using qemu-user (built static) in a docker container environment. When running multi-threaded go commands in the container (go build for example) the process may hang, report segfaults or other errors. I built qemu-ppc64le from the upstream git (master). I see the problem running on a multi core system with Intel i7 processors. # cat /proc/cpuinfo | grep "model name" model name: Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz model name: Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz model name: Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz model name: Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz model name: Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz model name: Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz model name: Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz model name: Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz Steps to reproduce: 1) Build qemu-ppc64le as static and copy into docker build directory named it qemu-ppc64le-static. 2) Add hello.go to docker build dir. package main import "fmt" func main() { fmt.Println("hello world") } 3) Create the Dockerfile from below: FROM ppc64le/golang:1.10.1-alpine3. COPY qemu-ppc64le-static /usr/bin/ COPY hello.go /go 4) Build container $ docker build -t qemutest -f Dockerfile ./go 5) Run test $ docker run -it qemutest /go #
Re: [Qemu-devel] [PATCH 2/4] pc-bios/s390-ccw: fix loadparm initialization and int conversion
On 04/12/2018 04:57 PM, Collin Walling wrote: On 04/12/2018 02:57 PM, Thomas Huth wrote: On 10.04.2018 17:01, Collin Walling wrote: Rename the loadparm char array in main.c to loadparm_str and increase the size by one byte to account for a null termination when converting the loadparm string to an int via atoui. Also allow the boot menu to be enabled when loadparm is set to an empty string or a series of spaces. Signed-off-by: Collin WallingReported-by: Vasily Gorbik --- hw/s390x/ipl.c | 2 ++ pc-bios/s390-ccw/main.c | 14 +++--- 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c index fdeaec3..23b5b54 100644 --- a/hw/s390x/ipl.c +++ b/hw/s390x/ipl.c @@ -352,6 +352,8 @@ int s390_ipl_set_loadparm(uint8_t *loadparm) loadparm[i] = ascii2ebcdic[(uint8_t) lp[i]]; } +memset(loadparm + i, 0x40, 8 - i); /* fill with EBCDIC spaces */ + g_free(lp); return 0; } diff --git a/pc-bios/s390-ccw/main.c b/pc-bios/s390-ccw/main.c index 9d9f8cf..26f9adf 100644 --- a/pc-bios/s390-ccw/main.c +++ b/pc-bios/s390-ccw/main.c @@ -15,11 +15,11 @@ char stack[PAGE_SIZE * 8] __attribute__((__aligned__(PAGE_SIZE))); static SubChannelId blk_schid = { .one = 1 }; IplParameterBlock iplb __attribute__((__aligned__(PAGE_SIZE))); -static char loadparm[8] = { 0, 0, 0, 0, 0, 0, 0, 0 }; +static char loadparm_str[9] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 }; QemuIplParameters qipl; #define LOADPARM_PROMPT "PROMPT " -#define LOADPARM_EMPTY "" +#define LOADPARM_EMPTY "" Sorry for my ignorance, but why was the old string containing dots? Thomas No need for apologies :) If -machine loadparm is *not* present on the command line, then the loadparm in the sclp will be a series of nulls. For whatever reason, that gets translated into a series of dots. It's because of the ebc2asc table we use for conversion, which results in the dots when converting from ebcdic_to_ascii. If -machine loadparm="" is present, then loadparm will in the sclp will be a series of spaces. > We want to enable the boot menu for both of these cases and, to make things easier, this patch replaces any nulls (dots) to spaces when setting loadparm so that we only have to handle one case instead of two within the bios.
Re: [Qemu-devel] [PATCH 2/4] pc-bios/s390-ccw: fix loadparm initialization and int conversion
On 04/12/2018 02:57 PM, Thomas Huth wrote: > On 10.04.2018 17:01, Collin Walling wrote: >> Rename the loadparm char array in main.c to loadparm_str and >> increase the size by one byte to account for a null termination >> when converting the loadparm string to an int via atoui. Also >> allow the boot menu to be enabled when loadparm is set to an >> empty string or a series of spaces. >> >> Signed-off-by: Collin Walling>> Reported-by: Vasily Gorbik >> --- >> hw/s390x/ipl.c | 2 ++ >> pc-bios/s390-ccw/main.c | 14 +++--- >> 2 files changed, 9 insertions(+), 7 deletions(-) >> >> diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c >> index fdeaec3..23b5b54 100644 >> --- a/hw/s390x/ipl.c >> +++ b/hw/s390x/ipl.c >> @@ -352,6 +352,8 @@ int s390_ipl_set_loadparm(uint8_t *loadparm) >> loadparm[i] = ascii2ebcdic[(uint8_t) lp[i]]; >> } >> >> +memset(loadparm + i, 0x40, 8 - i); /* fill with EBCDIC spaces */ >> + >> g_free(lp); >> return 0; >> } >> diff --git a/pc-bios/s390-ccw/main.c b/pc-bios/s390-ccw/main.c >> index 9d9f8cf..26f9adf 100644 >> --- a/pc-bios/s390-ccw/main.c >> +++ b/pc-bios/s390-ccw/main.c >> @@ -15,11 +15,11 @@ >> char stack[PAGE_SIZE * 8] __attribute__((__aligned__(PAGE_SIZE))); >> static SubChannelId blk_schid = { .one = 1 }; >> IplParameterBlock iplb __attribute__((__aligned__(PAGE_SIZE))); >> -static char loadparm[8] = { 0, 0, 0, 0, 0, 0, 0, 0 }; >> +static char loadparm_str[9] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 }; >> QemuIplParameters qipl; >> >> #define LOADPARM_PROMPT "PROMPT " >> -#define LOADPARM_EMPTY "" >> +#define LOADPARM_EMPTY "" > > Sorry for my ignorance, but why was the old string containing dots? > > Thomas > No need for apologies :) If -machine loadparm is *not* present on the command line, then the loadparm in the sclp will be a series of nulls. For whatever reason, that gets translated into a series of dots. If -machine loadparm="" is present, then loadparm will in the sclp will be a series of spaces. We want to enable the boot menu for both of these cases and, to make things easier, this patch replaces any nulls (dots) to spaces when setting loadparm so that we only have to handle one case instead of two within the bios. -- Respectfully, - Collin Walling
Re: [Qemu-devel] [PULL, 08/36] qapi: Remove qobject_to_X() functions
On 04/12/2018 12:51 PM, Yuval Shaia wrote: > Hi Eric, > > On Mon, Mar 12, 2018 at 01:35:59PM -0500, Eric Blake wrote: >> From: Max Reitz>> >> They are no longer needed now. > > I'm doing some off-list development which use these functions. > Unfortunately i'm not subscribed (yet) to qemu-devel so obviously missed > this email. > > Is there any alternative to these functions? Yes, in the patches right before this one. > > My code looks something like this: > > QObject *oid; > unsigned long id; > > oid = qlist_pop(list); > id = qnum_get_uint(qobject_to_qnum(oid)); > id = qnum_get_uint(qobject_to(QNum, oid)); -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH RFC] s390x: refactor reset/reipl handling
On 12/04/2018 21:26, David Hildenbrand wrote: > Calling pause_all_vcpus()/resume_all_vcpus() from a VCPU thread might > not be the best idea. As pause_all_vcpus() temporarily drops the qemu > mutex, two parallel calls to pause_all_vcpus() can be active at a time, > resulting in a deadlock. (either by two VCPUs or by the main thread and a > VCPU) > > Let's handle it via the main loop instead, as suggested by Paolo. If we > would have two parallel reset requests by two different VCPUs at the > same time, the last one would win. > > We use the existing ipl device to handle it. The nice side effect is > that we can get rid of reipl_requested. > > This change implies that all reset handling now goes via the common > path, so "no-reboot" handling is now active for all kinds of reboots. > > Let's execute any CPU initialization code on the target CPU using > run_on_cpu. > > Signed-off-by: David Hildenbrand> --- > > Did only a quick check with TCG. I think this way it is way easier to > control what is getting reset. Happy that it went well (there's always some trepidation when suggesting a refactoring :)). For what I can understand, it looks sane. Paolo > hw/s390x/ipl.c | 44 --- > hw/s390x/ipl.h | 16 -- > hw/s390x/s390-virtio-ccw.c | 49 +- > include/hw/s390x/s390-virtio-ccw.h | 2 -- > target/s390x/cpu.h | 26 > target/s390x/diag.c| 61 > +++--- > target/s390x/internal.h| 6 > target/s390x/kvm.c | 2 +- > 8 files changed, 127 insertions(+), 79 deletions(-) > > diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c > index fb554ab156..f7589d20f1 100644 > --- a/hw/s390x/ipl.c > +++ b/hw/s390x/ipl.c > @@ -26,6 +26,7 @@ > #include "qemu/config-file.h" > #include "qemu/cutils.h" > #include "qemu/option.h" > +#include "exec/exec-all.h" > > #define KERN_IMAGE_START0x01UL > #define KERN_PARM_AREA 0x010480UL > @@ -484,11 +485,22 @@ IplParameterBlock *s390_ipl_get_iplb(void) > return >iplb; > } > > -void s390_reipl_request(void) > +void s390_ipl_reset_request(CPUState *cs, enum s390_reset reset_type) > { > S390IPLState *ipl = get_ipl_device(); > > -ipl->reipl_requested = true; > +if (reset_type == S390_RESET_EXTERNAL || reset_type == S390_RESET_REIPL) > { > +/* let's always use CPU 0 */ > +ipl->reset_cpu = NULL; > +} else { > +ipl->reset_cpu = cs; > +} > +ipl->reset_type = reset_type; > + > +if (reset_type != S390_RESET_REIPL) { > +goto no_reipl; > +} > + > if (ipl->iplb_valid && > !ipl->netboot && > ipl->iplb.pbt == S390_IPL_TYPE_CCW && > @@ -505,7 +517,32 @@ void s390_reipl_request(void) > ipl->iplb_valid = s390_gen_initial_iplb(ipl); > } > } > +no_reipl: > qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET); > +/* as this is triggered by a CPU, make sure to exit the loop */ > +if (tcg_enabled()) { > +cpu_loop_exit(cs); > +} > +} > + > +void s390_ipl_get_reset_request(CPUState **cs, enum s390_reset *reset_type) > +{ > +S390IPLState *ipl = get_ipl_device(); > + > +*cs = ipl->reset_cpu; > +if (!ipl->reset_cpu) { > +/* use CPU 0 as default */ > +*cs = qemu_get_cpu(0); > +} > +*reset_type = ipl->reset_type; > +} > + > +void s390_ipl_clear_reset_request(void) > +{ > +S390IPLState *ipl = get_ipl_device(); > + > +ipl->reset_type = S390_RESET_EXTERNAL; > +ipl->reset_cpu = NULL; > } > > static void s390_ipl_prepare_qipl(S390CPU *cpu) > @@ -552,11 +589,10 @@ static void s390_ipl_reset(DeviceState *dev) > { > S390IPLState *ipl = S390_IPL(dev); > > -if (!ipl->reipl_requested) { > +if (ipl->reset_type != S390_RESET_REIPL) { > ipl->iplb_valid = false; > memset(>iplb, 0, sizeof(IplParameterBlock)); > } > -ipl->reipl_requested = false; > } > > static void s390_ipl_class_init(ObjectClass *klass, void *data) > diff --git a/hw/s390x/ipl.h b/hw/s390x/ipl.h > index 0570d0ad75..102f1ea7af 100644 > --- a/hw/s390x/ipl.h > +++ b/hw/s390x/ipl.h > @@ -87,7 +87,17 @@ int s390_ipl_set_loadparm(uint8_t *loadparm); > void s390_ipl_update_diag308(IplParameterBlock *iplb); > void s390_ipl_prepare_cpu(S390CPU *cpu); > IplParameterBlock *s390_ipl_get_iplb(void); > -void s390_reipl_request(void); > + > +enum s390_reset { > +/* default is a reset not triggered by a CPU e.g. issued by QMP */ > +S390_RESET_EXTERNAL = 0, > +S390_RESET_REIPL, > +S390_RESET_MODIFIED_CLEAR, > +S390_RESET_LOAD_NORMAL, > +}; > +void s390_ipl_reset_request(CPUState *cs, enum s390_reset reset_type); > +void s390_ipl_get_reset_request(CPUState **cs, enum s390_reset *reset_type); > +void s390_ipl_clear_reset_request(void); >
Re: [Qemu-devel] [PATCH 07/19] block: Really pause block jobs on drain
On 12/04/2018 16:25, Kevin Wolf wrote: > This is already the order we have there. What is probably different from > what you envision is that after the parents have concluded, we still > check that they are still quiescent in every iteration. Yes, and that's the quadratic part. > What we could do easily is introducing a bool bs->quiesce_concluded or > something that bdrv_drain_poll() sets to true the first time that it > returns false. It also gets a shortcut so that it returns false > immediately if bs->quiesce_concluded is true. The field is reset in > bdrv_do_drained_end() when bs->quiesce_counter reaches 0. Or bs->quiescent, for the sake of bikeshedding. > That would be a rather simple optimisation that could be done as the > final patch in this series, and would ensure that we don't recurse back > to parents that are already quiescent. > > Would you be happy with this change? Yes, I'll leave the organization of the series to you of course. Paolo
Re: [Qemu-devel] Bad icount read when running qemu-system-ppc64 and mfspr atbu guest instruction
On 04/11/2018 04:37 PM, Darrell Leinwand wrote: > Ah I see, that makes sense and the atb instructions are marked as unused in the mainline which when I compared I recall now that we modifed them because the guest software we are running is trying to use those special registers. > > I added the wrappers into those function, is that correct? Yes. r~
Re: [Qemu-devel] [PATCH v3 00/22] More fully implement ARM PMUv3
On Apr 12 18:32, Peter Maydell wrote: > On 16 March 2018 at 20:30, Aaron Lindsaywrote: > > The ARM PMU implementation currently contains a basic cycle counter, but it > > is > > often useful to gather counts of other events and filter them based on > > execution mode. These patches flesh out the implementations of various PMU > > registers including PM[X]EVCNTR and PM[X]EVTYPER, add a struct definition to > > represent arbitrary counter types, implement mode filtering, and add > > instruction, cycle, and software increment events. > > Hi; sorry it's taken me a while to get to this (I've been focusing > on for-2.12 work). I've now reviewed most of the patches in this > set. My suggestion is that you fix the stuff I've commented on > and send out a v4, and then I can take some of the patches from > the start of that into target-arm.next, and I'll review the tail > end of the set then. Thanks for the review! I'll see if I can get a v4 out late this week or early next. I'll make a note of this when I send it out, but one thing to look out for in the next version is that I had to reorganize a few things to make the interrupt-on-overflow functionality work better. I think the changes are fairly minor and limited to the later patches, with the possible exception of using two variables per counter (because you have to know what the previous counter value was in addition to the current value to know if you've overflowed since your last check). -Aaron -- Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
[Qemu-devel] [PATCH RFC] s390x: refactor reset/reipl handling
Calling pause_all_vcpus()/resume_all_vcpus() from a VCPU thread might not be the best idea. As pause_all_vcpus() temporarily drops the qemu mutex, two parallel calls to pause_all_vcpus() can be active at a time, resulting in a deadlock. (either by two VCPUs or by the main thread and a VCPU) Let's handle it via the main loop instead, as suggested by Paolo. If we would have two parallel reset requests by two different VCPUs at the same time, the last one would win. We use the existing ipl device to handle it. The nice side effect is that we can get rid of reipl_requested. This change implies that all reset handling now goes via the common path, so "no-reboot" handling is now active for all kinds of reboots. Let's execute any CPU initialization code on the target CPU using run_on_cpu. Signed-off-by: David Hildenbrand--- Did only a quick check with TCG. I think this way it is way easier to control what is getting reset. hw/s390x/ipl.c | 44 --- hw/s390x/ipl.h | 16 -- hw/s390x/s390-virtio-ccw.c | 49 +- include/hw/s390x/s390-virtio-ccw.h | 2 -- target/s390x/cpu.h | 26 target/s390x/diag.c| 61 +++--- target/s390x/internal.h| 6 target/s390x/kvm.c | 2 +- 8 files changed, 127 insertions(+), 79 deletions(-) diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c index fb554ab156..f7589d20f1 100644 --- a/hw/s390x/ipl.c +++ b/hw/s390x/ipl.c @@ -26,6 +26,7 @@ #include "qemu/config-file.h" #include "qemu/cutils.h" #include "qemu/option.h" +#include "exec/exec-all.h" #define KERN_IMAGE_START0x01UL #define KERN_PARM_AREA 0x010480UL @@ -484,11 +485,22 @@ IplParameterBlock *s390_ipl_get_iplb(void) return >iplb; } -void s390_reipl_request(void) +void s390_ipl_reset_request(CPUState *cs, enum s390_reset reset_type) { S390IPLState *ipl = get_ipl_device(); -ipl->reipl_requested = true; +if (reset_type == S390_RESET_EXTERNAL || reset_type == S390_RESET_REIPL) { +/* let's always use CPU 0 */ +ipl->reset_cpu = NULL; +} else { +ipl->reset_cpu = cs; +} +ipl->reset_type = reset_type; + +if (reset_type != S390_RESET_REIPL) { +goto no_reipl; +} + if (ipl->iplb_valid && !ipl->netboot && ipl->iplb.pbt == S390_IPL_TYPE_CCW && @@ -505,7 +517,32 @@ void s390_reipl_request(void) ipl->iplb_valid = s390_gen_initial_iplb(ipl); } } +no_reipl: qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET); +/* as this is triggered by a CPU, make sure to exit the loop */ +if (tcg_enabled()) { +cpu_loop_exit(cs); +} +} + +void s390_ipl_get_reset_request(CPUState **cs, enum s390_reset *reset_type) +{ +S390IPLState *ipl = get_ipl_device(); + +*cs = ipl->reset_cpu; +if (!ipl->reset_cpu) { +/* use CPU 0 as default */ +*cs = qemu_get_cpu(0); +} +*reset_type = ipl->reset_type; +} + +void s390_ipl_clear_reset_request(void) +{ +S390IPLState *ipl = get_ipl_device(); + +ipl->reset_type = S390_RESET_EXTERNAL; +ipl->reset_cpu = NULL; } static void s390_ipl_prepare_qipl(S390CPU *cpu) @@ -552,11 +589,10 @@ static void s390_ipl_reset(DeviceState *dev) { S390IPLState *ipl = S390_IPL(dev); -if (!ipl->reipl_requested) { +if (ipl->reset_type != S390_RESET_REIPL) { ipl->iplb_valid = false; memset(>iplb, 0, sizeof(IplParameterBlock)); } -ipl->reipl_requested = false; } static void s390_ipl_class_init(ObjectClass *klass, void *data) diff --git a/hw/s390x/ipl.h b/hw/s390x/ipl.h index 0570d0ad75..102f1ea7af 100644 --- a/hw/s390x/ipl.h +++ b/hw/s390x/ipl.h @@ -87,7 +87,17 @@ int s390_ipl_set_loadparm(uint8_t *loadparm); void s390_ipl_update_diag308(IplParameterBlock *iplb); void s390_ipl_prepare_cpu(S390CPU *cpu); IplParameterBlock *s390_ipl_get_iplb(void); -void s390_reipl_request(void); + +enum s390_reset { +/* default is a reset not triggered by a CPU e.g. issued by QMP */ +S390_RESET_EXTERNAL = 0, +S390_RESET_REIPL, +S390_RESET_MODIFIED_CLEAR, +S390_RESET_LOAD_NORMAL, +}; +void s390_ipl_reset_request(CPUState *cs, enum s390_reset reset_type); +void s390_ipl_get_reset_request(CPUState **cs, enum s390_reset *reset_type); +void s390_ipl_clear_reset_request(void); #define QIPL_ADDRESS 0xcc @@ -129,9 +139,11 @@ struct S390IPLState { bool enforce_bios; IplParameterBlock iplb; bool iplb_valid; -bool reipl_requested; bool netboot; QemuIplParameters qipl; +/* reset related properties don't have to be migrated or reset */ +enum s390_reset reset_type; +CPUState *reset_cpu; /*< public >*/ char *kernel; diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c index
Re: [Qemu-devel] [PATCH 4/5] migration: fix qemu carsh when RDMA live migration
* 858585 jemmy (jemmy858...@gmail.com) wrote: > On Thu, Apr 12, 2018 at 12:43 AM, Dr. David Alan Gilbert >wrote: > > * Lidong Chen (jemmy858...@gmail.com) wrote: > >> After postcopy, the destination qemu work in the dedicated > >> thread, so only invoke yield_until_fd_readable before postcopy > >> migration. > > > > The subject line needs to be more discriptive: > >migration: Stop rdma yielding during incoming postcopy > > > > I think. > > (Also please check the subject spellings) > > > >> Signed-off-by: Lidong Chen > >> --- > >> migration/rdma.c | 4 +++- > >> 1 file changed, 3 insertions(+), 1 deletion(-) > >> > >> diff --git a/migration/rdma.c b/migration/rdma.c > >> index 53773c7..81be482 100644 > >> --- a/migration/rdma.c > >> +++ b/migration/rdma.c > >> @@ -1489,11 +1489,13 @@ static int qemu_rdma_wait_comp_channel(RDMAContext > >> *rdma) > >> * Coroutine doesn't start until migration_fd_process_incoming() > >> * so don't yield unless we know we're running inside of a coroutine. > >> */ > >> -if (rdma->migration_started_on_destination) { > >> +if (rdma->migration_started_on_destination && > >> +migration_incoming_get_current()->state == > >> MIGRATION_STATUS_ACTIVE) { > > > > OK, that's a bit delicate; watch if it ever gets called in a failure > > case or similar - and also wathc out if we make more use of the status > > on the destination, but otherwise, and with a fix for the subject; > > How about use migration_incoming_get_current()->have_listen_thread? That's supposed to be pretty internal to the postcopy code, so I prefer the status check. Dave > if (rdma->migration_started_on_destination && > migration_incoming_get_current()->have_listen_thread == false) { > yield_until_fd_readable(rdma->comp_channel->fd); > } > > > > > > > Reviewed-by: Dr. David Alan Gilbert > > > >> yield_until_fd_readable(rdma->comp_channel->fd); > >> } else { > >> /* This is the source side, we're in a separate thread > >> * or destination prior to migration_fd_process_incoming() > >> + * after postcopy, the destination also in a seprate thread. > >> * we can't yield; so we have to poll the fd. > >> * But we need to be able to handle 'cancel' or an error > >> * without hanging forever. > >> -- > >> 1.8.3.1 > >> > > -- > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH 2/4] pc-bios/s390-ccw: fix loadparm initialization and int conversion
On 10.04.2018 17:01, Collin Walling wrote: > Rename the loadparm char array in main.c to loadparm_str and > increase the size by one byte to account for a null termination > when converting the loadparm string to an int via atoui. Also > allow the boot menu to be enabled when loadparm is set to an > empty string or a series of spaces. > > Signed-off-by: Collin Walling> Reported-by: Vasily Gorbik > --- > hw/s390x/ipl.c | 2 ++ > pc-bios/s390-ccw/main.c | 14 +++--- > 2 files changed, 9 insertions(+), 7 deletions(-) > > diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c > index fdeaec3..23b5b54 100644 > --- a/hw/s390x/ipl.c > +++ b/hw/s390x/ipl.c > @@ -352,6 +352,8 @@ int s390_ipl_set_loadparm(uint8_t *loadparm) > loadparm[i] = ascii2ebcdic[(uint8_t) lp[i]]; > } > > +memset(loadparm + i, 0x40, 8 - i); /* fill with EBCDIC spaces */ > + > g_free(lp); > return 0; > } > diff --git a/pc-bios/s390-ccw/main.c b/pc-bios/s390-ccw/main.c > index 9d9f8cf..26f9adf 100644 > --- a/pc-bios/s390-ccw/main.c > +++ b/pc-bios/s390-ccw/main.c > @@ -15,11 +15,11 @@ > char stack[PAGE_SIZE * 8] __attribute__((__aligned__(PAGE_SIZE))); > static SubChannelId blk_schid = { .one = 1 }; > IplParameterBlock iplb __attribute__((__aligned__(PAGE_SIZE))); > -static char loadparm[8] = { 0, 0, 0, 0, 0, 0, 0, 0 }; > +static char loadparm_str[9] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 }; > QemuIplParameters qipl; > > #define LOADPARM_PROMPT "PROMPT " > -#define LOADPARM_EMPTY "" > +#define LOADPARM_EMPTY "" Sorry for my ignorance, but why was the old string containing dots? Thomas
Re: [Qemu-devel] [PATCH 5/5] migration: disable RDMA WRITR after postcopy started.
* 858585 jemmy (jemmy858...@gmail.com) wrote: > On Wed, Apr 11, 2018 at 11:56 PM, Dr. David Alan Gilbert >wrote: > > * Lidong Chen (jemmy858...@gmail.com) wrote: > >> RDMA write operations are performed with no notification to the destination > >> qemu, then the destination qemu can not wakeup. So disable RDMA WRITE after > >> postcopy started. > >> > >> Signed-off-by: Lidong Chen > > > > This patch needs to be near the beginning of the series; at the moment a > > bisect would lead you to the middle of the series which had return > > paths, but then would fail to work properly because it would try and use > > the RDMA code. > > I will fix this problem in next version. > > > > >> --- > >> migration/qemu-file.c | 3 ++- > >> migration/rdma.c | 12 > >> 2 files changed, 14 insertions(+), 1 deletion(-) > >> > >> diff --git a/migration/qemu-file.c b/migration/qemu-file.c > >> index 8acb574..a64ac3a 100644 > >> --- a/migration/qemu-file.c > >> +++ b/migration/qemu-file.c > >> @@ -260,7 +260,8 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t > >> block_offset, > >> int ret = f->hooks->save_page(f, f->opaque, block_offset, > >>offset, size, bytes_sent); > >> f->bytes_xfer += size; > >> -if (ret != RAM_SAVE_CONTROL_DELAYED) { > >> +if (ret != RAM_SAVE_CONTROL_DELAYED && > >> +ret != RAM_SAVE_CONTROL_NOT_SUPP) { > > > > What about f->bytes_xfer in this case? > > f->bytes_xfer should not update when RAM_SAVE_CONTROL_NOT_SUPP. > I will fix this problem in next version. > > > > > Is there anything we have to do at the switchover into postcopy to make > > sure that all pages have been received? > > ram_save_iterate invoke ram_control_after_iterate(f, RAM_CONTROL_ROUND), > so before next iteration which switchover into postcopy, all the pages > sent by previous > iteration have been received. OK, great. Dave > > > > Dave > > > >> if (bytes_sent && *bytes_sent > 0) { > >> qemu_update_position(f, *bytes_sent); > >> } else if (ret < 0) { > >> diff --git a/migration/rdma.c b/migration/rdma.c > >> index 81be482..8529ddd 100644 > >> --- a/migration/rdma.c > >> +++ b/migration/rdma.c > >> @@ -2964,6 +2964,10 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void > >> *opaque, > >> > >> CHECK_ERROR_STATE(); > >> > >> +if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) > >> { > >> +return RAM_SAVE_CONTROL_NOT_SUPP; > >> +} > >> + > >> qemu_fflush(f); > >> > >> if (size > 0) { > >> @@ -3528,6 +3532,10 @@ static int qemu_rdma_registration_start(QEMUFile > >> *f, void *opaque, > >> > >> CHECK_ERROR_STATE(); > >> > >> +if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) > >> { > >> +return 0; > >> +} > >> + > >> trace_qemu_rdma_registration_start(flags); > >> qemu_put_be64(f, RAM_SAVE_FLAG_HOOK); > >> qemu_fflush(f); > >> @@ -3550,6 +3558,10 @@ static int qemu_rdma_registration_stop(QEMUFile *f, > >> void *opaque, > >> > >> CHECK_ERROR_STATE(); > >> > >> +if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) > >> { > >> +return 0; > >> +} > >> + > >> qemu_fflush(f); > >> ret = qemu_rdma_drain_cq(f, rdma); > >> > >> -- > >> 1.8.3.1 > >> > > -- > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH] block/gluster: defend on legacy ftruncate api use
On Thu, Apr 12, 2018 at 11:21:42AM -0500, Eric Blake wrote: > n 04/12/2018 08:31 AM, Niels de Vos wrote: > > This change looks good to me, but a commit message would have been > > helpful. I suggest something like this: > > > > Gluster 4.0 changed the signature of glfs_ftruncate(). The function > > now has two additional arguments, namely prestat and poststat. These > > provide not benefit for QEMU, so ignoring them and passing NULL makes > > the gluster-block driver compile with the new Gluster version again. > > > > And maybe add this too: > > > > Glusters libgfapi uses symbol versioning and provides backwards > > compatible functions. Binaries compiled against previous versions of > > Gluster keep on functioning with the new library. > > > > >> @@ -3856,6 +3857,9 @@ if test "$glusterfs" != "no" ; then > >>glusterfs_fallocate="yes" > >>glusterfs_zerofill="yes" > >> fi > >> +if ! $pkg_config --atleast-version=7.4 glusterfs-api; then > >> + glusterfs_legacy_ftruncate="yes" > > Also, version-based tests are lousy. Feature-based tests (does a call > to glfs_ftruncate(0, 0) compile without error? then we are legacy) are > less brittle, especially when features can be backported across > versions. So this should be reworked to a compile check, rather than a > version query. In this case, the cange to glfs_ftruncate() will not be backported in the releases from the Gluster Community or any distributions that I am familiar with. I agree that a compile-test is safer and can understand that those are preferred over pkg-config version checks. Niels
Re: [Qemu-devel] [PATCH 1/4] pc-bios/s390-ccw: rename MAX_TABLE_ENTRIES to MAX_BOOT_ENTRIES
On 10.04.2018 17:01, Collin Walling wrote: > The MAX_TABLE_ENTRIES constant has a name that is too generic. As we > want to declare a limit for boot menu entries, let's rename it to a more > fitting MAX_BOOT_ENTRIES and set its value to 31 (30 boot entries and > 1 default entry). Also we move it from bootmap.h to s390-ccw.h to make > it available for menu.c in a later patch. > > Signed-off-by: Collin Walling> --- > pc-bios/s390-ccw/bootmap.c | 6 +++--- > pc-bios/s390-ccw/bootmap.h | 2 -- > pc-bios/s390-ccw/s390-ccw.h | 2 ++ > 3 files changed, 5 insertions(+), 5 deletions(-) Reviewed-by: Thomas Huth
Re: [Qemu-devel] [Qemu-arm] [PATCH for-2.13 3/4] arm: always start from first_cpu when registering loader cpu reset callback
On 12 April 2018 at 17:40, Igor Mammedovwrote: > if arm_load_kernel() were passed non first_cpu, QEMU would end up > with partially set do_cpu_reset() callback leaving some CPUs without it. > > Make sure that do_cpu_reset() is registered for all CPUs by enumerating > CPUs from first_cpu. > > Signed-off-by: Igor Mammedov > --- > hw/arm/boot.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/hw/arm/boot.c b/hw/arm/boot.c > index 2f464ca..2591fee 100644 > --- a/hw/arm/boot.c > +++ b/hw/arm/boot.c > @@ -1188,7 +1188,7 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info > *info) > * actually loading a kernel, the handler is also responsible for > * arranging that we start it correctly. > */ > -for (cs = CPU(cpu); cs; cs = CPU_NEXT(cs)) { > +for (cs = first_cpu; cs; cs = CPU_NEXT(cs)) { > qemu_register_reset(do_cpu_reset, ARM_CPU(cs)); > } > } Definitely a bug fix, so: Reviewed-by: Peter Maydell I think though that in at least some cases we'll still mishandle being passed anything other than first_cpu as the CPU pointer, because in do_cpu_reset() we do some checks for "do this if cs == first_cpu", on the assumption that first_cpu is the primary CPU that we're booting. We should instead I suppose be checking against the CPU pointer we're given as the arm_load_kernel() argument (which I think do_cpu_reset() can get at via info->load_kernel_notifier.cpu). We should probably analyse which boards actually pass anything other than first_cpu -- I suspect it will end up just being the xilinx board which has both A and R profile cores... thanks -- PMM
Re: [Qemu-devel] [PATCH for-2.13 1/4] arm: reuse arm_boot_address_space() in armv7m_load_kernel()
On 12 April 2018 at 17:40, Igor Mammedovwrote: > reduce code duplication by resusing arm_boot_address_space() > > Signed-off-by: Igor Mammedov > --- > include/hw/arm/arm.h | 2 ++ > hw/arm/armv7m.c | 10 +- > hw/arm/boot.c| 16 > 3 files changed, 11 insertions(+), 17 deletions(-) This minor duplication was deliberate, because hw/arm/armv7m.c is for M profile cores only, and hw/arm/boot.c is for A/R profile cores only. I don't think that saving a single if/else in armv7m.c really justifies blurring that dividing line. thanks -- PMM
Re: [Qemu-devel] [PATCH 04/14] timer: ds1338 clarify HOUR handling
On 24 March 2018 at 19:24, Michael Davidsaverwrote: > Simplify and comment the translation between > registers and struct tm. > > Signed-off-by: Michael Davidsaver > --- > hw/timer/ds1338.c | 35 +-- > 1 file changed, 17 insertions(+), 18 deletions(-) > > diff --git a/hw/timer/ds1338.c b/hw/timer/ds1338.c > index 72a4692d60..9bcda26e51 100644 > --- a/hw/timer/ds1338.c > +++ b/hw/timer/ds1338.c > @@ -88,23 +88,23 @@ static void capture_current_time(DS1338State *s) > * which will be actually read by the data transfer operation. > */ > struct tm now; > +bool mode12 = ARRAY_FIELD_EX32(s->nvram, HOUR, SET12); > qemu_get_timedate(, s->offset); > + > s->nvram[R_SEC] = to_bcd(now.tm_sec); > s->nvram[R_MIN] = to_bcd(now.tm_min); > -if (ARRAY_FIELD_EX32(s->nvram, HOUR, SET12)) { > -int tmp = now.tm_hour; > -if (tmp % 12 == 0) { > -tmp += 12; > -} > -s->nvram[R_HOUR] = 0; > +s->nvram[R_HOUR] = 0; > +if (mode12) { > +/* map 0-23 to 1-12 am/pm */ > ARRAY_FIELD_DP32(s->nvram, HOUR, SET12, 1); > -if (tmp <= 12) { > -ARRAY_FIELD_DP32(s->nvram, HOUR, AMPM, 0); > -ARRAY_FIELD_DP32(s->nvram, HOUR, HOUR12, to_bcd(tmp)); > -} else { > -ARRAY_FIELD_DP32(s->nvram, HOUR, AMPM, 1); > -ARRAY_FIELD_DP32(s->nvram, HOUR, HOUR12, to_bcd(tmp - 12)); Oh, you're rewriting all this code from the earlier patch anyway. In that case you should definitely just have the earlier patch match the logic that the existing code did, so that it's easier to review as being a no-change patch. > +ARRAY_FIELD_DP32(s->nvram, HOUR, AMPM, now.tm_hour >= 12u); > +now.tm_hour %= 12u; /* wrap 0-23 to 0-11 */ > +if (now.tm_hour == 0u) { > +/* midnight/noon stored as 12 */ > +now.tm_hour = 12u; > } > +ARRAY_FIELD_DP32(s->nvram, HOUR, HOUR12, to_bcd(now.tm_hour)); > + > } else { > ARRAY_FIELD_DP32(s->nvram, HOUR, HOUR24, to_bcd(now.tm_hour)); > } > @@ -182,14 +182,13 @@ static int ds1338_send(I2CSlave *i2c, uint8_t data) > break; > case R_HOUR: > if (FIELD_EX32(data, HOUR, SET12)) { > -int tmp = from_bcd(FIELD_EX32(data, HOUR, HOUR12)); > +/* 12 hour (1-12) */ > +/* read and wrap 1-12 -> 0-11 */ > +now.tm_hour = from_bcd(FIELD_EX32(data, HOUR, HOUR12)) % 12u; > if (FIELD_EX32(data, HOUR, AMPM)) { > -tmp += 12; > +now.tm_hour += 12; > } > -if (tmp % 12 == 0) { > -tmp -= 12; > -} > -now.tm_hour = tmp; > + > } else { > now.tm_hour = from_bcd(FIELD_EX32(data, HOUR, HOUR24)); > } > -- > 2.11.0 > thanks -- PMM
Re: [Qemu-devel] [PATCH 03/14] timer: ds1338 persist 12-hour mode selection
On 24 March 2018 at 19:24, Michael Davidsaverwrote: > Need to save HOUR[HOUR12] bit to keep > track of guest selection of 12-hour mode. > Write through current time registers to > achieve this. Will be overwritten > by the next read/latch. > > Signed-off-by: Michael Davidsaver > --- > hw/timer/ds1338.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/hw/timer/ds1338.c b/hw/timer/ds1338.c > index b5630e214a..72a4692d60 100644 > --- a/hw/timer/ds1338.c > +++ b/hw/timer/ds1338.c > @@ -224,10 +224,8 @@ static int ds1338_send(I2CSlave *i2c, uint8_t data) > value unchanged. */ > data = (data & ~CTRL_OSF) | (data & s->nvram[s->ptr] & CTRL_OSF); > > -s->nvram[s->ptr] = data; > -} else { > -s->nvram[s->ptr] = data; > } > +s->nvram[s->ptr] = data; > inc_regptr(s); > return 0; The code change here looks like a reasonable no-behaviour-change simplification of the code, but it doesn't match up with the description in the commit message ? thanks -- PMM
Re: [Qemu-devel] [PATCH 02/14] timer: ds1338 use registerfields.h
On 24 March 2018 at 19:24, Michael Davidsaverwrote: > Use names for registers and bits except > for R_CTRL which will be dealt with later, > and isn't modeled anyway. > > Signed-off-by: Michael Davidsaver > --- > hw/timer/ds1338.c | 84 > ++- > 1 file changed, 58 insertions(+), 26 deletions(-) > @@ -61,25 +89,29 @@ static void capture_current_time(DS1338State *s) > */ > struct tm now; > qemu_get_timedate(, s->offset); > -s->nvram[0] = to_bcd(now.tm_sec); > -s->nvram[1] = to_bcd(now.tm_min); > -if (s->nvram[2] & HOURS_12) { > +s->nvram[R_SEC] = to_bcd(now.tm_sec); > +s->nvram[R_MIN] = to_bcd(now.tm_min); > +if (ARRAY_FIELD_EX32(s->nvram, HOUR, SET12)) { > int tmp = now.tm_hour; > if (tmp % 12 == 0) { > tmp += 12; > } > +s->nvram[R_HOUR] = 0; > +ARRAY_FIELD_DP32(s->nvram, HOUR, SET12, 1); > if (tmp <= 12) { > -s->nvram[2] = HOURS_12 | to_bcd(tmp); > +ARRAY_FIELD_DP32(s->nvram, HOUR, AMPM, 0); This is zeroing a bit that's already zero. > +ARRAY_FIELD_DP32(s->nvram, HOUR, HOUR12, to_bcd(tmp)); > } else { > -s->nvram[2] = HOURS_12 | HOURS_PM | to_bcd(tmp - 12); > +ARRAY_FIELD_DP32(s->nvram, HOUR, AMPM, 1); > +ARRAY_FIELD_DP32(s->nvram, HOUR, HOUR12, to_bcd(tmp - 12)); > } > } else { > -s->nvram[2] = to_bcd(now.tm_hour); > +ARRAY_FIELD_DP32(s->nvram, HOUR, HOUR24, to_bcd(now.tm_hour)); This else clause used to definitely set all the bits in nvram[2], and now it doesn't (the s->nvram[R_HOUR] = 0 is only in the if {}). For cases like this where we're setting the whole thing, I think s->nvram[R_HOUR] = R_HOUR_SET12_MASK | to_bcd(tmp); s->nvram[R_HOUR] = R_HOUR_SET12_MASK | R_HOUR_AMPM_MASK | to_bcd(tmp - 12); s->nvram[R_HOUR] = R_HOUR_HOUR24_MASK | to_bcm(now.tm_hour); is clearer than individually setting each field, but you can do the "clear and then set individual fields" approach if you like. Otherwise Reviewed-by: Peter Maydell thanks -- PMM
Re: [Qemu-devel] [PULL, 08/36] qapi: Remove qobject_to_X() functions
Hi Eric, On Mon, Mar 12, 2018 at 01:35:59PM -0500, Eric Blake wrote: > From: Max Reitz> > They are no longer needed now. I'm doing some off-list development which use these functions. Unfortunately i'm not subscribed (yet) to qemu-devel so obviously missed this email. Is there any alternative to these functions? My code looks something like this: QObject *oid; unsigned long id; oid = qlist_pop(list); id = qnum_get_uint(qobject_to_qnum(oid)); Since objects in the list are QObject type. Yuval > > Signed-off-by: Max Reitz > Reviewed-by: Alberto Garcia > Message-Id: <20180224154033.29559-5-mre...@redhat.com> > Reviewed-by: Eric Blake > Signed-off-by: Eric Blake > --- > include/qapi/qmp/qbool.h | 1 - > include/qapi/qmp/qdict.h | 1 - > include/qapi/qmp/qlist.h | 1 - > include/qapi/qmp/qnum.h| 1 - > include/qapi/qmp/qstring.h | 1 - > qobject/qbool.c| 11 --- > qobject/qdict.c| 11 --- > qobject/qlist.c| 11 --- > qobject/qnum.c | 11 --- > qobject/qstring.c | 11 --- > 10 files changed, 60 deletions(-) > > diff --git a/include/qapi/qmp/qbool.h b/include/qapi/qmp/qbool.h > index 629c508d34f..b9a44a1bfe0 100644 > --- a/include/qapi/qmp/qbool.h > +++ b/include/qapi/qmp/qbool.h > @@ -23,7 +23,6 @@ struct QBool { > > QBool *qbool_from_bool(bool value); > bool qbool_get_bool(const QBool *qb); > -QBool *qobject_to_qbool(const QObject *obj); > bool qbool_is_equal(const QObject *x, const QObject *y); > void qbool_destroy_obj(QObject *obj); > > diff --git a/include/qapi/qmp/qdict.h b/include/qapi/qmp/qdict.h > index 7c6d8445495..2cc3e906f7b 100644 > --- a/include/qapi/qmp/qdict.h > +++ b/include/qapi/qmp/qdict.h > @@ -39,7 +39,6 @@ void qdict_put_obj(QDict *qdict, const char *key, QObject > *value); > void qdict_del(QDict *qdict, const char *key); > int qdict_haskey(const QDict *qdict, const char *key); > QObject *qdict_get(const QDict *qdict, const char *key); > -QDict *qobject_to_qdict(const QObject *obj); > bool qdict_is_equal(const QObject *x, const QObject *y); > void qdict_iter(const QDict *qdict, > void (*iter)(const char *key, QObject *obj, void *opaque), > diff --git a/include/qapi/qmp/qlist.h b/include/qapi/qmp/qlist.h > index 5fd976a3981..5c673acb060 100644 > --- a/include/qapi/qmp/qlist.h > +++ b/include/qapi/qmp/qlist.h > @@ -53,7 +53,6 @@ QObject *qlist_pop(QList *qlist); > QObject *qlist_peek(QList *qlist); > int qlist_empty(const QList *qlist); > size_t qlist_size(const QList *qlist); > -QList *qobject_to_qlist(const QObject *obj); > bool qlist_is_equal(const QObject *x, const QObject *y); > void qlist_destroy_obj(QObject *obj); > > diff --git a/include/qapi/qmp/qnum.h b/include/qapi/qmp/qnum.h > index 15e3971c7f1..3e47475b2cb 100644 > --- a/include/qapi/qmp/qnum.h > +++ b/include/qapi/qmp/qnum.h > @@ -68,7 +68,6 @@ double qnum_get_double(QNum *qn); > > char *qnum_to_string(QNum *qn); > > -QNum *qobject_to_qnum(const QObject *obj); > bool qnum_is_equal(const QObject *x, const QObject *y); > void qnum_destroy_obj(QObject *obj); > > diff --git a/include/qapi/qmp/qstring.h b/include/qapi/qmp/qstring.h > index 98070ef3d6a..b72843fc1b0 100644 > --- a/include/qapi/qmp/qstring.h > +++ b/include/qapi/qmp/qstring.h > @@ -30,7 +30,6 @@ const char *qstring_get_str(const QString *qstring); > void qstring_append_int(QString *qstring, int64_t value); > void qstring_append(QString *qstring, const char *str); > void qstring_append_chr(QString *qstring, int c); > -QString *qobject_to_qstring(const QObject *obj); > bool qstring_is_equal(const QObject *x, const QObject *y); > void qstring_destroy_obj(QObject *obj); > > diff --git a/qobject/qbool.c b/qobject/qbool.c > index 5be6277cca8..b58249925c2 100644 > --- a/qobject/qbool.c > +++ b/qobject/qbool.c > @@ -39,17 +39,6 @@ bool qbool_get_bool(const QBool *qb) > return qb->value; > } > > -/** > - * qobject_to_qbool(): Convert a QObject into a QBool > - */ > -QBool *qobject_to_qbool(const QObject *obj) > -{ > -if (!obj || qobject_type(obj) != QTYPE_QBOOL) { > -return NULL; > -} > -return container_of(obj, QBool, base); > -} > - > /** > * qbool_is_equal(): Test whether the two QBools are equal > */ > diff --git a/qobject/qdict.c b/qobject/qdict.c > index 1e588123d00..45c8b53361f 100644 > --- a/qobject/qdict.c > +++ b/qobject/qdict.c > @@ -37,17 +37,6 @@ QDict *qdict_new(void) > return qdict; > } > > -/** > - * qobject_to_qdict(): Convert a QObject into a QDict > - */ > -QDict *qobject_to_qdict(const QObject *obj) > -{ > -if (!obj || qobject_type(obj) != QTYPE_QDICT) { > -return NULL; > -} > -return container_of(obj, QDict, base); > -} > - > /** > * tdb_hash(): based on the hash agorithm from gdbm, via tdb > * (from module-init-tools) > diff
Re: [Qemu-devel] [PATCH v1] s390x/kvm: cleanup calls to cpu_synchronize_state()
On 12.04.2018 11:35, David Hildenbrand wrote: > We have a call to cpu_synchronize_state() on every kvm_arch_handle_exit(). > > Let's remove the ones that are no longer needed. > > Remaining places (for s390x) are in > - target/s390x/sigp.c, on the target CPU > - target/s390x/cpu.c:s390_cpu_get_crash_info() > > While at it, use kvm_cpu_synchronize_state() instead of > cpu_synchronize_state() in KVM code. (suggested by Thomas Huth) > > Signed-off-by: David Hildenbrand> --- > hw/s390x/s390-pci-inst.c | 8 > target/s390x/kvm.c | 20 +--- > 2 files changed, 1 insertion(+), 27 deletions(-) Wow, that was really a mess, looking at which different "levels" the cpu_synchronize_state was done before. Good that you cleaned it up now. Reviewed-by: Thomas Huth
Re: [Qemu-devel] [PATCH v3 12/22] target/arm: Filter cycle counter based on PMCCFILTR_EL0
On Apr 12 18:15, Peter Maydell wrote: > On 16 March 2018 at 20:31, Aaron Lindsaywrote: > > The pmu_counter_filtered and pmu_op_start/finish functions are generic > > (as opposed to PMCCNTR-specific) to allow for the implementation of > > other events. > > > > Signed-off-by: Aaron Lindsay > > --- > > target/arm/cpu.c| 3 ++ > > target/arm/cpu.h| 37 +++ > > target/arm/helper.c | 87 > > ++--- > > 3 files changed, 116 insertions(+), 11 deletions(-) > > > > diff --git a/target/arm/cpu.c b/target/arm/cpu.c > > index a2cb21e..b0d032c 100644 > > --- a/target/arm/cpu.c > > +++ b/target/arm/cpu.c > > @@ -887,6 +887,9 @@ static void arm_cpu_realizefn(DeviceState *dev, Error > > **errp) > > if (!cpu->has_pmu) { > > unset_feature(env, ARM_FEATURE_PMU); > > cpu->id_aa64dfr0 &= ~0xf00; > > +} else { > > +arm_register_pre_el_change_hook(cpu, _pre_el_change, 0); > > +arm_register_el_change_hook(cpu, _post_el_change, 0); > > You probably don't want to do this if we're using KVM, as there > are some code paths where we call do_interrupt() when using KVM > which will trigger the hooks and likely do unexpected things. > (For instance kvm_arm_handle_debug() does this to set the guest > up to take debug exceptions.) Okay, I'll surround this with `if (!kvm_enabled())` for the next pass. > > > } > > > > if (!arm_feature(env, ARM_FEATURE_EL2)) { > > diff --git a/target/arm/cpu.h b/target/arm/cpu.h > > index b0ef727..9c3b5ef 100644 > > --- a/target/arm/cpu.h > > +++ b/target/arm/cpu.h > > @@ -458,6 +458,11 @@ typedef struct CPUARMState { > > * was reset. Otherwise it stores the counter value > > */ > > uint64_t c15_ccnt; > > +/* ccnt_cached_cycles is used to hold the last cycle count when > > + * c15_ccnt holds the guest-visible count instead of the delta > > during > > + * PMU operations which require this. > > + */ > > +uint64_t ccnt_cached_cycles; > > Can this ever hold valid state at a point when we need to do VM > migration, or is it purely temporary ? I believe that as of this version of the patch it is temporary and will not need to be migrated. However, I believe it's going to be necessary to have two variables to represent the state of each counter in order to implement interrupt on overflow. > > > uint64_t pmccfiltr_el0; /* Performance Monitor Filter Register */ > > uint64_t vpidr_el2; /* Virtualization Processor ID Register */ > > uint64_t vmpidr_el2; /* Virtualization Multiprocessor ID Register > > */ > > @@ -896,15 +901,35 @@ int cpu_arm_signal_handler(int host_signum, void > > *pinfo, > > void *puc); > > > > /** > > - * pmccntr_sync > > + * pmccntr_op_start/finish > > Shouldn't this doc change and prototype change have gone with the earlier > patch that changed the implementation? Hmmm, yes. Seems like I got pmccntr_op_start and pmu_op_start confused when staging this. > > * @env: CPUARMState > > * > > - * Synchronises the counter in the PMCCNTR. This must always be called > > twice, > > - * once before any action that might affect the timer and again afterwards. > > - * The function is used to swap the state of the register if required. > > - * This only happens when not in user mode (!CONFIG_USER_ONLY) > > + * Convert the counter in the PMCCNTR between its delta form (the typical > > mode > > + * when it's enabled) and the guest-visible value. These two calls must > > always > > + * surround any action which might affect the counter, and the return value > > + * from pmccntr_op_start must be supplied as the second argument to > > + * pmccntr_op_finish. > > + */ > > +uint64_t pmccntr_op_start(CPUARMState *env); > > +void pmccntr_op_finish(CPUARMState *env, uint64_t prev_cycles); > > + > > +/** > > + * pmu_op_start/finish > > + * @env: CPUARMState > > + * > > + * Convert all PMU counters between their delta form (the typical mode when > > + * they are enabled) and the guest-visible values. These two calls must > > + * surround any action which might affect the counters, and the return > > value > > + * from pmu_op_start must be supplied as the second argument to > > pmu_op_finish. > > + */ > > +uint64_t pmu_op_start(CPUARMState *env); > > +void pmu_op_finish(CPUARMState *env, uint64_t prev_cycles); > > + > > +/** > > + * Functions to register as EL change hooks for PMU mode filtering > > */ > > -void pmccntr_sync(CPUARMState *env); > > +void pmu_pre_el_change(ARMCPU *cpu, void *ignored); > > +void pmu_post_el_change(ARMCPU *cpu, void *ignored); > > > > /* SCTLR bit meanings. Several bits have been reused in newer > > * versions of the architecture; in that case we define constants > > diff --git a/target/arm/helper.c b/target/arm/helper.c > > index 0102357..95b09d6 100644 > > ---
Re: [Qemu-devel] [PATCH v3 00/22] More fully implement ARM PMUv3
On 16 March 2018 at 20:30, Aaron Lindsaywrote: > The ARM PMU implementation currently contains a basic cycle counter, but it is > often useful to gather counts of other events and filter them based on > execution mode. These patches flesh out the implementations of various PMU > registers including PM[X]EVCNTR and PM[X]EVTYPER, add a struct definition to > represent arbitrary counter types, implement mode filtering, and add > instruction, cycle, and software increment events. Hi; sorry it's taken me a while to get to this (I've been focusing on for-2.12 work). I've now reviewed most of the patches in this set. My suggestion is that you fix the stuff I've commented on and send out a v4, and then I can take some of the patches from the start of that into target-arm.next, and I'll review the tail end of the set then. thanks -- PMM
Re: [Qemu-devel] [PATCH v3 17/22] target/arm: Split arm_ccnt_enabled into generic pmu_counter_enabled
On 16 March 2018 at 20:31, Aaron Lindsaywrote: > Signed-off-by: Aaron Lindsay > Reviewed-by: Peter Maydell > --- > target/arm/helper.c | 11 --- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/target/arm/helper.c b/target/arm/helper.c > index f5e800e..2073d56 100644 > --- a/target/arm/helper.c > +++ b/target/arm/helper.c > @@ -1009,17 +1009,22 @@ static CPAccessResult pmreg_access_ccntr(CPUARMState > *env, > return pmreg_access(env, ri, isread); > } > > -static inline bool arm_ccnt_enabled(CPUARMState *env) > +static inline bool pmu_counter_enabled(CPUARMState *env, uint8_t counter) > { > /* Does not check PMCCFILTR_EL0, which is handled by > pmu_counter_filtered */ > - > -if (!(env->cp15.c9_pmcr & PMCRE) || !(env->cp15.c9_pmcnten & (1 << 31))) > { > +if (!(env->cp15.c9_pmcr & PMCRE) || > +!(env->cp15.c9_pmcnten & (1 << counter))) { > return false; > } > > return true; > } > > +static inline bool arm_ccnt_enabled(CPUARMState *env) > +{ > +return pmu_counter_enabled(env, 31); > +} We only use arm_ccnt_enabled() in a couple of places, so I would just make those callsites use pmu_counter_enabled(env, 31) directly. thanks -- PMM
Re: [Qemu-devel] [PATCH v3 16/22] target/arm: Implement PMOVSSET
On 16 March 2018 at 20:31, Aaron Lindsaywrote: > Adding an array for v7VE+ CP registers was necessary so that PMOVSSET > wasn't defined for all v7 processors. > > Signed-off-by: Aaron Lindsay > --- > target/arm/helper.c | 32 +++- > 1 file changed, 31 insertions(+), 1 deletion(-) > > diff --git a/target/arm/helper.c b/target/arm/helper.c > index d4f06e6..f5e800e 100644 > --- a/target/arm/helper.c > +++ b/target/arm/helper.c > @@ -1241,9 +1241,17 @@ static void pmcntenclr_write(CPUARMState *env, const > ARMCPRegInfo *ri, > static void pmovsr_write(CPUARMState *env, const ARMCPRegInfo *ri, > uint64_t value) > { > +value &= PMU_COUNTER_MASK(env); > env->cp15.c9_pmovsr &= ~value; > } > > +static void pmovsset_write(CPUARMState *env, const ARMCPRegInfo *ri, > + uint64_t value) > +{ > +value &= PMU_COUNTER_MASK(env); > +env->cp15.c9_pmovsr |= value; > +} > + > static void pmxevtyper_write(CPUARMState *env, const ARMCPRegInfo *ri, > uint64_t value) > { > @@ -1406,7 +1414,7 @@ static const ARMCPRegInfo v7_cp_reginfo[] = { >.fieldoffset = offsetof(CPUARMState, cp15.c9_pmcnten), >.writefn = pmcntenclr_write }, > { .name = "PMOVSR", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 3, > - .access = PL0_RW, .fieldoffset = offsetof(CPUARMState, cp15.c9_pmovsr), > + .access = PL0_RW, .fieldoffset = offsetoflow32(CPUARMState, > cp15.c9_pmovsr), >.accessfn = pmreg_access, >.writefn = pmovsr_write, >.raw_writefn = raw_write }, This change is half of a bug fix (the other half being to make the field in the CPU struct be uint64_t rather than uint32_t). That bug fix should be in a patch of its own. pmuserenr has the same bug (uint32_t state field accessed by a STATE_AA64 sysreg). thanks -- PMM
Re: [Qemu-devel] [PATCH v3 10/22] target/arm: Allow EL change hooks to do IO
On 12 April 2018 at 18:08, Aaron Lindsaywrote: > On Apr 12 17:53, Peter Maydell wrote: >> On 16 March 2018 at 20:31, Aaron Lindsay wrote: >> > During code generation, surround CPSR writes and exception returns which >> > call the EL change hooks with gen_io_start/end. The immediate need is >> > for the PMU to access the clock and icount during EL change to support >> > mode filtering. >> > >> > Signed-off-by: Aaron Lindsay >> > --- >> > target/arm/translate-a64.c | 2 ++ >> > target/arm/translate.c | 4 >> > 2 files changed, 6 insertions(+) >> > >> > diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c >> > index 31ff047..e1ae676 100644 >> > --- a/target/arm/translate-a64.c >> > +++ b/target/arm/translate-a64.c >> > @@ -1919,7 +1919,9 @@ static void disas_uncond_b_reg(DisasContext *s, >> > uint32_t insn) >> > unallocated_encoding(s); >> > return; >> > } >> > +gen_io_start(); >> > gen_helper_exception_return(cpu_env); >> > +gen_io_end(); >> >> You don't want to call gen_io_start() or gen_io_end() unless >> tb_cflags(s->base.tb) & CF_USE_ICOUNT) is true. >> >> (Ditto in the other cases below.) > > I assume there's nothing tricky about this and updating this as follows > is sufficient? Yes, that's sufficient. (The other thing that needs to happen for a gen_io_start/end is that the insn has to end the TB -- but in all these cases that's already true as they set s->base.is_jmp = DISAS_EXIT.) thanks -- PMM
Re: [Qemu-devel] [PATCH v3 09/22] target/arm: Add pre-EL change hooks
On 12 April 2018 at 18:01, Aaron Lindsaywrote: > On Apr 12 17:49, Peter Maydell wrote: >> Do we make the guarantee that if we call the pre-change hook >> then we will definitely subsequently call the post-change hook? >> (ie does the PMU hook rely on that?) > > Yes, the PMU relies on the pre- and post- hooks being called in pairs > since they drive the state machine of sorts that exists in the variables > holding the counter values/deltas. And unless I've really screwed up the > implementation, I believe the change hooks in my patchset make that > guarantee. Yes, I think I agree. We should document in the comment that this is a guarantee that we make. thanks -- PMM
Re: [Qemu-devel] [PATCH v3 15/22] target/arm: Add ARM_FEATURE_V7VE for v7 Virtualization Extensions
On 16 March 2018 at 20:31, Aaron Lindsaywrote: > Signed-off-by: Aaron Lindsay > --- > target/arm/cpu.c | 3 +++ > target/arm/cpu.h | 1 + > 2 files changed, 4 insertions(+) > > diff --git a/target/arm/cpu.c b/target/arm/cpu.c > index b0d032c..e544f1d 100644 > --- a/target/arm/cpu.c > +++ b/target/arm/cpu.c > @@ -765,6 +765,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error > **errp) > /* Some features automatically imply others: */ > if (arm_feature(env, ARM_FEATURE_V8)) { > set_feature(env, ARM_FEATURE_V7); > +set_feature(env, ARM_FEATURE_V7VE); > set_feature(env, ARM_FEATURE_ARM_DIV); > set_feature(env, ARM_FEATURE_LPAE); > } > @@ -1481,6 +1482,7 @@ static void cortex_a7_initfn(Object *obj) > > cpu->dtb_compatible = "arm,cortex-a7"; > set_feature(>env, ARM_FEATURE_V7); > +set_feature(>env, ARM_FEATURE_V7VE); > set_feature(>env, ARM_FEATURE_VFP4); > set_feature(>env, ARM_FEATURE_NEON); > set_feature(>env, ARM_FEATURE_THUMB2EE); > @@ -1526,6 +1528,7 @@ static void cortex_a15_initfn(Object *obj) > > cpu->dtb_compatible = "arm,cortex-a15"; > set_feature(>env, ARM_FEATURE_V7); > +set_feature(>env, ARM_FEATURE_V7VE); > set_feature(>env, ARM_FEATURE_VFP4); > set_feature(>env, ARM_FEATURE_NEON); > set_feature(>env, ARM_FEATURE_THUMB2EE); > diff --git a/target/arm/cpu.h b/target/arm/cpu.h > index fb2f983..cc1e2fb 100644 > --- a/target/arm/cpu.h > +++ b/target/arm/cpu.h > @@ -1439,6 +1439,7 @@ enum arm_features { > ARM_FEATURE_OMAPCP, /* OMAP specific CP15 ops handling. */ > ARM_FEATURE_THUMB2EE, > ARM_FEATURE_V7MP,/* v7 Multiprocessing Extensions */ > +ARM_FEATURE_V7VE,/* v7 with Virtualization Extensions */ > ARM_FEATURE_V4T, > ARM_FEATURE_V5, > ARM_FEATURE_STRONGARM, What's the difference between this and ARM_FEATURE_EL2 ? thanks -- PMM
[Qemu-devel] [PATCH v3] RFC: target/arm: Send interrupts on PMU counter overflow
On Mar 16 16:30, Aaron Lindsay wrote: > I aim to eventually add raising interrupts on counter overflow, but that is > not > covered by this patchset. I think I have a reasonable grasp of the mechanics > of > *how* to raise them, but am curious if anyone has thoughts on how to determine > *when* to raise them - we don't want to call into PMU code every time an > instruction is executed to check if any instruction counters have overflowed, > etc. The main candidate I've seen for doing this so far would be to set up a > QEMUTimer, but I haven't fully explored it. Does that seem plausible? Any > other/better ideas? I'm planning to post a full v4 of this patchset soon, pending a few review fixes, but I figured I'd throw out an early version of a patch to add interrupts on overflow in case it obviously has major issues that will need to be addressed. This patch sets up a QEMUTimer to get a callback when we expect counters to next overflow and triggers an interrupt at that time. Signed-off-by: Aaron Lindsay--- target/arm/cpu.c| 11 + target/arm/cpu.h| 7 +++ target/arm/helper.c | 129 3 files changed, 138 insertions(+), 9 deletions(-) diff --git a/target/arm/cpu.c b/target/arm/cpu.c index df27188..9108c6b 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -740,6 +740,12 @@ static void arm_cpu_finalizefn(Object *obj) QLIST_REMOVE(hook, node); g_free(hook); } +#ifndef CONFIG_USER_ONLY +if (arm_feature(>env, ARM_FEATURE_PMU)) { +timer_deinit(cpu->pmu_timer); +timer_free(cpu->pmu_timer); +} +#endif } static void arm_cpu_realizefn(DeviceState *dev, Error **errp) @@ -907,6 +913,11 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp) arm_register_pre_el_change_hook(cpu, _pre_el_change, 0); arm_register_el_change_hook(cpu, _post_el_change, 0); + +#ifndef CONFIG_USER_ONLY +cpu->pmu_timer = timer_new(QEMU_CLOCK_VIRTUAL, 1, arm_pmu_timer_cb, +cpu); +#endif } else { cpu->pmceid0 = 0x; cpu->pmceid1 = 0x; diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 5e6bbd3..bc0867f 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -703,6 +703,8 @@ struct ARMCPU { /* Timers used by the generic (architected) timer */ QEMUTimer *gt_timer[NUM_GTIMERS]; +/* Timer used by the PMU */ +QEMUTimer *pmu_timer; /* GPIO outputs for generic timer */ qemu_irq gt_timer_outputs[NUM_GTIMERS]; /* GPIO output for GICv3 maintenance interrupt signal */ @@ -934,6 +936,11 @@ void pmu_op_start(CPUARMState *env); void pmu_op_finish(CPUARMState *env); /** + * Called when a PMU counter is due to overflow + */ +void arm_pmu_timer_cb(void *opaque); + +/** * Functions to register as EL change hooks for PMU mode filtering */ void pmu_pre_el_change(ARMCPU *cpu, void *ignored); diff --git a/target/arm/helper.c b/target/arm/helper.c index 2147678..abe24dc 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -905,6 +905,7 @@ static const ARMCPRegInfo v6_cp_reginfo[] = { /* Definitions for the PMU registers */ #define PMCRN_MASK 0xf800 #define PMCRN_SHIFT 11 +#define PMCRLC 0x40 #define PMCRD 0x8 #define PMCRC 0x4 #define PMCRP 0x2 @@ -919,6 +920,8 @@ static const ARMCPRegInfo v6_cp_reginfo[] = { #define PMXEVTYPER_MT 0x0200 #define PMXEVTYPER_EVTCOUNT 0x03ff +#define PMEVCNTR_OVERFLOW_MASK ((uint64_t)1 << 31) + #define PMCCFILTR 0xf800 #define PMCCFILTR_M PMXEVTYPER_M #define PMCCFILTR_EL0 (PMCCFILTR | PMCCFILTR_M) @@ -934,6 +937,11 @@ typedef struct pm_event { /* Retrieve the current count of the underlying event. The programmed * counters hold a difference from the return value from this function */ uint64_t (*get_count)(CPUARMState *); +/* Return how many nanoseconds it will take (at a minimum) for count events + * to occur. A negative value indicates the counter will never overflow, or + * that the counter has otherwise arranged for the overflow bit to be set + * and the PMU interrupt to be raised on overflow. */ +int64_t (*ns_per_count)(uint64_t); } pm_event; static bool event_always_supported(CPUARMState *env) @@ -950,6 +958,11 @@ static uint64_t swinc_get_count(CPUARMState *env) return 0; } +static int64_t swinc_ns_per(uint64_t ignored) +{ +return -1; +} + /* * Return the underlying cycle count for the PMU cycle counters. If we're in * usermode, simply return 0. @@ -965,6 +978,11 @@ static uint64_t cycles_get_count(CPUARMState *env) } #ifndef CONFIG_USER_ONLY +static int64_t cycles_ns_per(uint64_t cycles) +{ +return ARM_CPU_FREQ/NANOSECONDS_PER_SECOND; +} + static bool instructions_supported(CPUARMState *env) { return use_icount == 1 /* Precise instruction counting */; @@ -974,22 +992,30 @@ static uint64_t
Re: [Qemu-devel] [PATCH v3 12/22] target/arm: Filter cycle counter based on PMCCFILTR_EL0
On 16 March 2018 at 20:31, Aaron Lindsaywrote: > The pmu_counter_filtered and pmu_op_start/finish functions are generic > (as opposed to PMCCNTR-specific) to allow for the implementation of > other events. > > Signed-off-by: Aaron Lindsay > --- > target/arm/cpu.c| 3 ++ > target/arm/cpu.h| 37 +++ > target/arm/helper.c | 87 > ++--- > 3 files changed, 116 insertions(+), 11 deletions(-) > > diff --git a/target/arm/cpu.c b/target/arm/cpu.c > index a2cb21e..b0d032c 100644 > --- a/target/arm/cpu.c > +++ b/target/arm/cpu.c > @@ -887,6 +887,9 @@ static void arm_cpu_realizefn(DeviceState *dev, Error > **errp) > if (!cpu->has_pmu) { > unset_feature(env, ARM_FEATURE_PMU); > cpu->id_aa64dfr0 &= ~0xf00; > +} else { > +arm_register_pre_el_change_hook(cpu, _pre_el_change, 0); > +arm_register_el_change_hook(cpu, _post_el_change, 0); You probably don't want to do this if we're using KVM, as there are some code paths where we call do_interrupt() when using KVM which will trigger the hooks and likely do unexpected things. (For instance kvm_arm_handle_debug() does this to set the guest up to take debug exceptions.) > } > > if (!arm_feature(env, ARM_FEATURE_EL2)) { > diff --git a/target/arm/cpu.h b/target/arm/cpu.h > index b0ef727..9c3b5ef 100644 > --- a/target/arm/cpu.h > +++ b/target/arm/cpu.h > @@ -458,6 +458,11 @@ typedef struct CPUARMState { > * was reset. Otherwise it stores the counter value > */ > uint64_t c15_ccnt; > +/* ccnt_cached_cycles is used to hold the last cycle count when > + * c15_ccnt holds the guest-visible count instead of the delta during > + * PMU operations which require this. > + */ > +uint64_t ccnt_cached_cycles; Can this ever hold valid state at a point when we need to do VM migration, or is it purely temporary ? > uint64_t pmccfiltr_el0; /* Performance Monitor Filter Register */ > uint64_t vpidr_el2; /* Virtualization Processor ID Register */ > uint64_t vmpidr_el2; /* Virtualization Multiprocessor ID Register */ > @@ -896,15 +901,35 @@ int cpu_arm_signal_handler(int host_signum, void *pinfo, > void *puc); > > /** > - * pmccntr_sync > + * pmccntr_op_start/finish Shouldn't this doc change and prototype change have gone with the earlier patch that changed the implementation? > * @env: CPUARMState > * > - * Synchronises the counter in the PMCCNTR. This must always be called twice, > - * once before any action that might affect the timer and again afterwards. > - * The function is used to swap the state of the register if required. > - * This only happens when not in user mode (!CONFIG_USER_ONLY) > + * Convert the counter in the PMCCNTR between its delta form (the typical > mode > + * when it's enabled) and the guest-visible value. These two calls must > always > + * surround any action which might affect the counter, and the return value > + * from pmccntr_op_start must be supplied as the second argument to > + * pmccntr_op_finish. > + */ > +uint64_t pmccntr_op_start(CPUARMState *env); > +void pmccntr_op_finish(CPUARMState *env, uint64_t prev_cycles); > + > +/** > + * pmu_op_start/finish > + * @env: CPUARMState > + * > + * Convert all PMU counters between their delta form (the typical mode when > + * they are enabled) and the guest-visible values. These two calls must > + * surround any action which might affect the counters, and the return value > + * from pmu_op_start must be supplied as the second argument to > pmu_op_finish. > + */ > +uint64_t pmu_op_start(CPUARMState *env); > +void pmu_op_finish(CPUARMState *env, uint64_t prev_cycles); > + > +/** > + * Functions to register as EL change hooks for PMU mode filtering > */ > -void pmccntr_sync(CPUARMState *env); > +void pmu_pre_el_change(ARMCPU *cpu, void *ignored); > +void pmu_post_el_change(ARMCPU *cpu, void *ignored); > > /* SCTLR bit meanings. Several bits have been reused in newer > * versions of the architecture; in that case we define constants > diff --git a/target/arm/helper.c b/target/arm/helper.c > index 0102357..95b09d6 100644 > --- a/target/arm/helper.c > +++ b/target/arm/helper.c > @@ -908,6 +908,15 @@ static const ARMCPRegInfo v6_cp_reginfo[] = { > #define PMCRC 0x4 > #define PMCRE 0x1 > > +#define PMXEVTYPER_P 0x8000 > +#define PMXEVTYPER_U 0x4000 > +#define PMXEVTYPER_NSK0x2000 > +#define PMXEVTYPER_NSU0x1000 > +#define PMXEVTYPER_NSH0x0800 > +#define PMXEVTYPER_M 0x0400 > +#define PMXEVTYPER_MT 0x0200 > +#define PMXEVTYPER_EVTCOUNT 0x03ff > + > #define PMU_NUM_COUNTERS(env) ((env->cp15.c9_pmcr & PMCRN_MASK) >> > PMCRN_SHIFT) > /* Bits allowed to be set/cleared for PMCNTEN* and PMINTEN* */ > #define
Re: [Qemu-devel] [PATCH v3 10/22] target/arm: Allow EL change hooks to do IO
On Apr 12 17:53, Peter Maydell wrote: > On 16 March 2018 at 20:31, Aaron Lindsaywrote: > > During code generation, surround CPSR writes and exception returns which > > call the EL change hooks with gen_io_start/end. The immediate need is > > for the PMU to access the clock and icount during EL change to support > > mode filtering. > > > > Signed-off-by: Aaron Lindsay > > --- > > target/arm/translate-a64.c | 2 ++ > > target/arm/translate.c | 4 > > 2 files changed, 6 insertions(+) > > > > diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c > > index 31ff047..e1ae676 100644 > > --- a/target/arm/translate-a64.c > > +++ b/target/arm/translate-a64.c > > @@ -1919,7 +1919,9 @@ static void disas_uncond_b_reg(DisasContext *s, > > uint32_t insn) > > unallocated_encoding(s); > > return; > > } > > +gen_io_start(); > > gen_helper_exception_return(cpu_env); > > +gen_io_end(); > > You don't want to call gen_io_start() or gen_io_end() unless > tb_cflags(s->base.tb) & CF_USE_ICOUNT) is true. > > (Ditto in the other cases below.) I assume there's nothing tricky about this and updating this as follows is sufficient? > > +if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) { > > +gen_io_start(); > > +} > > gen_helper_exception_return(cpu_env); > > +if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) { > > +gen_io_end(); > > +} -Aaron > > > /* Must exit loop to check un-masked IRQs */ > > s->base.is_jmp = DISAS_EXIT; > > return; > > diff --git a/target/arm/translate.c b/target/arm/translate.c > > index ba6ab7d..fd5871e 100644 > > --- a/target/arm/translate.c > > +++ b/target/arm/translate.c > > @@ -4536,7 +4536,9 @@ static void gen_rfe(DisasContext *s, TCGv_i32 pc, > > TCGv_i32 cpsr) > > * appropriately depending on the new Thumb bit, so it must > > * be called after storing the new PC. > > */ > > +gen_io_start(); > > gen_helper_cpsr_write_eret(cpu_env, cpsr); > > +gen_io_end(); > > tcg_temp_free_i32(cpsr); > > /* Must exit loop to check un-masked IRQs */ > > s->base.is_jmp = DISAS_EXIT; > > @@ -9828,7 +9830,9 @@ static void disas_arm_insn(DisasContext *s, unsigned > > int insn) > > if (exc_return) { > > /* Restore CPSR from SPSR. */ > > tmp = load_cpu_field(spsr); > > +gen_io_start(); > > gen_helper_cpsr_write_eret(cpu_env, tmp); > > +gen_io_end(); > > tcg_temp_free_i32(tmp); > > /* Must exit loop to check un-masked IRQs */ > > s->base.is_jmp = DISAS_EXIT; > > -- > > Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, > > Inc. > > Qualcomm Technologies, Inc. is a member of the > > Code Aurora Forum, a Linux Foundation Collaborative Project. > > > > thanks > -- PMM -- Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
[Qemu-devel] [RFC] Intermediate block mirroring
Hello, I mentioned this some time ago, but I'd like to retake it now: I'm checking how to copy arbitrary nodes on a backing chain, so if I have e.g. [A] <- [B] <- [C] <- [D] I'd like to end up with [A] <- [E] <- [C] <- [D] where [E] is a copy of [B]. The most obvious use case is to move [B] to a different storage backend. At the moment we can already copy [B] into [E] (outside QEMU) and then do change-backing-file device=[D] image-node-name=[C] backing-file=[E] However this only changes the image on disk and the bs->backing_file string in memory. QEMU keeps using the [B] BlockDriverState, and we need to make it switch to [E]. One possible way to do this would be to modify blockdev-mirror so the source image can be located anywhere on a chain. Currently there's bs->backing_blocker preventing that, plus qmp_blockdev_mirror() refuses to take any non-root source node. Other than permission changes I don't think the algorithm itself needs any additional modification, although I suppose we'd like to change the backing file as part of the same job, and that would require API changes. One other way is to have a more generic replace-node command which would call bdrv_replace_node(), but I don't know if we want to expose that and I don't have any other use case for it at the moment. Opinions, comments? Berto
Re: [Qemu-devel] [PATCH v3 09/22] target/arm: Add pre-EL change hooks
On Apr 12 17:49, Peter Maydell wrote: > On 16 March 2018 at 20:31, Aaron Lindsaywrote: > > Because the design of the PMU requires that the counter values be > > converted between their delta and guest-visible forms for mode > > filtering, an additional hook which occurs before the EL is changed is > > necessary. > > > > Signed-off-by: Aaron Lindsay > > --- > > target/arm/cpu.c | 13 + > > target/arm/cpu.h | 12 > > target/arm/helper.c| 14 -- > > target/arm/internals.h | 7 +++ > > target/arm/op_helper.c | 8 > > 5 files changed, 44 insertions(+), 10 deletions(-) > > > > diff --git a/target/arm/cpu.c b/target/arm/cpu.c > > index 5f782bf..a2cb21e 100644 > > --- a/target/arm/cpu.c > > +++ b/target/arm/cpu.c > > @@ -55,6 +55,18 @@ static bool arm_cpu_has_work(CPUState *cs) > > | CPU_INTERRUPT_EXITTB); > > } > > > > +void arm_register_pre_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook, > > + void *opaque) > > +{ > > +ARMELChangeHook *entry; > > +entry = g_malloc0(sizeof (*entry)); > > g_new0(). > > > + > > +entry->hook = hook; > > +entry->opaque = opaque; > > + > > +QLIST_INSERT_HEAD(>pre_el_change_hooks, entry, node); > > +} > > + > > void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook, > > void *opaque) > > { > > @@ -747,6 +759,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error > > **errp) > > return; > > } > > > > +QLIST_INIT(>pre_el_change_hooks); > > QLIST_INIT(>el_change_hooks); > > > > /* Some features automatically imply others: */ > > diff --git a/target/arm/cpu.h b/target/arm/cpu.h > > index 3b45d3d..b0ef727 100644 > > --- a/target/arm/cpu.h > > +++ b/target/arm/cpu.h > > @@ -832,6 +832,7 @@ struct ARMCPU { > > */ > > bool cfgend; > > > > +QLIST_HEAD(, ARMELChangeHook) pre_el_change_hooks; > > QLIST_HEAD(, ARMELChangeHook) el_change_hooks; > > > > int32_t node_id; /* NUMA node this CPU belongs to */ > > @@ -2895,12 +2896,15 @@ static inline AddressSpace > > *arm_addressspace(CPUState *cs, MemTxAttrs attrs) > > #endif > > > > /** > > + * arm_register_pre_el_change_hook: > > * arm_register_el_change_hook: > > - * Register a hook function which will be called back whenever this > > - * CPU changes exception level or mode. The hook function will be > > - * passed a pointer to the ARMCPU and the opaque data pointer passed > > - * to this function when the hook was registered. > > + * Register a hook function which will be called back before or after this > > CPU > > + * changes exception level or mode. The hook function will be passed a > > pointer > > + * to the ARMCPU and the opaque data pointer passed to this function when > > the > > + * hook was registered. > > */ > > I would just have one doc comment for each function, rather than > trying to share. (Some day we may actually autogenerate HTML docs > from these comments...) > > Do we make the guarantee that if we call the pre-change hook > then we will definitely subsequently call the post-change hook? > (ie does the PMU hook rely on that?) Yes, the PMU relies on the pre- and post- hooks being called in pairs since they drive the state machine of sorts that exists in the variables holding the counter values/deltas. And unless I've really screwed up the implementation, I believe the change hooks in my patchset make that guarantee. -Aaron > > Otherwise looks OK. > > thanks > -- PMM -- Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [Qemu-devel] [PATCH v3 04/22] target/arm: Treat PMCCNTR as alias of PMCCNTR_EL0
On Apr 12 17:10, Peter Maydell wrote: > On 16 March 2018 at 20:31, Aaron Lindsaywrote: > > They share the same underlying state > > > > Signed-off-by: Aaron Lindsay > > --- > > target/arm/helper.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/target/arm/helper.c b/target/arm/helper.c > > index 5e48982..5634561 100644 > > --- a/target/arm/helper.c > > +++ b/target/arm/helper.c > > @@ -1318,7 +1318,7 @@ static const ARMCPRegInfo v7_cp_reginfo[] = { > >.fieldoffset = offsetof(CPUARMState, cp15.c9_pmselr), > >.writefn = pmselr_write, .raw_writefn = raw_write, }, > > { .name = "PMCCNTR", .cp = 15, .crn = 9, .crm = 13, .opc1 = 0, .opc2 = > > 0, > > - .access = PL0_RW, .resetvalue = 0, .type = ARM_CP_IO, > > + .access = PL0_RW, .resetvalue = 0, .type = ARM_CP_ALIAS | ARM_CP_IO, > >.readfn = pmccntr_read, .writefn = pmccntr_write32, > >.accessfn = pmreg_access_ccntr }, > > { .name = "PMCCNTR_EL0", .state = ARM_CP_STATE_AA64, > > -- > > Does this fix an observed bug (presumably migration related), > or is it just something you saw in code inspection ? > Worth noting in the commit message if the former. Purely code inspection. I forget now, but I think I found it after chasing down some other register issue in the vicinity and noticed this while digging around. -Aaron > Reviewed-by: Peter Maydell > > thanks > -- PMM -- Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [Qemu-devel] [PATCH v3 10/22] target/arm: Allow EL change hooks to do IO
On 16 March 2018 at 20:31, Aaron Lindsaywrote: > During code generation, surround CPSR writes and exception returns which > call the EL change hooks with gen_io_start/end. The immediate need is > for the PMU to access the clock and icount during EL change to support > mode filtering. > > Signed-off-by: Aaron Lindsay > --- > target/arm/translate-a64.c | 2 ++ > target/arm/translate.c | 4 > 2 files changed, 6 insertions(+) > > diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c > index 31ff047..e1ae676 100644 > --- a/target/arm/translate-a64.c > +++ b/target/arm/translate-a64.c > @@ -1919,7 +1919,9 @@ static void disas_uncond_b_reg(DisasContext *s, > uint32_t insn) > unallocated_encoding(s); > return; > } > +gen_io_start(); > gen_helper_exception_return(cpu_env); > +gen_io_end(); You don't want to call gen_io_start() or gen_io_end() unless tb_cflags(s->base.tb) & CF_USE_ICOUNT) is true. (Ditto in the other cases below.) > /* Must exit loop to check un-masked IRQs */ > s->base.is_jmp = DISAS_EXIT; > return; > diff --git a/target/arm/translate.c b/target/arm/translate.c > index ba6ab7d..fd5871e 100644 > --- a/target/arm/translate.c > +++ b/target/arm/translate.c > @@ -4536,7 +4536,9 @@ static void gen_rfe(DisasContext *s, TCGv_i32 pc, > TCGv_i32 cpsr) > * appropriately depending on the new Thumb bit, so it must > * be called after storing the new PC. > */ > +gen_io_start(); > gen_helper_cpsr_write_eret(cpu_env, cpsr); > +gen_io_end(); > tcg_temp_free_i32(cpsr); > /* Must exit loop to check un-masked IRQs */ > s->base.is_jmp = DISAS_EXIT; > @@ -9828,7 +9830,9 @@ static void disas_arm_insn(DisasContext *s, unsigned > int insn) > if (exc_return) { > /* Restore CPSR from SPSR. */ > tmp = load_cpu_field(spsr); > +gen_io_start(); > gen_helper_cpsr_write_eret(cpu_env, tmp); > +gen_io_end(); > tcg_temp_free_i32(tmp); > /* Must exit loop to check un-masked IRQs */ > s->base.is_jmp = DISAS_EXIT; > -- > Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, > Inc. > Qualcomm Technologies, Inc. is a member of the > Code Aurora Forum, a Linux Foundation Collaborative Project. > thanks -- PMM
Re: [Qemu-devel] [PATCH v3 09/22] target/arm: Add pre-EL change hooks
On 16 March 2018 at 20:31, Aaron Lindsaywrote: > Because the design of the PMU requires that the counter values be > converted between their delta and guest-visible forms for mode > filtering, an additional hook which occurs before the EL is changed is > necessary. > > Signed-off-by: Aaron Lindsay > --- > target/arm/cpu.c | 13 + > target/arm/cpu.h | 12 > target/arm/helper.c| 14 -- > target/arm/internals.h | 7 +++ > target/arm/op_helper.c | 8 > 5 files changed, 44 insertions(+), 10 deletions(-) > > diff --git a/target/arm/cpu.c b/target/arm/cpu.c > index 5f782bf..a2cb21e 100644 > --- a/target/arm/cpu.c > +++ b/target/arm/cpu.c > @@ -55,6 +55,18 @@ static bool arm_cpu_has_work(CPUState *cs) > | CPU_INTERRUPT_EXITTB); > } > > +void arm_register_pre_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook, > + void *opaque) > +{ > +ARMELChangeHook *entry; > +entry = g_malloc0(sizeof (*entry)); g_new0(). > + > +entry->hook = hook; > +entry->opaque = opaque; > + > +QLIST_INSERT_HEAD(>pre_el_change_hooks, entry, node); > +} > + > void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook, > void *opaque) > { > @@ -747,6 +759,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error > **errp) > return; > } > > +QLIST_INIT(>pre_el_change_hooks); > QLIST_INIT(>el_change_hooks); > > /* Some features automatically imply others: */ > diff --git a/target/arm/cpu.h b/target/arm/cpu.h > index 3b45d3d..b0ef727 100644 > --- a/target/arm/cpu.h > +++ b/target/arm/cpu.h > @@ -832,6 +832,7 @@ struct ARMCPU { > */ > bool cfgend; > > +QLIST_HEAD(, ARMELChangeHook) pre_el_change_hooks; > QLIST_HEAD(, ARMELChangeHook) el_change_hooks; > > int32_t node_id; /* NUMA node this CPU belongs to */ > @@ -2895,12 +2896,15 @@ static inline AddressSpace *arm_addressspace(CPUState > *cs, MemTxAttrs attrs) > #endif > > /** > + * arm_register_pre_el_change_hook: > * arm_register_el_change_hook: > - * Register a hook function which will be called back whenever this > - * CPU changes exception level or mode. The hook function will be > - * passed a pointer to the ARMCPU and the opaque data pointer passed > - * to this function when the hook was registered. > + * Register a hook function which will be called back before or after this > CPU > + * changes exception level or mode. The hook function will be passed a > pointer > + * to the ARMCPU and the opaque data pointer passed to this function when the > + * hook was registered. > */ I would just have one doc comment for each function, rather than trying to share. (Some day we may actually autogenerate HTML docs from these comments...) Do we make the guarantee that if we call the pre-change hook then we will definitely subsequently call the post-change hook? (ie does the PMU hook rely on that?) Otherwise looks OK. thanks -- PMM
Re: [Qemu-devel] [PATCH v3 11/22] target/arm: Fix bitmask for PMCCFILTR writes
On 16 March 2018 at 20:31, Aaron Lindsaywrote: > It was shifted to the left one bit too few. > > Signed-off-by: Aaron Lindsay > --- > target/arm/helper.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/target/arm/helper.c b/target/arm/helper.c > index 50eaed7..0102357 100644 > --- a/target/arm/helper.c > +++ b/target/arm/helper.c > @@ -1123,7 +1123,7 @@ static void pmccfiltr_write(CPUARMState *env, const > ARMCPRegInfo *ri, > uint64_t value) > { > uint64_t saved_cycles = pmccntr_op_start(env); > -env->cp15.pmccfiltr_el0 = value & 0x7E00; > +env->cp15.pmccfiltr_el0 = value & 0xfc00; > pmccntr_op_finish(env, saved_cycles); > } > I wonder why we got that one wrong. Reviewed-by: Peter Maydell Strictly speaking, bit 26 (M) should be visible only in the AArch64 view of the register, not the AArch32 one, but that's a separate issue. thanks -- PMM
[Qemu-devel] [PATCH for-2.13 4/4] arm/boot: split load_dtb() from arm_load_kernel()
load_dtb() depends on arm_load_kernel() to figure out place in RAM where it should be loaded, but it's not required for arm_load_kernel() to work. Sometimes it's neccesary for devices added with -device/device_add to be enumerated in DTB as well, which's lead to [1] and surrounding commits to add 2 more machine_done notifiers with non obvious ordering to make dynamic sysbus devices initialization happen in the right order. However instead of moving whole arm_load_kernel() in to machine_done, it's sufficient to move only load_dtb() into virt_machine_done() notifier and remove ArmLoadKernelNotifier/ /PlatformBusFDTNotifierParams notifiers, which saves us ~90LOC and simplifies code flow quite a bit. Later would allow to consolidate DTB generation within one function for 'mach-virt' board and make it reentrant so it could generate updated DTB in device hotplug secenarios. While at it rename load_dtb() to arm_load_dtb() since it's public now. Add additional field skip_dtb_autoload to struct arm_boot_info to allow manual DTB load later in mach-virt and to avoid touching all other boards to explicitly call arm_load_dtb(). 1) (ac9d32e hw/arm/boot: arm_load_kernel implemented as a machine init done notifier) Signed-off-by: Igor Mammedov--- include/hw/arm/arm.h| 37 ++--- include/hw/arm/sysbus-fdt.h | 37 + hw/arm/boot.c | 67 +++-- hw/arm/sysbus-fdt.c | 64 --- hw/arm/virt.c | 64 --- 5 files changed, 86 insertions(+), 183 deletions(-) diff --git a/include/hw/arm/arm.h b/include/hw/arm/arm.h index 188d18b..312e533 100644 --- a/include/hw/arm/arm.h +++ b/include/hw/arm/arm.h @@ -39,15 +39,6 @@ DeviceState *armv7m_init(MemoryRegion *system_memory, int mem_size, int num_irq, */ void armv7m_load_kernel(ARMCPU *cpu, const char *kernel_filename, int mem_size); -/* - * struct used as a parameter of the arm_load_kernel machine init - * done notifier - */ -typedef struct { -Notifier notifier; /* actual notifier */ -ARMCPU *cpu; /* handle to the first cpu object */ -} ArmLoadKernelNotifier; - /* arm_boot.c */ struct arm_boot_info { uint64_t ram_size; @@ -56,6 +47,9 @@ struct arm_boot_info { const char *initrd_filename; const char *dtb_filename; hwaddr loader_start; +hwaddr dtb_start; +hwaddr dtb_limit; +bool skip_dtb_autoload; /* multicore boards that use the default secondary core boot functions * need to put the address of the secondary boot code, the boot reg, * and the GIC address in the next 3 values, respectively. boards that @@ -95,7 +89,6 @@ struct arm_boot_info { */ void (*modify_dtb)(const struct arm_boot_info *info, void *fdt); /* machine init done notifier executing arm_load_dtb */ -ArmLoadKernelNotifier load_kernel_notifier; /* Used internally by arm_boot.c */ int is_linux; hwaddr initrd_start; @@ -145,6 +138,30 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info *info); AddressSpace *arm_boot_address_space(ARMCPU *cpu, bool secure_boot); +/** + * arm_load_dtb() - load a device tree binary image into memory + * @addr: the address to load the image at + * @binfo: struct describing the boot environment + * @addr_limit: upper limit of the available memory area at @addr + * @as: address space to load image to + * + * Load a device tree supplied by the machine or by the user with the + * '-dtb' command line option, and put it at offset @addr in target + * memory. + * + * If @addr_limit contains a meaningful value (i.e., it is strictly greater + * than @addr), the device tree is only loaded if its size does not exceed + * the limit. + * + * Returns: the size of the device tree image on success, + * 0 if the image size exceeds the limit, + * -1 on errors. + * + * Note: Must not be called unless have_dtb(binfo) is true. + */ +int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo, + hwaddr addr_limit, AddressSpace *as); + /* Write a secure board setup routine with a dummy handler for SMCs */ void arm_write_secure_board_setup_dummy_smc(ARMCPU *cpu, const struct arm_boot_info *info, diff --git a/include/hw/arm/sysbus-fdt.h b/include/hw/arm/sysbus-fdt.h index e15bb81..340c382 100644 --- a/include/hw/arm/sysbus-fdt.h +++ b/include/hw/arm/sysbus-fdt.h @@ -24,37 +24,14 @@ #ifndef HW_ARM_SYSBUS_FDT_H #define HW_ARM_SYSBUS_FDT_H -#include "hw/arm/arm.h" -#include "qemu-common.h" -#include "hw/sysbus.h" - -/* - * struct that contains dimensioning parameters of the platform bus - */ -typedef struct { -hwaddr platform_bus_base; /* start address of the bus */ -hwaddr platform_bus_size; /* size of the bus */ -int platform_bus_first_irq; /* first hwirq assigned
[Qemu-devel] [PATCH for-2.13 2/4] platform-bus-device: use device plug callback instead of machine_done notifier
platform-bus were using machine_done notifier to get and map (assign irq/mmio resources) dynamically added sysbus devices after all '-device' options had been processed. That however creates non obvious dependencies on ordering of machine_done notifiers and requires carefull line juggling to keep it working. For example see comment above create_platform_bus() and 'straitforward' arm_load_kernel() had to converted to machine_done notifier and that lead to yet another machine_done notifier to keep it working arm_register_platform_bus_fdt_creator(). Instead of hiding resource assignment in platform-bus-device to magically initialize sysbus devices, use device plug callback and assign resources explicitly at board level at the moment each -device option is being processed. That adds a bunch of machine declaration boiler plate to e500plat board, similar to ARM/x86 but gets rid of hidden machine_done notifier and would allow to remove the dependent notifiers in ARM code simplifying it and making code flow easier to follow. Signed-off-by: Igor Mammedov--- CC: ag...@suse.de CC: da...@gibson.dropbear.id.au CC: qemu-...@nongnu.org --- hw/ppc/e500.h | 3 +++ include/hw/arm/virt.h | 3 +++ include/hw/platform-bus.h | 4 ++-- hw/arm/sysbus-fdt.c | 3 --- hw/arm/virt.c | 36 hw/core/platform-bus.c| 29 --- hw/ppc/e500.c | 37 + hw/ppc/e500plat.c | 60 +-- 8 files changed, 129 insertions(+), 46 deletions(-) diff --git a/hw/ppc/e500.h b/hw/ppc/e500.h index 70ba1d8..d0f8ddd 100644 --- a/hw/ppc/e500.h +++ b/hw/ppc/e500.h @@ -2,6 +2,7 @@ #define PPCE500_H #include "hw/boards.h" +#include "hw/sysbus.h" typedef struct PPCE500Params { int pci_first_slot; @@ -26,6 +27,8 @@ typedef struct PPCE500Params { void ppce500_init(MachineState *machine, PPCE500Params *params); +void ppce500_plug_dynamic_sysbus_device(SysBusDevice *sbdev); + hwaddr booke206_page_size_to_tlb(uint64_t size); #endif diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h index ba0c1a4..5535760 100644 --- a/include/hw/arm/virt.h +++ b/include/hw/arm/virt.h @@ -86,11 +86,14 @@ typedef struct { bool no_pmu; bool claim_edge_triggered_timers; bool smbios_old_sys_ver; +HotplugHandler *(*get_hotplug_handler)(MachineState *machine, + DeviceState *dev); } VirtMachineClass; typedef struct { MachineState parent; Notifier machine_done; +DeviceState *platform_bus_dev; FWCfgState *fw_cfg; bool secure; bool highmem; diff --git a/include/hw/platform-bus.h b/include/hw/platform-bus.h index a00775c..19e20c5 100644 --- a/include/hw/platform-bus.h +++ b/include/hw/platform-bus.h @@ -37,8 +37,6 @@ typedef struct PlatformBusDevice PlatformBusDevice; struct PlatformBusDevice { /*< private >*/ SysBusDevice parent_obj; -Notifier notifier; -bool done_gathering; /*< public >*/ uint32_t mmio_size; @@ -54,4 +52,6 @@ int platform_bus_get_irqn(PlatformBusDevice *platform_bus, SysBusDevice *sbdev, hwaddr platform_bus_get_mmio_addr(PlatformBusDevice *pbus, SysBusDevice *sbdev, int n); +void platform_bus_link_device(PlatformBusDevice *pbus, SysBusDevice *sbdev); + #endif /* HW_PLATFORM_BUS_H */ diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c index d68e3dc..80ff70e 100644 --- a/hw/arm/sysbus-fdt.c +++ b/hw/arm/sysbus-fdt.c @@ -506,9 +506,6 @@ static void add_all_platform_bus_fdt_nodes(ARMPlatformBusFDTParams *fdt_params) dev = qdev_find_recursive(sysbus_get_default(), TYPE_PLATFORM_BUS_DEVICE); pbus = PLATFORM_BUS_DEVICE(dev); -/* We can only create dt nodes for dynamic devices when they're ready */ -assert(pbus->done_gathering); - PlatformBusFDTData data = { .fdt = fdt, .irq_start = irq_start, diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 94dcb12..2e10d8b 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -1087,6 +1087,7 @@ static void create_platform_bus(VirtMachineState *vms, qemu_irq *pic) qdev_prop_set_uint32(dev, "mmio_size", platform_bus_params.platform_bus_size); qdev_init_nofail(dev); +vms->platform_bus_dev = dev; s = SYS_BUS_DEVICE(dev); for (i = 0; i < platform_bus_params.platform_bus_num_irqs; i++) { @@ -1536,9 +1537,37 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms) return ms->possible_cpus; } +static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev, +DeviceState *dev, Error **errp) +{ +VirtMachineState *vms = VIRT_MACHINE(hotplug_dev); + +if (object_dynamic_cast(OBJECT(dev), TYPE_SYS_BUS_DEVICE)) { +if (vms->platform_bus_dev) { +
[Qemu-devel] [PATCH for-2.13 1/4] arm: reuse arm_boot_address_space() in armv7m_load_kernel()
reduce code duplication by resusing arm_boot_address_space() Signed-off-by: Igor Mammedov--- include/hw/arm/arm.h | 2 ++ hw/arm/armv7m.c | 10 +- hw/arm/boot.c| 16 3 files changed, 11 insertions(+), 17 deletions(-) diff --git a/include/hw/arm/arm.h b/include/hw/arm/arm.h index ce769bd..188d18b 100644 --- a/include/hw/arm/arm.h +++ b/include/hw/arm/arm.h @@ -143,6 +143,8 @@ struct arm_boot_info { */ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info *info); +AddressSpace *arm_boot_address_space(ARMCPU *cpu, bool secure_boot); + /* Write a secure board setup routine with a dummy handler for SMCs */ void arm_write_secure_board_setup_dummy_smc(ARMCPU *cpu, const struct arm_boot_info *info, diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c index f123cc7..d372d9c 100644 --- a/hw/arm/armv7m.c +++ b/hw/arm/armv7m.c @@ -289,8 +289,6 @@ void armv7m_load_kernel(ARMCPU *cpu, const char *kernel_filename, int mem_size) uint64_t lowaddr; int big_endian; AddressSpace *as; -int asidx; -CPUState *cs = CPU(cpu); #ifdef TARGET_WORDS_BIGENDIAN big_endian = 1; @@ -303,13 +301,7 @@ void armv7m_load_kernel(ARMCPU *cpu, const char *kernel_filename, int mem_size) exit(1); } -if (arm_feature(>env, ARM_FEATURE_EL3)) { -asidx = ARMASIdx_S; -} else { -asidx = ARMASIdx_NS; -} -as = cpu_get_address_space(cs, asidx); - +as = arm_boot_address_space(cpu, true); if (kernel_filename) { image_size = load_elf_as(kernel_filename, NULL, NULL, , , NULL, big_endian, EM_ARM, 1, 0, as); diff --git a/hw/arm/boot.c b/hw/arm/boot.c index 26184bc..2f464ca 100644 --- a/hw/arm/boot.c +++ b/hw/arm/boot.c @@ -36,8 +36,7 @@ #define ARM64_TEXT_OFFSET_OFFSET8 #define ARM64_MAGIC_OFFSET 56 -static AddressSpace *arm_boot_address_space(ARMCPU *cpu, -const struct arm_boot_info *info) +AddressSpace *arm_boot_address_space(ARMCPU *cpu, bool secure_boot) { /* Return the address space to use for bootloader reads and writes. * We prefer the secure address space if the CPU has it and we're @@ -46,7 +45,7 @@ static AddressSpace *arm_boot_address_space(ARMCPU *cpu, int asidx; CPUState *cs = CPU(cpu); -if (arm_feature(>env, ARM_FEATURE_EL3) && info->secure_boot) { +if (arm_feature(>env, ARM_FEATURE_EL3) && secure_boot) { asidx = ARMASIdx_S; } else { asidx = ARMASIdx_NS; @@ -193,7 +192,7 @@ static void default_write_secondary(ARMCPU *cpu, const struct arm_boot_info *info) { uint32_t fixupcontext[FIXUP_MAX]; -AddressSpace *as = arm_boot_address_space(cpu, info); +AddressSpace *as = arm_boot_address_space(cpu, info->secure_boot); fixupcontext[FIXUP_GIC_CPU_IF] = info->gic_cpu_if_addr; fixupcontext[FIXUP_BOOTREG] = info->smp_bootreg_addr; @@ -211,7 +210,7 @@ void arm_write_secure_board_setup_dummy_smc(ARMCPU *cpu, const struct arm_boot_info *info, hwaddr mvbar_addr) { -AddressSpace *as = arm_boot_address_space(cpu, info); +AddressSpace *as = arm_boot_address_space(cpu, info->secure_boot); int n; uint32_t mvbar_blob[] = { /* mvbar_addr: secure monitor vectors @@ -262,7 +261,7 @@ void arm_write_secure_board_setup_dummy_smc(ARMCPU *cpu, static void default_reset_secondary(ARMCPU *cpu, const struct arm_boot_info *info) { -AddressSpace *as = arm_boot_address_space(cpu, info); +AddressSpace *as = arm_boot_address_space(cpu, info->secure_boot); CPUState *cs = CPU(cpu); address_space_stl_notdirty(as, info->smp_bootreg_addr, @@ -753,7 +752,8 @@ static void do_cpu_reset(void *opaque) } if (cs == first_cpu) { -AddressSpace *as = arm_boot_address_space(cpu, info); +AddressSpace *as = +arm_boot_address_space(cpu, info->secure_boot); cpu_set_pc(cs, info->loader_start); @@ -950,7 +950,7 @@ static void arm_load_kernel_notify(Notifier *notifier, void *data) ARMCPU *cpu = n->cpu; struct arm_boot_info *info = container_of(n, struct arm_boot_info, load_kernel_notifier); -AddressSpace *as = arm_boot_address_space(cpu, info); +AddressSpace *as = arm_boot_address_space(cpu, info->secure_boot); /* The board code is not supposed to set secure_board_setup unless * running its code in secure mode is actually possible, and KVM -- 2.7.4
[Qemu-devel] [PATCH for-2.13 3/4] arm: always start from first_cpu when registering loader cpu reset callback
if arm_load_kernel() were passed non first_cpu, QEMU would end up with partially set do_cpu_reset() callback leaving some CPUs without it. Make sure that do_cpu_reset() is registered for all CPUs by enumerating CPUs from first_cpu. Signed-off-by: Igor Mammedov--- hw/arm/boot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/arm/boot.c b/hw/arm/boot.c index 2f464ca..2591fee 100644 --- a/hw/arm/boot.c +++ b/hw/arm/boot.c @@ -1188,7 +1188,7 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info *info) * actually loading a kernel, the handler is also responsible for * arranging that we start it correctly. */ -for (cs = CPU(cpu); cs; cs = CPU_NEXT(cs)) { +for (cs = first_cpu; cs; cs = CPU_NEXT(cs)) { qemu_register_reset(do_cpu_reset, ARM_CPU(cs)); } } -- 2.7.4
[Qemu-devel] [PATCH for-2.13 0/4] arm: isolate and clean up dtb generation
While working on cpu hotplug for mach-virt, I've noticed that DTB is generated incrementally across whole machvirt_init(). While it's fine for machines with static DTB, it won't really work in case of cpu hotplug and followed up reset since machine will load old DTB that doesn't account for hotplugged CPUs. So I've set on a quest to consolidate DTB generation and make it reentrant so that on reset guest would see update DTB. It's preliminary series which makes possible to call load_dtb() later outside of arm_load_kernel() and in process of it drops several machine_done notifiers, that were introduced to make plaform-bus-devices work. Hopefully it makes code easier to follow. It replaces machine_done notifiers with device hotplug framework to allow for dynamic sysbus devices mapping at the moment they are created instead of waiting for machine_done time and trying to juggle with notifiers order to do initialization in propper order. Mostly 'make check' tested + manually with "ppce500" machine type and eTSEC device (eTSEC is still initialized with the same IRQs as before series) CC: qemu-...@nongnu.org CC: peter.mayd...@linaro.org CC: eric.au...@redhat.com Igor Mammedov (4): arm: reuse arm_boot_address_space() in armv7m_load_kernel() platform-bus-device: use device plug callback instead of machine_done notifier arm: always start from first_cpu when registering loader cpu reset callback arm/boot: split load_dtb() from arm_load_kernel() hw/ppc/e500.h | 3 ++ include/hw/arm/arm.h| 39 - include/hw/arm/sysbus-fdt.h | 37 include/hw/arm/virt.h | 3 ++ include/hw/platform-bus.h | 4 +- hw/arm/armv7m.c | 10 + hw/arm/boot.c | 85 +++-- hw/arm/sysbus-fdt.c | 67 +++-- hw/arm/virt.c | 100 +--- hw/core/platform-bus.c | 29 +++-- hw/ppc/e500.c | 37 +--- hw/ppc/e500plat.c | 60 +- 12 files changed, 227 insertions(+), 247 deletions(-) -- 2.7.4
Re: [Qemu-devel] [PATCH v3 08/22] target/arm: Support multiple EL change hooks
On 16 March 2018 at 20:31, Aaron Lindsaywrote: > Signed-off-by: Aaron Lindsay > --- > target/arm/cpu.c | 15 ++- > target/arm/cpu.h | 23 --- > target/arm/internals.h | 7 --- > 3 files changed, 26 insertions(+), 19 deletions(-) > > diff --git a/target/arm/cpu.c b/target/arm/cpu.c > index 072cbbf..5f782bf 100644 > --- a/target/arm/cpu.c > +++ b/target/arm/cpu.c > @@ -55,13 +55,16 @@ static bool arm_cpu_has_work(CPUState *cs) > | CPU_INTERRUPT_EXITTB); > } > > -void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHook *hook, > +void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook, > void *opaque) > { > -/* We currently only support registering a single hook function */ > -assert(!cpu->el_change_hook); > -cpu->el_change_hook = hook; > -cpu->el_change_hook_opaque = opaque; > +ARMELChangeHook *entry; > +entry = g_malloc0(sizeof (*entry)); g_new0(ARMELChangeHook, 1) is nicer for initializing a thing of known size. > + > +entry->hook = hook; > +entry->opaque = opaque; > + > +QLIST_INSERT_HEAD(>el_change_hooks, entry, node); > } > > static void cp_reg_reset(gpointer key, gpointer value, gpointer opaque) > @@ -744,6 +747,8 @@ static void arm_cpu_realizefn(DeviceState *dev, Error > **errp) > return; > } > > +QLIST_INIT(>el_change_hooks); > + If you put this in arm_cpu_initfn() you don't have to wonder whether anybody's calling arm_register_el_change_hook() after the CPU object has been created but before it is realized... > /* Some features automatically imply others: */ > if (arm_feature(env, ARM_FEATURE_V8)) { > set_feature(env, ARM_FEATURE_V7); > diff --git a/target/arm/cpu.h b/target/arm/cpu.h > index f17592b..3b45d3d 100644 > --- a/target/arm/cpu.h > +++ b/target/arm/cpu.h > @@ -633,11 +633,17 @@ typedef struct CPUARMState { > > /** > * ARMELChangeHook: > - * type of a function which can be registered via > arm_register_el_change_hook() > - * to get callbacks when the CPU changes its exception level or mode. > + * Support registering functions with ARMELChangeHookFn's signature via > + * arm_register_el_change_hook() to get callbacks when the CPU changes its > + * exception level or mode. > */ This is supposed to be the doc comment for the type, which I think was fine as it is, apart from the name change. If you want to also have a doc comment for the struct type that's fine but would be something different (and is less necessary, because the struct type is internal to the CPU whereas the function type is part of its API to the rest of the system). > -typedef void ARMELChangeHook(ARMCPU *cpu, void *opaque); > - > +typedef void ARMELChangeHookFn(ARMCPU *cpu, void *opaque); > +typedef struct ARMELChangeHook ARMELChangeHook; > +struct ARMELChangeHook { > +ARMELChangeHookFn *hook; > +void *opaque; > +QLIST_ENTRY(ARMELChangeHook) node; > +}; > > /* These values map onto the return values for > * QEMU_PSCI_0_2_FN_AFFINITY_INFO */ > @@ -826,8 +832,7 @@ struct ARMCPU { > */ > bool cfgend; > > -ARMELChangeHook *el_change_hook; > -void *el_change_hook_opaque; > +QLIST_HEAD(, ARMELChangeHook) el_change_hooks; > > int32_t node_id; /* NUMA node this CPU belongs to */ > > @@ -2895,12 +2900,8 @@ static inline AddressSpace *arm_addressspace(CPUState > *cs, MemTxAttrs attrs) > * CPU changes exception level or mode. The hook function will be > * passed a pointer to the ARMCPU and the opaque data pointer passed > * to this function when the hook was registered. > - * > - * Note that we currently only support registering a single hook function, > - * and will assert if this function is called twice. > - * This facility is intended for the use of the GICv3 emulation. > */ > -void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHook *hook, > +void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook, > void *opaque); > > /** > diff --git a/target/arm/internals.h b/target/arm/internals.h > index 47cc224..7df3eda 100644 > --- a/target/arm/internals.h > +++ b/target/arm/internals.h > @@ -727,11 +727,12 @@ void arm_cpu_do_transaction_failed(CPUState *cs, hwaddr > physaddr, > int mmu_idx, MemTxAttrs attrs, > MemTxResult response, uintptr_t retaddr); > > -/* Call the EL change hook if one has been registered */ > +/* Call any registered EL change hooks */ > static inline void arm_call_el_change_hook(ARMCPU *cpu) > { > -if (cpu->el_change_hook) { > -cpu->el_change_hook(cpu, cpu->el_change_hook_opaque); > +ARMELChangeHook *hook, *next; > +QLIST_FOREACH_SAFE(hook, >el_change_hooks, node, next) { > +hook->hook(cpu, hook->opaque); > } > } > thanks -- PMM
Re: [Qemu-devel] [PATCH v3 07/22] target/arm: Fetch GICv3 state directly from CPUARMState
On 16 March 2018 at 20:31, Aaron Lindsaywrote: > This eliminates the need for fetching it from el_change_hook_opaque, and > allows for supporting multiple el_change_hooks without having to hack > something together to find the registered opaque belonging to GICv3. > > Signed-off-by: Aaron Lindsay > --- > hw/intc/arm_gicv3_cpuif.c | 10 ++ > target/arm/cpu.h | 10 -- > 2 files changed, 2 insertions(+), 18 deletions(-) I'm not wonderfully happy about this, because the original aim here was to try to keep the GICv3 device end reasonably decoupled from the CPU proper. On the other hand since d3a3e529626fb we've had this pointer because we ended up needing it in the KVM reset codepath, so we might as well use it here too. Maybe one day I'll come back to this and clean it up. Reviewed-by: Peter Maydell thanks -- PMM
Re: [Qemu-devel] [PATCH v3 06/22] target/arm: Mask PMU register writes based on PMCR_EL0.N
On 16 March 2018 at 20:31, Aaron Lindsaywrote: > This is in preparation for enabling counters other than PMCCNTR > > Signed-off-by: Aaron Lindsay > --- > target/arm/helper.c | 24 +++- > 1 file changed, 15 insertions(+), 9 deletions(-) > > diff --git a/target/arm/helper.c b/target/arm/helper.c > index 6480b80..5d5c738 100644 > --- a/target/arm/helper.c > +++ b/target/arm/helper.c > @@ -52,11 +52,6 @@ typedef struct V8M_SAttributes { > static void v8m_security_lookup(CPUARMState *env, uint32_t address, > MMUAccessType access_type, ARMMMUIdx mmu_idx, > V8M_SAttributes *sattrs); > - > -/* Definitions for the PMCCNTR and PMCR registers */ > -#define PMCRD 0x8 > -#define PMCRC 0x4 > -#define PMCRE 0x1 > #endif > > static int vfp_gdb_get_reg(CPUARMState *env, uint8_t *buf, int reg) > @@ -906,6 +901,17 @@ static const ARMCPRegInfo v6_cp_reginfo[] = { > REGINFO_SENTINEL > }; > > +/* Definitions for the PMU registers */ > +#define PMCRN_MASK 0xf800 > +#define PMCRN_SHIFT 11 > +#define PMCRD 0x8 > +#define PMCRC 0x4 > +#define PMCRE 0x1 > + > +#define PMU_NUM_COUNTERS(env) ((env->cp15.c9_pmcr & PMCRN_MASK) >> > PMCRN_SHIFT) > +/* Bits allowed to be set/cleared for PMCNTEN* and PMINTEN* */ > +#define PMU_COUNTER_MASK(env) ((1 << 31) | ((1 << PMU_NUM_COUNTERS(env)) - > 1)) These would be better as inline functions I think. Otherwise Reviewed-by: Peter Maydell thanks -- PMM
Re: [Qemu-devel] [RFC PATCH] migration: discard RAMBlocks of type ram_device
On Thu, 12 Apr 2018 15:59:24 + "Zhang, Yulei"wrote: > > -Original Message- > > From: Alex Williamson [mailto:alex.william...@redhat.com] > > Sent: Thursday, April 12, 2018 1:55 AM > > To: Cédric Le Goater > > Cc: qemu-devel@nongnu.org; Juan Quintela ; Dr . > > David Alan Gilbert ; David Gibson > > ; Zhang, Yulei ; Tian, > > Kevin ; joonas.lahti...@linux.intel.com; > > zhen...@linux.intel.com; kwankh...@nvidia.com; Wang, Zhi A > > > > Subject: Re: [Qemu-devel] [RFC PATCH] migration: discard RAMBlocks of type > > ram_device > > > > [cc +folks working on vfio-mdev migration] > > > > On Wed, 11 Apr 2018 19:20:14 +0200 > > Cédric Le Goater wrote: > > > > > Here is some context for this strange change request. > > > > > > On the POWER9 processor, the XIVE interrupt controller can control > > > interrupt sources using MMIO to trigger events, to EOI or to turn off > > > the sources. Priority management and interrupt acknowledgment is also > > > controlled by MMIO in the presenter subengine. > > > > > > These MMIO regions are exposed to guests in QEMU with a set of 'ram > > > device' memory mappings, similarly to VFIO, and the VMAs are populated > > > dynamically with the appropriate pages using a fault handler. > > > > > > But, these regions are an issue for migration. We need to discard the > > > associated RAMBlocks from the RAM state on the source VM and let the > > > destination VM rebuild the memory mappings on the new host in the > > > post_load() operation just before resuming the system. > > > > > > This is the goal of the following proposal. Does it make sense ? It > > > seems to be working enough to migrate a running guest but there might > > > be a better, more subtle, approach. > > > > Yulei, is this something you've run into with GVT-g migration? I don't see > > how we can read from or write to ram_device regions in a useful way during > > migration anyway, so the change initially looks correct to me. > > Thanks, > > > > Alex > > > > Didn't meet such issue before. I think the change will be fine if the vendor > driver > handle the reconstruction well on the target side. And I agree with Dave's > suggestion, > how about vendor driver report a flag about the mapped region to indicate it > can > be registered as migratable memory block or not. "migration" and "memory blocks" are not vfio concepts, you'd need to come up with flags that actually conveys the device level property of the region that you're trying to indicate. I don't see why we'd do this though, the application of such a flag seems too narrow and it tarnishes the concept that a vendor driver provides a region, through which *all* device state is saved and restored. Thanks, Alex
Re: [Qemu-devel] [PATCH] block/gluster: defend on legacy ftruncate api use
n 04/12/2018 08:31 AM, Niels de Vos wrote: > This change looks good to me, but a commit message would have been > helpful. I suggest something like this: > > Gluster 4.0 changed the signature of glfs_ftruncate(). The function > now has two additional arguments, namely prestat and poststat. These > provide not benefit for QEMU, so ignoring them and passing NULL makes > the gluster-block driver compile with the new Gluster version again. > > And maybe add this too: > > Glusters libgfapi uses symbol versioning and provides backwards > compatible functions. Binaries compiled against previous versions of > Gluster keep on functioning with the new library. > >> @@ -3856,6 +3857,9 @@ if test "$glusterfs" != "no" ; then >>glusterfs_fallocate="yes" >>glusterfs_zerofill="yes" >> fi >> +if ! $pkg_config --atleast-version=7.4 glusterfs-api; then >> + glusterfs_legacy_ftruncate="yes" Also, version-based tests are lousy. Feature-based tests (does a call to glfs_ftruncate(0, 0) compile without error? then we are legacy) are less brittle, especially when features can be backported across versions. So this should be reworked to a compile check, rather than a version query. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v3 05/22] target/arm: Reorganize PMCCNTR read, write, sync
On 16 March 2018 at 20:31, Aaron Lindsaywrote: > pmccntr_read and pmccntr_write contained duplicate code that was already > being handled by pmccntr_sync. Split pmccntr_sync into pmccntr_op_start > and pmccntr_op_finish, passing the clock value between the two, to avoid > losing time between the two calls. > > Signed-off-by: Aaron Lindsay > --- > target/arm/helper.c | 101 > +--- > 1 file changed, 56 insertions(+), 45 deletions(-) > > diff --git a/target/arm/helper.c b/target/arm/helper.c > index 5634561..6480b80 100644 > --- a/target/arm/helper.c > +++ b/target/arm/helper.c > @@ -1000,28 +1000,58 @@ static inline bool arm_ccnt_enabled(CPUARMState *env) > > return true; > } > - > -void pmccntr_sync(CPUARMState *env) If you configure your git to use the 'histogram' diff algorithm ("git config --global diff.algorithm histogram", or edit ~/.gitconfig equivalently), does it make git format-patch make less of a mess of this commit ? > +/* > + * Ensure c15_ccnt is the guest-visible count so that operations such as > + * enabling/disabling the counter or filtering, modifying the count itself, > + * etc. can be done logically. This is essentially a no-op if the counter is > + * not enabled at the time of the call. > + * > + * The current cycle count is returned so that it can be passed into the > paired > + * pmccntr_op_finish() call which must follow each call to > pmccntr_op_start(). > + */ > +uint64_t pmccntr_op_start(CPUARMState *env) > { > -uint64_t temp_ticks; > - > -temp_ticks = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL), > +uint64_t cycles = 0; > +#ifndef CONFIG_USER_ONLY > +cycles = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL), >ARM_CPU_FREQ, NANOSECONDS_PER_SECOND); > +#endif Is this ifdef necessary? You have a do-nothing version of pmccntr_op_start() for CONFIG_USER_ONLY later on, so presumably this one is already inside a suitable ifndef. Otherwise Reviewed-by: Peter Maydell thanks -- PMM
Re: [Qemu-devel] [PATCH v3 04/22] target/arm: Treat PMCCNTR as alias of PMCCNTR_EL0
On 16 March 2018 at 20:31, Aaron Lindsaywrote: > They share the same underlying state > > Signed-off-by: Aaron Lindsay > --- > target/arm/helper.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/target/arm/helper.c b/target/arm/helper.c > index 5e48982..5634561 100644 > --- a/target/arm/helper.c > +++ b/target/arm/helper.c > @@ -1318,7 +1318,7 @@ static const ARMCPRegInfo v7_cp_reginfo[] = { >.fieldoffset = offsetof(CPUARMState, cp15.c9_pmselr), >.writefn = pmselr_write, .raw_writefn = raw_write, }, > { .name = "PMCCNTR", .cp = 15, .crn = 9, .crm = 13, .opc1 = 0, .opc2 = 0, > - .access = PL0_RW, .resetvalue = 0, .type = ARM_CP_IO, > + .access = PL0_RW, .resetvalue = 0, .type = ARM_CP_ALIAS | ARM_CP_IO, >.readfn = pmccntr_read, .writefn = pmccntr_write32, >.accessfn = pmreg_access_ccntr }, > { .name = "PMCCNTR_EL0", .state = ARM_CP_STATE_AA64, > -- Does this fix an observed bug (presumably migration related), or is it just something you saw in code inspection ? Worth noting in the commit message if the former. Reviewed-by: Peter Maydell thanks -- PMM
Re: [Qemu-devel] [PATCH v3 02/22] target/arm: A15 PMCEID0 initialization style nit
On 16 March 2018 at 20:31, Aaron Lindsaywrote: > Signed-off-by: Aaron Lindsay > --- > target/arm/cpu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/target/arm/cpu.c b/target/arm/cpu.c > index 022d8c5..072cbbf 100644 > --- a/target/arm/cpu.c > +++ b/target/arm/cpu.c > @@ -1524,7 +1524,7 @@ static void cortex_a15_initfn(Object *obj) > cpu->id_pfr0 = 0x1131; > cpu->id_pfr1 = 0x00011011; > cpu->id_dfr0 = 0x02010555; > -cpu->pmceid0 = 0x000; > +cpu->pmceid0 = 0x; > cpu->pmceid1 = 0x; > cpu->id_afr0 = 0x; > cpu->id_mmfr0 = 0x10201105; Reviewed-by: Peter Maydell thanks -- PMM
Re: [Qemu-devel] [RFC PATCH] migration: discard RAMBlocks of type ram_device
> -Original Message- > From: Alex Williamson [mailto:alex.william...@redhat.com] > Sent: Thursday, April 12, 2018 1:55 AM > To: Cédric Le Goater> Cc: qemu-devel@nongnu.org; Juan Quintela ; Dr . > David Alan Gilbert ; David Gibson > ; Zhang, Yulei ; Tian, > Kevin ; joonas.lahti...@linux.intel.com; > zhen...@linux.intel.com; kwankh...@nvidia.com; Wang, Zhi A > > Subject: Re: [Qemu-devel] [RFC PATCH] migration: discard RAMBlocks of type > ram_device > > [cc +folks working on vfio-mdev migration] > > On Wed, 11 Apr 2018 19:20:14 +0200 > Cédric Le Goater wrote: > > > Here is some context for this strange change request. > > > > On the POWER9 processor, the XIVE interrupt controller can control > > interrupt sources using MMIO to trigger events, to EOI or to turn off > > the sources. Priority management and interrupt acknowledgment is also > > controlled by MMIO in the presenter subengine. > > > > These MMIO regions are exposed to guests in QEMU with a set of 'ram > > device' memory mappings, similarly to VFIO, and the VMAs are populated > > dynamically with the appropriate pages using a fault handler. > > > > But, these regions are an issue for migration. We need to discard the > > associated RAMBlocks from the RAM state on the source VM and let the > > destination VM rebuild the memory mappings on the new host in the > > post_load() operation just before resuming the system. > > > > This is the goal of the following proposal. Does it make sense ? It > > seems to be working enough to migrate a running guest but there might > > be a better, more subtle, approach. > > Yulei, is this something you've run into with GVT-g migration? I don't see > how we can read from or write to ram_device regions in a useful way during > migration anyway, so the change initially looks correct to me. > Thanks, > > Alex > Didn't meet such issue before. I think the change will be fine if the vendor driver handle the reconstruction well on the target side. And I agree with Dave's suggestion, how about vendor driver report a flag about the mapped region to indicate it can be registered as migratable memory block or not. Yulei > > Signed-off-by: Cédric Le Goater > > --- > > migration/ram.c | 42 -- > > 1 file changed, 40 insertions(+), 2 deletions(-) > > > > diff --git a/migration/ram.c b/migration/ram.c index > > 0e90efa09236..6404ccd046d8 100644 > > --- a/migration/ram.c > > +++ b/migration/ram.c > > @@ -780,6 +780,10 @@ unsigned long > migration_bitmap_find_dirty(RAMState *rs, RAMBlock *rb, > > unsigned long *bitmap = rb->bmap; > > unsigned long next; > > > > +if (memory_region_is_ram_device(rb->mr)) { > > +return size; > > +} > > + > > if (rs->ram_bulk_stage && start > 0) { > > next = start + 1; > > } else { > > @@ -826,6 +830,9 @@ uint64_t ram_pagesize_summary(void) > > uint64_t summary = 0; > > > > RAMBLOCK_FOREACH(block) { > > +if (memory_region_is_ram_device(block->mr)) { > > +continue; > > +} > > summary |= block->page_size; > > } > > > > @@ -850,6 +857,9 @@ static void migration_bitmap_sync(RAMState *rs) > > qemu_mutex_lock(>bitmap_mutex); > > rcu_read_lock(); > > RAMBLOCK_FOREACH(block) { > > +if (memory_region_is_ram_device(block->mr)) { > > +continue; > > +} > > migration_bitmap_sync_range(rs, block, 0, block->used_length); > > } > > rcu_read_unlock(); > > @@ -1499,6 +1509,10 @@ static int ram_save_host_page(RAMState *rs, > PageSearchStatus *pss, > > size_t pagesize_bits = > > qemu_ram_pagesize(pss->block) >> TARGET_PAGE_BITS; > > > > +if (memory_region_is_ram_device(pss->block->mr)) { > > +return 0; > > +} > > + > > do { > > tmppages = ram_save_target_page(rs, pss, last_stage); > > if (tmppages < 0) { > > @@ -1588,6 +1602,9 @@ uint64_t ram_bytes_total(void) > > > > rcu_read_lock(); > > RAMBLOCK_FOREACH(block) { > > +if (memory_region_is_ram_device(block->mr)) { > > +continue; > > +} > > total += block->used_length; > > } > > rcu_read_unlock(); > > @@ -1643,6 +1660,9 @@ static void ram_save_cleanup(void *opaque) > > memory_global_dirty_log_stop(); > > > > QLIST_FOREACH_RCU(block, _list.blocks, next) { > > +if (memory_region_is_ram_device(block->mr)) { > > +continue; > > +} > > g_free(block->bmap); > > block->bmap = NULL; > > g_free(block->unsentmap); > > @@ -1710,6 +1730,9 @@ void > ram_postcopy_migrated_memory_release(MigrationState *ms) > > unsigned long range = block->used_length >> TARGET_PAGE_BITS; > > unsigned long run_start
Re: [Qemu-devel] [RFC PATCH V4 4/4] vifo: introduce new VFIO ioctl VFIO_IOMMU_GET_DIRTY_BITMAP
On 04/10/2018 01:03 AM, Yulei Zhang wrote: > New VFIO ioctl VFIO_IOMMU_GET_DIRTY_BITMAP is used to fetch the > bitmap of pinned memory in iommu container, we need copy those > memory to the target during the migration as they are dirtied by > mdev devices. [meta-comment] This patch, and the other 3 in the series, were sent with no 'In-Reply-To:' or 'References:' headers; as such, they were not threaded to the 0/4 cover letter ('Message-Id: <1523340124-8881-1-git-send-email-yulei.zh...@intel.com>'). In the future, you'll want to make sure that proper threading is preserved in the mail messages, as it makes it a lot easier when each message in the series is not a new top-level thread. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] Withdrawn: Tweak the NBD_OPT_SET_META_CONTEXT layout
On 03/31/2018 07:24 AM, Eric Blake wrote: > As mentioned in earlier email threads, I experimented with an > alternative layout to the structs sent across the wire during > NBD_OPT_{LIST,SET}_META_CONTEXT, on the grounds that having the > namespace and leaf-name combined into one string that requires further > parsing on each end is not as nice as having two separate fields. I'm > replying to this mail with two RFC patches, one for the NBD spec, and > one for the qemu implementation. > > If we like the idea, this has to go into qemu 2.12, so we have less than > a week. Or, we can ignore the RFC, keep qemu the way it is currently > coded, and when it is released, it will be time to promote the NBD > extension-blockstatus to current as there will be an existing > implementation to be interoperable with. As I got no comments, and as qemu is now frozen hard at 2.12-rc3, it is officially too late to change the format sent over the wire; qemu's implementation will be the reference implementation, and for nbd.git, I'll be promoting extension-blockstatus to the master branch once qemu 2.12 is officially released. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v3 6/7] s390x/kvm: handle AP instruction interception
On 04/11/2018 09:50 AM, Halil Pasic wrote: On 04/11/2018 03:20 PM, Tony Krowiak wrote: I may be all wrong, though... can we at least have a translation of ECA.28 and EECA.28 (the "ap is there" bit and the "ap instructions are interpreted" bit?) I think we have a misunderstanding here. I will wait for Tony. Maybe he can understand this better or explain in more accessible way. From what I get from your explanation, this approach sounds like a good way forward. But let's wait for Tony. I agree, this solves the problem with specifying installing AP facilities on the guest (-cpu xxx,ap=on) and not configuring a vfio-ap device (-device vfio-ap,sysfsdev=$path-to-mediated-device). To recap: The realize() function for the vfio-ap device opens the mediated matrix device file descriptor to set up the communication pathway with the vfio_ap kernel driver. When the fd is opened, the vfio_ap driver sets ECA.28 for the guest to instruct SIE to interpret AP instructions. The driver also configures the AP matrix for the guest in its SIE state description. Consequently, if a vfio-ap device is not configured for the guest, but AP facilities are installed, all AP instructions will be intercepted and routed to QEMU. If there are no interception handlers for the AP instructions, QEMU injects an operation exception into the guest. This results in initialization of the AP bus on the guest to terminate. This patch was intended to circumvent that problem. With Halil's suggestion, there is no need to provide these handlers. If ECA.28 is set for the guest by default when the AP facilities are installed, then the AP instructions will be interpreted and the AP bus will get initialized on the guest. Since there is no vfio-ap device to provide the AP matrix configuration for the guest, the AP bus will not detect any devices, but that's okay. AP instructions targeting at an APQN will execute successfully and set a response code in the status block returned from the instruction indicating the APQN is invalid but there will be no exception unless there is truly an exception condition caused by the execution of the instruction. That's exactly the idea, but it will only work out this way if the effective ECA.28 bit (i.e. EECA.28) is also set. Could you please comment the first paragraph quoted in this email? See my response to Connie's comment in Message ID <20180409113244.380a32c4.coh...@redhat.com>. My response is in Message ID <17b4ab9b-a8c7-37ec-bae4-c57b146ba...@linux.vnet.ibm.com>. Halil
Re: [Qemu-devel] [PATCH v3 6/7] s390x/kvm: handle AP instruction interception
On 04/09/2018 05:32 AM, Cornelia Huck wrote: On Fri, 6 Apr 2018 18:07:56 +0200 Halil Pasicwrote: On 04/05/2018 06:38 PM, Tony Krowiak wrote: On 04/03/2018 05:36 AM, Cornelia Huck wrote: On Mon, 2 Apr 2018 12:36:27 -0400 Tony Krowiak wrote: On 03/26/2018 05:03 AM, Pierre Morel wrote: On 26/03/2018 10:32, David Hildenbrand wrote: On 16.03.2018 00:24, Tony Krowiak wrote: +/* + * The Query Configuration Information (QCI) function (fc == 4) does not + * set a response code in reg 1, so check for that along with the + * AP feature. + */ +if ((fc != 4) && s390_has_feat(S390_FEAT_AP)) { +env->regs[1] = 0x1; + +return 0; +} This would imply an operation exception in case fc==4, which sounds very wrong. It depends but I think that the S390_FEAT_AP_QUERY_CONFIG_INFO must be tested to know what to answer. If the feature is there, QCI must be answered correctly. This is an interesting proposition which raises several issues that will need to be addressed. The only time the PQAP(QCI) instruction is intercepted is when: * A vfio-ap device is NOT defined for the guest because the vfio_ap device driver will set ECA.28 and the PQAP(QCI) instruction will be interpreted * STFLE.12 is set for the guest You say that the QCI must be answered correctly, but what is the correct response? If we return the matrix - i.e., APM, ADM and AQM - configured via the mediated matrix device's sysfs attributes files, then if any APQNs are defined in the matrix, we will have to handle *ALL* AP instructions by executing them on behalf of the guest. I suppose we could return an empty matrix in which case the AP bus will come up without any devices on the guest, but what is the expectation of an admin who deliberately configures the mediated matrix device? Should we forego handling interception of AP instructions and consider a guest configuration that turns on S390_FEAT_AP but does not define a vfio-ap device to be erroneous and terminate starting of the guest? Anybody have any thoughts? Hard to really give good advice without access to the documentation, but: - If we tell the guest that the feature is available, but it does not get any cards to use, returning an empty matrix makes the most sense to me. - I would not tie starting the guest to the presence of a vfio-ap device. Having the feature available in theory but without any devices actually being usable by the guest does not really sound wrong (can we hotplug this later?) For this phase of development, it is my opinion that introducing AP instruction interception handlers is superfluous for the following reasons: 1. Interception handling was introduced solely to ensure an operation exception would not be injected into the guest when CPU model feature for AP (i.e., ap=on) is specified but a VFIO AP device (i.e., -device vfio-ap,sysfsdev=$path) is not. We can kind of (i.e. modulo EECA.28) ensure this in a different fashion I think. How about proclaiming a 'has ap instructions, but nothing to see here' in the SIE interpreted flavor (ECA.28 set) the default way of having ap instructions under KVM. This should be easily accomplished with an all zero CRYCB and eca.28 set. The for the guest to actually get real work done with AP we would still require some sort of driver to either provide a non-zero matrix by altering the CRYCB or unsettling ECA.28 and doing the intercepted flavor. Please notice, the cpu facility ap would still keep it's semantic 'has ap instructions' (opposed to 'has ap instructions implemented in SIE interpreted flavor). And give us all the flexibility. Yet implementing what we want to have in absence of a driver would become much easier (under the assumption that ECA.28 equals EECA.28). How about this? Unfortunately, this is really hard to follow without the AR... let me summarize it to check whether I got the gist of it :) - If the "ap" cpu feature is specified, set a bit that indicates "hey, we basically have have AP support" and create the basics, but don't enable actual SIE handling. This means the guest gets exceptions from the SIE already and we don't need to emulate them. - Actually enable the missing pieces if a vfio device is created. This would enable processing by the SIE, and we would not need to do emulation, either (for most of it, IIRC). I may be all wrong, though... can we at least have a translation of ECA.28 and EECA.28 (the "ap is there" bit and the "ap instructions are interpreted" bit?) I am not sure what you are asking here, but I'll attempt to answer the question I think you are asking. The ap=on|off flag indicates that AP instructions are installed on the guest. This feature is enabled by the kernel only if AP instructions are installed on the host. Since there is no facilities bit to query, this is determined by attempting to execute an AP instruction using an
Re: [Qemu-devel] [PATCH v3] monitor: let cur_mon be per-thread
On 04/12/2018 01:11 AM, Peter Xu wrote: > In the future the monitor iothread may be accessing the cur_mon as > well (via monitor_qmp_dispatch_one()). Before we introduce a real > Out-Of-Band command, let's convert the cur_mon variable to be a > per-thread variable to make sure there won't be a race between threads. > > Note that thread variables are not initialized to a valid value when new > thread is created. However for our case we don't need to set it up, > since the cur_mon variable is only used in such a pattern: > > old_mon = cur_mon; > cur_mon = xxx; > (do something, read cur_mon if necessary in the stack) > cur_mon = old_mon; > > It plays a role as stack variable, so no need to be initialized at all. > We only need to make sure the variable won't be changed unexpectedly by > other threads. > > Signed-off-by: Peter Xu> --- Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v2 1/2 for-2.12?] qapi: Parameter gl of DisplayType now accept an enum
On 04/12/2018 09:11 AM, Elie Tournier wrote: > Hello, > > On Tue, Apr 10, 2018 at 03:33:35PM +0200, Gerd Hoffmann wrote: >>> # @off: Disable OpenGL (default). > > Just to be sure, I have to add @ in front of all parameter, right? Yes. >>> You may be in luck - there is no instance of 'window-close' in the >>> introspection output, which means 'DisplayType' exists only for ease of >>> command-line parsing and is not currently used by QMP, so tweaks here >>> are not affecting the command line. >> >> Yes, right now the struct is only used to store the parsed command line >> opts, so no effect on QMP. >> >> Plan for the future is to also parse command line options with generic >> qapi/json code instead of the home-grown parser, but that switch didn't >> happen yet. >> >>> That said, you can STILL name the enum value something smarter than 'on' >>> IF you make the change now, for 2.12, given that the QAPI type was only >>> introduced in 2.12 (you only have to worry about backwards compatibility >>> if 2.11 already parsed gl=on). >> >> gl=on is older, so that must continue to work. Making both "on" and >> "auto" work isn't a problem for our home-grown parser (aka >> parse_display() in vl.c). But having quirks like this makes the switch >> to generic parser code more difficuilt, so I'd prefer to avoid that ... > > Is it possible to upstream this change before 2.12 release? At this point, no, we've missed -rc3, and it's not a big enough bug to warrant needing -rc4 on its own. And as Gerd said, we already parsed 'gl=on' in 2.11 (the QAPI representation is new to 2.12, but only to simplify how the existing command line parser was coded), so that's different than if 2.12 were introducing brand-new content in a form that should be fixed before it is baked into existing uses. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH] migration: discard RAMBlocks of type ram_device
On 04/12/2018 04:11 PM, Paolo Bonzini wrote: > On 12/04/2018 15:51, Peter Maydell wrote: >> Paolo may have an opinion what the API here should be, but >> at the MemoryRegion level we already have a mix of functions >> memory_region_init_foo_nomigrate() and memory_region_init_foo(), >> which at the moment just control whether we call >> vmstate_register_ram() or not. Is it possible to hook into that, >> so that we default to marking RAMBlocks as not-migrated, and >> when vmstate_register_ram() is called we set the flag to >> "migrated"? > > Yes, that sounds pretty. I have added the MIGRATABLE flag un/setting under qemu_ram_un/set_idstr() which are the only routines called by vmstate_un/register_ram(). Tell me if that's fine or if you'd rather have helpers to set MIGRATABLE from savem ? below is some code extract. Thanks, C. >> NB: this probably requires careful thought to make sure we >> don't accidentally break migration compat, but it seems like >> the cleaner approach. At the moment we do weird things if you >> migrate a RAM block that hasn't had its ID string set, IIRC, >> so just not migrating those RAM blocks seems like a better >> plan than mishandling them. It would also mean that the >> vmstate_register/unregister_ram() functions did what they >> claim to do based on the function name, I think. --- qemu-xive.git.orig/exec.c +++ qemu-xive.git/exec.c @@ -104,6 +104,9 @@ static MemoryRegion io_mem_unassigned; * (Set during postcopy) */ #define RAM_UF_ZEROPAGE (1 << 3) + +/* RAM can be migrated */ +#define RAM_MIGRATABLE (1 << 4) #endif #ifdef TARGET_PAGE_BITS_VARY @@ -1807,6 +1810,11 @@ void qemu_ram_set_uf_zeroable(RAMBlock * rb->flags |= RAM_UF_ZEROPAGE; } +bool qemu_ram_is_migratable(RAMBlock *rb) +{ +return rb->flags & RAM_MIGRATABLE; +} + /* Called with iothread lock held. */ void qemu_ram_set_idstr(RAMBlock *new_block, const char *name, DeviceState *dev) { @@ -1823,6 +1831,7 @@ void qemu_ram_set_idstr(RAMBlock *new_bl } } pstrcat(new_block->idstr, sizeof(new_block->idstr), name); +new_block->flags |= RAM_MIGRATABLE; rcu_read_lock(); RAMBLOCK_FOREACH(block) { @@ -1845,6 +1854,7 @@ void qemu_ram_unset_idstr(RAMBlock *bloc */ if (block) { memset(block->idstr, 0, sizeof(block->idstr)); +block->flags &= ~RAM_MIGRATABLE; } }
[Qemu-devel] [PATCH v3 6/6] vhost-user: support registering external host notifiers
This patch introduces VHOST_USER_PROTOCOL_F_HOST_NOTIFIER. With this feature negotiated, vhost-user backend can register memory region based host notifiers. And it will allow the guest driver in the VM to notify the hardware accelerator at the vhost-user backend directly. Signed-off-by: Tiwei Bie--- docs/interop/vhost-user.txt| 33 +++ hw/virtio/vhost-user.c | 123 + include/hw/virtio/vhost-user.h | 8 +++ 3 files changed, 164 insertions(+) diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt index 534caab18a..9e57b36b20 100644 --- a/docs/interop/vhost-user.txt +++ b/docs/interop/vhost-user.txt @@ -132,6 +132,16 @@ Depending on the request type, payload can be: Payload: Size bytes array holding the contents of the virtio device's configuration space + * Vring area description + --- + | u64 | size | offset | + --- + + u64: a 64-bit integer contains vring index and flags + Size: a 64-bit size of this area + Offset: a 64-bit offset of this area from the start of the + supplied file descriptor + In QEMU the vhost-user message is implemented with the following struct: typedef struct VhostUserMsg { @@ -146,6 +156,7 @@ typedef struct VhostUserMsg { VhostUserLog log; struct vhost_iotlb_msg iotlb; VhostUserConfig config; +VhostUserVringArea area; }; } QEMU_PACKED VhostUserMsg; @@ -380,6 +391,7 @@ Protocol features #define VHOST_USER_PROTOCOL_F_CRYPTO_SESSION 7 #define VHOST_USER_PROTOCOL_F_PAGEFAULT 8 #define VHOST_USER_PROTOCOL_F_CONFIG 9 +#define VHOST_USER_PROTOCOL_F_HOST_NOTIFIER 10 Master message types @@ -777,6 +789,27 @@ Slave message types the VHOST_USER_NEED_REPLY flag, master must respond with zero when operation is successfully completed, or non-zero otherwise. + * VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG + + Id: 3 + Equivalent ioctl: N/A + Slave payload: vring area description + Master payload: N/A + + Sets host notifier for a specified queue. The queue index is contained + in the u64 field of the vring area description. The host notifier is + described by the file descriptor (typically it's a VFIO device fd) which + is passed as ancillary data and the size (which is mmap size and should + be the same as host page size) and offset (which is mmap offset) carried + in the vring area description. QEMU can mmap the file descriptor based + on the size and offset to get a memory range. Registering a host notifier + means mapping this memory range to the VM as the specified queue's notify + MMIO region. Slave sends this request to tell QEMU to de-register the + existing notifier if any and register the new notifier if the request is + sent with a file descriptor. + This request should be sent only when VHOST_USER_PROTOCOL_F_HOST_NOTIFIER + protocol feature has been successfully negotiated. + VHOST_USER_PROTOCOL_F_REPLY_ACK: --- The original vhost-user specification only demands replies for certain diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index 791e0a4763..1cd9c7276b 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -13,6 +13,7 @@ #include "hw/virtio/vhost.h" #include "hw/virtio/vhost-user.h" #include "hw/virtio/vhost-backend.h" +#include "hw/virtio/virtio.h" #include "hw/virtio/virtio-net.h" #include "chardev/char-fe.h" #include "sysemu/kvm.h" @@ -48,6 +49,7 @@ enum VhostUserProtocolFeature { VHOST_USER_PROTOCOL_F_CRYPTO_SESSION = 7, VHOST_USER_PROTOCOL_F_PAGEFAULT = 8, VHOST_USER_PROTOCOL_F_CONFIG = 9, +VHOST_USER_PROTOCOL_F_HOST_NOTIFIER = 10, VHOST_USER_PROTOCOL_F_MAX }; @@ -92,6 +94,7 @@ typedef enum VhostUserSlaveRequest { VHOST_USER_SLAVE_NONE = 0, VHOST_USER_SLAVE_IOTLB_MSG = 1, VHOST_USER_SLAVE_CONFIG_CHANGE_MSG = 2, +VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG = 3, VHOST_USER_SLAVE_MAX } VhostUserSlaveRequest; @@ -136,6 +139,12 @@ static VhostUserConfig c __attribute__ ((unused)); + sizeof(c.size) \ + sizeof(c.flags)) +typedef struct VhostUserVringArea { +uint64_t u64; +uint64_t size; +uint64_t offset; +} VhostUserVringArea; + typedef struct { VhostUserRequest request; @@ -157,6 +166,7 @@ typedef union { struct vhost_iotlb_msg iotlb; VhostUserConfig config; VhostUserCryptoSession session; +VhostUserVringArea area; } VhostUserPayload; typedef struct VhostUserMsg { @@ -638,9 +648,37 @@ static int vhost_user_set_vring_num(struct vhost_dev *dev, return vhost_set_vring(dev, VHOST_USER_SET_VRING_NUM, ring); } +static void vhost_user_host_notifier_restore(struct vhost_dev *dev, +
Re: [Qemu-devel] [PATCH] migration: discard RAMBlocks of type ram_device
On 04/12/2018 06:47 AM, Peter Maydell wrote: >> /* >> + * Identifies RAM blocks which should be discarded from migration. For >> + * the moment, it only applies to blocks backed by a 'ram_device' >> + * memory region. >> + */ >> +static inline bool ram_block_is_migratable(RAMBlock *block) >> +{ >> +return !memory_region_is_ram_device(block->mr); >> +} >> + >> +/* Should be holding either ram_list.mutex, or the RCU lock. */ >> +#define RAMBLOCK_FOREACH_MIGRATABLE(block) \ >> +RAMBLOCK_FOREACH(block)\ >> +if (ram_block_is_migratable(block)) > > This will mishandle some uses, like: > > if (foo) > RAMBLOCK_FOREACH_MIGRATABLE(block) > stuff; > else > morestuff; > > as the if() inside the macro will capture the else clause. > (The lack of braces in the calling code would be against our > coding style, of course, so not very likely.) All existing callers of RAMBLOCK_FOREACH() already supply their own {} around stuff (same is true for QLIST_FOREACH_RCU(), which is what is being used under the hood). By the time you correct that to: if (foo) RAMBLOCK_FOREACH_MIGRATABLE(block) { stuff; } else morestuff; then even though the outer if violates coding standard, the correct usage of the macro with {} around stuff won't leak that there is a trailing if. But yeah, if you're worrying about code that omitted {}, then dealing with omitted {} in both places is something to think about. > > Eric, is there a 'standard' trick for this? I thought of > maybe > > #define RAMBLOCK_FOREACH_MIGRATABLE(block) \ > RAMBLOCK_FOREACH(block)\ > if (!ram_block_is_migratable(block)) {} else at which point, yes, this is a little bit safer for taking exactly one statement without risking that a later bare 'else' could match to the wrong unbraced 'if'. I'm not coming up with any better ideas for (abusing?) the syntax. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
[Qemu-devel] [PATCH v3 3/6] vhost-user: support receiving file descriptors in slave_read
Signed-off-by: Tiwei Bie--- hw/virtio/vhost-user.c | 41 - 1 file changed, 40 insertions(+), 1 deletion(-) diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index 91edd95453..9cea2c8c51 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -854,14 +854,44 @@ static void slave_read(void *opaque) VhostUserHeader hdr = { 0, }; VhostUserPayload payload = { 0, }; int size, ret = 0; +struct iovec iov; +struct msghdr msgh; +int fd = -1; +char control[CMSG_SPACE(sizeof(fd))]; +struct cmsghdr *cmsg; +size_t fdsize; + +memset(, 0, sizeof(msgh)); +msgh.msg_iov = +msgh.msg_iovlen = 1; +msgh.msg_control = control; +msgh.msg_controllen = sizeof(control); /* Read header */ -size = read(u->slave_fd, , VHOST_USER_HDR_SIZE); +iov.iov_base = +iov.iov_len = VHOST_USER_HDR_SIZE; + +size = recvmsg(u->slave_fd, , 0); if (size != VHOST_USER_HDR_SIZE) { error_report("Failed to read from slave."); goto err; } +if (msgh.msg_flags & MSG_CTRUNC) { +error_report("Truncated message."); +goto err; +} + +for (cmsg = CMSG_FIRSTHDR(); cmsg != NULL; + cmsg = CMSG_NXTHDR(, cmsg)) { +if (cmsg->cmsg_level == SOL_SOCKET && +cmsg->cmsg_type == SCM_RIGHTS) { +fdsize = cmsg->cmsg_len - CMSG_LEN(0); +memcpy(, CMSG_DATA(cmsg), fdsize); +break; +} +} + if (hdr.size > VHOST_USER_PAYLOAD_SIZE) { error_report("Failed to read msg header." " Size %d exceeds the maximum %zu.", hdr.size, @@ -885,9 +915,15 @@ static void slave_read(void *opaque) break; default: error_report("Received unexpected msg type."); +if (fd != -1) { +close(fd); +} ret = -EINVAL; } +/* Message handlers need to make sure that fd will be consumed. */ +fd = -1; + /* * REPLY_ACK feature handling. Other reply types has to be managed * directly in their request handlers. @@ -920,6 +956,9 @@ err: qemu_set_fd_handler(u->slave_fd, NULL, NULL, NULL); close(u->slave_fd); u->slave_fd = -1; +if (fd != -1) { +close(fd); +} return; } -- 2.11.0
[Qemu-devel] [PATCH v3 5/6] vhost: allow backends to filter memory sections
This patch introduces a vhost op for vhost backends to allow them to filter the memory sections that they can handle. Signed-off-by: Tiwei Bie--- hw/virtio/vhost-user.c| 11 +++ hw/virtio/vhost.c | 9 +++-- include/hw/virtio/vhost-backend.h | 4 3 files changed, 22 insertions(+), 2 deletions(-) diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index 9cea2c8c51..791e0a4763 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -1622,6 +1622,16 @@ vhost_user_crypto_close_session(struct vhost_dev *dev, uint64_t session_id) return 0; } +static bool vhost_user_mem_section_filter(struct vhost_dev *dev, + MemoryRegionSection *section) +{ +bool result; + +result = memory_region_get_fd(section->mr) >= 0; + +return result; +} + VhostUserState *vhost_user_init(void) { VhostUserState *user = g_new0(struct VhostUserState, 1); @@ -1663,4 +1673,5 @@ const VhostOps user_ops = { .vhost_set_config = vhost_user_set_config, .vhost_crypto_create_session = vhost_user_crypto_create_session, .vhost_crypto_close_session = vhost_user_crypto_close_session, +.vhost_backend_mem_section_filter = vhost_user_mem_section_filter, }; diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index f51bf573d5..a939a38d97 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -382,7 +382,7 @@ static int vhost_verify_ring_mappings(struct vhost_dev *dev, return r; } -static bool vhost_section(MemoryRegionSection *section) +static bool vhost_section(struct vhost_dev *dev, MemoryRegionSection *section) { bool result; bool log_dirty = memory_region_get_dirty_log_mask(section->mr) & @@ -395,6 +395,11 @@ static bool vhost_section(MemoryRegionSection *section) */ result &= !log_dirty; +if (result && dev->vhost_ops->vhost_backend_mem_section_filter) { +result &= +dev->vhost_ops->vhost_backend_mem_section_filter(dev, section); +} + trace_vhost_section(section->mr->name, result); return result; } @@ -628,7 +633,7 @@ static void vhost_region_addnop(MemoryListener *listener, struct vhost_dev *dev = container_of(listener, struct vhost_dev, memory_listener); -if (!vhost_section(section)) { +if (!vhost_section(dev, section)) { return; } vhost_region_add_section(dev, section); diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h index 5dac61f9ea..81283ec50f 100644 --- a/include/hw/virtio/vhost-backend.h +++ b/include/hw/virtio/vhost-backend.h @@ -101,6 +101,9 @@ typedef int (*vhost_crypto_create_session_op)(struct vhost_dev *dev, typedef int (*vhost_crypto_close_session_op)(struct vhost_dev *dev, uint64_t session_id); +typedef bool (*vhost_backend_mem_section_filter_op)(struct vhost_dev *dev, +MemoryRegionSection *section); + typedef struct VhostOps { VhostBackendType backend_type; vhost_backend_init vhost_backend_init; @@ -138,6 +141,7 @@ typedef struct VhostOps { vhost_set_config_op vhost_set_config; vhost_crypto_create_session_op vhost_crypto_create_session; vhost_crypto_close_session_op vhost_crypto_close_session; +vhost_backend_mem_section_filter_op vhost_backend_mem_section_filter; } VhostOps; extern const VhostOps user_ops; -- 2.11.0
[Qemu-devel] [PATCH v3 2/6] vhost-user: introduce shared vhost-user state
When multi queue is enabled e.g. for a virtio-net device, each queue pair will have a vhost_dev, and the only thing shared between vhost devs currently is the chardev. This patch introduces a vhost-user state structure which will be shared by all vhost devs of the same virtio device. Signed-off-by: Tiwei Bie--- backends/cryptodev-vhost-user.c | 20 ++- hw/block/vhost-user-blk.c | 22 +++- hw/scsi/vhost-user-scsi.c | 20 ++- hw/virtio/Makefile.objs | 2 +- hw/virtio/vhost-stub.c | 10 ++ hw/virtio/vhost-user.c | 31 +++- include/hw/virtio/vhost-user-blk.h | 2 ++ include/hw/virtio/vhost-user-scsi.h | 2 ++ include/hw/virtio/vhost-user.h | 20 +++ net/vhost-user.c| 40 ++--- 10 files changed, 149 insertions(+), 20 deletions(-) create mode 100644 include/hw/virtio/vhost-user.h diff --git a/backends/cryptodev-vhost-user.c b/backends/cryptodev-vhost-user.c index 862d4f2580..d52daccfcd 100644 --- a/backends/cryptodev-vhost-user.c +++ b/backends/cryptodev-vhost-user.c @@ -26,6 +26,7 @@ #include "qapi/error.h" #include "qapi/qmp/qerror.h" #include "qemu/error-report.h" +#include "hw/virtio/vhost-user.h" #include "standard-headers/linux/virtio_crypto.h" #include "sysemu/cryptodev-vhost.h" #include "chardev/char-fe.h" @@ -46,6 +47,7 @@ typedef struct CryptoDevBackendVhostUser { CryptoDevBackend parent_obj; +VhostUserState *vhost_user; CharBackend chr; char *chr_name; bool opened; @@ -102,7 +104,7 @@ cryptodev_vhost_user_start(int queues, continue; } -options.opaque = >chr; +options.opaque = s->vhost_user; options.backend_type = VHOST_BACKEND_TYPE_USER; options.cc = b->conf.peers.ccs[i]; s->vhost_crypto[i] = cryptodev_vhost_init(); @@ -185,6 +187,7 @@ static void cryptodev_vhost_user_init( size_t i; Error *local_err = NULL; Chardev *chr; +VhostUserState *user; CryptoDevBackendClient *cc; CryptoDevBackendVhostUser *s = CRYPTODEV_BACKEND_VHOST_USER(backend); @@ -215,6 +218,15 @@ static void cryptodev_vhost_user_init( } } +user = vhost_user_init(); +if (!user) { +error_setg(errp, "Failed to init vhost_user"); +return; +} + +user->chr = >chr; +s->vhost_user = user; + qemu_chr_fe_set_handlers(>chr, NULL, NULL, cryptodev_vhost_user_event, NULL, s, NULL, true); @@ -299,6 +311,12 @@ static void cryptodev_vhost_user_cleanup( backend->conf.peers.ccs[i] = NULL; } } + +if (s->vhost_user) { +vhost_user_cleanup(s->vhost_user); +g_free(s->vhost_user); +s->vhost_user = NULL; +} } static void cryptodev_vhost_user_set_chardev(Object *obj, diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c index 262baca432..4021d71c31 100644 --- a/hw/block/vhost-user-blk.c +++ b/hw/block/vhost-user-blk.c @@ -229,6 +229,7 @@ static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp) { VirtIODevice *vdev = VIRTIO_DEVICE(dev); VHostUserBlk *s = VHOST_USER_BLK(vdev); +VhostUserState *user; int i, ret; if (!s->chardev.chr) { @@ -246,6 +247,15 @@ static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp) return; } +user = vhost_user_init(); +if (!user) { +error_setg(errp, "vhost-user-blk: failed to init vhost_user"); +return; +} + +user->chr = >chardev; +s->vhost_user = user; + virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK, sizeof(struct virtio_blk_config)); @@ -261,7 +271,7 @@ static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp) vhost_dev_set_config_notifier(>dev, _ops); -ret = vhost_dev_init(>dev, >chardev, VHOST_BACKEND_TYPE_USER, 0); +ret = vhost_dev_init(>dev, s->vhost_user, VHOST_BACKEND_TYPE_USER, 0); if (ret < 0) { error_setg(errp, "vhost-user-blk: vhost initialization failed: %s", strerror(-ret)); @@ -286,6 +296,10 @@ vhost_err: virtio_err: g_free(s->dev.vqs); virtio_cleanup(vdev); + +vhost_user_cleanup(user); +g_free(user); +s->vhost_user = NULL; } static void vhost_user_blk_device_unrealize(DeviceState *dev, Error **errp) @@ -297,6 +311,12 @@ static void vhost_user_blk_device_unrealize(DeviceState *dev, Error **errp) vhost_dev_cleanup(>dev); g_free(s->dev.vqs); virtio_cleanup(vdev); + +if (s->vhost_user) { +vhost_user_cleanup(s->vhost_user); +g_free(s->vhost_user); +s->vhost_user = NULL; +} } static void vhost_user_blk_instance_init(Object *obj) diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
[Qemu-devel] [PATCH v3 4/6] virtio: support setting memory region based host notifier
This patch introduces the support for setting memory region based host notifiers for virtio device. This is helpful when using a hardware accelerator for a virtio device, because hardware heavily depends on the notification, this will allow the guest driver in the VM to notify the hardware directly. Signed-off-by: Tiwei Bie--- hw/virtio/virtio-pci.c | 22 ++ hw/virtio/virtio.c | 13 + include/hw/virtio/virtio-bus.h | 2 ++ include/hw/virtio/virtio.h | 2 ++ 4 files changed, 39 insertions(+) diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c index 1e8ab7bbc5..5eb0c323ca 100644 --- a/hw/virtio/virtio-pci.c +++ b/hw/virtio/virtio-pci.c @@ -1037,6 +1037,27 @@ assign_error: return r; } +static int virtio_pci_set_host_notifier_mr(DeviceState *d, int n, + MemoryRegion *mr, bool assign) +{ +VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d); +int offset; + +if (n >= VIRTIO_QUEUE_MAX || !virtio_pci_modern(proxy) || +virtio_pci_queue_mem_mult(proxy) != memory_region_size(mr)) { +return -1; +} + +if (assign) { +offset = virtio_pci_queue_mem_mult(proxy) * n; +memory_region_add_subregion_overlap(>notify.mr, offset, mr, 1); +} else { +memory_region_del_subregion(>notify.mr, mr); +} + +return 0; +} + static void virtio_pci_vmstate_change(DeviceState *d, bool running) { VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d); @@ -2652,6 +2673,7 @@ static void virtio_pci_bus_class_init(ObjectClass *klass, void *data) k->has_extra_state = virtio_pci_has_extra_state; k->query_guest_notifiers = virtio_pci_query_guest_notifiers; k->set_guest_notifiers = virtio_pci_set_guest_notifiers; +k->set_host_notifier_mr = virtio_pci_set_host_notifier_mr; k->vmstate_change = virtio_pci_vmstate_change; k->pre_plugged = virtio_pci_pre_plugged; k->device_plugged = virtio_pci_device_plugged; diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 006d3d1148..1debb0147b 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -2454,6 +2454,19 @@ EventNotifier *virtio_queue_get_host_notifier(VirtQueue *vq) return >host_notifier; } +int virtio_queue_set_host_notifier_mr(VirtIODevice *vdev, int n, + MemoryRegion *mr, bool assign) +{ +BusState *qbus = qdev_get_parent_bus(DEVICE(vdev)); +VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); + +if (k->set_host_notifier_mr) { +return k->set_host_notifier_mr(qbus->parent, n, mr, assign); +} + +return -1; +} + void virtio_device_set_child_bus_name(VirtIODevice *vdev, char *bus_name) { g_free(vdev->bus_name); diff --git a/include/hw/virtio/virtio-bus.h b/include/hw/virtio/virtio-bus.h index ced3d2d2b0..7fec9dc929 100644 --- a/include/hw/virtio/virtio-bus.h +++ b/include/hw/virtio/virtio-bus.h @@ -52,6 +52,8 @@ typedef struct VirtioBusClass { bool (*has_extra_state)(DeviceState *d); bool (*query_guest_notifiers)(DeviceState *d); int (*set_guest_notifiers)(DeviceState *d, int nvqs, bool assign); +int (*set_host_notifier_mr)(DeviceState *d, int n, +MemoryRegion *mr, bool assign); void (*vmstate_change)(DeviceState *d, bool running); /* * Expose the features the transport layer supports before diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h index 098bdaaea3..9c1fa07d6d 100644 --- a/include/hw/virtio/virtio.h +++ b/include/hw/virtio/virtio.h @@ -239,6 +239,8 @@ void virtio_queue_set_align(VirtIODevice *vdev, int n, int align); void virtio_queue_notify(VirtIODevice *vdev, int n); uint16_t virtio_queue_vector(VirtIODevice *vdev, int n); void virtio_queue_set_vector(VirtIODevice *vdev, int n, uint16_t vector); +int virtio_queue_set_host_notifier_mr(VirtIODevice *vdev, int n, + MemoryRegion *mr, bool assign); int virtio_set_status(VirtIODevice *vdev, uint8_t val); void virtio_reset(void *opaque); void virtio_update_irq(VirtIODevice *vdev); -- 2.11.0
[Qemu-devel] [PATCH v3 0/6] Extend vhost-user to support registering external host notifiers
The original subject is: Extend vhost-user to support VFIO based accelerators Update notes Now, this patch set just focuses on adding the support for registering memory region based host notifiers. With this support, guest driver in the VM will be able to notify the hardware device at the vhost backend directly. It's one of the most important things in vDPA -- the host notification offload. Because, normally, the hardware device heavily depends on the notifications. Without this support, there will be a lot of VM-Exit happen due to the notifications from guest driver (it will drop the VM performance) and a lot of CPU resources wasted to do the notification relay (it will make the hardware offload less attractive, because one important goal of hardware offload is to free the CPU resources). More backgrounds of this patch set can be found from the cover letter of the previous versions: RFC: http://lists.nongnu.org/archive/html/qemu-devel/2017-12/msg04844.html v1: http://lists.nongnu.org/archive/html/qemu-devel/2018-01/msg06028.html v2: http://lists.nongnu.org/archive/html/qemu-devel/2018-03/msg05009.html v2 -> v3: - A better implementation of the shared vhost-user state (MST); - Use bus callback to add/delete subregions for notification (MST); - Refine APIs' names which add/delete subregions for notification (MST); - Refine the doc of the new vhost-user types and messages (MST); - Separate host notification offload from the guest notification offload (MST); - Drop the guest notification offload support from this patch; - Add memory filter for vhost backend to filter the sections they can handle; v1 -> v2: - Add some explanations about why extend vhost-user in commit log (Paolo); - Bug fix in slave_read() according to Stefan's fix in DPDK; - Remove IOMMU feature check and related commit log; - Some minor refinements; - Rebase to the latest QEMU; RFC -> v1: - Add some details about how vDPA works in cover letter (Alexey) - Add some details about the OVS offload use-case in cover letter (Jason) - Move PCI specific stuffs out of vhost-user (Jason) - Handle the virtual IOMMU case (Jason) - Move VFIO group management code into vfio/common.c (Alex) - Various refinements; (approximately sorted by comment posting time) Tiwei Bie (6): vhost-user: add Net prefix to internal state structure vhost-user: introduce shared vhost-user state vhost-user: support receiving file descriptors in slave_read virtio: support setting memory region based host notifier vhost: allow backends to filter memory sections vhost-user: support registering external host notifiers backends/cryptodev-vhost-user.c | 20 +++- docs/interop/vhost-user.txt | 33 ++ hw/block/vhost-user-blk.c | 22 +++- hw/scsi/vhost-user-scsi.c | 20 +++- hw/virtio/Makefile.objs | 2 +- hw/virtio/vhost-stub.c | 10 ++ hw/virtio/vhost-user.c | 206 ++-- hw/virtio/vhost.c | 9 +- hw/virtio/virtio-pci.c | 22 hw/virtio/virtio.c | 13 +++ include/hw/virtio/vhost-backend.h | 4 + include/hw/virtio/vhost-user-blk.h | 2 + include/hw/virtio/vhost-user-scsi.h | 2 + include/hw/virtio/vhost-user.h | 28 + include/hw/virtio/virtio-bus.h | 2 + include/hw/virtio/virtio.h | 2 + net/vhost-user.c| 78 +- 17 files changed, 433 insertions(+), 42 deletions(-) create mode 100644 include/hw/virtio/vhost-user.h -- 2.11.0
[Qemu-devel] [PATCH v3 1/6] vhost-user: add Net prefix to internal state structure
We are going to introduce a shared vhost user state which will be named as 'VhostUserState'. So add 'Net' prefix to the existing internal state structure in the vhost-user netdev to avoid conflict. Signed-off-by: Tiwei Bie--- net/vhost-user.c | 38 +++--- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/net/vhost-user.c b/net/vhost-user.c index e0f16c895b..fa28aad12d 100644 --- a/net/vhost-user.c +++ b/net/vhost-user.c @@ -20,38 +20,38 @@ #include "qemu/option.h" #include "trace.h" -typedef struct VhostUserState { +typedef struct NetVhostUserState { NetClientState nc; CharBackend chr; /* only queue index 0 */ VHostNetState *vhost_net; guint watch; uint64_t acked_features; bool started; -} VhostUserState; +} NetVhostUserState; VHostNetState *vhost_user_get_vhost_net(NetClientState *nc) { -VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc); +NetVhostUserState *s = DO_UPCAST(NetVhostUserState, nc, nc); assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_USER); return s->vhost_net; } uint64_t vhost_user_get_acked_features(NetClientState *nc) { -VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc); +NetVhostUserState *s = DO_UPCAST(NetVhostUserState, nc, nc); assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_USER); return s->acked_features; } static void vhost_user_stop(int queues, NetClientState *ncs[]) { -VhostUserState *s; +NetVhostUserState *s; int i; for (i = 0; i < queues; i++) { assert(ncs[i]->info->type == NET_CLIENT_DRIVER_VHOST_USER); -s = DO_UPCAST(VhostUserState, nc, ncs[i]); +s = DO_UPCAST(NetVhostUserState, nc, ncs[i]); if (s->vhost_net) { /* save acked features */ @@ -68,7 +68,7 @@ static int vhost_user_start(int queues, NetClientState *ncs[], CharBackend *be) { VhostNetOptions options; struct vhost_net *net = NULL; -VhostUserState *s; +NetVhostUserState *s; int max_queues; int i; @@ -77,7 +77,7 @@ static int vhost_user_start(int queues, NetClientState *ncs[], CharBackend *be) for (i = 0; i < queues; i++) { assert(ncs[i]->info->type == NET_CLIENT_DRIVER_VHOST_USER); -s = DO_UPCAST(VhostUserState, nc, ncs[i]); +s = DO_UPCAST(NetVhostUserState, nc, ncs[i]); options.net_backend = ncs[i]; options.opaque = be; @@ -123,7 +123,7 @@ static ssize_t vhost_user_receive(NetClientState *nc, const uint8_t *buf, without GUEST_ANNOUNCE capability. */ if (size == 60) { -VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc); +NetVhostUserState *s = DO_UPCAST(NetVhostUserState, nc, nc); int r; static int display_rarp_failure = 1; char mac_addr[6]; @@ -146,7 +146,7 @@ static ssize_t vhost_user_receive(NetClientState *nc, const uint8_t *buf, static void vhost_user_cleanup(NetClientState *nc) { -VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc); +NetVhostUserState *s = DO_UPCAST(NetVhostUserState, nc, nc); if (s->vhost_net) { vhost_net_cleanup(s->vhost_net); @@ -180,7 +180,7 @@ static bool vhost_user_has_ufo(NetClientState *nc) static NetClientInfo net_vhost_user_info = { .type = NET_CLIENT_DRIVER_VHOST_USER, -.size = sizeof(VhostUserState), +.size = sizeof(NetVhostUserState), .receive = vhost_user_receive, .cleanup = vhost_user_cleanup, .has_vnet_hdr = vhost_user_has_vnet_hdr, @@ -190,7 +190,7 @@ static NetClientInfo net_vhost_user_info = { static gboolean net_vhost_user_watch(GIOChannel *chan, GIOCondition cond, void *opaque) { -VhostUserState *s = opaque; +NetVhostUserState *s = opaque; qemu_chr_fe_disconnect(>chr); @@ -203,7 +203,7 @@ static void chr_closed_bh(void *opaque) { const char *name = opaque; NetClientState *ncs[MAX_QUEUE_NUM]; -VhostUserState *s; +NetVhostUserState *s; Error *err = NULL; int queues; @@ -212,7 +212,7 @@ static void chr_closed_bh(void *opaque) MAX_QUEUE_NUM); assert(queues < MAX_QUEUE_NUM); -s = DO_UPCAST(VhostUserState, nc, ncs[0]); +s = DO_UPCAST(NetVhostUserState, nc, ncs[0]); qmp_set_link(name, false, ); vhost_user_stop(queues, ncs); @@ -229,7 +229,7 @@ static void net_vhost_user_event(void *opaque, int event) { const char *name = opaque; NetClientState *ncs[MAX_QUEUE_NUM]; -VhostUserState *s; +NetVhostUserState *s; Chardev *chr; Error *err = NULL; int queues; @@ -239,7 +239,7 @@ static void net_vhost_user_event(void *opaque, int event) MAX_QUEUE_NUM); assert(queues < MAX_QUEUE_NUM); -s = DO_UPCAST(VhostUserState, nc, ncs[0]); +s = DO_UPCAST(NetVhostUserState, nc, ncs[0]);
[Qemu-devel] [Bug 1761401] Re: ARM/Neon: vcvt rounding error
I updated my QEMU git tree such that it includes commit bd49e6027cbc207c, and the test now passes. Thanks for the prompt fix! ** Changed in: qemu Status: New => Fix Committed -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1761401 Title: ARM/Neon: vcvt rounding error Status in QEMU: Fix Committed Bug description: Hello, While using QEMU commit 47d3b60858d90ac8a0cc3a72af7f95c96781125a (March 28, 2018), I've noticed failures in one of the GCC ARM/Neon tests. The test passes on hardware, and with QEMU-2.11.0, so it looks like a recent regression. The test builds a vector of 4 float32 with "125.9" as value, then converts them to 4 uint32_t. The expected result is 125, but we get 126 instead. Maybe it's just a matter of default rounding mode? To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1761401/+subscriptions
Re: [Qemu-devel] [PATCH v3 35/38] console: make screendump asynchronous
* Marc-André Lureau (marcandre.lur...@redhat.com) wrote: > Make screendump asynchronous to provide correct screendumps. > > HMP doesn't have async support, so it has to remain synchronous and > potentially incorrect to avoid potential races. > > Fixes: > https://bugzilla.redhat.com/show_bug.cgi?id=1230527 > > Signed-off-by: Marc-André LureauHi, Looking at this and the previous pair of patches, I think we'd be better if we defined that async commands took a callback function pointer and data that they call. Here we've got 'graphic_hw_update_done' that explicitly walks qmp lists; but I think it's better just to pass graphic_hw_update() the completion data together with a callback function and it calls that when it's done. (Also isn't it a little bit of a race, calling graphic_hw_update() and *then* adding the entry to the list? Can't it end up calling the _done() before we've added it to the list). Also, do we need a list of outstanding screendumps, or should we just have a list of async commands. It wouldn't seem hard to use the async-QMP command from the HMP and then just provide a way to tell whether it had finished. Dave > --- > qapi/ui.json | 3 +- > include/ui/console.h | 3 ++ > hmp.c| 2 +- > ui/console.c | 99 > 4 files changed, 96 insertions(+), 11 deletions(-) > > diff --git a/qapi/ui.json b/qapi/ui.json > index 5d01ad4304..4d2c326fb9 100644 > --- a/qapi/ui.json > +++ b/qapi/ui.json > @@ -96,7 +96,8 @@ > # > ## > { 'command': 'screendump', > - 'data': {'filename': 'str', '*device': 'str', '*head': 'int'} } > + 'data': {'filename': 'str', '*device': 'str', '*head': 'int'}, > + 'async': true } > > ## > # == Spice > diff --git a/include/ui/console.h b/include/ui/console.h > index fc21621656..0c02190963 100644 > --- a/include/ui/console.h > +++ b/include/ui/console.h > @@ -74,6 +74,9 @@ struct MouseTransformInfo { > }; > > void hmp_mouse_set(Monitor *mon, const QDict *qdict); > +void hmp_screendump_sync(const char *filename, > + bool has_device, const char *device, > + bool has_head, int64_t head, Error **errp); > > /* keysym is a unicode code except for special keys (see QEMU_KEY_xxx > constants) */ > diff --git a/hmp.c b/hmp.c > index 679467d85a..da9008fe63 100644 > --- a/hmp.c > +++ b/hmp.c > @@ -2144,7 +2144,7 @@ void hmp_screendump(Monitor *mon, const QDict *qdict) > int64_t head = qdict_get_try_int(qdict, "head", 0); > Error *err = NULL; > > -qmp_screendump(filename, id != NULL, id, id != NULL, head, ); > +hmp_screendump_sync(filename, id != NULL, id, id != NULL, head, ); > hmp_handle_error(mon, ); > } > > diff --git a/ui/console.c b/ui/console.c > index 29234605a7..da51861191 100644 > --- a/ui/console.c > +++ b/ui/console.c > @@ -32,6 +32,7 @@ > #include "chardev/char-fe.h" > #include "trace.h" > #include "exec/memory.h" > +#include "monitor/monitor.h" > > #define DEFAULT_BACKSCROLL 512 > #define CONSOLE_CURSOR_PERIOD 500 > @@ -116,6 +117,12 @@ typedef enum { > TEXT_CONSOLE_FIXED_SIZE > } console_type_t; > > +struct qmp_screendump { > +gchar *filename; > +QmpReturn *ret; > +QLIST_ENTRY(qmp_screendump) link; > +}; > + > struct QemuConsole { > Object parent; > > @@ -165,6 +172,8 @@ struct QemuConsole { > QEMUFIFO out_fifo; > uint8_t out_fifo_buf[16]; > QEMUTimer *kbd_timer; > + > +QLIST_HEAD(, qmp_screendump) qmp_screendumps; > }; > > struct DisplayState { > @@ -190,6 +199,8 @@ static void dpy_refresh(DisplayState *s); > static DisplayState *get_alloc_displaystate(void); > static void text_console_update_cursor_timer(void); > static void text_console_update_cursor(void *opaque); > +static void ppm_save(const char *filename, DisplaySurface *ds, > + Error **errp); > > static void gui_update(void *opaque) > { > @@ -256,8 +267,42 @@ static void gui_setup_refresh(DisplayState *ds) > ds->have_text = have_text; > } > > +static void qmp_screendump_finish(QemuConsole *con, const char *filename, > + QmpReturn *ret) > +{ > +Error *err = NULL; > +DisplaySurface *surface; > +Monitor *prev_mon = cur_mon; > + > +if (qmp_return_is_cancelled(ret)) { > +return; > +} > + > +cur_mon = qmp_return_get_monitor(ret); > +surface = qemu_console_surface(con); > + > +/* FIXME: async save with coroutine? it would have to copy or lock > + * the surface. */ > +ppm_save(filename, surface, ); > + > +if (err) { > +qmp_return_error(ret, err); > +} else { > +qmp_screendump_return(ret); > +} > +cur_mon = prev_mon; > +} > + > void graphic_hw_update_done(QemuConsole *con) > { > +struct qmp_screendump *dump, *next; > + > +QLIST_FOREACH_SAFE(dump, >qmp_screendumps, link, next) { > +
[Qemu-devel] [PATCH v2 2/3] machine: make MemoryHotplugState accessible via the machine
Let's allow to query the MemoryHotplugState from the machine. This allows us to generically detect if a certain machine has support for memory devices, and to generically manage it (find free address range, plug/unplug a memory region). Signed-off-by: David Hildenbrand--- hw/i386/pc.c | 12 hw/ppc/spapr.c | 12 include/hw/boards.h | 16 include/hw/mem/pc-dimm.h | 12 +--- 4 files changed, 41 insertions(+), 11 deletions(-) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index d36bac8c89..fa8862af33 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -2337,6 +2337,17 @@ static void x86_nmi(NMIState *n, int cpu_index, Error **errp) } } +static MemoryHotplugState *pc_machine_get_memory_hotplug_state(MachineState *ms) +{ +PCMachineState *pcms = PC_MACHINE(ms); + +if (!pcms->hotplug_memory.base) { +/* hotplug not supported */ +return NULL; +} +return >hotplug_memory; +} + static void pc_machine_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); @@ -2376,6 +2387,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data) hc->unplug = pc_machine_device_unplug_cb; nc->nmi_monitor_handler = x86_nmi; mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE; +mc->get_memory_hotplug_state = pc_machine_get_memory_hotplug_state; object_class_property_add(oc, PC_MACHINE_MEMHP_REGION_SIZE, "int", pc_machine_get_hotplug_memory_region_size, NULL, diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index a7428f7da7..7ccdb705b3 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -3868,6 +3868,17 @@ PowerPCCPU *spapr_find_cpu(int vcpu_id) return NULL; } +static MemoryHotplugState *spapr_get_memory_hotplug_state(MachineState *ms) +{ +sPAPRMachineState *spapr = SPAPR_MACHINE(ms); + +if (!spapr->hotplug_memory.base) { +/* hotplug not supported */ +return NULL; +} +return >hotplug_memory; +} + static void spapr_machine_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); @@ -3927,6 +3938,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data) * in which LMBs are represented and hot-added */ mc->numa_mem_align_shift = 28; +mc->get_memory_hotplug_state = spapr_get_memory_hotplug_state; smc->default_caps.caps[SPAPR_CAP_HTM] = SPAPR_CAP_OFF; smc->default_caps.caps[SPAPR_CAP_VSX] = SPAPR_CAP_ON; diff --git a/include/hw/boards.h b/include/hw/boards.h index a609239112..447bdc7219 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -105,6 +105,17 @@ typedef struct { CPUArchId cpus[0]; } CPUArchIdList; +/** + * MemoryHotplugState: + * @base: address in guest physical address space where hotplug memory + * address space begins. + * @mr: hotplug memory address space container + */ +typedef struct MemoryHotplugState { +hwaddr base; +MemoryRegion mr; +} MemoryHotplugState; + /** * MachineClass: * @max_cpus: maximum number of CPUs supported. Default: 1 @@ -156,6 +167,10 @@ typedef struct { *should instead use "unimplemented-device" for all memory ranges where *the guest will attempt to probe for a device that QEMU doesn't *implement and a stub device is required. + * @get_memory_hotplug_state: + *If a machine support memory hotplug, the returned data ontains + *information about the portion of guest physical address space + *where memory devices can be mapped to (e.g. to hotplug a pc-dimm). */ struct MachineClass { /*< private >*/ @@ -212,6 +227,7 @@ struct MachineClass { unsigned cpu_index); const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine); int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx); +MemoryHotplugState *(*get_memory_hotplug_state)(MachineState *ms); }; /** diff --git a/include/hw/mem/pc-dimm.h b/include/hw/mem/pc-dimm.h index e88073321f..8bda37adab 100644 --- a/include/hw/mem/pc-dimm.h +++ b/include/hw/mem/pc-dimm.h @@ -19,6 +19,7 @@ #include "exec/memory.h" #include "sysemu/hostmem.h" #include "hw/qdev.h" +#include "hw/boards.h" #define TYPE_PC_DIMM "pc-dimm" #define PC_DIMM(obj) \ @@ -75,17 +76,6 @@ typedef struct PCDIMMDeviceClass { MemoryRegion *(*get_vmstate_memory_region)(PCDIMMDevice *dimm); } PCDIMMDeviceClass; -/** - * MemoryHotplugState: - * @base: address in guest physical address space where hotplug memory - * address space begins. - * @mr: hotplug memory address space container - */ -typedef struct MemoryHotplugState { -hwaddr base; -MemoryRegion mr; -} MemoryHotplugState; - uint64_t pc_dimm_get_free_addr(uint64_t address_space_start, uint64_t address_space_size, uint64_t *hint, uint64_t align, uint64_t size, -- 2.14.3
[Qemu-devel] [PATCH v2 3/3] pc-dimm: factor out address space logic into MemoryDevice code
To be able to reuse MemoryDevice logic from other devices besides pc-dimm, factor the relevant stuff out into the MemoryDevice code. As we don't care about slots for memory devices that are not pc-dimm, don't factor that part out. Most of this patch just moves checks and logic around. While at it, make the code properly detect certain error conditions better (e.g. fragmented memory). Signed-off-by: David Hildenbrand--- hw/i386/pc.c | 12 +-- hw/mem/memory-device.c | 162 hw/mem/pc-dimm.c | 185 +++-- hw/ppc/spapr.c | 9 +- include/hw/mem/memory-device.h | 4 + include/hw/mem/pc-dimm.h | 14 +--- 6 files changed, 185 insertions(+), 201 deletions(-) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index fa8862af33..1c25546a0c 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1711,7 +1711,7 @@ static void pc_dimm_plug(HotplugHandler *hotplug_dev, goto out; } -pc_dimm_memory_plug(dev, >hotplug_memory, mr, align, _err); +pc_dimm_memory_plug(dev, align, _err); if (local_err) { goto out; } @@ -1761,17 +1761,9 @@ static void pc_dimm_unplug(HotplugHandler *hotplug_dev, DeviceState *dev, Error **errp) { PCMachineState *pcms = PC_MACHINE(hotplug_dev); -PCDIMMDevice *dimm = PC_DIMM(dev); -PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm); -MemoryRegion *mr; HotplugHandlerClass *hhc; Error *local_err = NULL; -mr = ddc->get_memory_region(dimm, _err); -if (local_err) { -goto out; -} - hhc = HOTPLUG_HANDLER_GET_CLASS(pcms->acpi_dev); hhc->unplug(HOTPLUG_HANDLER(pcms->acpi_dev), dev, _err); @@ -1779,7 +1771,7 @@ static void pc_dimm_unplug(HotplugHandler *hotplug_dev, goto out; } -pc_dimm_memory_unplug(dev, >hotplug_memory, mr); +pc_dimm_memory_unplug(dev); object_unparent(OBJECT(dev)); out: diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c index 0e0d6b539a..611c3252f0 100644 --- a/hw/mem/memory-device.c +++ b/hw/mem/memory-device.c @@ -15,6 +15,8 @@ #include "qapi/error.h" #include "hw/boards.h" #include "qemu/range.h" +#include "hw/virtio/vhost.h" +#include "sysemu/kvm.h" static gint memory_device_addr_sort(gconstpointer a, gconstpointer b) { @@ -105,6 +107,166 @@ uint64_t get_plugged_memory_size(void) return size; } +static int memory_device_used_region_size_internal(Object *obj, void *opaque) +{ +uint64_t *size = opaque; + +if (object_dynamic_cast(obj, TYPE_MEMORY_DEVICE)) { +DeviceState *dev = DEVICE(obj); +MemoryDeviceState *md = MEMORY_DEVICE(obj); +MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(obj); + +if (dev->realized) { +*size += mdc->get_region_size(md, _abort); +} +} + +object_child_foreach(obj, memory_device_used_region_size_internal, opaque); +return 0; +} + +static uint64_t memory_device_used_region_size(void) +{ +uint64_t size = 0; + +memory_device_used_region_size_internal(qdev_get_machine(), ); + +return size; +} + +uint64_t memory_device_get_free_addr(uint64_t *hint, uint64_t align, + uint64_t size, Error **errp) +{ +const uint64_t used_region_size = memory_device_used_region_size(); +uint64_t address_space_start, address_space_end; +MachineState *machine = MACHINE(qdev_get_machine()); +MachineClass *mc = MACHINE_GET_CLASS(machine); +MemoryHotplugState *hpms; +GSList *list = NULL, *item; +uint64_t new_addr = 0; + +if (!mc->get_memory_hotplug_state) { +error_setg(errp, "memory devices (e.g. for memory hotplug) are not " + "supported by the machine"); +return 0; +} + +hpms = mc->get_memory_hotplug_state(machine); +if (!hpms || !memory_region_size(>mr)) { +error_setg(errp, "memory devices (e.g. for memory hotplug) are not " + "enabled, please specify the maxmem option"); +return 0; +} +address_space_start = hpms->base; +address_space_end = address_space_start + memory_region_size(>mr); +g_assert(address_space_end >= address_space_start); + +if (used_region_size + size > machine->maxram_size - machine->ram_size) { +error_setg(errp, "not enough space, currently 0x%" PRIx64 + " in use of total hot pluggable 0x" RAM_ADDR_FMT, + used_region_size, machine->maxram_size - machine->ram_size); +return 0; +} + +if (hint && QEMU_ALIGN_UP(*hint, align) != *hint) { +error_setg(errp, "address must be aligned to 0x%" PRIx64 " bytes", + align); +return 0; +} + +if (QEMU_ALIGN_UP(size, align) != size) { +error_setg(errp, "backend memory size must be multiple of 0x%" + PRIx64, align); +
[Qemu-devel] [PATCH v2 0/3] pc-dimm: factor out MemoryDevice
Right now we can only map PCDIMM/NVDIMM into guest address space. In the future, we might want to do the same for virtio devices - e.g. virtio-pmem or virtio-mem. Especially, they should be able to live side by side to each other. E.g. the virto based memory devices regions will not be exposed via ACPI and friends. They will be detected just like other virtio devices and indicate the applicable memory region. This makes it possible to also use them on architectures without memory device detection support (e.g. s390x). Let's factor out the memory device code into a MemoryDevice interface. See the patch descriptions for details. v1 -> v2: - Fix compile issues on ppc (still untested O:-) ) David Hildenbrand (3): pc-dimm: factor out MemoryDevice interface machine: make MemoryHotplugState accessible via the machine pc-dimm: factor out address space logic into MemoryDevice code hw/i386/acpi-build.c | 3 +- hw/i386/pc.c | 24 ++- hw/mem/Makefile.objs | 1 + hw/mem/memory-device.c | 281 + hw/mem/pc-dimm.c | 304 +++ hw/ppc/spapr.c | 24 ++- hw/ppc/spapr_hcall.c | 1 + include/hw/boards.h | 16 ++ include/hw/mem/memory-device.h | 48 + include/hw/mem/pc-dimm.h | 26 +-- numa.c | 3 +- qmp.c| 4 +- stubs/Makefile.objs | 2 +- stubs/{qmp_pc_dimm.c => qmp_memory_device.c} | 4 +- 14 files changed, 464 insertions(+), 277 deletions(-) create mode 100644 hw/mem/memory-device.c create mode 100644 include/hw/mem/memory-device.h rename stubs/{qmp_pc_dimm.c => qmp_memory_device.c} (61%) -- 2.14.3
[Qemu-devel] [PATCH v2 1/3] pc-dimm: factor out MemoryDevice interface
On the qmp level, we already have the concept of memory devices: "query-memory-devices" Right now, we only support NVDIMM and PCDIMM. We want to map other devices later into the address space of the guest. Such device could e.g. be virtio devices. These devices will have a guest memory range assigned but won't be exposed via e.g. ACPI. We want to make them look like memory device, but not glued to pc-dimm. Especially, it will not always be possible to have TYPE_PC_DIMM as a parent class (e.g. virtio devices). Let's use an interface instead. As a first part, convert handling of - qmp_pc_dimm_device_list - get_plugged_memory_size to our new model. plug/unplug stuff etc. will follow later. A memory device will have to provide the following functions: - get_addr(): Necessary, as the property "addr" can e.g. not be used for virtio devices (already defined). - get_plugged_size(): The amount this device offers to the guest as of now. - get_region_size(): Because this can later on be bigger than the plugged size. - fill_device_info(): Fill MemoryDeviceInfo, e.g. for qmp. Signed-off-by: David Hildenbrand--- hw/i386/acpi-build.c | 3 +- hw/mem/Makefile.objs | 1 + hw/mem/memory-device.c | 119 +++ hw/mem/pc-dimm.c | 119 ++- hw/ppc/spapr.c | 3 +- hw/ppc/spapr_hcall.c | 1 + include/hw/mem/memory-device.h | 44 ++ include/hw/mem/pc-dimm.h | 2 - numa.c | 3 +- qmp.c| 4 +- stubs/Makefile.objs | 2 +- stubs/{qmp_pc_dimm.c => qmp_memory_device.c} | 4 +- 12 files changed, 239 insertions(+), 66 deletions(-) create mode 100644 hw/mem/memory-device.c create mode 100644 include/hw/mem/memory-device.h rename stubs/{qmp_pc_dimm.c => qmp_memory_device.c} (61%) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 3cf2a1679c..ca3645d57b 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -46,6 +46,7 @@ #include "hw/acpi/vmgenid.h" #include "sysemu/tpm_backend.h" #include "hw/timer/mc146818rtc_regs.h" +#include "hw/mem/memory-device.h" #include "sysemu/numa.h" /* Supported chipsets: */ @@ -2253,7 +2254,7 @@ build_tpm2(GArray *table_data, BIOSLinker *linker, GArray *tcpalog) static void build_srat_hotpluggable_memory(GArray *table_data, uint64_t base, uint64_t len, int default_node) { -MemoryDeviceInfoList *info_list = qmp_pc_dimm_device_list(); +MemoryDeviceInfoList *info_list = qmp_memory_device_list(); MemoryDeviceInfoList *info; MemoryDeviceInfo *mi; PCDIMMDeviceInfo *di; diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs index f12f8b97a2..10be4df2a2 100644 --- a/hw/mem/Makefile.objs +++ b/hw/mem/Makefile.objs @@ -1,2 +1,3 @@ common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o +common-obj-$(CONFIG_MEM_HOTPLUG) += memory-device.o common-obj-$(CONFIG_NVDIMM) += nvdimm.o diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c new file mode 100644 index 00..0e0d6b539a --- /dev/null +++ b/hw/mem/memory-device.c @@ -0,0 +1,119 @@ +/* + * Memory Device Interface + * + * Copyright ProfitBricks GmbH 2012 + * Copyright (C) 2014 Red Hat Inc + * Copyright (c) 2018 Red Hat Inc + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "hw/mem/memory-device.h" +#include "hw/qdev.h" +#include "qapi/error.h" +#include "hw/boards.h" +#include "qemu/range.h" + +static gint memory_device_addr_sort(gconstpointer a, gconstpointer b) +{ +MemoryDeviceState *md_a = MEMORY_DEVICE(a); +MemoryDeviceState *md_b = MEMORY_DEVICE(b); +MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(a); +const uint64_t addr_a = mdc->get_addr(md_a); +const uint64_t addr_b = mdc->get_addr(md_b); + +if (addr_a > addr_b) { +return 1; +} else if (addr_a < addr_b) { +return -1; +} +return 0; +} + +static int memory_device_built_list(Object *obj, void *opaque) +{ +GSList **list = opaque; + +if (object_dynamic_cast(obj, TYPE_MEMORY_DEVICE)) { +DeviceState *dev = DEVICE(obj); +if (dev->realized) { /* only realized memory devices matter */ +*list = g_slist_insert_sorted(*list, dev, memory_device_addr_sort); +} +} + +object_child_foreach(obj, memory_device_built_list, opaque); +return 0; +} + +MemoryDeviceInfoList *qmp_memory_device_list(void) +{ +GSList *devices = NULL, *item; +MemoryDeviceInfoList *list = NULL, *prev = NULL; + +object_child_foreach(qdev_get_machine(),
Re: [Qemu-devel] [PATCH v1 04/24] Makefile: Rename TARGET_DIRS to TARGET_LIST
On 04/10/2018 04:38 PM, Alex Bennée wrote: > From: Fam Zheng> > To be more accurate on its purpose and make code that looks for a certain > target out of this variable more readable. > > Signed-off-by: Fam Zheng Reviewed-by: Philippe Mathieu-Daudé Tested-by: Philippe Mathieu-Daudé > --- > Makefile | 20 ++-- > configure | 2 +- > scripts/create_config | 2 +- > tests/Makefile.include | 2 +- > 4 files changed, 13 insertions(+), 13 deletions(-) > > diff --git a/Makefile b/Makefile > index 727ef118f3..2c54cd8345 100644 > --- a/Makefile > +++ b/Makefile > @@ -62,8 +62,8 @@ seems to have been used for an in-tree build. You can fix > this by running \ > endif > endif > > -CONFIG_SOFTMMU := $(if $(filter %-softmmu,$(TARGET_DIRS)),y) > -CONFIG_USER_ONLY := $(if $(filter %-user,$(TARGET_DIRS)),y) > +CONFIG_SOFTMMU := $(if $(filter %-softmmu,$(TARGET_LIST)),y) > +CONFIG_USER_ONLY := $(if $(filter %-user,$(TARGET_LIST)),y) > CONFIG_XEN := $(CONFIG_XEN_BACKEND) > CONFIG_ALL=y > -include config-all-devices.mak > @@ -362,8 +362,8 @@ DOCS= > endif > > SUBDIR_MAKEFLAGS=$(if $(V),,--no-print-directory --quiet) > BUILD_DIR=$(BUILD_DIR) > -SUBDIR_DEVICES_MAK=$(patsubst %, %/config-devices.mak, $(TARGET_DIRS)) > -SUBDIR_DEVICES_MAK_DEP=$(patsubst %, %-config-devices.mak.d, $(TARGET_DIRS)) > +SUBDIR_DEVICES_MAK=$(patsubst %, %/config-devices.mak, $(TARGET_LIST)) > +SUBDIR_DEVICES_MAK_DEP=$(patsubst %, %-config-devices.mak.d, $(TARGET_LIST)) > > ifeq ($(SUBDIR_DEVICES_MAK),) > config-all-devices.mak: > @@ -466,7 +466,7 @@ config-host.h-timestamp: config-host.mak > qemu-options.def: $(SRC_PATH)/qemu-options.hx $(SRC_PATH)/scripts/hxtool > $(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > > $@,"GEN","$@") > > -SUBDIR_RULES=$(patsubst %,subdir-%, $(TARGET_DIRS)) > +SUBDIR_RULES=$(patsubst %,subdir-%, $(TARGET_LIST)) > SOFTMMU_SUBDIR_RULES=$(filter %-softmmu,$(SUBDIR_RULES)) > > $(SOFTMMU_SUBDIR_RULES): $(block-obj-y) > @@ -510,7 +510,7 @@ ROMSUBDIR_RULES=$(patsubst %,romsubdir-%, $(ROMS)) > romsubdir-%: > $(call quiet-command,$(MAKE) $(SUBDIR_MAKEFLAGS) -C pc-bios/$* V="$(V)" > TARGET_DIR="$*/" CFLAGS="$(filter -O% -g%,$(CFLAGS))",) > > -ALL_SUBDIRS=$(TARGET_DIRS) $(patsubst %,pc-bios/%, $(ROMS)) > +ALL_SUBDIRS=$(TARGET_LIST) $(patsubst %,pc-bios/%, $(ROMS)) > > recurse-all: $(SUBDIR_RULES) $(ROMSUBDIR_RULES) > > @@ -763,7 +763,7 @@ distclean: clean > rm -f docs/interop/qemu-qmp-ref.pdf docs/interop/qemu-ga-ref.pdf > rm -f docs/interop/qemu-qmp-ref.html docs/interop/qemu-ga-ref.html > rm -f docs/qemu-block-drivers.7 > - for d in $(TARGET_DIRS); do \ > + for d in $(TARGET_LIST); do \ > rm -rf $$d || exit 1 ; \ > done > rm -Rf .sdk > @@ -864,7 +864,7 @@ endif > $(INSTALL_DATA) $(SRC_PATH)/pc-bios/keymaps/$$x > "$(DESTDIR)$(qemu_datadir)/keymaps"; \ > done > $(INSTALL_DATA) $(BUILD_DIR)/trace-events-all > "$(DESTDIR)$(qemu_datadir)/trace-events-all" > - for d in $(TARGET_DIRS); do \ > + for d in $(TARGET_LIST); do \ > $(MAKE) $(SUBDIR_MAKEFLAGS) TARGET_DIR=$$d/ -C $$d $@ || exit 1 ; \ > done > > @@ -1062,9 +1062,9 @@ endif > @echo ' ctags/TAGS - Generate tags file for editors' > @echo ' cscope - Generate cscope index' > @echo '' > - @$(if $(TARGET_DIRS), \ > + @$(if $(TARGET_LIST), \ > echo 'Architecture specific targets:'; \ > - $(foreach t, $(TARGET_DIRS), \ > + $(foreach t, $(TARGET_LIST), \ > printf " %-30s - Build for %s\\n" $(patsubst %,subdir-%,$(t)) > $(t);) \ > echo '') > @echo 'Cleaning targets:' > diff --git a/configure b/configure > index add87ff4d4..5a41c87cc3 100755 > --- a/configure > +++ b/configure > @@ -6094,7 +6094,7 @@ qemu_version=$(head $source_path/VERSION) > echo "VERSION=$qemu_version" >>$config_host_mak > echo "PKGVERSION=$pkgversion" >>$config_host_mak > echo "SRC_PATH=$source_path" >> $config_host_mak > -echo "TARGET_DIRS=$target_list" >> $config_host_mak > +echo "TARGET_LIST=$target_list" >> $config_host_mak > if [ "$docs" = "yes" ] ; then >echo "BUILD_DOCS=yes" >> $config_host_mak > fi > diff --git a/scripts/create_config b/scripts/create_config > index d727e5e36e..58948a67a4 100755 > --- a/scripts/create_config > +++ b/scripts/create_config > @@ -107,7 +107,7 @@ case $line in > target_name=${line#*=} > echo "#define TARGET_NAME \"$target_name\"" > ;; > - TARGET_DIRS=*) > + TARGET_LIST=*) > # do nothing > ;; > TARGET_*=y) # configuration > diff --git a/tests/Makefile.include b/tests/Makefile.include > index 3b9a5e31a2..3d2f0458ab 100644 > --- a/tests/Makefile.include > +++ b/tests/Makefile.include > @@ -854,7 +854,7 @@ endif > > # QTest rules > > -TARGETS=$(patsubst
[Qemu-devel] buffer sharing across VMs - xen-zcopy and hyper_dmabuf discussion
(changed subject and decoupling from udmabuf thread) On Wed, Apr 11, 2018 at 08:59:32AM +0300, Oleksandr Andrushchenko wrote: > On 04/10/2018 08:26 PM, Dongwon Kim wrote: > >On Tue, Apr 10, 2018 at 09:37:53AM +0300, Oleksandr Andrushchenko wrote: > >>On 04/06/2018 09:57 PM, Dongwon Kim wrote: > >>>On Fri, Apr 06, 2018 at 03:36:03PM +0300, Oleksandr Andrushchenko wrote: > On 04/06/2018 02:57 PM, Gerd Hoffmann wrote: > > Hi, > > > >>>I fail to see any common ground for xen-zcopy and udmabuf ... > >>Does the above mean you can assume that xen-zcopy and udmabuf > >>can co-exist as two different solutions? > >Well, udmabuf route isn't fully clear yet, but yes. > > > >See also gvt (intel vgpu), where the hypervisor interface is abstracted > >away into a separate kernel modules even though most of the actual vgpu > >emulation code is common. > Thank you for your input, I'm just trying to figure out > which of the three z-copy solutions intersect and how much > >>And what about hyper-dmabuf? > >>>xen z-copy solution is pretty similar fundamentally to hyper_dmabuf > >>>in terms of these core sharing feature: > >>> > >>>1. the sharing process - import prime/dmabuf from the producer -> extract > >>>underlying pages and get those shared -> return references for shared pages > >Another thing is danvet was kind of against to the idea of importing existing > >dmabuf/prime buffer and forward it to the other domain due to synchronization > >issues. He proposed to make hyper_dmabuf only work as an exporter so that it > >can have a full control over the buffer. I think we need to talk about this > >further as well. > Yes, I saw this. But this limits the use-cases so much. I agree. Our current approach is a lot more flexible. You can find very similar feedback in my reply to those review messages. However, I also understand Daniel's concern as well. I believe we need more dicussion regarding this matter. > For instance, running Android as a Guest (which uses ION to allocate > buffers) means that finally HW composer will import dma-buf into > the DRM driver. Then, in case of xen-front for example, it needs to be > shared with the backend (Host side). Of course, we can change user-space > to make xen-front allocate the buffers (make it exporter), but what we try > to avoid is to change user-space which in normal world would have remain > unchanged otherwise. > So, I do think we have to support this use-case and just have to understand > the complexity. > > > > >danvet, can you comment on this topic? > > > >>>2. the page sharing mechanism - it uses Xen-grant-table. > >>> > >>>And to give you a quick summary of differences as far as I understand > >>>between two implementations (please correct me if I am wrong, Oleksandr.) > >>> > >>>1. xen-zcopy is DRM specific - can import only DRM prime buffer > >>>while hyper_dmabuf can export any dmabuf regardless of originator > >>Well, this is true. And at the same time this is just a matter > >>of extending the API: xen-zcopy is a helper driver designed for > >>xen-front/back use-case, so this is why it only has DRM PRIME API > >>>2. xen-zcopy doesn't seem to have dma-buf synchronization between two VMs > >>>while (as danvet called it as remote dmabuf api sharing) hyper_dmabuf sends > >>>out synchronization message to the exporting VM for synchronization. > >>This is true. Again, this is because of the use-cases it covers. > >>But having synchronization for a generic solution seems to be a good idea. > >Yeah, understood xen-zcopy works ok with your use case. But I am just curious > >if it is ok not to have any inter-domain synchronization in this sharing > >model. > The synchronization is done with displif protocol [1] > >The buffer being shared is technically dma-buf and originator needs to be > >able > >to keep track of it. > As I am working in DRM terms the tracking is done by the DRM core > for me for free. (This might be one of the reasons Daniel sees DRM > based implementation fit very good from code-reuse POV). yeah but once you have a DRM object (whether it's dmabuf or not) on a remote domain, it is totally new object and out of sync (correct me if I am wrong) with original DRM prime, isn't it? How could these two different but based on same pages be synchronized? > > > >>>3. 1-level references - when using grant-table for sharing pages, there > >>>will > >>>be same # of refs (each 8 byte) > >>To be precise, grant ref is 4 bytes > >You are right. Thanks for correction.;) > > > >>>as # of shared pages, which is passed to > >>>the userspace to be shared with importing VM in case of xen-zcopy. > >>The reason for that is that xen-zcopy is a helper driver, e.g. > >>the grant references come from the display backend [1], which implements > >>Xen display protocol [2]. So, effectively the backend extracts references > >>from frontend's requests and passes those to xen-zcopy as an array > >>of refs. > >>> Compared > >>>to
Re: [Qemu-devel] [PATCH 07/19] block: Really pause block jobs on drain
Am 12.04.2018 um 15:42 hat Paolo Bonzini geschrieben: > On 12/04/2018 15:27, Kevin Wolf wrote: > > Not sure I follow. Let's look at an example. Say, we have a block job > > BlockBackend as the root (because that uses proper layering, unlike > > devices which use aio_disable_external()), connected to a qcow2 node > > over file. > > > > 1. The block job issues a request to the qcow2 node > > > > 2. In order to process that request, the qcow2 node internally issues > >one or more requests to the file node > > > > 3. Someone calls bdrv_drained_begin(qcow2_node) > > > > 4. We call BlockDriver.bdrv_drained_begin() for qcow2_node and > >file_node, and BdrvChildRole.drained_begin() for the block job. > > > >Now the block nodes don't create any new original requests any more > >(qcow2 and file don't do that anyway; qcow2 only creates requests to > >its children in the context of parent requests). The job potentially > >continues sending requests until it reaches the next pause point. The > >previously issued requests are still in flight. > > > >Is this step what you meant by X->drv->bdrv_drain(X)? I don't see why > >pending requests can only be in X's children. Why can't the request > >be pending in X itself, say waiting for a thread pool worker > >decrypting a buffer? > > No, that step is ->bdrv_co_drain_begin in BlockDriver. It's where the > "last" requests are sent to file_node after we know that qcow2_node > won't get any more requests. That's not really how .bdrv_co_drain_begin works. It is called first, potentially long before we know that the parent won't send new requests any more. All it does is preventing the node from creating new original requests, i.e. internal requests that are not triggered by a parent request. > >Also, note that after this series, the driver callbacks are called > >asynchronously, but I don't think it makes a big difference here. > > > > 5. The file_node request completes. file_node doesn't have any requests > >in flight any more, but in theory it could still get new requests > >from qcow2_node. Anyway, let's say this is the last request, so I > >think we'd call its requests concluded? > > No, if it can still get more requests they're not concluded. Okay. In that case, we can't really determine any order, because the whole subtree concludes when the root node concludes. (Ignoring any additional parents, but the effect is the same: Children always conclude at the same time as their last parent.) > That's why we need to first ensure qcow2_node is quiescent, and before > then we need to ensure that the BlockBackends are quiescent (in this > case meaning the job has reached its pause point). Only then we can > look at file_node. In this case we'll see that we have nothing to > do---file_node is already quiescent---and bdrv_drained_begin() can > return. bdrv_drained_begin(qcow2_node) actually never looks at file_node. Only bdrv_subtree_drained_begin(qcow2_node) would do so. But what you're saying is essentially that for a subtree drain, we want the following order in bdrv_drained_poll(): 1. Check if a parent still has pending requests. If so, don't bother checking the rest. 2. Check the node the drain request was for. If so, don't bother looking at the children. 3. Finally, if all parents and the node itself are drained, look at the children. This is already the order we have there. What is probably different from what you envision is that after the parents have concluded, we still check that they are still quiescent in every iteration. What we could do easily is introducing a bool bs->quiesce_concluded or something that bdrv_drain_poll() sets to true the first time that it returns false. It also gets a shortcut so that it returns false immediately if bs->quiesce_concluded is true. The field is reset in bdrv_do_drained_end() when bs->quiesce_counter reaches 0. That would be a rather simple optimisation that could be done as the final patch in this series, and would ensure that we don't recurse back to parents that are already quiescent. Would you be happy with this change? Kevin
Re: [Qemu-devel] [PATCH v2 1/2 for-2.12?] qapi: Parameter gl of DisplayType now accept an enum
Hello, On Tue, Apr 10, 2018 at 03:33:35PM +0200, Gerd Hoffmann wrote: > > # @off: Disable OpenGL (default). Just to be sure, I have to add @ in front of all parameter, right? > > > > > + # 'on'Use OpenGL, pick context type automatically. > > > + # Would better be named 'auto' but is called 'on' for backward > > > + # compatibility with bool type. > > > > See below... > > > DisplayOptions was added in 2.12. This is a backwards-incompatible > > change in QMP (you CANNOT change 'bool' to 'DisplayGLMode' across > > releases, because the on-the-wire representation differs; pre-patch it > > would be "gl":true, post-patch it is "gl":"on"). So if it affects QMP, > > and we want it, this patch either HAS to go in 2.12, or we have to have > > more finesse (perhaps by using an 'alternate' type in the QAPI) so that > > it remains backwards-compatible. > > > > /me goes and looks at introspection output... > > > > You may be in luck - there is no instance of 'window-close' in the > > introspection output, which means 'DisplayType' exists only for ease of > > command-line parsing and is not currently used by QMP, so tweaks here > > are not affecting the command line. > > Yes, right now the struct is only used to store the parsed command line > opts, so no effect on QMP. > > Plan for the future is to also parse command line options with generic > qapi/json code instead of the home-grown parser, but that switch didn't > happen yet. > > > That said, you can STILL name the enum value something smarter than 'on' > > IF you make the change now, for 2.12, given that the QAPI type was only > > introduced in 2.12 (you only have to worry about backwards compatibility > > if 2.11 already parsed gl=on). > > gl=on is older, so that must continue to work. Making both "on" and > "auto" work isn't a problem for our home-grown parser (aka > parse_display() in vl.c). But having quirks like this makes the switch > to generic parser code more difficuilt, so I'd prefer to avoid that ... Is it possible to upstream this change before 2.12 release? > > cheers, > Gerd >
Re: [Qemu-devel] [PATCH] migration: discard RAMBlocks of type ram_device
On 12/04/2018 15:51, Peter Maydell wrote: > Paolo may have an opinion what the API here should be, but > at the MemoryRegion level we already have a mix of functions > memory_region_init_foo_nomigrate() and memory_region_init_foo(), > which at the moment just control whether we call > vmstate_register_ram() or not. Is it possible to hook into that, > so that we default to marking RAMBlocks as not-migrated, and > when vmstate_register_ram() is called we set the flag to > "migrated"? Yes, that sounds pretty. Paolo > NB: this probably requires careful thought to make sure we > don't accidentally break migration compat, but it seems like > the cleaner approach. At the moment we do weird things if you > migrate a RAM block that hasn't had its ID string set, IIRC, > so just not migrating those RAM blocks seems like a better > plan than mishandling them. It would also mean that the > vmstate_register/unregister_ram() functions did what they > claim to do based on the function name, I think.
[Qemu-devel] [PATCH for-2.12] linux-user/signal.c: Put AArch64 frame record in the right place
AArch64 stack frames include a 'frame record' which holds a pointer to the next frame record in the chain and the LR on entry to the function. The procedure calling standard doesn't mandate where exactly this frame record is in the stack frame, but for signal frames the kernel puts it right at the top. We used to put it there too, but in commit 7f0f4208b3a96f22 we accidentally put the "enlarge to the 4K reserved space minimum" check after the "allow for the frame record" code, rather than before it, with the effect that the frame record would be inside the reserved space and immediately after the last used part of it. Move the frame record back out of the reserved space to where we used to put it. This bug shouldn't break any sensible guest code, but test programs that deliberately look at the internal details of the signal frame layout will not find what they are expecting to see. Fixes: 7f0f4208b3a96f22 Signed-off-by: Peter Maydell--- I'm marking this as for-2.12 on the basis that it puts our frame layout back to exactly what 2.11 had, and so seems safest. No sensible guest code should really care, though, so this is in the "only if we're doing an rc4" bucket; but I think that the softfloat fixes deserve an rc4 anyway. linux-user/signal.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/linux-user/signal.c b/linux-user/signal.c index 8d9e6e8410..e6dfe0adfd 100644 --- a/linux-user/signal.c +++ b/linux-user/signal.c @@ -1843,6 +1843,12 @@ static void target_setup_frame(int usig, struct target_sigaction *ka, layout.total_size += sizeof(struct target_aarch64_ctx); } +/* We must always provide at least the standard 4K reserved space, + * even if we don't use all of it (this is part of the ABI) + */ +layout.total_size = MAX(layout.total_size, +sizeof(struct target_rt_sigframe)); + /* Reserve space for the return code. On a real system this would * be within the VDSO. So, despite the name this is not a "real" * record within the frame. @@ -1850,12 +1856,6 @@ static void target_setup_frame(int usig, struct target_sigaction *ka, fr_ofs = layout.total_size; layout.total_size += sizeof(struct target_rt_frame_record); -/* We must always provide at least the standard 4K reserved space, - * even if we don't use all of it (this is part of the ABI) - */ -layout.total_size = MAX(layout.total_size, -sizeof(struct target_rt_sigframe)); - frame_addr = get_sigframe(ka, env, layout.total_size); trace_user_setup_frame(env, frame_addr); if (!lock_user_struct(VERIFY_WRITE, frame, frame_addr, 0)) { -- 2.16.2
[Qemu-devel] [PATCH qemu v3] device_tree: Increase FDT_MAX_SIZE to 1 MiB
It is not uncommon for a contemporary FDT to be larger than 64 KiB, leading to failures loading the device tree from sysfs: qemu-system-aarch64: qemu_fdt_setprop: Couldn't set ...: FDT_ERR_NOSPACE Hence increase the limit to 1 MiB, like on PPC. For reference, the largest arm64 DTB created from the Linux sources is ca. 75 KiB large (100 KiB when built with symbols/fixup support). Signed-off-by: Geert Uytterhoeven--- v3: - Update example size figures, v2: - Enlarge from 128 KiB to 1 MiB, as suggested by Peter Maydell. --- device_tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/device_tree.c b/device_tree.c index 19458b32bf81e55e..52c3358a55838d33 100644 --- a/device_tree.c +++ b/device_tree.c @@ -29,7 +29,7 @@ #include -#define FDT_MAX_SIZE 0x1 +#define FDT_MAX_SIZE 0x10 void *create_device_tree(int *sizep) { -- 2.7.4
Re: [Qemu-devel] [PATCH] migration: discard RAMBlocks of type ram_device
On 12 April 2018 at 14:41, Cédric Le Goaterwrote: > On 04/12/2018 02:08 PM, Peter Maydell wrote: >> On 12 April 2018 at 12:53, Dr. David Alan Gilbert >> wrote: >>> * Peter Maydell (peter.mayd...@linaro.org) wrote: David suggested on IRC that we would want a flag on the ramblock for "not migratable", because there are other uses for "don't migrate this" than just "is this a ram device". >>> >>> My original suggestion to your series was with a flag, but I'd forgotten >>> about that by the time I'd made the suggestion to Cédric. >>> In your case would just adding an extra term to the >>> ram_block_is_migratable function work, or do you really need a flag? >> >> I don't see how else you would identify the ram block that needs >> to be skipped. Also I think it's just better design to decouple >> the decision about "should we migrate this ram block" from the >> migration code itself, and push it up to the code layer that knows >> it's creating ram blocks that shouldn't be migrated. > > Do you mean adding a new RAMBlock flag RAM_NON_MIGRATABLE > in exec.c ? That would require to add an extra bool to the > following functions : > > memory_region_init_ram_ptr() > qemu_ram_alloc_from_ptr() > qemu_ram_alloc_internal() > > Would that be ok ? Paolo may have an opinion what the API here should be, but at the MemoryRegion level we already have a mix of functions memory_region_init_foo_nomigrate() and memory_region_init_foo(), which at the moment just control whether we call vmstate_register_ram() or not. Is it possible to hook into that, so that we default to marking RAMBlocks as not-migrated, and when vmstate_register_ram() is called we set the flag to "migrated"? NB: this probably requires careful thought to make sure we don't accidentally break migration compat, but it seems like the cleaner approach. At the moment we do weird things if you migrate a RAM block that hasn't had its ID string set, IIRC, so just not migrating those RAM blocks seems like a better plan than mishandling them. It would also mean that the vmstate_register/unregister_ram() functions did what they claim to do based on the function name, I think. thanks -- PMM
Re: [Qemu-devel] [PATCH 07/19] block: Really pause block jobs on drain
On 12/04/2018 15:27, Kevin Wolf wrote: > Not sure I follow. Let's look at an example. Say, we have a block job > BlockBackend as the root (because that uses proper layering, unlike > devices which use aio_disable_external()), connected to a qcow2 node > over file. > > 1. The block job issues a request to the qcow2 node > > 2. In order to process that request, the qcow2 node internally issues >one or more requests to the file node > > 3. Someone calls bdrv_drained_begin(qcow2_node) > > 4. We call BlockDriver.bdrv_drained_begin() for qcow2_node and >file_node, and BdrvChildRole.drained_begin() for the block job. > >Now the block nodes don't create any new original requests any more >(qcow2 and file don't do that anyway; qcow2 only creates requests to >its children in the context of parent requests). The job potentially >continues sending requests until it reaches the next pause point. The >previously issued requests are still in flight. > >Is this step what you meant by X->drv->bdrv_drain(X)? I don't see why >pending requests can only be in X's children. Why can't the request >be pending in X itself, say waiting for a thread pool worker >decrypting a buffer? No, that step is ->bdrv_co_drain_begin in BlockDriver. It's where the "last" requests are sent to file_node after we know that qcow2_node won't get any more requests. >Also, note that after this series, the driver callbacks are called >asynchronously, but I don't think it makes a big difference here. > > 5. The file_node request completes. file_node doesn't have any requests >in flight any more, but in theory it could still get new requests >from qcow2_node. Anyway, let's say this is the last request, so I >think we'd call its requests concluded? No, if it can still get more requests they're not concluded. That's why we need to first ensure qcow2_node is quiescent, and before then we need to ensure that the BlockBackends are quiescent (in this case meaning the job has reached its pause point). Only then we can look at file_node. In this case we'll see that we have nothing to do---file_node is already quiescent---and bdrv_drained_begin() can return. > 6. qcow2 still has a request in flight, but doesn't need to access >file_node for it. It finishes the work and therefore concludes its >requests as well. Note that qcow2_node (the parent) concludes after >file_node (the child). > > 7. We'll keep the example simple, so after completion of its request, >the job reaches a pause point without sending a new request. Again, >this happens after qcow2_node has concluded. > > 8. Only when neither file_node nor qcow2_node have a request in flight >and the job has reached a pause point, bdrv_drained_begin() can >return. > > So completing the last request and reaching an actually quiescent state > looks very much like a process in child-to-parent order to me?
Re: [Qemu-devel] [PATCH] migration: discard RAMBlocks of type ram_device
On 04/12/2018 02:08 PM, Peter Maydell wrote: > On 12 April 2018 at 12:53, Dr. David Alan Gilbertwrote: >> * Peter Maydell (peter.mayd...@linaro.org) wrote: >>> David suggested on IRC that we would want a flag on the ramblock >>> for "not migratable", because there are other uses for "don't >>> migrate this" than just "is this a ram device". >> >> My original suggestion to your series was with a flag, but I'd forgotten >> about that by the time I'd made the suggestion to Cédric. >> In your case would just adding an extra term to the >> ram_block_is_migratable function work, or do you really need a flag? > > I don't see how else you would identify the ram block that needs > to be skipped. Also I think it's just better design to decouple > the decision about "should we migrate this ram block" from the > migration code itself, and push it up to the code layer that knows > it's creating ram blocks that shouldn't be migrated. Do you mean adding a new RAMBlock flag RAM_NON_MIGRATABLE in exec.c ? That would require to add an extra bool to the following functions : memory_region_init_ram_ptr() qemu_ram_alloc_from_ptr() qemu_ram_alloc_internal() Would that be ok ? Thanks, C.
Re: [Qemu-devel] [Bug 1759264] Re: fpu/softfloat: round_to_int_and_pack refactor broke TriCore ftoi insns
Bastian Koppelmannwrites: > On 04/11/2018 01:01 PM, Alex Bennée wrote: >> Bastian Koppelmann writes: >> >>> On 04/10/2018 10:07 PM, Alex Bennée wrote: Yeah it looks like it was missed, the round_to_uint code does it. Do you have a test case I can verify? >>> >>> For the NaN input 0x the expected result for the flags is that >>> flag_invalid is raised. >>> >>> I can provide you with some TriCore asm, but it is a bit of pain to get >>> the gnu assembler to build, since the public version is a decade old. >> >> I'll trust you if you send me a static binary for this particular >> verification ;-) > > I set up a github repo with working binutils and the corresponding > testcase: > > https://github.com/bkoppelmann/tricore-fpu > > The one caveat is, that we cannot produce any binaries with the TriCore > ISA > 1.3. > > In this testcase the last ftoi instruction is supposed to raise the > invalid flag and return 0 since the input was NaN. We did that by only > checking for NaN if any flag was raised after ftoi, then do the NaN > check, and if positive, return 0. Well it builds and I get an fpu-test.elf but I'm a bit stuck on how to run it. What are the runes for launching the test? > > Cheers, > Bastian -- Alex Bennée
Re: [Qemu-devel] [PATCH] block/gluster: defend on legacy ftruncate api use
This change looks good to me, but a commit message would have been helpful. I suggest something like this: Gluster 4.0 changed the signature of glfs_ftruncate(). The function now has two additional arguments, namely prestat and poststat. These provide not benefit for QEMU, so ignoring them and passing NULL makes the gluster-block driver compile with the new Gluster version again. And maybe add this too: Glusters libgfapi uses symbol versioning and provides backwards compatible functions. Binaries compiled against previous versions of Gluster keep on functioning with the new library. Thanks! Reviewed-by: Niels de VosOn Thu, Apr 12, 2018 at 04:46:14PM +0530, Prasanna Kumar Kalever wrote: > Signed-off-by: Prasanna Kumar Kalever > --- > block/gluster.c | 15 +-- > configure | 8 > 2 files changed, 21 insertions(+), 2 deletions(-) > > diff --git a/block/gluster.c b/block/gluster.c > index 4adc1a875b..2474580ad6 100644 > --- a/block/gluster.c > +++ b/block/gluster.c > @@ -996,6 +996,7 @@ static int qemu_gluster_do_truncate(struct glfs_fd *fd, > int64_t offset, > PreallocMode prealloc, Error **errp) > { > int64_t current_length; > +int ret; > > current_length = glfs_lseek(fd, 0, SEEK_END); > if (current_length < 0) { > @@ -1023,7 +1024,12 @@ static int qemu_gluster_do_truncate(struct glfs_fd > *fd, int64_t offset, > #endif /* CONFIG_GLUSTERFS_FALLOCATE */ > #ifdef CONFIG_GLUSTERFS_ZEROFILL > case PREALLOC_MODE_FULL: > -if (glfs_ftruncate(fd, offset)) { > +#ifdef CONFIG_GLUSTERFS_LEGACY_FTRUNCATE > +ret = glfs_ftruncate(fd, offset); > +#else > +ret = glfs_ftruncate(fd, offset, NULL, NULL); > +#endif > +if (ret) { > error_setg_errno(errp, errno, "Could not resize file"); > return -errno; > } > @@ -1034,7 +1040,12 @@ static int qemu_gluster_do_truncate(struct glfs_fd > *fd, int64_t offset, > break; > #endif /* CONFIG_GLUSTERFS_ZEROFILL */ > case PREALLOC_MODE_OFF: > -if (glfs_ftruncate(fd, offset)) { > +#ifdef CONFIG_GLUSTERFS_LEGACY_FTRUNCATE > +ret = glfs_ftruncate(fd, offset); > +#else > +ret = glfs_ftruncate(fd, offset, NULL, NULL); > +#endif > +if (ret) { > error_setg_errno(errp, errno, "Could not resize file"); > return -errno; > } > diff --git a/configure b/configure > index 0a19b033bc..69827b0098 100755 > --- a/configure > +++ b/configure > @@ -429,6 +429,7 @@ glusterfs_xlator_opt="no" > glusterfs_discard="no" > glusterfs_fallocate="no" > glusterfs_zerofill="no" > +glusterfs_legacy_ftruncate="no" > gtk="" > gtkabi="" > gtk_gl="no" > @@ -3856,6 +3857,9 @@ if test "$glusterfs" != "no" ; then >glusterfs_fallocate="yes" >glusterfs_zerofill="yes" > fi > +if ! $pkg_config --atleast-version=7.4 glusterfs-api; then > + glusterfs_legacy_ftruncate="yes" > +fi >else > if test "$glusterfs" = "yes" ; then >feature_not_found "GlusterFS backend support" \ > @@ -6502,6 +6506,10 @@ if test "$glusterfs_zerofill" = "yes" ; then >echo "CONFIG_GLUSTERFS_ZEROFILL=y" >> $config_host_mak > fi > > +if test "$glusterfs_legacy_ftruncate" = "yes" ; then > + echo "CONFIG_GLUSTERFS_LEGACY_FTRUNCATE=y" >> $config_host_mak > +fi > + > if test "$libssh2" = "yes" ; then >echo "CONFIG_LIBSSH2=m" >> $config_host_mak >echo "LIBSSH2_CFLAGS=$libssh2_cflags" >> $config_host_mak > -- > 2.14.3 >
Re: [Qemu-devel] [PATCH 07/19] block: Really pause block jobs on drain
Am 12.04.2018 um 14:02 hat Paolo Bonzini geschrieben: > On 12/04/2018 13:53, Kevin Wolf wrote: > >> The problem I have is that there is a direction through which I/O flows > >> (parent-to-child), so why can't draining follow that natural direction. > >> Having to check for the parents' I/O, while draining the child, seems > >> wrong. Perhaps we can't help it, but I cannot understand the reason. > > I'm not sure what's there that could be not understood. You already > > confirmed that we need to drain the parents, too, when we drain a node. > > Drain really must propagate in the opposite direction of I/O, because > > part of its job is to quiesce the origin of any I/O to the node that > > should be drained. Opposite of I/O _is_ the natural direction for drain. > > Opposite of I/O is the natural direction for drain to propagate, yes. > > However, I/O direction is the natural direction for requests to stop. > After quiescing X and calling X->drv->bdrv_drain(X), there can be > pending requests only in X's children. So I don't understand why you > need to keep checking in_flight over the whole subgraph, when there are > roots that will conclude their request first, and then their children, > and so on so forth. Not sure I follow. Let's look at an example. Say, we have a block job BlockBackend as the root (because that uses proper layering, unlike devices which use aio_disable_external()), connected to a qcow2 node over file. 1. The block job issues a request to the qcow2 node 2. In order to process that request, the qcow2 node internally issues one or more requests to the file node 3. Someone calls bdrv_drained_begin(qcow2_node) 4. We call BlockDriver.bdrv_drained_begin() for qcow2_node and file_node, and BdrvChildRole.drained_begin() for the block job. Now the block nodes don't create any new original requests any more (qcow2 and file don't do that anyway; qcow2 only creates requests to its children in the context of parent requests). The job potentially continues sending requests until it reaches the next pause point. The previously issued requests are still in flight. Is this step what you meant by X->drv->bdrv_drain(X)? I don't see why pending requests can only be in X's children. Why can't the request be pending in X itself, say waiting for a thread pool worker decrypting a buffer? Also, note that after this series, the driver callbacks are called asynchronously, but I don't think it makes a big difference here. 5. The file_node request completes. file_node doesn't have any requests in flight any more, but in theory it could still get new requests from qcow2_node. Anyway, let's say this is the last request, so I think we'd call its requests concluded? 6. qcow2 still has a request in flight, but doesn't need to access file_node for it. It finishes the work and therefore concludes its requests as well. Note that qcow2_node (the parent) concludes after file_node (the child). 7. We'll keep the example simple, so after completion of its request, the job reaches a pause point without sending a new request. Again, this happens after qcow2_node has concluded. 8. Only when neither file_node nor qcow2_node have a request in flight and the job has reached a pause point, bdrv_drained_begin() can return. So completing the last request and reaching an actually quiescent state looks very much like a process in child-to-parent order to me? Kevin
Re: [Qemu-devel] [PATCH v1 2/2] fpu/softfloat: raise float_invalid for NaN in float_to_int
Peter Maydellwrites: > On 12 April 2018 at 12:58, Alex Bennée wrote: >> Fixes https://bugs.launchpad.net/qemu/+bug/1759264 >> >> Signed-off-by: Alex Bennée >> Cc: Bastian Koppelmann >> --- >> fpu/softfloat.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/fpu/softfloat.c b/fpu/softfloat.c >> index 9b99aa6ec8..ddc77c273c 100644 >> --- a/fpu/softfloat.c >> +++ b/fpu/softfloat.c >> @@ -1344,6 +1344,7 @@ static int64_t round_to_int_and_pack(FloatParts in, >> int rmode, >> case float_class_qnan: >> case float_class_dnan: >> case float_class_msnan: >> +s->float_exception_flags = orig_flags | float_flag_invalid; >> return max; >> case float_class_inf: >> return p.sign ? min : max; > > Don't we also need to raise the Invalid flag for float_class_inf ? > > In both cases, this is fixing a regression introduced in > commit ab52f973a50, I think. And I guess the round_to_uint_and_pack as well which doesn't handle the input inf case. > > thanks > -- PMM -- Alex Bennée
Re: [Qemu-devel] [PATCH v7 4/8] migration: Create socket-address parameter
* Juan Quintela (quint...@redhat.com) wrote: > It will be used to store the uri parameters. We want this only for > tcp, so we don't set it for other uris. We need it to know what port > is migration running. > > Signed-off-by: Juan Quintela> > -- > > This used to be uri parameter, but it has so many troubles to > reproduce that it don't just make sense. > > This used to be a port parameter. I was asked to move to > SocketAddress, done. > I also merged the setting of the migration tcp port in this one > because now I need to free the address, and this makes it easier. > This used to be x-socket-address with a single direction, now it is a > list of addresses. Is there a reason it's a parameter rather than just an entry in MigrationInfo? Dave > --- > hmp.c | 14 ++ > migration/migration.c | 25 + > migration/migration.h | 1 + > migration/socket.c| 11 +++ > qapi/migration.json | 13 +++-- > qapi/sockets.json | 13 + > 6 files changed, 75 insertions(+), 2 deletions(-) > > diff --git a/hmp.c b/hmp.c > index a25c7bd9a8..caf94345c9 100644 > --- a/hmp.c > +++ b/hmp.c > @@ -355,6 +355,20 @@ void hmp_info_migrate_parameters(Monitor *mon, const > QDict *qdict) > monitor_printf(mon, "%s: %" PRIu64 "\n", > MigrationParameter_str(MIGRATION_PARAMETER_XBZRLE_CACHE_SIZE), > params->xbzrle_cache_size); > +if (params->has_socket_address) { > +SocketAddressList *addr; > + > +monitor_printf(mon, "%s: [\n", > +MigrationParameter_str(MIGRATION_PARAMETER_SOCKET_ADDRESS)); > + > +for (addr = params->socket_address; addr; addr = addr->next) { > +char *s = SocketAddress_to_str("", addr->value, > + false, false); > +monitor_printf(mon, "\t%s\n", s); > +} > + > +monitor_printf(mon, "]\n"); > +} > } > > qapi_free_MigrationParameters(params); > diff --git a/migration/migration.c b/migration/migration.c > index 58bd382730..71fa4c8176 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -31,6 +31,8 @@ > #include "migration/vmstate.h" > #include "block/block.h" > #include "qapi/error.h" > +#include "qapi/clone-visitor.h" > +#include "qapi/qapi-visit-sockets.h" > #include "qapi/qapi-commands-migration.h" > #include "qapi/qapi-events-migration.h" > #include "qapi/qmp/qerror.h" > @@ -277,6 +279,21 @@ int migrate_send_rp_req_pages(MigrationIncomingState > *mis, const char *rbname, > return migrate_send_rp_message(mis, msg_type, msglen, bufc); > } > > +void migrate_set_address(SocketAddress *address) > +{ > +MigrationState *s = migrate_get_current(); > +SocketAddressList *addrs; > + > +addrs = g_new0(SocketAddressList, 1); > +addrs->next = s->parameters.socket_address; > +s->parameters.socket_address = addrs; > + > +if (!s->parameters.has_socket_address) { > +s->parameters.has_socket_address = true; > +} > +addrs->value = QAPI_CLONE(SocketAddress, address); > +} > + > void qemu_start_incoming_migration(const char *uri, Error **errp) > { > const char *p; > @@ -564,6 +581,11 @@ MigrationParameters *qmp_query_migrate_parameters(Error > **errp) > params->x_multifd_page_count = s->parameters.x_multifd_page_count; > params->has_xbzrle_cache_size = true; > params->xbzrle_cache_size = s->parameters.xbzrle_cache_size; > +if (s->parameters.socket_address) { > +params->has_socket_address = true; > +params->socket_address = > +QAPI_CLONE(SocketAddressList, s->parameters.socket_address); > +} > > return params; > } > @@ -2571,6 +2593,9 @@ static void migration_instance_finalize(Object *obj) > qemu_mutex_destroy(>error_mutex); > g_free(params->tls_hostname); > g_free(params->tls_creds); > +if (params->socket_address) { > +qapi_free_SocketAddressList(params->socket_address); > +} > qemu_sem_destroy(>pause_sem); > error_free(ms->error); > } > diff --git a/migration/migration.h b/migration/migration.h > index 8d2f320c48..4774ee305f 100644 > --- a/migration/migration.h > +++ b/migration/migration.h > @@ -241,5 +241,6 @@ int migrate_send_rp_req_pages(MigrationIncomingState > *mis, const char* rbname, > > void dirty_bitmap_mig_before_vm_start(void); > void init_dirty_bitmap_incoming_migration(void); > +void migrate_set_address(SocketAddress *address); > > #endif > diff --git a/migration/socket.c b/migration/socket.c > index 122d8ccfbe..5195fd57c5 100644 > --- a/migration/socket.c > +++ b/migration/socket.c > @@ -15,6 +15,7 @@ > */ > > #include "qemu/osdep.h" > +#include "qemu/cutils.h" > > #include "qemu-common.h" > #include "qemu/error-report.h" > @@ -152,6 +153,7 @@ static void socket_start_incoming_migration(SocketAddress >
[Qemu-devel] [PATCH v2 5/5] qmp: add pmemload command
Adapted patch from Baojun Wang [1] with the following commit message: I found this could be useful to have qemu-softmmu as a cross debugger (launch with -s -S command line option), then if we can have a command to load guest physical memory, we can use cross gdb to do some target debug which gdb cannot do directly. pmemload is necessary to directly write physical memory which is not possible with gdb alone as it uses only logical addresses. The QAPI for pmemload uses "val" as parameter name for the physical address. This name is not very descriptive but is consistent with the existing pmemsave. Changing the parameter name of pmemsave is not possible without breaking the existing API. [1]: https://lists.gnu.org/archive/html/qemu-trivial/2014-04/msg00074.html Based-on-patch-by: Baojun WangSigned-off-by: Simon Ruderich --- cpus.c | 41 + hmp-commands.hx | 14 ++ hmp.c | 12 hmp.h | 1 + qapi/misc.json | 20 5 files changed, 88 insertions(+) diff --git a/cpus.c b/cpus.c index d256d8e9b4..c690dc205f 100644 --- a/cpus.c +++ b/cpus.c @@ -2314,6 +2314,47 @@ exit: qemu_close(fd); } +void qmp_pmemload(int64_t addr, int64_t size, int64_t offset, + const char *filename, Error **errp) +{ +int fd; +size_t l; +ssize_t r; +uint8_t buf[1024]; + +fd = qemu_open(filename, O_RDONLY | O_BINARY); +if (fd < 0) { +error_setg_file_open(errp, errno, filename); +return; +} +if (offset > 0) { +if (lseek(fd, offset, SEEK_SET) != offset) { +error_setg_errno(errp, errno, + "could not seek to offset %" PRIx64, offset); +goto exit; +} +} + +while (size != 0) { +l = sizeof(buf); +if (l > size) { +l = size; +} +r = read(fd, buf, l); +if (r <= 0) { +error_setg(errp, QERR_IO_ERROR); +goto exit; +} +l = r; /* in case of short read */ +cpu_physical_memory_write(addr, buf, l); +addr += l; +size -= l; +} + +exit: +qemu_close(fd); +} + void qmp_inject_nmi(Error **errp) { nmi_monitor_handle(monitor_get_cpu_index(), errp); diff --git a/hmp-commands.hx b/hmp-commands.hx index 35d862a5d2..a392d0e379 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -822,6 +822,20 @@ STEXI @item pmemsave @var{addr} @var{size} @var{file} @findex pmemsave save to disk physical memory dump starting at @var{addr} of size @var{size}. +ETEXI + +{ +.name = "pmemload", +.args_type = "val:l,size:i,offset:i,filename:s", +.params = "addr size offset file", +.help = "load from disk physical memory dump starting at 'addr' of size 'size' at file offset 'offset'", +.cmd= hmp_pmemload, +}, + +STEXI +@item pmemload @var{addr} @var{size} @var{offset} @var{file} +@findex pmemload +load from disk physical memory dump starting at @var{addr} of size @var{size} at file offset @var{offset}. ETEXI { diff --git a/hmp.c b/hmp.c index 1e392055f7..c9b37dbfbd 100644 --- a/hmp.c +++ b/hmp.c @@ -1090,6 +1090,18 @@ void hmp_pmemsave(Monitor *mon, const QDict *qdict) hmp_handle_error(mon, ); } +void hmp_pmemload(Monitor *mon, const QDict *qdict) +{ +uint64_t size = qdict_get_int(qdict, "size"); +uint64_t offset = qdict_get_int(qdict, "offset"); +const char *filename = qdict_get_str(qdict, "filename"); +uint64_t addr = qdict_get_int(qdict, "val"); +Error *err = NULL; + +qmp_pmemload(addr, size, offset, filename, ); +hmp_handle_error(mon, ); +} + void hmp_ringbuf_write(Monitor *mon, const QDict *qdict) { const char *chardev = qdict_get_str(qdict, "device"); diff --git a/hmp.h b/hmp.h index 4e2ec375b0..0a69e371ca 100644 --- a/hmp.h +++ b/hmp.h @@ -47,6 +47,7 @@ void hmp_system_powerdown(Monitor *mon, const QDict *qdict); void hmp_cpu(Monitor *mon, const QDict *qdict); void hmp_memsave(Monitor *mon, const QDict *qdict); void hmp_pmemsave(Monitor *mon, const QDict *qdict); +void hmp_pmemload(Monitor *mon, const QDict *qdict); void hmp_ringbuf_write(Monitor *mon, const QDict *qdict); void hmp_ringbuf_read(Monitor *mon, const QDict *qdict); void hmp_cont(Monitor *mon, const QDict *qdict); diff --git a/qapi/misc.json b/qapi/misc.json index 5636f4a149..2255d219fa 100644 --- a/qapi/misc.json +++ b/qapi/misc.json @@ -1185,6 +1185,26 @@ { 'command': 'pmemsave', 'data': {'val': 'int', 'size': 'int', 'filename': 'str'} } +## +# @pmemload: +# +# Load a portion of guest physical memory from a file. +# +# @val: the physical address of the guest to start from +# +# @size: the size of memory region to load +# +# @offset: the offset in the file to start from +# +# @filename: the file to load the memory from as binary data +#
[Qemu-devel] [PATCH v2 1/5] cpus: correct coding style in qmp_memsave/qmp_pmemsave
Signed-off-by: Simon Ruderich--- cpus.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/cpus.c b/cpus.c index 38eba8bff3..c78f430532 100644 --- a/cpus.c +++ b/cpus.c @@ -2263,8 +2263,9 @@ void qmp_memsave(int64_t addr, int64_t size, const char *filename, while (size != 0) { l = sizeof(buf); -if (l > size) +if (l > size) { l = size; +} if (cpu_memory_rw_debug(cpu, addr, buf, l, 0) != 0) { error_setg(errp, "Invalid addr 0x%016" PRIx64 "/size %" PRId64 " specified", orig_addr, orig_size); @@ -2297,8 +2298,9 @@ void qmp_pmemsave(int64_t addr, int64_t size, const char *filename, while (size != 0) { l = sizeof(buf); -if (l > size) +if (l > size) { l = size; +} cpu_physical_memory_read(addr, buf, l); if (fwrite(buf, 1, l, f) != l) { error_setg(errp, QERR_IO_ERROR); -- 2.15.0
[Qemu-devel] [PATCH v2 4/5] hmp: don't truncate size in hmp_memsave/hmp_pmemsave
The called function takes an uint64_t as size parameter and qdict_get_int() returns an uint64_t. Don't truncate it needlessly to an uint32_t. Signed-off-by: Simon Ruderich--- hmp.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hmp.c b/hmp.c index a25c7bd9a8..1e392055f7 100644 --- a/hmp.c +++ b/hmp.c @@ -1064,7 +1064,7 @@ void hmp_cpu(Monitor *mon, const QDict *qdict) void hmp_memsave(Monitor *mon, const QDict *qdict) { -uint32_t size = qdict_get_int(qdict, "size"); +uint64_t size = qdict_get_int(qdict, "size"); const char *filename = qdict_get_str(qdict, "filename"); uint64_t addr = qdict_get_int(qdict, "val"); Error *err = NULL; @@ -1081,7 +1081,7 @@ void hmp_memsave(Monitor *mon, const QDict *qdict) void hmp_pmemsave(Monitor *mon, const QDict *qdict) { -uint32_t size = qdict_get_int(qdict, "size"); +uint64_t size = qdict_get_int(qdict, "size"); const char *filename = qdict_get_str(qdict, "filename"); uint64_t addr = qdict_get_int(qdict, "val"); Error *err = NULL; -- 2.15.0
[Qemu-devel] [PATCH v2 2/5] cpus: convert qmp_memsave/qmp_pmemsave to use qemu_open
qemu_open() allow passing file descriptors to qemu which is used in restricted environments like libvirt where open() is prohibited. Suggested-by: Eric BlakeSigned-off-by: Simon Ruderich --- cpus.c | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/cpus.c b/cpus.c index c78f430532..292d5b94b1 100644 --- a/cpus.c +++ b/cpus.c @@ -2238,7 +2238,7 @@ CpuInfoFastList *qmp_query_cpus_fast(Error **errp) void qmp_memsave(int64_t addr, int64_t size, const char *filename, bool has_cpu, int64_t cpu_index, Error **errp) { -FILE *f; +int fd; uint32_t l; CPUState *cpu; uint8_t buf[1024]; @@ -2255,8 +2255,8 @@ void qmp_memsave(int64_t addr, int64_t size, const char *filename, return; } -f = fopen(filename, "wb"); -if (!f) { +fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, 0600); +if (fd < 0) { error_setg_file_open(errp, errno, filename); return; } @@ -2271,7 +2271,7 @@ void qmp_memsave(int64_t addr, int64_t size, const char *filename, " specified", orig_addr, orig_size); goto exit; } -if (fwrite(buf, 1, l, f) != l) { +if (qemu_write_full(fd, buf, l) != l) { error_setg(errp, QERR_IO_ERROR); goto exit; } @@ -2280,18 +2280,18 @@ void qmp_memsave(int64_t addr, int64_t size, const char *filename, } exit: -fclose(f); +qemu_close(fd); } void qmp_pmemsave(int64_t addr, int64_t size, const char *filename, Error **errp) { -FILE *f; +int fd; uint32_t l; uint8_t buf[1024]; -f = fopen(filename, "wb"); -if (!f) { +fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, 0600); +if (fd < 0) { error_setg_file_open(errp, errno, filename); return; } @@ -2302,7 +2302,7 @@ void qmp_pmemsave(int64_t addr, int64_t size, const char *filename, l = size; } cpu_physical_memory_read(addr, buf, l); -if (fwrite(buf, 1, l, f) != l) { +if (qemu_write_full(fd, buf, l) != l) { error_setg(errp, QERR_IO_ERROR); goto exit; } @@ -2311,7 +2311,7 @@ void qmp_pmemsave(int64_t addr, int64_t size, const char *filename, } exit: -fclose(f); +qemu_close(fd); } void qmp_inject_nmi(Error **errp) -- 2.15.0
Re: [Qemu-devel] [PATCH/RFC 3/5] hw/arm/virt: Allow dynamic sysbus devices again
Hi Eric, On Wed, Feb 14, 2018 at 11:37 AM, Auger Ericwrote: > On 09/02/18 16:17, Geert Uytterhoeven wrote: >> Allow the instantation of generic dynamic sysbus devices again, without >> the need to create a new device-specific vfio type. >> >> This is a partial revert of commit 6f2062b9758ebc64 ("hw/arm/virt: >> Allow only supported dynamic sysbus devices"). >> >> Not-Yet-Signed-off-by: Geert Uytterhoeven >> --- >> hw/arm/virt.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/hw/arm/virt.c b/hw/arm/virt.c >> index b334c82edae9fb1f..fa83784fc08ed076 100644 >> --- a/hw/arm/virt.c >> +++ b/hw/arm/virt.c >> @@ -1596,6 +1596,7 @@ static void virt_machine_class_init(ObjectClass *oc, >> void *data) >> mc->max_cpus = 255; >> machine_class_allow_dynamic_sysbus_dev(mc, TYPE_VFIO_CALXEDA_XGMAC); >> machine_class_allow_dynamic_sysbus_dev(mc, TYPE_VFIO_AMD_XGBE); >> +machine_class_allow_dynamic_sysbus_dev(mc, TYPE_SYS_BUS_DEVICE); > Couldn't you limit that to TYPE_VFIO_PLATFORM instead? Yep, that works. Thanks! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
[Qemu-devel] [PATCH v2 3/5] cpus: use size_t in qmp_memsave/qmp_pmemsave
It's the natural type for object sizes and matches the return value of sizeof(buf). Signed-off-by: Simon Ruderich--- cpus.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/cpus.c b/cpus.c index 292d5b94b1..d256d8e9b4 100644 --- a/cpus.c +++ b/cpus.c @@ -2239,7 +2239,7 @@ void qmp_memsave(int64_t addr, int64_t size, const char *filename, bool has_cpu, int64_t cpu_index, Error **errp) { int fd; -uint32_t l; +size_t l; CPUState *cpu; uint8_t buf[1024]; int64_t orig_addr = addr, orig_size = size; @@ -2287,7 +2287,7 @@ void qmp_pmemsave(int64_t addr, int64_t size, const char *filename, Error **errp) { int fd; -uint32_t l; +size_t l; uint8_t buf[1024]; fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, 0600); -- 2.15.0
Re: [Qemu-devel] [PATCH] qmp: add pmemload command
On Wed, Apr 11, 2018 at 08:02:58AM -0500, Eric Blake wrote: > You could always add qemu_fopen/qemu_fclose to match the existing > qemu_open/qemu_close. But you do have a point that you can't call > qemu_close/fclose (because fclose would be left with a stale fd that > might spuriously close something that has been opened in another thread > during the race window), nor fclose/qemu_close. > > The FILE interface may sometimes be easier to work with, but it also > comes with buffering issues that you don't have to worry about when > using the fd interface. Yeah. I'm using plain qemu_open()/qemu_close() and read/write directly. > Good point - but yes, ideally it should always be possible to pass in an > fd instead of having qemu itself open a file, because there are > execution environments where qemu is intentionally prohibited from > directly calling open() for security reasons (where the management app, > such as libvirt, opens and then passes fds to qemu instead). I've converted qmp_memsave/qmp_pmemsave in extra patches to use qemu_open() (and some additional cleanup). > You are correct that we can't really change the existing interface; so > documenting in the commit message that your choice of names is for > consistency reasons may be sufficient. Will do. Revised patches which should implement all of your suggestions (and the style issue noticed by the bot) as replies to this mail. Regards Simon -- + privacy is necessary + using gnupg http://gnupg.org + public key id: 0x92FEFDB7E44C32F9
Re: [Qemu-devel] [Bug Report] vm paused after succeeding to migrate
* linzhecheng (linzhech...@huawei.com) wrote: > Hi, all > I encounterd a bug when I try to migrate a windows vm. > > Enviroment information: > host A: cpu E5620(model WestmereEP without flag xsave) > host B: cpu E5-2643(model SandyBridgeEP with xsave) > > The reproduce steps is : > 1. Start a windows 2008 vm with -cpu host(which means host-passthrough). > 2. Migrate the vm to host B when cr4.OSXSAVE=0 (successfully). > 3. Vm runs on host B for a while so that cr4.OSXSAVE changes to 1. > 4. Then migrate the vm to host A (successfully), but vm was paused, and qemu > printed log as followed: Remember that migrating using -cpu host across different CPU models is NOT expected to work. > KVM: entry failed, hardware error 0x8021 > > If you're running a guest on an Intel machine without unrestricted mode > support, the failure can be most likely due to the guest entering an invalid > state for Intel VT. For example, the guest maybe running in big real mode > which is not supported on less recent Intel processors. > > EAX=019b3bb0 EBX=01a3ae80 ECX=01a61ce8 EDX= > ESI=01a62000 EDI= EBP= ESP=01718b20 > EIP=0185d982 EFL=0286 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 > ES = 9300 > CS =f000 9b00 > SS = 9300 > DS = 9300 > FS = 9300 > GS = 9300 > LDT= 8200 > TR = 8b00 > GDT= > IDT= > CR0=6010 CR2= CR3= CR4= > DR0= DR1= DR2= > DR3= > DR6=0ff0 DR7=0400 > EFER= > Code=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 > > I have found that problem happened when kvm_put_sregs returns err -22(called > by kvm_arch_put_registers(qemu)). > Because kvm_arch_vcpu_ioctl_set_sregs(kvm-mod) checked that guest_cpuid_has > no X86_FEATURE_XSAVE but cr4.OSXSAVE=1. > So should we cancel migration when kvm_arch_put_registers returns error? It would seem good if we can make the migration fail there rather than hitting that KVM error. It looks like we need to do a bit of plumbing to convert the places that call it to return a bool rather than void. Dave -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH v1 2/2] fpu/softfloat: raise float_invalid for NaN in float_to_int
On 04/12/2018 02:26 PM, Peter Maydell wrote: > On 12 April 2018 at 12:58, Alex Bennéewrote: >> Fixes https://bugs.launchpad.net/qemu/+bug/1759264 >> >> Signed-off-by: Alex Bennée >> Cc: Bastian Koppelmann >> --- >> fpu/softfloat.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/fpu/softfloat.c b/fpu/softfloat.c >> index 9b99aa6ec8..ddc77c273c 100644 >> --- a/fpu/softfloat.c >> +++ b/fpu/softfloat.c >> @@ -1344,6 +1344,7 @@ static int64_t round_to_int_and_pack(FloatParts in, >> int rmode, >> case float_class_qnan: >> case float_class_dnan: >> case float_class_msnan: >> +s->float_exception_flags = orig_flags | float_flag_invalid; >> return max; >> case float_class_inf: >> return p.sign ? min : max; > > Don't we also need to raise the Invalid flag for float_class_inf ? > Yup, as mentioned in the bug report. Cheers, Bastian