Re: [RFC PATCH 0/3] hw/net/tulip: Fix LP#1874539
On 2020/4/24 下午11:27, Helge Deller wrote: * Philippe Mathieu-Daudé : Attempt to fix the launchpad bug filled by Helge: In a qemu-system-hppa system, qemu release v5.0.0-rc, the tulip nic driver is broken. The tulip nic is detected, even getting DHCP info does work. But when trying to download bigger files via network, the tulip driver gets stuck. Philippe Mathieu-Daudé (3): hw/net/tulip: Fix 'Descriptor Error' definition hw/net/tulip: Log descriptor overflows hw/net/tulip: Set descriptor error bit when lenght is incorrect hw/net/tulip.h | 2 +- hw/net/tulip.c | 32 2 files changed, 29 insertions(+), 5 deletions(-) Philippe, thanks for your efforts. Sadly your patch did not fixed the bug itself, but it had some nice cleanups which should be included at some point. Regarding the tulip hang reported by me, the patch below does fix the issue. [PATCH] Fix tulip rx hang Cc: Prasad J Pandit Fixes: 8ffb7265af ("check frame size and r/w data length") Buglink: https://bugs.launchpad.net/bugs/1874539 Signed-off-by: Helge Deller Commit 8ffb7265af ("check frame size and r/w data length") introduced checks to prevent accesses outside of the rx/tx buffers. But the new checks were plain wrong. rx_frame_len does count backwards, and the surrounding code ensures that rx_frame_len will not be bigger than rx_frame_size. Remove those checks again. diff --git a/hw/net/tulip.c b/hw/net/tulip.c index 1295f51d07..59d21defcc 100644 --- a/hw/net/tulip.c +++ b/hw/net/tulip.c @@ -171,9 +171,6 @@ static void tulip_copy_rx_bytes(TULIPState *s, struct tulip_descriptor *desc) len = s->rx_frame_len; } -if (s->rx_frame_len + len > sizeof(s->rx_frame)) { -return; -} pci_dma_write(>dev, desc->buf_addr1, s->rx_frame + (s->rx_frame_size - s->rx_frame_len), len); s->rx_frame_len -= len; @@ -186,9 +183,6 @@ static void tulip_copy_rx_bytes(TULIPState *s, struct tulip_descriptor *desc) len = s->rx_frame_len; } -if (s->rx_frame_len + len > sizeof(s->rx_frame)) { -return; -} pci_dma_write(>dev, desc->buf_addr2, s->rx_frame + (s->rx_frame_size - s->rx_frame_len), len); s->rx_frame_len -= len; Looks good to me. Would you please send a formal patch and cc Peter. Consider we are about to release 5.0, it's better for him to apply the patch directly. Thanks
Re: [PATCH v2] qemu-sockets: add abstract UNIX domain socket support
Eric , Markus, any comments ? 在 2020/4/23 下午6:51, xiaoqiang.zhao 写道: 在 2020/4/23 下午5:00, Daniel P. Berrangé 写道: Adding Eric & Markus for QAPI modelling questions On Thu, Apr 23, 2020 at 11:56:40AM +0800, xiaoqiang zhao wrote: unix_connect_saddr now support abstract address type By default qemu does not support abstract UNIX domain socket address. Add this ability to make qemu handy when abstract address is needed. Was that a specific app you're using with QEMU that needs this ? Thanks for your reply ! I once use qemu to connect a unix domain socket server (with abstract type address written by android java code) Abstract address is marked by prefixing the address name with a '@'. For full support of the abstract namespace we would ned to allow the "sun_path" to contain an arbitrary mix of NULs and non-NULs characters, and allow connect() @addrlen to be an arbitrary size. This patch only allows a single initial NUL, and reqiures @addrlen to be the full size of sun_path, padding with trailing NULs. This limitation is impossible to lift with QEMU's current approach to UNIX sockets, as it relies on passing around a NULL terminated string, so there's no way to have embedded NULs. Since there's no explicit length, we have to chooose between forcing the full sun_path size as @addrlen, or forcing the string length as the @addrlen value. IIUC, socat makes the latter decision by default, but has a flag to switch to the former. [man socat] unix-tightsocklen=[0|1] On socket operations, pass a socket address length that does not include the whole struct sockaddr_un record but (besides other compo‐ nents) only the relevant part of the filename or abstract string. Default is 1. [/man] This actually is supported for both abstract and non-abstract sockets, though IIUC this doesn't make a semantic difference for non-abstract sockets. The point is we have four possible combinations NON-ABSTRACT + FULL SIZE NON-ABSTRACT + MINIMAL SIZE (default) ABSTRACT + FULL SIZE ABSTRACT + MINIMAL SIZE (default) With your patch doing the latter, it means QEMU supports only two combinations NON+ABSTRACT + FULL SIZE ABSTRACT + MINIMAL SIZE and also can't use "@somerealpath" for a non-abstract socket, though admittedly this is unlikely. Socat uses a special option to request use of abstract sockets. eg ABSTRACT:somepath, and automatically adds the leading NUL, so there's no need for a special "@" character. This means that UNIX:@somepath still resolves to a filesystem path and not a abstract socket path. Finally, the patch as only added support for connect() not listen(). I think if QEMU wants to support abstract sockets we must do both, and also have unit tests added to tests/test-util-sockets.c Yes , I missed these parts. The question is whether we're ok with this simple approach in QEMU, or should do a full approach with more explicit modelling. Agree, more comments is welcome. ie should we change QAPI thus: { 'struct': 'UnixSocketAddress', 'data': { 'path': 'str', 'tight': 'bool', 'abstract': 'bool' } } where 'tight' is a flag indicating whether to set @addrlen to the minimal string length, or the maximum sun_path length. And 'abstract' indicates that we automagically add a leading NUL. This would *not* allow for NULs in the middle of path, but I'm not so bothered about that, since I can't see that being widely used. If we really did need that it could be added via a 'base64': 'bool' flag, to indicate that @path is base64 encoded and thus may contain NULs From a CLI POV, this could be mapped to QAPI thus * -chardev unix:somepath @path==somepath @tight==false @abstract==false * -chardev unix:somepath,tight @path==somepath @tight==true @abstract==false * -chardev unix-abstract:somepath @path==somepath @tight==false @abstract==true * -chardev unix-abstract:somepath,tight @path==somepath @tight==true @abstract==true Regards, Daniel
Re: [PATCH] hw: add compat machines for 5.1
On Fri, Apr 24, 2020 at 11:03:14AM +0200, Cornelia Huck wrote: > Add 5.1 machine types for arm/i440fx/q35/s390x/spapr. > > Signed-off-by: Cornelia Huck Acked-by: David Gibson > --- > > Still keeping the default cpu model version on x86 at v1. > > I'll queue this to my s390-next branch, as I'm planning to send a pull > req as soon as 5.0 is out; if someone else wants to queue this, we'll > figure it out :) > > --- > hw/arm/virt.c | 9 - > hw/core/machine.c | 3 +++ > hw/i386/pc.c | 3 +++ > hw/i386/pc_piix.c | 14 +- > hw/i386/pc_q35.c | 13 - > hw/ppc/spapr.c | 15 +-- > hw/s390x/s390-virtio-ccw.c | 14 +- > include/hw/boards.h| 3 +++ > include/hw/i386/pc.h | 3 +++ > 9 files changed, 71 insertions(+), 6 deletions(-) > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c > index 7dc96abf72cf..5e84c09402dd 100644 > --- a/hw/arm/virt.c > +++ b/hw/arm/virt.c > @@ -2309,15 +2309,22 @@ static void machvirt_machine_init(void) > } > type_init(machvirt_machine_init); > > +static void virt_machine_5_1_options(MachineClass *mc) > +{ > +} > +DEFINE_VIRT_MACHINE_AS_LATEST(5, 1) > + > static void virt_machine_5_0_options(MachineClass *mc) > { > static GlobalProperty compat[] = { > { TYPE_TPM_TIS_SYSBUS, "ppi", "false" }, > }; > > +virt_machine_5_1_options(mc); > +compat_props_add(mc->compat_props, hw_compat_5_0, hw_compat_5_0_len); > compat_props_add(mc->compat_props, compat, G_N_ELEMENTS(compat)); > } > -DEFINE_VIRT_MACHINE_AS_LATEST(5, 0) > +DEFINE_VIRT_MACHINE(5, 0) > > static void virt_machine_4_2_options(MachineClass *mc) > { > diff --git a/hw/core/machine.c b/hw/core/machine.c > index c1a444cb7558..7a50dd518f4c 100644 > --- a/hw/core/machine.c > +++ b/hw/core/machine.c > @@ -28,6 +28,9 @@ > #include "hw/mem/nvdimm.h" > #include "migration/vmstate.h" > > +GlobalProperty hw_compat_5_0[] = {}; > +const size_t hw_compat_5_0_len = G_N_ELEMENTS(hw_compat_5_0); > + > GlobalProperty hw_compat_4_2[] = { > { "virtio-blk-device", "queue-size", "128"}, > { "virtio-scsi-device", "virtqueue_size", "128"}, > diff --git a/hw/i386/pc.c b/hw/i386/pc.c > index 5143c516531e..13e1d18cd758 100644 > --- a/hw/i386/pc.c > +++ b/hw/i386/pc.c > @@ -94,6 +94,9 @@ > #include "fw_cfg.h" > #include "trace.h" > > +GlobalProperty pc_compat_5_0[] = {}; > +const size_t pc_compat_5_0_len = G_N_ELEMENTS(pc_compat_5_0); > + > GlobalProperty pc_compat_4_2[] = { > { "mch", "smbase-smram", "off" }, > }; > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c > index 22dee0e76c62..921caa502b85 100644 > --- a/hw/i386/pc_piix.c > +++ b/hw/i386/pc_piix.c > @@ -419,7 +419,7 @@ static void pc_i440fx_machine_options(MachineClass *m) > machine_class_allow_dynamic_sysbus_dev(m, TYPE_RAMFB_DEVICE); > } > > -static void pc_i440fx_5_0_machine_options(MachineClass *m) > +static void pc_i440fx_5_1_machine_options(MachineClass *m) > { > PCMachineClass *pcmc = PC_MACHINE_CLASS(m); > pc_i440fx_machine_options(m); > @@ -428,6 +428,18 @@ static void pc_i440fx_5_0_machine_options(MachineClass > *m) > pcmc->default_cpu_version = 1; > } > > +DEFINE_I440FX_MACHINE(v5_1, "pc-i440fx-5.1", NULL, > + pc_i440fx_5_1_machine_options); > + > +static void pc_i440fx_5_0_machine_options(MachineClass *m) > +{ > +pc_i440fx_5_1_machine_options(m); > +m->alias = NULL; > +m->is_default = false; > +compat_props_add(m->compat_props, hw_compat_5_0, hw_compat_5_0_len); > +compat_props_add(m->compat_props, pc_compat_5_0, pc_compat_5_0_len); > +} > + > DEFINE_I440FX_MACHINE(v5_0, "pc-i440fx-5.0", NULL, >pc_i440fx_5_0_machine_options); > > diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c > index d37c425e2236..253688a9b8f3 100644 > --- a/hw/i386/pc_q35.c > +++ b/hw/i386/pc_q35.c > @@ -349,7 +349,7 @@ static void pc_q35_machine_options(MachineClass *m) > m->max_cpus = 288; > } > > -static void pc_q35_5_0_machine_options(MachineClass *m) > +static void pc_q35_5_1_machine_options(MachineClass *m) > { > PCMachineClass *pcmc = PC_MACHINE_CLASS(m); > pc_q35_machine_options(m); > @@ -357,6 +357,17 @@ static void pc_q35_5_0_machine_options(MachineClass *m) > pcmc->default_cpu_version = 1; > } > > +DEFINE_Q35_MACHINE(v5_1, "pc-q35-5.1", NULL, > + pc_q35_5_1_machine_options); > + > +static void pc_q35_5_0_machine_options(MachineClass *m) > +{ > +pc_q35_5_1_machine_options(m); > +m->alias = NULL; > +compat_props_add(m->compat_props, hw_compat_5_0, hw_compat_5_0_len); > +compat_props_add(m->compat_props, pc_compat_5_0, pc_compat_5_0_len); > +} > + > DEFINE_Q35_MACHINE(v5_0, "pc-q35-5.0", NULL, > pc_q35_5_0_machine_options); > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index 9a2bd501aa6e..fd5bfd11a84c 100644 > ---
Re: [PATCH v5 0/4] introduction of migration_version attribute for VFIO live migration
On Sat, Apr 25, 2020 at 03:10:49AM +0800, Dr. David Alan Gilbert wrote: > * Yan Zhao (yan.y.z...@intel.com) wrote: > > On Tue, Apr 21, 2020 at 08:08:49PM +0800, Tian, Kevin wrote: > > > > From: Yan Zhao > > > > Sent: Tuesday, April 21, 2020 10:37 AM > > > > > > > > On Tue, Apr 21, 2020 at 06:56:00AM +0800, Alex Williamson wrote: > > > > > On Sun, 19 Apr 2020 21:24:57 -0400 > > > > > Yan Zhao wrote: > > > > > > > > > > > On Fri, Apr 17, 2020 at 07:24:57PM +0800, Cornelia Huck wrote: > > > > > > > On Fri, 17 Apr 2020 05:52:02 -0400 > > > > > > > Yan Zhao wrote: > > > > > > > > > > > > > > > On Fri, Apr 17, 2020 at 04:44:50PM +0800, Cornelia Huck wrote: > > > > > > > > > On Mon, 13 Apr 2020 01:52:01 -0400 > > > > > > > > > Yan Zhao wrote: > > > > > > > > > > > > > > > > > > > This patchset introduces a migration_version attribute > > > > > > > > > > under sysfs > > > > of VFIO > > > > > > > > > > Mediated devices. > > > > > > > > > > > > > > > > > > > > This migration_version attribute is used to check migration > > > > compatibility > > > > > > > > > > between two mdev devices. > > > > > > > > > > > > > > > > > > > > Currently, it has two locations: > > > > > > > > > > (1) under mdev_type node, > > > > > > > > > > which can be used even before device creation, but only > > > > > > > > > > for > > > > mdev > > > > > > > > > > devices of the same mdev type. > > > > > > > > > > (2) under mdev device node, > > > > > > > > > > which can only be used after the mdev devices are > > > > > > > > > > created, but > > > > the src > > > > > > > > > > and target mdev devices are not necessarily be of the > > > > > > > > > > same > > > > mdev type > > > > > > > > > > (The second location is newly added in v5, in order to keep > > > > consistent > > > > > > > > > > with the migration_version node for migratable pass-though > > > > devices) > > > > > > > > > > > > > > > > > > What is the relationship between those two attributes? > > > > > > > > > > > > > > > > > (1) is for mdev devices specifically, and (2) is provided to > > > > > > > > keep the > > > > same > > > > > > > > sysfs interface as with non-mdev cases. so (2) is for both mdev > > > > devices and > > > > > > > > non-mdev devices. > > > > > > > > > > > > > > > > in future, if we enable vfio-pci vendor ops, (i.e. a non-mdev > > > > > > > > device > > > > > > > > is binding to vfio-pci, but is able to register migration > > > > > > > > region and do > > > > > > > > migration transactions from a vendor provided affiliate driver), > > > > > > > > the vendor driver would export (2) directly, under device node. > > > > > > > > It is not able to provide (1) as there're no mdev devices > > > > > > > > involved. > > > > > > > > > > > > > > Ok, creating an alternate attribute for non-mdev devices makes > > > > > > > sense. > > > > > > > However, wouldn't that rather be a case (3)? The change here only > > > > > > > refers to mdev devices. > > > > > > > > > > > > > as you pointed below, (3) and (2) serve the same purpose. > > > > > > and I think a possible usage is to migrate between a non-mdev > > > > > > device and > > > > > > an mdev device. so I think it's better for them both to use (2) > > > > > > rather > > > > > > than creating (3). > > > > > > > > > > An mdev type is meant to define a software compatible interface, so in > > > > > the case of mdev->mdev migration, doesn't migrating to a different > > > > > type > > > > > fail the most basic of compatibility tests that we expect userspace to > > > > > perform? IOW, if two mdev types are migration compatible, it seems a > > > > > prerequisite to that is that they provide the same software interface, > > > > > which means they should be the same mdev type. > > > > > > > > > > In the hybrid cases of mdev->phys or phys->mdev, how does a > > > > management > > > > > tool begin to even guess what might be compatible? Are we expecting > > > > > libvirt to probe ever device with this attribute in the system? Is > > > > > there going to be a new class hierarchy created to enumerate all > > > > > possible migrate-able devices? > > > > > > > > > yes, management tool needs to guess and test migration compatible > > > > between two devices. But I think it's not the problem only for > > > > mdev->phys or phys->mdev. even for mdev->mdev, management tool needs > > > > to > > > > first assume that the two mdevs have the same type of parent devices > > > > (e.g.their pciids are equal). otherwise, it's still enumerating > > > > possibilities. > > > > > > > > on the other hand, for two mdevs, > > > > mdev1 from pdev1, its mdev_type is 1/2 of pdev1; > > > > mdev2 from pdev2, its mdev_type is 1/4 of pdev2; > > > > if pdev2 is exactly 2 times of pdev1, why not allow migration between > > > > mdev1 <-> mdev2. > > > > > > How could the manage tool figure out that 1/2 of pdev1 is equivalent > > > to 1/4 of pdev2? If we really want to allow such thing happen, the best > > > choice is to report the same
Re: [PATCH v4 1/3] memory: drop guest writes to read-only ram device regions
On Sat, Apr 25, 2020 at 06:55:33PM +0800, Paolo Bonzini wrote: > On 17/04/20 09:44, Yan Zhao wrote: > > for ram device regions, drop guest writes if the regions is read-only. > > > > Cc: Philippe Mathieu-Daudé > > Signed-off-by: Yan Zhao > > Signed-off-by: Xin Zeng > > --- > > memory.c | 7 +++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/memory.c b/memory.c > > index 601b749906..9576dd6807 100644 > > --- a/memory.c > > +++ b/memory.c > > @@ -34,6 +34,7 @@ > > #include "sysemu/accel.h" > > #include "hw/boards.h" > > #include "migration/vmstate.h" > > +#include "qemu/log.h" > > > > //#define DEBUG_UNASSIGNED > > > > @@ -1313,6 +1314,12 @@ static void memory_region_ram_device_write(void > > *opaque, hwaddr addr, > > MemoryRegion *mr = opaque; > > > > trace_memory_region_ram_device_write(get_cpu_index(), mr, addr, data, > > size); > > +if (mr->readonly) { > > +qemu_log_mask(LOG_GUEST_ERROR, > > + "Invalid write to read only ram device region 0x%" > > + HWADDR_PRIx" size %u\n", addr, size); > > +return; > > +} > > As mentioned in the review of v1, memory_region_ram_device_write should > be changed to a .write_with_attrs operation, so that it can return > MEMTX_ERROR. > > Otherwise this looks good. > hi Paolo, thanks for pointing it out again! I didn't get your meaning in v1. will update the patch! Thanks Yan >
Re: [PATCH] linux-user: Drop unnecessary check in signalfd4 syscall
On 25.04.20 12:03, Laurent Vivier wrote: > Le 25/04/2020 à 11:24, Helge Deller a écrit : >> On 25.04.20 10:39, Laurent Vivier wrote: >>> Le 24/04/2020 à 23:04, Helge Deller a écrit : The signalfd4() syscall takes optional O_NONBLOCK and O_CLOEXEC fcntl flags. If the user gave any other invalid flags, the host syscall will return correct error codes, so simply drop the extra check here. Signed-off-by: Helge Deller diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 05f03919ff..ebf0d38321 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -7176,9 +7176,6 @@ static abi_long do_signalfd4(int fd, abi_long mask, int flags) sigset_t host_mask; abi_long ret; -if (flags & ~(TARGET_O_NONBLOCK | TARGET_O_CLOEXEC)) { -return -TARGET_EINVAL; -} if (!lock_user_struct(VERIFY_READ, target_mask, mask, 1)) { return -TARGET_EFAULT; } >>> >>> Perhaps we want to trigger the TARGET_EINVAL before the TARGET_EFAULT if >>> we have both cases? >>> >>> But I've checked the kernel, and the kernel does a copy_from_user() >>> before checking the flags, but it returns EINVAL rather than EFAULT. >> >> That's not the full picture, since the kernel is not consistent here! >> In the compat-case (32bit userspace on 64bit kernel) it returns correctly >> EINVAL and EFAULT: >> if (sigsetsize != sizeof(compat_sigset_t)) >> return -EINVAL; >> if (get_compat_sigset(, user_mask)) >> return -EFAULT; >> while in the non-compat case it returns EINVAL only: >> if (sizemask != sizeof(sigset_t) || >> copy_from_user(, user_mask, sizeof(mask))) >> return -EINVAL; >> >> I think the kernel should be fixed here... >> >>> We can remove the flags checking but we should also change TARGET_EFAULT >>> by TARGET_EINVAL. >> >> According to the different behaviour of the kernel mentioned above >> you won't get it correct either way. > > If we refer to manpage, EFAULT is not one of possible errors. The manpage doesn't mention any error code for a bad mask address, and IMHO the manpage should reflect the real coding, not vice versa. I've just sent a RFC patch to the kernel mailing list. It's not a critical problem, but it would be nice to be at least consistent across the implementations. Either both should return always EINVAL, or both should return EFAULT. If it gets accepted I'll send a patch to correct the manpage accordingly afterwards, and then we can adjust qemu. Helge
[Bug 1875080] [NEW] USB host device data transfer with control endpoint
Public bug reported: QEMU emulator version 4.2.0 Host -> Arch Linux kernel version: 5.4.34-1-lts Guest -> Various Linux Distros I sent a control message with data through endpoint zero. On the other side message is received with all fields correct except data buffer. I've tested the data field inside guest with usbmon and data field was correct but after packet leaved qemu, data filed is lost. ** Affects: qemu Importance: Undecided Status: New ** Summary changed: - USB host device data transfer of control endpoint + USB host device data transfer with control endpoint -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1875080 Title: USB host device data transfer with control endpoint Status in QEMU: New Bug description: QEMU emulator version 4.2.0 Host -> Arch Linux kernel version: 5.4.34-1-lts Guest -> Various Linux Distros I sent a control message with data through endpoint zero. On the other side message is received with all fields correct except data buffer. I've tested the data field inside guest with usbmon and data field was correct but after packet leaved qemu, data filed is lost. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1875080/+subscriptions
Re: ARM SVE issues with non "standard" vector lengths
On 4/23/20 9:24 AM, Laurent Desnogues wrote: > Looking at the code I think sve_punpk_p is affected too. Got it. Same issue as zip1. r~
Re: ARM SVE issues with non "standard" vector lengths
On 4/25/20 11:49 AM, Richard Henderson wrote: > On 4/23/20 6:59 AM, Laurent Desnogues wrote: >> 2. sve_zip_p >> >> This generates extraneous data in the higher part of the result. >> >> I hit this when I got a wrong result on an instruction that ends up >> using sve_cntp which counts all bits set in each 64-bit chunk. There >> might be some other instructions beyond ZIP that generate extra data >> that would break sve_cntp. So perhaps it'd be easier to fix sve_cmtp >> (and hope that it's the only function that uses bits beyond vector >> length...). > > I don't see how sve_zip_p can set high bits. If vl is not a multiple of 512, > it writes in units of uint16_t. This cannot produce values outside range. Bah. I was looking at zip2 first. zip1 uses the uint64_t path. I see the problem now. r~
Re: ARM SVE issues with non "standard" vector lengths
On 4/23/20 6:59 AM, Laurent Desnogues wrote: > 2. sve_zip_p > > This generates extraneous data in the higher part of the result. > > I hit this when I got a wrong result on an instruction that ends up > using sve_cntp which counts all bits set in each 64-bit chunk. There > might be some other instructions beyond ZIP that generate extra data > that would break sve_cntp. So perhaps it'd be easier to fix sve_cmtp > (and hope that it's the only function that uses bits beyond vector > length...). I don't see how sve_zip_p can set high bits. If vl is not a multiple of 512, it writes in units of uint16_t. This cannot produce values outside range. Are you certain that the high bits were not set and left over by some previous buggy operation, since the uint16_t writes would not clear any extra bits? r~
target/mips: Enable Hardware page table walker and CMGCR features for P5600
Hi, I have discovered that MIPS hardware page table walker is not enabled for any CPU currently available. In this patch I have enable it (and also CMGCR feature) for P5600 which supports both but they are not enabled. This is my first patch to QEMU, I hope it is well formatted and correct. Signed-off-by: Andrea Oliveri diff --git a/target/mips/translate_init.inc.c b/target/mips/translate_init.inc.c index 6d145a905a..482cfe2123 100644 --- a/target/mips/translate_init.inc.c +++ b/target/mips/translate_init.inc.c @@ -366,7 +366,7 @@ const mips_def_t mips_defs[] = }, { /* FIXME: - * Config3: CMGCR, PW, VZ, CTXTC, CDMM, TL + * Config3: VZ, CTXTC, CDMM, TL * Config4: MMUExtDef * Config5: MRP * FIR(FCR0): Has2008 @@ -380,10 +380,11 @@ const mips_def_t mips_defs[] = (2 << CP0C1_DS) | (4 << CP0C1_DL) | (3 << CP0C1_DA) | (1 << CP0C1_PC) | (1 << CP0C1_FP), .CP0_Config2 = MIPS_CONFIG2, -.CP0_Config3 = MIPS_CONFIG3 | (1U << CP0C3_M) | (1 << CP0C3_MSAP) | +.CP0_Config3 = MIPS_CONFIG3 | (1U << CP0C3_M) | + (1 << CP0C3_CMGCR) | (1 << CP0C3_MSAP) | (1 << CP0C3_BP) | (1 << CP0C3_BI) | (1 << CP0C3_SC) | - (1 << CP0C3_ULRI) | (1 << CP0C3_RXI) | (1 << CP0C3_LPA) | - (1 << CP0C3_VInt), + (1 << CP0C3_PW) | (1 << CP0C3_ULRI) | (1 << CP0C3_RXI) | + (1 << CP0C3_LPA) | (1 << CP0C3_VInt), .CP0_Config4 = MIPS_CONFIG4 | (1U << CP0C4_M) | (2 << CP0C4_IE) | (0x1c << CP0C4_KScrExist), .CP0_Config4_rw_bitmask = 0,
The -icount option document are confusing
-icount [shift=N|auto][,rr=record|replay,rrfile=filename,rrsnapshot=snapshot ] When the virtual cpu is sleeping, the virtual time will advance at default speed unless sleep=on|off is specified. With sleep=on|off, the virtual time will jump to the next timer deadline instantly whenever the virtual cpu goes to sleep mode and will not advance if no timer is enabled. This behavior give deterministic execution times from the guest point of view. I wanna to know what's the meaning when sleep=on, and what's the meaning when sleep=off -- 此致 礼 罗勇刚 Yours sincerely, Yonggang Luo
Re: [PATCH] linux-user: Drop unnecessary check in dup3 syscall
On Fri, 24 Apr 2020 at 22:33, Eric Blake wrote: > I don't think this is quite correct. target_to_host_bitmask() silently > ignores unknown bits, and a user that was relying on bit 0x4000 to > cause an EINVAL will not fail with this change (unless bit 0x4000 > happens to be one of the bits translated by fcntl_flags_tbl). The > open() syscall is notorious for ignoring unknown bits rather than > failing with EINVAL, and it is has come back to haunt kernel developers; > newer syscalls like dup3() learned from the mistake, and we really do > want to catch unsupported bits up to make it easier for future kernels > to define meanings to those bits without them being silently swallowed > when run on older systems that did not know what those bits meant. The other reason linux-user sometimes has this sort of manual check of input values is that it can affect which errno value is returned if a call has multiple wrong things (eg a bad address to a pointer parameter and a bad flags value), and some test suites care about the difference. I'm not sure that's the case here, though. I didn't write out my reasoning back in 2017 when I made this page and don't remember it now, but my guess is that it's just that dup3 is only supposed to permit O_CLOEXEC, not any of the other flags that the fcntl_flags_tbl permits and translates. thanks -- PMM
[Bug 1875012] [NEW] KVM internal error. Suberror: 1 -- emulation failure
Public bug reported: Trying to boot a core20 amd64 image on an amd64 Eoan or Focal host via libvirt leads to: KVM internal error. Suberror: 1 emulation failure RAX= RBX=3bdcd5c0 RCX=3ff1d030 RDX=19a0 RSI=00ff RDI=3bd73ee0 RBP=3bd73e40 RSP=3ff1d1f8 R8 =3df52168 R9 = R10= R11=3bd44c40 R12=3bd76500 R13=3bd73e00 R14=00020002 R15=3df4b483 RIP=000b RFL=00210246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0030 00c09300 DPL=0 DS [-WA] CS =0038 00a09b00 DPL=0 CS64 [-RA] SS =0030 00c09300 DPL=0 DS [-WA] DS =0030 00c09300 DPL=0 DS [-WA] FS =0030 00c09300 DPL=0 DS [-WA] GS =0030 00c09300 DPL=0 DS [-WA] LDT= 8200 DPL=0 LDT TR = 8b00 DPL=0 TSS64-busy GDT= 3fbee698 0047 IDT= 3f2d8018 0fff CR0=80010033 CR2= CR3=3fc01000 CR4=0668 DR0= DR1= DR2= DR3= DR6=0ff0 DR7=0400 EFER=0d00 Code=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ** Affects: qemu (Ubuntu) Importance: Undecided Status: New ** Affects: qemu (Ubuntu Eoan) Importance: Undecided Status: New ** Affects: qemu (Ubuntu Focal) Importance: Undecided Status: New ** Also affects: qemu (Ubuntu) Importance: Undecided Status: New ** No longer affects: qemu ** Also affects: qemu (Ubuntu Focal) Importance: Undecided Status: New ** Also affects: qemu (Ubuntu Eoan) Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1875012 Title: KVM internal error. Suberror: 1 -- emulation failure Status in qemu package in Ubuntu: New Status in qemu source package in Eoan: New Status in qemu source package in Focal: New Bug description: Trying to boot a core20 amd64 image on an amd64 Eoan or Focal host via libvirt leads to: KVM internal error. Suberror: 1 emulation failure RAX= RBX=3bdcd5c0 RCX=3ff1d030 RDX=19a0 RSI=00ff RDI=3bd73ee0 RBP=3bd73e40 RSP=3ff1d1f8 R8 =3df52168 R9 = R10= R11=3bd44c40 R12=3bd76500 R13=3bd73e00 R14=00020002 R15=3df4b483 RIP=000b RFL=00210246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0030 00c09300 DPL=0 DS [-WA] CS =0038 00a09b00 DPL=0 CS64 [-RA] SS =0030 00c09300 DPL=0 DS [-WA] DS =0030 00c09300 DPL=0 DS [-WA] FS =0030 00c09300 DPL=0 DS [-WA] GS =0030 00c09300 DPL=0 DS [-WA] LDT= 8200 DPL=0 LDT TR = 8b00 DPL=0 TSS64-busy GDT= 3fbee698 0047 IDT= 3f2d8018 0fff CR0=80010033 CR2= CR3=3fc01000 CR4=0668 DR0= DR1= DR2= DR3= DR6=0ff0 DR7=0400 EFER=0d00 Code=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1875012/+subscriptions
Re: [PATCH v1 PATCH v1 1/1] tests: machine-none-test: Enable MicroBlaze testing
On 16/04/20 21:33, Edgar E. Iglesias wrote: > From: "Edgar E. Iglesias" > > Enable MicroBlaze testing. > > Signed-off-by: Edgar E. Iglesias > --- > tests/qtest/machine-none-test.c | 10 -- > 1 file changed, 4 insertions(+), 6 deletions(-) > > diff --git a/tests/qtest/machine-none-test.c b/tests/qtest/machine-none-test.c > index 8bb54a6360..209d86eb57 100644 > --- a/tests/qtest/machine-none-test.c > +++ b/tests/qtest/machine-none-test.c > @@ -33,8 +33,8 @@ static struct arch2cpu cpus_map[] = { > { "cris", "crisv32" }, > { "lm32", "lm32-full" }, > { "m68k", "m5206" }, > -/* FIXME: { "microblaze", "any" }, doesn't work with -M none -cpu any */ > -/* FIXME: { "microblazeel", "any" }, doesn't work with -M none -cpu any > */ > +{ "microblaze", "any" }, > +{ "microblazeel", "any" }, > { "mips", "4Kc" }, > { "mipsel", "I7200" }, > { "mips64", "20Kc" }, > @@ -79,10 +79,8 @@ static void test_machine_cpu_cli(void) > QTestState *qts; > > if (!cpu_model) { > -if (!(!strcmp(arch, "microblaze") || !strcmp(arch, "microblazeel"))) > { > -fprintf(stderr, "WARNING: cpu name for target '%s' isn't > defined," > -" add it to cpus_map\n", arch); > -} > +fprintf(stderr, "WARNING: cpu name for target '%s' isn't defined," > +" add it to cpus_map\n", arch); > return; /* TODO: die here to force all targets have a test */ > } > qts = qtest_initf("-machine none -cpu '%s'", cpu_model); > Queued, thanks. Paolo
Re: [PATCH] hw/i386/amd_iommu: Fix the reserved bits definition of IOMMU commands
On 18/04/20 06:28, Wei Huang wrote: > Many reserved bits of amd_iommu commands are defined incorrectly in QEMU. > Because of it, QEMU incorrectly injects lots of illegal commands into guest > VM's IOMMU event log. > > Signed-off-by: Wei Huang > --- > hw/i386/amd_iommu.c | 19 ++- > 1 file changed, 10 insertions(+), 9 deletions(-) > > diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c > index fd75cae02437..4346060e6223 100644 > --- a/hw/i386/amd_iommu.c > +++ b/hw/i386/amd_iommu.c > @@ -370,7 +370,7 @@ static void amdvi_completion_wait(AMDVIState *s, uint64_t > *cmd) > hwaddr addr = cpu_to_le64(extract64(cmd[0], 3, 49)) << 3; > uint64_t data = cpu_to_le64(cmd[1]); > > -if (extract64(cmd[0], 51, 8)) { > +if (extract64(cmd[0], 52, 8)) { > amdvi_log_illegalcom_error(s, extract64(cmd[0], 60, 4), > s->cmdbuf + s->cmdbuf_head); > } > @@ -395,7 +395,7 @@ static void amdvi_inval_devtab_entry(AMDVIState *s, > uint64_t *cmd) > uint16_t devid = cpu_to_le16((uint16_t)extract64(cmd[0], 0, 16)); > > /* This command should invalidate internal caches of which there isn't */ > -if (extract64(cmd[0], 15, 16) || cmd[1]) { > +if (extract64(cmd[0], 16, 44) || cmd[1]) { > amdvi_log_illegalcom_error(s, extract64(cmd[0], 60, 4), > s->cmdbuf + s->cmdbuf_head); > } > @@ -405,9 +405,9 @@ static void amdvi_inval_devtab_entry(AMDVIState *s, > uint64_t *cmd) > > static void amdvi_complete_ppr(AMDVIState *s, uint64_t *cmd) > { > -if (extract64(cmd[0], 15, 16) || extract64(cmd[0], 19, 8) || > +if (extract64(cmd[0], 16, 16) || extract64(cmd[0], 52, 8) || > extract64(cmd[1], 0, 2) || extract64(cmd[1], 3, 29) > -|| extract64(cmd[1], 47, 16)) { > +|| extract64(cmd[1], 48, 16)) { > amdvi_log_illegalcom_error(s, extract64(cmd[0], 60, 4), > s->cmdbuf + s->cmdbuf_head); > } > @@ -438,8 +438,8 @@ static void amdvi_inval_pages(AMDVIState *s, uint64_t > *cmd) > { > uint16_t domid = cpu_to_le16((uint16_t)extract64(cmd[0], 32, 16)); > > -if (extract64(cmd[0], 20, 12) || extract64(cmd[0], 16, 12) || > -extract64(cmd[0], 3, 10)) { > +if (extract64(cmd[0], 20, 12) || extract64(cmd[0], 48, 12) || > +extract64(cmd[1], 3, 9)) { > amdvi_log_illegalcom_error(s, extract64(cmd[0], 60, 4), > s->cmdbuf + s->cmdbuf_head); > } > @@ -451,7 +451,7 @@ static void amdvi_inval_pages(AMDVIState *s, uint64_t > *cmd) > > static void amdvi_prefetch_pages(AMDVIState *s, uint64_t *cmd) > { > -if (extract64(cmd[0], 16, 8) || extract64(cmd[0], 20, 8) || > +if (extract64(cmd[0], 16, 8) || extract64(cmd[0], 52, 8) || > extract64(cmd[1], 1, 1) || extract64(cmd[1], 3, 1) || > extract64(cmd[1], 5, 7)) { > amdvi_log_illegalcom_error(s, extract64(cmd[0], 60, 4), > @@ -463,7 +463,7 @@ static void amdvi_prefetch_pages(AMDVIState *s, uint64_t > *cmd) > > static void amdvi_inval_inttable(AMDVIState *s, uint64_t *cmd) > { > -if (extract64(cmd[0], 16, 16) || cmd[1]) { > +if (extract64(cmd[0], 16, 44) || cmd[1]) { > amdvi_log_illegalcom_error(s, extract64(cmd[0], 60, 4), > s->cmdbuf + s->cmdbuf_head); > return; > @@ -479,7 +479,8 @@ static void iommu_inval_iotlb(AMDVIState *s, uint64_t > *cmd) > { > > uint16_t devid = extract64(cmd[0], 0, 16); > -if (extract64(cmd[1], 1, 1) || extract64(cmd[1], 3, 9)) { > +if (extract64(cmd[1], 1, 1) || extract64(cmd[1], 3, 1) || > +extract64(cmd[1], 6, 6)) { > amdvi_log_illegalcom_error(s, extract64(cmd[0], 60, 4), > s->cmdbuf + s->cmdbuf_head); > return; > Queued, thanks. Paolo
Re: [PATCH v4 1/3] memory: drop guest writes to read-only ram device regions
On 17/04/20 09:44, Yan Zhao wrote: > for ram device regions, drop guest writes if the regions is read-only. > > Cc: Philippe Mathieu-Daudé > Signed-off-by: Yan Zhao > Signed-off-by: Xin Zeng > --- > memory.c | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/memory.c b/memory.c > index 601b749906..9576dd6807 100644 > --- a/memory.c > +++ b/memory.c > @@ -34,6 +34,7 @@ > #include "sysemu/accel.h" > #include "hw/boards.h" > #include "migration/vmstate.h" > +#include "qemu/log.h" > > //#define DEBUG_UNASSIGNED > > @@ -1313,6 +1314,12 @@ static void memory_region_ram_device_write(void > *opaque, hwaddr addr, > MemoryRegion *mr = opaque; > > trace_memory_region_ram_device_write(get_cpu_index(), mr, addr, data, > size); > +if (mr->readonly) { > +qemu_log_mask(LOG_GUEST_ERROR, > + "Invalid write to read only ram device region 0x%" > + HWADDR_PRIx" size %u\n", addr, size); > +return; > +} As mentioned in the review of v1, memory_region_ram_device_write should be changed to a .write_with_attrs operation, so that it can return MEMTX_ERROR. Otherwise this looks good. Paolo
Re: [PATCH v2] chardev/char-socket: Properly make qio connections non blocking
On 20/04/20 11:26, Daniel P. Berrangé wrote: > On Sun, Apr 19, 2020 at 03:21:40PM +0530, Sai Pavan Boddu wrote: >> In tcp_chr_sync_read function, there is a possibility of socket >> disconnection during blocking read, then tcp_chr_hup function would clean up >> the qio channel pointers(i.e ioc, sioc). >> >> Signed-off-by: Sai Pavan Boddu >> --- >> Changes for V2: >> Place the guard around 'qio_channel_set_blocking' call to check >> connection status >> This fix is simpler than v1 and explains better about the issue. >> >> chardev/char-socket.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/chardev/char-socket.c b/chardev/char-socket.c >> index 185fe38..e56b2f0 100644 >> --- a/chardev/char-socket.c >> +++ b/chardev/char-socket.c >> @@ -549,7 +549,9 @@ static int tcp_chr_sync_read(Chardev *chr, const uint8_t >> *buf, int len) >> >> qio_channel_set_blocking(s->ioc, true, NULL); >> size = tcp_chr_recv(chr, (void *) buf, len); >> -qio_channel_set_blocking(s->ioc, false, NULL); >> +if (s->state != TCP_CHARDEV_STATE_DISCONNECTED) { >> +qio_channel_set_blocking(s->ioc, false, NULL); >> +} >> if (size == 0) { >> /* connection closed */ >> tcp_chr_disconnect(chr); > > Reviewed-by: Daniel P. Berrangé > > > Regards, > Daniel > Queued, thanks. Paolo
CFP: KVM Forum 2020
KVM Forum 2020: Call For Participation October 28-30, 2020 - Convention Centre Dublin - Dublin, Ireland (All submissions must be received before June 15, 2020 at 23:59 PST) = KVM Forum is an annual event that presents a rare opportunity for developers and users to meet, discuss the state of Linux virtualization technology, and plan for the challenges ahead. This highly technical conference unites the developers who drive KVM development and the users who depend on KVM as part of their offerings, or to power their data centers and clouds. Sessions include updates on the state of the KVM virtualization stack, planning for the future, and many opportunities for attendees to collaborate. After more than ten years in the mainline kernel, KVM continues to be a critical part of the FOSS cloud infrastructure. Come join us in continuing to improve the KVM ecosystem. We understand that these are uncertain times and we are continuously monitoring the COVID-19/Novel Coronavirus situation. Our attendees' safety is of the utmost importance. If we feel it is not safe to meet in person, we will turn the event into a virtual experience. We hope to announce this at the same time as the speaker notification. Speakers will still be expected to attend if a physical event goes ahead, but we understand that exceptional circumstances might arise after speakers are accepted and we will do our best to accommodate such circumstances. Based on these factors, we encourage you to submit and reach out to us should you have any questions. The program committee may be contacted as a group via email: kvm-forum-2020...@redhat.com. SUGGESTED TOPICS * Scaling, latency optimizations, performance tuning * Hardening and security * New features * Testing KVM and the Linux Kernel: * Nested virtualization * Resource management (CPU, I/O, memory) and scheduling * VFIO: IOMMU, SR-IOV, virtual GPU, etc. * Networking: Open vSwitch, XDP, etc. * virtio and vhost * Architecture ports and new processor features QEMU: * Management interfaces: QOM and QMP * New devices, new boards, new architectures * New storage features * High availability, live migration and fault tolerance * Emulation and TCG * Firmware: ACPI, UEFI, coreboot, U-Boot, etc. Management & Infrastructure * Managing KVM: Libvirt, OpenStack, oVirt, KubeVirt, etc. * Storage: Ceph, Gluster, SPDK, etc. * Network Function Virtualization: DPDK, OPNFV, OVN, etc. * Provisioning SUBMITTING YOUR PROPOSAL Abstracts due: June 15, 2020 Please submit a short abstract (~150 words) describing your presentation proposal. Slots vary in length up to 45 minutes. Submit your proposal here: https://events.linuxfoundation.org/kvm-forum/program/cfp/ Please only use the categories "presentation" and "panel discussion" You will receive a notification whether or not your presentation proposal was accepted by August 17, 2020. Speakers will receive a complimentary pass for the event. In case your submission has multiple presenters, only the primary speaker for a proposal will receive a complimentary event pass. For panel discussions, all panelists will receive a complimentary event pass. TECHNICAL TALKS A good technical talk should not just report on what has happened over the last year; it should present a concrete problem and how it impacts the user and/or developer community. Whenever applicable, focus on work that needs to be done, difficulties that haven't yet been solved, and on decisions that other developers should be aware of. Summarizing recent developments is okay but it should not be more than a small portion of the overall talk. END-USER TALKS One of the big challenges as developers is to know what, where and how people actually use our software. We will reserve a few slots for end users talking about their deployment challenges and achievements. If you are using KVM in production you are encouraged submit a speaking proposal. Simply mark it as an end-user talk. As an end user, this is a unique opportunity to get your input to developers. PANEL DISCUSSIONS If you are proposing a panel discussion, please make sure that you list all of your potential panelists in your the abstract. We will request full biographies if a panel is accepted. HANDS-ON / BOF SESSIONS We will reserve some time for people to get together and discuss strategic decisions as well as other topics that are best solved within smaller groups. These sessions will be announced during the event. If you are interested in organizing such a session, please add it to the list at http://www.linux-kvm.org/page/KVM_Forum_2020_BOF Let people you think who might be interested know about your BOF, and encourage them to add their names to the wiki page as well. Please add your ideas to the list before KVM Forum starts. HOTEL / TRAVEL --
Re: [PATCH] linux-user: Drop unnecessary check in signalfd4 syscall
Le 25/04/2020 à 11:24, Helge Deller a écrit : > On 25.04.20 10:39, Laurent Vivier wrote: >> Le 24/04/2020 à 23:04, Helge Deller a écrit : >>> The signalfd4() syscall takes optional O_NONBLOCK and O_CLOEXEC fcntl >>> flags. If the user gave any other invalid flags, the host syscall will >>> return correct error codes, so simply drop the extra check here. >>> >>> Signed-off-by: Helge Deller >>> >>> diff --git a/linux-user/syscall.c b/linux-user/syscall.c >>> index 05f03919ff..ebf0d38321 100644 >>> --- a/linux-user/syscall.c >>> +++ b/linux-user/syscall.c >>> @@ -7176,9 +7176,6 @@ static abi_long do_signalfd4(int fd, abi_long mask, >>> int flags) >>> sigset_t host_mask; >>> abi_long ret; >>> >>> -if (flags & ~(TARGET_O_NONBLOCK | TARGET_O_CLOEXEC)) { >>> -return -TARGET_EINVAL; >>> -} >>> if (!lock_user_struct(VERIFY_READ, target_mask, mask, 1)) { >>> return -TARGET_EFAULT; >>> } >>> >> >> Perhaps we want to trigger the TARGET_EINVAL before the TARGET_EFAULT if >> we have both cases? >> >> But I've checked the kernel, and the kernel does a copy_from_user() >> before checking the flags, but it returns EINVAL rather than EFAULT. > > That's not the full picture, since the kernel is not consistent here! > In the compat-case (32bit userspace on 64bit kernel) it returns correctly > EINVAL and EFAULT: > if (sigsetsize != sizeof(compat_sigset_t)) > return -EINVAL; > if (get_compat_sigset(, user_mask)) > return -EFAULT; > while in the non-compat case it returns EINVAL only: > if (sizemask != sizeof(sigset_t) || > copy_from_user(, user_mask, sizeof(mask))) > return -EINVAL; > > I think the kernel should be fixed here... > >> We can remove the flags checking but we should also change TARGET_EFAULT >> by TARGET_EINVAL. > > According to the different behaviour of the kernel mentioned above > you won't get it correct either way. If we refer to manpage, EFAULT is not one of possible errors. Thanks, Laurent
Re: [RFC PATCH v1 1/7] contrib/vhost-user-blk: add option to simulate disconnect on init
On Fri, Apr 24, 2020 at 12:17:54PM +0200, Marc-André Lureau wrote: > Hi > > On Fri, Apr 24, 2020 at 4:32 AM Raphael Norwitz > wrote: > > > > I’m not opposed to adding this kind of debugging functionality to the > > vhost-user-blk sample. It could be helpful to easily test these cases > > in the future. > > > > That said, I'm not sure how others will feel about adding these kind > > of debugging capabilities to libvhost-user. Marc-Andre, thoughts? > > Maybe we should only enable this code if LIBVHOST_USER_DEBUG is set? > > And to make logging silent by default, we shouldn't print them unless > VHOST_USER_DEBUG env is set? Yes, it is a good idea to move this code under LIBVHOST_USER_DEBUG. Agree. Will update it in version 2, but need more feedback on other patches first. > > > > > If we go this route I would prefer to add the debugging options to the > > vhost-user-blk sample in a separate patch. > > > > On Thu, Apr 23, 2020 at 09:39:32PM +0300, Dima Stepanov wrote: > > > > > > Add "--simulate-disconnect-stage" option for the testing purposes. > > > This option can be used to test the vhost-user reconnect functionality: > > > ./vhost-user-blk ... --simulate-disconnect-stage= > > > In this case the daemon will "crash" in the middle of the VHOST comands > > > communication. Case nums are as follows: > > > 1 - make assert in the handler of the SET_VRING_CALL command > > > 2 - make assert in the handler of the SET_VRING_NUM command > > > Main purpose is to test QEMU reconnect functionality. Such fail > > > injection should not lead to QEMU crash and should be handled > > > successfully. > > > Also update the "GOptionEntry entries" definition with the final NULL > > > item according to API. > > > > > > Signed-off-by: Dima Stepanov > > > --- > > > contrib/libvhost-user/libvhost-user.c | 30 > > > ++ > > > contrib/libvhost-user/libvhost-user.h | 13 + > > > contrib/vhost-user-blk/vhost-user-blk.c | 14 +- > > > 3 files changed, 56 insertions(+), 1 deletion(-) > > >
Re: [PATCH v4 1/2] target/arm: kvm: Handle DABT with no valid ISS
On 24/04/20 14:16, Dr. David Alan Gilbert wrote: >>> I was trying to work out whether we need to migrate this state, >>> and I'm not sure. Andrew, do you know? I think this comes down >>> to "at what points in QEMU's kvm run loop can migration kick in", >>> and specifically if we get a KVM_EXIT_ARM_NISV do we definitely >>> go round the loop and KVM_RUN again without ever checking >>> to see if we should do a migration ? >>> >> I'd prefer a migration expert confirm this, so I've CC'ed David and Juan, >> but afaict there's no way to break out of the KVM_RUN loop after a >> successful (ret=0) call to kvm_arch_handle_exit() until after the next >> KVM_RUN ioctl. This is because even if migration kicks the vcpus between >> kvm_arch_handle_exit() and the next run, the signal won't do anything >> other than prepare the vcpu for an immediate exit. As far as QEMU is concerned, this should be enough for Beata's patch to be safe. If the signal causes KVM to exit before KVM_EXIT_ARM_NISV, it's of course okay. If you get a KVM_EXIT_ARM_NISV, however, KVM_RUN will exit with return code 0 and kvm_cpu_exec will: - set env->ext_dabt_pending - go round the loop again - notice cpu->exit_request and schedule an immediate exit - call kvm_arch_put_registers - call KVM_RUN again, which will exit with -EINTR - exit the loop and allow migration to proceed However, I'm not sure that it's a good idea to +/* Clear instantly if the call was successful */ +env->ext_dabt_pending = 0; Rather, this should be done by the next kvm_arch_get_registers when it calls KVM_GET_VCPU_EVENTS. It's also possible to add an assertion in kvm_get_vcpu_events that it you always get zero, to justify that the field is not migrated. Thanks, Paolo
Re: [PATCH] linux-user: Drop unnecessary check in signalfd4 syscall
On 25.04.20 10:39, Laurent Vivier wrote: > Le 24/04/2020 à 23:04, Helge Deller a écrit : >> The signalfd4() syscall takes optional O_NONBLOCK and O_CLOEXEC fcntl >> flags. If the user gave any other invalid flags, the host syscall will >> return correct error codes, so simply drop the extra check here. >> >> Signed-off-by: Helge Deller >> >> diff --git a/linux-user/syscall.c b/linux-user/syscall.c >> index 05f03919ff..ebf0d38321 100644 >> --- a/linux-user/syscall.c >> +++ b/linux-user/syscall.c >> @@ -7176,9 +7176,6 @@ static abi_long do_signalfd4(int fd, abi_long mask, >> int flags) >> sigset_t host_mask; >> abi_long ret; >> >> -if (flags & ~(TARGET_O_NONBLOCK | TARGET_O_CLOEXEC)) { >> -return -TARGET_EINVAL; >> -} >> if (!lock_user_struct(VERIFY_READ, target_mask, mask, 1)) { >> return -TARGET_EFAULT; >> } >> > > Perhaps we want to trigger the TARGET_EINVAL before the TARGET_EFAULT if > we have both cases? > > But I've checked the kernel, and the kernel does a copy_from_user() > before checking the flags, but it returns EINVAL rather than EFAULT. That's not the full picture, since the kernel is not consistent here! In the compat-case (32bit userspace on 64bit kernel) it returns correctly EINVAL and EFAULT: if (sigsetsize != sizeof(compat_sigset_t)) return -EINVAL; if (get_compat_sigset(, user_mask)) return -EFAULT; while in the non-compat case it returns EINVAL only: if (sizemask != sizeof(sigset_t) || copy_from_user(, user_mask, sizeof(mask))) return -EINVAL; I think the kernel should be fixed here... > We can remove the flags checking but we should also change TARGET_EFAULT > by TARGET_EINVAL. According to the different behaviour of the kernel mentioned above you won't get it correct either way. Helge
Re: [PATCH v5 0/2] Replaced locks with lock guard macros
On 24/04/20 12:50, Stefan Hajnoczi wrote: > Paolo, hope you don't mind if I use the block-next branch to merge this > and Simran's patch that depends on it. Absolutely not, thanks. Paolo > Thanks, applied to my block-next tree: > https://github.com/stefanha/qemu/commits/block-next signature.asc Description: OpenPGP digital signature
Re: [RFC patch v1 2/3] qemu-file: add buffered mode
13.04.2020 14:12, Denis Plotnikov wrote: The patch adds ability to qemu-file to write the data asynchronously to improve the performance on writing. Before, only synchronous writing was supported. Enabling of the asyncronous mode is managed by new "enabled_buffered" callback. Hmm. I don't like resulting architecture very much: 1. Function naming is not clean: how can I understand that copy_buf is for buffered mode when add_to_iovec is +- same thing for non-buffered mode? Hmm actually, you just alter several significant functions of QEMUFile - open, close, put, flush. In old mode we do one thing, in a new mode - absolutely another. This looks like a driver. So may be we want to add QEMUFileDriver struct, to define these functions as callbacks, move old realizations to default driver, and add new functionality as a new driver, what do you think? 2. Terminology: you say you add buffered mode, but actually qemu file is already work in buffered mode, so it should be clarified somehow.. You also add asynchronisity, but old implementation has already qemu_put_buffer_async.. You use aio task pool, but don't say that it may be used only from coroutine. May be, we'd better call it "coroutine-mode" ? Also, am I correct that new mode can't be used for read? Should we document/assert it somehow? Or may be support reading? Will switch to snapshot benefit if we implement reading in a new mode? Signed-off-by: Denis Plotnikov --- include/qemu/typedefs.h | 1 + migration/qemu-file.c | 351 +--- migration/qemu-file.h | 9 ++ 3 files changed, 339 insertions(+), 22 deletions(-) diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h index 88dce54..9b388c8 100644 --- a/include/qemu/typedefs.h +++ b/include/qemu/typedefs.h @@ -98,6 +98,7 @@ typedef struct QEMUBH QEMUBH; typedef struct QemuConsole QemuConsole; typedef struct QEMUFile QEMUFile; typedef struct QEMUFileBuffer QEMUFileBuffer; +typedef struct QEMUFileAioTask QEMUFileAioTask; typedef struct QemuLockable QemuLockable; typedef struct QemuMutex QemuMutex; typedef struct QemuOpt QemuOpt; diff --git a/migration/qemu-file.c b/migration/qemu-file.c index 285c6ef..f42f949 100644 --- a/migration/qemu-file.c +++ b/migration/qemu-file.c @@ -29,19 +29,25 @@ #include "qemu-file.h" #include "trace.h" #include "qapi/error.h" +#include "block/aio_task.h" -#define IO_BUF_SIZE 32768 +#define IO_BUF_SIZE (1024 * 1024) #define MAX_IOV_SIZE MIN(IOV_MAX, 64) +#define IO_BUF_NUM 2 Interesting, how much is it better than if we set to 1, limiting the influence of the series to alignment of written chunks? +#define IO_BUF_ALIGNMENT 512 -QEMU_BUILD_BUG_ON(!QEMU_IS_ALIGNED(IO_BUF_SIZE, 512)); +QEMU_BUILD_BUG_ON(!QEMU_IS_ALIGNED(IO_BUF_SIZE, IO_BUF_ALIGNMENT)); +QEMU_BUILD_BUG_ON(IO_BUF_SIZE > INT_MAX); +QEMU_BUILD_BUG_ON(IO_BUF_NUM <= 0); struct QEMUFileBuffer { int buf_index; -int buf_size; /* 0 when writing */ +int buf_size; /* 0 when non-buffered writing */ uint8_t *buf; unsigned long *may_free; struct iovec *iov; unsigned int iovcnt; +QLIST_ENTRY(QEMUFileBuffer) link; }; struct QEMUFile { @@ -60,6 +66,22 @@ struct QEMUFile { bool shutdown; /* currently used buffer */ QEMUFileBuffer *current_buf; +/* + * with buffered_mode enabled all the data copied to 512 byte + * aligned buffer, including iov data. Then the buffer is passed + * to writev_buffer callback. + */ +bool buffered_mode; +/* for async buffer writing */ +AioTaskPool *pool; +/* the list of free buffers, currently used on is NOT there */ +QLIST_HEAD(, QEMUFileBuffer) free_buffers; +}; + +struct QEMUFileAioTask { +AioTask task; +QEMUFile *f; +QEMUFileBuffer *fb; }; /* @@ -115,10 +137,42 @@ QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops) f->opaque = opaque; f->ops = ops; -f->current_buf = g_new0(QEMUFileBuffer, 1); -f->current_buf->buf = g_malloc(IO_BUF_SIZE); -f->current_buf->iov = g_new0(struct iovec, MAX_IOV_SIZE); -f->current_buf->may_free = bitmap_new(MAX_IOV_SIZE); +if (f->ops->enable_buffered) { +f->buffered_mode = f->ops->enable_buffered(f->opaque); +} + +if (f->buffered_mode && qemu_file_is_writable(f)) { So we actually go to buffered_mode if file is writable. Then, shouldn't we otherwise set buffered_mode to false otherwise? +int i; +/* + * in buffered_mode we don't use internal io vectors + * and may_free bitmap, because we copy the data to be + * written right away to the buffer + */ +f->pool = aio_task_pool_new(IO_BUF_NUM); + +/* allocate io buffers */ +for (i = 0; i < IO_BUF_NUM; i++) { +QEMUFileBuffer *fb = g_new0(QEMUFileBuffer, 1); + +fb->buf = qemu_memalign(IO_BUF_ALIGNMENT, IO_BUF_SIZE); +
Re: [PATCH] linux-user: Drop unnecessary check in signalfd4 syscall
Le 24/04/2020 à 23:04, Helge Deller a écrit : > The signalfd4() syscall takes optional O_NONBLOCK and O_CLOEXEC fcntl > flags. If the user gave any other invalid flags, the host syscall will > return correct error codes, so simply drop the extra check here. > > Signed-off-by: Helge Deller > > diff --git a/linux-user/syscall.c b/linux-user/syscall.c > index 05f03919ff..ebf0d38321 100644 > --- a/linux-user/syscall.c > +++ b/linux-user/syscall.c > @@ -7176,9 +7176,6 @@ static abi_long do_signalfd4(int fd, abi_long mask, int > flags) > sigset_t host_mask; > abi_long ret; > > -if (flags & ~(TARGET_O_NONBLOCK | TARGET_O_CLOEXEC)) { > -return -TARGET_EINVAL; > -} > if (!lock_user_struct(VERIFY_READ, target_mask, mask, 1)) { > return -TARGET_EFAULT; > } > Perhaps we want to trigger the TARGET_EINVAL before the TARGET_EFAULT if we have both cases? But I've checked the kernel, and the kernel does a copy_from_user() before checking the flags, but it returns EINVAL rather than EFAULT. We can remove the flags checking but we should also change TARGET_EFAULT by TARGET_EINVAL. Thanks, Laurent
Re: [PATCH] linux-user: Drop open-coded fcntl flags conversion in eventfd2 syscall
Le 24/04/2020 à 22:48, Helge Deller a écrit : > Drop the open-coded fcntl flags conversion in the eventfd2 syscall and > replace it with the built-in conversion with fcntl_flags_tbl. > > Signed-off-by: Helge Deller > > diff --git a/linux-user/syscall.c b/linux-user/syscall.c > index 05f03919ff..ebf0d38321 100644 > --- a/linux-user/syscall.c > +++ b/linux-user/syscall.c > @@ -11938,13 +11942,7 @@ static abi_long do_syscall1(void *cpu_env, int num, > abi_long arg1, > #if defined(TARGET_NR_eventfd2) > case TARGET_NR_eventfd2: > { > -int host_flags = arg2 & (~(TARGET_O_NONBLOCK | TARGET_O_CLOEXEC)); > -if (arg2 & TARGET_O_NONBLOCK) { > -host_flags |= O_NONBLOCK; > -} > -if (arg2 & TARGET_O_CLOEXEC) { > -host_flags |= O_CLOEXEC; > -} > +int host_flags = target_to_host_bitmask(arg2, fcntl_flags_tbl); > ret = get_errno(eventfd(arg1, host_flags)); > if (ret >= 0) { > fd_trans_register(ret, _eventfd_trans); > The problem here is eventfd2 doesn't take O_ flags but EFD_ flags. Most EFD_ flags are mapped to O_ flags, but one is not: include/linux/eventfd.h: /* * CAREFUL: Check include/uapi/asm-generic/fcntl.h when defining * new flags, since they might collide with O_* ones. We want * to re-use O_* flags that couldn't possibly have a meaning * from eventfd, in order to leave a free define-space for * shared O_* flags. */ #define EFD_SEMAPHORE (1 << 0) #define EFD_CLOEXEC O_CLOEXEC #define EFD_NONBLOCK O_NONBLOCK So I think it's better to convert them manually Perhaps we can defined TARGET_EFD_ flags to make this clearer, I don't know. Thanks, Laurent
Re: [PATCH] linux-user: Add support for /proc/cpuinfo on hppa platform
Le 24/04/2020 à 23:06, Helge Deller a écrit : > Provide an own /proc/cpuinfo file for the hppa (parisc) platform. > > Signed-off-by: Helge Deller > > diff --git a/linux-user/syscall.c b/linux-user/syscall.c > index 05f03919ff..ebf0d38321 100644 > --- a/linux-user/syscall.c > +++ b/linux-user/syscall.c > @@ -7438,6 +7435,18 @@ static int open_cpuinfo(void *cpu_env, int fd) > } > #endif > > +#if defined(TARGET_HPPA) > +static int open_cpuinfo(void *cpu_env, int fd) > +{ > +dprintf(fd, "cpu family\t: PA-RISC 1.1e\n"); > +dprintf(fd, "cpu\t\t: PA7300LC (PCX-L2)\n"); > +dprintf(fd, "capabilities\t: os32\n"); > +dprintf(fd, "model\t\t: 9000/778/B160L\n"); > +dprintf(fd, "model name\t: Merlin L2 160 QEMU (9000/778/B160L)\n"); > +return 0; > +} > +#endif > + > #if defined(TARGET_M68K) > static int open_hardware(void *cpu_env, int fd) > { > @@ -7462,7 +7471,7 @@ static int do_openat(void *cpu_env, int dirfd, const > char *pathname, int flags, > #if defined(HOST_WORDS_BIGENDIAN) != defined(TARGET_WORDS_BIGENDIAN) > { "/proc/net/route", open_net_route, is_proc }, > #endif > -#if defined(TARGET_SPARC) > +#if defined(TARGET_SPARC) || defined(TARGET_HPPA) > { "/proc/cpuinfo", open_cpuinfo, is_proc }, > #endif > #if defined(TARGET_M68K) > Reviewed-by: Laurent Vivier
Re: [PATCH] linux-user: return target error codes for socket() and prctl()
Le 25/04/2020 à 00:00, Helge Deller a écrit : > Return target error codes instead of host error codes. > > Signed-off-by: Helge Deller > > diff --git a/linux-user/syscall.c b/linux-user/syscall.c > index 05f03919ff..655a86fa45 100644 > --- a/linux-user/syscall.c > +++ b/linux-user/syscall.c > @@ -2987,7 +2987,7 @@ static abi_long do_socket(int domain, int type, int > protocol) > #endif > protocol == NETLINK_KOBJECT_UEVENT || > protocol == NETLINK_AUDIT)) { > -return -EPFNOSUPPORT; > +return -TARGET_EPFNOSUPPORT; > } > > if (domain == AF_PACKET || > @@ -5856,7 +5856,7 @@ static abi_long do_get_thread_area(CPUX86State *env, > abi_ulong ptr) > > abi_long do_arch_prctl(CPUX86State *env, int code, abi_ulong addr) > { > -return -ENOSYS; > +return -TARGET_ENOSYS; > } > #else > abi_long do_arch_prctl(CPUX86State *env, int code, abi_ulong addr) > Reviewed-by: Laurent Vivier
Re: [PATCH v4 24/30] qcow2: Clear the L2 bitmap when allocating a compressed cluster
24.04.2020 21:41, Alberto Garcia wrote: On Fri 24 Apr 2020 08:15:04 PM CEST, Vladimir Sementsov-Ogievskiy wrote: AFAIK, now compressed clusters can't be used in scenarios with guest, as qcow2 driver doesn't support rewriting them. You can write to those images just fine, it's just not efficient because you have to COW the compressed clusters. No, rewriting doesn't work: [root@kvm master]# ./qemu-img create -f qcow2 x 10M Formatting 'x', fmt=qcow2 size=10485760 cluster_size=65536 lazy_refcounts=off refcount_bits=16 [root@kvm master]# ./qemu-io -c 'write -c 0 64K' x wrote 65536/65536 bytes at offset 0 64 KiB, 1 ops; 00.23 sec (278.708 KiB/sec and 4.3548 ops/sec) [root@kvm master]# ./qemu-io -c 'write -c 0 64K' x write failed: Input/output error Or am I wrong? And we normally don't combine normal and compressed clusters together in one image. As soon as you start writing to an image with compressed clusters you'll have a combination of both. Ah, you mean, rewriting compressed clusters by normal.. So the use-case is just take compressed backup image and use it for the guest, instead of converting first.. It makes sense. But it's true that you don't have an image with compressed clusters if what you're looking for is performance. So I wouldn't add support for this if it complicates things too much. Berto -- Best regards, Vladimir