Re: [Qemu-devel] [Qemu-arm] [PATCH 1/2] hw/adc: Add basic Aspeed ADC model
On Wed, 2017-05-24 at 02:15 -0300, Philippe Mathieu-Daudé wrote: > Hi Andrew, > > On 05/22/2017 02:14 AM, Andrew Jeffery wrote: > > On Mon, 2017-05-22 at 03:15 +, Ryan Chen wrote: > > > In ASPEED SoC chip, all register access have following rule. > > > Most of controller write access is only support 32bit access. > > > Read is support 8bits/16bits/32bits. > > This makes sens thinking about how a DMA controller can take advantage > of the ADC. > > > > > Thanks for clearing that up Ryan. > > > > Phil: I'll rework the model so the reads are 16-bits. > > This shouldn't be necessary, QEMU is supposed to supports different > access size for different implemented size, so you can declare your > implementation as 32-bit and valid accesses from 8 to 32: > > static const MemoryRegionOps aspeed_adc_ops = { > .read = aspeed_adc_read, > .write = aspeed_adc_write, > .endianness = DEVICE_LITTLE_ENDIAN, > .valid.min_access_size = 1, > .valid.max_access_size = 4, > .valid.unaligned = false, > +.impl.min_access_size = 4, > +.impl.max_access_size = 4, > }; > > This way an I/O access from the CPU or a DMA could use 8/16-bit while > you keep a 32-bit implementation. The adjustment is done by > access_with_adjusted_size() from memory.c. > > Afaik there is, however, no distinction between read/write different > access size in current QEMU MemoryRegionOps model. Yep, I realised all of the above when I went to implement it. Thanks for pointing it out though! Andrew signature.asc Description: This is a digitally signed message part
Re: [Qemu-devel] [PATCH v2] qapi: Fix some QMP documentation regressions
Eric Blakewrites: > In the process of getting rid of docs/qmp-commands.txt, we managed > to regress on some of the text that changed after the point where > the move was first branched and when the move actually occurred. > For example, commit 3282eca for blockdev-snapshot re-added the > extra "options" layer which had been cleaned up in commit 0153d2f. > > This clears up all regressions identified over the range > 02b351d..bd6092e: > https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg05127.html > as well as a cleanup to x-blockdev-remove-medium to prefer > 'id' over 'device' (matching the cleanup for 'eject'). > > CC: qemu-triv...@nongnu.org > Signed-off-by: Eric Blake Reviewed-by: Markus Armbruster > --- > v2: add 'eject', 'x-blockdev-remove-medium' cleanups, thanks to > Markus' audit. > > Can go in either via trivial or via qapi tree I'll put it in my next pull unless -trivial picks it up first.
[Qemu-devel] [PATCH] Fix nmi injection failure when vcpu got blocked
From: ZhuangYanyingRecently I found NMI could not be injected to vm via libvirt API Reproduce the problem: 1 use guest of redhat 7.3 2 disable nmi_watchdog and trig spinlock deadlock inside the guest check the running vcpu thread, make sure not vcpu0 3 inject NMI into the guest via libvirt API "inject-nmi" Result: The NMI could not be injected into the guest. Reason: 1 It sets nmi_queued to 1 when calling ioctl KVM_NMI in qemu, and sets cpu->kvm_vcpu_dirty to true in do_inject_external_nmi() meanwhile. 2 It sets nmi_queued to 0 in process_nmi(), before entering guest, because cpu->kvm_vcpu_dirty is true. Normally, vcpu could call vcpu_enter_guest successfully to inject the NMI. However, in the problematic scenario, when the guest's threads hold spin_lock_irqsave for a long time, such as entering a while loop after spin_lock_irqsave(), other vcpus would enter into S state because of pvspinlock scheme, then KVM module will loop in vcpu_block rather than entry the guest. I think that it's not suitable to decide whether to stay in vcpu_block() or not just by checking nmi_queued, NMI should be injected immediately even at this situation. Solution: There're 2 ways to solve the problem: 1 call cpu_synchronize_state_not_set_dirty() rather than cpu_synchronize_state(), while injecting NMI, to avoid changing nmi_queued to 0. But other workqueues may affect cpu->kvm_vcpu_dirty, so it's not recommended. 2 add checking nmi_pending plus with nmi_queued in vm_vcpu_has_events() in KVM module. qemu_kvm_wait_io_event qemu_wait_io_event_common flush_queued_work do_inject_external_nmi cpu_synchronize_state kvm_cpu_synchronize_state do_kvm_cpu_synchronize_state cpu->kvm_vcpu_dirty = true;/* trigger process_nmi */ kvm_vcpu_ioctl(cpu, KVM_NMI) kvm_vcpu_ioctl_nmi kvm_inject_nmi atomic_inc(>arch.nmi_queued); nmi_queued = 1 /* nmi_queued set to 1, when qemu ioctl KVM_NMI */ kvm_make_request(KVM_REQ_NMI, vcpu); kvm_cpu_exec kvm_arch_put_registers(cpu, KVM_PUT_RUNTIME_STATE); kvm_arch_put_registers kvm_put_vcpu_events kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, ); kvm_vcpu_ioctl_x86_set_vcpu_events process_nmi(vcpu); vcpu->arch.nmi_pending += atomic_xchg(>arch.nmi_queued, 0); nmi_queued = 0 /* nmi_queued set to 0, vcpu thread always block */ nmi_pending = 1 kvm_make_request(KVM_REQ_EVENT, vcpu); kvm_vcpu_ioctl(cpu, KVM_RUN, 0); kvm_arch_vcpu_ioctl_run vcpu_run(vcpu); kvm_vcpu_running(vcpu) /* always false, could not call vcpu_enter_guest */ vcpu_block kvm_arch_vcpu_runnable kvm_vcpu_has_events if (atomic_read(>arch.nmi_queued)) /* nmi_queued is 0, vcpu thread always block*/ Signed-off-by: Zhuang Yanying --- arch/x86/kvm/x86.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 02363e3..96983dc 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8394,7 +8394,8 @@ static inline bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu) if (vcpu->arch.pv.pv_unhalted) return true; - if (atomic_read(>arch.nmi_queued)) + if (vcpu->arch.nmi_pending || + atomic_read(>arch.nmi_queued)) return true; if (kvm_test_request(KVM_REQ_SMI, vcpu)) -- 1.8.3.1
Re: [Qemu-devel] [PATCH v2 09/13] vvfat: correctly create base short names for non-ASCII filenames
Hi Philippe, Le 24/05/2017 à 06:17, Philippe Mathieu-Daudé a écrit : Hi Hervé, On 05/22/2017 06:12 PM, Hervé Poussineau wrote: More specifically, create short name from filename and change blacklist of invalid chars to whitelist of valid chars. Windows 9x also now correctly see long file names of filenames containing a space, but Scandisk still complains about mismatch between SFN and LFN. Specification: "FAT: General overview of on-disk format" v1.03, pages 30-31 Signed-off-by: Hervé Poussineau--- block/vvfat.c | 105 ++ 1 file changed, 77 insertions(+), 28 deletions(-) diff --git a/block/vvfat.c b/block/vvfat.c index d34241da17..3cb65493cd 100644 --- a/block/vvfat.c +++ b/block/vvfat.c @@ -516,6 +516,81 @@ static void set_begin_of_direntry(direntry_t* direntry, uint32_t begin) direntry->begin_hi = cpu_to_le16((begin >> 16) & 0x); } +static uint8_t to_valid_short_char(gunichar c) +{ +c = g_unichar_toupper(c); +if ((c >= '0' && c <= '9') || +(c >= 'A' && c <= 'Z') || +strchr("$%'-_@~`!(){}^#&", c) != 0) { +return c; +} else { +return 0; +} +} + +static direntry_t *create_short_filename(BDRVVVFATState *s, + const char *filename) +{ +int i, j = 0; +direntry_t *entry = array_get_next(&(s->directory)); +const gchar *p, *last_dot = NULL; +gunichar c; +bool lossy_conversion = false; What is the purpose of this variable? Do you plan to use it later or it is just for debugging? I will use it later in patch 10/13, when generating numeric-tails of short names (~1, ~2...) +char tail[11]; This also looks like old debug code... Yes, good catch! I will remove it. Compiler didn't warn me. + +if (!entry) { +return NULL; +} +memset(entry->name, 0x20, sizeof(entry->name)); + +/* copy filename and search last dot */ +for (p = filename; ; p = g_utf8_next_char(p)) { +c = g_utf8_get_char(p); +if (c == '\0') { +break; +} else if (c == '.') { +if (j == 0) { +/* '.' at start of filename */ +lossy_conversion = true; +} else { +if (last_dot) { +lossy_conversion = true; +} +last_dot = p; +} +} else if (!last_dot) { +/* first part of the name; copy it */ +uint8_t v = to_valid_short_char(c); +if (j < 8 && v) { +entry->name[j++] = v; +} else { +lossy_conversion = true; +} +} +} + +/* copy extension (if any) */ +if (last_dot) { +j = 0; +for (p = g_utf8_next_char(last_dot); ; p = g_utf8_next_char(p)) { +c = g_utf8_get_char(p); +if (c == '\0') { +break; +} else { +/* extension; copy it */ +uint8_t v = to_valid_short_char(c); +if (j < 3 && v) { +entry->name[8 + (j++)] = v; +} else { +lossy_conversion = true; +} +} +} +} +(void)lossy_conversion; +return entry; See the (void)lossy_conversion, which was put on purpose. Those two lines will be replaced in patch 10/13. +} + /* fat functions */ static inline uint8_t fat_chksum(const direntry_t* entry) @@ -614,7 +689,7 @@ static inline void init_fat(BDRVVVFATState* s) static inline direntry_t* create_short_and_long_name(BDRVVVFATState* s, unsigned int directory_start, const char* filename, int is_dot) { -int i,j,long_index=s->directory.next; +int long_index = s->directory.next; direntry_t* entry = NULL; direntry_t* entry_long = NULL; @@ -626,33 +701,7 @@ static inline direntry_t* create_short_and_long_name(BDRVVVFATState* s, } entry_long=create_long_filename(s,filename); - -i = strlen(filename); -for(j = i - 1; j>0 && filename[j]!='.';j--); -if (j > 0) -i = (j > 8 ? 8 : j); -else if (i > 8) -i = 8; - -entry=array_get_next(&(s->directory)); -memset(entry->name, 0x20, sizeof(entry->name)); -memcpy(entry->name, filename, i); - -if (j > 0) { -for (i = 0; i < 3 && filename[j + 1 + i]; i++) { -entry->name[8 + i] = filename[j + 1 + i]; -} -} - -/* upcase & remove unwanted characters */ -for(i=10;i>=0;i--) { -if(i==10 || i==7) for(;i>0 && entry->name[i]==' ';i--); -if(entry->name[i]<=' ' || entry->name[i]>0x7f -|| strchr(".*?<>|\":/\\[];,+='",entry->name[i])) -entry->name[i]='_'; -else if(entry->name[i]>='a' && entry->name[i]<='z') -entry->name[i]+='A'-'a'; -} +entry = create_short_filename(s, filename); /* mangle duplicates */
Re: [Qemu-devel] [PATCH v5 1/4] net/rocker: Remove the dead error handling
Philippe Mathieu-Daudéwrites: > Hi Markus, > > On 05/23/2017 06:27 AM, Markus Armbruster wrote: > [...] >> There's one more cleanup opportunity: >> > [...] >>> if (pci_dma_read(dev, le64_to_cpu(info->desc.buf_addr), info->buf, >>> size)) { >>> return NULL; >>> } >> >> None of the pci_dma_read() calls outside rocker check the return value. >> Just as well, because it always returns 0. Please clean this up in a >> separate followup patch. > > It may be the correct way to do it but this sounds like we are missing > something somewhere... pci_dma_read() calls pci_dma_rw() which always > returns 0. Why not let it returns void? It is inlined and never used > by address. Else we should document why returning 0 is correct, and > what is the reason to not use a void prototype. > > pci_dma_rw() calls dma_memory_rw() which does return a boolean value, > false on success (MEMTX_OK) and true on error > (MEMTX_ERROR/DECODE_ERROR) PCI question. Michael, Marcel?
Re: [Qemu-devel] [PATCH v3 02/24] docker: add --include-files argument to 'build' command
On Wed, 05/24 13:21, Fam Zheng wrote: > > diff --git a/tests/docker/docker.py b/tests/docker/docker.py > > index 6ddc6e4c2a..68cca25f89 100755 > > --- a/tests/docker/docker.py > > +++ b/tests/docker/docker.py > > @@ -237,6 +237,11 @@ class BuildCommand(SubCommand): > > help="""Specify a binary that will be copied > > to the > > container together with all its dependent > > libraries""") > > +parser.add_argument("--extra-files", "-f", nargs='*', > > +help="""Specify files that will be sent to the > > +Docker daemon. The daemon will copy those > > files into > > +the built image. The ADD directive of the > > Dockerfile > > +specify where a file is placed into the > > image""") > > Or more precisely, "The daemon will copy these files into the docker build > directory, which can be copied to the built image with ADD directives in > Dockerfile"? Hmm, not copied by daemon, maybe "The files will be sent...". But you have the idea. Fam
Re: [Qemu-devel] [PATCH v3 00/24] docker/shippable: cross-build mipsel and powerpc targets
On Sun, 05/21 00:29, Philippe Mathieu-Daudé wrote: > This patchset add 2 more architectures to the cross-build farm. > > There is still some issue trying to cross-build mips64el-softmmu, it seems the > cross tools use the system outdated libfdt instead of the one checkouted in > the > dtc submodule. I disabled this target for now. > > The branch https://github.com/philmd/qemu/tree/docker_shippable_v3 can be > checked at Shippable: > https://app.shippable.com/github/philmd/qemu/status/dashboard > > Each arch builds in around ~9min. Using Shippable free open source projects > service, the 5 jobs take ~38-44min in total. > > v3: > - Addressed review feedbacks from Fam: > - Keep building images in various layers, but use > DEBIAN_FRONTEND=noninteractive > - Document '--extra-files', now it supports adding various files at once > - Checksum extra files to trigger a docker image rebuild if modified > - Use better regex to generate deb-src entries > - Reordered extra libs, to ease further add/remove diffs Looks good in general, thanks! I've had two small questions, apart from that it's ready to go. Fam
Re: [Qemu-devel] [PATCH v3 03/24] docker: rebuild image if 'extra files' checksum does not match
On Sun, 05/21 00:29, Philippe Mathieu-Daudé wrote: > Signed-off-by: Philippe Mathieu-Daudé> --- > tests/docker/docker.py | 15 --- > 1 file changed, 12 insertions(+), 3 deletions(-) > > diff --git a/tests/docker/docker.py b/tests/docker/docker.py > index eef92a54f1..0dd6a3304f 100755 > --- a/tests/docker/docker.py > +++ b/tests/docker/docker.py > @@ -38,6 +38,9 @@ def _text_checksum(text): > """Calculate a digest string unique to the text content""" > return hashlib.sha1(text).hexdigest() > > +def _file_checksum(filename): > +return _text_checksum(open(filename, 'rb').read()) > + > def _guess_docker_command(): > """ Guess a working docker command or raise exception if not found""" > commands = [["docker"], ["sudo", "-n", "docker"]] > @@ -154,7 +157,7 @@ class Docker(object): > return labels.get("com.qemu.dockerfile-checksum", "") > > def build_image(self, tag, docker_dir, dockerfile, > -quiet=True, user=False, argv=None): > +quiet=True, user=False, argv=None, extra_files_cksum=[]): > if argv == None: > argv = [] > > @@ -170,7 +173,7 @@ class Docker(object): > > tmp_df.write("\n") > tmp_df.write("LABEL com.qemu.dockerfile-checksum=%s" % > - _text_checksum(dockerfile)) > + _text_checksum(dockerfile + > "-".join(extra_files_cksum))) Since @dockerfile is the content of the the Dockerfile, how about concatenating with "\n": _text_checksum("\n".join([dockerfile] + extra_files_cksum)) ? > tmp_df.flush() > > self._do(["build", "-t", tag, "-f", tmp_df.name] + argv + \ > @@ -276,16 +279,22 @@ class BuildCommand(SubCommand): > return 1 > > # Include files used by ADD directives found within the > Dockerfile. > +cksum = [] > if args.include_executable: > +# FIXME: there is no checksum of this executable and the > linked > +# libraries, once the image built any change of this > executable > +# or any library won't trigger another build. > _copy_binary_with_libs(args.include_executable, docker_dir) > for filename in args.include_files or []: > _copy_with_mkdir(filename, docker_dir) > +cksum += [_file_checksum(filename)] > > argv += ["--build-arg=" + k.lower() + "=" + v > for k, v in os.environ.iteritems() > if k.lower() in FILTERED_ENV_NAMES] > dkr.build_image(tag, docker_dir, dockerfile, > -quiet=args.quiet, user=args.user, argv=argv) > +quiet=args.quiet, user=args.user, argv=argv, > +extra_files_cksum=cksum) > > rmtree(docker_dir) > > -- > 2.11.0 >
Re: [Qemu-devel] [PATCH v3 02/24] docker: add --include-files argument to 'build' command
On Sun, 05/21 00:29, Philippe Mathieu-Daudé wrote: > Signed-off-by: Philippe Mathieu-Daudé> --- > tests/docker/Makefile.include | 5 - > tests/docker/docker.py| 12 +--- > 2 files changed, 13 insertions(+), 4 deletions(-) > > diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include > index 03eda37bf4..fe1a9a53ff 100644 > --- a/tests/docker/Makefile.include > +++ b/tests/docker/Makefile.include > @@ -51,6 +51,7 @@ docker-image-%: $(DOCKER_FILES_DIR)/%.docker > $(SRC_PATH)/tests/docker/docker.py build qemu:$* $< \ > $(if $V,,--quiet) $(if $(NOCACHE),--no-cache) \ > $(if $(NOUSER),,--add-current-user) \ > + $(if $(EXTRA_FILES),--extra-files $(EXTRA_FILES))\ > $(if $(EXECUTABLE),--include-executable=$(EXECUTABLE)),\ > "BUILD","$*") > > @@ -107,6 +108,8 @@ docker: > @echo 'NOUSER Define to disable adding current user > to containers passwd.' > @echo 'NOCACHE=1Ignore cache when build images.' > @echo 'EXECUTABLE=Include executable in image.' > + @echo 'EXTRA_FILES=" [... ]"' > + @echo ' Include extra files in image.' > > # This rule if for directly running against an arbitrary docker target. > # It is called by the expanded docker targets (e.g. make > diff --git a/tests/docker/docker.py b/tests/docker/docker.py > index 6ddc6e4c2a..68cca25f89 100755 > --- a/tests/docker/docker.py > +++ b/tests/docker/docker.py > @@ -237,6 +237,11 @@ class BuildCommand(SubCommand): > help="""Specify a binary that will be copied to > the > container together with all its dependent > libraries""") > +parser.add_argument("--extra-files", "-f", nargs='*', > +help="""Specify files that will be sent to the > +Docker daemon. The daemon will copy those files > into > +the built image. The ADD directive of the > Dockerfile > +specify where a file is placed into the image""") Or more precisely, "The daemon will copy these files into the docker build directory, which can be copied to the built image with ADD directives in Dockerfile"? > parser.add_argument("--add-current-user", "-u", dest="user", > action="store_true", > help="Add the current user to image's passwd") > @@ -270,10 +275,11 @@ class BuildCommand(SubCommand): > print "%s exited with code %d" % (docker_pre, rc) > return 1 > > -# Do we include a extra binary? > +# Include files used by ADD directives found within the > Dockerfile. > if args.include_executable: > -_copy_binary_with_libs(args.include_executable, > - docker_dir) > +_copy_binary_with_libs(args.include_executable, docker_dir) > +for filename in args.extra_files or []: > +_copy_with_mkdir(filename, docker_dir) > > argv += ["--build-arg=" + k.lower() + "=" + v > for k, v in os.environ.iteritems() > -- > 2.11.0 >
Re: [Qemu-devel] [Qemu-arm] [PATCH 1/2] hw/adc: Add basic Aspeed ADC model
Hi Andrew, On 05/22/2017 02:14 AM, Andrew Jeffery wrote: On Mon, 2017-05-22 at 03:15 +, Ryan Chen wrote: In ASPEED SoC chip, all register access have following rule. Most of controller write access is only support 32bit access. Read is support 8bits/16bits/32bits. This makes sens thinking about how a DMA controller can take advantage of the ADC. Thanks for clearing that up Ryan. Phil: I'll rework the model so the reads are 16-bits. This shouldn't be necessary, QEMU is supposed to supports different access size for different implemented size, so you can declare your implementation as 32-bit and valid accesses from 8 to 32: static const MemoryRegionOps aspeed_adc_ops = { .read = aspeed_adc_read, .write = aspeed_adc_write, .endianness = DEVICE_LITTLE_ENDIAN, .valid.min_access_size = 1, .valid.max_access_size = 4, .valid.unaligned = false, +.impl.min_access_size = 4, +.impl.max_access_size = 4, }; This way an I/O access from the CPU or a DMA could use 8/16-bit while you keep a 32-bit implementation. The adjustment is done by access_with_adjusted_size() from memory.c. Afaik there is, however, no distinction between read/write different access size in current QEMU MemoryRegionOps model. Thanks, Andrew Best Regards, Ryan 信驊科技股份有限公司 ASPEED Technology Inc. 2F,No.15,Industry East Road 4.,Hsinchu Science Park, Hsinchu City 30077, Taiwan Tel: 886-3-578-9568 #857 Fax: 886-3-578-9586 * Email Confidentiality Notice DISCLAIMER: This message (and any attachments) may contain legally privileged and/or other confidential information. If you have received it in error, please notify the sender by reply e-mail and immediately delete the e-mail and any attachments without copying or disclosing the contents. Thank you. -Original Message- From: Andrew Jeffery [mailto:and...@aj.id.au] Sent: Monday, May 22, 2017 8:24 AM To: Philippe Mathieu-Daudé; qemu-...@nongnu.org Cc: Peter Maydell ; Ryan Chen ; Alistair Francis ; qemu-devel@nongnu.org; Cédric Le Goater ; Joel Stanley Subject: Re: [Qemu-arm] [PATCH 1/2] hw/adc: Add basic Aspeed ADC model Hi Phil, On Sat, 2017-05-20 at 00:21 -0300, Philippe Mathieu-Daudé wrote: Hi Andrew, On 05/19/2017 09:26 PM, Andrew Jeffery wrote: This model implements enough behaviour to do basic functionality tests such as device initialisation and read out of dummy sample values. The sample value generation strategy is similar to the STM ADC already in the tree. Signed-off-by: Andrew Jeffery --- hw/adc/Makefile.objs| 1 + hw/adc/aspeed_adc.c | 246 include/hw/adc/aspeed_adc.h | 33 ++ 3 files changed, 280 insertions(+) create mode 100644 hw/adc/aspeed_adc.c create mode 100644 include/hw/adc/aspeed_adc.h diff --git a/hw/adc/Makefile.objs b/hw/adc/Makefile.objs index 3f6dfdedaec7..2bf9362ba3c4 100644 --- a/hw/adc/Makefile.objs +++ b/hw/adc/Makefile.objs @@ -1 +1,2 @@ obj-$(CONFIG_STM32F2XX_ADC) += stm32f2xx_adc.o +obj-$(CONFIG_ASPEED_SOC) += aspeed_adc.o diff --git a/hw/adc/aspeed_adc.c b/hw/adc/aspeed_adc.c new file mode 100644 index ..d08f1684f7bc --- /dev/null +++ b/hw/adc/aspeed_adc.c @@ -0,0 +1,246 @@ +/* + * Aspeed ADC + * + * Andrew Jeffery + * + * Copyright 2017 IBM Corp. + * + * This code is licensed under the GPL version 2 or later. See + * the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "hw/adc/aspeed_adc.h" +#include "qapi/error.h" +#include "qemu/log.h" + +#define ASPEED_ADC_ENGINE_CTRL 0x00 #define +ASPEED_ADC_ENGINE_CH_EN_MASK 0x #define +ASPEED_ADC_ENGINE_CH_EN(x)((BIT(x)) << 16) #define +ASPEED_ADC_ENGINE_INIT BIT(8) #define +ASPEED_ADC_ENGINE_AUTO_COMPBIT(5) #define +ASPEED_ADC_ENGINE_COMP BIT(4) #define +ASPEED_ADC_ENGINE_MODE_MASK0x000e #define +ASPEED_ADC_ENGINE_MODE_OFF(0b000 << 1) #define +ASPEED_ADC_ENGINE_MODE_STANDBY(0b001 << 1) #define +ASPEED_ADC_ENGINE_MODE_NORMAL (0b111 << 1) #define +ASPEED_ADC_ENGINE_EN BIT(0) + +#define ASPEED_ADC_L_MASK ((1 << 10) - 1) #define +ASPEED_ADC_L(x) ((x) & ASPEED_ADC_L_MASK) #define +ASPEED_ADC_H(x) (((x) >> 16) & ASPEED_ADC_L_MASK) #define +ASPEED_ADC_LH_MASK (ASPEED_ADC_L_MASK << 16 | +ASPEED_ADC_L_MASK) + +static inline uint32_t update_channels(uint32_t current) { +const uint32_t next = (current + 7) & 0x3ff; + +return (next << 16) | next; +} + +static bool breaks_threshold(AspeedADCState *s, int ch_off) { +const uint32_t a = ASPEED_ADC_L(s->channels[ch_off]); +const uint32_t a_lower = ASPEED_ADC_L(s->bounds[2 * ch_off]); +
[Qemu-devel] [PATCH v3 3/3] spapr: Fix migration of Radix guests
Fix migration of radix guests by ensuring that we issue KVM_PPC_CONFIGURE_V3_MMU for radix case post migration. Reported-by: Nageswara R SastrySigned-off-by: Bharata B Rao --- hw/ppc/spapr.c | 12 1 file changed, 12 insertions(+) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index daf335c..ea14bed 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -1400,6 +1400,18 @@ static int spapr_post_load(void *opaque, int version_id) err = spapr_rtc_import_offset(>rtc, spapr->rtc_offset); } +if (spapr->patb_entry) { +PowerPCCPU *cpu = POWERPC_CPU(first_cpu); +bool radix = !!(spapr->patb_entry & PATBE1_GR); +bool gtse = !!(cpu->env.spr[SPR_LPCR] & LPCR_GTSE); + +err = kvmppc_configure_v3_mmu(cpu, radix, gtse, spapr->patb_entry); +if (err) { +error_report("Process table config unsupported by the host"); +return -EINVAL; +} +} + return err; } -- 2.7.4
[Qemu-devel] [PATCH v3 1/3] migration: Introduce unregister_savevm_live()
Introduce a new function unregister_savevm_live() to unregister the vmstate handlers registered via register_savevm_live(). register_savevm() allocates SaveVMHandlers while register_savevm_live() gets passed with SaveVMHandlers. During unregistration, we want to free SaveVMHandlers in the former case but not free in the latter case. Hence this new API is needed to differentiate this. This new API will be needed by PowerPC to unregister the HTAB savevm handlers. Signed-off-by: Bharata B RaoReviewed-by: David Gibson Cc: Juan Quintela Cc: Dr. David Alan Gilbert --- include/migration/vmstate.h | 1 + migration/savevm.c | 17 +++-- 2 files changed, 16 insertions(+), 2 deletions(-) diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h index 8489659..02a1bac 100644 --- a/include/migration/vmstate.h +++ b/include/migration/vmstate.h @@ -79,6 +79,7 @@ int register_savevm_live(DeviceState *dev, void *opaque); void unregister_savevm(DeviceState *dev, const char *idstr, void *opaque); +void unregister_savevm_live(DeviceState *dev, const char *idstr, void *opaque); typedef struct VMStateInfo VMStateInfo; typedef struct VMStateDescription VMStateDescription; diff --git a/migration/savevm.c b/migration/savevm.c index f5e8194..4ef6fdc 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -630,7 +630,8 @@ int register_savevm(DeviceState *dev, ops, opaque); } -void unregister_savevm(DeviceState *dev, const char *idstr, void *opaque) +static void unregister_savevm_common(DeviceState *dev, const char *idstr, + void *opaque, bool free_savevmhandlers) { SaveStateEntry *se, *new_se; char id[256] = ""; @@ -649,12 +650,24 @@ void unregister_savevm(DeviceState *dev, const char *idstr, void *opaque) if (strcmp(se->idstr, id) == 0 && se->opaque == opaque) { QTAILQ_REMOVE(_state.handlers, se, entry); g_free(se->compat); -g_free(se->ops); +if (free_savevmhandlers) { +g_free(se->ops); +} g_free(se); } } } +void unregister_savevm(DeviceState *dev, const char *idstr, void *opaque) +{ +unregister_savevm_common(dev, idstr, opaque, true); +} + +void unregister_savevm_live(DeviceState *dev, const char *idstr, void *opaque) +{ +unregister_savevm_common(dev, idstr, opaque, false); +} + int vmstate_register_with_alias_id(DeviceState *dev, int instance_id, const VMStateDescription *vmsd, void *opaque, int alias_id, -- 2.7.4
[Qemu-devel] [PATCH v3 0/3] ppc/spapr: Fix migration of radix guests
This patchset fixes the migration of sPAPR radix guests. Changes in v3: -- - Dropped the patch that made h_register_process_table flags global as it is not required with the changed logic. - The post load logic is now same for both TCG as well as KVM radix guests. Also ensured that radix and future v3 hash migration post load is treated similarly. - Ensure TCG Radix guest migration isn't broken. v2: https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg04491.html Bharata B Rao (3): migration: Introduce unregister_savevm_live() spapr: Unregister HPT savevm handlers for radix guests spapr: Fix migration of Radix guests hw/ppc/spapr.c | 111 +--- include/hw/ppc/spapr.h | 2 + include/migration/vmstate.h | 1 + migration/savevm.c | 17 ++- 4 files changed, 80 insertions(+), 51 deletions(-) -- 2.7.4
[Qemu-devel] [PATCH v3 2/3] spapr: Unregister HPT savevm handlers for radix guests
HPT gets created by default for TCG guests and later when the guest turns out to be a radix guest, the HPT is destroyed when guest does H_REGISTER_PROC_TBL hcall. Let HTAB savevm handlers registration and unregistration follow the same model so that we don't end up having unrequired HTAB savevm handlers for radix guests. This also ensures that HTAB savevm handlers seemlessly get destroyed and recreated like HTAB itself when hash guest reboots. HTAB savevm handlers registration/unregistration is now done from spapr_reallocate_hpt() which itself is called from one of the savevm_htab_handlers.htab_load(). To cater to this circular dependency spapr_reallocate_hpt() is made global. Signed-off-by: Bharata B RaoReviewed-by: David Gibson --- hw/ppc/spapr.c | 99 +- include/hw/ppc/spapr.h | 2 + 2 files changed, 52 insertions(+), 49 deletions(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 91f7434..daf335c 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -1233,53 +1233,7 @@ void spapr_free_hpt(sPAPRMachineState *spapr) spapr->htab = NULL; spapr->htab_shift = 0; close_htab_fd(spapr); -} - -static void spapr_reallocate_hpt(sPAPRMachineState *spapr, int shift, - Error **errp) -{ -long rc; - -/* Clean up any HPT info from a previous boot */ -spapr_free_hpt(spapr); - -rc = kvmppc_reset_htab(shift); -if (rc < 0) { -/* kernel-side HPT needed, but couldn't allocate one */ -error_setg_errno(errp, errno, - "Failed to allocate KVM HPT of order %d (try smaller maxmem?)", - shift); -/* This is almost certainly fatal, but if the caller really - * wants to carry on with shift == 0, it's welcome to try */ -} else if (rc > 0) { -/* kernel-side HPT allocated */ -if (rc != shift) { -error_setg(errp, - "Requested order %d HPT, but kernel allocated order %ld (try smaller maxmem?)", - shift, rc); -} - -spapr->htab_shift = shift; -spapr->htab = NULL; -} else { -/* kernel-side HPT not needed, allocate in userspace instead */ -size_t size = 1ULL << shift; -int i; - -spapr->htab = qemu_memalign(size, size); -if (!spapr->htab) { -error_setg_errno(errp, errno, - "Could not allocate HPT of order %d", shift); -return; -} - -memset(spapr->htab, 0, size); -spapr->htab_shift = shift; - -for (i = 0; i < size / HASH_PTE_SIZE_64; i++) { -DIRTY_HPTE(HPTE(spapr->htab, i)); -} -} +unregister_savevm_live(NULL, "spapr/htab", spapr); } void spapr_setup_hpt_and_vrma(sPAPRMachineState *spapr) @@ -1879,6 +1833,55 @@ static SaveVMHandlers savevm_htab_handlers = { .load_state = htab_load, }; +void spapr_reallocate_hpt(sPAPRMachineState *spapr, int shift, + Error **errp) +{ +long rc; + +/* Clean up any HPT info from a previous boot */ +spapr_free_hpt(spapr); + +rc = kvmppc_reset_htab(shift); +if (rc < 0) { +/* kernel-side HPT needed, but couldn't allocate one */ +error_setg_errno(errp, errno, + "Failed to allocate KVM HPT of order %d (try smaller maxmem?)", + shift); +/* This is almost certainly fatal, but if the caller really + * wants to carry on with shift == 0, it's welcome to try */ +} else if (rc > 0) { +/* kernel-side HPT allocated */ +if (rc != shift) { +error_setg(errp, + "Requested order %d HPT, but kernel allocated order %ld (try smaller maxmem?)", + shift, rc); +} + +spapr->htab_shift = shift; +spapr->htab = NULL; +} else { +/* kernel-side HPT not needed, allocate in userspace instead */ +size_t size = 1ULL << shift; +int i; + +spapr->htab = qemu_memalign(size, size); +if (!spapr->htab) { +error_setg_errno(errp, errno, + "Could not allocate HPT of order %d", shift); +return; +} + +memset(spapr->htab, 0, size); +spapr->htab_shift = shift; + +for (i = 0; i < size / HASH_PTE_SIZE_64; i++) { +DIRTY_HPTE(HPTE(spapr->htab, i)); +} +} +register_savevm_live(NULL, "spapr/htab", -1, 1, + _htab_handlers, spapr); +} + static void spapr_boot_set(void *opaque, const char *boot_device, Error **errp) { @@ -2341,8 +2344,6 @@ static void ppc_spapr_init(MachineState *machine) * interface, this is a legacy from the sPAPREnvironment structure * which predated MachineState but had a
Re: [Qemu-devel] [PATCH v3 15/24] docker: add powerpc build target
Hi Alex, On 05/22/2017 11:08 AM, Alex Bennée wrote: Philippe Mathieu-Daudéwrites: Signed-off-by: Philippe Mathieu-Daudé --- tests/docker/Makefile.include | 4 +-- .../docker/dockerfiles/debian-powerpc-cross.docker | 40 ++ 2 files changed, 42 insertions(+), 2 deletions(-) create mode 100644 tests/docker/dockerfiles/debian-powerpc-cross.docker diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include index 111b8090b2..9815976486 100644 --- a/tests/docker/Makefile.include +++ b/tests/docker/Makefile.include @@ -56,11 +56,13 @@ docker-image-%: $(DOCKER_FILES_DIR)/%.docker "BUILD","$*") docker-image-debian-mipsel-cross: EXTRA_FILES:=tests/docker/dockerfiles/debian-apt-fake.sh +docker-image-debian-powerpc-cross: EXTRA_FILES:=tests/docker/dockerfiles/debian-apt-fake.sh # Enforce dependancies for composite images docker-image-debian-armhf-cross: docker-image-debian docker-image-debian-arm64-cross: docker-image-debian docker-image-debian-mipsel-cross: docker-image-debian +docker-image-debian-powerpc-cross: docker-image-debian # Expand all the pre-requistes for each docker image and test combination $(foreach i,$(DOCKER_IMAGES), \ @@ -111,8 +113,6 @@ docker: @echo 'NOUSER Define to disable adding current user to containers passwd.' @echo 'NOCACHE=1Ignore cache when build images.' @echo 'EXECUTABLE=Include executable in image.' - @echo 'EXTRA_FILES=" [... ]"' - @echo ' Include extra files in image.' I'm fairly sure you didn't want to do this. Ups, good catch :) Rebase mistake. Are you Ok with those 2 lines back? (I think since it is pretty much a copy/paste of mipsel Dockerfile). I'll send fixed and your other r-b after Fam review, thanks! # This rule if for directly running against an arbitrary docker target. # It is called by the expanded docker targets (e.g. make diff --git a/tests/docker/dockerfiles/debian-powerpc-cross.docker b/tests/docker/dockerfiles/debian-powerpc-cross.docker new file mode 100644 index 00..fa2cc7a657 --- /dev/null +++ b/tests/docker/dockerfiles/debian-powerpc-cross.docker @@ -0,0 +1,40 @@ +# +# Docker powerpc cross-compiler target +# +# This docker target builds on the base debian image. +# +FROM qemu:debian +MAINTAINER Philippe Mathieu-Daudé + +# Add the foreign architecture we want and install dependencies +RUN dpkg --add-architecture powerpc +RUN apt-get update +RUN DEBIAN_FRONTEND=noninteractive eatmydata \ +apt-get install -y --no-install-recommends \ +crossbuild-essential-powerpc + +# to fix "following packages have unmet dependencies" ... +ADD debian-apt-fake.sh /usr/local/bin/apt-fake +RUN apt-get install -y --no-install-recommends \ +equivs \ +pkg-config +RUN apt-fake install \ +pkg-config:powerpc=0.28-1.1-fake && \ +ln -s pkg-config /usr/bin/powerpc-linux-gnu-pkg-config +ENV PKG_CONFIG_PATH /usr/lib/powerpc-linux-gnu/pkgconfig +# + +# Specify the cross prefix for this image (see tests/docker/common.rc) +ENV QEMU_CONFIGURE_OPTS --cross-prefix=powerpc-linux-gnu- + +RUN DEBIAN_FRONTEND=noninteractive eatmydata \ +apt-get build-dep -yy -a powerpc qemu +RUN DEBIAN_FRONTEND=noninteractive \ +apt-get install -y --no-install-recommends \ +glusterfs-common:powerpc \ +libbz2-dev:powerpc \ +liblzo2-dev:powerpc \ +libncursesw5-dev:powerpc \ +libnfs-dev:powerpc \ +librdmacm-dev:powerpc \ +libsnappy-dev:powerpc -- Alex Bennée
Re: [Qemu-devel] [PATCH v5 2/4] net/rocker: Plug memory leak in pci_rocker_init()
On 05/23/2017 01:04 AM, Mao Zhongyi wrote: pci_rocker_init() leaks a World when the name more than 9 chars, then return a negative value directly, doesn't make a correct cleanup. So add a new goto label to fix it. Signed-off-by: Mao ZhongyiReviewed-by: Markus Armbruster Reviewed-by: Philippe Mathieu-Daudé --- hw/net/rocker/rocker.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/hw/net/rocker/rocker.c b/hw/net/rocker/rocker.c index d01ba9d..6d44f37 100644 --- a/hw/net/rocker/rocker.c +++ b/hw/net/rocker/rocker.c @@ -1357,7 +1357,8 @@ static int pci_rocker_init(PCIDevice *dev) fprintf(stderr, "rocker: name too long; please shorten to at most %d chars\n", MAX_ROCKER_NAME_LEN); -return -EINVAL; +err = -EINVAL; +goto err_name_too_long; } if (memcmp(>fp_start_macaddr, , sizeof(zero)) == 0) { @@ -1416,6 +1417,7 @@ static int pci_rocker_init(PCIDevice *dev) return 0; +err_name_too_long: err_duplicate: rocker_msix_uninit(r); err_msix_init:
Re: [Qemu-devel] [PATCH v5 3/4] net/rocker: Convert to realize()
On 05/23/2017 01:04 AM, Mao Zhongyi wrote: The rocker device still implements the old PCIDeviceClass .init() instead of the new .realize(). All devices need to be converted to .realize(). .init() reports errors with fprintf() and return 0 on success, negative number on failure. Meanwhile, when -device rocker fails, it first report a specific error, then a generic one, like this: $ x86_64-softmmu/qemu-system-x86_64 -device rocker,name=qemu-rocker rocker: name too long; please shorten to at most 9 chars qemu-system-x86_64: -device rocker,name=qemu-rocker: Device initialization failed Now, convert it to .realize() that passes errors to its callers via its errp argument. Also avoid the superfluous second error message. After the patch, effect like this: $ x86_64-softmmu/qemu-system-x86_64 -device rocker,name=qemu-rocker qemu-system-x86_64: -device rocker,name=qemu-rocker: name too long; please shorten to at most 9 chars Signed-off-by: Mao ZhongyiReviewed-by: Markus Armbruster Reviewed-by: Philippe Mathieu-Daudé --- hw/net/rocker/rocker.c | 27 +++ 1 file changed, 11 insertions(+), 16 deletions(-) diff --git a/hw/net/rocker/rocker.c b/hw/net/rocker/rocker.c index 6d44f37..2764529 100644 --- a/hw/net/rocker/rocker.c +++ b/hw/net/rocker/rocker.c @@ -1238,20 +1238,18 @@ rollback: return err; } -static int rocker_msix_init(Rocker *r) +static int rocker_msix_init(Rocker *r, Error **errp) { PCIDevice *dev = PCI_DEVICE(r); int err; -Error *local_err = NULL; err = msix_init(dev, ROCKER_MSIX_VEC_COUNT(r->fp_ports), >msix_bar, ROCKER_PCI_MSIX_BAR_IDX, ROCKER_PCI_MSIX_TABLE_OFFSET, >msix_bar, ROCKER_PCI_MSIX_BAR_IDX, ROCKER_PCI_MSIX_PBA_OFFSET, -0, _err); +0, errp); if (err) { -error_report_err(local_err); return err; } @@ -1287,7 +1285,7 @@ static World *rocker_world_type_by_name(Rocker *r, const char *name) return NULL; } -static int pci_rocker_init(PCIDevice *dev) +static void pci_rocker_realize(PCIDevice *dev, Error **errp) { Rocker *r = to_rocker(dev); const MACAddr zero = { .a = { 0, 0, 0, 0, 0, 0 } }; @@ -1305,10 +1303,9 @@ static int pci_rocker_init(PCIDevice *dev) r->world_dflt = rocker_world_type_by_name(r, r->world_name); if (!r->world_dflt) { -fprintf(stderr, -"rocker: requested world \"%s\" does not exist\n", +error_setg(errp, +"invalid argument requested world %s does not exist", r->world_name); -err = -EINVAL; goto err_world_type_by_name; } @@ -1328,7 +1325,7 @@ static int pci_rocker_init(PCIDevice *dev) /* MSI-X init */ -err = rocker_msix_init(r); +err = rocker_msix_init(r, errp); if (err) { goto err_msix_init; } @@ -1340,7 +1337,7 @@ static int pci_rocker_init(PCIDevice *dev) } if (rocker_find(r->name)) { -err = -EEXIST; +error_setg(errp, "%s already exists", r->name); goto err_duplicate; } @@ -1354,10 +1351,9 @@ static int pci_rocker_init(PCIDevice *dev) #define ROCKER_IFNAMSIZ 16 #define MAX_ROCKER_NAME_LEN (ROCKER_IFNAMSIZ - 1 - 3 - 3) if (strlen(r->name) > MAX_ROCKER_NAME_LEN) { -fprintf(stderr, -"rocker: name too long; please shorten to at most %d chars\n", +error_setg(errp, +"name too long; please shorten to at most %d chars", MAX_ROCKER_NAME_LEN); -err = -EINVAL; goto err_name_too_long; } @@ -1415,7 +1411,7 @@ static int pci_rocker_init(PCIDevice *dev) QLIST_INSERT_HEAD(, r, next); -return 0; +return; err_name_too_long: err_duplicate: @@ -1429,7 +1425,6 @@ err_world_type_by_name: world_free(r->worlds[i]); } } -return err; } static void pci_rocker_uninit(PCIDevice *dev) @@ -1514,7 +1509,7 @@ static void rocker_class_init(ObjectClass *klass, void *data) DeviceClass *dc = DEVICE_CLASS(klass); PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); -k->init = pci_rocker_init; +k->realize = pci_rocker_realize; k->exit = pci_rocker_uninit; k->vendor_id = PCI_VENDOR_ID_REDHAT; k->device_id = PCI_DEVICE_ID_REDHAT_ROCKER;
Re: [Qemu-devel] [PATCH v5 4/4] net/rocker: Fix the unusual macro name
On 05/23/2017 01:04 AM, Mao Zhongyi wrote: Suggested-by: Markus ArmbrusterSigned-off-by: Mao Zhongyi Reviewed-by: Philippe Mathieu-Daudé --- hw/net/rocker/rocker.c | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/hw/net/rocker/rocker.c b/hw/net/rocker/rocker.c index 2764529..f8a32f7 100644 --- a/hw/net/rocker/rocker.c +++ b/hw/net/rocker/rocker.c @@ -69,10 +69,10 @@ struct rocker { QLIST_ENTRY(rocker) next; }; -#define ROCKER "rocker" +#define TYPE_ROCKER "rocker" -#define to_rocker(obj) \ -OBJECT_CHECK(Rocker, (obj), ROCKER) +#define ROCKER(obj) \ +OBJECT_CHECK(Rocker, (obj), TYPE_ROCKER) static QLIST_HEAD(, rocker) rockers; @@ -1287,7 +1287,7 @@ static World *rocker_world_type_by_name(Rocker *r, const char *name) static void pci_rocker_realize(PCIDevice *dev, Error **errp) { -Rocker *r = to_rocker(dev); +Rocker *r = ROCKER(dev); const MACAddr zero = { .a = { 0, 0, 0, 0, 0, 0 } }; const MACAddr dflt = { .a = { 0x52, 0x54, 0x00, 0x12, 0x35, 0x01 } }; static int sw_index; @@ -1333,7 +1333,7 @@ static void pci_rocker_realize(PCIDevice *dev, Error **errp) /* validate switch properties */ if (!r->name) { -r->name = g_strdup(ROCKER); +r->name = g_strdup(TYPE_ROCKER); } if (rocker_find(r->name)) { @@ -1429,7 +1429,7 @@ err_world_type_by_name: static void pci_rocker_uninit(PCIDevice *dev) { -Rocker *r = to_rocker(dev); +Rocker *r = ROCKER(dev); int i; QLIST_REMOVE(r, next); @@ -1462,7 +1462,7 @@ static void pci_rocker_uninit(PCIDevice *dev) static void rocker_reset(DeviceState *dev) { -Rocker *r = to_rocker(dev); +Rocker *r = ROCKER(dev); int i; for (i = 0; i < ROCKER_WORLD_TYPE_MAX; i++) { @@ -1500,7 +1500,7 @@ static Property rocker_properties[] = { }; static const VMStateDescription rocker_vmsd = { -.name = ROCKER, +.name = TYPE_ROCKER, .unmigratable = 1, }; @@ -1523,7 +1523,7 @@ static void rocker_class_init(ObjectClass *klass, void *data) } static const TypeInfo rocker_info = { -.name = ROCKER, +.name = TYPE_ROCKER, .parent= TYPE_PCI_DEVICE, .instance_size = sizeof(Rocker), .class_init= rocker_class_init,
Re: [Qemu-devel] [PATCH v5 1/4] net/rocker: Remove the dead error handling
Hi Markus, On 05/23/2017 06:27 AM, Markus Armbruster wrote: [...] There's one more cleanup opportunity: [...] if (pci_dma_read(dev, le64_to_cpu(info->desc.buf_addr), info->buf, size)) { return NULL; } None of the pci_dma_read() calls outside rocker check the return value. Just as well, because it always returns 0. Please clean this up in a separate followup patch. It may be the correct way to do it but this sounds like we are missing something somewhere... pci_dma_read() calls pci_dma_rw() which always returns 0. Why not let it returns void? It is inlined and never used by address. Else we should document why returning 0 is correct, and what is the reason to not use a void prototype. pci_dma_rw() calls dma_memory_rw() which does return a boolean value, false on success (MEMTX_OK) and true on error (MEMTX_ERROR/DECODE_ERROR) Regards, Phil.
Re: [Qemu-devel] [PATCH v2 13/13] vvfat: change OEM name to 'MSWIN4.1'
On 05/23/2017 04:41 PM, Hervé Poussineau wrote: Hi Philippe, Le 23/05/2017 à 06:23, Philippe Mathieu-Daudé a écrit : Hi Hervé, On 05/22/2017 06:12 PM, Hervé Poussineau wrote: According to specification: "'MSWIN4.1' is the recommanded setting, because it is the setting least likely to cause compatibility problems. If you want to put something else in here, that is your option, but the result may be that some FAT drivers might not recognize the volume." It also says "Typically this is some indication of what system formatted the volume." From https://technet.microsoft.com/en-us/library/cc976796.aspx: "Following the jump instruction is the 8-byte OEM ID, a string of characters that identifies the name and version number of the operating system that formatted the volume. To preserve compatibility with MS-DOS, Windows 2000 records "MSDOS5.0" in this field on FAT16 and FAT32 disks. On NTFS disks, Windows 2000 records "NTFS." You may also see the OEM ID "MSWIN4.0" on disks formatted by Windows 95 and "MSWIN4.1" on disks formatted by Windows 95 OSR2 and Windows 98. Windows 2000 does not use the OEM ID field in the boot sector except for verifying NTFS volumes." I'm curious which one is the most up-to-date, the specification v1.03 or a Windows 2000. Do you have an idea if there is more MSDOS/W2K vs W95/98 users? Hopefully W95 is a Long time no see to me. You think having "QEMU" OEM does more harm than good? That's complicated. Indeed, OEM ID should be the name of the OS/tool which formatted the partition. However, some FAT drivers take different paths according to OEM ID. See for example: https://jdebp.eu/FGA/volume-boot-block-oem-name-field.html "Linux (util-linux up to at least version 2.12) ... [first characters are QEMU] Treat the volume as unformatted" :facepalm: http://seasip.info./Misc/oemid.html So, in my opinion, it should be better to stick to some specification for the whole FAT format, so we have a reference document to know if there is a bug in the code or not. Once explained, that's fine :) Can you add those references in the code? Eventually add a #define VOLUME_BOOT_BLOCKS_OEM_NAME with a fast explanation and those urls. With that: Reviewed-by: Philippe Mathieu-DaudéFAT specification 1.03 is dated December 6, 2000, while Windows 2000 has been released December 15, 1999, so both are around the same date. FAT specification gives all details about all disk structures, so I think that's better to stick to what it says. So to answer your question, and knowing the tricks played with OEM ID, I think that's better to use "MSWIN4.1" than "QEMU". Regards, Hervé Specification: "FAT: General overview of on-disk format" v1.03, page 9 Signed-off-by: Hervé Poussineau --- block/vvfat.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/vvfat.c b/block/vvfat.c index 53e8faa54c..1f7f46ecea 100644 --- a/block/vvfat.c +++ b/block/vvfat.c @@ -1024,7 +1024,7 @@ static int init_directories(BDRVVVFATState* s, bootsector->jump[0]=0xeb; bootsector->jump[1]=0x3e; bootsector->jump[2]=0x90; -memcpy(bootsector->name,"QEMU",8); +memcpy(bootsector->name, "MSWIN4.1", 8); bootsector->sector_size=cpu_to_le16(0x200); bootsector->sectors_per_cluster=s->sectors_per_cluster; bootsector->reserved_sectors=cpu_to_le16(1); Regards, Phil.
Re: [Qemu-devel] [PATCH v2 12/13] vvfat: handle KANJI lead byte 0xe5
Hi Hervé, You explained in the cover "fix problems detected by disk checking utilities in read-only mode", do you think it would be doable to have unit-tests for those corner cases? On 05/22/2017 06:12 PM, Hervé Poussineau wrote: Specification: "FAT: General overview of on-disk format" v1.03, page 23 Signed-off-by: Hervé Poussineau--- block/vvfat.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/block/vvfat.c b/block/vvfat.c index 5376659010..53e8faa54c 100644 --- a/block/vvfat.c +++ b/block/vvfat.c @@ -590,6 +590,10 @@ static direntry_t *create_short_filename(BDRVVVFATState *s, } } +if (entry->name[0] == 0xe5) { This deserves a comment, and a #define would also be welcomed. +entry->name[0] = 0x05; +} + /* numeric-tail generation */ for (j = 0; j < 8; j++) { if (entry->name[j] == ' ') { @@ -710,8 +714,6 @@ static inline void init_fat(BDRVVVFATState* s) } -/* TODO: in create_short_filename, 0xe5->0x05 is not yet handled! */ -/* TODO: in parse_short_filename, 0x05->0xe5 is not yet handled! */ static inline direntry_t* create_short_and_long_name(BDRVVVFATState* s, unsigned int directory_start, const char* filename, int is_dot) { @@ -1744,6 +1746,9 @@ static int parse_short_name(BDRVVVFATState* s, } else lfn->name[i + j + 1] = '\0'; +if (lfn->name[0] == 0x05) { +lfn->name[0] = 0xe5; +} lfn->len = strlen((char*)lfn->name); return 0;
Re: [Qemu-devel] [PATCH] net/rocker: Cleanup the useless return value check
On 05/24/2017 11:31 AM, Jason Wang wrote: On 2017年05月24日 10:57, Mao Zhongyi wrote: None of pci_dma_read()'s callers check the return value except rocker. There is no need to check it because it always return 0. So the check work is useless. Remove it entirely. Suggested-by: Markus ArmbrusterSigned-off-by: Mao Zhongyi --- [...] Applied, thanks. Thanks:)
Re: [Qemu-devel] [PATCH v2 09/13] vvfat: correctly create base short names for non-ASCII filenames
Hi Hervé, On 05/22/2017 06:12 PM, Hervé Poussineau wrote: More specifically, create short name from filename and change blacklist of invalid chars to whitelist of valid chars. Windows 9x also now correctly see long file names of filenames containing a space, but Scandisk still complains about mismatch between SFN and LFN. Specification: "FAT: General overview of on-disk format" v1.03, pages 30-31 Signed-off-by: Hervé Poussineau--- block/vvfat.c | 105 ++ 1 file changed, 77 insertions(+), 28 deletions(-) diff --git a/block/vvfat.c b/block/vvfat.c index d34241da17..3cb65493cd 100644 --- a/block/vvfat.c +++ b/block/vvfat.c @@ -516,6 +516,81 @@ static void set_begin_of_direntry(direntry_t* direntry, uint32_t begin) direntry->begin_hi = cpu_to_le16((begin >> 16) & 0x); } +static uint8_t to_valid_short_char(gunichar c) +{ +c = g_unichar_toupper(c); +if ((c >= '0' && c <= '9') || +(c >= 'A' && c <= 'Z') || +strchr("$%'-_@~`!(){}^#&", c) != 0) { +return c; +} else { +return 0; +} +} + +static direntry_t *create_short_filename(BDRVVVFATState *s, + const char *filename) +{ +int i, j = 0; +direntry_t *entry = array_get_next(&(s->directory)); +const gchar *p, *last_dot = NULL; +gunichar c; +bool lossy_conversion = false; What is the purpose of this variable? Do you plan to use it later or it is just for debugging? +char tail[11]; This also looks like old debug code... + +if (!entry) { +return NULL; +} +memset(entry->name, 0x20, sizeof(entry->name)); + +/* copy filename and search last dot */ +for (p = filename; ; p = g_utf8_next_char(p)) { +c = g_utf8_get_char(p); +if (c == '\0') { +break; +} else if (c == '.') { +if (j == 0) { +/* '.' at start of filename */ +lossy_conversion = true; +} else { +if (last_dot) { +lossy_conversion = true; +} +last_dot = p; +} +} else if (!last_dot) { +/* first part of the name; copy it */ +uint8_t v = to_valid_short_char(c); +if (j < 8 && v) { +entry->name[j++] = v; +} else { +lossy_conversion = true; +} +} +} + +/* copy extension (if any) */ +if (last_dot) { +j = 0; +for (p = g_utf8_next_char(last_dot); ; p = g_utf8_next_char(p)) { +c = g_utf8_get_char(p); +if (c == '\0') { +break; +} else { +/* extension; copy it */ +uint8_t v = to_valid_short_char(c); +if (j < 3 && v) { +entry->name[8 + (j++)] = v; +} else { +lossy_conversion = true; +} +} +} +} +(void)lossy_conversion; +return entry; +} + /* fat functions */ static inline uint8_t fat_chksum(const direntry_t* entry) @@ -614,7 +689,7 @@ static inline void init_fat(BDRVVVFATState* s) static inline direntry_t* create_short_and_long_name(BDRVVVFATState* s, unsigned int directory_start, const char* filename, int is_dot) { -int i,j,long_index=s->directory.next; +int long_index = s->directory.next; direntry_t* entry = NULL; direntry_t* entry_long = NULL; @@ -626,33 +701,7 @@ static inline direntry_t* create_short_and_long_name(BDRVVVFATState* s, } entry_long=create_long_filename(s,filename); - -i = strlen(filename); -for(j = i - 1; j>0 && filename[j]!='.';j--); -if (j > 0) -i = (j > 8 ? 8 : j); -else if (i > 8) -i = 8; - -entry=array_get_next(&(s->directory)); -memset(entry->name, 0x20, sizeof(entry->name)); -memcpy(entry->name, filename, i); - -if (j > 0) { -for (i = 0; i < 3 && filename[j + 1 + i]; i++) { -entry->name[8 + i] = filename[j + 1 + i]; -} -} - -/* upcase & remove unwanted characters */ -for(i=10;i>=0;i--) { -if(i==10 || i==7) for(;i>0 && entry->name[i]==' ';i--); -if(entry->name[i]<=' ' || entry->name[i]>0x7f -|| strchr(".*?<>|\":/\\[];,+='",entry->name[i])) -entry->name[i]='_'; -else if(entry->name[i]>='a' && entry->name[i]<='z') -entry->name[i]+='A'-'a'; -} +entry = create_short_filename(s, filename); /* mangle duplicates */ while(1) {
Re: [Qemu-devel] [PATCH v2 05/13] vvfat: introduce offset_to_bootsector, offset_to_fat and offset_to_root_dir
Hi Hervé, On 05/22/2017 06:11 PM, Hervé Poussineau wrote: - offset_to_bootsector is the number of sectors up to FAT bootsector - offset_to_fat is the number of sectors up to first File Allocation Table - offset_to_root_dir is the number of sectors up to root directory sector Eventually your commit description can end here, adding the 3 following lines below the "---" separator. Replace first_sectors_number - 1 by offset_to_bootsector. Replace first_sectors_number by offset_to_fat. Replace faked_sectors by offset_to_rootdir. This is "offset_to_root_dir" Signed-off-by: Hervé Poussineau--- ^-- separator block/vvfat.c | 70 --- 1 file changed, 43 insertions(+), 27 deletions(-) diff --git a/block/vvfat.c b/block/vvfat.c index 6a36d4f7fa..e694d82df4 100644 --- a/block/vvfat.c +++ b/block/vvfat.c @@ -320,22 +320,24 @@ static void print_mapping(const struct mapping_t* mapping); typedef struct BDRVVVFATState { CoMutex lock; BlockDriverState* bs; /* pointer to parent */ -unsigned int first_sectors_number; /* 1 for a single partition, 0x40 for a disk with partition table */ unsigned char first_sectors[0x40*0x200]; int fat_type; /* 16 or 32 */ array_t fat,directory,mapping; char volume_label[11]; +uint32_t offset_to_bootsector; /* 0 for floppy, 0x3f for disk */ + unsigned int cluster_size; unsigned int sectors_per_cluster; unsigned int sectors_per_fat; unsigned int sectors_of_root_directory; uint32_t last_cluster_of_root_directory; -unsigned int faked_sectors; /* how many sectors are faked before file data */ uint32_t sector_count; /* total number of sectors of the partition */ uint32_t cluster_count; /* total number of clusters of this partition */ uint32_t max_fat_value; +uint32_t offset_to_fat; +uint32_t offset_to_root_dir; int current_fd; mapping_t* current_mapping; @@ -394,15 +396,15 @@ static void init_mbr(BDRVVVFATState *s, int cyls, int heads, int secs) partition->attributes=0x80; /* bootable */ /* LBA is used when partition is outside the CHS geometry */ -lba = sector2CHS(>start_CHS, s->first_sectors_number - 1, +lba = sector2CHS(>start_CHS, s->offset_to_bootsector, cyls, heads, secs); lba |= sector2CHS(>end_CHS, s->bs->total_sectors - 1, cyls, heads, secs); /*LBA partitions are identified only by start/length_sector_long not by CHS*/ -partition->start_sector_long = cpu_to_le32(s->first_sectors_number - 1); +partition->start_sector_long = cpu_to_le32(s->offset_to_bootsector); partition->length_sector_long = cpu_to_le32(s->bs->total_sectors -- s->first_sectors_number + 1); +- s->offset_to_bootsector); /* FAT12/FAT16/FAT32 */ /* DOS uses different types when partition is LBA, @@ -823,12 +825,12 @@ static int read_directory(BDRVVVFATState* s, int mapping_index) static inline uint32_t sector2cluster(BDRVVVFATState* s,off_t sector_num) { -return (sector_num-s->faked_sectors)/s->sectors_per_cluster; +return (sector_num - s->offset_to_root_dir) / s->sectors_per_cluster; } static inline off_t cluster2sector(BDRVVVFATState* s, uint32_t cluster_num) { -return s->faked_sectors + s->sectors_per_cluster * cluster_num; +return s->offset_to_root_dir + s->sectors_per_cluster * cluster_num; } static int init_directories(BDRVVVFATState* s, @@ -855,6 +857,9 @@ static int init_directories(BDRVVVFATState* s, i = 1+s->sectors_per_cluster*0x200*8/s->fat_type; s->sectors_per_fat=(s->sector_count+i)/i; /* round up */ +s->offset_to_fat = s->offset_to_bootsector + 1; +s->offset_to_root_dir = s->offset_to_fat + s->sectors_per_fat * 2; + array_init(&(s->mapping),sizeof(mapping_t)); array_init(&(s->directory),sizeof(direntry_t)); @@ -868,7 +873,6 @@ static int init_directories(BDRVVVFATState* s, /* Now build FAT, and write back information into directory */ init_fat(s); -s->faked_sectors=s->first_sectors_number+s->sectors_per_fat*2; s->cluster_count=sector2cluster(s, s->sector_count); mapping = array_get_next(&(s->mapping)); @@ -946,7 +950,8 @@ static int init_directories(BDRVVVFATState* s, s->current_mapping = NULL; - bootsector=(bootsector_t*)(s->first_sectors+(s->first_sectors_number-1)*0x200); +bootsector = (bootsector_t *)(s->first_sectors + + s->offset_to_bootsector * 0x200); bootsector->jump[0]=0xeb; bootsector->jump[1]=0x3e; bootsector->jump[2]=0x90; @@ -957,16 +962,18 @@ static int init_directories(BDRVVVFATState* s, bootsector->number_of_fats=0x2; /* number of FATs */ bootsector->root_entries=cpu_to_le16(s->sectors_of_root_directory*0x10);
Re: [Qemu-devel] [PATCH] docker: Add flex and bison to centos6 image
On 05/23/2017 09:52 PM, Fam Zheng wrote: Currently there are warnings about flex and bison being missing when building in the centos6 image: make[1]: flex: Command not found BISON dtc-parser.tab.c make[1]: bison: Command not found Add them. I had the same issue with debian based images :/ Reported-by: Thomas HuthSigned-off-by: Fam Zheng Reviewed-by: Philippe Mathieu-Daudé --- tests/docker/dockerfiles/centos6.docker | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/docker/dockerfiles/centos6.docker b/tests/docker/dockerfiles/centos6.docker index 34e0d3b..17a4d24 100644 --- a/tests/docker/dockerfiles/centos6.docker +++ b/tests/docker/dockerfiles/centos6.docker @@ -1,7 +1,7 @@ FROM centos:6 RUN yum install -y epel-release ENV PACKAGES libfdt-devel ccache \ -tar git make gcc g++ \ +tar git make gcc g++ flex bison \ zlib-devel glib2-devel SDL-devel pixman-devel \ epel-release RUN yum install -y $PACKAGES
Re: [Qemu-devel] [PATCH V6 06/10] migration: add postcopy blocktime ctx into MigrationIncomingState
On Tue, May 23, 2017 at 02:31:07PM +0300, Alexey Perevalov wrote: > This patch adds request to kernel space for UFFD_FEATURE_THREAD_ID, > in case when this feature is provided by kernel. > > PostcopyBlocktimeContext is incapsulated inside postcopy-ram.c, > due to it's postcopy only feature. > Also it defines PostcopyBlocktimeContext's instance live time. > Information from PostcopyBlocktimeContext instance will be provided > much after postcopy migration end, instance of PostcopyBlocktimeContext > will live till QEMU exit, but part of it (vcpu_addr, > page_fault_vcpu_time) used only during calculation, will be released > when postcopy ended or failed. > > To enable postcopy blocktime calculation on destination, need to request > proper capabiltiy (Patch for documentation will be at the tail of the patch > set). > > As an example following command enable that capability, assume QEMU was > started with > -chardev socket,id=charmonitor,path=/var/lib/migrate-vm-monitor.sock > option to control it > > [root@host]#printf "{\"execute\" : \"qmp_capabilities\"}\r\n \ > {\"execute\": \"migrate-set-capabilities\" , \"arguments\": { > \"capabilities\": [ { \"capability\": \"postcopy-blocktime\", \"state\": > true } ] } }" | nc -U /var/lib/migrate-vm-monitor.sock > > Or just with HMP > (qemu) migrate_set_capability postcopy-blocktime on > > Signed-off-by: Alexey Perevalov> --- > include/migration/migration.h | 8 + > migration/postcopy-ram.c | 80 > +++ > 2 files changed, 88 insertions(+) > > diff --git a/include/migration/migration.h b/include/migration/migration.h > index 2951253..449cb07 100644 > --- a/include/migration/migration.h > +++ b/include/migration/migration.h > @@ -57,6 +57,8 @@ enum mig_rp_message_type { > > typedef QLIST_HEAD(, LoadStateEntry) LoadStateEntry_Head; > > +struct PostcopyBlocktimeContext; > + > /* State for the incoming migration */ > struct MigrationIncomingState { > QEMUFile *from_src_file; > @@ -97,6 +99,12 @@ struct MigrationIncomingState { > > /* See savevm.c */ > LoadStateEntry_Head loadvm_handlers; > + > +/* > + * PostcopyBlocktimeContext to keep information for postcopy > + * live migration, to calculate vCPU block time > + * */ > +struct PostcopyBlocktimeContext *blocktime_ctx; > }; > > MigrationIncomingState *migration_incoming_get_current(void); > diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c > index 4f3f495..5435a40 100644 > --- a/migration/postcopy-ram.c > +++ b/migration/postcopy-ram.c > @@ -59,6 +59,73 @@ struct PostcopyDiscardState { > #include > #include > > +typedef struct PostcopyBlocktimeContext { > +/* time when page fault initiated per vCPU */ > +int64_t *page_fault_vcpu_time; > +/* page address per vCPU */ > +uint64_t *vcpu_addr; > +int64_t total_blocktime; > +/* blocktime per vCPU */ > +int64_t *vcpu_blocktime; > +/* point in time when last page fault was initiated */ > +int64_t last_begin; > +/* number of vCPU are suspended */ > +int smp_cpus_down; > + > +/* > + * Handler for exit event, necessary for > + * releasing whole blocktime_ctx > + */ > +Notifier exit_notifier; > +/* > + * Handler for postcopy event, necessary for > + * releasing unnecessary part of blocktime_ctx > + */ > +Notifier postcopy_notifier; > +} PostcopyBlocktimeContext; > + > +static void destroy_blocktime_context(struct PostcopyBlocktimeContext *ctx) > +{ > +g_free(ctx->page_fault_vcpu_time); > +g_free(ctx->vcpu_addr); > +g_free(ctx->vcpu_blocktime); > +g_free(ctx); > +} > + > +static void postcopy_migration_cb(Notifier *n, void *data) > +{ > +PostcopyBlocktimeContext *ctx = container_of(n, PostcopyBlocktimeContext, > + postcopy_notifier); > +MigrationState *s = data; > +if (migration_has_finished(s) || migration_has_failed(s)) { > +g_free(ctx->page_fault_vcpu_time); > +/* g_free is NULL robust */ > +ctx->page_fault_vcpu_time = NULL; > +g_free(ctx->vcpu_addr); > +ctx->vcpu_addr = NULL; > +} > +} > + > +static void migration_exit_cb(Notifier *n, void *data) > +{ > +PostcopyBlocktimeContext *ctx = container_of(n, PostcopyBlocktimeContext, > + exit_notifier); > +destroy_blocktime_context(ctx); > +} > + > +static struct PostcopyBlocktimeContext *blocktime_context_new(void) > +{ > +PostcopyBlocktimeContext *ctx = g_new0(PostcopyBlocktimeContext, 1); > +ctx->page_fault_vcpu_time = g_new0(int64_t, smp_cpus); > +ctx->vcpu_addr = g_new0(uint64_t, smp_cpus); > +ctx->vcpu_blocktime = g_new0(int64_t, smp_cpus); > + > +ctx->exit_notifier.notify = migration_exit_cb; > +ctx->postcopy_notifier.notify = postcopy_migration_cb; > +qemu_add_exit_notifier(>exit_notifier); > +
Re: [Qemu-devel] [PATCH] net/rocker: Cleanup the useless return value check
On 2017年05月24日 10:57, Mao Zhongyi wrote: None of pci_dma_read()'s callers check the return value except rocker. There is no need to check it because it always return 0. So the check work is useless. Remove it entirely. Suggested-by: Markus ArmbrusterSigned-off-by: Mao Zhongyi --- hw/net/rocker/rocker.c | 9 +++-- hw/net/rocker/rocker_desc.c | 4 +--- 2 files changed, 4 insertions(+), 9 deletions(-) diff --git a/hw/net/rocker/rocker.c b/hw/net/rocker/rocker.c index 6e70fdd..4f0f6d7 100644 --- a/hw/net/rocker/rocker.c +++ b/hw/net/rocker/rocker.c @@ -244,11 +244,9 @@ static int tx_consume(Rocker *r, DescInfo *info) goto err_no_mem; } -if (pci_dma_read(dev, frag_addr, iov[iovcnt].iov_base, - iov[iovcnt].iov_len)) { -err = -ROCKER_ENXIO; -goto err_bad_io; -} +pci_dma_read(dev, frag_addr, iov[iovcnt].iov_base, + iov[iovcnt].iov_len); + iovcnt++; } @@ -261,7 +259,6 @@ static int tx_consume(Rocker *r, DescInfo *info) err = fp_port_eg(r->fp_port[port], iov, iovcnt); err_too_many_frags: -err_bad_io: err_no_mem: err_bad_attr: for (i = 0; i < ROCKER_TX_FRAGS_MAX; i++) { diff --git a/hw/net/rocker/rocker_desc.c b/hw/net/rocker/rocker_desc.c index ac02797..6184c40 100644 --- a/hw/net/rocker/rocker_desc.c +++ b/hw/net/rocker/rocker_desc.c @@ -69,9 +69,7 @@ char *desc_get_buf(DescInfo *info, bool read_only) return NULL; } -if (pci_dma_read(dev, le64_to_cpu(info->desc.buf_addr), info->buf, size)) { -return NULL; -} +pci_dma_read(dev, le64_to_cpu(info->desc.buf_addr), info->buf, size); return info->buf; } Applied, thanks.
Re: [Qemu-devel] [virtio-dev] Re: [virtio-dev] Re: [PATCH v2 00/16] Vhost-pci for inter-VM communication
On 2017年05月23日 18:48, Wei Wang wrote: On 05/23/2017 02:32 PM, Jason Wang wrote: On 2017年05月23日 13:47, Wei Wang wrote: On 05/23/2017 10:08 AM, Jason Wang wrote: On 2017年05月22日 19:46, Wang, Wei W wrote: On Monday, May 22, 2017 10:28 AM, Jason Wang wrote: On 2017年05月19日 23:33, Stefan Hajnoczi wrote: On Fri, May 19, 2017 at 11:10:33AM +0800, Jason Wang wrote: On 2017年05月18日 11:03, Wei Wang wrote: On 05/17/2017 02:22 PM, Jason Wang wrote: On 2017年05月17日 14:16, Jason Wang wrote: On 2017年05月16日 15:12, Wei Wang wrote: Hi: Care to post the driver codes too? OK. It may take some time to clean up the driver code before post it out. You can first have a check of the draft at the repo here: https://github.com/wei-w-wang/vhost-pci-driver Best, Wei Interesting, looks like there's one copy on tx side. We used to have zerocopy support for tun for VM2VM traffic. Could you please try to compare it with your vhost-pci-net by: We can analyze from the whole data path - from VM1's network stack to send packets -> VM2's network stack to receive packets. The number of copies are actually the same for both. That's why I'm asking you to compare the performance. The only reason for vhost-pci is performance. You should prove it. There is another reason for vhost-pci besides maximum performance: vhost-pci makes it possible for end-users to run networking or storage appliances in compute clouds. Cloud providers do not allow end-users to run custom vhost-user processes on the host so you need vhost-pci. Stefan Then it has non NFV use cases and the question goes back to the performance comparing between vhost-pci and zerocopy vhost_net. If it does not perform better, it was less interesting at least in this case. Probably I can share what we got about vhost-pci and vhost-user: https://github.com/wei-w-wang/vhost-pci-discussion/blob/master/vhost_pci_vs_vhost_user.pdf Right now, I don’t have the environment to add the vhost_net test. Thanks, the number looks good. But I have some questions: - Is the number measured through your vhost-pci kernel driver code? Yes, the kernel driver code. Interesting, in the above link, "l2fwd" was used in vhost-pci testing. I want to know more about the test configuration: If l2fwd is the one that dpdk had, want to know how can you make it work for kernel driver. (Maybe packet socket I think?) If not, want to know how do you configure it (e.g through bridge or act_mirred or others). And in OVS dpdk, is dpdk l2fwd + pmd used in the testing? Oh, that l2fwd is a kernel module from OPNFV vsperf (http://artifacts.opnfv.org/vswitchperf/docs/userguide/quickstart.html) For both legacy and vhost-pci cases, they use the same l2fwd module. No bridge is used, the module already works at L2 to forward packets between two net devices. Thanks for the pointer. Just to confirm, I think virtio-net kernel driver is used in OVS-dpdk test? Another question is, can we manage to remove the copy in tx? If not, is it a limitation of virtio protocol? Thanks Best, Wei
Re: [Qemu-devel] [virtio-dev] Re: [PATCH RFC] virtio-net: enable configurable tx queue size
On 2017年05月23日 18:36, Wei Wang wrote: On 05/23/2017 02:24 PM, Jason Wang wrote: On 2017年05月23日 13:15, Wei Wang wrote: On 05/23/2017 10:04 AM, Jason Wang wrote: On 2017年05月22日 19:52, Wei Wang wrote: On 05/20/2017 04:42 AM, Michael S. Tsirkin wrote: On Fri, May 19, 2017 at 10:32:19AM +0800, Wei Wang wrote: This patch enables the virtio-net tx queue size to be configurable between 256 (the default queue size) and 1024 by the user. The queue size specified by the user should be power of 2. Setting the tx queue size to be 1024 requires the guest driver to support the VIRTIO_NET_F_MAX_CHAIN_SIZE feature. This should be a generic ring feature, not one specific to virtio net. OK. How about making two more changes below: 1) make the default tx queue size = 1024 (instead of 256). As has been pointed out, you need compat the default value too in this case. The driver gets the size info from the device, then would it cause any compatibility issue if we change the default ring size to 1024 in the vhost case? In other words, is there any software (i.e. any virtio-net driver) functions based on the assumption of 256 queue size? I don't know. But is it safe e.g we migrate from 1024 to an older qemu with 256 as its queue size? Yes, I think it is safe, because the default queue size is used when the device is being set up (e.g. feature negotiation). During migration (the device has already been running), the destination machine will load the device state based on the the queue size that is being used (i.e. vring.num). The default value is not used any more after the setup phase. I haven't checked all cases, but there's two obvious things: - After migration and after a reset, it will go back to 256 on dst. - ABI is changed, e.g -M pc-q35-2.10 returns 1024 on 2.11 For live migration, the queue size that is being used will also be transferred to the destination. We can reduce the size (to 256) if the MAX_CHAIN_SIZE feature is not supported by the guest. In this way, people who apply the QEMU patch can directly use the largest queue size(1024) without adding the booting command line. 2) The vhost backend does not use writev, so I think when the vhost backed is used, using 1024 queue size should not depend on the MAX_CHAIN_SIZE feature. But do we need to consider even larger queue size now? Need Michael's feedback on this. Meanwhile, I'll get the next version of code ready and check if larger queue size would cause any corner case. The problem is, do we really need a new config filed for this? Or just introduce a flag which means "I support up to 1024 sgs" is sufficient? For now, it also works without the new config field, max_chain_size, But I would prefer to keep the new config field, because: Without that, the driver will work on an assumed value, 1023. This is the fact, and it's too late to change legacy driver. If the future, QEMU needs to change it to 1022, then how can the new QEMU tell the old driver, which supports the MAX_CHAIN_SIZE feature but works with the old hardcode value 1023? Can config filed help in this case? The problem is similar to ANY_HEADER_SG, the only thing we can is to clarify the limitation for new drivers. Thanks So, I think using that config value would be good for future updates. Best, Wei - To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
Re: [Qemu-devel] new to qemu with a simple question
On Wed, 05/24 10:56, Wang Dong wrote: > Some C source code is generate from json file. > > I wonder why this? What is the benefits of this? This allows you to concentrate on the high level semantics of the QMP API, without worrying about C syntax and structural code here and there. In one word, more conciseness and little code duplication. If done in C, it would be much more lines of code and hardly readable. Fam
[Qemu-devel] [PATCH] net/rocker: Cleanup the useless return value check
None of pci_dma_read()'s callers check the return value except rocker. There is no need to check it because it always return 0. So the check work is useless. Remove it entirely. Suggested-by: Markus ArmbrusterSigned-off-by: Mao Zhongyi --- hw/net/rocker/rocker.c | 9 +++-- hw/net/rocker/rocker_desc.c | 4 +--- 2 files changed, 4 insertions(+), 9 deletions(-) diff --git a/hw/net/rocker/rocker.c b/hw/net/rocker/rocker.c index 6e70fdd..4f0f6d7 100644 --- a/hw/net/rocker/rocker.c +++ b/hw/net/rocker/rocker.c @@ -244,11 +244,9 @@ static int tx_consume(Rocker *r, DescInfo *info) goto err_no_mem; } -if (pci_dma_read(dev, frag_addr, iov[iovcnt].iov_base, - iov[iovcnt].iov_len)) { -err = -ROCKER_ENXIO; -goto err_bad_io; -} +pci_dma_read(dev, frag_addr, iov[iovcnt].iov_base, + iov[iovcnt].iov_len); + iovcnt++; } @@ -261,7 +259,6 @@ static int tx_consume(Rocker *r, DescInfo *info) err = fp_port_eg(r->fp_port[port], iov, iovcnt); err_too_many_frags: -err_bad_io: err_no_mem: err_bad_attr: for (i = 0; i < ROCKER_TX_FRAGS_MAX; i++) { diff --git a/hw/net/rocker/rocker_desc.c b/hw/net/rocker/rocker_desc.c index ac02797..6184c40 100644 --- a/hw/net/rocker/rocker_desc.c +++ b/hw/net/rocker/rocker_desc.c @@ -69,9 +69,7 @@ char *desc_get_buf(DescInfo *info, bool read_only) return NULL; } -if (pci_dma_read(dev, le64_to_cpu(info->desc.buf_addr), info->buf, size)) { -return NULL; -} +pci_dma_read(dev, le64_to_cpu(info->desc.buf_addr), info->buf, size); return info->buf; } -- 2.9.3
[Qemu-devel] [PATCH v3 16/16] tests: Add test case for BLK_PERM_AIO_CONTEXT_CHANGE
Signed-off-by: Fam Zheng--- tests/Makefile.include | 2 ++ tests/test-blk-perm.c | 59 ++ 2 files changed, 61 insertions(+) create mode 100644 tests/test-blk-perm.c diff --git a/tests/Makefile.include b/tests/Makefile.include index 7589383..5811296 100644 --- a/tests/Makefile.include +++ b/tests/Makefile.include @@ -56,6 +56,7 @@ gcov-files-test-thread-pool-y = thread-pool.c gcov-files-test-hbitmap-y = util/hbitmap.c check-unit-y += tests/test-hbitmap$(EXESUF) gcov-files-test-hbitmap-y = blockjob.c +check-unit-y += tests/test-blk-perm$(EXESUF) check-unit-y += tests/test-blockjob$(EXESUF) check-unit-y += tests/test-blockjob-txn$(EXESUF) check-unit-y += tests/test-x86-cpuid$(EXESUF) @@ -548,6 +549,7 @@ tests/test-coroutine$(EXESUF): tests/test-coroutine.o $(test-block-obj-y) tests/test-aio$(EXESUF): tests/test-aio.o $(test-block-obj-y) tests/test-aio-multithread$(EXESUF): tests/test-aio-multithread.o $(test-block-obj-y) tests/test-throttle$(EXESUF): tests/test-throttle.o $(test-block-obj-y) +tests/test-blk-perm$(EXESUF): tests/test-blk-perm.o $(test-block-obj-y) tests/test-blockjob$(EXESUF): tests/test-blockjob.o $(test-block-obj-y) $(test-util-obj-y) tests/test-blockjob-txn$(EXESUF): tests/test-blockjob-txn.o $(test-block-obj-y) $(test-util-obj-y) tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(test-block-obj-y) diff --git a/tests/test-blk-perm.c b/tests/test-blk-perm.c new file mode 100644 index 000..1c7b5d2 --- /dev/null +++ b/tests/test-blk-perm.c @@ -0,0 +1,59 @@ +/* + * Block permission tests + * + * Copyright Red Hat, Inc. 2017 + * + * Authors: + * Fam Zheng + * + * This work is licensed under the terms of the GNU LGPL, version 2 or later. + * See the COPYING.LIB file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qapi/error.h" +#include "sysemu/block-backend.h" + +static void test_aio_context_success(void) +{ +BlockBackend *blk1 = blk_new(BLK_PERM_AIO_CONTEXT_CHANGE, BLK_PERM_ALL); +BlockBackend *blk2 = blk_new(BLK_PERM_AIO_CONTEXT_CHANGE, BLK_PERM_ALL); +BlockDriverState *bs = bdrv_open("null-co://", NULL, NULL, 0, _abort); + +blk_insert_bs(blk1, bs, _abort); +blk_insert_bs(blk2, bs, _abort); + +blk_unref(blk1); +blk_unref(blk2); +bdrv_unref(bs); +} + +static void test_aio_context_failure(void) +{ +Error *local_err = NULL; +BlockBackend *blk1 = blk_new(BLK_PERM_AIO_CONTEXT_CHANGE, + BLK_PERM_ALL & ~BLK_PERM_AIO_CONTEXT_CHANGE); +BlockBackend *blk2 = blk_new(BLK_PERM_AIO_CONTEXT_CHANGE, BLK_PERM_ALL); +BlockDriverState *bs = bdrv_open("null-co://", NULL, NULL, 0, _abort); + +blk_insert_bs(blk1, bs, _abort); +blk_insert_bs(blk2, bs, _err); + +error_free_or_abort(_err); + +blk_unref(blk1); +blk_unref(blk2); +bdrv_unref(bs); +} + +int main(int argc, char **argv) +{ +bdrv_init(); +qemu_init_main_loop(_abort); +g_test_init(, , NULL); +g_test_add_func("/block/perm/aio-context/success", +test_aio_context_success); +g_test_add_func("/block/perm/aio-context/failure", +test_aio_context_failure); +return g_test_run(); +} -- 2.9.4
[Qemu-devel] [PATCH v3 14/16] nbd: Allow BLK_PERM_AIO_CONTEXT_CHANGE on BB
This is safe because of the aio context notifier we'll register on this node. So allow it. Signed-off-by: Fam Zheng--- nbd/server.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/nbd/server.c b/nbd/server.c index 924a1fe..a8f58fb 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -900,8 +900,10 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset, off_t size, if ((nbdflags & NBD_FLAG_READ_ONLY) == 0) { perm |= BLK_PERM_WRITE; } -blk = blk_new(perm, BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED | -BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD); +blk = blk_new(perm, + BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED | + BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD | + BLK_PERM_AIO_CONTEXT_CHANGE); ret = blk_insert_bs(blk, bs, errp); if (ret < 0) { goto fail; -- 2.9.4
[Qemu-devel] new to qemu with a simple question
Hi guys, I am new to qemu. But I need do some job in it right now. When I try read qmp code. I found a interesting part against it. Some C source code is generate from json file. I wonder why this? What is the benefits of this? Thanks in advance. -- Best regards. Wang Dong
[Qemu-devel] [PATCH v3 15/16] block: Add perm assertion on blk_set_aio_context
Now that all BB users comply with the BLK_PERM_AIO_CONTEXT_CHANGE rule, we can assert it. Signed-off-by: Fam Zheng--- block/block-backend.c | 4 1 file changed, 4 insertions(+) diff --git a/block/block-backend.c b/block/block-backend.c index dfe577d..51c62ea 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -1695,8 +1695,12 @@ static AioContext *blk_aiocb_get_aio_context(BlockAIOCB *acb) void blk_set_aio_context(BlockBackend *blk, AioContext *new_context) { +uint64_t perm, shared_perm; BlockDriverState *bs = blk_bs(blk); +blk_get_perm(blk, , _perm); +assert(perm & BLK_PERM_AIO_CONTEXT_CHANGE); + blk->aio_context = new_context; if (bs) { if (blk->public.throttle_state) { -- 2.9.4
[Qemu-devel] [PATCH v3 10/16] mirror: Do initial aio context move of target via BB interface
While blockdev-backup tried to verify before moving target's aio context, the same is missing for blockdev-mirror. Now that we have the right interface, fix this as well. As a bonus, the aio context move is now conditional, which avoids some unnecessary operations in bdrv_set_aio_context. Signed-off-by: Fam Zheng--- block/mirror.c | 54 -- blockdev.c | 4 2 files changed, 32 insertions(+), 26 deletions(-) diff --git a/block/mirror.c b/block/mirror.c index a3337ee..4eccb8d 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -1118,13 +1118,44 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs, bool auto_complete, const char *filter_node_name, Error **errp) { +AioContext *aio_context, *target_context; MirrorBlockJob *s; BlockDriverState *mirror_top_bs; +BlockBackend *target_blk; bool target_graph_mod; bool target_is_backing; Error *local_err = NULL; int ret; +/* No resize for the target either; while the mirror is still running, a + * consistent read isn't necessarily possible. We could possibly allow + * writes and graph modifications, though it would likely defeat the + * purpose of a mirror, so leave them blocked for now. + * + * In the case of active commit, things look a bit different, though, + * because the target is an already populated backing file in active use. + * We can allow anything except resize there.*/ +target_is_backing = bdrv_chain_contains(bs, target); +target_graph_mod = (backing_mode != MIRROR_LEAVE_BACKING_CHAIN); +target_blk = blk_new(BLK_PERM_WRITE | BLK_PERM_RESIZE | + BLK_PERM_AIO_CONTEXT_CHANGE | + (target_graph_mod ? BLK_PERM_GRAPH_MOD : 0), + BLK_PERM_WRITE_UNCHANGED | + (target_is_backing ? BLK_PERM_CONSISTENT_READ | + BLK_PERM_WRITE | + BLK_PERM_GRAPH_MOD : 0)); +ret = blk_insert_bs(target_blk, target, errp); +if (ret < 0) { +blk_unref(target_blk); +return; +} +aio_context = bdrv_get_aio_context(bs); +target_context = bdrv_get_aio_context(target); +if (target_context != aio_context) { +aio_context_acquire(target_context); +blk_set_aio_context(target_blk, aio_context); +aio_context_release(target_context); +} if (granularity == 0) { granularity = bdrv_get_default_bitmap_granularity(target); } @@ -1179,28 +1210,7 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs, s->source = bs; s->mirror_top_bs = mirror_top_bs; -/* No resize for the target either; while the mirror is still running, a - * consistent read isn't necessarily possible. We could possibly allow - * writes and graph modifications, though it would likely defeat the - * purpose of a mirror, so leave them blocked for now. - * - * In the case of active commit, things look a bit different, though, - * because the target is an already populated backing file in active use. - * We can allow anything except resize there.*/ -target_is_backing = bdrv_chain_contains(bs, target); -target_graph_mod = (backing_mode != MIRROR_LEAVE_BACKING_CHAIN); -s->target = blk_new(BLK_PERM_WRITE | BLK_PERM_RESIZE | -BLK_PERM_AIO_CONTEXT_CHANGE | -(target_graph_mod ? BLK_PERM_GRAPH_MOD : 0), -BLK_PERM_WRITE_UNCHANGED | -(target_is_backing ? BLK_PERM_CONSISTENT_READ | - BLK_PERM_WRITE | - BLK_PERM_GRAPH_MOD : 0)); -ret = blk_insert_bs(s->target, target, errp); -if (ret < 0) { -goto fail; -} - +s->target = target_blk; s->replaces = g_strdup(replaces); s->on_source_error = on_source_error; s->on_target_error = on_target_error; diff --git a/blockdev.c b/blockdev.c index e8f72a1..52d81c4 100644 --- a/blockdev.c +++ b/blockdev.c @@ -3545,8 +3545,6 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp) goto out; } -bdrv_set_aio_context(target_bs, aio_context); - blockdev_mirror_common(arg->has_job_id ? arg->job_id : NULL, bs, target_bs, arg->has_replaces, arg->replaces, arg->sync, backing_mode, arg->has_speed, arg->speed, @@ -3597,8 +3595,6 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id, aio_context = bdrv_get_aio_context(bs); aio_context_acquire(aio_context); -bdrv_set_aio_context(target_bs, aio_context); - blockdev_mirror_common(has_job_id ? job_id : NULL, bs, target_bs, has_replaces,
[Qemu-devel] [PATCH v3 11/16] virtio-scsi: Request BLK_PERM_AIO_CONTEXT_CHANGE for dataplane
blk_set_aio_context is audited by perm API, so follow the protocol and request for permission first. Signed-off-by: Fam Zheng--- hw/scsi/virtio-scsi.c | 4 1 file changed, 4 insertions(+) diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c index 46a3e3f..074e235 100644 --- a/hw/scsi/virtio-scsi.c +++ b/hw/scsi/virtio-scsi.c @@ -794,6 +794,10 @@ static void virtio_scsi_hotplug(HotplugHandler *hotplug_dev, DeviceState *dev, return; } virtio_scsi_acquire(s); +if (!blk_request_perm(sd->conf.blk, BLK_PERM_AIO_CONTEXT_CHANGE, errp)) { +virtio_scsi_release(s); +return; +} blk_set_aio_context(sd->conf.blk, s->ctx); virtio_scsi_release(s); -- 2.9.4
[Qemu-devel] [PATCH v3 03/16] blockjob: Add BLK_PERM_AIO_CONTEXT_CHANGE shared perm on bs
Signed-off-by: Fam Zheng--- blockjob.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/blockjob.c b/blockjob.c index 6e48932..b20fb86 100644 --- a/blockjob.c +++ b/blockjob.c @@ -214,6 +214,9 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver, } } +/* The notifier we'll register on @blk takes care of following context + * change, so permit it. */ +shared_perm |= BLK_PERM_AIO_CONTEXT_CHANGE; blk = blk_new(perm, shared_perm); ret = blk_insert_bs(blk, bs, errp); if (ret < 0) { -- 2.9.4
[Qemu-devel] [PATCH v3 12/16] virtio-blk: Request BLK_PERM_AIO_CONTEXT_CHANGE for dataplane
blk_set_aio_context is audited by perm API, so follow the protocol and request for permission first. Previously the return code in error path was hardcoded as -EPERM, but returning 'r' is more meaningful here both for the new error and existing ones. Signed-off-by: Fam Zheng--- hw/block/dataplane/virtio-blk.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index 5556f0e..8f2ff2df 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -168,6 +168,7 @@ int virtio_blk_data_plane_start(VirtIODevice *vdev) unsigned i; unsigned nvqs = s->conf->num_queues; int r; +Error *local_err = NULL; if (vblk->dataplane_started || s->starting) { return 0; @@ -175,12 +176,18 @@ int virtio_blk_data_plane_start(VirtIODevice *vdev) s->starting = true; +r = blk_request_perm(s->conf->conf.blk, BLK_PERM_AIO_CONTEXT_CHANGE, + _err); +if (r) { +error_report_err(local_err); +goto fail; +} /* Set up guest notifier (irq) */ r = k->set_guest_notifiers(qbus->parent, nvqs, true); if (r != 0) { fprintf(stderr, "virtio-blk failed to set guest notifier (%d), " "ensure -enable-kvm is set\n", r); -goto fail_guest_notifiers; +goto fail; } /* Set up virtqueue notify */ @@ -191,7 +198,7 @@ int virtio_blk_data_plane_start(VirtIODevice *vdev) while (i--) { virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), i, false); } -goto fail_guest_notifiers; +goto fail; } } @@ -219,11 +226,11 @@ int virtio_blk_data_plane_start(VirtIODevice *vdev) aio_context_release(s->ctx); return 0; - fail_guest_notifiers: +fail: vblk->dataplane_disabled = true; s->starting = false; vblk->dataplane_started = true; -return -ENOSYS; +return r; } /* Context: QEMU global mutex held */ -- 2.9.4
[Qemu-devel] [PATCH v3 09/16] commit: Allow aio context change on s->base
The block job has the aio context change notifier, we should allow it here as well. Signed-off-by: Fam Zheng--- block/commit.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/block/commit.c b/block/commit.c index e2ee0ff..bbf7768 100644 --- a/block/commit.c +++ b/block/commit.c @@ -391,7 +391,8 @@ void commit_start(const char *job_id, BlockDriverState *bs, | BLK_PERM_RESIZE, BLK_PERM_CONSISTENT_READ | BLK_PERM_GRAPH_MOD - | BLK_PERM_WRITE_UNCHANGED); + | BLK_PERM_WRITE_UNCHANGED + | BLK_PERM_AIO_CONTEXT_CHANGE); ret = blk_insert_bs(s->base, base, errp); if (ret < 0) { goto fail; -- 2.9.4
[Qemu-devel] [PATCH v3 02/16] block-backend: Add blk_request_perm
This function tries to request, if not granted yet, for the given permissions. Signed-off-by: Fam Zheng--- block/block-backend.c | 12 include/sysemu/block-backend.h | 1 + 2 files changed, 13 insertions(+) diff --git a/block/block-backend.c b/block/block-backend.c index f3a6008..5492f64 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -654,6 +654,18 @@ void blk_get_perm(BlockBackend *blk, uint64_t *perm, uint64_t *shared_perm) *shared_perm = blk->shared_perm; } +int blk_request_perm(BlockBackend *blk, uint64_t perm, Error **errp) +{ +uint64_t blk_perm, shared_perm; + +blk_get_perm(blk, _perm, _perm); +if ((blk_perm & perm) == perm) { +return 0; +} +blk_perm |= perm; +return blk_set_perm(blk, blk_perm, shared_perm, errp); +} + static int blk_do_attach_dev(BlockBackend *blk, void *dev) { if (blk->dev) { diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h index 840ad61..fc23a9e 100644 --- a/include/sysemu/block-backend.h +++ b/include/sysemu/block-backend.h @@ -116,6 +116,7 @@ bool bdrv_is_root_node(BlockDriverState *bs); int blk_set_perm(BlockBackend *blk, uint64_t perm, uint64_t shared_perm, Error **errp); void blk_get_perm(BlockBackend *blk, uint64_t *perm, uint64_t *shared_perm); +int blk_request_perm(BlockBackend *blk, uint64_t perm, Error **errp); void blk_set_allow_write_beyond_eof(BlockBackend *blk, bool allow); void blk_iostatus_enable(BlockBackend *blk); -- 2.9.4
[Qemu-devel] [PATCH v3 08/16] mirror: Request aio context change permission on target
What's done in the source's context change notifier is moving the target's context to follow the new one, so we request this permission here. Signed-off-by: Fam Zheng--- block/mirror.c | 1 + 1 file changed, 1 insertion(+) diff --git a/block/mirror.c b/block/mirror.c index 03e82eb..a3337ee 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -1190,6 +1190,7 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs, target_is_backing = bdrv_chain_contains(bs, target); target_graph_mod = (backing_mode != MIRROR_LEAVE_BACKING_CHAIN); s->target = blk_new(BLK_PERM_WRITE | BLK_PERM_RESIZE | +BLK_PERM_AIO_CONTEXT_CHANGE | (target_graph_mod ? BLK_PERM_GRAPH_MOD : 0), BLK_PERM_WRITE_UNCHANGED | (target_is_backing ? BLK_PERM_CONSISTENT_READ | -- 2.9.4
[Qemu-devel] [PATCH v3 07/16] backup: Do initial aio context move of target via BB interface
The old aio context check is hacky because when it was added we didn't have the permission system to enforce a general rule. It only checks if the target BDS has a BB, which is in fact insufficient because there may be other BBs in the graph that cannot handle the aio context change. Do this through blk_set_aio_context interface, in backup_job_create() which is a central place for both drive-backup and blockdev-backup, to take care of the compatibility check. Also the bdrv_set_aio_context in do_drive_backup could have been conditional, to save a recursion when possible. Signed-off-by: Fam Zheng--- block/backup.c | 9 + blockdev.c | 14 -- 2 files changed, 9 insertions(+), 14 deletions(-) diff --git a/block/backup.c b/block/backup.c index 546c5c5..9136f91 100644 --- a/block/backup.c +++ b/block/backup.c @@ -564,6 +564,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, BlockDriverInfo bdi; BackupBlockJob *job = NULL; int ret; +AioContext *aio_context, *target_context; assert(bs); assert(target); @@ -644,6 +645,14 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, goto error; } +aio_context = bdrv_get_aio_context(bs); +target_context = bdrv_get_aio_context(target); +if (target_context != aio_context) { +aio_context_acquire(target_context); +blk_set_aio_context(job->target, aio_context); +aio_context_release(target_context); +} + job->on_source_error = on_source_error; job->on_target_error = on_target_error; job->sync_mode = sync_mode; diff --git a/blockdev.c b/blockdev.c index c63f4e8..e8f72a1 100644 --- a/blockdev.c +++ b/blockdev.c @@ -3240,8 +3240,6 @@ static BlockJob *do_drive_backup(DriveBackup *backup, BlockJobTxn *txn, goto out; } -bdrv_set_aio_context(target_bs, aio_context); - if (set_backing_hd) { bdrv_set_backing_hd(target_bs, source, _err); if (local_err) { @@ -3326,18 +3324,6 @@ BlockJob *do_blockdev_backup(BlockdevBackup *backup, BlockJobTxn *txn, if (!target_bs) { goto out; } - -if (bdrv_get_aio_context(target_bs) != aio_context) { -if (!bdrv_has_blk(target_bs)) { -/* The target BDS is not attached, we can safely move it to another - * AioContext. */ -bdrv_set_aio_context(target_bs, aio_context); -} else { -error_setg(errp, "Target is attached to a different thread from " - "source."); -goto out; -} -} job = backup_job_create(backup->job_id, bs, target_bs, backup->speed, backup->sync, NULL, backup->compress, backup->on_source_error, backup->on_target_error, -- 2.9.4
[Qemu-devel] [PATCH v3 06/16] backup: Request BLK_PERM_AIO_CONTEXT_CHANGE on target
What's done in the source's context change notifier is moving the target's context to follow the new one, so we request this permission here. Signed-off-by: Fam Zheng--- block/backup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/backup.c b/block/backup.c index a4fb288..546c5c5 100644 --- a/block/backup.c +++ b/block/backup.c @@ -636,7 +636,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, } /* The target must match the source in size, so no resize here either */ -job->target = blk_new(BLK_PERM_WRITE, +job->target = blk_new(BLK_PERM_WRITE | BLK_PERM_AIO_CONTEXT_CHANGE, BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE | BLK_PERM_WRITE_UNCHANGED | BLK_PERM_GRAPH_MOD); ret = blk_insert_bs(job->target, target, errp); -- 2.9.4
[Qemu-devel] [PATCH v3 00/16] block: Protect AIO context change with perm API
v3: Move blk_set_aio_context to the front of mirror_start_job to avoid accessing target without acquiring its aio context. [Stefan] Use error_free_or_abort in test code. [Stefan] v2: Address Stefan's comments: - Clean up redundancy in bdrv_format_default_perms change. - Add a test case to check both success/failure cases. A failure case is not possible at user interface level because of other checks we have, so write a unit test in tests/test-blk-perm.c. Eject / change of scsi-cd on a virtio-scsi dataplane bus causes abort() because the new BDS doesn't get proper bdrv_set_aio_context(). Store the AioContext in BB and do it in blk_insert_bs. That is done by Vladimir's patch. Other patches are to make sure such a bdrv_set_aio_context() doesn't interfere with other BBs using other nodes from this graph. Fam Fam Zheng (15): block: Define BLK_PERM_AIO_CONTEXT_CHANGE block-backend: Add blk_request_perm blockjob: Add BLK_PERM_AIO_CONTEXT_CHANGE shared perm on bs blockjob: Allow aio context change on intermediate nodes block: Propagate BLK_PERM_AIO_CONTEXT_CHANGE down the graph backup: Request BLK_PERM_AIO_CONTEXT_CHANGE on target backup: Do initial aio context move of target via BB interface mirror: Request aio context change permission on target commit: Allow aio context change on s->base mirror: Do initial aio context move of target via BB interface virtio-scsi: Request BLK_PERM_AIO_CONTEXT_CHANGE for dataplane virtio-blk: Request BLK_PERM_AIO_CONTEXT_CHANGE for dataplane nbd: Allow BLK_PERM_AIO_CONTEXT_CHANGE on BB block: Add perm assertion on blk_set_aio_context tests: Add test case for BLK_PERM_AIO_CONTEXT_CHANGE Vladimir Sementsov-Ogievskiy (1): blk: fix aio context loss on media change block.c | 12 ++--- block/backup.c | 11 +++- block/block-backend.c | 22 +++ block/commit.c | 6 +++-- block/mirror.c | 56 +++--- block/stream.c | 3 ++- block/vvfat.c | 2 +- blockdev.c | 18 - blockjob.c | 3 +++ hw/block/dataplane/virtio-blk.c | 15 --- hw/scsi/virtio-scsi.c | 4 +++ include/block/block.h | 7 - include/sysemu/block-backend.h | 1 + nbd/server.c| 6 +++-- tests/Makefile.include | 2 ++ tests/test-blk-perm.c | 59 + 16 files changed, 171 insertions(+), 56 deletions(-) create mode 100644 tests/test-blk-perm.c -- 2.9.4
[Qemu-devel] [PATCH v3 13/16] blk: fix aio context loss on media change
From: Vladimir Sementsov-OgievskiyIf we have separate iothread for cdrom, we lose connection to it on qmp_blockdev_change_medium, as aio_context is on bds which is dropped and switched with new one. As an example result, after such media change we have crash on virtio_scsi_ctx_check: Assertion `blk_get_aio_context(d->conf.blk) == s->ctx' failed. Signed-off-by: Vladimir Sementsov-Ogievskiy Signed-off-by: Fam Zheng --- block/block-backend.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/block/block-backend.c b/block/block-backend.c index 5492f64..dfe577d 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -68,6 +68,7 @@ struct BlockBackend { NotifierList remove_bs_notifiers, insert_bs_notifiers; int quiesce_counter; +AioContext *aio_context; }; typedef struct BlockBackendAIOCB { @@ -618,6 +619,10 @@ int blk_insert_bs(BlockBackend *blk, BlockDriverState *bs, Error **errp) } bdrv_ref(bs); +if (blk->aio_context != NULL) { +bdrv_set_aio_context(bs, blk->aio_context); +} + notifier_list_notify(>insert_bs_notifiers, blk); if (blk->public.throttle_state) { throttle_timers_attach_aio_context( @@ -1692,6 +1697,7 @@ void blk_set_aio_context(BlockBackend *blk, AioContext *new_context) { BlockDriverState *bs = blk_bs(blk); +blk->aio_context = new_context; if (bs) { if (blk->public.throttle_state) { throttle_timers_detach_aio_context(>public.throttle_timers); -- 2.9.4
[Qemu-devel] [PATCH v3 04/16] blockjob: Allow aio context change on intermediate nodes
The intermediate nodes do work with aio context change, so allow that operations. Signed-off-by: Fam Zheng--- block/commit.c | 3 ++- block/mirror.c | 3 ++- block/stream.c | 3 ++- 3 files changed, 6 insertions(+), 3 deletions(-) diff --git a/block/commit.c b/block/commit.c index 76a0d98..e2ee0ff 100644 --- a/block/commit.c +++ b/block/commit.c @@ -365,7 +365,8 @@ void commit_start(const char *job_id, BlockDriverState *bs, * for its backing file). The other options would be a second filter * driver above s->base. */ ret = block_job_add_bdrv(>common, "intermediate node", iter, 0, - BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE, + BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE | + BLK_PERM_AIO_CONTEXT_CHANGE, errp); if (ret < 0) { goto fail; diff --git a/block/mirror.c b/block/mirror.c index e86f8f8..03e82eb 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -1232,7 +1232,8 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs, * also blocked for its backing file). The other options would be a * second filter driver above s->base (== target). */ ret = block_job_add_bdrv(>common, "intermediate node", iter, 0, - BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE, + BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE | + BLK_PERM_AIO_CONTEXT_CHANGE, errp); if (ret < 0) { goto fail; diff --git a/block/stream.c b/block/stream.c index 0113710..2fab1f4 100644 --- a/block/stream.c +++ b/block/stream.c @@ -265,7 +265,8 @@ void stream_start(const char *job_id, BlockDriverState *bs, * and resizes. */ for (iter = backing_bs(bs); iter && iter != base; iter = backing_bs(iter)) { block_job_add_bdrv(>common, "intermediate node", iter, 0, - BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED, + BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED | + BLK_PERM_AIO_CONTEXT_CHANGE, _abort); } -- 2.9.4
[Qemu-devel] [PATCH v3 05/16] block: Propagate BLK_PERM_AIO_CONTEXT_CHANGE down the graph
bdrv_set_aio_context can take care of children recursively, so it is okay to pass down the perm. Signed-off-by: Fam Zheng--- block.c | 10 ++ block/vvfat.c | 2 +- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/block.c b/block.c index d98662f..1e5eae7 100644 --- a/block.c +++ b/block.c @@ -1772,7 +1772,8 @@ int bdrv_child_try_set_perm(BdrvChild *c, uint64_t perm, uint64_t shared, #define DEFAULT_PERM_PASSTHROUGH (BLK_PERM_CONSISTENT_READ \ | BLK_PERM_WRITE \ | BLK_PERM_WRITE_UNCHANGED \ - | BLK_PERM_RESIZE) + | BLK_PERM_RESIZE \ + | BLK_PERM_AIO_CONTEXT_CHANGE) #define DEFAULT_PERM_UNCHANGED (BLK_PERM_ALL & ~DEFAULT_PERM_PASSTHROUGH) void bdrv_filter_default_perms(BlockDriverState *bs, BdrvChild *c, @@ -1815,9 +1816,10 @@ void bdrv_format_default_perms(BlockDriverState *bs, BdrvChild *c, perm |= BLK_PERM_CONSISTENT_READ; shared &= ~(BLK_PERM_WRITE | BLK_PERM_RESIZE); } else { -/* We want consistent read from backing files if the parent needs it. +/* We want consistent read and aio context change from backing files if + * the parent needs it. * No other operations are performed on backing files. */ -perm &= BLK_PERM_CONSISTENT_READ; +perm &= BLK_PERM_CONSISTENT_READ | BLK_PERM_AIO_CONTEXT_CHANGE; /* If the parent can deal with changing data, we're okay with a * writable and resizable backing file. */ @@ -1829,7 +1831,7 @@ void bdrv_format_default_perms(BlockDriverState *bs, BdrvChild *c, } shared |= BLK_PERM_CONSISTENT_READ | BLK_PERM_GRAPH_MOD | - BLK_PERM_WRITE_UNCHANGED; + BLK_PERM_WRITE_UNCHANGED | BLK_PERM_AIO_CONTEXT_CHANGE; } if (bs->open_flags & BDRV_O_INACTIVE) { diff --git a/block/vvfat.c b/block/vvfat.c index 426ca70..599a370 100644 --- a/block/vvfat.c +++ b/block/vvfat.c @@ -3091,7 +3091,7 @@ static void vvfat_child_perm(BlockDriverState *bs, BdrvChild *c, if (c == s->qcow) { /* This is a private node, nobody should try to attach to it */ *nperm = BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE; -*nshared = BLK_PERM_WRITE_UNCHANGED; +*nshared = BLK_PERM_WRITE_UNCHANGED | BLK_PERM_AIO_CONTEXT_CHANGE; } else { /* The backing file is there so 'commit' can use it. vvfat doesn't * access it in any way. */ -- 2.9.4
[Qemu-devel] [PATCH v3 01/16] block: Define BLK_PERM_AIO_CONTEXT_CHANGE
Signed-off-by: Fam Zheng--- block.c | 2 ++ include/block/block.h | 7 ++- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/block.c b/block.c index 50ba264..d98662f 100644 --- a/block.c +++ b/block.c @@ -1649,6 +1649,8 @@ char *bdrv_perm_names(uint64_t perm) { BLK_PERM_WRITE_UNCHANGED, "write unchanged" }, { BLK_PERM_RESIZE, "resize" }, { BLK_PERM_GRAPH_MOD, "change children" }, +{ BLK_PERM_AIO_CONTEXT_CHANGE, +"aio context change" }, { 0, NULL } }; diff --git a/include/block/block.h b/include/block/block.h index 9b355e9..017d6c8 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -225,7 +225,12 @@ enum { */ BLK_PERM_GRAPH_MOD = 0x10, -BLK_PERM_ALL= 0x1f, +/** + * This permission is required to change the AioContext of this node. + */ +BLK_PERM_AIO_CONTEXT_CHANGE = 0x20, + +BLK_PERM_ALL= 0x3f, }; char *bdrv_perm_names(uint64_t perm); -- 2.9.4
Re: [Qemu-devel] [PATCH V6 04/10] migration: split ufd_version_check onto receive/request features part
On Tue, May 23, 2017 at 02:31:05PM +0300, Alexey Perevalov wrote: > This modification is necessary for userfault fd features which are > required to be requested from userspace. > UFFD_FEATURE_THREAD_ID is a one of such "on demand" feature, which will > be introduced in the next patch. > > QEMU have to use separate userfault file descriptor, due to > userfault context has internal state, and after first call of > ioctl UFFD_API it changes its state to UFFD_STATE_RUNNING (in case of > success), but kernel while handling ioctl UFFD_API expects > UFFD_STATE_WAIT_API. > So only one ioctl with UFFD_API is possible per ufd. > > Signed-off-by: Alexey PerevalovHi, Alexey, Mostly good to me, some nitpicks below. > --- > migration/postcopy-ram.c | 100 > ++- > 1 file changed, 91 insertions(+), 9 deletions(-) > > diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c > index 3ed78bf..4f3f495 100644 > --- a/migration/postcopy-ram.c > +++ b/migration/postcopy-ram.c > @@ -59,32 +59,114 @@ struct PostcopyDiscardState { > #include > #include > > -static bool ufd_version_check(int ufd, MigrationIncomingState *mis) > + > +/** > + * receive_ufd_features: check userfault fd features, to request only > supported > + * features in the future. > + * > + * Returns: true on success > + * > + * __NR_userfaultfd - should be checked before I don't see this line necessary. After all we will detect the error no matter what... > + * @features: out parameter will contain uffdio_api.features provided by > kernel > + * in case of success > + */ > +static bool receive_ufd_features(uint64_t *features) > { > -struct uffdio_api api_struct; > -uint64_t ioctl_mask; > +struct uffdio_api api_struct = {0}; > +int ufd; > +bool ret = true; > + > +/* if we are here __NR_userfaultfd should exists */ > +ufd = syscall(__NR_userfaultfd, O_CLOEXEC); > +if (ufd == -1) { > +error_report("%s: syscall __NR_userfaultfd failed: %s", __func__, > + strerror(errno)); > +return false; > +} > > +/* ask features */ > api_struct.api = UFFD_API; > api_struct.features = 0; > if (ioctl(ufd, UFFDIO_API, _struct)) { > -error_report("%s: UFFDIO_API failed: %s", __func__ > +error_report("%s: UFFDIO_API failed: %s", __func__, > strerror(errno)); > +ret = false; > +goto release_ufd; > +} > + > +*features = api_struct.features; > + > +release_ufd: > +close(ufd); > +return ret; > +} > + > +/** > + * request_ufd_features: this function should be called only once on a newly > + * opened ufd, subsequent calls will lead to error. > + * > + * Returns: true on succes > + * > + * @ufd: fd obtained from userfaultfd syscall > + * @features: bit mask see UFFD_API_FEATURES > + */ > +static bool request_ufd_features(int ufd, uint64_t features) > +{ > +struct uffdio_api api_struct = {0}; > +uint64_t ioctl_mask; > + > +api_struct.api = UFFD_API; > +api_struct.features = features; > +if (ioctl(ufd, UFFDIO_API, _struct)) { > +error_report("%s failed: UFFDIO_API failed: %s", __func__, > +strerror(errno)); Maybe we can indent this line to follow this file's rule? error_report("%s failed: UFFDIO_API failed: %s", __func__, strerror(errno)); > return false; > } > > -ioctl_mask = (__u64)1 << _UFFDIO_REGISTER | > - (__u64)1 << _UFFDIO_UNREGISTER; > +ioctl_mask = 1 << _UFFDIO_REGISTER | > + 1 << _UFFDIO_UNREGISTER; Could I ask why we explicitly removed (__u64) here? Since I see the old one better. > if ((api_struct.ioctls & ioctl_mask) != ioctl_mask) { > error_report("Missing userfault features: %" PRIx64, > (uint64_t)(~api_struct.ioctls & ioctl_mask)); > return false; > } > > +return true; > +} > + > +static bool ufd_check_and_apply(int ufd, MigrationIncomingState *mis) > +{ > +uint64_t asked_features = 0; > +static uint64_t supported_features; > + > +/* > + * it's not possible to > + * request UFFD_API twice per one fd > + * userfault fd features is persistent > + */ > +if (!supported_features) { I would prefer not having this static variable. After all, this function call is rare, and the receive_ufd_features() is not that slow as well. > +if (!receive_ufd_features(_features)) { > +error_report("%s failed", __func__); > +return false; > +} > +} > + > +/* > + * request features, even if asked_features is 0, due to > + * kernel expects UFFD_API before UFFDIO_REGISTER, per > + * userfault file descriptor > + */ > +if (!request_ufd_features(ufd, asked_features)) { > +error_report("%s failed: features %" PRIu64, __func__, > +
Re: [Qemu-devel] [PATCH risu] ppc64: Fix patterns for rotate doubleword instructions
On Tue, May 23, 2017 at 11:47:30AM +0530, Nikunj A Dadhania wrote: > G 3writes: > > > On May 22, 2017, at 4:32 AM, qemu-devel-requ...@nongnu.org wrote: > > > > Hello I have also done some work risu. My patches add ppc32 support. > > Well my patches were made to work with Mac OS X but they are required > > to work with Linux. Do you think you could help port these patches to > > Linux? > > Ziviani did the ppc64 work, lets see if he can spare some time. > > The patches haven't come right on the mailing list, its tedious to pull > them. Please resend them with git send-mail. > Hey, sure I can help. I'll take a look on them. > > > > ppc.risu: > > https://patchwork.kernel.org/patch/9697489/ > > > > risu_ppc.c: > > https://patchwork.kernel.org/patch/9697491/ > > > > risu_reginfo_ppc.c: > > https://patchwork.kernel.org/patch/9697493/ > > > > risu_reginfo_ppc.h: > > https://patchwork.kernel.org/patch/9697495/ > > > > risugen_ppc.pm: > > https://patchwork.kernel.org/patch/9697497/ > > > > Add ppc support to configure: > > https://patchwork.kernel.org/patch/9697499/ > > > > Add verbose option: > > https://patchwork.kernel.org/patch/9697501/ > > > > Add end of test message: > > https://patchwork.kernel.org/patch/9697503/ > > > > Add more descriptive comment for mismatch or end of test condition: > > https://patchwork.kernel.org/patch/9697505/ > > Regards > Nikunj >
Re: [Qemu-devel] [Qemu-block] [PATCH v2 10/16] mirror: Do initial aio context move of target via BB interface
On Thu, 05/11 16:27, Stefan Hajnoczi wrote: > On Wed, Apr 19, 2017 at 05:43:50PM +0800, Fam Zheng wrote: > > diff --git a/blockdev.c b/blockdev.c > > index dfd1385..53f9874 100644 > > --- a/blockdev.c > > +++ b/blockdev.c > > @@ -3556,8 +3556,6 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp) > > goto out; > > } > > > > -bdrv_set_aio_context(target_bs, aio_context); > > - > > blockdev_mirror_common(arg->has_job_id ? arg->job_id : NULL, bs, > > target_bs, > > arg->has_replaces, arg->replaces, arg->sync, > > backing_mode, arg->has_speed, arg->speed, > > @@ -3608,8 +3606,6 @@ void qmp_blockdev_mirror(bool has_job_id, const char > > *job_id, > > aio_context = bdrv_get_aio_context(bs); > > aio_context_acquire(aio_context); > > > > -bdrv_set_aio_context(target_bs, aio_context); > > - > > blockdev_mirror_common(has_job_id ? job_id : NULL, bs, target_bs, > > has_replaces, replaces, sync, backing_mode, > > has_speed, speed, > > This patch post-pones bdrv_set_aio_context() until later on. We now > call bdrv_get_info() without holding target_bs's AioContext in > mirror_start_job(). > > Please acquire the AioContext around target_bs accesses until we move it > to our AioContext. Good catch! Will fix. Fam
Re: [Qemu-devel] [Qemu-block] [PATCH v2 06/16] backup: Request BLK_PERM_AIO_CONTEXT_CHANGE on target
On Thu, 05/11 15:41, Stefan Hajnoczi wrote: > On Wed, Apr 19, 2017 at 05:43:46PM +0800, Fam Zheng wrote: > > What's done in the source's context change notifier is moving the > > target's context to follow the new one, so we request this permission > > here. > > It's true that the backup block job must be able to set target's > AioContext, but does this change also allow other users to set target's > AioContext while the backup job is running? If yes, then we need to > handle that. If through job->target, yes, but I don't think there is any user of job->target. Otherwise, it's not allowed, because the second parameter of blk_new doesn't have BLK_PERM_AIO_CONTEXT_CHANGE. So it's okay. Fam
Re: [Qemu-devel] [Qemu-block] [PATCH v2 02/16] block-backend: Add blk_request_perm
On Thu, 05/11 15:32, Stefan Hajnoczi wrote: > On Wed, Apr 19, 2017 at 05:43:42PM +0800, Fam Zheng wrote: > > This function tries to request, if not granted yet, for the given > > permissions. > > > > Signed-off-by: Fam Zheng> > --- > > block/block-backend.c | 12 > > include/sysemu/block-backend.h | 1 + > > 2 files changed, 13 insertions(+) > > > > diff --git a/block/block-backend.c b/block/block-backend.c > > index 7405024..6bdd9ce 100644 > > --- a/block/block-backend.c > > +++ b/block/block-backend.c > > @@ -629,6 +629,18 @@ void blk_resume_after_migration(Error **errp) > > } > > } > > > > +int blk_request_perm(BlockBackend *blk, uint64_t perm, Error **errp) > > +{ > > +uint64_t blk_perm, shared_perm; > > + > > +blk_get_perm(blk, _perm, _perm); > > +if ((blk_perm & perm) == perm) { > > +return 0; > > +} > > +blk_perm |= perm; > > +return blk_set_perm(blk, blk_perm, shared_perm, errp); > > +} > > I'm slightly confused about why this function is needed. blk_set_perm() > doesn't do quite the right thing? Sorry for the late reply! I think blk_set_perm() always calls bs->drv->bdrv_check_perm() even when the required perm bits are already acquired. Fam
Re: [Qemu-devel] [Xen-devel] [RFC PATCH V2 1/2] xen-pt: bind/unbind interrupt remapping format MSI
On 2017年05月24日 01:06, Anthony PERARD wrote: > On Tue, May 23, 2017 at 08:16:25PM +0800, Lan Tianyu wrote: >> On 2017年05月19日 20:04, Jan Beulich wrote: >> On 19.05.17 at 13:16,wrote: On Thu, May 18, 2017 at 01:32:59AM -0400, Lan Tianyu wrote: > --- a/include/hw/i386/apic-msidef.h > +++ b/include/hw/i386/apic-msidef.h > @@ -26,6 +26,7 @@ > > #define MSI_ADDR_DEST_ID_SHIFT 12 > #define MSI_ADDR_DEST_IDX_SHIFT 4 > -#define MSI_ADDR_DEST_ID_MASK 0x000 > +#define MSI_ADDR_DEST_ID_MASK 0x000fff00 The value of MSI_ADDR_DEST_ID_MASK is changed here. I think the patch should be: +#define MSI_ADDR_DEST_ID_MASK 0x0000 >>> Judging from other sources, rather the other way around - the >>> mask needs to have further bits removed (should be 0x000ff000 >>> afaict). Xen sources confirm this, and while Linux has the value >>> you suggest, that contradicts >> Agree. Defining the mask as "0x000ff000" makes more sense. >> Just check Qemu source code. Only apic_send_msi() and msi_dest_id() use >> the mask >> to get dest apic id. They mask MSI address field with >> MSI_ADDR_DEST_ID_MASK and >> then right-shift 12bit. The low 12bit won't be used. >> >> Anthony, does this make sense? > Yes, it does. > The change to MSI_ADDR_DEST_ID_MASK should probably go in its own patch. > OK. Will update. -- Best regards Tianyu Lan
Re: [Qemu-devel] virtio crypto device implemenation
On Tue, May 23, 2017 at 04:08:25PM +, Zeng, Xin wrote: > Hi, Michael, >As you know, Lei Gong from Huawei and I are co-working on virtio crypto > device spec, he is focusing on symmetric algorithm part, I am focusing on > asymmetric part. Now I am planning the implementation for asymmetric part, > would you please give me your point regarding the questions below? >Current virtio crypto device implementation from Lei Gong: >The virtio crypto device implementation has been upstreamed to QEMU and it > has a qemu backend implementation for symmetric algorithm part, the front end > Linux device driver for symmetric part has been upstreamed to Linux kernel as > well. >My questions: >From my side, I planned to add the asymmetric part support in upstreamed > front end device driver, and I don't want to add the asymmetric algorithm > support to current virtio crypto device's qemu backend, instead, I would like > to implement and upstream a DPDK vhost-user based backend for asymmetric > algorithm, and accordingly Lei Gong will help to upstream a vhost user agent > for virtio crypto device in QEMU, is this approach acceptable? Is a qemu > backend a mandatory requirement for the virtio crypto device? Is there a > general policy for this? > > Thanks Parity on QEMU side is naturally preferable. I don't think we should require it at all times, but if there's no implementation outside vhost-user, and if the feature includes a non-trivial amount of code, how will it be tested? I don't think we want to require all testers to use dpdk. An implementation under tests using libvhost-user might be a solution. -- MST
[Qemu-devel] [PATCH] docker: Add flex and bison to centos6 image
Currently there are warnings about flex and bison being missing when building in the centos6 image: make[1]: flex: Command not found BISON dtc-parser.tab.c make[1]: bison: Command not found Add them. Reported-by: Thomas HuthSigned-off-by: Fam Zheng --- tests/docker/dockerfiles/centos6.docker | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/docker/dockerfiles/centos6.docker b/tests/docker/dockerfiles/centos6.docker index 34e0d3b..17a4d24 100644 --- a/tests/docker/dockerfiles/centos6.docker +++ b/tests/docker/dockerfiles/centos6.docker @@ -1,7 +1,7 @@ FROM centos:6 RUN yum install -y epel-release ENV PACKAGES libfdt-devel ccache \ -tar git make gcc g++ \ +tar git make gcc g++ flex bison \ zlib-devel glib2-devel SDL-devel pixman-devel \ epel-release RUN yum install -y $PACKAGES -- 2.9.4
[Qemu-devel] [Bug 1693050] [NEW] xhci HCIVERSION register read emulation incorrectly handled
Public bug reported: We had an illumos user trying to run illumos in QEMU 2.9.0 with the qemu-xhci device enabled. Note, that while this was discovered against QEMU 2.9.0, from my current read of the HEAD, it is still present. The illumos bug at https://www.illumos.org/issues/8173 has additional information on how the user invoked qemu. While investigating the problem we found that the illumos driver was reading a version of 0x when reading the HCIVERSION register from qemu. In the illumos driver we're performing a 16-bit read of the version register at offset 0x2. From looking around at other OSes, while Linux performs a 4 byte read at offset 0x0 and masks out the version, others that care about the version are doing a two byte read, though not all actually act on the version and some just discard the information. The user who hit this was able to enable tracing (note the tracing file is attached to the illumos bug linked previously) and we hit the unimplemented register read with offset 0x2 at http://git.qemu.org/?p=qemu.git;a=blob;f=hw/usb/hcd- xhci.c;hb=HEAD#l2960. The xhci register specifies today that its allowed for users to do 1-4 byte reads; however, that it implements only four byte reads in its implementation (http://git.qemu.org/?p=qemu.git;a=blob;f=hw/usb/hcd- xhci.c;hb=HEAD#l). Hence why when we read the HCIVERSION register at offset 0x2, it isn't handled in xhci_cap_read() which then returns zeros. >From digging into this, I think that we're coming into memory_region_dispatch_read() and then memory_region_dispatch_read1(). However, I don't see anything in either the general memory region logic or in the xhci_cap_read() function that would deal with adjusting the offset that we're reading at and then masking it off again. While the access_with_adjusted_size() attempts to deal with this, it never adjusts the hardware address that's passed in to be a multiple of the implementation defined offset that I can see. I suspect that the FIXME at line 582 (http://git.qemu.org/?p=qemu.git;a=blob;f=memory.c;hb=HEAD#l582) and the implementation in the xhci code is the crux of the problem. For the time being we're working around this in the illumos driver, but I wanted to point this out such that it might be helpful for other systems which are assuming that they can do the two byte read like on hardware. ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1693050 Title: xhci HCIVERSION register read emulation incorrectly handled Status in QEMU: New Bug description: We had an illumos user trying to run illumos in QEMU 2.9.0 with the qemu-xhci device enabled. Note, that while this was discovered against QEMU 2.9.0, from my current read of the HEAD, it is still present. The illumos bug at https://www.illumos.org/issues/8173 has additional information on how the user invoked qemu. While investigating the problem we found that the illumos driver was reading a version of 0x when reading the HCIVERSION register from qemu. In the illumos driver we're performing a 16-bit read of the version register at offset 0x2. From looking around at other OSes, while Linux performs a 4 byte read at offset 0x0 and masks out the version, others that care about the version are doing a two byte read, though not all actually act on the version and some just discard the information. The user who hit this was able to enable tracing (note the tracing file is attached to the illumos bug linked previously) and we hit the unimplemented register read with offset 0x2 at http://git.qemu.org/?p=qemu.git;a=blob;f=hw/usb/hcd- xhci.c;hb=HEAD#l2960. The xhci register specifies today that its allowed for users to do 1-4 byte reads; however, that it implements only four byte reads in its implementation (http://git.qemu.org/?p=qemu.git;a=blob;f=hw/usb/hcd- xhci.c;hb=HEAD#l). Hence why when we read the HCIVERSION register at offset 0x2, it isn't handled in xhci_cap_read() which then returns zeros. From digging into this, I think that we're coming into memory_region_dispatch_read() and then memory_region_dispatch_read1(). However, I don't see anything in either the general memory region logic or in the xhci_cap_read() function that would deal with adjusting the offset that we're reading at and then masking it off again. While the access_with_adjusted_size() attempts to deal with this, it never adjusts the hardware address that's passed in to be a multiple of the implementation defined offset that I can see. I suspect that the FIXME at line 582 (http://git.qemu.org/?p=qemu.git;a=blob;f=memory.c;hb=HEAD#l582) and the implementation in the xhci code is the crux of the problem. For the time being we're working around this in the illumos driver, but I wanted to point this out such that it might be
Re: [Qemu-devel] [PATCH 02/31] target/s390x: Implement EXECUTE via new TranslationBlock
On 05/23/2017 10:28 AM, Aurelien Jarno wrote: Something like this, as a delta patch. Unfortunately it doesn't work. So far I have no real idea what could be the root cause of the issue. I have just determined that up to the crash, only a very limited set of instructions are being executed. They are the 4 bytes long versions of MVC, CLC, XC, TR. Yeah, it appears XC is the culprit, though I have not yet determined exactly what's going wrong. That said, perhaps I'll delay this for later and just add some extra helpers for now. It does seem slightly wasteful to create a TB for at least these common cases. r~
Re: [Qemu-devel] [PATCH v2 4/4] 9pfs: local: metadata file for the VirtFS root
On 05/23/2017 07:13 PM, Eric Blake wrote:> We have to block VIRTFS_META_DIR at any depth in the hierarchy, but > can/should we change the blocking of VIRTFS_META_ROOT_FILE to only > happen at the root directory, rather than at all directories? On the > other hand, if you can simultaneously map /path/to/a for one mount > point, and /path/to/a/b for another, then the root file for B is visible > at a lower depth than the root file for A, and blocking the metafile > name everywhere means that the mount A can't influence the behavior on > the mount for B. If you take this kind of vulnerabilities into account, then you also have to consider a mix of mapped-file and mapped-attr mounts, or even worse a proxy with a mapped-file mount (which I think is currently vulnerable to this threat if the "proxy" path points above the "local,security_model=mapped-file" path, as the check is done in "local_" functions, which are I guess not used for proxy-type virtfs) I'm clearly not saying it's an invalid attack (there is no explicit documentation stating it's insecure to "nest" virtual mounts"), just saying it's a much larger attack surface than one internal to virtfs mapped-file only. Then the question of what is reasonably possible to forbid to the user arises, and that's not one I could answer. Cheers & HTH, Leo signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v2 0/4] 9pfs: local: fix metadata of mapped-file security mode
On 05/23/2017 04:32 PM, Greg Kurz wrote: > v2: - posted patch for CVE-2017-7493 separately > - other changes available in each patch changelog > > Leo, > > If you find time to test this series, I'll gladly add your Tested-by: to > it before merging. Just tested with a base of 2.9.0 with patches [1] [2] (from my distribution), [3] (required to apply cleanly) and this patchset. Things appear to work as expected, and .virtfs_metadata{,_root} appear to be neither readable nor writable by any user. That said, one thing still bothering me with the fix in [3] is that it still "leaks" the host's uid/gid to the guest when a corresponding file in .virtfs_metadata is not present (while I'd have expected it to appear as root:root in the guest), but that's a separate issue, and I guess retro-compatibility prevents any fixing it. Thanks for these patches! Leo [1] https://github.com/NixOS/nixpkgs/blob/master/pkgs/applications/virtualization/qemu/force-uid0-on-9p.patch [2] https://github.com/NixOS/nixpkgs/blob/master/pkgs/applications/virtualization/qemu/no-etc-install.patch [3] https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg03663.html signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v2 0/4] 9pfs: local: fix metadata of mapped-file security mode
On Tue, 05/23 17:04, Greg Kurz wrote: > > CC hw/9pfs/9p-xattr-user.o > > In file included from > > /var/tmp/patchew-tester-tmp-5niyvzwz/src/hw/9pfs/9p-local.c:18:0: > > /var/tmp/patchew-tester-tmp-5niyvzwz/src/hw/9pfs/9p-local.c: In function > > ‘local_set_mapped_file_attrat’: > > /var/tmp/patchew-tester-tmp-5niyvzwz/src/hw/9pfs/9p-util.h:19:5: error: > > ‘map_dirfd’ may be used uninitialized in this function > > [-Werror=maybe-uninitialized] > > close(fd); > > ^ > > /var/tmp/patchew-tester-tmp-5niyvzwz/src/hw/9pfs/9p-local.c:235:9: note: > > ‘map_dirfd’ was declared here > > int map_dirfd, map_fd; > > ^ > > cc1: all warnings being treated as errors > > This is a false positive AFAICT... what compiler is this ? It's gcc, as in the snipped package list above. Fam
Re: [Qemu-devel] [PATCH v2 3/4] 9pfs: local: simplify file opening
On Tue, 23 May 2017 10:51:26 -0500 Eric Blakewrote: > On 05/23/2017 09:32 AM, Greg Kurz wrote: > > The logic to open a path currently sits between local_open_nofollow() and > > the relative_openat_nofollow() helper, which has no other user. > > > > For the sake of clarity, this patch moves all the code of the helper into > > its unique caller. While here we also: > > - drop the code to skip leading "/" because the backend isn't supposed to > > pass anything but relative paths without consecutive slashes. The assert() > > is kept because we really don't want a buggy backend to pass an > > absolute > > odd spacing > > > path to openat(). > > - use strchrnul() to get a simpler code. This is ok since virtfs if for > > s/if/is/ > Yeah, I spotted these two nits just after posting the series, as usual :) I'll fix them before merging. > > linux+glibc hosts only. > > - don't dup() the initial directory and add an assert() to ensure we don't > > return the global mountfd to the caller. BTW, this would mean that the > > caller passed an empty path, which isn't supposed to happen either. > > > > Signed-off-by: Greg Kurz > > --- > > Reviewed-by: Eric Blake > pgpD7XR2CQwF1.pgp Description: OpenPGP digital signature
[Qemu-devel] [Bug 1318091] Re: Perfctr MSRs not available to Guest OS on AMD Phenom II
i don't understand this in detail, but since the last update of qemu i can't start my virtual win7 machine. i use gnome-boxes 3.24. qemu 2.8 works, 2.9 leads to this: Preformatted text(gnome-boxes:4301): Boxes-WARNING **: machine.vala:611: Failed to start win7: Unable to start domain: the CPU is incompatible with host CPU: Host CPU does not provide required features: monitor, rdtscp, svm i ask, because i also use an phenom 2 x4 and if this is the bug i don't need to opan a new one. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1318091 Title: Perfctr MSRs not available to Guest OS on AMD Phenom II Status in QEMU: New Bug description: The AMD Phenom(tm) II X4 965 Processor (family 16, model 4, stepping 3) has the 4 architecturally supported perf counters at MSRs. The selectors are c001000-c001003, and the counters are c001004-c001007. I've verified that the MSRs are there and working by manually setting the MSRs with msr-tools to count cycles. The processor does not support the extended perfctr or the perfctr_nb. These are in cpuid leaf 8000_0001. Qemu also sees that these cpuid flags are not set, when I try launching with -cpu host,perfctr_core,check. However, this flag is only for the extended perfctr MSRs, which also happen to map the original four counters at c0010200. When I run a Guest OS, that OS is unable to use the perf counter registers from c001000-7. rdmsr and wrmsr complete, but the results are always 0. By contrast, a wrmsr to one of the c0010200 registers causes a general protection fault (as it should, since those aren't supported). Kernel: 3.14.0-gentoo Qemu: 2.0.0 (gentoo) and also with 2.0.50 (commit 06b4f00d5) Qemu command: qemu-system-x86_64 -enable-kvm -cpu host -smp 8 -m 1024 -nographic -monitor /dev/pts/4 mnt/hdd.img cat /proc/cpuinfo: processor : 3 vendor_id : AuthenticAMD cpu family: 16 model : 4 model name: AMD Phenom(tm) II X4 965 Processor stepping : 3 cpu MHz : 800.000 cache size: 512 KB physical id : 0 siblings : 4 core id : 3 cpu cores : 4 apicid: 3 initial apicid: 3 fpu : yes fpu_exception : yes cpuid level : 5 wp: yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt hw_pstate npt lbrv svm_lock nrip_save bogomips : 6803.79 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: ts ttp tm stc 100mhzsteps hwpstate thanks. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1318091/+subscriptions
Re: [Qemu-devel] [PATCH v2] tests/libqtest: Print error instead of aborting when env variable is missing
Quoting Thomas Huth (2017-05-23 02:25:30) > On 22.05.2017 17:58, no-re...@patchew.org wrote: > > This series failed automatic build test. Please find the testing commands > > and > > their output below. If you have docker installed, you can probably > > reproduce it > > locally. > [...] > >LEX convert-dtsv0-lexer.lex.c > >DEP /tmp/qemu-test/src/dtc/fdtdump.c > > make[1]: flex: Command not found > >BISON dtc-parser.tab.c > > make[1]: bison: Command not found > >DEP /tmp/qemu-test/src/dtc/srcpos.c > >LEX dtc-lexer.lex.c > > make[1]: flex: Command not found > >DEP /tmp/qemu-test/src/dtc/treesource.c > >DEP /tmp/qemu-test/src/dtc/livetree.c > >DEP /tmp/qemu-test/src/dtc/fstree.c > >DEP /tmp/qemu-test/src/dtc/flattree.c > >DEP /tmp/qemu-test/src/dtc/dtc.c > >DEP /tmp/qemu-test/src/dtc/data.c > >DEP /tmp/qemu-test/src/dtc/checks.c > > CHK version_gen.h > >BISON dtc-parser.tab.c > > make[1]: bison: Command not found > >LEX convert-dtsv0-lexer.lex.c > > UPD version_gen.h > > make[1]: flex: Command not found > >LEX dtc-lexer.lex.c > > make[1]: flex: Command not found > >DEP /tmp/qemu-test/src/dtc/util.c > >LEX convert-dtsv0-lexer.lex.c > >BISON dtc-parser.tab.c > > make[1]: flex: Command not found > > make[1]: bison: Command not found > >LEX dtc-lexer.lex.c > > make[1]: flex: Command not found > > Looks like flex and bison are missing in the docker image? Could someone > please add it? > > [...] > > CC replay/replay-internal.o > > /tmp/qemu-test/src/replay/replay-internal.c: In function ‘replay_put_array’: > > /tmp/qemu-test/src/replay/replay-internal.c:65: warning: ignoring return > > value of ‘fwrite’, declared with attribute warn_unused_result > > I guess that should be fixed? > > [...] > > CC slirp/tcp_input.o > > CC slirp/tcp_output.o > > CC slirp/tcp_subr.o > > CC slirp/tcp_timer.o > > /tmp/qemu-test/src/slirp/tcp_input.c: In function ‘tcp_input’: > > /tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_p’ may be > > used uninitialized in this function > > /tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_len’ may be > > used uninitialized in this function > > /tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_tos’ may be > > used uninitialized in this function > > /tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_id’ may be > > used uninitialized in this function > > /tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_off’ may be > > used uninitialized in this function > > /tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_ttl’ may be > > used uninitialized in this function > > /tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_sum’ may be > > used uninitialized in this function > > /tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_src.s_addr’ > > may be used uninitialized in this function > > /tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_dst.s_addr’ > > may be used uninitialized in this function > > /tmp/qemu-test/src/slirp/tcp_input.c:220: warning: ‘save_ip6.ip_nh’ may be > > used uninitialized in this function > > I've never seen these warnings in tcp_input.c before ... and they also > look like false positives too me ... is that GCC in the docker image too > sensitive? > > [...] > > CC aarch64-softmmu/hw/timer/exynos4210_rtc.o > > /tmp/qemu-test/src/hw/i386/pc_piix.c: In function > > ‘igd_passthrough_isa_bridge_create’: > > /tmp/qemu-test/src/hw/i386/pc_piix.c:1067: warning: ‘pch_rev_id’ may be > > used uninitialized in this function > > dito > > > CC aarch64-softmmu/hw/usb/tusb6010.o > > /tmp/qemu-test/src/hw/i386/acpi-build.c: In function > > ‘build_append_pci_bus_devices’: > > /tmp/qemu-test/src/hw/i386/acpi-build.c:525: warning: ‘notify_method’ may > > be used uninitialized in this function > > dito > > [...] > > /tmp/qemu-test/src/tests/ide-test.c: In function ‘cdrom_pio_impl’: > > /tmp/qemu-test/src/tests/ide-test.c:803: warning: ignoring return value of > > ‘fwrite’, declared with attribute warn_unused_result > > /tmp/qemu-test/src/tests/ide-test.c: In function ‘test_cdrom_dma’: > > /tmp/qemu-test/src/tests/ide-test.c:899: warning: ignoring return value of > > ‘fwrite’, declared with attribute warn_unused_result > > Needs a patch, I guess? > > > GTESTER tests/test-vmstate > > Failed to load simple/primitive:b_1 > > Failed to load simple/primitive:i64_2 > > Failed to load simple/primitive:i32_1 > > Failed to load simple/primitive:i32_1 > > Failed to load test/with_tmp:a > > Failed to load test/tmp_child_parent:f > > Failed to load test/tmp_child:parent > > Failed to load test/with_tmp:tmp > > Failed to load test/tmp_child:diff > > Failed to load test/with_tmp:tmp > > Failed to load test/tmp_child:diff > > Failed to load test/with_tmp:tmp > >
Re: [Qemu-devel] [PATCH 1/7] qcow2: Remove unused Error in do_perform_cow()
On 05/23/2017 06:22 AM, Alberto Garcia wrote: > qcow2_encrypt_sectors() does not need an Error parameter, and we're > not checking its value anyway, so we can safely remove it. Misleading. You are NOT removing the Error parameter from qcow2_encrypt_sectors(), but rather are explicitly ignoring any errors by passing NULL. I'd update the commit message to something like: We are relying on the return value of qcow2_encrypt_sectors() to flag problems, but have no way to report that error to the end user. Since we are just throwing away the error, we can pass NULL instead for simpler code. A more robust solution would figure out how to pass the original error (rather than a new message related to our -EIO return) back to the caller, but that is more invasive. > > Signed-off-by: Alberto Garcia> --- > block/qcow2-cluster.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > With a better commit message, Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 02/31] target/s390x: Implement EXECUTE via new TranslationBlock
On 23.05.2017 17:54, Richard Henderson wrote: > On 05/23/2017 03:48 AM, Aurelien Jarno wrote: >> On 2017-05-22 20:02, Richard Henderson wrote: >>> Previously, helper_ex would construct the insn and then implement >>> the insn via direct calls other helpers. This was sufficient to >>> boot Linux but that is all. >>> >>> It is easy enough to go the whole nine yards by stashing state for >>> EXECUTE within the cpu, and then relying on a new TB to be created >>> that properly and completely interprets the insn. >>> >>> Signed-off-by: Richard Henderson>>> --- >>> target/s390x/cpu.h | 4 +- >>> target/s390x/helper.h | 2 +- >>> target/s390x/insn-data.def | 4 +- >>> target/s390x/machine.c | 19 +++ >>> target/s390x/mem_helper.c | 136 >>> +++-- >>> target/s390x/translate.c | 124 >>> + >>> 6 files changed, 133 insertions(+), 156 deletions(-) >> >> This looks good on the principle, and finally removes a big hack. That >> said it prevent my test system to boot. I haven't investigated why yet. > > Hmm. I've not got a complete environment -- merely booting a kernel up > to the point it fails to find a rootfs. Which did find several problems > with my first attempts at this, but wouldn't have exercised paging. > I'll try again to get a full install working... Something nice for a quick test is also: http://www.qemu-advent-calendar.org/2014/download/s390-moon-buggy.tar.xz Not sure whether it will trigger your EXECUTE problem, though. Thomas
Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/4] spapr: disable hotplugging without OS
On 23/05/2017 20:22, Daniel Henrique Barboza wrote: > > > This is the Libvirt use case that fails with this patch set applied in > QEMU master, Libvirt 3.4.0 compiled from > source: > > # ./virsh start dhb_ub1704_nfs 2 > # > # ./virsh setvcpus dhb_ub1704_nfs 2 --live > # > # ./virsh -c 'qemu:///system' migrate --live --domain dhb_ub1704_nfs > --desturi qemu+ssh://target_ip/system --timeout 60 --verbose > error: internal error: unable to execute QEMU command 'device_add': CPU > hotplug not supported without OS Good point. I think I should refine my series to allow hotplug if the machine is not started. Thanks, Laurent
Re: [Qemu-devel] [PATCH v2 13/13] vvfat: change OEM name to 'MSWIN4.1'
Hi Philippe, Le 23/05/2017 à 06:23, Philippe Mathieu-Daudé a écrit : Hi Hervé, On 05/22/2017 06:12 PM, Hervé Poussineau wrote: According to specification: "'MSWIN4.1' is the recommanded setting, because it is the setting least likely to cause compatibility problems. If you want to put something else in here, that is your option, but the result may be that some FAT drivers might not recognize the volume." It also says "Typically this is some indication of what system formatted the volume." From https://technet.microsoft.com/en-us/library/cc976796.aspx: "Following the jump instruction is the 8-byte OEM ID, a string of characters that identifies the name and version number of the operating system that formatted the volume. To preserve compatibility with MS-DOS, Windows 2000 records "MSDOS5.0" in this field on FAT16 and FAT32 disks. On NTFS disks, Windows 2000 records "NTFS." You may also see the OEM ID "MSWIN4.0" on disks formatted by Windows 95 and "MSWIN4.1" on disks formatted by Windows 95 OSR2 and Windows 98. Windows 2000 does not use the OEM ID field in the boot sector except for verifying NTFS volumes." I'm curious which one is the most up-to-date, the specification v1.03 or a Windows 2000. Do you have an idea if there is more MSDOS/W2K vs W95/98 users? Hopefully W95 is a Long time no see to me. You think having "QEMU" OEM does more harm than good? That's complicated. Indeed, OEM ID should be the name of the OS/tool which formatted the partition. However, some FAT drivers take different paths according to OEM ID. See for example: https://jdebp.eu/FGA/volume-boot-block-oem-name-field.html http://seasip.info./Misc/oemid.html So, in my opinion, it should be better to stick to some specification for the whole FAT format, so we have a reference document to know if there is a bug in the code or not. FAT specification 1.03 is dated December 6, 2000, while Windows 2000 has been released December 15, 1999, so both are around the same date. FAT specification gives all details about all disk structures, so I think that's better to stick to what it says. So to answer your question, and knowing the tricks played with OEM ID, I think that's better to use "MSWIN4.1" than "QEMU". Regards, Hervé Specification: "FAT: General overview of on-disk format" v1.03, page 9 Signed-off-by: Hervé Poussineau--- block/vvfat.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/vvfat.c b/block/vvfat.c index 53e8faa54c..1f7f46ecea 100644 --- a/block/vvfat.c +++ b/block/vvfat.c @@ -1024,7 +1024,7 @@ static int init_directories(BDRVVVFATState* s, bootsector->jump[0]=0xeb; bootsector->jump[1]=0x3e; bootsector->jump[2]=0x90; -memcpy(bootsector->name,"QEMU",8); +memcpy(bootsector->name, "MSWIN4.1", 8); bootsector->sector_size=cpu_to_le16(0x200); bootsector->sectors_per_cluster=s->sectors_per_cluster; bootsector->reserved_sectors=cpu_to_le16(1); Regards, Phil.
Re: [Qemu-devel] [PATCH v2 4/4] 9pfs: local: metadata file for the VirtFS root
On Tue, 23 May 2017 12:13:17 -0500 Eric Blakewrote: > On 05/23/2017 09:32 AM, Greg Kurz wrote: > > When using the mapped-file security, credentials are stored in a metadata > > directory located in the parent directory. This is okay for all paths with > > the notable exception of the root path, since we don't want and probably > > can't create a metadata directory above the virtfs directory on the host. > > > > This patch introduces a dedicated metadata file, sitting in the virtfs root > > for this purpose. It relies on the fact that the "." name necessarily refers > > to the virtfs root. > > > > As for the metadata directory, we don't want the client to see this file. > > The current code only cares for readdir() but there are many other places > > to fix actually. The filtering logic is hence put in a separate function. > > > > @@ -478,7 +504,8 @@ static off_t local_telldir(FsContext *ctx, > > V9fsFidOpenState *fs) > > > > static bool local_is_mapped_file_metadata(FsContext *fs_ctx, const char > > *name) > > { > > -return !strcmp(name, VIRTFS_META_DIR); > > +return > > +!strcmp(name, VIRTFS_META_DIR) || !strcmp(name, > > VIRTFS_META_ROOT_FILE); > > We have to block VIRTFS_META_DIR at any depth in the hierarchy, but > can/should we change the blocking of VIRTFS_META_ROOT_FILE to only > happen at the root directory, rather than at all directories? On the > other hand, if you can simultaneously map /path/to/a for one mount > point, and /path/to/a/b for another, then the root file for B is visible > at a lower depth than the root file for A, and blocking the metafile > name everywhere means that the mount A can't influence the behavior on > the mount for B. > I must confess I hadn't this scenario in mind but it's safer from a security standpoint indeed. > Not tested, but looks sane, so: > Reviewed-by: Eric Blake > Thanks again for the review, Eric. Cheers, -- Greg pgpZnM41L6JdP.pgp Description: OpenPGP digital signature
Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/4] spapr: disable hotplugging without OS
On 05/23/2017 03:07 PM, Daniel Henrique Barboza wrote: On 05/23/2017 02:52 PM, Daniel Henrique Barboza wrote: Hi Laurent, This is an interesting patch series. I've been working in the last weeks in the DRC migration, mainly to solve the problem in which a hot CPU unplug will not succeed after a migration if the CPU was hotplugged in the source. The problem happened when migrating with virsh because Libvirt hotplugs the CPU in both source and target, and the DRC state of the hotplugged after the migration is inconsistent. This series solves the issue by preventing it from happening in the first place. Of course that migrating DRC states has other uses (pending unplug operations during a migration, for example) so both patch series can coexist. One possible issue I see with this series is that it breaks Libvirt migration entirely if a CPU/mem hotplug happens in the target. With your series applied the migration fails before start with: Sorry: if a migration happens in the *source*, before the migration. Hehe nothing like fixing a typo with another one ... This is the Libvirt use case that fails with this patch set applied in QEMU master, Libvirt 3.4.0 compiled from source: # ./virsh start dhb_ub1704_nfs 2 # # ./virsh setvcpus dhb_ub1704_nfs 2 --live # # ./virsh -c 'qemu:///system' migrate --live --domain dhb_ub1704_nfs --desturi qemu+ssh://target_ip/system --timeout 60 --verbose error: internal error: unable to execute QEMU command 'device_add': CPU hotplug not supported without OS This is the error msg that appears in Libvirt daemon: 2017-05-23 18:17:17.844+: 159678: error : qemuMonitorJSONCheckError:389 : internal error: unable to execute QEMU command 'device_add': CPU hotplug not supported without OS Daniel # ./virsh -c 'qemu:///system' migrate --live --domain dhb_ub1704_nfs --desturi qemu+ssh://target_ip/system --timeout 60 --verbose error: internal error: unable to execute QEMU command 'device_add': CPU hotplug not supported without OS Note that I say "possible issue" because, although I believe we do not want to break Libvirt if possible, I also believe that we need to think about what makes sense in QEMU first. Thanks, Daniel On 05/23/2017 08:18 AM, Laurent Vivier wrote: If the OS is not started, QEMU sends an event to the OS that is lost and cannot be recovered. An unplug is not able to restore QEMU in a coherent state. So, while the OS is not started, disable CPU and memory hotplug. We use option vector 6 to know if the OS is started This series moves error checking for memory hotplug in a pre_plug function, and introduces the option vector 6 management. It also revert previous fix which was not really fixing the hotplug problem when the OS is not running. Laurent Vivier (4): spapr: add pre_plug function for memory spapr: add option vector 6 spapr: disable hotplugging without OS Revert "spapr: fix memory hot-unplugging" hw/ppc/spapr.c | 103 hw/ppc/spapr_drc.c | 20 ++--- hw/ppc/spapr_hcall.c| 5 ++- hw/ppc/spapr_ovec.c | 8 include/hw/ppc/spapr.h | 2 + include/hw/ppc/spapr_drc.h | 1 - include/hw/ppc/spapr_ovec.h | 7 +++ 7 files changed, 109 insertions(+), 37 deletions(-)
Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/4] spapr: disable hotplugging without OS
On 05/23/2017 02:52 PM, Daniel Henrique Barboza wrote: Hi Laurent, This is an interesting patch series. I've been working in the last weeks in the DRC migration, mainly to solve the problem in which a hot CPU unplug will not succeed after a migration if the CPU was hotplugged in the source. The problem happened when migrating with virsh because Libvirt hotplugs the CPU in both source and target, and the DRC state of the hotplugged after the migration is inconsistent. This series solves the issue by preventing it from happening in the first place. Of course that migrating DRC states has other uses (pending unplug operations during a migration, for example) so both patch series can coexist. One possible issue I see with this series is that it breaks Libvirt migration entirely if a CPU/mem hotplug happens in the target. With your series applied the migration fails before start with: Sorry: if a migration happens in the *source*, before the migration. # ./virsh -c 'qemu:///system' migrate --live --domain dhb_ub1704_nfs --desturi qemu+ssh://target_ip/system --timeout 60 --verbose error: internal error: unable to execute QEMU command 'device_add': CPU hotplug not supported without OS Note that I say "possible issue" because, although I believe we do not want to break Libvirt if possible, I also believe that we need to think about what makes sense in QEMU first. Thanks, Daniel On 05/23/2017 08:18 AM, Laurent Vivier wrote: If the OS is not started, QEMU sends an event to the OS that is lost and cannot be recovered. An unplug is not able to restore QEMU in a coherent state. So, while the OS is not started, disable CPU and memory hotplug. We use option vector 6 to know if the OS is started This series moves error checking for memory hotplug in a pre_plug function, and introduces the option vector 6 management. It also revert previous fix which was not really fixing the hotplug problem when the OS is not running. Laurent Vivier (4): spapr: add pre_plug function for memory spapr: add option vector 6 spapr: disable hotplugging without OS Revert "spapr: fix memory hot-unplugging" hw/ppc/spapr.c | 103 hw/ppc/spapr_drc.c | 20 ++--- hw/ppc/spapr_hcall.c| 5 ++- hw/ppc/spapr_ovec.c | 8 include/hw/ppc/spapr.h | 2 + include/hw/ppc/spapr_drc.h | 1 - include/hw/ppc/spapr_ovec.h | 7 +++ 7 files changed, 109 insertions(+), 37 deletions(-)
Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/4] spapr: disable hotplugging without OS
Hi Laurent, This is an interesting patch series. I've been working in the last weeks in the DRC migration, mainly to solve the problem in which a hot CPU unplug will not succeed after a migration if the CPU was hotplugged in the source. The problem happened when migrating with virsh because Libvirt hotplugs the CPU in both source and target, and the DRC state of the hotplugged after the migration is inconsistent. This series solves the issue by preventing it from happening in the first place. Of course that migrating DRC states has other uses (pending unplug operations during a migration, for example) so both patch series can coexist. One possible issue I see with this series is that it breaks Libvirt migration entirely if a CPU/mem hotplug happens in the target. With your series applied the migration fails before start with: # ./virsh -c 'qemu:///system' migrate --live --domain dhb_ub1704_nfs --desturi qemu+ssh://target_ip/system --timeout 60 --verbose error: internal error: unable to execute QEMU command 'device_add': CPU hotplug not supported without OS Note that I say "possible issue" because, although I believe we do not want to break Libvirt if possible, I also believe that we need to think about what makes sense in QEMU first. Thanks, Daniel On 05/23/2017 08:18 AM, Laurent Vivier wrote: If the OS is not started, QEMU sends an event to the OS that is lost and cannot be recovered. An unplug is not able to restore QEMU in a coherent state. So, while the OS is not started, disable CPU and memory hotplug. We use option vector 6 to know if the OS is started This series moves error checking for memory hotplug in a pre_plug function, and introduces the option vector 6 management. It also revert previous fix which was not really fixing the hotplug problem when the OS is not running. Laurent Vivier (4): spapr: add pre_plug function for memory spapr: add option vector 6 spapr: disable hotplugging without OS Revert "spapr: fix memory hot-unplugging" hw/ppc/spapr.c | 103 hw/ppc/spapr_drc.c | 20 ++--- hw/ppc/spapr_hcall.c| 5 ++- hw/ppc/spapr_ovec.c | 8 include/hw/ppc/spapr.h | 2 + include/hw/ppc/spapr_drc.h | 1 - include/hw/ppc/spapr_ovec.h | 7 +++ 7 files changed, 109 insertions(+), 37 deletions(-)
Re: [Qemu-devel] [PATCH] block/gluster: glfs_lseek() workaround
On 05/23/2017 12:27 PM, Jeff Cody wrote: > On current released versions of glusterfs, glfs_lseek() will sometimes > return invalid values for SEEK_DATA or SEEK_HOLE. For SEEK_DATA and > SEEK_HOLE, the returned value should be >= the passed offset, or < 0 in > the case of error: > > However, occasionally glfs_lseek() for SEEK_HOLE/DATA will return a > value less than the passed offset, yet greater than zero. Ouch. > > Although this is being fixed in gluster, we still should work around it > in QEMU, given that multiple released versions of gluster behave this > way. Fair enough. > > This patch treats the return case of (offs < start) the same as if an > error value other than ENXIO is returned; we will assume we learned > nothing, and there are no holes in the file. Yes, holes are merely an optimization, so treating bugs by the pessimistic fallback of no holes rather than aborting is reasonable. > +++ b/block/gluster.c > @@ -1275,7 +1275,14 @@ static int find_allocation(BlockDriverState *bs, off_t > start, > if (offs < 0) { > return -errno; /* D3 or D4 */ > } > -assert(offs >= start); > + > +if (offs < start) { > +/* This is not a valid return by lseek(). We are safe to just return > + * -EIO in this case, and we'll treat it like D4. Unfortunately some > + * versions of libgfapi will return offs < start, so an assert here > + * will unneccesarily abort QEMU. */ s/unneccesarily/unnecessarily/ (twice) With spelling fix, Reviewed-by: Eric Blake-- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 29/31] target/s390x: Use atomic operations for COMPARE SWAP PURGE
On 2017-05-23 09:31, Richard Henderson wrote: > On 05/23/2017 05:28 AM, Aurelien Jarno wrote: > > On 2017-05-22 20:03, Richard Henderson wrote: > > > +/* flush global tlb */ > > > +void HELPER(purge)(CPUS390XState *env) > > > +{ > > > +S390CPU *cpu = s390_env_get_cpu(env); > > > + > > > +tlb_flush_all_cpus(CPU(cpu)); > > > > > From what I understand from the PoP, the instruction should not complete > > before the TLB has been purged on all CPUs. Therefore I guess > > tlb_flush_all_cpus_synced() should be used instead. > I don't read that from this: > > # (1) all specified entries have been cleared > # from the ALB and TLB of this CPU and > > # (2) all other > # CPUs in the configuration have completed any stor- > # age accesses, including the updating of the change > # and reference bits, by using the specified ALB and > # TLB entries. > > It talks about referenced bits being updated -- presumably before the tlb > entry is flushed. But it doesn't say "all specified ALB and TLB entries of > other CPUs in the configuration". > > But if you still disagree, it's certainly an easy change as you note. Well i have to say it's not very clear. My point is that given the way QEMU model things, if we want to ensure that all storage accesses using the specified TLB entries are completed, we currently can only just make sure that the all TLB entries have been flushed. > > > +tcg_gen_atomic_cmpxchg_i64(old, addr, o->in1, o->out2, > > > > Here the prep generator took the 32-bit version of in1. I guess the same > > should be done for out2. > > No, in1 is zero-extended for its use ... > > > > > > + get_mem_index(s), mop | MO_ALIGN); > > > +tcg_temp_free_i64(addr); > > > + > > > +/* Are the memory and expected values (un)equal? */ > > > +cc = tcg_temp_new_i64(); > > > +tcg_gen_setcond_i64(TCG_COND_NE, cc, o->in1, old); > > ... here. > > For out2 above, cmpxchg acts as any other store wrt MO_TEUL, in that it > ignores the unused upper bits. Indeed you are correct, I read it too fast. -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net
[Qemu-devel] [PATCH v2] qapi: Fix some QMP documentation regressions
In the process of getting rid of docs/qmp-commands.txt, we managed to regress on some of the text that changed after the point where the move was first branched and when the move actually occurred. For example, commit 3282eca for blockdev-snapshot re-added the extra "options" layer which had been cleaned up in commit 0153d2f. This clears up all regressions identified over the range 02b351d..bd6092e: https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg05127.html as well as a cleanup to x-blockdev-remove-medium to prefer 'id' over 'device' (matching the cleanup for 'eject'). CC: qemu-triv...@nongnu.org Signed-off-by: Eric Blake--- v2: add 'eject', 'x-blockdev-remove-medium' cleanups, thanks to Markus' audit. Can go in either via trivial or via qapi tree qapi/block-core.json | 28 ++-- qapi/block.json | 2 +- 2 files changed, 15 insertions(+), 15 deletions(-) diff --git a/qapi/block-core.json b/qapi/block-core.json index 309b1df..88a7471 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -1206,11 +1206,11 @@ # Example: # # -> { "execute": "blockdev-add", -# "arguments": { "options": { "driver": "qcow2", -# "node-name": "node1534", -# "file": { "driver": "file", -#"filename": "hd1.qcow2" }, -# "backing": "" } } } +# "arguments": { "driver": "qcow2", +# "node-name": "node1534", +# "file": { "driver": "file", +# "filename": "hd1.qcow2" }, +# "backing": "" } } # # <- { "return": {} } # @@ -3214,7 +3214,7 @@ # <- { "return": {} } # # -> { "execute": "x-blockdev-remove-medium", -# "arguments": { "device": "ide0-1-0" } } +# "arguments": { "id": "ide0-1-0" } } # # <- { "return": {} } # @@ -3245,10 +3245,10 @@ # # -> { "execute": "blockdev-add", # "arguments": { -# "options": { "node-name": "node0", -# "driver": "raw", -# "file": { "driver": "file", -# "filename": "fedora.iso" } } } } +# "node-name": "node0", +# "driver": "raw", +# "file": { "driver": "file", +#"filename": "fedora.iso" } } } # <- { "return": {} } # # -> { "execute": "x-blockdev-insert-medium", @@ -3701,10 +3701,10 @@ # 1. Add a new node to a quorum # -> { "execute": "blockdev-add", # "arguments": { -# "options": { "driver": "raw", -# "node-name": "new_node", -#"file": { "driver": "file", -# "filename": "test.raw" } } } } +# "driver": "raw", +# "node-name": "new_node", +# "file": { "driver": "file", +#"filename": "test.raw" } } } # <- { "return": {} } # -> { "execute": "x-blockdev-change", # "arguments": { "parent": "disk1", diff --git a/qapi/block.json b/qapi/block.json index 6a2fdc7..414b61b 100644 --- a/qapi/block.json +++ b/qapi/block.json @@ -180,7 +180,7 @@ # # Example: # -# -> { "execute": "eject", "arguments": { "device": "ide1-0-1" } } +# -> { "execute": "eject", "arguments": { "id": "ide1-0-1" } } # <- { "return": {} } ## { 'command': 'eject', -- 2.9.4
Re: [Qemu-devel] [PATCH 1/1] Improve Cocoa modifier key handling
On Tue, May 23, 2017 at 8:52 AM Ian McKellarwrote: > On Tue, May 23, 2017 at 3:03 AM Gerd Hoffmann wrote: > >> >> I'm wondering whenever we should just use modifierFlags all the time. >> > > Probably. My initial patch tried to be minimally intrusive but I can try > reworking the NSEventTypeFlagsChanged handling to compare the new > modifierFlags to the last we've seen and synthesize the appropriate key > down/up events. > > After a little more experimentation I think that the approach in this patch is the right one. modifierFlags doesn't[1] indicate which instance of a modifier (ie: left or right) is being held. Except for the NSEventTypeFlagsChanged that's synthesized when a window takes focus the event's keyCode indicates which physical key is affected. That first event after focus has keyCode==0 which we can detect because 0 is the keyCode of the 'A' key which isn't a modifier. Ian [1] actually there are undocumented flags outside the NSEventModifierFlagDeviceIndependentFlagsMask mask that indicate left/right keys but I don't think we should use them.
Re: [Qemu-devel] [PATCH 02/31] target/s390x: Implement EXECUTE via new TranslationBlock
On 2017-05-23 08:54, Richard Henderson wrote: > On 05/23/2017 03:48 AM, Aurelien Jarno wrote: > > On 2017-05-22 20:02, Richard Henderson wrote: > > > Previously, helper_ex would construct the insn and then implement > > > the insn via direct calls other helpers. This was sufficient to > > > boot Linux but that is all. > > > > > > It is easy enough to go the whole nine yards by stashing state for > > > EXECUTE within the cpu, and then relying on a new TB to be created > > > that properly and completely interprets the insn. > > > > > > Signed-off-by: Richard Henderson> > > --- > > > target/s390x/cpu.h | 4 +- > > > target/s390x/helper.h | 2 +- > > > target/s390x/insn-data.def | 4 +- > > > target/s390x/machine.c | 19 +++ > > > target/s390x/mem_helper.c | 136 > > > +++-- > > > target/s390x/translate.c | 124 > > > + > > > 6 files changed, 133 insertions(+), 156 deletions(-) > > > > This looks good on the principle, and finally removes a big hack. That > > said it prevent my test system to boot. I haven't investigated why yet. > > Hmm. I've not got a complete environment -- merely booting a kernel up to > the point it fails to find a rootfs. Which did find several problems with > my first attempts at this, but wouldn't have exercised paging. I'll try > again to get a full install working... > > I wonder if I needed to adjust s390_cpu_handle_mmu_fault (and its myriad > subroutines) to handle setting ILEN correctly. > > There might be a simpler fix though. Currently I advance the PC and > remember the ilen of the EX(RL). Maybe better to *not* advance the PC so as > to have the original EX(RL) right there for ILEN_LATER and ILEN_LATER_INC to > operate on. > > Something like this, as a delta patch. Unfortunately it doesn't work. So far I have no real idea what could be the root cause of the issue. I have just determined that up to the crash, only a very limited set of instructions are being executed. They are the 4 bytes long versions of MVC, CLC, XC, TR. -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net
[Qemu-devel] [PATCH] block/gluster: glfs_lseek() workaround
On current released versions of glusterfs, glfs_lseek() will sometimes return invalid values for SEEK_DATA or SEEK_HOLE. For SEEK_DATA and SEEK_HOLE, the returned value should be >= the passed offset, or < 0 in the case of error: LSEEK(2): off_t lseek(int fd, off_t offset, int whence); [...] SEEK_HOLE Adjust the file offset to the next hole in the file greater than or equal to offset. If offset points into the middle of a hole, then the file offset is set to offset. If there is no hole past offset, then the file offset is adjusted to the end of the file (i.e., there is an implicit hole at the end of any file). [...] RETURN VALUE Upon successful completion, lseek() returns the resulting offset location as measured in bytes from the beginning of the file. On error, the value (off_t) -1 is returned and errno is set to indicate the error However, occasionally glfs_lseek() for SEEK_HOLE/DATA will return a value less than the passed offset, yet greater than zero. For instance, here are example values observed from this call: offs = glfs_lseek(s->fd, start, SEEK_HOLE); if (offs < 0) { return -errno; /* D1 and (H3 or H4) */ } start == 7608336384 offs == 7607877632 This causes QEMU to abort on the assert test. When this value is returned, errno is also 0. This is a reported and known bug to glusterfs: https://bugzilla.redhat.com/show_bug.cgi?id=1425293 Although this is being fixed in gluster, we still should work around it in QEMU, given that multiple released versions of gluster behave this way. This patch treats the return case of (offs < start) the same as if an error value other than ENXIO is returned; we will assume we learned nothing, and there are no holes in the file. Signed-off-by: Jeff Cody--- block/gluster.c | 18 -- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/block/gluster.c b/block/gluster.c index 7c76cd0..c147909e 100644 --- a/block/gluster.c +++ b/block/gluster.c @@ -1275,7 +1275,14 @@ static int find_allocation(BlockDriverState *bs, off_t start, if (offs < 0) { return -errno; /* D3 or D4 */ } -assert(offs >= start); + +if (offs < start) { +/* This is not a valid return by lseek(). We are safe to just return + * -EIO in this case, and we'll treat it like D4. Unfortunately some + * versions of libgfapi will return offs < start, so an assert here + * will unneccesarily abort QEMU. */ +return -EIO; +} if (offs > start) { /* D2: in hole, next data at offs */ @@ -1307,7 +1314,14 @@ static int find_allocation(BlockDriverState *bs, off_t start, if (offs < 0) { return -errno; /* D1 and (H3 or H4) */ } -assert(offs >= start); + +if (offs < start) { +/* This is not a valid return by lseek(). We are safe to just return + * -EIO in this case, and we'll treat it like H4. Unfortunately some + * versions of libgfapi will return offs < start, so an assert here + * will unneccesarily abort QEMU. */ +return -EIO; +} if (offs > start) { /* -- 2.9.3
Re: [Qemu-devel] [PATCH] qapi: Fix some blockdev-add documentation regressions
On 05/23/2017 09:32 AM, Eric Blake wrote: > On 05/23/2017 03:39 AM, Markus Armbruster wrote: >> Eric Blakewrites: >> >>> In the process of getting rid of docs/qmp-commands.txt, we >>> managed to regress on any text that changed after the point >>> where the move was first branched and when the move actually >>> occurred. For example, commit 3282eca for blockdev-snapshot >>> re-added the extra "options" layer which had been cleaned up >>> in commit 0153d2f. >>> > > I'll post a v2, since I have another pending doc patch that hasn't been > reviewed yet, and since your audit means I need to update my commit > message anyway. Actually, it just got queued as trivial: https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg05323.html although git handles things just fine if it goes in earlier through any other tree. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 30/31] target/s390x: Implement CSPG
On 2017-05-23 09:33, Richard Henderson wrote: > On 05/23/2017 04:12 AM, Aurelien Jarno wrote: > > On 2017-05-22 20:03, Richard Henderson wrote: > > > Signed-off-by: Richard Henderson> > > --- > > > target/s390x/insn-data.def | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def > > > index 4c91f30..8604847 100644 > > > --- a/target/s390x/insn-data.def > > > +++ b/target/s390x/insn-data.def > > > @@ -838,6 +838,7 @@ > > > #ifndef CONFIG_USER_ONLY > > > /* COMPARE AND SWAP AND PURGE */ > > > D(0xb250, CSP, RRE, Z, r1_32u, ra2, r1_P, 0, csp, 0, > > > MO_TEUL) > > > +D(0xb98a, CSPG,RRE, Z, r1_o, ra2, r1_P, 0, csp, 0, MO_TEQ) > > > > CSPG is part of the of the DAT-enhancement facility. I called it DAT_ENH > > in my local patches to match the name we have in the CPU features. > > The translator needs a large overhaul to bring it into line with the (more > recently added) facilities infrastructure. We don't currently enforce > anything. Agreed. That said it's actually useful to quickly check if all the instructions of a given facility have been implemented and then flip the corresponding facility bit. -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] [PATCH v2] tests/libqtest: Print error instead of aborting when env variable is missing
On 05/23/2017 03:25 AM, Thomas Huth wrote: > On 22.05.2017 17:58, no-re...@patchew.org wrote: >> This series failed automatic build test. Please find the testing commands and >> their output below. If you have docker installed, you can probably reproduce >> it >> locally. > [...] >> LEX convert-dtsv0-lexer.lex.c >> DEP /tmp/qemu-test/src/dtc/fdtdump.c >> make[1]: flex: Command not found >> BISON dtc-parser.tab.c >> make[1]: bison: Command not found >> DEP /tmp/qemu-test/src/dtc/srcpos.c >> LEX dtc-lexer.lex.c >> make[1]: flex: Command not found >> DEP /tmp/qemu-test/src/dtc/treesource.c >> DEP /tmp/qemu-test/src/dtc/livetree.c >> DEP /tmp/qemu-test/src/dtc/fstree.c >> DEP /tmp/qemu-test/src/dtc/flattree.c >> DEP /tmp/qemu-test/src/dtc/dtc.c >> DEP /tmp/qemu-test/src/dtc/data.c >> DEP /tmp/qemu-test/src/dtc/checks.c >> CHK version_gen.h >> BISON dtc-parser.tab.c >> make[1]: bison: Command not found >> LEX convert-dtsv0-lexer.lex.c >> UPD version_gen.h >> make[1]: flex: Command not found >> LEX dtc-lexer.lex.c >> make[1]: flex: Command not found >> DEP /tmp/qemu-test/src/dtc/util.c >> LEX convert-dtsv0-lexer.lex.c >> BISON dtc-parser.tab.c >> make[1]: flex: Command not found >> make[1]: bison: Command not found >> LEX dtc-lexer.lex.c >> make[1]: flex: Command not found > > Looks like flex and bison are missing in the docker image? Could someone > please add it? > > [...] >> CC replay/replay-internal.o >> /tmp/qemu-test/src/replay/replay-internal.c: In function ‘replay_put_array’: >> /tmp/qemu-test/src/replay/replay-internal.c:65: warning: ignoring return >> value of ‘fwrite’, declared with attribute warn_unused_result > > I guess that should be fixed? > > [...] >> CC slirp/tcp_input.o >> CC slirp/tcp_output.o >> CC slirp/tcp_subr.o >> CC slirp/tcp_timer.o >> /tmp/qemu-test/src/slirp/tcp_input.c: In function ‘tcp_input’: >> /tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_p’ may be >> used uninitialized in this function >> /tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_len’ may be >> used uninitialized in this function >> /tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_tos’ may be >> used uninitialized in this function >> /tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_id’ may be >> used uninitialized in this function >> /tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_off’ may be >> used uninitialized in this function >> /tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_ttl’ may be >> used uninitialized in this function >> /tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_sum’ may be >> used uninitialized in this function >> /tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_src.s_addr’ >> may be used uninitialized in this function >> /tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_dst.s_addr’ >> may be used uninitialized in this function >> /tmp/qemu-test/src/slirp/tcp_input.c:220: warning: ‘save_ip6.ip_nh’ may be >> used uninitialized in this function > > I've never seen these warnings in tcp_input.c before ... and they also > look like false positives too me ... is that GCC in the docker image too > sensitive? > > [...] >> CC aarch64-softmmu/hw/timer/exynos4210_rtc.o >> /tmp/qemu-test/src/hw/i386/pc_piix.c: In function >> ‘igd_passthrough_isa_bridge_create’: >> /tmp/qemu-test/src/hw/i386/pc_piix.c:1067: warning: ‘pch_rev_id’ may be used >> uninitialized in this function > > dito > >> CC aarch64-softmmu/hw/usb/tusb6010.o >> /tmp/qemu-test/src/hw/i386/acpi-build.c: In function >> ‘build_append_pci_bus_devices’: >> /tmp/qemu-test/src/hw/i386/acpi-build.c:525: warning: ‘notify_method’ may be >> used uninitialized in this function > > dito > > [...] >> /tmp/qemu-test/src/tests/ide-test.c: In function ‘cdrom_pio_impl’: >> /tmp/qemu-test/src/tests/ide-test.c:803: warning: ignoring return value of >> ‘fwrite’, declared with attribute warn_unused_result >> /tmp/qemu-test/src/tests/ide-test.c: In function ‘test_cdrom_dma’: >> /tmp/qemu-test/src/tests/ide-test.c:899: warning: ignoring return value of >> ‘fwrite’, declared with attribute warn_unused_result > > Needs a patch, I guess? > Huh, I've never seen these come up locally, I'll send a patch (if only to quiet patchew.) >> GTESTER tests/test-vmstate >> Failed to load simple/primitive:b_1 >> Failed to load simple/primitive:i64_2 >> Failed to load simple/primitive:i32_1 >> Failed to load simple/primitive:i32_1 >> Failed to load test/with_tmp:a >> Failed to load test/tmp_child_parent:f >> Failed to load test/tmp_child:parent >> Failed to load test/with_tmp:tmp >> Failed to load test/tmp_child:diff >> Failed to load test/with_tmp:tmp >> Failed to load test/tmp_child:diff >> Failed to load test/with_tmp:tmp > > That works for me when I
Re: [Qemu-devel] [PATCH v2 4/4] 9pfs: local: metadata file for the VirtFS root
On 05/23/2017 09:32 AM, Greg Kurz wrote: > When using the mapped-file security, credentials are stored in a metadata > directory located in the parent directory. This is okay for all paths with > the notable exception of the root path, since we don't want and probably > can't create a metadata directory above the virtfs directory on the host. > > This patch introduces a dedicated metadata file, sitting in the virtfs root > for this purpose. It relies on the fact that the "." name necessarily refers > to the virtfs root. > > As for the metadata directory, we don't want the client to see this file. > The current code only cares for readdir() but there are many other places > to fix actually. The filtering logic is hence put in a separate function. > > @@ -478,7 +504,8 @@ static off_t local_telldir(FsContext *ctx, > V9fsFidOpenState *fs) > > static bool local_is_mapped_file_metadata(FsContext *fs_ctx, const char > *name) > { > -return !strcmp(name, VIRTFS_META_DIR); > +return > +!strcmp(name, VIRTFS_META_DIR) || !strcmp(name, > VIRTFS_META_ROOT_FILE); We have to block VIRTFS_META_DIR at any depth in the hierarchy, but can/should we change the blocking of VIRTFS_META_ROOT_FILE to only happen at the root directory, rather than at all directories? On the other hand, if you can simultaneously map /path/to/a for one mount point, and /path/to/a/b for another, then the root file for B is visible at a lower depth than the root file for A, and blocking the metafile name everywhere means that the mount A can't influence the behavior on the mount for B. Not tested, but looks sane, so: Reviewed-by: Eric Blake-- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [Xen-devel] [RFC PATCH V2 1/2] xen-pt: bind/unbind interrupt remapping format MSI
On Tue, May 23, 2017 at 08:16:25PM +0800, Lan Tianyu wrote: > On 2017年05月19日 20:04, Jan Beulich wrote: > On 19.05.17 at 13:16,wrote: > >> On Thu, May 18, 2017 at 01:32:59AM -0400, Lan Tianyu wrote: > >>> --- a/include/hw/i386/apic-msidef.h > >>> +++ b/include/hw/i386/apic-msidef.h > >>> @@ -26,6 +26,7 @@ > >>> > >>> #define MSI_ADDR_DEST_ID_SHIFT 12 > >>> #define MSI_ADDR_DEST_IDX_SHIFT 4 > >>> -#define MSI_ADDR_DEST_ID_MASK 0x000 > >>> +#define MSI_ADDR_DEST_ID_MASK 0x000fff00 > >> The value of MSI_ADDR_DEST_ID_MASK is changed here. I think the patch > >> should be: > >> +#define MSI_ADDR_DEST_ID_MASK 0x0000 > > Judging from other sources, rather the other way around - the > > mask needs to have further bits removed (should be 0x000ff000 > > afaict). Xen sources confirm this, and while Linux has the value > > you suggest, that contradicts > Agree. Defining the mask as "0x000ff000" makes more sense. > Just check Qemu source code. Only apic_send_msi() and msi_dest_id() use > the mask > to get dest apic id. They mask MSI address field with > MSI_ADDR_DEST_ID_MASK and > then right-shift 12bit. The low 12bit won't be used. > > Anthony, does this make sense? Yes, it does. The change to MSI_ADDR_DEST_ID_MASK should probably go in its own patch. -- Anthony PERARD
Re: [Qemu-devel] [PATCH 1/4] block: Fix anonymous BBs in blk_root_inactivate()
Kevin Wolfwrote: > blk->name isn't an array, but a pointer that can be NULL. Checking for > an anonymous BB must involve a NULL check first, otherwise we get > crashes. > > Signed-off-by: Kevin Wolf Reviewed-by: Juan Quintela
Re: [Qemu-devel] [PATCH 30/31] target/s390x: Implement CSPG
On 05/23/2017 04:12 AM, Aurelien Jarno wrote: On 2017-05-22 20:03, Richard Henderson wrote: Signed-off-by: Richard Henderson--- target/s390x/insn-data.def | 1 + 1 file changed, 1 insertion(+) diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def index 4c91f30..8604847 100644 --- a/target/s390x/insn-data.def +++ b/target/s390x/insn-data.def @@ -838,6 +838,7 @@ #ifndef CONFIG_USER_ONLY /* COMPARE AND SWAP AND PURGE */ D(0xb250, CSP, RRE, Z, r1_32u, ra2, r1_P, 0, csp, 0, MO_TEUL) +D(0xb98a, CSPG,RRE, Z, r1_o, ra2, r1_P, 0, csp, 0, MO_TEQ) CSPG is part of the of the DAT-enhancement facility. I called it DAT_ENH in my local patches to match the name we have in the CPU features. The translator needs a large overhaul to bring it into line with the (more recently added) facilities infrastructure. We don't currently enforce anything. But you're also right that I shouldn't just ignore the issue and leave it marked incorrectly. r~
Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] spapr: add option vector 6
On Tue, 23 May 2017 13:18:10 +0200 Laurent Vivierwrote: > This allows to know when the OS is started and its type. > > Signed-off-by: Laurent Vivier > --- > hw/ppc/spapr.c | 36 > hw/ppc/spapr_hcall.c| 5 - > hw/ppc/spapr_ovec.c | 8 > include/hw/ppc/spapr.h | 2 ++ > include/hw/ppc/spapr_ovec.h | 7 +++ > 5 files changed, 57 insertions(+), 1 deletion(-) > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index 0e8d8d1..eceb4cc 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -1369,6 +1369,7 @@ static void ppc_spapr_reset(void) > first_ppc_cpu->env.nip = SPAPR_ENTRY_POINT; > > spapr->cas_reboot = false; > +spapr->os_name = OV6_NONE; > } > > static void spapr_create_nvram(sPAPRMachineState *spapr) > @@ -1524,10 +1525,41 @@ static const VMStateDescription > vmstate_spapr_patb_entry = { > }, > }; > > +static bool spapr_os_name_needed(void *opaque) > +{ > +sPAPRMachineState *spapr = opaque; > +sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr); > +return smc->need_os_name; So this will have the subsection migrated unconditionally even if the value wasn't changed from the default yet ? Also, it looks weird to involve a machine compat flag here... if the concern is backwards migration then I guess you should check the compat flag in h_client_architecture_support() and not set @os_name for older machines. > +} > + > +static const VMStateDescription vmstate_spapr_os_name = { > +.name = "spapr_os_name", > +.version_id = 1, > +.minimum_version_id = 1, > +.needed = spapr_os_name_needed, > +.fields = (VMStateField[]) { > +VMSTATE_UINT8(os_name, sPAPRMachineState), > +VMSTATE_END_OF_LIST() > +}, > +}; > + > +static int spapr_pre_load(void *opaque) > +{ > +sPAPRMachineState *spapr = opaque; > + > +/* if the os_name is not migrated from the source, > + * we must allow hotplug, so set os_name to linux > + */ > +spapr->os_name = OV6_LINUX; But maybe the source was in SLOF and I guess you don't want @os_name to be set in this case... The correct way to restore older machines behavior is to set @os_name to OV6_LINUX according to the compat flag. > + > +return 0; > +} > + > static const VMStateDescription vmstate_spapr = { > .name = "spapr", > .version_id = 3, > .minimum_version_id = 1, > +.pre_load = spapr_pre_load, > .post_load = spapr_post_load, > .fields = (VMStateField[]) { > /* used to be @next_irq */ > @@ -1542,6 +1574,7 @@ static const VMStateDescription vmstate_spapr = { > .subsections = (const VMStateDescription*[]) { > _spapr_ov5_cas, > _spapr_patb_entry, > +_spapr_os_name, > NULL > } > }; > @@ -3216,6 +3249,7 @@ static void spapr_machine_class_init(ObjectClass *oc, > void *data) > * in which LMBs are represented and hot-added > */ > mc->numa_mem_align_shift = 28; > +smc->need_os_name = true; The name seems to indicate the machine requires this, which is obviously not the case... what about @parse_os_name instead ? > } > > static const TypeInfo spapr_machine_info = { > @@ -3293,9 +3327,11 @@ static void > spapr_machine_2_9_instance_options(MachineState *machine) > > static void spapr_machine_2_9_class_options(MachineClass *mc) > { > +sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc); > spapr_machine_2_10_class_options(mc); > SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_9); > mc->numa_auto_assign_ram = numa_legacy_auto_assign_ram; > +smc->need_os_name = false; > } > > DEFINE_SPAPR_MACHINE(2_9, "2.9", false); > diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c > index 0d608d6..5dbe3c7 100644 > --- a/hw/ppc/spapr_hcall.c > +++ b/hw/ppc/spapr_hcall.c > @@ -1058,7 +1058,8 @@ static target_ulong > h_client_architecture_support(PowerPCCPU *cpu, > uint32_t max_compat = cpu->max_compat; > uint32_t best_compat = 0; > int i; > -sPAPROptionVector *ov1_guest, *ov5_guest, *ov5_cas_old, *ov5_updates; > +sPAPROptionVector *ov1_guest, *ov5_guest, *ov5_cas_old, *ov5_updates, > + *ov6_guest; > bool guest_radix; > > /* > @@ -1112,6 +1113,7 @@ static target_ulong > h_client_architecture_support(PowerPCCPU *cpu, > > ov1_guest = spapr_ovec_parse_vector(ov_table, 1); > ov5_guest = spapr_ovec_parse_vector(ov_table, 5); > +ov6_guest = spapr_ovec_parse_vector(ov_table, 6); > if (spapr_ovec_test(ov5_guest, OV5_MMU_BOTH)) { > error_report("guest requested hash and radix MMU, which is > invalid."); > exit(EXIT_FAILURE); > @@ -1154,6 +1156,7 @@ static target_ulong > h_client_architecture_support(PowerPCCPU *cpu, > } > spapr->cas_legacy_guest_workaround = !spapr_ovec_test(ov1_guest, >
Re: [Qemu-devel] [PATCH 29/31] target/s390x: Use atomic operations for COMPARE SWAP PURGE
On 05/23/2017 05:28 AM, Aurelien Jarno wrote: On 2017-05-22 20:03, Richard Henderson wrote: +/* flush global tlb */ +void HELPER(purge)(CPUS390XState *env) +{ +S390CPU *cpu = s390_env_get_cpu(env); + +tlb_flush_all_cpus(CPU(cpu)); From what I understand from the PoP, the instruction should not complete before the TLB has been purged on all CPUs. Therefore I guess tlb_flush_all_cpus_synced() should be used instead. I don't read that from this: # (1) all specified entries have been cleared # from the ALB and TLB of this CPU and # (2) all other # CPUs in the configuration have completed any stor- # age accesses, including the updating of the change # and reference bits, by using the specified ALB and # TLB entries. It talks about referenced bits being updated -- presumably before the tlb entry is flushed. But it doesn't say "all specified ALB and TLB entries of other CPUs in the configuration". But if you still disagree, it's certainly an easy change as you note. +tcg_gen_atomic_cmpxchg_i64(old, addr, o->in1, o->out2, Here the prep generator took the 32-bit version of in1. I guess the same should be done for out2. No, in1 is zero-extended for its use ... + get_mem_index(s), mop | MO_ALIGN); +tcg_temp_free_i64(addr); + +/* Are the memory and expected values (un)equal? */ +cc = tcg_temp_new_i64(); +tcg_gen_setcond_i64(TCG_COND_NE, cc, o->in1, old); ... here. For out2 above, cmpxchg acts as any other store wrt MO_TEUL, in that it ignores the unused upper bits. r~
Re: [Qemu-devel] [PATCH 4/4] qemu-iotests: Block migration test
On 05/23/2017 11:18 AM, Kevin Wolf wrote: > Am 23.05.2017 um 17:46 hat Eric Blake geschrieben: >> On 05/23/2017 09:01 AM, Kevin Wolf wrote: >>> Signed-off-by: Kevin Wolf>>> --- >>> tests/qemu-iotests/183 | 121 >>> + >>> tests/qemu-iotests/183.out | 48 ++ >>> tests/qemu-iotests/group | 1 + >>> 3 files changed, 170 insertions(+) >>> create mode 100755 tests/qemu-iotests/183 >>> create mode 100644 tests/qemu-iotests/183.out >> >> Does this test gracefully skip when coupled with the efforts for a >> configure option to disable block migration? >> https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg04398.html > > Obviously it doesn't. How would we even check? Just grep whether the > magic error string turns up somewhere after doing 'migrate -b'? I think the easiest way (once Juan's series lands) would be to try this QMP on a standalone qemu execution prior to firing up the rest of the test: { "execute":"migrate-set-capabilities", "arguments":{ "capabilities": [ { "capability":"block", "state":true } ] } } If that command fails, block migration is not compiled in (or we're talking to an older qemu that lacks the capability altogether, but we don't have to worry about that in our iotests). Of course, if we do that, then we could use QMP 'migrate' with the capabilities rather than HMP -b to drive the same aspect of the test. > > I don't think we have other test cases that don't skip immediately, but > only after doing half of the test, but I think it might work anyway. That's an option too - grepping for the magic failure string and gracefully exiting if we were unable to start migration. I think we have done something like that recently to grep whether we have op-blocking support via fcntl OFD semantics, and exit early if it is an older kernel that has less-sane locks based on the error message. >> Should we also check the use of -i? > > Good point, though I don't even know how to use -i manually. :-) > > We can either have a follow-up patch extending this one or a completely > separate test case for it. I would have to try out -i before I could say > which makes more sense. An incremental patch to expand this test is fine. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 28/31] target/s390x: Use unwind data for helper_lra
On 05/23/2017 05:11 AM, Aurelien Jarno wrote: On 2017-05-22 20:03, Richard Henderson wrote: Note that exception_index is not live during a TB, so there is no point saving it around mmu_translate. What do you mean by "is not live"? Indeed cpu_loop_exit() is not called so the TB is not terminated immediately. That said the while loop in cpu_exec() will trigger the exception after the TB. Ah, yes. I'll undo that bit. r~
Re: [Qemu-devel] [PATCH 4/4] qemu-iotests: Block migration test
Am 23.05.2017 um 17:46 hat Eric Blake geschrieben: > On 05/23/2017 09:01 AM, Kevin Wolf wrote: > > Signed-off-by: Kevin Wolf> > --- > > tests/qemu-iotests/183 | 121 > > + > > tests/qemu-iotests/183.out | 48 ++ > > tests/qemu-iotests/group | 1 + > > 3 files changed, 170 insertions(+) > > create mode 100755 tests/qemu-iotests/183 > > create mode 100644 tests/qemu-iotests/183.out > > Does this test gracefully skip when coupled with the efforts for a > configure option to disable block migration? > https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg04398.html Obviously it doesn't. How would we even check? Just grep whether the magic error string turns up somewhere after doing 'migrate -b'? I don't think we have other test cases that don't skip immediately, but only after doing half of the test, but I think it might work anyway. > > +echo > > +echo === Do block migration to destination === > > +echo > > + > > +_send_qemu_cmd $src "migrate -b unix:${MIG_SOCKET}" "(qemu)" > > I don't see any earlier checks, so it looks like you'd choke here if -b > is not compiled in. > > Should we also check the use of -i? Good point, though I don't even know how to use -i manually. :-) We can either have a follow-up patch extending this one or a completely separate test case for it. I would have to try out -i before I could say which makes more sense. Kevin pgpIbuEY8Irv7.pgp Description: PGP signature
Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/4] spapr: add pre_plug function for memory
On 23/05/2017 17:28, Greg Kurz wrote: > On Tue, 23 May 2017 13:18:09 +0200 > Laurent Vivierwrote: > >> This allows to manage errors before the memory >> has started to be hotplugged. We already have >> the function for the CPU cores. >> >> Signed-off-by: Laurent Vivier >> --- >> hw/ppc/spapr.c | 45 ++--- >> 1 file changed, 30 insertions(+), 15 deletions(-) >> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c >> index 0980d73..0e8d8d1 100644 >> --- a/hw/ppc/spapr.c >> +++ b/hw/ppc/spapr.c >> @@ -2569,20 +2569,6 @@ static void spapr_memory_plug(HotplugHandler >> *hotplug_dev, DeviceState *dev, >> uint64_t align = memory_region_get_alignment(mr); >> uint64_t size = memory_region_size(mr); >> uint64_t addr; >> -char *mem_dev; >> - >> -if (size % SPAPR_MEMORY_BLOCK_SIZE) { >> -error_setg(_err, "Hotplugged memory size must be a multiple >> of " >> - "%lld MB", SPAPR_MEMORY_BLOCK_SIZE/M_BYTE); >> -goto out; >> -} >> - >> -mem_dev = object_property_get_str(OBJECT(dimm), PC_DIMM_MEMDEV_PROP, >> NULL); >> -if (mem_dev && !kvmppc_is_mem_backend_page_size_ok(mem_dev)) { >> -error_setg(_err, "Memory backend has bad page size. " >> - "Use 'memory-backend-file' with correct mem-path."); >> -goto out; >> -} >> >> pc_dimm_memory_plug(dev, >hotplug_memory, mr, align, _err); >> if (local_err) { >> @@ -2603,6 +2589,33 @@ out: >> error_propagate(errp, local_err); >> } >> >> +static void spapr_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState >> *dev, >> +Error **errp) > > Indentation nit ok > >> +{ >> +PCDIMMDevice *dimm = PC_DIMM(dev); >> +PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm); >> +MemoryRegion *mr = ddc->get_memory_region(dimm); >> +uint64_t size = memory_region_size(mr); >> +Error *local_err = NULL; >> +char *mem_dev; >> + >> +if (size % SPAPR_MEMORY_BLOCK_SIZE) { >> +error_setg(_err, "Hotplugged memory size must be a multiple >> of " >> + "%lld MB", SPAPR_MEMORY_BLOCK_SIZE / M_BYTE); >> +goto out; >> +} >> + >> +mem_dev = object_property_get_str(OBJECT(dimm), PC_DIMM_MEMDEV_PROP, >> NULL); >> +if (mem_dev && !kvmppc_is_mem_backend_page_size_ok(mem_dev)) { >> +error_setg(_err, "Memory backend has bad page size. " >> + "Use 'memory-backend-file' with correct mem-path."); >> +goto out; >> +} >> + >> +out: >> +error_propagate(errp, local_err); > > As recently discussed with Markus Armbruster, it isn't necessary to have a > local Error * if you don't do anything else with it but propagate it. Yes, you are right, it's a stupid cut'n'paste. Thanks, Laurent
[Qemu-devel] virtio crypto device implemenation
Hi, Michael, As you know, Lei Gong from Huawei and I are co-working on virtio crypto device spec, he is focusing on symmetric algorithm part, I am focusing on asymmetric part. Now I am planning the implementation for asymmetric part, would you please give me your point regarding the questions below? Current virtio crypto device implementation from Lei Gong: The virtio crypto device implementation has been upstreamed to QEMU and it has a qemu backend implementation for symmetric algorithm part, the front end Linux device driver for symmetric part has been upstreamed to Linux kernel as well. My questions: From my side, I planned to add the asymmetric part support in upstreamed front end device driver, and I don't want to add the asymmetric algorithm support to current virtio crypto device's qemu backend, instead, I would like to implement and upstream a DPDK vhost-user based backend for asymmetric algorithm, and accordingly Lei Gong will help to upstream a vhost user agent for virtio crypto device in QEMU, is this approach acceptable? Is a qemu backend a mandatory requirement for the virtio crypto device? Is there a general policy for this? Thanks
Re: [Qemu-devel] [PATCH 02/31] target/s390x: Implement EXECUTE via new TranslationBlock
On 2017-05-23 12:48, Aurelien Jarno wrote: > On 2017-05-22 20:02, Richard Henderson wrote: > > Previously, helper_ex would construct the insn and then implement > > the insn via direct calls other helpers. This was sufficient to > > boot Linux but that is all. > > > > It is easy enough to go the whole nine yards by stashing state for > > EXECUTE within the cpu, and then relying on a new TB to be created > > that properly and completely interprets the insn. > > > > Signed-off-by: Richard Henderson> > --- > > target/s390x/cpu.h | 4 +- > > target/s390x/helper.h | 2 +- > > target/s390x/insn-data.def | 4 +- > > target/s390x/machine.c | 19 +++ > > target/s390x/mem_helper.c | 136 > > +++-- > > target/s390x/translate.c | 124 + > > 6 files changed, 133 insertions(+), 156 deletions(-) > > This looks good on the principle, and finally removes a big hack. That > said it prevent my test system to boot. I haven't investigated why yet. This can aslo be reproduced using the kernel and initrd from the daily Debian installer: https://d-i.debian.org/daily-images/s390x/daily/generic/ I am personally using the following command line: qemu-system-s390x -M s390-ccw-virtio -m 512 -nographic -kernel kernel.debian -initrd initrd.debian Aurelien -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] [PATCH 02/31] target/s390x: Implement EXECUTE via new TranslationBlock
On 05/23/2017 03:48 AM, Aurelien Jarno wrote: On 2017-05-22 20:02, Richard Henderson wrote: Previously, helper_ex would construct the insn and then implement the insn via direct calls other helpers. This was sufficient to boot Linux but that is all. It is easy enough to go the whole nine yards by stashing state for EXECUTE within the cpu, and then relying on a new TB to be created that properly and completely interprets the insn. Signed-off-by: Richard Henderson--- target/s390x/cpu.h | 4 +- target/s390x/helper.h | 2 +- target/s390x/insn-data.def | 4 +- target/s390x/machine.c | 19 +++ target/s390x/mem_helper.c | 136 +++-- target/s390x/translate.c | 124 + 6 files changed, 133 insertions(+), 156 deletions(-) This looks good on the principle, and finally removes a big hack. That said it prevent my test system to boot. I haven't investigated why yet. Hmm. I've not got a complete environment -- merely booting a kernel up to the point it fails to find a rootfs. Which did find several problems with my first attempts at this, but wouldn't have exercised paging. I'll try again to get a full install working... I wonder if I needed to adjust s390_cpu_handle_mmu_fault (and its myriad subroutines) to handle setting ILEN correctly. There might be a simpler fix though. Currently I advance the PC and remember the ilen of the EX(RL). Maybe better to *not* advance the PC so as to have the original EX(RL) right there for ILEN_LATER and ILEN_LATER_INC to operate on. Something like this, as a delta patch. r~ diff --git a/target/s390x/translate.c b/target/s390x/translate.c index 67c85f0..5773f92 100644 --- a/target/s390x/translate.c +++ b/target/s390x/translate.c @@ -2206,8 +2206,10 @@ static ExitStatus op_ex(DisasContext *s, DisasOps *o) tcg_temp_free_i64(v1); } -/* End the TB; a new TB will be created for modified insn. */ -return EXIT_PC_STALE; +/* End the TB; a new TB will be created for modified insn. + Note that the modified insn runs with this same PC. */ +update_psw_addr(s); +return EXIT_PC_UPDATED; } static ExitStatus op_fieb(DisasContext *s, DisasOps *o) @@ -5189,14 +5191,10 @@ static const DisasInsn *extract_insn insn = s->ex_value & 0xull; ilen = s->ex_value & 0xff; op = insn >> 56; -s->ilen = ilen; -s->next_pc = s->pc; } else { insn = ld_code2(env, pc); op = (insn >> 8) & 0xff; ilen = get_ilen(op); -s->ilen = ilen; -s->next_pc = s->pc + ilen; switch (ilen) { case 2: @@ -5212,6 +5210,8 @@ static const DisasInsn *extract_insn g_assert_not_reached(); } } +s->next_pc = s->pc + ilen; +s->ilen = ilen; /* We can't actually determine the insn format until we've looked up the full insn opcode. Which we can't do without locating the @@ -5470,17 +5470,14 @@ void gen_intermediate_code /* If we reach a page boundary, are single stepping, or exhaust instruction count, stop generation. */ -if (status == NO_EXIT) { -if (unlikely(dc.ex_value)) { -/* The PC on entry is already advanced. */ -status = EXIT_PC_UPDATED; -} else if (dc.pc >= next_page_start - || tcg_op_buf_full() - || num_insns >= max_insns - || singlestep - || cs->singlestep_enabled) { -status = EXIT_PC_STALE; -} +if (status == NO_EXIT +&& (dc.pc >= next_page_start +|| tcg_op_buf_full() +|| num_insns >= max_insns +|| singlestep +|| cs->singlestep_enabled +|| dc.ex_value)) { +status = EXIT_PC_STALE; } } while (status == NO_EXIT);
Re: [Qemu-devel] [PATCH 1/1] Improve Cocoa modifier key handling
On Tue, May 23, 2017 at 3:03 AM Gerd Hoffmannwrote: > Sounds like this happens in case there is a modifier state change > without linked key event, such as state change while qemu did not have > the keyboard focus. Nice that macos sends notifications in that case. > Yeah, I guess it makes sense to send an event to update modifier state when keyboard focus returns. > > I'm wondering whenever we should just use modifierFlags all the time. > Probably. My initial patch tried to be minimally intrusive but I can try reworking the NSEventTypeFlagsChanged handling to compare the new modifierFlags to the last we've seen and synthesize the appropriate key down/up events. Ian
Re: [Qemu-devel] [PATCH v2 3/4] 9pfs: local: simplify file opening
On 05/23/2017 09:32 AM, Greg Kurz wrote: > The logic to open a path currently sits between local_open_nofollow() and > the relative_openat_nofollow() helper, which has no other user. > > For the sake of clarity, this patch moves all the code of the helper into > its unique caller. While here we also: > - drop the code to skip leading "/" because the backend isn't supposed to > pass anything but relative paths without consecutive slashes. The assert() > is kept because we really don't want a buggy backend to pass an absolute odd spacing > path to openat(). > - use strchrnul() to get a simpler code. This is ok since virtfs if for s/if/is/ > linux+glibc hosts only. > - don't dup() the initial directory and add an assert() to ensure we don't > return the global mountfd to the caller. BTW, this would mean that the > caller passed an empty path, which isn't supposed to happen either. > > Signed-off-by: Greg Kurz> --- Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v2 2/4] 9pfs: local: resolve special directories in paths
On 05/23/2017 09:32 AM, Greg Kurz wrote: > When using the mapped-file security mode, the creds of a path /foo/bar > are stored in the /foo/.virtfs_metadata/bar file. This is okay for all > paths unless they end with '.' or '..', because we cannot create the > corresponding file in the metadata directory. > > This patch ensures that '.' and '..' are resolved in all paths. > > The core code only passes path elements (no '/') to the backend, with > the notable exception of the '/' path, which refers to the virtfs root. > This patch preserves the current behavior of converting it to '.' so > that it can be passed to "*at()" syscalls ('/' would mean the host root). > > Signed-off-by: Greg Kurz> --- > +} else { > +char *tmp = g_path_get_dirname(dir_path->data); > +/* Symbolic links are resolved by the client. We can assume > + * that ".." relative to "foo/bar" is equivalent to "foo" > + */ Thanks for tweaking this since v1. Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 0/2] docs: Minor updates around ide-drive
On 05/23/2017 10:56 AM, Markus Armbruster wrote: > Markus Armbrusterwrites: > >> John Snow writes: >> >>> On 05/09/2017 05:41 AM, Markus Armbruster wrote: Markus Armbruster (2): docs qemu-doc: Avoid ide-drive, it's deprecated docs/qdev-device-use.txt: update section Default Devices docs/bootindex.txt | 2 +- docs/qdev-device-use.txt | 13 +++-- qemu-options.hx | 2 +- 3 files changed, 9 insertions(+), 8 deletions(-) >>> >>> Shame on me for not noticing, but thank you. >>> >>> Acked-by: John Snow >> >> Thanks! >> >> The canonical path is through your tree, but I could stick it into a >> miscellaneous pull request if you like. > > Nevermind, qemu-trivial picked it up. > That was my assumption since it went to -trivial :) Glad my inaction helped sort it out.
Re: [Qemu-devel] [PATCH 4/4] qemu-iotests: Block migration test
On 05/23/2017 09:01 AM, Kevin Wolf wrote: > Signed-off-by: Kevin Wolf> --- > tests/qemu-iotests/183 | 121 > + > tests/qemu-iotests/183.out | 48 ++ > tests/qemu-iotests/group | 1 + > 3 files changed, 170 insertions(+) > create mode 100755 tests/qemu-iotests/183 > create mode 100644 tests/qemu-iotests/183.out > Does this test gracefully skip when coupled with the efforts for a configure option to disable block migration? https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg04398.html > +echo > +echo === Do block migration to destination === > +echo > + > +_send_qemu_cmd $src "migrate -b unix:${MIG_SOCKET}" "(qemu)" I don't see any earlier checks, so it looks like you'd choke here if -b is not compiled in. Should we also check the use of -i? -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature