Re: [Qemu-devel] Non-flat command line option argument syntax
Eric Blakewrites: > On 02/02/2017 01:42 PM, Markus Armbruster wrote: > >> >> === Structured values === >> >> The dotted key convention messes with KEY syntax to permit structured >> values. Works, but the more conventional way to support structured >> values is a syntax for structured values. >> >> An obvious one is to use { KEY=VALUE, ...} for objects, and [ VALUE, >> ... ] for arrays. Looks like this: >> >> -drive 'driver=quorum, >> child=[{ driver=file, filename=disk1.img }, >>{ driver=host_device, filename=/dev/sdb }, >>{ driver=nbd, host=localhost } ]' >> >> Again, lines broken and indented for legibility; you need to join them >> for actual use. >> >> There's a syntactic catch, though: a value of the form [ ... ] can >> either be an array or a string. Which one it is depends on the type of >> the key. To parse this syntax, you need to know the types, unlike JSON >> or traditional QemuOpts. Unless we outlaw strings starting with '{' or >> '[', which feels impractical. > > Another syntactic catch: from the shell, > > -drive driver=quorum,child=[...] > > is insufficiently quoted, and MIGHT glob to a completely different > argument (or even multiple arguments) depending on the (oddly-named) > contents of the current directory. Any use of [] HAS to consistently > recommend use with shell quotes. Using straight JSON already has to use > shell quotes (generally '' for the overall argument, and "" for key > names and string values within the JSON, although our parser as an > extension supports '' for key names and string values which pairs with > "" for the overall argument and allows the use of $var shell interpolation). I forgot to mention interactions with shell meta-characters and quoting. Glad you brought it up. >> === Comparison === >> >> In my opinion, dotted keys are weird and ugly, but at least they don't >> add to the quoting mess. Structured values look better, except when >> they do add to the quoting mess. >> >> I'm having a hard time deciding which one I like less :) > > Both are a bit awkward. I think dotted keys require more typing but > less shell quoting than structured values. And with either approach, it > would STILL be nice if we taught QemuOpts to strip whitespace after > delimiting commas - the only requirement is that no key value can start > with space, which QAPI enforces, and QOM is unlikely to break, although > the benefits of stripping whitespace are only apparent when you remember > to use shell quoting over the entire argument (which partially defeats > the purpose of trying to come up with a syntax that needs less shell > quoting). Syntax that requires shell quoting even in simple cases is not nice. Syntax that requires it in complicated cases could be tolerable. If you're the kind of person to type option arguments exceeding 200 characters into an interactive shell, you should be the kind of person to know when to quote. Programs generating shell input should probably quote everything except the specific parts they *want* the shell to expand. >> Opinions? Other ideas? > > I don't think command line length is a problem; most command lines are > generated. I'm torn on whether a simplified structured values is nicer > than full-blown JSON; your argument about having the same JSON work on > both the command line and through QMP resulting in less work for > management apps is interesting. And reusing an existing syntax instead > of inventing yet another one always has the benefit of less code to > maintain. So even though it's harder to type by hand, I'm somewhat > leaning towards full JSON (where a leading '{' says to parse using JSON > until the closing '}'), rather than any other structured value > representation. Thanks!
Re: [Qemu-devel] Non-flat command line option argument syntax
"Dr. David Alan Gilbert"writes: > * Markus Armbruster (arm...@redhat.com) wrote: >> = Introduction = >> > > > >> = Structured option argument syntax = >> >> == JSON == >> >> The obvious way to provide the expressiveness of JSON on the command >> line is JSON. Easy enough[2]. However, besides not being compatible, >> it's rather heavy on syntax, at least for simple cases. Compare: >> >> -machine q35,accel=kvm >> -machine '{ "type": "q35", "accel": "kvm"}' >> >> It compares a bit more favourably in cases that use our non-flat hacks. >> Here's a flat list as KEY=VALUE,... with repeated keys, and as JSON: >> >> -semihosting-config enable,arg=eins,arg=zwei,arg=drei >> -semihosting-config '{ "enable": true, "arg": [ "eins", "zwei", "drei" ] >> }' >> >> Arbitrary nesting with dotted key convention: >> >> -drive driver=qcow2,file.driver=gluster, >>file.volume=testvol,file.path=/path/a.qcow2,file.debug=9, >>file.server.0.type=tcp, >>file.server.0.host=1.2.3.4, >>file.server.0.port=24007, >>file.server.1.type=unix, >>file.server.1.socket=/var/run/glusterd.socket >> -drive '{ "driver": "qcow2", >> "file": { >> "driver": "gluster", "volume": "testvol", >> "path": "/path/a.qcow2", "debug": 9, >> "server": [ { "type": "tcp", >> "host": "1.2.3.4", "port": "24007"}, >> { "type": "unix", >> "socket": "/var/run/glusterd.socket" } ] } }' > > So while I generally hate JSON, the -drive dotted key syntax makes > me mad when it gets like this; have a look > at the block replication and quorum setups especially, that can end up > with (from docs/COLO-FT.txt): > > -drive > if=virtio,id=primary-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\ > children.0.file.filename=1.raw,\ > children.0.driver=raw -S > >that's just way too many .'s to ever properly understand. > (I'm sure it used to be more complex). Here's an idea to cut down on the dottery that drives you mad (and me too): if KEY starts with '.', combine it with a prefix of the previous one so that the result has the same number of name components. Your example becomes -drive if=virtio,id=primary-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\ children.0.file.filename=1.raw,.driver=raw -S My example -drive driver=qcow2,file.driver=gluster, file.volume=testvol,file.path=/path/a.qcow2,file.debug=9, file.server.0.type=tcp, file.server.0.host=1.2.3.4, file.server.0.port=24007, file.server.1.type=unix, file.server.1.socket=/var/run/glusterd.socket becomes -drive driver=qcow2, file.driver=gluster, .volume=testvol, .path=/path/a.qcow2, .debug=9, file.server.0.type=tcp, .host=1.2.3.4, .port=24007, file.server.1.type=unix, .socket=/var/run/glusterd.socket Mind, I'm not at all sure this is a *good* idea. I suspect it's more magic than it's worth. >> Lines broken and indented for legibility; you need to join them for >> actual use. > > Why? What's a \n between friends for JSON? You're right, the JSON works as is. Only the KEY=VALUE example doesn't. >> Once you do, both variants are basically illegible. This >> is simply something that belongs into a config file rather than the >> command line. In a config file, JSON would be a better choice. >> >> There's also the -drive file=json:... syntax. It's a bad fit for >> QemuOpts, because QemuOpts and JSON fight for the comma. I'd show you >> if I could get it to work. >> >> We obviously can't replace QemuOpts with JSON. But accepting JSON in >> addition to QemuOpts is a debatable feature: it lets management >> applications reuse the code to build QMP arguments for option arguments. >> >> Since structured option arguments are always dictionaries, a JSON option >> argument always starts with '{'. If no QemuOpts argument can ever start >> with '{', accepting either QemuOpts or a JSON object is unambiguous. >> For a more detailed discussion of the following argument, see [3]. >> >> A QemuOpts argument normally starts with KEY. We need to outlaw KEYs >> starting with '{'. QAPI outlaws such names, see docs/qapi-code-gen.txt. >> QOM doesn't, but no such keys exist as far as I know. >> >> QemuOpts permit abbreviating KEY=VALUE to just VALUE for one specific >> KEY (the "implied" key). We need to limit this to KEYs whose VALUE >> can't start with '{'. Most implied keys can't have such values. >> Troublemakers include qemu-img's use of implied "file" keys. You'd have >> to say "file={my-tastelessly-named-file}" instead of just >>
Re: [Qemu-devel] [PATCH v5 03/18] vfio: allow to notify unmap for very large region
On Tue, Jan 31, 2017 at 02:35:00PM +1100, David Gibson wrote: > On Tue, Jan 24, 2017 at 09:32:07AM -0700, Alex Williamson wrote: > > On Tue, 24 Jan 2017 18:25:56 +0800 > > Peter Xuwrote: > > > > > Linux vfio driver supports to do VFIO_IOMMU_UNMAP_DMA for a very big > > > region. This can be leveraged by QEMU IOMMU implementation to cleanup > > > existing page mappings for an entire iova address space (by notifying > > > with an IOTLB with extremely huge addr_mask). However current > > > vfio_iommu_map_notify() does not allow that. It make sure that all the > > > translated address in IOTLB is falling into RAM range. > > > > > > The check makes sense, but it should only be a sensible checker for > > > mapping operations, and mean little for unmap operations. > > > > > > This patch moves this check into map logic only, so that we'll get > > > faster unmap handling (no need to translate again), and also we can then > > > better support unmapping a very big region when it covers non-ram ranges > > > or even not-existing ranges. > > > > > > Signed-off-by: Peter Xu > > > --- > > > hw/vfio/common.c | 7 +++ > > > 1 file changed, 3 insertions(+), 4 deletions(-) > > > > > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c > > > index ce55dff..4d90844 100644 > > > --- a/hw/vfio/common.c > > > +++ b/hw/vfio/common.c > > > @@ -354,11 +354,10 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, > > > IOMMUTLBEntry *iotlb) > > > return; > > > } > > > > > > -if (!vfio_get_vaddr(iotlb, , _only)) { > > > -return; > > > -} > > > - > > > if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) { > > > +if (!vfio_get_vaddr(iotlb, , _only)) { > > > +return; > > > +} > > > > > > David, is SPAPR going to freak out if it sees unmaps to ranges that > > extend beyond individual mappings, or perhaps include no mappings? > > This effectively allows unmapping the entire address space of the iommu > > in one pass, without validating or translating the backing. > > Extending beyond an individual mapping will be fine. However, if the > unmap extends beyond the extent of IOVAs mapped by a single TCE table, > then the unmap will fail (with ENXIO or EINVAL depending on whether > there's a problem with origin or only size). > > With holidays I've lost the context of this thread, so I can't easily > find the whole patch series this relates to. From your brief > description above it sounds likes it should be ok - a range covering > just the IOVA space (as long as that's actually correct for spapr tce) > should be ok. Thanks for the confirmation! Then I'll move ahead for the next spin. Thanks, -- peterx
[Qemu-devel] [PATCH 1/3] kvm/ioapic: dump real object instead of a fake one
When we do "info ioapic" for kvm ioapic, we were building up a temporary ioapic object. Let's fetch the real one and update correspond to the real object as well. This fixes printing uninitialized version field in ioapic_print_redtbl(). Reported-by: Peter MaydellSuggested-by: Paolo Bonzini Signed-off-by: Peter Xu --- hw/i386/kvm/ioapic.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/hw/i386/kvm/ioapic.c b/hw/i386/kvm/ioapic.c index 8eb2c7a..716d59a 100644 --- a/hw/i386/kvm/ioapic.c +++ b/hw/i386/kvm/ioapic.c @@ -114,11 +114,11 @@ static void kvm_ioapic_put(IOAPICCommonState *s) void kvm_ioapic_dump_state(Monitor *mon, const QDict *qdict) { -IOAPICCommonState s; +IOAPICCommonState *s = IOAPIC_COMMON(object_resolve_path("ioapic", NULL)); -kvm_ioapic_get(); - -ioapic_print_redtbl(mon, ); +assert(s); +kvm_ioapic_get(s); +ioapic_print_redtbl(mon, s); } static void kvm_ioapic_reset(DeviceState *dev) -- 2.7.4
[Qemu-devel] [PATCH 3/3] kvm/ioapic: correct kvm ioapic version
Signed-off-by: Peter Xu--- hw/i386/kvm/ioapic.c | 5 + 1 file changed, 5 insertions(+) diff --git a/hw/i386/kvm/ioapic.c b/hw/i386/kvm/ioapic.c index 716d59a..98ca480 100644 --- a/hw/i386/kvm/ioapic.c +++ b/hw/i386/kvm/ioapic.c @@ -143,6 +143,11 @@ static void kvm_ioapic_realize(DeviceState *dev, Error **errp) IOAPICCommonState *s = IOAPIC_COMMON(dev); memory_region_init_reservation(>io_memory, NULL, "kvm-ioapic", 0x1000); +/* + * KVM ioapic only supports 0x11 now. This will only be used when + * we want to dump ioapic version. + */ +s->version = 0x11; qdev_init_gpio_in(dev, kvm_ioapic_set_irq, IOAPIC_NUM_PINS); } -- 2.7.4
[Qemu-devel] [PATCH 2/3] ioapic: fix error report value of def version
It should be 0x20, rather than 0x11. Signed-off-by: Peter Xu--- hw/intc/ioapic.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c index 9047b89..37c4386 100644 --- a/hw/intc/ioapic.c +++ b/hw/intc/ioapic.c @@ -408,13 +408,15 @@ static void ioapic_machine_done_notify(Notifier *notifier, void *data) #endif } +#define IOAPIC_VER_DEF 0x20 + static void ioapic_realize(DeviceState *dev, Error **errp) { IOAPICCommonState *s = IOAPIC_COMMON(dev); if (s->version != 0x11 && s->version != 0x20) { error_report("IOAPIC only supports version 0x11 or 0x20 " - "(default: 0x11)."); + "(default: 0x%x).", IOAPIC_VER_DEF); exit(1); } @@ -429,7 +431,7 @@ static void ioapic_realize(DeviceState *dev, Error **errp) } static Property ioapic_properties[] = { -DEFINE_PROP_UINT8("version", IOAPICCommonState, version, 0x20), +DEFINE_PROP_UINT8("version", IOAPICCommonState, version, IOAPIC_VER_DEF), DEFINE_PROP_END_OF_LIST(), }; -- 2.7.4
[Qemu-devel] [PATCH 0/3] kvm/ioapic: some tiny tweaks
This series fixes the issue pointed out by PMM in thread: https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg06323.html Along with other two patches to make the version more accurate. Thanks, Peter Xu (3): kvm/ioapic: dump real object instead of a fake one ioapic: fix error report value of def version kvm/ioapic: correct kvm ioapic version hw/i386/kvm/ioapic.c | 13 + hw/intc/ioapic.c | 6 -- 2 files changed, 13 insertions(+), 6 deletions(-) -- 2.7.4
Re: [Qemu-devel] [PULL 23/35] x86: ioapic: dump version for "info ioapic"
On Mon, Jan 30, 2017 at 02:33:32PM -0500, Paolo Bonzini wrote: > > > On 30/01/2017 09:07, Peter Maydell wrote: > > On 20 January 2017 at 13:31, Paolo Bonziniwrote: > >> From: Peter Xu > >> > >> Signed-off-by: Peter Xu > >> Message-Id: <1483952153-7221-3-git-send-email-pet...@redhat.com> > >> Signed-off-by: Paolo Bonzini > >> --- > >> hw/intc/ioapic_common.c | 3 ++- > >> 1 file changed, 2 insertions(+), 1 deletion(-) > >> > >> diff --git a/hw/intc/ioapic_common.c b/hw/intc/ioapic_common.c > >> index 1b7ec5e..97c4f9c 100644 > >> --- a/hw/intc/ioapic_common.c > >> +++ b/hw/intc/ioapic_common.c > >> @@ -58,7 +58,8 @@ void ioapic_print_redtbl(Monitor *mon, IOAPICCommonState > >> *s) > >> uint32_t remote_irr = 0; > >> int i; > >> > >> -monitor_printf(mon, "ioapic id=0x%02x sel=0x%02x", s->id, > >> s->ioregsel); > >> +monitor_printf(mon, "ioapic ver=0x%x id=0x%02x sel=0x%02x", > >> + s->version, s->id, s->ioregsel); > >> if (s->ioregsel) { > >> monitor_printf(mon, " (redir[%u])\n", > >> (s->ioregsel - IOAPIC_REG_REDTBL_BASE) >> 1); > > > > Coverity points out (CID 1369422) that this is a use of a possibly > > uninitialized field. In kvm_ioapic_dump_state() we do: > > > > IOAPICCommonState s; > > kvm_ioapic_get(); > > ioapic_print_redtbl(mon, ); > > > > and kvm_ioapic_get() doesn't initialize s->version, so when we > > come to print it in ioapic_print_redtbl() it's uninitialized. > > > > The easy fix is to initialize version to something. The > > underlying problem here I think is that we're manufacturing > > a fake IOAPICCommonState rather than finding the one that > > corresponds to the actual IOAPIC device in the system... > > Right, we can probably use object_resolve_path to get one. I'll use this and post a fix for it. Thanks! -- peterx
Re: [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases
> From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo > Bonzini > On 31/01/2017 01:05, Pavel Dovgalyuk wrote: > > Hi, Paolo! > > > > Thanks for refactoring. > > I tested these patches with icount record/replay on i386 machine. > > It works, but the following changes should be applied. > > I also removed call to replay_has_interrupt, because now it is not needed > > here. > > It seems, that this call is an artifact of an older record/replay revision. > > > > diff --git a/cpu-exec.c b/cpu-exec.c > > index 3838eb8..5cef8bc 100644 > > --- a/cpu-exec.c > > +++ b/cpu-exec.c > > @@ -519,7 +519,8 @@ static inline bool cpu_handle_interrupt(CPUState *cpu, > > } > > > > /* Finally, check if we need to exit to the main loop. */ > > -if (unlikely(atomic_read(>exit_request) || > > replay_has_interrupt())) { > > +if (unlikely(atomic_read(>exit_request) > > +|| (use_icount && cpu->icount_decr.u16.low + cpu->icount_extra == > > 0))) { > > atomic_set(>exit_request, 0); > > cpu->exception_index = EXCP_INTERRUPT; > > return true; > > So is this needed to avoid exceptions in tb_find? Please add a comment > about this This code comes from my last patch, that was not applied. Here is the comment: It adds check to break cpu loop when icount expires without setting the TB_EXIT_ICOUNT_EXPIRED flag. It happens when there is no available translated blocks and all instructions were executed. In icount replay mode unnecessary tb_find will be called (which may cause an exception) and execution will be non-deterministic. > and check if you can also replace: > > atomic_set(>exit_request, 1); > > in cpu_loop_exec_tb with > > cpu->icount_decr.u16.low = 0; > > ? > This line is not needed at all, because the following code decrements icount automatically. if (insns_left > 0) { cpu_exec_nocache(cpu, insns_left, tb, false); } Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH 3/3] spapr: make cpu core unplug follow expected hotunplug call flow
On Thu, Feb 02, 2017 at 04:02:35PM +0100, Igor Mammedov wrote: > spapr_core_unplug() were essentially spapr_core_unplug_request() > handler that requested CPU removal and registered callback > which did actual cpu core removali but it was called from > spapr_machine_device_unplug() which is intended for actual object > removal. Commit (cf632463 spapr: Memory hot-unplug support) > sort of fixed it introducing spapr_machine_device_unplug_request() > and calling spapr_core_unplug() but it hasn't renamed callback and > by mistake calls it from spapr_machine_device_unplug(). > > However spapr_machine_device_unplug() isn't ever called for > cpu core since spapr_core_release() doesn't follow expected > hotunplug call flow which is: > 1: device_del() -> > hotplug_handler_unplug_request() -> > set destroy_cb() > 2: destroy_cb() -> > hotplug_handler_unplug() -> > object_unparent // actual device removal > > Fix it by renaming spapr_core_unplug() to spapr_core_unplug_request() > which is called from spapr_machine_device_unplug_request() and > making spapr_core_release() call hotplug_handler_unplug() which > will call spapr_machine_device_unplug() -> spapr_core_unplug() > to remove cpu core. > > Signed-off-by: Igor Mammedov> --- > hw/ppc/spapr.c | 18 ++ Reveiwed-by: Bharata B Rao Regards, Bharata.
Re: [Qemu-devel] [PATCH 1/3] spapr: cpu core: separate child threads destruction from machine state operations
On Thu, Feb 02, 2017 at 04:02:33PM +0100, Igor Mammedov wrote: > Signed-off-by: Igor Mammedov> --- > hw/ppc/spapr_cpu_core.c | 15 ++- > 1 file changed, 10 insertions(+), 5 deletions(-) > I would mention explicitly that we are adding core unrealizefn and doing threads' destruction from it now. Otherwise, Reviewed-by: Bharata B Rao Regards, Bharata.
[Qemu-devel] [PATCH] xhci: fix event queue IRQ handling
The qemu xhci emulation doesn't handle the ERDP_EHB flag correctly. When the host adapter queues a new event the ERDP_EHB flag is set. The flag is cleared (via w1c) by the guest when it updates the ERDP (event ring dequeue pointer) register to notify the host adapter which events it has fetched. An IRQ must raised in case the ERDP_EHB flag flips from clear to set. If the flag is set already (which implies there are events queued up which are not yet processed by the guest) xhci must *not* raise a IRQ. Qemu got that wrong and raised an IRQ on every event, thereby generating suspious interrupts in case we've queued events faster than the guest processed them. This patch fixes that. With that change in place we also have to check ERDP updates, to see whenever the guest has fetched all queued events. In case there are still pending events set ERDP_EHB and raise an IRQ again, to make sure the events don't linger unseen forever. The linux kernel driver and the microsoft windows driver (shipped with win8+) can deal with the suspious interrupts without problems. The renesas windows driver (v2.1.39) which can be used on older windows versions is quite upset though. It does suspious ERDP updates now and then (not every time, seems we must hit a race window for this to happen), which in turn makes the qemu xhci emulation think the event ring is full. Things go south from here ... tl;dr: This is the "fix xhci on win7" patch. Cc: m.cerv...@computer.org Cc: 1373...@bugs.launchpad.net Signed-off-by: Gerd Hoffmann--- hw/usb/hcd-xhci.c | 13 + 1 file changed, 13 insertions(+) diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c index df75907..4f05747 100644 --- a/hw/usb/hcd-xhci.c +++ b/hw/usb/hcd-xhci.c @@ -789,11 +789,15 @@ static void xhci_msix_update(XHCIState *xhci, int v) static void xhci_intr_raise(XHCIState *xhci, int v) { PCIDevice *pci_dev = PCI_DEVICE(xhci); +bool pending = (xhci->intr[v].erdp_low & ERDP_EHB); xhci->intr[v].erdp_low |= ERDP_EHB; xhci->intr[v].iman |= IMAN_IP; xhci->usbsts |= USBSTS_EINT; +if (pending) { +return; +} if (!(xhci->intr[v].iman & IMAN_IE)) { return; } @@ -3352,6 +3356,15 @@ static void xhci_runtime_write(void *ptr, hwaddr reg, intr->erdp_low &= ~ERDP_EHB; } intr->erdp_low = (val & ~ERDP_EHB) | (intr->erdp_low & ERDP_EHB); +if (val & ERDP_EHB) { +dma_addr_t erdp = xhci_addr64(intr->erdp_low, intr->erdp_high); +unsigned int dp_idx = (erdp - intr->er_start) / TRB_SIZE; +if (erdp >= intr->er_start && +erdp < (intr->er_start + TRB_SIZE * intr->er_size) && +dp_idx != intr->er_ep_idx) { +xhci_intr_raise(xhci, v); +} +} break; case 0x1c: /* ERDP high */ intr->erdp_high = val; -- 1.8.3.1
Re: [Qemu-devel] [PATCH 3/3] colo-compare: use notifier to notify inconsistent packets comparing
On 2017年01月24日 22:05, zhanghailiang wrote: It's a good idea to use notifier to notify COLO frame of inconsistent packets comparing. Signed-off-by: Zhang ChenSigned-off-by: zhanghailiang --- net/colo-compare.c | 24 ++-- net/colo-compare.h | 2 ++ 2 files changed, 24 insertions(+), 2 deletions(-) diff --git a/net/colo-compare.c b/net/colo-compare.c index 2ad577b..39c394d 100644 --- a/net/colo-compare.c +++ b/net/colo-compare.c @@ -30,6 +30,7 @@ #include "qapi-visit.h" #include "net/colo.h" #include "net/colo-compare.h" +#include "migration/migration.h" #define TYPE_COLO_COMPARE "colo-compare" #define COLO_COMPARE(obj) \ @@ -38,6 +39,9 @@ static QTAILQ_HEAD(, CompareState) net_compares = QTAILQ_HEAD_INITIALIZER(net_compares); +static NotifierList colo_compare_notifiers = +NOTIFIER_LIST_INITIALIZER(colo_compare_notifiers); + #define COMPARE_READ_LEN_MAX NET_BUFSIZE #define MAX_QUEUE_SIZE 1024 @@ -378,6 +382,22 @@ static int colo_old_packet_check_one(Packet *pkt, int64_t *check_time) } } +static void colo_compare_inconsistent_notify(void) +{ +notifier_list_notify(_compare_notifiers, +migrate_get_current()); +} + +void colo_compare_register_notifier(Notifier *notify) +{ +notifier_list_add(_compare_notifiers, notify); +} + +void colo_compare_unregister_notifier(Notifier *notify) +{ +notifier_remove(notify); +} + static void colo_old_packet_check_one_conn(void *opaque, void *user_data) { @@ -392,7 +412,7 @@ static void colo_old_packet_check_one_conn(void *opaque, qemu_mutex_unlock(>conn_lock); if (result) { /* do checkpoint will flush old packet */ -/* TODO: colo_notify_checkpoint();*/ +colo_compare_inconsistent_notify(); } } @@ -466,7 +486,7 @@ static void colo_compare_connection(void *opaque, void *user_data) */ trace_colo_compare_main("packet different"); g_queue_push_tail(>primary_list, pkt); -/* TODO: colo_notify_checkpoint();*/ +colo_compare_inconsistent_notify(); break; } } I don't see any users for colo_compare_register_notifier/colo_compare_unregister_notifier, is any patch missed in this series? And is it safe to do notification in colo thread? Thanks diff --git a/net/colo-compare.h b/net/colo-compare.h index 44f9014..769f55a 100644 --- a/net/colo-compare.h +++ b/net/colo-compare.h @@ -16,5 +16,7 @@ #define QEMU_COLO_COMPARE_H void colo_compare_do_checkpoint(void); +void colo_compare_register_notifier(Notifier *notify); +void colo_compare_unregister_notifier(Notifier *notify); #endif /* QEMU_COLO_COMPARE_H */
Re: [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage
On 2017年01月24日 22:05, zhanghailiang wrote: The original 'timer_check_lock' mutex lock of struct CompareState is used to protect the 'conn_list' queue and its child queues which are 'primary_list' and 'secondary_list', which is a little abused and confusing To make it clearer, we rename 'timer_check_lock' to 'conn_list_lock' which is used to protect 'conn_list' queue, use another 'conn_lock' to protect 'primary_list' and 'secondary_list'. Besides, fix some missing places which need these mutex lock. Signed-off-by: zhanghailiangInstead of sticking to such kind of mutex, I think it's time to make colo timer run in colo thread (there's a TODO in the code). Thought? Thanks
Re: [Qemu-devel] [PATCH] colo-compare: sort TCP packet queue by sequence number
On 2017年01月24日 16:53, Zhang Chen wrote: Improve efficiency of TCP packet comparison. Signed-off-by: Zhang ChenSigned-off-by: Li Zhijian --- net/colo-compare.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/net/colo-compare.c b/net/colo-compare.c index 9bfc736..5a4f335 100644 --- a/net/colo-compare.c +++ b/net/colo-compare.c @@ -101,6 +101,15 @@ static int compare_chr_send(CharBackend *out, const uint8_t *buf, uint32_t size); +static gint seq_sorter(Packet *a, Packet *b, gpointer data) +{ +struct tcphdr *atcp, *btcp; + +atcp = (struct tcphdr *)(a->transport_header); +btcp = (struct tcphdr *)(b->transport_header); +return ntohl(atcp->th_seq) - ntohl(btcp->th_seq); +} + /* * Return 0 on success, if return -1 means the pkt * is unsupported(arp and ipv6) and will be sent later @@ -137,6 +146,11 @@ static int packet_enqueue(CompareState *s, int mode) if (g_queue_get_length(>primary_list) <= MAX_QUEUE_SIZE) { g_queue_push_tail(>primary_list, pkt); +if (conn->ip_proto == IPPROTO_TCP) { +g_queue_sort(>primary_list, + (GCompareDataFunc)seq_sorter, + NULL); +} } else { error_report("colo compare primary queue size too big," "drop packet"); @@ -145,6 +159,11 @@ static int packet_enqueue(CompareState *s, int mode) if (g_queue_get_length(>secondary_list) <= MAX_QUEUE_SIZE) { g_queue_push_tail(>secondary_list, pkt); +if (conn->ip_proto == IPPROTO_TCP) { +g_queue_sort(>secondary_list, + (GCompareDataFunc)seq_sorter, + NULL); +} } else { error_report("colo compare secondary queue size too big," "drop packet"); Applied. Thanks
Re: [Qemu-devel] [RFC PATCH v2 0/2] softfloat: Add round-to-odd rounding mode
Peter, On Fri, Jan 27, 2017 at 01:33:31PM +0530, Bharata B Rao wrote: > > Changes in v2: > - > - Do odd or even for the right precision bit in 64bit rounding. (Peter > Maydell) > - Handle the overflow case correctly in 64bit rounding. (Peter Maydell) Does this version of round-to-odd implementation look correct now ? > - Add a patch to handle underflow case correctly in 64bit rounding. Does this fix for underflow case make sense ? Regards, Bharata.
Re: [Qemu-devel] [PATCH] net: e1000e: fix dead code in e1000e_write_packet_to_guest
On 2017年01月26日 18:10, Paolo Bonzini wrote: Because is_first is declared inside a loop, it is always true. The store is dead, and so is the "else" branch of "if (is_first)". is_last is okay though. Reported by Coverity. Signed-off-by: Paolo Bonzini--- hw/net/e1000e_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c index 2b11499..c99e2fb 100644 --- a/hw/net/e1000e_core.c +++ b/hw/net/e1000e_core.c @@ -1507,6 +1507,7 @@ e1000e_write_packet_to_guest(E1000ECore *core, struct NetRxPkt *pkt, const E1000E_RingInfo *rxi; size_t ps_hdr_len = 0; bool do_ps = e1000e_do_ps(core, pkt, _hdr_len); +bool is_first = true; rxi = rxr->i; @@ -1514,7 +1515,6 @@ e1000e_write_packet_to_guest(E1000ECore *core, struct NetRxPkt *pkt, hwaddr ba[MAX_PS_BUFFERS]; e1000e_ba_state bastate = { { 0 } }; bool is_last = false; -bool is_first = true; desc_size = total_size - desc_offset; Applied, thanks.
[Qemu-devel] [RESEND PATCH v2] docs: add document to explain the usage of vNVDIMM
Signed-off-by: Haozhong ZhangReviewed-by: Xiao Guangrong Reviewed-by: Stefan Hajnoczi --- Changes since v1: * explicitly state the block window mode is not supported (Stefan Hajnoczi) * typo fix: label_size ==> label-size (David Alan Gilbert) --- docs/nvdimm.txt | 124 1 file changed, 124 insertions(+) create mode 100644 docs/nvdimm.txt diff --git a/docs/nvdimm.txt b/docs/nvdimm.txt new file mode 100644 index 000..2d9f8c0 --- /dev/null +++ b/docs/nvdimm.txt @@ -0,0 +1,124 @@ +QEMU Virtual NVDIMM +=== + +This document explains the usage of virtual NVDIMM (vNVDIMM) feature +which is available since QEMU v2.6.0. + +The current QEMU only implements the persistent memory mode of vNVDIMM +device and not the block window mode. + +Basic Usage +--- + +The storage of a vNVDIMM device in QEMU is provided by the memory +backend (i.e. memory-backend-file and memory-backend-ram). A simple +way to create a vNVDIMM device at startup time is done via the +following command line options: + + -machine pc,nvdimm + -m $RAM_SIZE,slots=$N,maxmem=$MAX_SIZE + -object memory-backend-file,id=mem1,share=on,mem-path=$PATH,size=$NVDIMM_SIZE + -device nvdimm,id=nvdimm1,memdev=mem1 + +Where, + + - the "nvdimm" machine option enables vNVDIMM feature. + + - "slots=$N" should be equal to or larger than the total amount of + normal RAM devices and vNVDIMM devices, e.g. $N should be >= 2 here. + + - "maxmem=$MAX_SIZE" should be equal to or larger than the total size + of normal RAM devices and vNVDIMM devices, e.g. $MAX_SIZE should be + >= $RAM_SIZE + $NVDIMM_SIZE here. + + - "object memory-backend-file,id=mem1,share=on,mem-path=$PATH,size=$NVDIMM_SIZE" + creates a backend storage of size $NVDIMM_SIZE on a file $PATH. All + accesses to the virtual NVDIMM device go to the file $PATH. + + "share=on/off" controls the visibility of guest writes. If + "share=on", then guest writes will be applied to the backend + file. If another guest uses the same backend file with option + "share=on", then above writes will be visible to it as well. If + "share=off", then guest writes won't be applied to the backend + file and thus will be invisible to other guests. + + - "device nvdimm,id=nvdimm1,memdev=mem1" creates a virtual NVDIMM + device whose storage is provided by above memory backend device. + +Multiple vNVDIMM devices can be created if multiple pairs of "-object" +and "-device" are provided. + +For above command line options, if the guest OS has the proper NVDIMM +driver, it should be able to detect a NVDIMM device which is in the +persistent memory mode and whose size is $NVDIMM_SIZE. + +Note: + +1. Prior to QEMU v2.8.0, if memory-backend-file is used and the actual + backend file size is not equal to the size given by "size" option, + QEMU will truncate the backend file by ftruncate(2), which will + corrupt the existing data in the backend file, especially for the + shrink case. + + QEMU v2.8.0 and later check the backend file size and the "size" + option. If they do not match, QEMU will report errors and abort in + order to avoid the data corruption. + +2. QEMU v2.6.0 only puts a basic alignment requirement on the "size" + option of memory-backend-file, e.g. 4KB alignment on x86. However, + QEMU v.2.7.0 puts an additional alignment requirement, which may + require a larger value than the basic one, e.g. 2MB on x86. This + change breaks the usage of memory-backend-file that only satisfies + the basic alignment. + + QEMU v2.8.0 and later remove the additional alignment on non-s390x + architectures, so the broken memory-backend-file can work again. + +Label +- + +QEMU v2.7.0 and later implement the label support for vNVDIMM devices. +To enable label on vNVDIMM devices, users can simply add +"label-size=$SZ" option to "-device nvdimm", e.g. + + -device nvdimm,id=nvdimm1,memdev=mem1,label-size=128K + +Note: + +1. The minimal label size is 128KB. + +2. QEMU v2.7.0 and later store labels at the end of backend storage. + If a memory backend file, which was previously used as the backend + of a vNVDIMM device without labels, is now used for a vNVDIMM + device with label, the data in the label area at the end of file + will be inaccessible to the guest. If any useful data (e.g. the + meta-data of the file system) was stored there, the latter usage + may result guest data corruption (e.g. breakage of guest file + system). + +Hotplug +--- + +QEMU v2.8.0 and later implement the hotplug support for vNVDIMM +devices. Similarly to the RAM hotplug, the vNVDIMM hotplug is +accomplished by two monitor commands "object_add" and "device_add". + +For example, the following commands add another 4GB vNVDIMM device to +the guest: + + (qemu) object_add memory-backend-file,id=mem2,share=on,mem-path=new_nvdimm.img,size=4G + (qemu)
Re: [Qemu-devel] new developer
On Wed, 02/01 17:41, Shubham Kumar wrote: > Sir > > I'm a third year Indian engineering student looking for experience in open > source development. I read about QEMU and I want to be a contributor for this > organisation.Can you guide me where exactly to start since I'm new to this ?? > Thanks in advance. You can take a look at our Google Summer of Code 2017 ideas and see if there is a project you're interested in: http://wiki.qemu.org/Google_Summer_of_Code_2017 Fam
Re: [Qemu-devel] [PATCH] hw/pxb-pcie: fix PCI Express hotplug support
On Fri, Feb 03, 2017 at 12:44:17AM +0200, Marcel Apfelbaum wrote: > On 02/03/2017 12:34 AM, Michael S. Tsirkin wrote: > > On Fri, Feb 03, 2017 at 12:05:52AM +0200, Marcel Apfelbaum wrote: > > > Add the missing osc method for pxb-pcie devices > > > > > > Signed-off-by: Marcel Apfelbaum> > > --- > > > > > > Note: The patch is based on the fact that Q35's osc method is quite > > > generic. > > > > > > Thanks, > > > Marcel > > > > > > hw/i386/acpi-build.c | 5 + > > > 1 file changed, 5 insertions(+) > > > > > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c > > > index 1c928ab..555aab3 100644 > > > --- a/hw/i386/acpi-build.c > > > +++ b/hw/i386/acpi-build.c > > > @@ -1964,6 +1964,11 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, > > > aml_append(dev, aml_name_decl("_UID", aml_int(bus_num))); > > > aml_append(dev, aml_name_decl("_HID", > > > aml_eisaid("PNP0A03"))); > > > aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num))); > > > +if (pci_bus_is_express(bus)) { > > > +aml_append(dev, aml_name_decl("SUPP", aml_int(0))); > > > +aml_append(dev, aml_name_decl("CTRL", aml_int(0))); > > > +aml_append(dev, build_q35_osc_method()); > > > +} > > > > Hi Michael, > Thanks for the review. > > > I think we want to move this stuff into build_q35_osc_method: > > "SUPP" seems to be 0, CTRL should be a variable passed in. > > > > Sure > > > > > > if (numa_node != NUMA_NODE_UNASSIGNED) { > > > aml_append(dev, aml_name_decl("_PXM", > > > aml_int(numa_node))); > > > > > > In fact build_q35_osc_method needs to be cleaned up. > > It stores result in "CTRL" which should be documented. > > > > "SUPP" seems to be unused. > > > > A bunch of comments missing. > > > > More importantly, its an unserialized method but it creates a bunch of > > fields so attempts to run two of these in parallel will crash. > > I don't see how this code will run in parallel. (the code that calls the _OSC > method) If the method is not serialized, it should be callable in parallel. I'm guessing OSPMs do not do it for _OSC. > > I see the same in ACPI spec which probably means it should be > > revisited with a critical eye. > > > > I am in PTO next week, then I'll revisit the build_q35_osc_method and (try > to) clean it up. > > Thanks, > Marcel > > > > > > -- > > > 2.5.5
Re: [Qemu-devel] [PULL 000/107] ppc-for-2.9 queue 20170202
On Wed, Feb 01, 2017 at 11:41:40PM -0800, no-re...@patchew.org wrote: > Hi, > > Your series seems to have some coding style problems. See output below for > more information: > > Type: series > Subject: [Qemu-devel] [PULL 000/107] ppc-for-2.9 queue 20170202 > Message-id: 20170202051445.5735-1-da...@gibson.dropbear.id.au > > === TEST SCRIPT BEGIN === > #!/bin/bash > > BASE=base > n=1 > total=$(git log --oneline $BASE.. | wc -l) > failed=0 > > # Useful git options > git config --local diff.renamelimit 0 > git config --local diff.renames True > > commits="$(git log --format=%H --reverse $BASE..)" > for c in $commits; do > echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..." > if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; > then > failed=1 > echo > fi > n=$((n+1)) > done > > exit $failed > === TEST SCRIPT END === > > Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 > From https://github.com/patchew-project/qemu > * [new tag] > patchew/20170202051445.5735-1-da...@gibson.dropbear.id.au -> > patchew/20170202051445.5735-1-da...@gibson.dropbear.id.au > Switched to a new branch 'test' [snip] > Checking PATCH 46/107: target-ppc: Add xxextractuw instruction... > ERROR: Macros with complex values should be enclosed in parenthesis > #110: FILE: target/ppc/translate/vsx-ops.inc.c:52: > +#define GEN_XX2FORM_EXT(name, opc2, opc3, fl2) \ > +GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 0, opc3, 0x0010, PPC_NONE, > fl2), \ > +GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 1, opc3, 0x0010, PPC_NONE, fl2) > > total: 1 errors, 0 warnings, 92 lines checked > > Your patch has style problems, please review. If any of these errors > are false positives report them to the maintainer, see > CHECKPATCH in MAINTAINERS. This one is a standard false positive due to checkpatch being confused by the ugly macros in this file. [snip] > Checking PATCH 93/107: target/ppc: Add pcr_supported to POWER9 cpu class > definition... > ERROR: spaces required around that '-' (ctx:VxV) > #22: FILE: target/ppc/cpu.h:2293: > +PCR_COMPAT_3_00 = 1ull << (63-59), > ^ > > total: 1 errors, 0 warnings, 15 lines checked > > Your patch has style problems, please review. If any of these errors > are false positives report them to the maintainer, see > CHECKPATCH in MAINTAINERS. This one's matching the surrounding code which is already not quite in style, best to fix those together sometime later. [snip] > Checking PATCH 100/107: target-ppc: Add xvtstdc[sp, dp] instructions... > ERROR: Macros with complex values should be enclosed in parenthesis > #126: FILE: target/ppc/translate/vsx-ops.inc.c:137: > +#define GEN_XX2FORM_DCMX(name, opc2, opc3, fl2) \ > +GEN_XX3FORM(name, opc2, opc3 | 0, fl2), \ > +GEN_XX3FORM(name, opc2, opc3 | 1, fl2) > > total: 1 errors, 0 warnings, 96 lines checked > > Your patch has style problems, please review. If any of these errors > are false positives report them to the maintainer, see > CHECKPATCH in MAINTAINERS. This one's the false positive again. [snip] > Checking PATCH 103/107: tcg/POWER9: NOOP the cp_abort instruction... > ERROR: do not use C99 // comments > #28: FILE: target/ppc/translate.c:6025: > +// Do Nothing > > total: 1 errors, 0 warnings, 17 lines checked > > Your patch has style problems, please review. If any of these errors > are false positives report them to the maintainer, see > CHECKPATCH in MAINTAINERS. But this one was me being sloppy. Do we care enough to re-do the pullreq? -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH] hw/pxb-pcie: fix PCI Express hotplug support
On 02/03/2017 12:34 AM, Michael S. Tsirkin wrote: On Fri, Feb 03, 2017 at 12:05:52AM +0200, Marcel Apfelbaum wrote: Add the missing osc method for pxb-pcie devices Signed-off-by: Marcel Apfelbaum--- Note: The patch is based on the fact that Q35's osc method is quite generic. Thanks, Marcel hw/i386/acpi-build.c | 5 + 1 file changed, 5 insertions(+) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 1c928ab..555aab3 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -1964,6 +1964,11 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, aml_append(dev, aml_name_decl("_UID", aml_int(bus_num))); aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03"))); aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num))); +if (pci_bus_is_express(bus)) { +aml_append(dev, aml_name_decl("SUPP", aml_int(0))); +aml_append(dev, aml_name_decl("CTRL", aml_int(0))); +aml_append(dev, build_q35_osc_method()); +} Hi Michael, Thanks for the review. I think we want to move this stuff into build_q35_osc_method: "SUPP" seems to be 0, CTRL should be a variable passed in. Sure if (numa_node != NUMA_NODE_UNASSIGNED) { aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node))); In fact build_q35_osc_method needs to be cleaned up. It stores result in "CTRL" which should be documented. "SUPP" seems to be unused. A bunch of comments missing. More importantly, its an unserialized method but it creates a bunch of fields so attempts to run two of these in parallel will crash. I don't see how this code will run in parallel. (the code that calls the _OSC method) I see the same in ACPI spec which probably means it should be revisited with a critical eye. I am in PTO next week, then I'll revisit the build_q35_osc_method and (try to) clean it up. Thanks, Marcel -- 2.5.5
Re: [Qemu-devel] [PATCH] hw/pxb-pcie: fix PCI Express hotplug support
On Fri, Feb 03, 2017 at 12:05:52AM +0200, Marcel Apfelbaum wrote: > Add the missing osc method for pxb-pcie devices > > Signed-off-by: Marcel Apfelbaum> --- > > Note: The patch is based on the fact that Q35's osc method is quite generic. > > Thanks, > Marcel > > hw/i386/acpi-build.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c > index 1c928ab..555aab3 100644 > --- a/hw/i386/acpi-build.c > +++ b/hw/i386/acpi-build.c > @@ -1964,6 +1964,11 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, > aml_append(dev, aml_name_decl("_UID", aml_int(bus_num))); > aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03"))); > aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num))); > +if (pci_bus_is_express(bus)) { > +aml_append(dev, aml_name_decl("SUPP", aml_int(0))); > +aml_append(dev, aml_name_decl("CTRL", aml_int(0))); > +aml_append(dev, build_q35_osc_method()); > +} I think we want to move this stuff into build_q35_osc_method: "SUPP" seems to be 0, CTRL should be a variable passed in. > if (numa_node != NUMA_NODE_UNASSIGNED) { > aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node))); In fact build_q35_osc_method needs to be cleaned up. It stores result in "CTRL" which should be documented. "SUPP" seems to be unused. A bunch of comments missing. More importantly, its an unserialized method but it creates a bunch of fields so attempts to run two of these in parallel will crash. I see the same in ACPI spec which probably means it should be revisited with a critical eye. > -- > 2.5.5
Re: [Qemu-devel] [PATCH v2] q35: Improve sample configuration files
On Do, 2017-02-02 at 16:20 +0100, Andrea Bolognani wrote: > On Thu, 2017-02-02 at 16:42 +0200, Marcel Apfelbaum wrote: > [...] > > > +[device "ich9-pcie-port-1"] > > > > I would use the new generic root port. > > This sample configuration (q35-emulated.cfg) is supposed to > match physical hardware as closely as possible, so we should > stick with ioh3420s. Agree. > [...] > > > +[device "pci.1"] > > > + driver = "ioh3420" > > > > Same here, maybe we can use the new generic port. > > These sample configuration files (q35-virtio-*.cfg) should > probably use the generic ports instead, yes. It's just that > they were not merged yet when I started working on this :) Agree too. > > I am not sure about having the DMI-PCI bridge "by default". > > Users can understand is actually a good idea to have it by default > > while we don't really want them to use legacy PCI devices on Q35; > > and even if so, they should use them as Integrated Endpoints. > > They don't have hotplug for the DMI-PCI bridge anyway. > > Not sure about this one. It doesn't show up on my laptop, > so it's not like every single q35-based physical machine has > it. Is your laptop really q35+ich9 or something newer? > I'd be okay with dropping it, but I leave the decision > to Gerd. Please leave it in for qemu-emulated.cfg, as this is supposed to serve as documentation how real q35 hardware looks like. From the virtio variants it can be dropped. > > I personally don't use them because every time I try, I > > find something with no config support > > Is that so? Can you please test these new ones and see > whether they work for you? I think mst means that not all qemu command line switches are using QemuOpts, and the ones which don't can't be used with -readconfig. cheers, Gerd
Re: [Qemu-devel] [PATCH] tci: Remove invalid assertions
On Thu, Feb 02, 2017 at 09:10:26PM +0100, Stefan Weil wrote: > Am 02.02.2017 um 21:00 schrieb Eric Blake: > > On 02/02/2017 01:56 PM, Stefan Weil wrote: > > > tb_jmp_insn_offset and tb_jmp_reset_offset are pointers > > > and cannot be used with ARRAY_SIZE. > > > > > > Signed-off-by: Stefan Weil> > > --- > > > tcg/tci/tcg-target.inc.c | 2 -- > > > 1 file changed, 2 deletions(-) > > > > mst posted an alternative patch: > > https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg00551.html > > > Yes, I noticed that, too. It's not obvious that this new > assertion will be correct, and none of the other targets > has that kind of assertion. Only two targets use an > assertion which detects NULL pointers, but NULL pointers > will result in an abort anyway. > > Do you think that there are reasons why TCI should use > the assertion suggested by Michael? > > Stefan You know what this code does and I don't, not really. I just did a monkey patch guessing at what was intended (value is used as an array index, so we do a bounds check). I sent the patch before I saw yours simply to fix the build in a way that's as unintrusive as possible: args[0] seemed to come from guest so I thought it might be prudent to do a bounds check. So feel free to ignore mine. Here's an ack for yours Acked-by: Michael S. Tsirkin -- MST
Re: [Qemu-devel] [PATCH v3] q35: Improve sample configuration files
On Do, 2017-02-02 at 16:35 +0100, Andrea Bolognani wrote: > On Thu, 2017-02-02 at 15:55 +0100, Gerd Hoffmann wrote: > > > +++ b/docs/q35-emulated.cfg > > > > > > +# 00:01.0 VGA compatible controller > > > > > > +[device "video"] > > > + driver = "VGA" > > > + bus = "pcie.0" > > > + addr = "02.0" > > > > Here is a address mismatch ;) > > Oh, wait, I realize where the mismatch comes from now: > on real hardware (my laptop) the video card is plugged > into 00:02.0, but the default one you get when you don't > pass -nodefault to QEMU is plugged into 00:01.0! And that depends on the chipset. On q35 it is 01.0, whereas on pc it is 02.0 because 01.0 is occupied by the piix southbridge. > I think we should use 00:02.0, do you agree? I would > also use the same slot in q35-virtio-graphical.cfg. Matter of taste really. I have some guests where the vga is at 04.0, works fine too. But being consistent across all configurations makes sense. cheers, Gerd
[Qemu-devel] [PATCH] hw/pxb-pcie: fix PCI Express hotplug support
Add the missing osc method for pxb-pcie devices Signed-off-by: Marcel Apfelbaum--- Note: The patch is based on the fact that Q35's osc method is quite generic. Thanks, Marcel hw/i386/acpi-build.c | 5 + 1 file changed, 5 insertions(+) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 1c928ab..555aab3 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -1964,6 +1964,11 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, aml_append(dev, aml_name_decl("_UID", aml_int(bus_num))); aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03"))); aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num))); +if (pci_bus_is_express(bus)) { +aml_append(dev, aml_name_decl("SUPP", aml_int(0))); +aml_append(dev, aml_name_decl("CTRL", aml_int(0))); +aml_append(dev, build_q35_osc_method()); +} if (numa_node != NUMA_NODE_UNASSIGNED) { aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node))); -- 2.5.5
Re: [Qemu-devel] [PATCH] qga: ignore EBUSY when freezing a filesystem
On 31/01/2017 07:36, Peter Lieven wrote: > the current implementation fails if we try to freeze an > already frozen filesystem. This can happen if a filesystem > is mounted more than once (e.g. with a bind mount). > > Suggested-by: Christian Theune> Cc: qemu-sta...@nongnu.org > Signed-off-by: Peter Lieven What happens when you thaw? Paolo
[Qemu-devel] [Bug 1661386] [NEW] Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed
Public bug reported: Hello, I see the following when try to run qemu from master as the following: # ./x86_64-softmmu/qemu-system-x86_64 --version QEMU emulator version 2.8.50 (v2.8.0-1006-g4e9f524) Copyright (c) 2003-2016 Fabrice Bellard and the QEMU Project developers # ./x86_64-softmmu/qemu-system-x86_64 -machine accel=kvm -nodefaults -no-reboot -nographic -cpu host -vga none -kernel .build.kernel.kvm -initrd .build.initrd.kvm -append 'panic=1 no-kvmclock console=ttyS0 loglevel=7' -m 1024 -serial stdio qemu-system-x86_64: /home/matwey/lab/qemu/target/i386/kvm.c:1849: kvm_put_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. First broken commit has been bisected: commit 48e1a45c3166d659f781171a47dabf4a187ed7a5 Author: Paolo BonziniDate: Wed Mar 30 22:55:29 2016 +0200 target-i386: assert that KVM_GET/SET_MSRS can set all requested MSRs This would have caught the bug in the previous patch. Signed-off-by: Paolo Bonzini My cpuinfo is the following: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU X5675 @ 3.07GHz stepping: 2 microcode : 0x14 cpu MHz : 3066.775 cache size : 12288 KB physical id : 0 siblings: 2 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss ht syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc aperfmperf pni pclmulqdq vmx ssse3 cx16 sse4_1 sse4_2 popcnt aes hypervisor lahf_lm ida arat epb dtherm tpr_shadow vnmi ept vpid bugs: bogomips: 6133.55 clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1661386 Title: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed Status in QEMU: New Bug description: Hello, I see the following when try to run qemu from master as the following: # ./x86_64-softmmu/qemu-system-x86_64 --version QEMU emulator version 2.8.50 (v2.8.0-1006-g4e9f524) Copyright (c) 2003-2016 Fabrice Bellard and the QEMU Project developers # ./x86_64-softmmu/qemu-system-x86_64 -machine accel=kvm -nodefaults -no-reboot -nographic -cpu host -vga none -kernel .build.kernel.kvm -initrd .build.initrd.kvm -append 'panic=1 no-kvmclock console=ttyS0 loglevel=7' -m 1024 -serial stdio qemu-system-x86_64: /home/matwey/lab/qemu/target/i386/kvm.c:1849: kvm_put_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. First broken commit has been bisected: commit 48e1a45c3166d659f781171a47dabf4a187ed7a5 Author: Paolo Bonzini Date: Wed Mar 30 22:55:29 2016 +0200 target-i386: assert that KVM_GET/SET_MSRS can set all requested MSRs This would have caught the bug in the previous patch. Signed-off-by: Paolo Bonzini My cpuinfo is the following: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU X5675 @ 3.07GHz stepping: 2 microcode : 0x14 cpu MHz : 3066.775 cache size : 12288 KB physical id : 0 siblings: 2 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss ht syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc aperfmperf pni pclmulqdq vmx ssse3 cx16 sse4_1 sse4_2 popcnt aes hypervisor lahf_lm ida arat epb dtherm tpr_shadow vnmi ept vpid bugs: bogomips: 6133.55 clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1661386/+subscriptions
Re: [Qemu-devel] [PATCH v7 RFC] block/vxhs: Initial commit to add Veritas HyperScale VxHS block device support
On 2/2/17, 12:57 PM, "Ketan Nilangekar"wrote: On 2/2/17, 2:15 AM, "Daniel P. Berrange" wrote: On Thu, Feb 02, 2017 at 10:07:28AM +, Stefan Hajnoczi wrote: > On Wed, Feb 01, 2017 at 11:59:53PM +, Ketan Nilangekar wrote: > > Patch for secure implementation in libqnio is available for review here: > > > > https://github.com/VeritasHyperScale/libqnio/pull/12 > > > > libqnio client initialization now has an option to use X.509 certificates to authenticate itself to the vxhs server. > > Also each client IO request now includes an instance id that is used by the vxhs server to authorize the request. > > A test client has also been added. > > Libqnio.so so is renamed to libvxhs.so. We will rename the repository once the latest patches are merged. > > QEMU patch to use the new secure interface will follow shortly. > > I have left comments on specific lines of code on GitHub. > > The server should do something based on the client X.509 certificate. > Is the code actually verifying certificates on the client side? > > Right now the code is just going through the motions of SSL but not > protecting against man-in-the-middle attacks. > > I noticed that the code uses OpenSSL. QEMU uses GnuTLS instead of > OpenSSL. In practice it's hard to avoid duplication of SSL libraries: > GlusterFS and Ceph use OpenSSL and NSS. That means QEMU KVM may drag in > GnuTLS, OpenSSL, and NSS! But from a QEMU point of view it would be > nicest to use GnuTLS to keep extra library dependencies minimal. These points are all reasons why libqnio should not do anything TLS related at all. It should delegate all actual I/O to QEMU, so that we can use our existing QIO logic for TLS that is tried & tested, as well as integrating with existing QEMU infrastructure. ie, ability to use object_add QMP command to register TLS certificates with QEMU, instead of hardcoding their location on disk in libqnio [Ketan] Does the QIO interface allow for readv/writev over network for unsecure sockets? [Ketan] I checked the qio implementation and it seems that there is a pseudo implementation of readv/writev which iterates over the individual iovecs to make send/recv syscalls. This can’t be too good for performance. Are you suggesting we use the qio interface for secure communication only and leave the unsecure communication to libqnio? Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :|
[Qemu-devel] regarding codebase
Sir I'm having difficulty in understanding the codebase of QEMU .There seem to have hundreds of code files and header files in the directory and I'm unable to figure out the functionality of each file.Is there any way to understand the function of each file ?? Regards Shubham Kumar
Re: [Qemu-devel] [PATCH v7 RFC] block/vxhs: Initial commit to add Veritas HyperScale VxHS block device support
On 2/2/17, 2:15 AM, "Daniel P. Berrange"wrote: On Thu, Feb 02, 2017 at 10:07:28AM +, Stefan Hajnoczi wrote: > On Wed, Feb 01, 2017 at 11:59:53PM +, Ketan Nilangekar wrote: > > Patch for secure implementation in libqnio is available for review here: > > > > https://github.com/VeritasHyperScale/libqnio/pull/12 > > > > libqnio client initialization now has an option to use X.509 certificates to authenticate itself to the vxhs server. > > Also each client IO request now includes an instance id that is used by the vxhs server to authorize the request. > > A test client has also been added. > > Libqnio.so so is renamed to libvxhs.so. We will rename the repository once the latest patches are merged. > > QEMU patch to use the new secure interface will follow shortly. > > I have left comments on specific lines of code on GitHub. > > The server should do something based on the client X.509 certificate. > Is the code actually verifying certificates on the client side? > > Right now the code is just going through the motions of SSL but not > protecting against man-in-the-middle attacks. > > I noticed that the code uses OpenSSL. QEMU uses GnuTLS instead of > OpenSSL. In practice it's hard to avoid duplication of SSL libraries: > GlusterFS and Ceph use OpenSSL and NSS. That means QEMU KVM may drag in > GnuTLS, OpenSSL, and NSS! But from a QEMU point of view it would be > nicest to use GnuTLS to keep extra library dependencies minimal. These points are all reasons why libqnio should not do anything TLS related at all. It should delegate all actual I/O to QEMU, so that we can use our existing QIO logic for TLS that is tried & tested, as well as integrating with existing QEMU infrastructure. ie, ability to use object_add QMP command to register TLS certificates with QEMU, instead of hardcoding their location on disk in libqnio [Ketan] Does the QIO interface allow for readv/writev over network for unsecure sockets? Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o-http://search.cpan.org/~danberr/ :|
Re: [Qemu-devel] [PATCH v7 RFC] block/vxhs: Initial commit to add Veritas HyperScale VxHS block device support
On 2/2/17, 2:07 AM, "Stefan Hajnoczi"wrote: On Wed, Feb 01, 2017 at 11:59:53PM +, Ketan Nilangekar wrote: > Patch for secure implementation in libqnio is available for review here: > > https://github.com/VeritasHyperScale/libqnio/pull/12 > > libqnio client initialization now has an option to use X.509 certificates to authenticate itself to the vxhs server. > Also each client IO request now includes an instance id that is used by the vxhs server to authorize the request. > A test client has also been added. > Libqnio.so so is renamed to libvxhs.so. We will rename the repository once the latest patches are merged. > QEMU patch to use the new secure interface will follow shortly. I have left comments on specific lines of code on GitHub. [Ketan] I have replied to your comments and will be uploading a patch shortly to cover some fixes. The server should do something based on the client X.509 certificate. Is the code actually verifying certificates on the client side? Right now the code is just going through the motions of SSL but not protecting against man-in-the-middle attacks. [Ketan] The VxHS server implementation will be verifying the client certificates. The libqnio server implementation is just a sample to verify the client functionality. VxHS does not use the libqnio server implementation. I noticed that the code uses OpenSSL. QEMU uses GnuTLS instead of OpenSSL. In practice it's hard to avoid duplication of SSL libraries: GlusterFS and Ceph use OpenSSL and NSS. That means QEMU KVM may drag in GnuTLS, OpenSSL, and NSS! But from a QEMU point of view it would be nicest to use GnuTLS to keep extra library dependencies minimal. [Ketan] If GlusterFS is already using OpenSSL then libqnio is not introducing anything new here. From a licensing perspective there seems to be a way of adding a “GPL linking exception” in this case.
Re: [Qemu-devel] [PATCH v2] q35: Improve sample configuration files
On 02/02/2017 05:20 PM, Andrea Bolognani wrote: On Thu, 2017-02-02 at 16:42 +0200, Marcel Apfelbaum wrote: [...] +[device "ich9-pcie-port-1"] I would use the new generic root port. This sample configuration (q35-emulated.cfg) is supposed to match physical hardware as closely as possible, so we should stick with ioh3420s. Good point. [...] +[device "pci.1"] + driver = "ioh3420" Same here, maybe we can use the new generic port. These sample configuration files (q35-virtio-*.cfg) should probably use the generic ports instead, yes. It's just that they were not merged yet when I started working on this :) Indeed, they've been merged only today. So good timing :) [...] +[device "ich9-pci-bridge"] + driver = "i82801b11-bridge" + bus = "pcie.0" + addr = "1e.0" I am not sure about having the DMI-PCI bridge "by default". Users can understand is actually a good idea to have it by default while we don't really want them to use legacy PCI devices on Q35; and even if so, they should use them as Integrated Endpoints. They don't have hotplug for the DMI-PCI bridge anyway. Not sure about this one. It doesn't show up on my laptop, Do you have a Q35 laptop ? Good for you, I thought Q35 machines are lost in time. I have a really old PC with Q35 chipset and I thought I am "special". so it's not like every single q35-based physical machine has That was my point. it. I'd be okay with dropping it, but I leave the decision to Gerd. Sure [...] Thanks for taking your time to update the configuration files! Marcel I personally don't use them because every time I try, I find something with no config support Is that so? Can you please test these new ones and see whether they work for you? Well, I am sure they work :), I'll give them a try (I'll beon PTO next week), but their purpose is to be a template for more complex configurations. Once I start to tweak them for my needs I hit a dead end. I don't remember anything specific though. Thanks, Marcel -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] Non-flat command line option argument syntax
* Markus Armbruster (arm...@redhat.com) wrote: > = Introduction = > > = Structured option argument syntax = > > == JSON == > > The obvious way to provide the expressiveness of JSON on the command > line is JSON. Easy enough[2]. However, besides not being compatible, > it's rather heavy on syntax, at least for simple cases. Compare: > > -machine q35,accel=kvm > -machine '{ "type": "q35", "accel": "kvm"}' > > It compares a bit more favourably in cases that use our non-flat hacks. > Here's a flat list as KEY=VALUE,... with repeated keys, and as JSON: > > -semihosting-config enable,arg=eins,arg=zwei,arg=drei > -semihosting-config '{ "enable": true, "arg": [ "eins", "zwei", "drei" ] > }' > > Arbitrary nesting with dotted key convention: > > -drive driver=qcow2,file.driver=gluster, >file.volume=testvol,file.path=/path/a.qcow2,file.debug=9, >file.server.0.type=tcp, >file.server.0.host=1.2.3.4, >file.server.0.port=24007, >file.server.1.type=unix, >file.server.1.socket=/var/run/glusterd.socket > -drive '{ "driver": "qcow2", > "file": { > "driver": "gluster", "volume": "testvol", > "path": "/path/a.qcow2", "debug": 9, > "server": [ { "type": "tcp", > "host": "1.2.3.4", "port": "24007"}, > { "type": "unix", > "socket": "/var/run/glusterd.socket" } ] } }' So while I generally hate JSON, the -drive dotted key syntax makes me mad when it gets like this; have a look at the block replication and quorum setups especially, that can end up with (from docs/COLO-FT.txt): -drive if=virtio,id=primary-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\ children.0.file.filename=1.raw,\ children.0.driver=raw -S that's just way too many .'s to ever properly understand. (I'm sure it used to be more complex). > Lines broken and indented for legibility; you need to join them for > actual use. Why? What's a \n between friends for JSON? > Once you do, both variants are basically illegible. This > is simply something that belongs into a config file rather than the > command line. In a config file, JSON would be a better choice. > > There's also the -drive file=json:... syntax. It's a bad fit for > QemuOpts, because QemuOpts and JSON fight for the comma. I'd show you > if I could get it to work. > > We obviously can't replace QemuOpts with JSON. But accepting JSON in > addition to QemuOpts is a debatable feature: it lets management > applications reuse the code to build QMP arguments for option arguments. > > Since structured option arguments are always dictionaries, a JSON option > argument always starts with '{'. If no QemuOpts argument can ever start > with '{', accepting either QemuOpts or a JSON object is unambiguous. > For a more detailed discussion of the following argument, see [3]. > > A QemuOpts argument normally starts with KEY. We need to outlaw KEYs > starting with '{'. QAPI outlaws such names, see docs/qapi-code-gen.txt. > QOM doesn't, but no such keys exist as far as I know. > > QemuOpts permit abbreviating KEY=VALUE to just VALUE for one specific > KEY (the "implied" key). We need to limit this to KEYs whose VALUE > can't start with '{'. Most implied keys can't have such values. > Troublemakers include qemu-img's use of implied "file" keys. You'd have > to say "file={my-tastelessly-named-file}" instead of just > "{my-tastelessly-named-file}". What worries me a bit is building shell scripts which include ['s and {'s tends to be painful. > === Structured values === > > The dotted key convention messes with KEY syntax to permit structured > values. Works, but the more conventional way to support structured > values is a syntax for structured values. > > An obvious one is to use { KEY=VALUE, ...} for objects, and [ VALUE, > ... ] for arrays. Looks like this: > > -drive 'driver=quorum, > child=[{ driver=file, filename=disk1.img }, >{ driver=host_device, filename=/dev/sdb }, >{ driver=nbd, host=localhost } ]' > > Again, lines broken and indented for legibility; you need to join them > for actual use. > > There's a syntactic catch, though: a value of the form [ ... ] can > either be an array or a string. Which one it is depends on the type of > the key. To parse this syntax, you need to know the types, unlike JSON > or traditional QemuOpts. Unless we outlaw strings starting with '{' or > '[', which feels impractical. I don't understand why [ could imply a string. > But wait, there's another syntactic catch: in traditional QemuOpts, a > value ends at the next unescaped ',' or '\0'. Inside an object, it now > also ends at the next unescaped '}', and inside an array, at the next > unescaped ']'. Or perhaps at the next space (the example
Re: [Qemu-devel] [PATCH] tci: Remove invalid assertions
On 02/02/2017 02:10 PM, Stefan Weil wrote: > Am 02.02.2017 um 21:00 schrieb Eric Blake: >> On 02/02/2017 01:56 PM, Stefan Weil wrote: >>> tb_jmp_insn_offset and tb_jmp_reset_offset are pointers >>> and cannot be used with ARRAY_SIZE. >>> >> mst posted an alternative patch: >> https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg00551.html > > > Yes, I noticed that, too. It's not obvious that this new > assertion will be correct, and none of the other targets > has that kind of assertion. Only two targets use an > assertion which detects NULL pointers, but NULL pointers > will result in an abort anyway. > > Do you think that there are reasons why TCI should use > the assertion suggested by Michael? I don't have any strong opinions on which patch is better, so much as just pointing out that alternative proposals exist so that we have good threading in making the decision on which one to go with. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] Non-flat command line option argument syntax
On 02/02/2017 01:42 PM, Markus Armbruster wrote: > > === Structured values === > > The dotted key convention messes with KEY syntax to permit structured > values. Works, but the more conventional way to support structured > values is a syntax for structured values. > > An obvious one is to use { KEY=VALUE, ...} for objects, and [ VALUE, > ... ] for arrays. Looks like this: > > -drive 'driver=quorum, > child=[{ driver=file, filename=disk1.img }, >{ driver=host_device, filename=/dev/sdb }, >{ driver=nbd, host=localhost } ]' > > Again, lines broken and indented for legibility; you need to join them > for actual use. > > There's a syntactic catch, though: a value of the form [ ... ] can > either be an array or a string. Which one it is depends on the type of > the key. To parse this syntax, you need to know the types, unlike JSON > or traditional QemuOpts. Unless we outlaw strings starting with '{' or > '[', which feels impractical. Another syntactic catch: from the shell, -drive driver=quorum,child=[...] is insufficiently quoted, and MIGHT glob to a completely different argument (or even multiple arguments) depending on the (oddly-named) contents of the current directory. Any use of [] HAS to consistently recommend use with shell quotes. Using straight JSON already has to use shell quotes (generally '' for the overall argument, and "" for key names and string values within the JSON, although our parser as an extension supports '' for key names and string values which pairs with "" for the overall argument and allows the use of $var shell interpolation). > > === Comparison === > > In my opinion, dotted keys are weird and ugly, but at least they don't > add to the quoting mess. Structured values look better, except when > they do add to the quoting mess. > > I'm having a hard time deciding which one I like less :) Both are a bit awkward. I think dotted keys require more typing but less shell quoting than structured values. And with either approach, it would STILL be nice if we taught QemuOpts to strip whitespace after delimiting commas - the only requirement is that no key value can start with space, which QAPI enforces, and QOM is unlikely to break, although the benefits of stripping whitespace are only apparent when you remember to use shell quoting over the entire argument (which partially defeats the purpose of trying to come up with a syntax that needs less shell quoting). > > Opinions? Other ideas? I don't think command line length is a problem; most command lines are generated. I'm torn on whether a simplified structured values is nicer than full-blown JSON; your argument about having the same JSON work on both the command line and through QMP resulting in less work for management apps is interesting. And reusing an existing syntax instead of inventing yet another one always has the benefit of less code to maintain. So even though it's harder to type by hand, I'm somewhat leaning towards full JSON (where a leading '{' says to parse using JSON until the closing '}'), rather than any other structured value representation. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] regarding QEMU cloning
On 2 February 2017 at 19:45, Shubham Kumarwrote: > I'm unable to clone QEMU using this command on my terminal > " git clone git://git.qemu-project.org/qemu.git ". > The error I'm receiving is >" fatal: unable to access > 'https://git.qemu-project.org/qemu.git/': Received HTTP code 503 from proxy > after CONNECT". > > I have tried many solutions given on internet but it doesn't work out. The error message talks about a proxy, which suggests that the problem is with the HTTP proxy between you and us (ie probably your company or institution's). In particular it looks like your proxy attempts to silently convert git:// protocol into https://, which is not necessarily going to work. You could try using HTTPS directly: git clone https://git.qemu-project.org/git/qemu.git or HTTP: git clone http://git.qemu-project.org/git/qemu.git (note that the url is not quite the same here!) ...but those seem to give 443 timeouts at the moment. I've asked the qemu server sysadmin if there's something that needs fixing. In the meantime, if you can do a git clone that uses real git:// protocol and doesn't go via your local proxy then that ought to work. Or you can use the github mirror: git clone https://github.com/qemu/qemu.git thanks -- PMM
[Qemu-devel] [PATCH 3/9] armv7m: Rewrite NVIC to not use any GIC code
From: Michael DavidsaverDespite some superficial similarities of register layout, the M-profile NVIC is really very different from the A-profile GIC. Our current attempt to reuse the GIC code means that we have significant bugs in our NVIC. Implement the NVIC as an entirely separate device, to give us somewhere we can get the behaviour correct. This initial commit does not attempt to implement exception priority escalation, since the GIC-based code didn't either. It does fix a few bugs in passing: * ICSR.RETTOBASE polarity was wrong and didn't account for internal exceptions * ICSR.VECTPENDING was 16 too high if the pending exception was for an external interrupt * UsageFault, BusFault and MemFault were not disabled on reset as they are supposed to be Signed-off-by: Michael Davidsaver [PMM: reworked, various bugs and stylistic cleanups] Signed-off-by: Peter Maydell --- hw/intc/armv7m_nvic.c | 721 -- hw/intc/trace-events | 15 ++ 2 files changed, 592 insertions(+), 144 deletions(-) diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c index ce22001..e319077 100644 --- a/hw/intc/armv7m_nvic.c +++ b/hw/intc/armv7m_nvic.c @@ -17,48 +17,79 @@ #include "hw/sysbus.h" #include "qemu/timer.h" #include "hw/arm/arm.h" +#include "target/arm/cpu.h" #include "exec/address-spaces.h" -#include "gic_internal.h" #include "qemu/log.h" +#include "trace.h" + +/* IRQ number counting: + * + * the num-irq property counts the number of external IRQ lines + * + * NVICState::num_irq counts the total number of exceptions + * (external IRQs, the 15 internal exceptions including reset, + * and one for the unused exception number 0). + * + * NVIC_MAX_IRQ is the highest permitted number of external IRQ lines. + * + * NVIC_MAX_VECTORS is the highest permitted number of exceptions. + * + * Iterating through all exceptions should typically be done with + * for (i = 1; i < s->num_irq; i++) to avoid the unused slot 0. + * + * The external qemu_irq lines are the NVIC's external IRQ lines, + * so line 0 is exception 16. + */ +#define NVIC_FIRST_IRQ 16 +#define NVIC_MAX_VECTORS 512 +#define NVIC_MAX_IRQ (NVIC_MAX_VECTORS - NVIC_FIRST_IRQ) + +/* Effective running priority of the CPU when no exception is active + * (higher than the highest possible priority value) + */ +#define NVIC_NOEXC_PRIO 0x100 + +typedef struct VecInfo { +int16_t prio; /* priorities can range from -3 to 255 */ +uint8_t enabled; +uint8_t pending; +uint8_t active; +uint8_t level; /* exceptions <=15 never set level */ +} VecInfo; typedef struct NVICState { -GICState gic; +/*< private >*/ +SysBusDevice parent_obj; +/*< public >*/ + ARMCPU *cpu; +VecInfo vectors[NVIC_MAX_VECTORS]; uint32_t prigroup; +/* vectpending and exception_prio are both cached state that can + * be recalculated from the vectors[] array and the prigroup field. + */ +unsigned int vectpending; /* highest prio pending enabled exception */ +int exception_prio; /* group prio of the highest prio active exception */ + struct { uint32_t control; uint32_t reload; int64_t tick; QEMUTimer *timer; } systick; + MemoryRegion sysregmem; -MemoryRegion gic_iomem_alias; MemoryRegion container; + uint32_t num_irq; +qemu_irq excpout; qemu_irq sysresetreq; } NVICState; #define TYPE_NVIC "armv7m_nvic" -/** - * NVICClass: - * @parent_reset: the parent class' reset handler. - * - * A model of the v7M NVIC and System Controller - */ -typedef struct NVICClass { -/*< private >*/ -ARMGICClass parent_class; -/*< public >*/ -DeviceRealize parent_realize; -void (*parent_reset)(DeviceState *dev); -} NVICClass; - -#define NVIC_CLASS(klass) \ -OBJECT_CLASS_CHECK(NVICClass, (klass), TYPE_NVIC) -#define NVIC_GET_CLASS(obj) \ -OBJECT_GET_CLASS(NVICClass, (obj), TYPE_NVIC) + #define NVIC(obj) \ OBJECT_CHECK(NVICState, (obj), TYPE_NVIC) @@ -125,47 +156,274 @@ static void systick_reset(NVICState *s) timer_del(s->systick.timer); } -/* The external routines use the hardware vector numbering, ie. the first - IRQ is #16. The internal GIC routines use #32 as the first IRQ. */ +static int nvic_pending_prio(NVICState *s) +{ +/* return the priority of the current pending interrupt, + * or NVIC_NOEXC_PRIO if no interrupt is pending + */ +return s->vectpending ? s->vectors[s->vectpending].prio : NVIC_NOEXC_PRIO; +} + +/* Return the value of the ISCR RETTOBASE bit: + * 1 if there is exactly one active exception + * 0 if there is more than one active exception + * UNKNOWN if there are no active exceptions (we choose 0) + */ +static bool nvic_rettobase(NVICState *s) +{ +int irq, nhand = 0; + +for (irq = ARMV7M_EXCP_RESET; irq < s->num_irq; irq++) { +if (s->vectors[irq].active)
[Qemu-devel] [PATCH 1/9] armv7m: Rename nvic_state to NVICState
Rename the nvic_state struct to NVICState, to match our naming conventions. Signed-off-by: Peter Maydell--- hw/intc/armv7m_nvic.c | 44 ++-- 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c index fe5c303..09975f3 100644 --- a/hw/intc/armv7m_nvic.c +++ b/hw/intc/armv7m_nvic.c @@ -21,7 +21,7 @@ #include "gic_internal.h" #include "qemu/log.h" -typedef struct { +typedef struct NVICState { GICState gic; ARMCPU *cpu; struct { @@ -35,7 +35,7 @@ typedef struct { MemoryRegion container; uint32_t num_irq; qemu_irq sysresetreq; -} nvic_state; +} NVICState; #define TYPE_NVIC "armv7m_nvic" /** @@ -57,7 +57,7 @@ typedef struct NVICClass { #define NVIC_GET_CLASS(obj) \ OBJECT_GET_CLASS(NVICClass, (obj), TYPE_NVIC) #define NVIC(obj) \ -OBJECT_CHECK(nvic_state, (obj), TYPE_NVIC) +OBJECT_CHECK(NVICState, (obj), TYPE_NVIC) static const uint8_t nvic_id[] = { 0x00, 0xb0, 0x1b, 0x00, 0x0d, 0xe0, 0x05, 0xb1 @@ -74,7 +74,7 @@ static const uint8_t nvic_id[] = { int system_clock_scale; /* Conversion factor from qemu timer to SysTick frequencies. */ -static inline int64_t systick_scale(nvic_state *s) +static inline int64_t systick_scale(NVICState *s) { if (s->systick.control & SYSTICK_CLKSOURCE) return system_clock_scale; @@ -82,7 +82,7 @@ static inline int64_t systick_scale(nvic_state *s) return 1000; } -static void systick_reload(nvic_state *s, int reset) +static void systick_reload(NVICState *s, int reset) { /* The Cortex-M3 Devices Generic User Guide says that "When the * ENABLE bit is set to 1, the counter loads the RELOAD value from the @@ -101,7 +101,7 @@ static void systick_reload(nvic_state *s, int reset) static void systick_timer_tick(void * opaque) { -nvic_state *s = (nvic_state *)opaque; +NVICState *s = (NVICState *)opaque; s->systick.control |= SYSTICK_COUNTFLAG; if (s->systick.control & SYSTICK_TICKINT) { /* Trigger the interrupt. */ @@ -114,7 +114,7 @@ static void systick_timer_tick(void * opaque) } } -static void systick_reset(nvic_state *s) +static void systick_reset(NVICState *s) { s->systick.control = 0; s->systick.reload = 0; @@ -126,7 +126,7 @@ static void systick_reset(nvic_state *s) IRQ is #16. The internal GIC routines use #32 as the first IRQ. */ void armv7m_nvic_set_pending(void *opaque, int irq) { -nvic_state *s = (nvic_state *)opaque; +NVICState *s = (NVICState *)opaque; if (irq >= 16) irq += 16; gic_set_pending_private(>gic, 0, irq); @@ -135,7 +135,7 @@ void armv7m_nvic_set_pending(void *opaque, int irq) /* Make pending IRQ active. */ int armv7m_nvic_acknowledge_irq(void *opaque) { -nvic_state *s = (nvic_state *)opaque; +NVICState *s = (NVICState *)opaque; uint32_t irq; irq = gic_acknowledge_irq(>gic, 0, MEMTXATTRS_UNSPECIFIED); @@ -148,13 +148,13 @@ int armv7m_nvic_acknowledge_irq(void *opaque) void armv7m_nvic_complete_irq(void *opaque, int irq) { -nvic_state *s = (nvic_state *)opaque; +NVICState *s = (NVICState *)opaque; if (irq >= 16) irq += 16; gic_complete_irq(>gic, 0, irq, MEMTXATTRS_UNSPECIFIED); } -static uint32_t nvic_readl(nvic_state *s, uint32_t offset) +static uint32_t nvic_readl(NVICState *s, uint32_t offset) { ARMCPU *cpu = s->cpu; uint32_t val; @@ -294,7 +294,7 @@ static uint32_t nvic_readl(nvic_state *s, uint32_t offset) } } -static void nvic_writel(nvic_state *s, uint32_t offset, uint32_t value) +static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value) { ARMCPU *cpu = s->cpu; uint32_t oldval; @@ -425,7 +425,7 @@ static void nvic_writel(nvic_state *s, uint32_t offset, uint32_t value) static uint64_t nvic_sysreg_read(void *opaque, hwaddr addr, unsigned size) { -nvic_state *s = (nvic_state *)opaque; +NVICState *s = (NVICState *)opaque; uint32_t offset = addr; int i; uint32_t val; @@ -454,7 +454,7 @@ static uint64_t nvic_sysreg_read(void *opaque, hwaddr addr, static void nvic_sysreg_write(void *opaque, hwaddr addr, uint64_t value, unsigned size) { -nvic_state *s = (nvic_state *)opaque; +NVICState *s = (NVICState *)opaque; uint32_t offset = addr; int i; @@ -486,17 +486,17 @@ static const VMStateDescription vmstate_nvic = { .version_id = 1, .minimum_version_id = 1, .fields = (VMStateField[]) { -VMSTATE_UINT32(systick.control, nvic_state), -VMSTATE_UINT32(systick.reload, nvic_state), -VMSTATE_INT64(systick.tick, nvic_state), -VMSTATE_TIMER_PTR(systick.timer, nvic_state), +VMSTATE_UINT32(systick.control, NVICState), +VMSTATE_UINT32(systick.reload, NVICState), +VMSTATE_INT64(systick.tick,
[Qemu-devel] [PATCH 2/9] armv7m: Implement reading and writing of PRIGROUP
Add a state field for the v7M PRIGROUP register and implent reading and writing it. The current NVIC doesn't honour the values written, but the new version will. Signed-off-by: Peter Maydell--- hw/intc/armv7m_nvic.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c index 09975f3..ce22001 100644 --- a/hw/intc/armv7m_nvic.c +++ b/hw/intc/armv7m_nvic.c @@ -24,6 +24,9 @@ typedef struct NVICState { GICState gic; ARMCPU *cpu; + +uint32_t prigroup; + struct { uint32_t control; uint32_t reload; @@ -223,7 +226,7 @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset) case 0xd08: /* Vector Table Offset. */ return cpu->env.v7m.vecbase; case 0xd0c: /* Application Interrupt/Reset Control. */ -return 0xfa05; +return 0xfa05 | (s->prigroup << 8); case 0xd10: /* System Control. */ /* TODO: Implement SLEEPONEXIT. */ return 0; @@ -362,9 +365,7 @@ static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value) if (value & 1) { qemu_log_mask(LOG_UNIMP, "AIRCR system reset unimplemented\n"); } -if (value & 0x700) { -qemu_log_mask(LOG_UNIMP, "PRIGROUP unimplemented\n"); -} +s->prigroup = extract32(value, 8, 3); } break; case 0xd10: /* System Control. */ @@ -483,13 +484,14 @@ static const MemoryRegionOps nvic_sysreg_ops = { static const VMStateDescription vmstate_nvic = { .name = "armv7m_nvic", -.version_id = 1, -.minimum_version_id = 1, +.version_id = 2, +.minimum_version_id = 2, .fields = (VMStateField[]) { VMSTATE_UINT32(systick.control, NVICState), VMSTATE_UINT32(systick.reload, NVICState), VMSTATE_INT64(systick.tick, NVICState), VMSTATE_TIMER_PTR(systick.timer, NVICState), +VMSTATE_UINT32(prigroup, NVICState), VMSTATE_END_OF_LIST() } }; -- 2.7.4
Re: [Qemu-devel] [PATCH] tci: Remove invalid assertions
Am 02.02.2017 um 21:00 schrieb Eric Blake: On 02/02/2017 01:56 PM, Stefan Weil wrote: tb_jmp_insn_offset and tb_jmp_reset_offset are pointers and cannot be used with ARRAY_SIZE. Signed-off-by: Stefan Weil--- tcg/tci/tcg-target.inc.c | 2 -- 1 file changed, 2 deletions(-) mst posted an alternative patch: https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg00551.html Yes, I noticed that, too. It's not obvious that this new assertion will be correct, and none of the other targets has that kind of assertion. Only two targets use an assertion which detects NULL pointers, but NULL pointers will result in an abort anyway. Do you think that there are reasons why TCI should use the assertion suggested by Michael? Stefan
[Qemu-devel] [PATCH 4/9] armv7m: Fix condition check for taking exceptions
The M profile condition for when we can take a pending exception or interrupt is not the same as that for A/R profile. The code originally copied from the A/R profile version of the cpu_exec_interrupt function only worked by chance for the very simple case of exceptions being masked by PRIMASK. Replace it with a call to a function in the NVIC code that correctly compares the priority of the pending exception against the current execution priority of the CPU. [Michael Davidsaver's patchset had a patch to do something similar but the implementation ended up being a rewrite.] Signed-off-by: Peter Maydell--- target/arm/cpu.h | 8 hw/intc/armv7m_nvic.c | 7 +++ target/arm/cpu.c | 16 3 files changed, 23 insertions(+), 8 deletions(-) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 39bff86..ac20a56 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -1335,6 +1335,14 @@ uint32_t arm_phys_excp_target_el(CPUState *cs, uint32_t excp_idx, uint32_t cur_el, bool secure); /* Interface between CPU and Interrupt controller. */ +#ifndef CONFIG_USER_ONLY +bool armv7m_nvic_can_take_pending_exception(void *opaque); +#else +static inline bool armv7m_nvic_can_take_pending_exception(void *opaque) +{ +return true; +} +#endif void armv7m_nvic_set_pending(void *opaque, int irq); int armv7m_nvic_acknowledge_irq(void *opaque); void armv7m_nvic_complete_irq(void *opaque, int irq); diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c index e319077..3d77cbf 100644 --- a/hw/intc/armv7m_nvic.c +++ b/hw/intc/armv7m_nvic.c @@ -268,6 +268,13 @@ static inline int nvic_exec_prio(NVICState *s) return MIN(running, s->exception_prio); } +bool armv7m_nvic_can_take_pending_exception(void *opaque) +{ +NVICState *s = opaque; + +return nvic_exec_prio(s) > nvic_pending_prio(s); +} + /* caller must call nvic_irq_update() after this */ static void set_prio(NVICState *s, unsigned irq, uint8_t prio) { diff --git a/target/arm/cpu.c b/target/arm/cpu.c index e9f10f7..7713d88 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -338,13 +338,6 @@ static bool arm_v7m_cpu_exec_interrupt(CPUState *cs, int interrupt_request) CPUARMState *env = >env; bool ret = false; - -if (interrupt_request & CPU_INTERRUPT_FIQ -&& !(env->daif & PSTATE_F)) { -cs->exception_index = EXCP_FIQ; -cc->do_interrupt(cs); -ret = true; -} /* ARMv7-M interrupt return works by loading a magic value * into the PC. On real hardware the load causes the * return to occur. The qemu implementation performs the @@ -354,9 +347,16 @@ static bool arm_v7m_cpu_exec_interrupt(CPUState *cs, int interrupt_request) * the stack if an interrupt occurred at the wrong time. * We avoid this by disabling interrupts when * pc contains a magic address. + * + * ARMv7-M interrupt masking works differently than -A or -R. + * There is no FIQ/IRQ distinction. Instead of I and F bits + * masking FIQ and IRQ interrupts, an exception is taken only + * if it is higher priority than the current execution priority + * (which depends on state like BASEPRI, FAULTMASK and the + * currently active exception). */ if (interrupt_request & CPU_INTERRUPT_HARD -&& !(env->daif & PSTATE_I) +&& (armv7m_nvic_can_take_pending_exception(env->nvic)) && (env->regs[15] < 0xfff0)) { cs->exception_index = EXCP_IRQ; cc->do_interrupt(cs); -- 2.7.4
Re: [Qemu-devel] Non-flat command line option argument syntax
On 02/02/2017 01:42 PM, Markus Armbruster wrote: quick comment before I reply to the overall message... > Arbitrary nesting with dotted key convention: > > -drive driver=qcow2,file.driver=gluster, >file.volume=testvol,file.path=/path/a.qcow2,file.debug=9, >file.server.0.type=tcp, >file.server.0.host=1.2.3.4, >file.server.0.port=24007, >file.server.1.type=unix, >file.server.1.socket=/var/run/glusterd.socket > > Lines broken and indented for legibility; you need to join them for > actual use. Once you do, both variants are basically illegible. It should be relatively simple to tweak QemuOpts to support: -drive 'driver=qcow2, file.driver=gluster, file' where all whitespace after comma is ignored, with a nice effect of much more legibility in long command lines. I might just submit a patch for that, regardless of what else we do to get rid of QemuOpts. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
[Qemu-devel] [PATCH 5/9] arm: gic: Remove references to NVIC
From: Michael DavidsaverNow that the NVIC is its own separate implementation, we can clean up the GIC code by removing REV_NVIC and conditionals which use it. Signed-off-by: Michael Davidsaver Signed-off-by: Peter Maydell --- hw/intc/gic_internal.h | 7 ++- hw/intc/arm_gic.c| 31 +-- hw/intc/arm_gic_common.c | 23 --- 3 files changed, 15 insertions(+), 46 deletions(-) diff --git a/hw/intc/gic_internal.h b/hw/intc/gic_internal.h index 3f31174..7fe87b1 100644 --- a/hw/intc/gic_internal.h +++ b/hw/intc/gic_internal.h @@ -25,9 +25,7 @@ #define ALL_CPU_MASK ((unsigned)(((1 << GIC_NCPU) - 1))) -/* The NVIC has 16 internal vectors. However these are not exposed - through the normal GIC interface. */ -#define GIC_BASE_IRQ ((s->revision == REV_NVIC) ? 32 : 0) +#define GIC_BASE_IRQ 0 #define GIC_SET_ENABLED(irq, cm) s->irq_state[irq].enabled |= (cm) #define GIC_CLEAR_ENABLED(irq, cm) s->irq_state[irq].enabled &= ~(cm) @@ -75,7 +73,6 @@ /* The special cases for the revision property: */ #define REV_11MPCORE 0 -#define REV_NVIC 0x void gic_set_pending_private(GICState *s, int cpu, int irq); uint32_t gic_acknowledge_irq(GICState *s, int cpu, MemTxAttrs attrs); @@ -87,7 +84,7 @@ void gic_set_priority(GICState *s, int cpu, int irq, uint8_t val, static inline bool gic_test_pending(GICState *s, int irq, int cm) { -if (s->revision == REV_NVIC || s->revision == REV_11MPCORE) { +if (s->revision == REV_11MPCORE) { return s->irq_state[irq].pending & cm; } else { /* Edge-triggered interrupts are marked pending on a rising edge, but diff --git a/hw/intc/arm_gic.c b/hw/intc/arm_gic.c index 521aac3..8e5a9d8 100644 --- a/hw/intc/arm_gic.c +++ b/hw/intc/arm_gic.c @@ -156,17 +156,6 @@ static void gic_set_irq_11mpcore(GICState *s, int irq, int level, } } -static void gic_set_irq_nvic(GICState *s, int irq, int level, - int cm, int target) -{ -if (level) { -GIC_SET_LEVEL(irq, cm); -GIC_SET_PENDING(irq, target); -} else { -GIC_CLEAR_LEVEL(irq, cm); -} -} - static void gic_set_irq_generic(GICState *s, int irq, int level, int cm, int target) { @@ -214,8 +203,6 @@ static void gic_set_irq(void *opaque, int irq, int level) if (s->revision == REV_11MPCORE) { gic_set_irq_11mpcore(s, irq, level, cm, target); -} else if (s->revision == REV_NVIC) { -gic_set_irq_nvic(s, irq, level, cm, target); } else { gic_set_irq_generic(s, irq, level, cm, target); } @@ -367,7 +354,7 @@ uint32_t gic_acknowledge_irq(GICState *s, int cpu, MemTxAttrs attrs) return 1023; } -if (s->revision == REV_11MPCORE || s->revision == REV_NVIC) { +if (s->revision == REV_11MPCORE) { /* Clear pending flags for both level and edge triggered interrupts. * Level triggered IRQs will be reasserted once they become inactive. */ @@ -589,11 +576,6 @@ void gic_complete_irq(GICState *s, int cpu, int irq, MemTxAttrs attrs) DPRINTF("Set %d pending mask %x\n", irq, cm); GIC_SET_PENDING(irq, cm); } -} else if (s->revision == REV_NVIC) { -if (GIC_TEST_LEVEL(irq, cm)) { -DPRINTF("Set nvic %d pending mask %x\n", irq, cm); -GIC_SET_PENDING(irq, cm); -} } group = gic_has_groups(s) && GIC_TEST_GROUP(irq, cm); @@ -768,7 +750,7 @@ static uint32_t gic_dist_readb(void *opaque, hwaddr offset, MemTxAttrs attrs) } else if (offset < 0xf10) { goto bad_reg; } else if (offset < 0xf30) { -if (s->revision == REV_11MPCORE || s->revision == REV_NVIC) { +if (s->revision == REV_11MPCORE) { goto bad_reg; } @@ -802,9 +784,6 @@ static uint32_t gic_dist_readb(void *opaque, hwaddr offset, MemTxAttrs attrs) case 2: res = gic_id_gicv2[(offset - 0xfd0) >> 2]; break; -case REV_NVIC: -/* Shouldn't be able to get here */ -abort(); default: res = 0; } @@ -1028,7 +1007,7 @@ static void gic_dist_writeb(void *opaque, hwaddr offset, continue; /* Ignore Non-secure access of Group0 IRQ */ } -if (s->revision == REV_11MPCORE || s->revision == REV_NVIC) { +if (s->revision == REV_11MPCORE) { if (value & (1 << (i * 2))) { GIC_SET_MODEL(irq + i); } else { @@ -1046,7 +1025,7 @@ static void gic_dist_writeb(void *opaque, hwaddr offset, goto bad_reg; } else if (offset < 0xf20) { /* GICD_CPENDSGIRn */ -if (s->revision == REV_11MPCORE || s->revision == REV_NVIC) { +if (s->revision == REV_11MPCORE) {
[Qemu-devel] [PATCH 6/9] armv7m: Escalate exceptions to HardFault if necessary
From: Michael DavidsaverThe v7M exception architecture requires that if a synchronous exception cannot be taken immediately (because it is disabled or at too low a priority) then it should be escalated to HardFault (and the HardFault exception is then taken). Implement this escalation logic. Signed-off-by: Michael Davidsaver [PMM: extracted from another patch] Signed-off-by: Peter Maydell --- hw/intc/armv7m_nvic.c | 53 +++ target/arm/helper.c | 2 -- 2 files changed, 53 insertions(+), 2 deletions(-) diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c index 3d77cbf..2eaac3d 100644 --- a/hw/intc/armv7m_nvic.c +++ b/hw/intc/armv7m_nvic.c @@ -334,6 +334,59 @@ void armv7m_nvic_set_pending(void *opaque, int irq) vec = >vectors[irq]; trace_nvic_set_pending(irq, vec->enabled, vec->prio); + + +if (irq >= ARMV7M_EXCP_HARD && irq < ARMV7M_EXCP_PENDSV) { +/* If a synchronous exception is pending then it may be + * escalated to HardFault if: + * * it is equal or lower priority to current execution + * * it is disabled + * (ie we need to take it immediately but we can't do so). + * Asynchronous exceptions (and interrupts) simply remain pending. + * + * For QEMU, we don't have any imprecise (asynchronous) faults, + * so we can assume that PREFETCH_ABORT and DATA_ABORT are always + * synchronous. + * Debug exceptions are awkward because only Debug exceptions + * resulting from the BKPT instruction should be escalated, + * but we don't currently implement any Debug exceptions other + * than those that result from BKPT, so we treat all debug exceptions + * as needing escalation. + * + * This all means we can identify whether to escalate based only on + * the exception number and don't (yet) need the caller to explicitly + * tell us whether this exception is synchronous or not. + */ +int running = nvic_exec_prio(s); +bool escalate = false; + +if (vec->prio >= running) { +trace_nvic_escalate_prio(irq, vec->prio, running); +escalate = true; +} else if (!vec->enabled) { +trace_nvic_escalate_disabled(irq); +escalate = true; +} + +if (escalate) { +if (running < 0) { +/* We want to escalate to HardFault but we can't take a + * synchronous HardFault at this point either. This is a + * Lockup condition due to a guest bug. We don't model + * Lockup, so report via cpu_abort() instead. + */ +cpu_abort(>cpu->parent_obj, + "Lockup: can't escalate %d to HardFault " + "(current priority %d)\n", irq, running); +} + +/* We can do the escalation, so we take HardFault instead */ +irq = ARMV7M_EXCP_HARD; +vec = >vectors[irq]; +s->cpu->env.v7m.hfsr |= R_V7M_HFSR_FORCED_MASK; +} +} + if (!vec->pending) { vec->pending = 1; nvic_irq_update(s); diff --git a/target/arm/helper.c b/target/arm/helper.c index c23df1b..6c86eac 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -6067,8 +6067,6 @@ void arm_v7m_cpu_do_interrupt(CPUState *cs) /* For exceptions we just mark as pending on the NVIC, and let that handle it. */ -/* TODO: Need to escalate if the current priority is higher than the - one we're raising. */ switch (cs->exception_index) { case EXCP_UDEF: armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE); -- 2.7.4
[Qemu-devel] [PATCH 9/9] armv7m: VECTCLRACTIVE and VECTRESET are UNPREDICTABLE
From: Michael DavidsaverThe VECTCLRACTIVE and VECTRESET bits in the AIRCR are both documented as UNPREDICTABLE if you write a 1 to them when the processor is not halted in Debug state (ie stopped and under the control of an external JTAG debugger). Since we don't implement Debug state or emulated JTAG these bits are always UNPREDICTABLE for us. Instead of logging them as unimplemented we can simply log writes as guest errors and ignore them. Signed-off-by: Michael Davidsaver [PMM: change extracted from another patch; commit message constructed from scratch] Signed-off-by: Peter Maydell --- hw/intc/armv7m_nvic.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c index 7b61fe6..18c0e60 100644 --- a/hw/intc/armv7m_nvic.c +++ b/hw/intc/armv7m_nvic.c @@ -698,10 +698,14 @@ static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value) qemu_irq_pulse(s->sysresetreq); } if (value & 2) { -qemu_log_mask(LOG_UNIMP, "VECTCLRACTIVE unimplemented\n"); +qemu_log_mask(LOG_GUEST_ERROR, + "Setting VECTCLRACTIVE when not in DEBUG mode " + "is UNPREDICTABLE\n"); } if (value & 1) { -qemu_log_mask(LOG_UNIMP, "AIRCR system reset unimplemented\n"); +qemu_log_mask(LOG_GUEST_ERROR, + "Setting VECTRESET when not in DEBUG mode " + "is UNPREDICTABLE\n"); } s->prigroup = extract32(value, 8, 3); nvic_irq_update(s); -- 2.7.4
[Qemu-devel] [PATCH 8/9] armv7m: Simpler and faster exception start
From: Michael DavidsaverAll the places in armv7m_cpu_do_interrupt() which pend an exception in the NVIC are doing so for synchronous exceptions. We know that we will always take some exception in this case, so we can just acknowledge it immediately, rather than returning and then immediately being called again because the NVIC has raised its outbound IRQ line. Signed-off-by: Michael Davidsaver [PMM: tweaked commit message; added DEBUG to the set of exceptions we handle immediately, since it is synchronous when it results from the BKPT instruction] Signed-off-by: Peter Maydell --- target/arm/helper.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/target/arm/helper.c b/target/arm/helper.c index 78bf9ab..8bdd99c 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -6071,22 +6071,22 @@ void arm_v7m_cpu_do_interrupt(CPUState *cs) case EXCP_UDEF: armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE); env->v7m.cfsr |= R_V7M_CFSR_UNDEFINSTR_MASK; -return; +break; case EXCP_NOCP: armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE); env->v7m.cfsr |= R_V7M_CFSR_NOCP_MASK; -return; +break; case EXCP_SWI: /* The PC already points to the next instruction. */ armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SVC); -return; +break; case EXCP_PREFETCH_ABORT: case EXCP_DATA_ABORT: /* TODO: if we implemented the MPU registers, this is where we * should set the MMFAR, etc from exception.fsr and exception.vaddress. */ armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_MEM); -return; +break; case EXCP_BKPT: if (semihosting_enabled()) { int nr; @@ -6101,9 +6101,8 @@ void arm_v7m_cpu_do_interrupt(CPUState *cs) } } armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_DEBUG); -return; +break; case EXCP_IRQ: -armv7m_nvic_acknowledge_irq(env->nvic); break; case EXCP_EXCEPTION_EXIT: do_v7m_exception_exit(env); @@ -6113,6 +6112,10 @@ void arm_v7m_cpu_do_interrupt(CPUState *cs) return; /* Never happens. Keep compiler happy. */ } +armv7m_nvic_acknowledge_irq(env->nvic); + +qemu_log_mask(CPU_LOG_INT, "... as %d\n", env->v7m.exception); + /* Align stack pointer if the guest wants that */ if ((env->regs[13] & 4) && (env->v7m.ccr & R_V7M_CCR_STKALIGN_MASK)) { env->regs[13] -= 4; -- 2.7.4
[Qemu-devel] [PATCH 0/9] Rewrite NVIC to not depend on the GIC
This patchset is the revamp of the NVIC code from Michael Davidsaver's patchset of a year ago. Despite some superficial similarities of register layout, the M-profile NVIC is really very different from the A-profile GIC. Our current attempt to reuse the GIC code means that we have significant bugs in our NVIC. The series pulls the NVIC apart from the GIC code (fixing a few accidental bugs in the process), and then once it has a place to stand, implements a few minor cleanups, a key bugfix (getting priority calculations and masking right) and a missing feature (escalation to HardFault). I've tried to separate things out into their own patches where I could, but the core 'rewrite NVIC' patch itself is still 592 insertions(+), 144 deletions(-), and there's not much to be done about that since we don't want to break functionality in the process. This patch series doesn't include the "check exception return consistency" changes that Michael's original patch set had, because I need to do more work on those and there's no need to make this series any bigger. I want to pull SysTick out into its own device object, so there are some foundations for that here (mostly that we leave the container memory object in place even though it only has one thing inside it now). For testing, I have used the Stellaris image I have to hand: http://people.linaro.org/~peter.maydell/stellaris.tgz and also a set of bare-metal test programs also written by Michael. You can find my slightly tweaked and cleand up version of those here (a README explains how to run them): https://git.linaro.org/people/peter.maydell/m-profile-tests.git Further testing welcome. thanks -- PMM Michael Davidsaver (5): armv7m: Rewrite NVIC to not use any GIC code arm: gic: Remove references to NVIC armv7m: Escalate exceptions to HardFault if necessary armv7m: Simpler and faster exception start armv7m: VECTCLRACTIVE and VECTRESET are UNPREDICTABLE Peter Maydell (4): armv7m: Rename nvic_state to NVICState armv7m: Implement reading and writing of PRIGROUP armv7m: Fix condition check for taking exceptions armv7m: Remove unused armv7m_nvic_acknowledge_irq() return value hw/intc/gic_internal.h | 7 +- target/arm/cpu.h | 10 +- hw/intc/arm_gic.c| 31 +- hw/intc/arm_gic_common.c | 23 +- hw/intc/armv7m_nvic.c| 843 +-- target/arm/cpu.c | 16 +- target/arm/helper.c | 17 +- hw/intc/trace-events | 15 + 8 files changed, 726 insertions(+), 236 deletions(-) -- 2.7.4
[Qemu-devel] [PATCH 7/9] armv7m: Remove unused armv7m_nvic_acknowledge_irq() return value
Having armv7m_nvic_acknowledge_irq() return the new value of env->v7m.exception and its one caller assign the return value back to env->v7m.exception is pointless. Just make the return type void instead. Signed-off-by: Peter Maydell--- target/arm/cpu.h | 2 +- hw/intc/armv7m_nvic.c | 4 +--- target/arm/helper.c | 2 +- 3 files changed, 3 insertions(+), 5 deletions(-) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index ac20a56..36cccfc 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -1344,7 +1344,7 @@ static inline bool armv7m_nvic_can_take_pending_exception(void *opaque) } #endif void armv7m_nvic_set_pending(void *opaque, int irq); -int armv7m_nvic_acknowledge_irq(void *opaque); +void armv7m_nvic_acknowledge_irq(void *opaque); void armv7m_nvic_complete_irq(void *opaque, int irq); /* Interface for defining coprocessor registers. diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c index 2eaac3d..7b61fe6 100644 --- a/hw/intc/armv7m_nvic.c +++ b/hw/intc/armv7m_nvic.c @@ -394,7 +394,7 @@ void armv7m_nvic_set_pending(void *opaque, int irq) } /* Make pending IRQ active. */ -int armv7m_nvic_acknowledge_irq(void *opaque) +void armv7m_nvic_acknowledge_irq(void *opaque) { NVICState *s = (NVICState *)opaque; CPUARMState *env = >cpu->env; @@ -421,8 +421,6 @@ int armv7m_nvic_acknowledge_irq(void *opaque) env->v7m.exception = s->vectpending; nvic_irq_update(s); - -return env->v7m.exception; } void armv7m_nvic_complete_irq(void *opaque, int irq) diff --git a/target/arm/helper.c b/target/arm/helper.c index 6c86eac..78bf9ab 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -6103,7 +6103,7 @@ void arm_v7m_cpu_do_interrupt(CPUState *cs) armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_DEBUG); return; case EXCP_IRQ: -env->v7m.exception = armv7m_nvic_acknowledge_irq(env->nvic); +armv7m_nvic_acknowledge_irq(env->nvic); break; case EXCP_EXCEPTION_EXIT: do_v7m_exception_exit(env); -- 2.7.4
Re: [Qemu-devel] [PATCH] tci: Remove invalid assertions
On 02/02/2017 01:56 PM, Stefan Weil wrote: > tb_jmp_insn_offset and tb_jmp_reset_offset are pointers > and cannot be used with ARRAY_SIZE. > > Signed-off-by: Stefan Weil> --- > tcg/tci/tcg-target.inc.c | 2 -- > 1 file changed, 2 deletions(-) mst posted an alternative patch: https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg00551.html -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
[Qemu-devel] [PATCH] tci: Remove invalid assertions
tb_jmp_insn_offset and tb_jmp_reset_offset are pointers and cannot be used with ARRAY_SIZE. Signed-off-by: Stefan Weil--- tcg/tci/tcg-target.inc.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/tcg/tci/tcg-target.inc.c b/tcg/tci/tcg-target.inc.c index 26ee9b1664..b6a15569f8 100644 --- a/tcg/tci/tcg-target.inc.c +++ b/tcg/tci/tcg-target.inc.c @@ -566,7 +566,6 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args, case INDEX_op_goto_tb: if (s->tb_jmp_insn_offset) { /* Direct jump method. */ -tcg_debug_assert(args[0] < ARRAY_SIZE(s->tb_jmp_insn_offset)); /* Align for atomic patching and thread safety */ s->code_ptr = QEMU_ALIGN_PTR_UP(s->code_ptr, 4); s->tb_jmp_insn_offset[args[0]] = tcg_current_code_size(s); @@ -575,7 +574,6 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args, /* Indirect jump method. */ TODO(); } -tcg_debug_assert(args[0] < ARRAY_SIZE(s->tb_jmp_reset_offset)); s->tb_jmp_reset_offset[args[0]] = tcg_current_code_size(s); break; case INDEX_op_br: -- 2.11.0
[Qemu-devel] regarding QEMU cloning
sir I'm unable to clone QEMU using this command on my terminal " git clone git://git.qemu-project.org/qemu.git ". The error I'm receiving is " fatal: unable to access 'https://git.qemu-project.org/qemu.git/': Received HTTP code 503 from proxy after CONNECT". I have tried many solutions given on internet but it doesn't work out. Regards Shubham Kumar
[Qemu-devel] Non-flat command line option argument syntax
= Introduction = If you're familiar with prior discussion of non-flat arguments for -object and -blockdev, you can probably skip ahead to "= Structured option argument syntax =". Structured option arguments use KEY=VALUE,... syntax. Goes back many years (at least to commit 7c9d8e0, Nov 2005). Since 2009, the proper way to do this is QemuOpts. QemuOpts can be used in two ways. You can either declare accepted keys and their types, or accept arbitrary keys with string values. The types supported with declared keys are char *, bool, uint64_t. Since none of them is structured, an option argument is basically a flat dictionary. QMP was created a few months after QemuOpts, and later on rebased onto QAPI. Its simple types are char * (JSON string), bool (JSON false, true), {uint,int}{8,16,32,64}_t and double (JSON number). Structured types are dictionary (JSON object) and list (JSON array). A QMP command takes a dictionary as argument. It need not be flat. Fine print: the implementation currently can't read uint64_t values above INT64_MAX from the wire, but that's fixable. We expose a few things both as QMP command and as command line option: netdev_add and -netdev, device_add and -device, chardev-add and -chardev, ... When the QMP command's argument dictionary happens to be flat, translating it to QemuOpts is easy enough, if you're willing to map all integers to uint64_t, and don't need floating-point. However, many argument dictionaries aren't flat, and numeric types other than uint64_t exist for a reason. We need command line option arguments that are just as expressive as QMP command arguments. Moreover, having to translate the argument type from QAPI to QemuOpts is dumb. We actually have a way to use a QAPI type for an option argument without translating it: the options visitor. But it supports basically just the intersection of QemuOpts and QMP. Too limited. We've hacked around the "flatness" of QemuOpts in various ways over the years. We abuse implementation details to express flat lists as repeated keys (e.g. -semihosting-config, -spice, options visitor). We bolted on value syntax for lists of integers in places (options visitor, string visitor). We bolted on value syntax for arbitrary nesting in another place (-drive file=json:...). We bolted on key syntax for arbitrary nesting in yet another place (block layer's dotted key convention). Most recently, Dan even created a fully functional bridge between QemuOpts and QMP types based on the dotted key convention (patches [1], not committed). In my opinion, we need to stop hacking around QemuOpts design limitations, and start replacing it. Naturally, any replacement needs to remain sufficiently backward compatible. This memo is about the replacement's option argument syntax. = Brief recap of dotted key convention = We'll discuss use of dotted key convention later, so let me explain it briefly for the readers who don't know it already. The dotted key convention interprets the KEY part as a sequence of names separated by dots. If a name looks like an integer *handwave*, it's an array index, else it's an object member name. The first name must be an object member name, because the option argument is an object, not an array. Restriction: can't express member names that look like an integer. Example: port=5901 sets the option argument member "port" to the value 5901. Example: foo.0.bar=bla updates the option argument member "foo", whose value is an array. The array's 0-th element is an object with a member "bar". That member's value is set to "bla". The various KEYs need to be consistent in their use of array vs. object. For instance, foo.0.bar=bla,foo.eek.bar=blubb isn't, because it uses the value of member "foo" both as array and as object. = Structured option argument syntax = == JSON == The obvious way to provide the expressiveness of JSON on the command line is JSON. Easy enough[2]. However, besides not being compatible, it's rather heavy on syntax, at least for simple cases. Compare: -machine q35,accel=kvm -machine '{ "type": "q35", "accel": "kvm"}' It compares a bit more favourably in cases that use our non-flat hacks. Here's a flat list as KEY=VALUE,... with repeated keys, and as JSON: -semihosting-config enable,arg=eins,arg=zwei,arg=drei -semihosting-config '{ "enable": true, "arg": [ "eins", "zwei", "drei" ] }' Arbitrary nesting with dotted key convention: -drive driver=qcow2,file.driver=gluster, file.volume=testvol,file.path=/path/a.qcow2,file.debug=9, file.server.0.type=tcp, file.server.0.host=1.2.3.4, file.server.0.port=24007, file.server.1.type=unix, file.server.1.socket=/var/run/glusterd.socket -drive '{ "driver": "qcow2", "file": { "driver": "gluster", "volume": "testvol", "path": "/path/a.qcow2", "debug": 9, "server": [ { "type": "tcp",
Re: [Qemu-devel] KVM/QEMU on Raspberry Pi 3
Hi, On 02/02/2017 19.48, Ard Biesheuvel wrote: $ git grep -C5 -ni 0x1DE7EC7EDBADC0DE arch/arm64/kvm/sys_regs.h-105-static inline void reset_unknown(struct kvm_vcpu *vcpu, arch/arm64/kvm/sys_regs.h-106- const struct sys_reg_desc *r) arch/arm64/kvm/sys_regs.h-107-{ arch/arm64/kvm/sys_regs.h-108- BUG_ON(!r->reg); arch/arm64/kvm/sys_regs.h-109- BUG_ON(r->reg >= NR_SYS_REGS); arch/arm64/kvm/sys_regs.h:110: vcpu_sys_reg(vcpu, r->reg) = 0x1de7ec7edbadc0deULL; arch/arm64/kvm/sys_regs.h-111-} arch/arm64/kvm/sys_regs.h-112- In other words (or rather, in words), KVM is triggering this exception in the guest deliberately, which I suspect has something to do with the lack of a GIC? Are you using these patches Peter mentions? No, I'm not using Peter's patches. It's mainline Fedora 24 and rawhide QEMU. - Pekka
[Qemu-devel] [PATCH] usb: ccid: check ccid apdu length
From: Prasad J PanditCCID device emulator uses Application Protocol Data Units(APDU) to exchange command and responses to and from the host. The length in these units couldn't be greater than 65536. Add check to ensure the same. It'd also avoid potential integer overflow in emulated_apdu_from_guest. Reported-by: Li Qiang Signed-off-by: Prasad J Pandit --- hw/usb/dev-smartcard-reader.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/usb/dev-smartcard-reader.c b/hw/usb/dev-smartcard-reader.c index 89e11b6..1325ea1 100644 --- a/hw/usb/dev-smartcard-reader.c +++ b/hw/usb/dev-smartcard-reader.c @@ -967,7 +967,7 @@ static void ccid_on_apdu_from_guest(USBCCIDState *s, CCID_XferBlock *recv) DPRINTF(s, 1, "%s: seq %d, len %d\n", __func__, recv->hdr.bSeq, len); ccid_add_pending_answer(s, (CCID_Header *)recv); -if (s->card) { +if (s->card && len <= BULK_OUT_DATA_SIZE) { ccid_card_apdu_from_guest(s->card, recv->abData, len); } else { DPRINTF(s, D_WARN, "warning: discarded apdu\n"); -- 2.9.3
Re: [Qemu-devel] [PATCH 02/10] qemu-thread: introduce QemuLockCnt
Just noticed the message above mistakenly sat in my outbox for nearly 2 months. Just flushed it, so do not be surprised by its original date. Sorry for the noise, Emilio
[Qemu-devel] tci build failure (was Re: [PULL v5 00/22] virtio, vhost, pci: fixes, features)
On Thu, Feb 02, 2017 at 04:25:34PM +, Peter Maydell wrote: > On 2 February 2017 at 13:56, Peter Maydellwrote: > > On 31 January 2017 at 20:18, Michael S. Tsirkin wrote: > >> virtio, vhost, pci: fixes, features > >> > >> generic pci root port support > >> disable shpc by default > >> safer version of ARRAY_SIZE and QEMU_BUILD_BUG_ON > >> fixes and cleanups all over the place > >> > >> Signed-off-by: Michael S. Tsirkin > > > Applied, thanks. > > ...travis builds now fail for the --enable-tcg-interpreter config: > https://travis-ci.org/qemu/qemu/jobs/197648661 > > In file included from /home/travis/build/qemu/qemu/tcg/tcg.c:255:0: > /home/travis/build/qemu/qemu/tcg/tci/tcg-target.inc.c: In function > ‘tcg_out_op’: > /home/travis/build/qemu/qemu/tcg/tci/tcg-target.inc.c:569:117: error: > negative width in bit-field ‘’ > /home/travis/build/qemu/qemu/tcg/tci/tcg-target.inc.c:569:255: error: > negative width in bit-field ‘’ > In file included from /home/travis/build/qemu/qemu/tcg/tcg.c:255:0: > /home/travis/build/qemu/qemu/tcg/tci/tcg-target.inc.c:578:115: error: > negative width in bit-field ‘’ > /home/travis/build/qemu/qemu/tcg/tci/tcg-target.inc.c:578:255: error: > negative width in bit-field ‘’ > > These look to be because we were trying to use ARRAY_SIZE() > on a non-array, which was previously undetected. The use is > only in an assert() so fairly harmless. > > Would somebody who cares about TCI like to provide a fix? > > thanks > -- PMM I think the following should do it. Completely untested. --> tcg/tci: fix ARRAY_SIZE misuse tb_jmp_insn_offset and tb_jmp_reset_offset are pointers, not arrays, so using ARRAY_SIZE on them will not do the right thing. They point to arrays within TranslationBlock so check the size of these instead. Signed-off-by: Michael S. Tsirkin -- diff --git a/tcg/tci/tcg-target.inc.c b/tcg/tci/tcg-target.inc.c index 26ee9b1..a2ba654 100644 --- a/tcg/tci/tcg-target.inc.c +++ b/tcg/tci/tcg-target.inc.c @@ -556,6 +556,7 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args, const int *const_args) { uint8_t *old_code_ptr = s->code_ptr; +TranslationBlock *tb; tcg_out_op_t(s, opc); @@ -566,7 +567,7 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args, case INDEX_op_goto_tb: if (s->tb_jmp_insn_offset) { /* Direct jump method. */ -tcg_debug_assert(args[0] < ARRAY_SIZE(s->tb_jmp_insn_offset)); +tcg_debug_assert(args[0] < ARRAY_SIZE(tb->jmp_insn_offset)); /* Align for atomic patching and thread safety */ s->code_ptr = QEMU_ALIGN_PTR_UP(s->code_ptr, 4); s->tb_jmp_insn_offset[args[0]] = tcg_current_code_size(s); @@ -575,7 +576,7 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args, /* Indirect jump method. */ TODO(); } -tcg_debug_assert(args[0] < ARRAY_SIZE(s->tb_jmp_reset_offset)); +tcg_debug_assert(args[0] < ARRAY_SIZE(tb->jmp_reset_offset)); s->tb_jmp_reset_offset[args[0]] = tcg_current_code_size(s); break; case INDEX_op_br:
Re: [Qemu-devel] [PATCH 02/10] qemu-thread: introduce QemuLockCnt
On Tue, Nov 29, 2016 at 12:46:59 +0100, Paolo Bonzini wrote: > A QemuLockCnt comprises a counter and a mutex, with primitives > to increment and decrement the counter, and to take and release the > mutex. It can be used to do lock-free visits to a data structure > whenever mutexes would be too heavy-weight and the critical section > is too long for RCU. > > This could be implemented simply by protecting the counter with the > mutex, but QemuLockCnt is harder to misuse and more efficient. > > Signed-off-by: Paolo Bonzini(snip) > +++ b/docs/lockcnt.txt > @@ -0,0 +1,343 @@ > +DOCUMENTATION FOR LOCKED COUNTERS (aka QemuLockCnt) > +=== (snip) > +bool qemu_lockcnt_dec_if_lock(QemuLockCnt *lockcnt); > + > +If the count is 1, decrement the count to zero, lock > +the mutex and return true. Otherwise, return false. > + (snip) > +++ b/util/lockcnt.c (snip) > +void qemu_lockcnt_init(QemuLockCnt *lockcnt) > +{ > +qemu_mutex_init(>mutex); > +lockcnt->count = 0; Minor nit: a release barrier here wouldn't harm. > +} > + > +void qemu_lockcnt_destroy(QemuLockCnt *lockcnt) > +{ > +qemu_mutex_destroy(>mutex); > +} > + > +void qemu_lockcnt_inc(QemuLockCnt *lockcnt) > +{ > +int old; > +for (;;) { > +old = atomic_read(>count); > +if (old == 0) { > +qemu_lockcnt_lock(lockcnt); > +qemu_lockcnt_inc_and_unlock(lockcnt); > +return; > +} else { > +if (atomic_cmpxchg(>count, old, old + 1) == old) { > +return; > +} > +} > +} > +} > + > +void qemu_lockcnt_dec(QemuLockCnt *lockcnt) > +{ > +atomic_dec(>count); > +} > + > +/* Decrement a counter, and return locked if it is decremented to zero. > + * It is impossible for the counter to become nonzero while the mutex > + * is taken. > + */ > +bool qemu_lockcnt_dec_and_lock(QemuLockCnt *lockcnt) > +{ > +int val = atomic_read(>count); > +while (val > 1) { > +int old = atomic_cmpxchg(>count, val, val - 1); > +if (old != val) { > +val = old; > +continue; > +} > + > +return false; > +} Minor nit: while (val > 1) { int old = cmpxchg(); if (old == val) { return false; } val = old; } seems to me a little easier to read. > +qemu_lockcnt_lock(lockcnt); > +if (atomic_fetch_dec(>count) == 1) { > +return true; > +} > + > +qemu_lockcnt_unlock(lockcnt); > +return false; > +} > + > +/* Decrement a counter and return locked if it is decremented to zero. > + * Otherwise do nothing. This comment doesn't match the code below nor the description in the .txt file (quoted above) [we might not decrement the counter at all!] > + * > + * It is impossible for the counter to become nonzero while the mutex > + * is taken. > + */ > +bool qemu_lockcnt_dec_if_lock(QemuLockCnt *lockcnt) > +{ > +int val = atomic_mb_read(>count); > +if (val > 1) { > +return false; > +} > + > +qemu_lockcnt_lock(lockcnt); > +if (atomic_fetch_dec(>count) == 1) { > +return true; > +} > + > +qemu_lockcnt_inc_and_unlock(lockcnt); The choice of dec+(maybe)inc over cmpxchg seems a little odd to me. Emilio
Re: [Qemu-devel] [PULL v5 00/22] virtio, vhost, pci: fixes, features
Am 02.02.2017 um 17:25 schrieb Peter Maydell: On 2 February 2017 at 13:56, Peter Maydellwrote: On 31 January 2017 at 20:18, Michael S. Tsirkin wrote: virtio, vhost, pci: fixes, features generic pci root port support disable shpc by default safer version of ARRAY_SIZE and QEMU_BUILD_BUG_ON fixes and cleanups all over the place Signed-off-by: Michael S. Tsirkin Applied, thanks. ...travis builds now fail for the --enable-tcg-interpreter config: https://travis-ci.org/qemu/qemu/jobs/197648661 In file included from /home/travis/build/qemu/qemu/tcg/tcg.c:255:0: /home/travis/build/qemu/qemu/tcg/tci/tcg-target.inc.c: In function ‘tcg_out_op’: /home/travis/build/qemu/qemu/tcg/tci/tcg-target.inc.c:569:117: error: negative width in bit-field ‘’ /home/travis/build/qemu/qemu/tcg/tci/tcg-target.inc.c:569:255: error: negative width in bit-field ‘’ In file included from /home/travis/build/qemu/qemu/tcg/tcg.c:255:0: /home/travis/build/qemu/qemu/tcg/tci/tcg-target.inc.c:578:115: error: negative width in bit-field ‘’ /home/travis/build/qemu/qemu/tcg/tci/tcg-target.inc.c:578:255: error: negative width in bit-field ‘’ These look to be because we were trying to use ARRAY_SIZE() on a non-array, which was previously undetected. The use is only in an assert() so fairly harmless. Would somebody who cares about TCI like to provide a fix? thanks -- PMM Other architectures either no longer use an assertion or use tcg_debug_assert(s->tb_jmp_insn_offset != NULL), see tcg/aarch64/tcg-target.inc.c and tcg/ppc/tcg-target.inc.c. As the majority thinks that there is no longer a need for an assertion here, I think that is the best solution for TCI, too. I'll send a patch. Regards Stefan
Re: [Qemu-devel] [PULL 00/10] Tracing patches
On 1 February 2017 at 13:44, Stefan Hajnocziwrote: > The following changes since commit a0def594286d9110a6035e02eef558cf3cf5d847: > > Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into > staging (2017-01-30 10:23:20 +) > > are available in the git repository at: > > git://github.com/stefanha/qemu.git tags/tracing-pull-request > > for you to fetch changes up to 7f4076c1bb16d0d6f81a085ecc9c9d0b9da74c7d: > > trace: clean up trace-events files (2017-01-31 17:12:15 +) > > Applied, thanks. -- PMM
[Qemu-devel] [PULL 2/5] xen-platform: add support for unplugging NVMe disks...
From: Paul Durrant...not just IDE and SCSI. This patch allows the Xen tool-stack to fully support of NVMe as an emulated disk type. See [1] for the relevant tool-stack patch discussion. [1] https://lists.xen.org/archives/html/xen-devel/2017-01/msg01225.html Signed-off-by: Paul Durrant Reviewed-by: Stefano Stabellini --- hw/i386/xen/xen_platform.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/i386/xen/xen_platform.c b/hw/i386/xen/xen_platform.c index f50915f..7d41ebb 100644 --- a/hw/i386/xen/xen_platform.c +++ b/hw/i386/xen/xen_platform.c @@ -120,6 +120,7 @@ static void unplug_disks(PCIBus *b, PCIDevice *d, void *o) break; case PCI_CLASS_STORAGE_SCSI: +case PCI_CLASS_STORAGE_EXPRESS: object_unparent(OBJECT(d)); break; -- 1.9.1
[Qemu-devel] [PULL 4/5] MAINTAINERS: Update xen-devel mailing list address
From: Anthony PERARDSigned-off-by: Anthony PERARD Acked-by: Stefano Stabellini --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index a428cb2..baea7c7 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -323,7 +323,7 @@ Guest CPU Cores (Xen): X86 M: Stefano Stabellini M: Anthony Perard -L: xen-de...@lists.xensource.com +L: xen-de...@lists.xenproject.org S: Supported F: xen-* F: */xen* -- 1.9.1
[Qemu-devel] [PULL 1/5] xen-platform: re-structure unplug_disks
From: Paul DurrantThe current code is poorly structured and potentially leads to multiple config space reads when one is sufficient. Also the UNPLUG_ALL_IDE_DISKS flag is mis-named since it also results in SCSI disks being unplugged. This patch renames the flag and re-structures the code to be more efficient, and readable. Signed-off-by: Paul Durrant Reviewed-by: Stefano Stabellini --- hw/i386/xen/xen_platform.c | 25 - 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/hw/i386/xen/xen_platform.c b/hw/i386/xen/xen_platform.c index 2e1e543..f50915f 100644 --- a/hw/i386/xen/xen_platform.c +++ b/hw/i386/xen/xen_platform.c @@ -88,7 +88,7 @@ static void log_writeb(PCIXenPlatformState *s, char val) } /* Xen Platform, Fixed IOPort */ -#define UNPLUG_ALL_IDE_DISKS 1 +#define UNPLUG_ALL_DISKS 1 #define UNPLUG_ALL_NICS 2 #define UNPLUG_AUX_IDE_DISKS 4 @@ -110,14 +110,21 @@ static void pci_unplug_nics(PCIBus *bus) static void unplug_disks(PCIBus *b, PCIDevice *d, void *o) { /* We have to ignore passthrough devices */ -if (pci_get_word(d->config + PCI_CLASS_DEVICE) == -PCI_CLASS_STORAGE_IDE -&& strcmp(d->name, "xen-pci-passthrough") != 0) { +if (!strcmp(d->name, "xen-pci-passthrough")) { +return; +} + +switch (pci_get_word(d->config + PCI_CLASS_DEVICE)) { +case PCI_CLASS_STORAGE_IDE: pci_piix3_xen_ide_unplug(DEVICE(d)); -} else if (pci_get_word(d->config + PCI_CLASS_DEVICE) == -PCI_CLASS_STORAGE_SCSI -&& strcmp(d->name, "xen-pci-passthrough") != 0) { +break; + +case PCI_CLASS_STORAGE_SCSI: object_unparent(OBJECT(d)); +break; + +default: +break; } } @@ -134,9 +141,9 @@ static void platform_fixed_ioport_writew(void *opaque, uint32_t addr, uint32_t v case 0: { PCIDevice *pci_dev = PCI_DEVICE(s); /* Unplug devices. Value is a bitmask of which devices to - unplug, with bit 0 the IDE devices, bit 1 the network + unplug, with bit 0 the disk devices, bit 1 the network devices, and bit 2 the non-primary-master IDE devices. */ -if (val & UNPLUG_ALL_IDE_DISKS) { +if (val & UNPLUG_ALL_DISKS) { DPRINTF("unplug disks\n"); pci_unplug_disks(pci_dev->bus); } -- 1.9.1
[Qemu-devel] [PULL 5/5] xen: use qdev_unplug() instead of g_free() in xen_pv_find_xendev()
From: Juergen GrossThe error exits of xen_pv_find_xendev() free the new xen-device via g_free() which is wrong. As the xen-device has been initialized as qdev it must be removed via qdev_unplug(). This bug has been introduced with commit 3a6c9172ac5951e6dac2b3f6 ("xen: create qdev for each backend device"). Reported-by: Roger Pau Monné Tested-by: Roger Pau Monné Signed-off-by: Juergen Gross Reviewed-by: Stefano Stabellini --- hw/xen/xen_backend.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c index d119004..6c21c37 100644 --- a/hw/xen/xen_backend.c +++ b/hw/xen/xen_backend.c @@ -124,10 +124,11 @@ static struct XenDevice *xen_be_get_xendev(const char *type, int dom, int dev, /* init new xendev */ xendev = g_malloc0(ops->size); object_initialize(>qdev, ops->size, TYPE_XENBACKEND); -qdev_set_parent_bus(>qdev, xen_sysbus); -qdev_set_id(>qdev, g_strdup_printf("xen-%s-%d", type, dev)); -qdev_init_nofail(>qdev); -object_unref(OBJECT(>qdev)); +OBJECT(xendev)->free = g_free; +qdev_set_parent_bus(DEVICE(xendev), xen_sysbus); +qdev_set_id(DEVICE(xendev), g_strdup_printf("xen-%s-%d", type, dev)); +qdev_init_nofail(DEVICE(xendev)); +object_unref(OBJECT(xendev)); xendev->type = type; xendev->dom = dom; @@ -145,7 +146,7 @@ static struct XenDevice *xen_be_get_xendev(const char *type, int dom, int dev, xendev->evtchndev = xenevtchn_open(NULL, 0); if (xendev->evtchndev == NULL) { xen_pv_printf(NULL, 0, "can't open evtchn device\n"); -g_free(xendev); +qdev_unplug(DEVICE(xendev), NULL); return NULL; } fcntl(xenevtchn_fd(xendev->evtchndev), F_SETFD, FD_CLOEXEC); @@ -155,7 +156,7 @@ static struct XenDevice *xen_be_get_xendev(const char *type, int dom, int dev, if (xendev->gnttabdev == NULL) { xen_pv_printf(NULL, 0, "can't open gnttab device\n"); xenevtchn_close(xendev->evtchndev); -g_free(xendev); +qdev_unplug(DEVICE(xendev), NULL); return NULL; } } else { -- 1.9.1
[Qemu-devel] [PULL 3/5] xen-platform: add missing disk unplug option
From: Paul DurrantThe Xen HVM unplug protocol [1] specifies a mechanism to allow guests to request unplug of 'aux' disks (which is stated to mean all IDE disks, except the primary master). This patch adds support for that unplug request. NOTE: The semantics of what happens if unplug of all disks and 'aux' disks is simultaneously requests is not clear. The patch makes that assumption that an 'all' request overrides an 'aux' request. [1] http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/misc/hvm-emulated-unplug.markdown Signed-off-by: Paul Durrant Reviewed-by: Stefano Stabellini Cc: Stefano Stabellini Cc: Anthony Perard Cc: Paolo Bonzini Cc: Richard Henderson Cc: Eduardo Habkost Cc: "Michael S. Tsirkin" Cc: John Snow --- hw/i386/xen/xen_platform.c | 27 +++ hw/ide/piix.c | 4 ++-- include/hw/ide.h | 2 +- 3 files changed, 18 insertions(+), 15 deletions(-) diff --git a/hw/i386/xen/xen_platform.c b/hw/i386/xen/xen_platform.c index 7d41ebb..6010f35 100644 --- a/hw/i386/xen/xen_platform.c +++ b/hw/i386/xen/xen_platform.c @@ -107,8 +107,12 @@ static void pci_unplug_nics(PCIBus *bus) pci_for_each_device(bus, 0, unplug_nic, NULL); } -static void unplug_disks(PCIBus *b, PCIDevice *d, void *o) +static void unplug_disks(PCIBus *b, PCIDevice *d, void *opaque) { +uint32_t flags = *(uint32_t *)opaque; +bool aux = (flags & UNPLUG_AUX_IDE_DISKS) && +!(flags & UNPLUG_ALL_DISKS); + /* We have to ignore passthrough devices */ if (!strcmp(d->name, "xen-pci-passthrough")) { return; @@ -116,12 +120,14 @@ static void unplug_disks(PCIBus *b, PCIDevice *d, void *o) switch (pci_get_word(d->config + PCI_CLASS_DEVICE)) { case PCI_CLASS_STORAGE_IDE: -pci_piix3_xen_ide_unplug(DEVICE(d)); +pci_piix3_xen_ide_unplug(DEVICE(d), aux); break; case PCI_CLASS_STORAGE_SCSI: case PCI_CLASS_STORAGE_EXPRESS: -object_unparent(OBJECT(d)); +if (!aux) { +object_unparent(OBJECT(d)); +} break; default: @@ -129,9 +135,9 @@ static void unplug_disks(PCIBus *b, PCIDevice *d, void *o) } } -static void pci_unplug_disks(PCIBus *bus) +static void pci_unplug_disks(PCIBus *bus, uint32_t flags) { -pci_for_each_device(bus, 0, unplug_disks, NULL); +pci_for_each_device(bus, 0, unplug_disks, ); } static void platform_fixed_ioport_writew(void *opaque, uint32_t addr, uint32_t val) @@ -144,17 +150,14 @@ static void platform_fixed_ioport_writew(void *opaque, uint32_t addr, uint32_t v /* Unplug devices. Value is a bitmask of which devices to unplug, with bit 0 the disk devices, bit 1 the network devices, and bit 2 the non-primary-master IDE devices. */ -if (val & UNPLUG_ALL_DISKS) { +if (val & (UNPLUG_ALL_DISKS | UNPLUG_AUX_IDE_DISKS)) { DPRINTF("unplug disks\n"); -pci_unplug_disks(pci_dev->bus); +pci_unplug_disks(pci_dev->bus, val); } if (val & UNPLUG_ALL_NICS) { DPRINTF("unplug nics\n"); pci_unplug_nics(pci_dev->bus); } -if (val & UNPLUG_AUX_IDE_DISKS) { -DPRINTF("unplug auxiliary disks not supported\n"); -} break; } case 2: @@ -335,14 +338,14 @@ static void xen_platform_ioport_writeb(void *opaque, hwaddr addr, * If VMDP was to control both disk and LAN it would use 4. * If it controlled just disk or just LAN, it would use 8 below. */ -pci_unplug_disks(pci_dev->bus); +pci_unplug_disks(pci_dev->bus, UNPLUG_ALL_DISKS); pci_unplug_nics(pci_dev->bus); } break; case 8: switch (val) { case 1: -pci_unplug_disks(pci_dev->bus); +pci_unplug_disks(pci_dev->bus, UNPLUG_ALL_DISKS); break; case 2: pci_unplug_nics(pci_dev->bus); diff --git a/hw/ide/piix.c b/hw/ide/piix.c index d5777fd..7e2d767 100644 --- a/hw/ide/piix.c +++ b/hw/ide/piix.c @@ -165,7 +165,7 @@ static void pci_piix_ide_realize(PCIDevice *dev, Error **errp) pci_piix_init_ports(d); } -int pci_piix3_xen_ide_unplug(DeviceState *dev) +int pci_piix3_xen_ide_unplug(DeviceState *dev, bool aux) { PCIIDEState *pci_ide; DriveInfo *di; @@ -174,7 +174,7 @@ int pci_piix3_xen_ide_unplug(DeviceState *dev) pci_ide = PCI_IDE(dev); -for (i = 0; i < 4; i++) { +for (i = aux ? 1 : 0; i < 4; i++) { di = drive_get_by_index(IF_IDE, i); if (di != NULL && !di->media_cd) { BlockBackend *blk = blk_by_legacy_dinfo(di); diff --git a/include/hw/ide.h
[Qemu-devel] [PULL 0/5] xen-20170202
The following changes since commit 3aca12f841fcd6f3a7477076dad0d564360500de: Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20170127' into staging (2017-01-27 16:59:17 +) are available in the git repository at: git://xenbits.xen.org/people/sstabellini/qemu-dm.git tags/xen-20170202 for you to fetch changes up to e9dcbc86d614018923e26e31319b0a54c9e5abac: xen: use qdev_unplug() instead of g_free() in xen_pv_find_xendev() (2017-02-02 10:23:53 -0800) Xen 2017/02/02 Anthony PERARD (1): MAINTAINERS: Update xen-devel mailing list address Juergen Gross (1): xen: use qdev_unplug() instead of g_free() in xen_pv_find_xendev() Paul Durrant (3): xen-platform: re-structure unplug_disks xen-platform: add support for unplugging NVMe disks... xen-platform: add missing disk unplug option MAINTAINERS| 2 +- hw/i386/xen/xen_platform.c | 51 -- hw/ide/piix.c | 4 ++-- hw/xen/xen_backend.c | 13 ++-- include/hw/ide.h | 2 +- 5 files changed, 42 insertions(+), 30 deletions(-)
Re: [Qemu-devel] [PATCH v2] xen: use qdev_unplug() instead of g_free() in xen_pv_find_xendev()
On Thu, 2 Feb 2017, Juergen Gross wrote: > On 01/02/17 21:20, Peter Maydell wrote: > > On 1 February 2017 at 19:37, Stefano Stabellini> > wrote: > >> Hi Peter, > >> > >> do you think this is acceptable? > > > > The set of operations here is basically what I suggested > > in review of v1, so I think it is the right thing. > > OTOH this is a bit of an odd corner of the QOM model > > so it might be worth doing some testing to make sure > > the reference counts are doing what you (I) expect and > > that the object does get correctly freed both in the > > error-handling path here and when the device is > > unplugged via xen_pv_del_xendev(). > > I've used my_g_free() printing a log message when called instead of > g_free() in a test. I could verify it has been called when the > device was unplugged. This test covered xen_pv_del_xendev() and > an error handling path. Thanks Juergen for testing. I'll commit shortly.
[Qemu-devel] [patch 1/2] kvm: sync linux headers
Sync linux headers. Signed-off-by: Marcelo Tosatti--- linux-headers/asm-x86/kvm.h |5 + linux-headers/asm-x86/kvm_para.h | 13 - linux-headers/linux/kvm.h|8 ++-- linux-headers/linux/kvm_para.h |7 +++ 4 files changed, 30 insertions(+), 3 deletions(-) Index: qemu/linux-headers/asm-x86/kvm.h === --- qemu.orig/linux-headers/asm-x86/kvm.h 2016-12-29 15:45:22.415325241 -0200 +++ qemu/linux-headers/asm-x86/kvm.h2017-01-31 09:47:45.740645314 -0200 @@ -357,4 +357,9 @@ #define KVM_X86_QUIRK_LINT0_REENABLED (1 << 0) #define KVM_X86_QUIRK_CD_NW_CLEARED(1 << 1) +struct kvm_vcpu_allow_freq { + __u16 enable; + __u16 pad[7]; +}; + #endif /* _ASM_X86_KVM_H */ Index: qemu/linux-headers/asm-x86/kvm_para.h === --- qemu.orig/linux-headers/asm-x86/kvm_para.h 2016-12-29 15:44:51.281263648 -0200 +++ qemu/linux-headers/asm-x86/kvm_para.h 2017-01-31 09:47:45.740645314 -0200 @@ -45,7 +45,18 @@ __u64 steal; __u32 version; __u32 flags; - __u32 pad[12]; + __u8 preempted; + __u8 u8_pad[3]; + __u32 pad[11]; +}; + +#define KVM_CLOCK_PAIRING_WALLCLOCK 0 +struct kvm_clock_pairing { + __s64 sec; + __s64 nsec; + __u64 tsc; + __u32 flags; + __u32 pad[9]; }; #define KVM_STEAL_ALIGNMENT_BITS 5 Index: qemu/linux-headers/linux/kvm.h === --- qemu.orig/linux-headers/linux/kvm.h 2016-12-29 15:45:49.572379128 -0200 +++ qemu/linux-headers/linux/kvm.h 2017-01-31 09:47:45.740645314 -0200 @@ -651,6 +651,9 @@ }; /* for KVM_PPC_GET_PVINFO */ + +#define KVM_PPC_PVINFO_FLAGS_EV_IDLE (1<<0) + struct kvm_ppc_pvinfo { /* out */ __u32 flags; @@ -682,8 +685,6 @@ struct kvm_ppc_one_seg_page_size sps[KVM_PPC_PAGE_SIZES_MAX_SZ]; }; -#define KVM_PPC_PVINFO_FLAGS_EV_IDLE (1<<0) - #define KVMIO 0xAE /* machine type bits, to be used as argument to KVM_CREATE_VM */ @@ -870,6 +871,7 @@ #define KVM_CAP_S390_USER_INSTR0 130 #define KVM_CAP_MSI_DEVID 131 #define KVM_CAP_PPC_HTM 132 +#define KVM_CAP_ALLOW_FREQ_HC 133 #ifdef KVM_CAP_IRQ_ROUTING @@ -1280,6 +1282,8 @@ #define KVM_S390_GET_IRQ_STATE _IOW(KVMIO, 0xb6, struct kvm_s390_irq_state) /* Available with KVM_CAP_X86_SMM */ #define KVM_SMI _IO(KVMIO, 0xb7) +#define KVM_SET_VCPU_ALLOW_FREQ_HC _IO(KVMIO, 0xb8) +#define KVM_GET_VCPU_ALLOW_FREQ_HC _IO(KVMIO, 0xb9) #define KVM_DEV_ASSIGN_ENABLE_IOMMU(1 << 0) #define KVM_DEV_ASSIGN_PCI_2_3 (1 << 1) Index: qemu/linux-headers/linux/kvm_para.h === --- qemu.orig/linux-headers/linux/kvm_para.h2016-12-29 15:45:22.416325243 -0200 +++ qemu/linux-headers/linux/kvm_para.h 2017-01-31 09:47:45.741645316 -0200 @@ -14,6 +14,7 @@ #define KVM_EFAULT EFAULT #define KVM_E2BIG E2BIG #define KVM_EPERM EPERM +#define KVM_EOPNOTSUPP 95 #define KVM_HC_VAPIC_POLL_IRQ 1 #define KVM_HC_MMU_OP 2 @@ -23,6 +24,12 @@ #define KVM_HC_MIPS_GET_CLOCK_FREQ 6 #define KVM_HC_MIPS_EXIT_VM7 #define KVM_HC_MIPS_CONSOLE_OUTPUT 8 +#define KVM_HC_CLOCK_PAIRING 9 +#define KVM_HC_FREQ_UP 10 +#define KVM_HC_FREQ_DOWN 11 +#define KVM_HC_FREQ_MAX12 +#define KVM_HC_FREQ_MIN13 + /* * hypercalls use architecture specific
[Qemu-devel] [patch 2/2] kvm: introduce cpu flag to enable cpu frequency changes via hypercall
Guests with DPDK, whose vcpus run isolated on physical CPUs, want to control the frequency of such physical CPUs. Introduce a allow-freq-hc CPU flag to enable such hypercalls. Signed-off-by: Marcelo Tosatti--- target/i386/cpu.c |1 + target/i386/cpu.h |6 ++ target/i386/kvm.c | 29 + target/i386/kvm_i386.h |1 + 4 files changed, 37 insertions(+) Index: qemu/target/i386/cpu.c === --- qemu.orig/target/i386/cpu.c 2017-01-31 09:18:52.944948296 -0200 +++ qemu/target/i386/cpu.c 2017-01-31 09:47:58.050671578 -0200 @@ -3659,6 +3659,7 @@ DEFINE_PROP_BOOL("lmce", X86CPU, enable_lmce, false), DEFINE_PROP_BOOL("l3-cache", X86CPU, enable_l3_cache, true), DEFINE_PROP_BOOL("vmware-cpuid-freq", X86CPU, vmware_cpuid_freq, true), +DEFINE_PROP_BOOL("allow-freq-hc", X86CPU, allow_freq_hc, false), DEFINE_PROP_END_OF_LIST() }; Index: qemu/target/i386/cpu.h === --- qemu.orig/target/i386/cpu.h 2017-01-31 09:18:52.944948296 -0200 +++ qemu/target/i386/cpu.h 2017-01-31 09:47:58.051671580 -0200 @@ -1243,6 +1243,12 @@ */ bool enable_l3_cache; +/* Direct frequency hypercalls from guest userspace can be + * enabled/disabled via cpu option 'allow_freq_hc=on/off'. + * It is disabled by default. + */ +bool allow_freq_hc; + /* Compatibility bits for old machine types: */ bool enable_cpuid_0xb; Index: qemu/target/i386/kvm.c === --- qemu.orig/target/i386/kvm.c 2017-01-31 09:18:52.944948296 -0200 +++ qemu/target/i386/kvm.c 2017-01-31 10:05:01.546855265 -0200 @@ -163,6 +163,11 @@ has_x2apic_api); } +bool kvm_has_allow_freq_hc(void) +{ +return kvm_check_extension(kvm_state, KVM_CAP_ALLOW_FREQ_HC); +} + static int kvm_get_tsc(CPUState *cs) { X86CPU *cpu = X86_CPU(cs); @@ -692,6 +697,15 @@ return 0; } +static int kvm_set_vcpu_allow_freq(X86CPU *cpu) +{ +struct kvm_vcpu_allow_freq fr; + +fr.enable = 1; + +return kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_ALLOW_FREQ_HC, ); +} + static Error *invtsc_mig_blocker; #define KVM_MAX_CPUID_ENTRIES 100 @@ -1040,6 +1054,21 @@ has_msr_tsc_aux = false; } +if (cpu->allow_freq_hc) { +int ret; + +if (!kvm_has_allow_freq_hc()) { +error_report("kvm: allow freq hypercall not supported"); +return -ENOTSUP; +} + +ret = kvm_set_vcpu_allow_freq(cpu); +if (ret) { +error_report("kvm: kvm_set_allow_freq failure, ret=%d", ret); +return ret; +} +} + return 0; fail: Index: qemu/target/i386/kvm_i386.h === --- qemu.orig/target/i386/kvm_i386.h2017-01-31 09:18:52.944948296 -0200 +++ qemu/target/i386/kvm_i386.h 2017-01-31 09:47:58.052671582 -0200 @@ -21,6 +21,7 @@ void kvm_synchronize_all_tsc(void); void kvm_arch_reset_vcpu(X86CPU *cs); void kvm_arch_do_init_vcpu(X86CPU *cs); +bool kvm_has_allow_freq_hc(void); int kvm_device_pci_assign(KVMState *s, PCIHostDeviceAddress *dev_addr, uint32_t flags, uint32_t *dev_id);
[Qemu-devel] [patch 0/2] qemu support for kvm cpu freq hypercalls
Add a new cpu flag to allow vCPUs to perform frequency change hypercalls from userspace. See the kernel patchseries for more details.
Re: [Qemu-devel] KVM/QEMU on Raspberry Pi 3
On 02/02/17 18:48, Ard Biesheuvel wrote: >>> ESR 0x0200 FAR 0x1DE7EC7EDBADC0DE >>> > > > $ git grep -C5 -ni 0x1DE7EC7EDBADC0DE > arch/arm64/kvm/sys_regs.h-105-static inline void reset_unknown(struct > kvm_vcpu *vcpu, > arch/arm64/kvm/sys_regs.h-106- const struct > sys_reg_desc *r) > arch/arm64/kvm/sys_regs.h-107-{ > arch/arm64/kvm/sys_regs.h-108- BUG_ON(!r->reg); > arch/arm64/kvm/sys_regs.h-109- BUG_ON(r->reg >= NR_SYS_REGS); > arch/arm64/kvm/sys_regs.h:110: vcpu_sys_reg(vcpu, r->reg) = > 0x1de7ec7edbadc0deULL; > arch/arm64/kvm/sys_regs.h-111-} > arch/arm64/kvm/sys_regs.h-112- > > In other words (or rather, in words), KVM is triggering this exception > in the guest deliberately, which I suspect has something to do with > the lack of a GIC? Are you using these patches Peter mentions? "detected bad code". Mind = blown.
Re: [Qemu-devel] KVM/QEMU on Raspberry Pi 3
On 2 February 2017 at 15:50, Laszlo Ersekwrote: > Adding Ard, just in case... > > I have one (half-)comment re: GICv3: > > On 02/02/17 15:44, Pekka Enberg wrote: >> Hi, >> >> Has anyone been able to successfully run QEMU/KVM under Raspberry Pi 3? >> >> I have installed 64-bit Fedora 24 by Gerd Hoffmann on the hardware: >> >> https://www.kraxel.org/blog/2016/04/fedora-on-raspberry-pi-updates/ >> >> and built a VM image using virt-builder: >> >> virt-builder --root-password password:root --arch aarch64 fedora-24 >> >> I also built the latest UEFI for QEMU from sources: >> >> https://wiki.linaro.org/LEG/UEFIforQEMU >> >> and updated to QEMU 2.8.0 from rawhide: >> >> [root@fedora-rpi2 ~]# qemu-system-aarch64 -version >> QEMU emulator version 2.8.0(qemu-2.8.0-1.fc26) >> Copyright (c) 2003-2016 Fabrice Bellard and the QEMU Project developers >> >> The VM image should be fine because I’m able to boot to it under CPU >> emulation: >> >> qemu-system-aarch64 \ >> -nographic \ >> -M virt \ >> -cpu cortex-a57 \ >> -smp 1 \ >> -m 512 \ >> -bios QEMU_EFI.fd \ >> -device virtio-blk-device,drive=image -drive >> if=none,id=image,file=fedora-24.img \ >> -netdev bridge,id=hn0,br=virbr0 -device >> virtio-net-pci,netdev=hn0,romfile= \ >> -device virtio-rng-pci >> [..] >> I also tried to enable GIC v3 by adding the “-machine gic-version=3” >> command one option but the UEFI firmware doesn’t like that: >> >> Found GIC v3 (re)distributor @ 0x800 (0x80A) >> >> >> Synchronous Exception at 0x5BD5B820 >> PC 0x5BD5B820 (0x5BD58000+0x3820) [ 0] ArmGicDxe.dll >> PC 0x5BD5BC38 (0x5BD58000+0x3C38) [ 0] ArmGicDxe.dll >> PC 0x5BD593B0 (0x5BD58000+0x13B0) [ 0] ArmGicDxe.dll >> PC 0x5BD590A0 (0x5BD58000+0x10A0) [ 0] ArmGicDxe.dll >> PC 0x5EF1ADF4 (0x5EF14000+0x6DF4) [ 1] DxeCore.dll >> PC 0x5EF32B0C (0x5EF14000+0x0001EB0C) [ 1] DxeCore.dll >> PC 0x5EF165E4 (0x5EF14000+0x25E4) [ 1] DxeCore.dll >> PC 0x5EF15828 (0x5EF14000+0x1828) [ 1] DxeCore.dll >> PC 0x5EF15024 (0x5EF14000+0x1024) [ 1] DxeCore.dll >> >> [ 0] >> /home/penberg/raspberrypi/uefi/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/AARCH64/ArmPkg/Drivers/ArmGic/ArmGicDxe/DEBUG/ArmGicDxe.dll >> >> [ 1] >> /home/penberg/raspberrypi/uefi/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/AARCH64/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll >> >> >> X0 0x0036 X1 0x0004 X2 0x0036 >> X3 0x >> X4 0x0001 X5 0x X6 0x0A01191513061C12 >> X7 0x121C06131519010A >> X8 0x041ECB83 X9 0x0007 X10 0x58B6 >> X11 0x0004 >> X12 0x0001 X13 0x0008 X14 0x >> X15 0x >> X16 0x5EF13DF0 X17 0x X18 0x >> X19 0x4007C268 >> X20 0x X21 0x X22 0x >> X23 0x >> X24 0x X25 0x X26 0x >> X27 0x >> X28 0x FP 0x5EF13D20 LR 0x5BD5BC38 >> >> V0 0xAFAFAFAFAFAFAFAF AFAFAFAFAFAFAFAF V1 0x >> >> V2 0x V3 0x >> >> V4 0x V5 0x >> >> V6 0x V7 0x >> >> V8 0x V9 0x >> >> V10 0x V11 0x >> >> V12 0x V13 0x >> >> V14 0x V15 0x >> >> V16 0x V17 0x >> >> V18 0x V19 0x >> >> V20 0x V21 0x >> >> V22 0x V23 0x >> >> V24 0x V25 0x >> >> V26 0x V27 0x >> >> V28 0x V29 0x >> >> V30 0x V31 0x >> >> >> SP 0x5EF13D20 ELR 0x5BD5B820 SPSR 0x8205 FPSR >> 0x >> ESR 0x0200 FAR 0x1DE7EC7EDBADC0DE >> $ git grep -C5 -ni 0x1DE7EC7EDBADC0DE arch/arm64/kvm/sys_regs.h-105-static inline void reset_unknown(struct
Re: [Qemu-devel] [RFC][PATCH] qemu-img: make convert async
Hi, Your series failed automatic build test. Please find the testing commands and their output below. If you have docker installed, you can probably reproduce it locally. Type: series Subject: [Qemu-devel] [RFC][PATCH] qemu-img: make convert async Message-id: 1486051604-32310-1-git-send-email...@kamp.de === TEST SCRIPT BEGIN === #!/bin/bash set -e git submodule update --init dtc # Let docker tests dump environment info export SHOW_ENV=1 export J=16 make docker-test-quick@centos6 make docker-test-mingw@fedora make docker-test-build@min-glib === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 Switched to a new branch 'test' 3a167eb qemu-img: make convert async === OUTPUT BEGIN === Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc' Cloning into 'dtc'... Submodule path 'dtc': checked out '65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf' BUILD centos6 make[1]: Entering directory `/var/tmp/patchew-tester-tmp-oelmqzb0/src' ARCHIVE qemu.tgz ARCHIVE dtc.tgz COPYRUNNER RUN test-quick in qemu:centos6 Packages installed: SDL-devel-1.2.14-7.el6_7.1.x86_64 ccache-3.1.6-2.el6.x86_64 epel-release-6-8.noarch gcc-4.4.7-17.el6.x86_64 git-1.7.1-4.el6_7.1.x86_64 glib2-devel-2.28.8-5.el6.x86_64 libfdt-devel-1.4.0-1.el6.x86_64 make-3.81-23.el6.x86_64 package g++ is not installed pixman-devel-0.32.8-1.el6.x86_64 tar-1.23-15.el6_8.x86_64 zlib-devel-1.2.3-29.el6.x86_64 Environment variables: PACKAGES=libfdt-devel ccache tar git make gcc g++ zlib-devel glib2-devel SDL-devel pixman-devel epel-release HOSTNAME=66ea1b2e6c36 TERM=xterm MAKEFLAGS= -j16 HISTSIZE=1000 J=16 USER=root CCACHE_DIR=/var/tmp/ccache EXTRA_CONFIGURE_OPTS= V= SHOW_ENV=1 MAIL=/var/spool/mail/root PATH=/usr/lib/ccache:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin PWD=/ LANG=en_US.UTF-8 TARGET_LIST= HISTCONTROL=ignoredups SHLVL=1 HOME=/root TEST_DIR=/tmp/qemu-test LOGNAME=root LESSOPEN=||/usr/bin/lesspipe.sh %s FEATURES= dtc DEBUG= G_BROKEN_FILENAMES=1 CCACHE_HASHDIR= _=/usr/bin/env Configure options: --enable-werror --target-list=x86_64-softmmu,aarch64-softmmu --prefix=/var/tmp/qemu-build/install No C++ compiler available; disabling C++ specific optional code Install prefix/var/tmp/qemu-build/install BIOS directory/var/tmp/qemu-build/install/share/qemu binary directory /var/tmp/qemu-build/install/bin library directory /var/tmp/qemu-build/install/lib module directory /var/tmp/qemu-build/install/lib/qemu libexec directory /var/tmp/qemu-build/install/libexec include directory /var/tmp/qemu-build/install/include config directory /var/tmp/qemu-build/install/etc local state directory /var/tmp/qemu-build/install/var Manual directory /var/tmp/qemu-build/install/share/man ELF interp prefix /usr/gnemul/qemu-%M Source path /tmp/qemu-test/src C compilercc Host C compiler cc C++ compiler Objective-C compiler cc ARFLAGS rv CFLAGS-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g QEMU_CFLAGS -I/usr/include/pixman-1-pthread -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -fPIE -DPIE -m64 -mcx16 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -Wendif-labels -Wmissing-include-dirs -Wempty-body -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wold-style-declaration -Wold-style-definition -Wtype-limits -fstack-protector-all LDFLAGS -Wl,--warn-common -Wl,-z,relro -Wl,-z,now -pie -m64 -g make make install install pythonpython -B smbd /usr/sbin/smbd module supportno host CPU x86_64 host big endian no target list x86_64-softmmu aarch64-softmmu tcg debug enabled no gprof enabled no sparse enabledno strip binariesyes profiler no static build no pixmansystem SDL support yes (1.2.14) GTK support no GTK GL supportno VTE support no TLS priority NORMAL GNUTLS supportno GNUTLS rndno libgcrypt no libgcrypt kdf no nettleno nettle kdfno libtasn1 no curses supportno virgl support no curl support no mingw32 support no Audio drivers oss Block whitelist (rw) Block whitelist (ro) VirtFS supportno VNC support yes VNC SASL support no VNC JPEG support no VNC PNG support no xen support no brlapi supportno bluez supportno Documentation no PIE yes vde support no netmap supportno Linux AIO support no ATTR/XATTR support yes Install blobs yes KVM support yes HAX support no RDMA support no TCG interpreter no fdt support yes preadv supportyes fdatasync yes madvise yes posix_madvise yes libcap-ng support no vhost-net support yes
Re: [Qemu-devel] [PATCH v2] qemu-nbd: Implement socket activation.
On Thu, Feb 02, 2017 at 05:16:25PM +, Richard W.M. Jones wrote: > Socket activation (sometimes known as systemd socket activation) > allows an Internet superserver to pass a pre-opened listening socket > to the process, instead of having qemu-nbd open a socket itself. This > is done via the LISTEN_FDS and LISTEN_PID environment variables, and a > standard file descriptor range. > > This change partially implements socket activation for qemu-nbd. If > the environment variables are set correctly, then socket activation > will happen automatically, otherwise everything works as before. The > limitation is that LISTEN_FDS must be 1. > > Signed-off-by: Richard W.M. Jones. > --- > qemu-nbd.c | 172 > + > 1 file changed, 163 insertions(+), 9 deletions(-) Reviewed-by: Daniel P. BerrangeRegards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o-http://search.cpan.org/~danberr/ :|
[Qemu-devel] [PATCH v2] qemu-nbd: Implement socket activation.
Socket activation (sometimes known as systemd socket activation) allows an Internet superserver to pass a pre-opened listening socket to the process, instead of having qemu-nbd open a socket itself. This is done via the LISTEN_FDS and LISTEN_PID environment variables, and a standard file descriptor range. This change partially implements socket activation for qemu-nbd. If the environment variables are set correctly, then socket activation will happen automatically, otherwise everything works as before. The limitation is that LISTEN_FDS must be 1. Signed-off-by: Richard W.M. Jones. --- qemu-nbd.c | 172 + 1 file changed, 163 insertions(+), 9 deletions(-) diff --git a/qemu-nbd.c b/qemu-nbd.c index c734f62..b3088d0 100644 --- a/qemu-nbd.c +++ b/qemu-nbd.c @@ -463,6 +463,135 @@ static QCryptoTLSCreds *nbd_get_tls_creds(const char *id, Error **errp) return creds; } +static void setup_address_and_port(const char **address, const char **port) +{ +if (*address == NULL) { +*address = "0.0.0.0"; +} + +if (*port == NULL) { +*port = g_strdup_printf("%d", NBD_DEFAULT_PORT);; +} +} + +#define FIRST_SOCKET_ACTIVATION_FD 3 /* defined by systemd ABI */ + +#ifndef _WIN32 +/* + * Check if socket activation was requested via use of the + * LISTEN_FDS and LISTEN_PID environment variables. + * + * Returns 0 if no socket activation, or the number of FDs. + */ +static unsigned int check_socket_activation(void) +{ +const char *s; +unsigned long pid; +unsigned long nr_fds; +unsigned int i; +int fd; +int err; + +s = getenv("LISTEN_PID"); +if (s == NULL) { +return 0; +} +err = qemu_strtoul(s, NULL, 10, ); +if (err) { +if (verbose) { +fprintf(stderr, "malformed %s environment variable (ignored)\n", +"LISTEN_PID"); +} +return 0; +} +if (pid != getpid()) { +if (verbose) { +fprintf(stderr, "%s was not for us (ignored)\n", +"LISTEN_PID"); +} +return 0; +} + +s = getenv("LISTEN_FDS"); +if (s == NULL) { +return 0; +} +err = qemu_strtoul(s, NULL, 10, _fds); +if (err) { +if (verbose) { +fprintf(stderr, "malformed %s environment variable (ignored)\n", +"LISTEN_FDS"); +} +return 0; +} +assert(nr_fds <= UINT_MAX); + +/* A limitation of current qemu-nbd is that it can only listen on + * a single socket. When that limitation is lifted, we can change + * this function to allow LISTEN_FDS > 1, and remove the assertion + * in the main function below. + */ +if (nr_fds > 1) { +error_report("qemu-nbd does not support socket activation with %s > 1", + "LISTEN_FDS"); +exit(EXIT_FAILURE); +} + +/* So these are not passed to any child processes we might start. */ +unsetenv("LISTEN_FDS"); +unsetenv("LISTEN_PID"); + +/* So the file descriptors don't leak into child processes. */ +for (i = 0; i < nr_fds; ++i) { +fd = FIRST_SOCKET_ACTIVATION_FD + i; +if (fcntl(fd, F_SETFD, FD_CLOEXEC) == -1) { +/* If we cannot set FD_CLOEXEC then it probably means the file + * descriptor is invalid, so socket activation has gone wrong + * and we should exit. + */ +error_report("Socket activation failed: " + "invalid file descriptor fd = %d: %m", + fd); +exit(EXIT_FAILURE); +} +} + +return (unsigned int) nr_fds; +} + +#else /* !_WIN32 */ +static unsigned int check_socket_activation(void) +{ +return 0; +} +#endif + +/* + * Check socket parameters compatibility when socket activation is used. + */ +static const char *socket_activation_validate_opts(const char *device, + const char *sockpath, + const char *address, + const char *port) +{ +if (device != NULL) { +return "NBD device can't be set when using socket activation"; +} + +if (sockpath != NULL) { +return "Unix socket can't be set when using socket activation"; +} + +if (address != NULL) { +return "The interface can't be set when using socket activation"; +} + +if (port != NULL) { +return "TCP port number can't be set when using socket activation"; +} + +return NULL; +} int main(int argc, char **argv) { @@ -471,7 +600,7 @@ int main(int argc, char **argv) off_t dev_offset = 0; uint16_t nbdflags = 0; bool disconnect = false; -const char *bindto = "0.0.0.0"; +const char *bindto = NULL; const char *port = NULL; char *sockpath = NULL; char *device = NULL; @@ -533,6
[Qemu-devel] [PATCH v2] qemu-nbd: Implement socket activation.
v2: - A few small fixed identified by Dan Berrange. The original cover letter is below. Rich. Socket activation (sometimes known as systemd socket activation) allows an Internet superserver to pass a pre-opened listening socket to the process, instead of having qemu-nbd open a socket itself. This is done via the LISTEN_FDS and LISTEN_PID environment variables, and a standard file descriptor range. This patch partially implements socket activation. The limitation of this implementation is that qemu-nbd can only listen on a single file descriptor, and so if LISTEN_FDS > 1 (eg. for listening on multiple interfaces or ports) socket activation will fail. However for the simple case of listening on a single port, and either all interfaces with IPv4+IPv6, or just a loopback interface, the current implementation works fine. Fixing this properly would require considerable changes throughout qemu, since qemu's currently handling of getaddrinfo is plainly wrong. To use qemu-nbd from systemd, you create /etc/systemd/system/nbd.socket: [Unit] Description=QEMU Network Block Device server [Socket] ListenStream=10809 [Install] WantedBy=sockets.target and /etc/systemd/system/nbd.service: [Service] ExecStart=/usr/sbin/qemu-nbd -v -t /path/to/file and enable the socket service (only): systemctl enable nbd.socket systemctl start nbd.socket and then connecting to port 10809 will start qemu-nbd and service the file, with systemd opening the listening socket. In the ExecStart line, the qemu-nbd -v option is only needed if you want enhanced debugging. The -t option is required unless you want to fiddle with systemd settings for rate-limiting. If you try to use the -p and similar options with socket activation then qemu-nbd will give an error. (I wasn't sure where to document this -- there is no obvious documentation for qemu-nbd beyond the simple list of command line arguments) This is based on the implementations in libvirt (src/util/virutil.c:virGetListenFDs) and nbdkit (src/main.c:get_socket_activation), and also on Denis Plotnikov's implementation of --server-sock-fd (https://lists.gnu.org/archive/html/qemu-devel/2016-09/msg07781.html).
[Qemu-devel] VirtIO issues after 83d768b5640946b7da55ce8335509df297e2c7cd
We recently upgraded to qemu 2.8.0, and noticed a bunch of issues with various non-linux operating systems. I was able to bisect this down to 83d768b5640946b7da55ce8335509df297e2c7cd being the commit that breaks things (0687c37c5eeef8580b31cc6e1202d874833ae38a which is immediately prior still works ok). The issue here seems to be that guests stop receiving packets, and this only happens with vhost_net enabled. I've been consistently able to reproduce this with by booting the OpenBSD 5.8 ISO ( http://mirrors.evowise.com/pub/OpenBSD/5.8/amd64/install58.iso ), and pressing 'A' at the first prompt. It'll attempt to do a DHCP lease, and will succeed prior to 83d7. After that commit, it'll continue to transit DHCP requests, but never actually receive the reply. So far, we've seen this same problem manifest on the following operating systems: * OpenBSD 5.8 * pfSense 2.2.6 (seems to be FreeBSD 10.1 underneath) * Windows XP * "9front" (fork of Plan 9) More modern operating systems did work, including * Various Linux distributions (CentOS 6+, Ubuntu 12.04+) * Modern windows distributions (Server 2012) * OpenBSD 6 I was pointed to this thread - https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg04532.html , and if I apply that patch network connectivity begins working normally again. (I am running under x86-64 btw)
Re: [Qemu-devel] [RFC][PATCH] qemu-img: make convert async
Hi, Your series seems to have some coding style problems. See output below for more information: Type: series Subject: [Qemu-devel] [RFC][PATCH] qemu-img: make convert async Message-id: 1486051604-32310-1-git-send-email...@kamp.de === TEST SCRIPT BEGIN === #!/bin/bash BASE=base n=1 total=$(git log --oneline $BASE.. | wc -l) failed=0 # Useful git options git config --local diff.renamelimit 0 git config --local diff.renames True commits="$(git log --format=%H --reverse $BASE..)" for c in $commits; do echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..." if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then failed=1 echo fi n=$((n+1)) done exit $failed === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 Switched to a new branch 'test' 3a167eb qemu-img: make convert async === OUTPUT BEGIN === Checking PATCH 1/1: qemu-img: make convert async... ERROR: suspect code indent for conditional statements (4, 9) #118: FILE: qemu-img.c:1580: +for (i = 0; i < CONVERT_MAX_REQS; i++) { + if (s->reqs[i].sector_num == s->wr_offs && s->reqs[i].rd_completed) { ERROR: line over 90 characters #147: FILE: qemu-img.c:1609: +printf("convert_fill_read_queue req #%d @%p sector_num %ld nb_sectors %d status %d\n", i, req, elt->sector_num, elt->nb_sectors, elt->status); WARNING: line over 80 characters #149: FILE: qemu-img.c:1611: +if (elt->status == BLK_DATA || (!s->min_sparse && elt->status == BLK_ZERO)) WARNING: line over 80 characters #152: FILE: qemu-img.c:1614: +qemu_progress_print(100.0 * s->allocated_done / s->allocated_sectors, ERROR: do not use assignment in if condition #161: FILE: qemu-img.c:1623: +if ((elt = QSIMPLEQ_FIRST(>queue))) { ERROR: line over 90 characters #189: FILE: qemu-img.c:1650: +printf("convert_aio_read enter req %p sector_num %ld nb_sectors %d offs %ld ret %d\n", req, sector_num, nb_sectors, req->sector_offs, ret); ERROR: line over 90 characters #245: FILE: qemu-img.c:1695: +printf("convert_aio_write enter req %p sector_num %ld nb_sectors %d offs %ld ret %d\n", req, sector_num, nb_sectors, req->sector_offs, ret); ERROR: do not use C99 // comments #268: FILE: qemu-img.c:1722: +assert(0); //XXX TODO ERROR: do not use C99 // comments #311: FILE: qemu-img.c:1766: +assert(0); //XXX TODO WARNING: line over 80 characters #358: FILE: qemu-img.c:1823: +s->reqs[i].buf = blk_blockalign(s->target, s->buf_sectors * BDRV_SECTOR_SIZE); ERROR: line over 90 characters #378: FILE: qemu-img.c:1843: +printf("convert_iteration_sectors %ld n %d status %d\n", sector_num, n, s->status); ERROR: suspect code indent for conditional statements (4, 9) #452: FILE: qemu-img.c:1879: +for (i = 0; i < CONVERT_MAX_REQS; i++) { + qemu_iovec_destroy(>reqs[i].qiov); total: 9 errors, 3 warnings, 397 lines checked Your patch has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. === OUTPUT END === Test command exited with code: 1 --- Email generated automatically by Patchew [http://patchew.org/]. Please send your feedback to patchew-de...@freelists.org
Re: [Qemu-devel] [PATCH 2/2] postcopy: Recover block devices on early failure
"Dr. David Alan Gilbert (git)"wrote: > From: "Dr. David Alan Gilbert" > > An early postcopy failure can be recovered from as long as we know > we haven't sent the command to run the destination. > We have to undo the bdrv_inactivate_all by calling > bdrv_invalidate_cache_all > > Note that I'm not using ms->block_inactive because once we've > sent the postcopy package we dont want anything else to try > and recover the block storage on the source; the destination > might have started writing to it. > > Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Juan Quintela
Re: [Qemu-devel] [PATCH 1/2] Postcopy: Reset state to avoid cleanup assert
"Dr. David Alan Gilbert (git)"wrote: > From: "Dr. David Alan Gilbert" > > On a destination host with no userfault support an incoming > postcopy would cause the state to enter ADVISE before > it realised there was no support, and because it was in ADVISE > state it would perform a cleanup at the end. Since there > was no support the cleanup function should be unreachable, > but ends up being called and asserting. > > Reset the state when we realise we have no support, thus the > cleanup doesn't happen. > > Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Juan Quintela
Re: [Qemu-devel] KVM/QEMU on Raspberry Pi 3
On 2 February 2017 at 15:52, Ard Biesheuvelwrote: > How is this supposed to work? RPI3 does not have a GIC at all, but a > proprietary interrupt controller that is not supported by KVM The idea is that (with the relevant not-yet-in-master patchsets) you can fall back to "use a userspace interrupt controller model". The performance will probably not be great. thanks -- PMM
Re: [Qemu-devel] [PULL v5 00/22] virtio, vhost, pci: fixes, features
On 2 February 2017 at 13:56, Peter Maydellwrote: > On 31 January 2017 at 20:18, Michael S. Tsirkin wrote: >> virtio, vhost, pci: fixes, features >> >> generic pci root port support >> disable shpc by default >> safer version of ARRAY_SIZE and QEMU_BUILD_BUG_ON >> fixes and cleanups all over the place >> >> Signed-off-by: Michael S. Tsirkin > Applied, thanks. ...travis builds now fail for the --enable-tcg-interpreter config: https://travis-ci.org/qemu/qemu/jobs/197648661 In file included from /home/travis/build/qemu/qemu/tcg/tcg.c:255:0: /home/travis/build/qemu/qemu/tcg/tci/tcg-target.inc.c: In function ‘tcg_out_op’: /home/travis/build/qemu/qemu/tcg/tci/tcg-target.inc.c:569:117: error: negative width in bit-field ‘’ /home/travis/build/qemu/qemu/tcg/tci/tcg-target.inc.c:569:255: error: negative width in bit-field ‘’ In file included from /home/travis/build/qemu/qemu/tcg/tcg.c:255:0: /home/travis/build/qemu/qemu/tcg/tci/tcg-target.inc.c:578:115: error: negative width in bit-field ‘’ /home/travis/build/qemu/qemu/tcg/tci/tcg-target.inc.c:578:255: error: negative width in bit-field ‘’ These look to be because we were trying to use ARRAY_SIZE() on a non-array, which was previously undetected. The use is only in an assert() so fairly harmless. Would somebody who cares about TCI like to provide a fix? thanks -- PMM
Re: [Qemu-devel] [PATCH RFC] acpi: add reset register to fadt
On Tue, Jan 31, 2017 at 04:04:58PM +, Phil Dennis-Jordan wrote: > > On Tue, 31 Jan 2017 at 16:41, Igor Mammedovwrote: > > On Tue, 31 Jan 2017 16:58:22 +0200 > "Michael S. Tsirkin" wrote: > > > On Tue, Jan 31, 2017 at 03:31:46PM +0100, Phil Dennis-Jordan wrote: > > > On 18 January 2017 at 18:19, Igor Mammedov > wrote: > > > > On Wed, 18 Jan 2017 18:30:59 +0200 > > > > "Michael S. Tsirkin" wrote: > > > > > > > >> On Wed, Jan 18, 2017 at 12:45:54PM +0100, Phil Dennis-Jordan wrote: > > > > [...] > > > > > > > >> > I suspect more might be involved in enabling ACPI 2.0, and it > should probably be an option so as to avoid regressions. I don't know what > the best approach would be for this, so comments welcome. Should adding > the > reset register to the FADT also be configurable? > > > >> > > > >> I would say an option to make FADT use rev 3 format would be a good > > > >> idea. > > > >> > > > >> I'd make it the default if XP survives. > > > > if XP and legacy linux survive, > > > > I'd skip adding option as probably there won't be any users, > > > > in unlikely case such user surfaces we always can add option later > > > > but we can't do it other way around (i.e. take it away). > > > > > > I have now finally solved the mystery of why my FADT patch has been > > > going so disastrously wrong - I've now got working code, but I'd > > > appreciate some guidance on the best way to structure a patch to > > > minimise further back-and forth. > > > > + lersek > > > > > The culprit turned out to be OVMF, > > > specifically 2 bugs/shortcomings: > > > > > > 1. It completely gives up on parsing Qemu's ACPI tables if more than > > > one "add pointer" linker command points to the same table. In this > > > case, if you add a command for both the DSDT and X_DSDT fields of the > > > FADT, it aborts completely and uses fallback tables. (The following > > > InstallAcpiTable call fails if called twice with the same table type.) > > > https://github.com/tianocore/edk2/blob/master/OvmfPkg/AcpiPlatformDxe/ > QemuFwCfgAcpi.c#L518 > > > > > > 2. After applying all the linker commands, it goes and rewrites part > > > of the FADT anyway. Specifically, it rewrites the DSDT and X_DSDT > > > fields - and it always sets one of them to 0. Which one depends on > > > whether the DSDT is above the 4G barrier: > > > https://github.com/tianocore/edk2/blob/master/MdeModulePkg/Universal/ > Acpi/AcpiTableDxe/AcpiTableProtocol.c#L650 > > > > > > Both of these are easily fixed, and I will submit a corresponding > patch > to EDK2. > > > > > > With that fixed, the rest of the FADT provided by Qemu is accepted by > > > OVMF and the operating systems. On the Qemu side, it does mean we'll > > > need to still retain the ACPI 1.0 tables for backwards compatibility. > > > > > > Q1: How should the option be structured and named? Should the FADT > > > revision be selectable via a sub-option on -machine? Or as a > > > standalone option? Something else? > WinXP is main reason why we are using 1.0 revisions, > but if bumping revision doesn't affect it or later versions > and linux kernel (ancient/contemporary) boots fine, > I wouldn't bother with yet another option. > > > > XP is fine. My concern is that setups with OVMF+Windows (10 confirmed, but > probably 7/8 too) will suddenly bluescreen on boot because unpatched OVMF > delivers a non-compliant FADT. So for that, there might be a work around if you reorder the tables in the file. > > > > > Q2: To avoid any more confusion, I'd appreciate > > > confirmation/clarification on the X_ and non-X FADT fields in the case > > > where 32-bit pointers suffice. > > > > > > Q2a: DSDT/X_DSDT: Both variants appear to be de-facto required. > spec doesn't say that X_DSDT is optional and Windows requires it if field > is present. > > > > > > > Q2b: FIRMWARE_CTRL/X_FIRMWARE_CTRL: leave X_FIRMWARE_CTRL zero. > > > > > > Q2c: X_PM1a_EVT_BLK, X_PM1a_CNT_BLK, X_PM_TMR_BLK: These all state > > > "This is a required field" for both variants. > > > > > > Q2d: GPE0_BLK/X_GPE0_BLK: Both variants state "if this register block > > > is not supported, this field contains zero." - I understand this to > > > mean that when the register block IS supported and 32-bit, both > > > variants must be filled. > > > > > > In other words, only X_FIRMWARE_CTRL stays zero in Qemu's x86 case. > > > > > > > > > I'll come up with a revised patch in the next few days. > > > > > > Thanks for your help and patience so far, everyone! > >
Re: [Qemu-devel] [PULL 0/3] s390x fixes
On 1 February 2017 at 09:04, Christian Borntraegerwrote: > Peter, > > The following changes since commit a0def594286d9110a6035e02eef558cf3cf5d847: > > Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into > staging (2017-01-30 10:23:20 +) > > are available in the git repository at: > > git://github.com/borntraeger/qemu.git tags/s390x-20170201 > > for you to fetch changes up to d8923bc75479cd3fdcc72b7647f4877f91950b01: > > target/s390x: use "qemu" cpu model in user mode (2017-02-01 09:15:17 +0100) > > > s390x fixes > > - build error with old gcc versions > - race between cmma reset and rom/loader resets > - linux-user vs. cpu model > > > Christian Borntraeger (1): > s390x/kvm: fix small race reboot vs. cmma > > David Hildenbrand (1): > target/s390x: use "qemu" cpu model in user mode > > Paolo Bonzini (1): > s390-pci: fix compilation on older GCC versions Applied, thanks. -- PMM
[Qemu-devel] [RFC][PATCH] qemu-img: make convert async
this is something I have been thinking about for almost 2 years now. we heavily have the following two use cases when using qemu-img convert. a) reading from NFS and writing to iSCSI for deploying templates b) reading from iSCSI and writing to NFS for backups In both processes we use libiscsi and libnfs so we have no kernel pagecache. As qemu-img convert is implemented with sync operations that means we read one buffer and then write it. No parallelism and each sync request takes as long as it takes until it is completed. What I put together is an approach to use aio routines for the conversion process to have ideally read and write happening in parallel. The code is far from clean or complete, but I would appreaciate comments and thoughts from you. So far I have the following runtimes when reading an uncompressed QCOW2 from NFS and writing it to iSCSI (raw): qemu-img (master) nfs -> iscsi 33 secs nfs -> ram 19 secs ram -> iscsi 14 secs qemu-img-async nfs -> iscsi 23 secs nfs -> ram 17 secs ram -> iscsi 14 secs Its visible that on master the runtimes add up as expected. The async branch is faster, but not as fast as I would have expected. I would expect the runtime to be as slow as the slowest of the two involved transfers. Thank you, Peter Signed-off-by: Peter Lieven--- qemu-img.c | 271 + 1 file changed, 199 insertions(+), 72 deletions(-) diff --git a/qemu-img.c b/qemu-img.c index 5df66fe..d06f968 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -1446,6 +1446,29 @@ enum ImgConvertBlockStatus { BLK_BACKING_FILE, }; +#define CONVERT_MAX_REQS 4 + +typedef struct ImgConvertState ImgConvertState; + +typedef struct ImgConvertReq { +ImgConvertState *s; +int64_t sector_num; +int64_t sector_offs; +int64_t next_sector; +int nb_sectors; +QEMUIOVector qiov; +uint8_t *buf; +enum ImgConvertBlockStatus status; +bool rd_completed; +} ImgConvertReq; + +typedef struct ImgConvertQueueElt { +int64_t sector_num; +enum ImgConvertBlockStatus status; +int nb_sectors; +QSIMPLEQ_ENTRY(ImgConvertQueueElt) next; +} ImgConvertQueueElt; + typedef struct ImgConvertState { BlockBackend **src; int64_t *src_sectors; @@ -1453,6 +1476,8 @@ typedef struct ImgConvertState { int64_t src_cur_offset; int64_t total_sectors; int64_t allocated_sectors; +int64_t allocated_done; +int64_t wr_offs; enum ImgConvertBlockStatus status; int64_t sector_next_status; BlockBackend *target; @@ -1462,11 +1487,15 @@ typedef struct ImgConvertState { int min_sparse; size_t cluster_sectors; size_t buf_sectors; +ImgConvertReq reqs[CONVERT_MAX_REQS]; +int ret; +QSIMPLEQ_HEAD(, ImgConvertQueueElt) queue; } ImgConvertState; static void convert_select_part(ImgConvertState *s, int64_t sector_num) { -assert(sector_num >= s->src_cur_offset); +s->src_cur_offset = 0; +s->src_cur = 0; while (sector_num - s->src_cur_offset >= s->src_sectors[s->src_cur]) { s->src_cur_offset += s->src_sectors[s->src_cur]; s->src_cur++; @@ -1542,16 +1571,89 @@ static int convert_iteration_sectors(ImgConvertState *s, int64_t sector_num) return n; } -static int convert_read(ImgConvertState *s, int64_t sector_num, int nb_sectors, -uint8_t *buf) +static void convert_aio_read(void *opaque, int ret); +static void convert_aio_write(void *opaque, int ret); + +static void convert_fill_write_queue(ImgConvertState *s) { +int i; +for (i = 0; i < CONVERT_MAX_REQS; i++) { + if (s->reqs[i].sector_num == s->wr_offs && s->reqs[i].rd_completed) { +s->reqs[i].rd_completed = false; +s->reqs[i].sector_offs = 0; +convert_aio_write(>reqs[i], 0); +break; + } +} +} + +static void convert_fill_read_queue(ImgConvertState *s) +{ + ImgConvertQueueElt *elt; + + while ((elt = QSIMPLEQ_FIRST(>queue)) && !s->ret) { +ImgConvertReq *req = NULL; +int i; +for (i = 0; i < CONVERT_MAX_REQS; i++) { +if (s->reqs[i].sector_num == -1) { +req = >reqs[i]; +break; +} +} +if (!req) { +return; +} + +QSIMPLEQ_REMOVE_HEAD(>queue, next); + +printf("convert_fill_read_queue req #%d @%p sector_num %ld nb_sectors %d status %d\n", i, req, elt->sector_num, elt->nb_sectors, elt->status); + +if (elt->status == BLK_DATA || (!s->min_sparse && elt->status == BLK_ZERO)) +{ +s->allocated_done += elt->nb_sectors; +qemu_progress_print(100.0 * s->allocated_done / s->allocated_sectors, +0); +} +req->sector_num = elt->sector_num; +req->nb_sectors = elt->nb_sectors; +req->status = elt->status; +req->sector_offs = 0;
Re: [Qemu-devel] [PULL 0/1] Block patches
On 1 February 2017 at 13:24, Stefan Hajnocziwrote: > The following changes since commit a0def594286d9110a6035e02eef558cf3cf5d847: > > Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into > staging (2017-01-30 10:23:20 +) > > are available in the git repository at: > > git://github.com/stefanha/qemu.git block-pull-request > > for you to fetch changes up to 81a9f2cb9336a7e9f50b0729b4c81d287e0015e9: > > iothread: enable AioContext polling by default (2017-01-31 17:09:34 +) git claims not to be able to find "block-pull-request", and https://github.com/stefanha/qemu/tags says it was pushed 7 days ago with a different commit hash to the one quoted here. Forgot to tag? Forgot to push? thanks -- PMM
Re: [Qemu-devel] [qcow2] how to avoid qemu doing lseek(SEEK_DATA/SEEK_HOLE)?
2017-02-02 16:23:53 +0100, Laszlo Ersek: [...] > You didn't mention what qcow2 features you use -- vmstate, snapshots, > backing files (chains of them), compression? > > Since commit 2928abce6d1d only modifies "block/qcow2.c", you could > switch / convert the images to "raw". "raw" still benefits from sparse > files (which ZFS-on-Linux apparently supports). Sparse files (i.e., the > disk space savings) are the most important feature to me at least. [...] Thanks for the feedback. Sorry for not mentioning it in the first place, but I do need the vmstate and snapshots (even non-linear snapshots which means even ZFS zvol snapshots as done by Proxmox VE are not an option either, neither is vmdk) I hadn't tested before now, but what I observe with raw devices and discard=on,detect-zeroes=unmap (and the virtio-scsi interface), is that upon those "synced writes of zeroes" into allocated data, qemu does some [pid 10535] fallocate(14, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 136314880, 4096) = 0 into the disk image. (and no lseek(SEEK_DATA/SEEK_HOLE)) which I don't see when using qcow2 images. If the qcow2 interface was updated to do the same (punch holes regardless instead of checking if the data is allocated beforehand), that would also solve my problem (anything that avoid those lseek()s being called). Another thing I've not mentioned clearly is the versions of qemu I have been testing with: 2.7, 2.7.1 (those two on Proxmox VE 4.4 (based on Debian jessie)) and 2.8.0 (the latter for verification on a Debian unstable system, not with zfs). -- Stephane
Re: [Qemu-devel] [PATCH v2 1/1] qemu-char: socket backend: disconnect on write error
On 02/02/2017 04:18 PM, Denis V. Lunev wrote: > From: Anton Nefedov> > Socket backend read handler should normally perform a disconnect, however > the read handler may not get a chance to run if the frontend is not ready > (qemu_chr_be_can_write() == 0). > > This means that in virtio-serial frontend case if > - the host has disconnected (giving EPIPE on socket write) > - and the guest has disconnected (-> frontend not ready -> backend >will not read) > - and there is still data (frontend->backend) to flush (has to be a really >tricky timing but nevertheless, we have observed the case in production) > > This results in virtio-serial trying to flush this data continiously forming > a busy loop. > > Solution: react on write error in the socket write handler. > errno is not reliable after qio_channel_writev_full(), so we may not get > the exact EPIPE, so disconnect on any error but QIO_CHANNEL_ERR_BLOCK which > io_channel_send_full() converts to errno EAGAIN. > We must not disconnect right away though, there still may be data to read > (see 4bf1cb0). > > Signed-off-by: Anton Nefedov > Signed-off-by: Denis V. Lunev > CC: Paolo Bonzini > CC: Daniel P. Berrange > CC: Marc-André Lureau > --- > Changes from v1: > - we do not rely on EPIPE anynore. Socket should be closed on all errors > except EAGAIN > > chardev/char-socket.c | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/chardev/char-socket.c b/chardev/char-socket.c > index 4068dc5..e2137aa 100644 > --- a/chardev/char-socket.c > +++ b/chardev/char-socket.c > @@ -97,6 +97,9 @@ static gboolean tcp_chr_accept(QIOChannel *chan, > GIOCondition cond, > void *opaque); > > +static int tcp_chr_read_poll(void *opaque); > +static void tcp_chr_disconnect(CharDriverState *chr); > + > /* Called with chr_write_lock held. */ > static int tcp_chr_write(Chardev *chr, const uint8_t *buf, int len) > { > @@ -114,6 +117,13 @@ static int tcp_chr_write(Chardev *chr, const uint8_t > *buf, int len) > s->write_msgfds_num = 0; > } > > +if (ret < 0 && errno != EAGAIN) { > +if (tcp_chr_read_poll(chr) <= 0) { > +tcp_chr_disconnect(chr); > +return len; > +} /* else let the read handler finish it properly */ > +} > + > return ret; > } else { > /* XXX: indicate an error ? */ pls disregard. I am very sorry. This patch was made against all version and will not compile. Den
[Qemu-devel] [PATCH 2/2] postcopy: Recover block devices on early failure
From: "Dr. David Alan Gilbert"An early postcopy failure can be recovered from as long as we know we haven't sent the command to run the destination. We have to undo the bdrv_inactivate_all by calling bdrv_invalidate_cache_all Note that I'm not using ms->block_inactive because once we've sent the postcopy package we dont want anything else to try and recover the block storage on the source; the destination might have started writing to it. Signed-off-by: Dr. David Alan Gilbert --- migration/migration.c | 25 + 1 file changed, 25 insertions(+) diff --git a/migration/migration.c b/migration/migration.c index 2766d2f..283677c 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1605,6 +1605,7 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running) QIOChannelBuffer *bioc; QEMUFile *fb; int64_t time_at_stop = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); +bool restart_block = false; migrate_set_state(>state, MIGRATION_STATUS_ACTIVE, MIGRATION_STATUS_POSTCOPY_ACTIVE); @@ -1624,6 +1625,7 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running) if (ret < 0) { goto fail; } +restart_block = true; /* * Cause any non-postcopiable, but iterative devices to @@ -1680,6 +1682,18 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running) /* <><> end of stuff going into the package */ +/* Last point of recovery; as soon as we send the package the destination + * can open devices and potentially start running. + * Lets just check again we've not got any errors. + */ +ret = qemu_file_get_error(ms->to_dst_file); +if (ret) { +error_report("postcopy_start: Migration stream errored (pre package)"); +goto fail_closefb; +} + +restart_block = false; + /* Now send that blob */ if (qemu_savevm_send_packaged(ms->to_dst_file, bioc->data, bioc->usage)) { goto fail_closefb; @@ -1717,6 +1731,17 @@ fail_closefb: fail: migrate_set_state(>state, MIGRATION_STATUS_POSTCOPY_ACTIVE, MIGRATION_STATUS_FAILED); +if (restart_block) { +/* A failure happened early enough that we know the destination hasn't + * accessed block devices, so we're safe to recover. + */ +Error *local_err = NULL; + +bdrv_invalidate_cache_all(_err); +if (local_err) { +error_report_err(local_err); +} +} qemu_mutex_unlock_iothread(); return -1; } -- 2.9.3
[Qemu-devel] [PATCH 0/2] Postcopy fixes
From: "Dr. David Alan Gilbert"Hi, These are a couple of postcopy fixes for failure paths; The first is a pretty harmless assert, it happens on the destination when it discovers that it's got an incoming postcopy but it can't support it - it was going to exit anyway. The second is more useful and fixes an assert on the source in the small window where a postcopy that's starting up fails and tries to recover the source. Dave Dr. David Alan Gilbert (2): Postcopy: Reset state to avoid cleanup assert postcopy: Recover block devices on early failure migration/migration.c | 25 + migration/savevm.c| 1 + 2 files changed, 26 insertions(+) -- 2.9.3
Re: [Qemu-devel] [PATCH v3] q35: Improve sample configuration files
On Thu, 2017-02-02 at 15:55 +0100, Gerd Hoffmann wrote: > > +++ b/docs/q35-emulated.cfg > > > > +# 00:01.0 VGA compatible controller > > > > +[device "video"] > > + driver = "VGA" > > + bus = "pcie.0" > > + addr = "02.0" > > Here is a address mismatch ;) > > Otherwise looks all fine to me. Good catch! I'll fix it and wait for input on the questions raised by Marcel before sending out a v4 :) -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] KVM/QEMU on Raspberry Pi 3
On 2 February 2017 at 15:50, Laszlo Ersekwrote: > Adding Ard, just in case... > > I have one (half-)comment re: GICv3: > How is this supposed to work? RPI3 does not have a GIC at all, but a proprietary interrupt controller that is not supported by KVM > On 02/02/17 15:44, Pekka Enberg wrote: >> Hi, >> >> Has anyone been able to successfully run QEMU/KVM under Raspberry Pi 3? >> >> I have installed 64-bit Fedora 24 by Gerd Hoffmann on the hardware: >> >> https://www.kraxel.org/blog/2016/04/fedora-on-raspberry-pi-updates/ >> >> and built a VM image using virt-builder: >> >> virt-builder --root-password password:root --arch aarch64 fedora-24 >> >> I also built the latest UEFI for QEMU from sources: >> >> https://wiki.linaro.org/LEG/UEFIforQEMU >> >> and updated to QEMU 2.8.0 from rawhide: >> >> [root@fedora-rpi2 ~]# qemu-system-aarch64 -version >> QEMU emulator version 2.8.0(qemu-2.8.0-1.fc26) >> Copyright (c) 2003-2016 Fabrice Bellard and the QEMU Project developers >> >> The VM image should be fine because I’m able to boot to it under CPU >> emulation: >> >> qemu-system-aarch64 \ >> -nographic \ >> -M virt \ >> -cpu cortex-a57 \ >> -smp 1 \ >> -m 512 \ >> -bios QEMU_EFI.fd \ >> -device virtio-blk-device,drive=image -drive >> if=none,id=image,file=fedora-24.img \ >> -netdev bridge,id=hn0,br=virbr0 -device >> virtio-net-pci,netdev=hn0,romfile= \ >> -device virtio-rng-pci >> >> However, when I enable KVM, keyboard stops working (interrupt delivery >> issue?) and Fedora boot process hangs at random places before reaching >> login: >> >> qemu-system-aarch64 \ >> -nographic \ >> -M virt \ >> -cpu host \ >> -enable-kvm \ >> -smp 1 \ >> -m 512 \ >> -bios QEMU_EFI.fd \ >> -device virtio-blk-device,drive=image -drive >> if=none,id=image,file=$IMAGE \ >> -netdev bridge,id=hn0,br=virbr0 -device >> virtio-net-pci,netdev=hn0,romfile= \ >> -device virtio-rng-pci >> >> EFI stub: Booting Linux Kernel... >> ConvertPages: Incompatible memory types >> EFI stub: Using DTB from configuration table >> EFI stub: Exiting boot services and installing virtual address map... >> [0.00] Booting Linux on physical CPU 0x0 >> [0.00] Linux version 4.9.5-100.fc24.aarch64 >> (mockbu...@aarch64-06a.arm.fedoraproject.org) (gcc version 6.3.1 >> 20161221 (Red Hat 6.3.1-1) (GCC) ) #1 SMP Tue Jan 24 21:12:07 UTC 2017 >> [0.00] Boot CPU: AArch64 Processor [410fd034] >> [0.00] debug: ignoring loglevel setting. >> [0.00] efi: Getting EFI parameters from FDT: >> [0.00] efi: EFI v2.60 by EDK II >> [0.00] efi: SMBIOS 3.0=0x5871 ACPI 2.0=0x589b >> MEMATTR=0x59c03218 >> [0.00] cma: Failed to reserve 512 MiB >> [0.00] NUMA: No NUMA configuration found >> [0.00] NUMA: Faking a node at [mem >> 0x-0x5fff] >> [0.00] NUMA: Adding memblock [0x4000 - 0x585b] on node 0 >> [0.00] NUMA: Adding memblock [0x585c - 0x5861] on node 0 >> [0.00] NUMA: Adding memblock [0x5862 - 0x586f] on node 0 >> [0.00] NUMA: Adding memblock [0x5870 - 0x58b6] on node 0 >> [0.00] NUMA: Adding memblock [0x58b7 - 0x5be3] on node 0 >> [0.00] NUMA: Adding memblock [0x5be4 - 0x5bec] on node 0 >> [0.00] NUMA: Adding memblock [0x5bed - 0x5bed] on node 0 >> [0.00] NUMA: Adding memblock [0x5bee - 0x5bff] on node 0 >> [0.00] NUMA: Adding memblock [0x5c00 - 0x5fff] on node 0 >> [0.00] NUMA: Initmem setup node 0 [mem 0x4000-0x5fff] >> [0.00] NUMA: NODE_DATA [mem 0x5ff62680-0x5ff6] >> [0.00] Zone ranges: >> [0.00] DMA [mem 0x4000-0x5fff] >> [0.00] Normal empty >> [0.00] Movable zone start for each node >> [0.00] Early memory node ranges >> [0.00] node 0: [mem 0x4000-0x585b] >> [0.00] node 0: [mem 0x585c-0x5861] >> [0.00] node 0: [mem 0x5862-0x586f] >> [0.00] node 0: [mem 0x5870-0x58b6] >> [0.00] node 0: [mem 0x58b7-0x5be3] >> [0.00] node 0: [mem 0x5be4-0x5bec] >> [0.00] node 0: [mem 0x5bed-0x5bed] >> [0.00] node 0: [mem 0x5bee-0x5bff] >> [0.00] node 0: [mem 0x5c00-0x5fff] >> [0.00] Initmem setup node 0 [mem >> 0x4000-0x5fff] >> [0.00] On node 0 totalpages: 8192 >> [0.00] DMA zone: 8 pages used for memmap >> [0.00] DMA zone: 0 pages reserved >> [0.00] DMA zone: 8192 pages, LIFO batch:0 >> [
[Qemu-devel] [PATCH 1/2] Postcopy: Reset state to avoid cleanup assert
From: "Dr. David Alan Gilbert"On a destination host with no userfault support an incoming postcopy would cause the state to enter ADVISE before it realised there was no support, and because it was in ADVISE state it would perform a cleanup at the end. Since there was no support the cleanup function should be unreachable, but ends up being called and asserting. Reset the state when we realise we have no support, thus the cleanup doesn't happen. Signed-off-by: Dr. David Alan Gilbert --- migration/savevm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/migration/savevm.c b/migration/savevm.c index e8e5ff5..de86db0 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1355,6 +1355,7 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis) } if (!postcopy_ram_supported_by_host()) { +postcopy_state_set(POSTCOPY_INCOMING_NONE); return -1; } -- 2.9.3
Re: [Qemu-devel] KVM/QEMU on Raspberry Pi 3
Adding Ard, just in case... I have one (half-)comment re: GICv3: On 02/02/17 15:44, Pekka Enberg wrote: > Hi, > > Has anyone been able to successfully run QEMU/KVM under Raspberry Pi 3? > > I have installed 64-bit Fedora 24 by Gerd Hoffmann on the hardware: > > https://www.kraxel.org/blog/2016/04/fedora-on-raspberry-pi-updates/ > > and built a VM image using virt-builder: > > virt-builder --root-password password:root --arch aarch64 fedora-24 > > I also built the latest UEFI for QEMU from sources: > > https://wiki.linaro.org/LEG/UEFIforQEMU > > and updated to QEMU 2.8.0 from rawhide: > > [root@fedora-rpi2 ~]# qemu-system-aarch64 -version > QEMU emulator version 2.8.0(qemu-2.8.0-1.fc26) > Copyright (c) 2003-2016 Fabrice Bellard and the QEMU Project developers > > The VM image should be fine because I’m able to boot to it under CPU > emulation: > > qemu-system-aarch64 \ > -nographic \ > -M virt \ > -cpu cortex-a57 \ > -smp 1 \ > -m 512 \ > -bios QEMU_EFI.fd \ > -device virtio-blk-device,drive=image -drive > if=none,id=image,file=fedora-24.img \ > -netdev bridge,id=hn0,br=virbr0 -device > virtio-net-pci,netdev=hn0,romfile= \ > -device virtio-rng-pci > > However, when I enable KVM, keyboard stops working (interrupt delivery > issue?) and Fedora boot process hangs at random places before reaching > login: > > qemu-system-aarch64 \ > -nographic \ > -M virt \ > -cpu host \ > -enable-kvm \ > -smp 1 \ > -m 512 \ > -bios QEMU_EFI.fd \ > -device virtio-blk-device,drive=image -drive > if=none,id=image,file=$IMAGE \ > -netdev bridge,id=hn0,br=virbr0 -device > virtio-net-pci,netdev=hn0,romfile= \ > -device virtio-rng-pci > > EFI stub: Booting Linux Kernel... > ConvertPages: Incompatible memory types > EFI stub: Using DTB from configuration table > EFI stub: Exiting boot services and installing virtual address map... > [0.00] Booting Linux on physical CPU 0x0 > [0.00] Linux version 4.9.5-100.fc24.aarch64 > (mockbu...@aarch64-06a.arm.fedoraproject.org) (gcc version 6.3.1 > 20161221 (Red Hat 6.3.1-1) (GCC) ) #1 SMP Tue Jan 24 21:12:07 UTC 2017 > [0.00] Boot CPU: AArch64 Processor [410fd034] > [0.00] debug: ignoring loglevel setting. > [0.00] efi: Getting EFI parameters from FDT: > [0.00] efi: EFI v2.60 by EDK II > [0.00] efi: SMBIOS 3.0=0x5871 ACPI 2.0=0x589b > MEMATTR=0x59c03218 > [0.00] cma: Failed to reserve 512 MiB > [0.00] NUMA: No NUMA configuration found > [0.00] NUMA: Faking a node at [mem > 0x-0x5fff] > [0.00] NUMA: Adding memblock [0x4000 - 0x585b] on node 0 > [0.00] NUMA: Adding memblock [0x585c - 0x5861] on node 0 > [0.00] NUMA: Adding memblock [0x5862 - 0x586f] on node 0 > [0.00] NUMA: Adding memblock [0x5870 - 0x58b6] on node 0 > [0.00] NUMA: Adding memblock [0x58b7 - 0x5be3] on node 0 > [0.00] NUMA: Adding memblock [0x5be4 - 0x5bec] on node 0 > [0.00] NUMA: Adding memblock [0x5bed - 0x5bed] on node 0 > [0.00] NUMA: Adding memblock [0x5bee - 0x5bff] on node 0 > [0.00] NUMA: Adding memblock [0x5c00 - 0x5fff] on node 0 > [0.00] NUMA: Initmem setup node 0 [mem 0x4000-0x5fff] > [0.00] NUMA: NODE_DATA [mem 0x5ff62680-0x5ff6] > [0.00] Zone ranges: > [0.00] DMA [mem 0x4000-0x5fff] > [0.00] Normal empty > [0.00] Movable zone start for each node > [0.00] Early memory node ranges > [0.00] node 0: [mem 0x4000-0x585b] > [0.00] node 0: [mem 0x585c-0x5861] > [0.00] node 0: [mem 0x5862-0x586f] > [0.00] node 0: [mem 0x5870-0x58b6] > [0.00] node 0: [mem 0x58b7-0x5be3] > [0.00] node 0: [mem 0x5be4-0x5bec] > [0.00] node 0: [mem 0x5bed-0x5bed] > [0.00] node 0: [mem 0x5bee-0x5bff] > [0.00] node 0: [mem 0x5c00-0x5fff] > [0.00] Initmem setup node 0 [mem > 0x4000-0x5fff] > [0.00] On node 0 totalpages: 8192 > [0.00] DMA zone: 8 pages used for memmap > [0.00] DMA zone: 0 pages reserved > [0.00] DMA zone: 8192 pages, LIFO batch:0 > [0.00] psci: probing for conduit method from DT. > [0.00] psci: PSCIv0.2 detected in firmware. > [0.00] psci: Using standard PSCI v0.2 function IDs > [0.00] psci: Trusted OS migration not required > [0.00] percpu: Embedded 3 pages/cpu @80001ff0 s114840 > r8192
[Qemu-devel] [PULL 0/4] cirrus: multiple bugfixes, including CVE-2017-2615 fix.
Hi, Here is the vga patch queue, finally bringing the cirrus fixes. Pull v2 fixes the review and message id lines in patch #4. No code changes. please pull, Gerd The following changes since commit a0def594286d9110a6035e02eef558cf3cf5d847: Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging (2017-01-30 10:23:20 +) are available in the git repository at: git://git.kraxel.org/qemu tags/pull-vga-20170202-2 for you to fetch changes up to 62d4c6bd5263bb8413a06c80144fc678df6dfb64: cirrus: fix oob access issue (CVE-2017-2615) (2017-02-02 15:58:23 +0100) cirrus: multiple bugfixes, including CVE-2017-2615 fix. Gerd Hoffmann (1): cirrus: fix blit address mask handling Li Qiang (1): cirrus: fix oob access issue (CVE-2017-2615) Wolfgang Bumiller (2): cirrus: handle negative pitch in cirrus_invalidate_region() cirrus: allow zero source pitch in pattern fill rops hw/display/cirrus_vga.c | 64 ++--- 1 file changed, 39 insertions(+), 25 deletions(-)
Re: [Qemu-devel] [PULL 095/107] spapr: clock should count only if vm is running
On 02/02/17 14:20, Laurent Vivier wrote: > I think adding the the PPCTimebase field and the VMSTATE_PPC_TIMEBASE_V > macro to the PMac machines should fix your issue. > > Do you have a test case I can try? > > Laurent Hi Laurent, Yes I'd say that is required, although I still think you need to migrate the decrementer value as per the comments on https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg00546.html. Here's the reproducer from an off-list email I sent last year: 1) Download https://www.ilande.co.uk/tmp/darwin_empty.qcow2.xz and decompress the image (it's a pre-partitioned empty Apple Partition Map disk) 2) Download https://opensource.apple.com/static/iso/darwinppc-602.cdr.gz image, gunzip it and rename with .iso extension 3) Start QEMU using the attached "empty" disk like this: ./qemu-system-ppc -hda darwin_empty.qcow2 -cdrom darwinppc-602.iso -boot d 4) Start the installer in the guest and you'll see lots of files with ASCII progress bars as the various files are copied to disk Then to see the problem with the progress bar, repeat the following: 5) Pause the VM 6) Issue "savevm foo" in the monitor 7) Exit QEMU 8) Start QEMU again as below: ./qemu-system-ppc -hda darwin_empty.qcow2 -cdrom darwinppc-602.iso -boot d -loadvm foo If you do this enough times (maybe 10 or so?) you'll see the progress bars stop working correctly and get out of sync, i.e. it will freeze for long periods of time and then "jump" to catch-up but not all the way. With my above patch applied to include the decrementer in the migration, the bug was no longer visible in my tests. HTH, Mark.
Re: [Qemu-devel] [PATCH v3] q35: Improve sample configuration files
On Thu, 2017-02-02 at 15:55 +0100, Gerd Hoffmann wrote: > > +++ b/docs/q35-emulated.cfg > > > > +# 00:01.0 VGA compatible controller > > > > +[device "video"] > > + driver = "VGA" > > + bus = "pcie.0" > > + addr = "02.0" > > Here is a address mismatch ;) Oh, wait, I realize where the mismatch comes from now: on real hardware (my laptop) the video card is plugged into 00:02.0, but the default one you get when you don't pass -nodefault to QEMU is plugged into 00:01.0! I think we should use 00:02.0, do you agree? I would also use the same slot in q35-virtio-graphical.cfg. -- Andrea Bolognani / Red Hat / Virtualization
[Qemu-devel] [PATCH 3/3] spapr: make cpu core unplug follow expected hotunplug call flow
spapr_core_unplug() were essentially spapr_core_unplug_request() handler that requested CPU removal and registered callback which did actual cpu core removali but it was called from spapr_machine_device_unplug() which is intended for actual object removal. Commit (cf632463 spapr: Memory hot-unplug support) sort of fixed it introducing spapr_machine_device_unplug_request() and calling spapr_core_unplug() but it hasn't renamed callback and by mistake calls it from spapr_machine_device_unplug(). However spapr_machine_device_unplug() isn't ever called for cpu core since spapr_core_release() doesn't follow expected hotunplug call flow which is: 1: device_del() -> hotplug_handler_unplug_request() -> set destroy_cb() 2: destroy_cb() -> hotplug_handler_unplug() -> object_unparent // actual device removal Fix it by renaming spapr_core_unplug() to spapr_core_unplug_request() which is called from spapr_machine_device_unplug_request() and making spapr_core_release() call hotplug_handler_unplug() which will call spapr_machine_device_unplug() -> spapr_core_unplug() to remove cpu core. Signed-off-by: Igor Mammedov--- hw/ppc/spapr.c | 18 ++ 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 8c2efd8..37cb338 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -2488,7 +2488,8 @@ void *spapr_populate_hotplug_cpu_dt(CPUState *cs, int *fdt_offset, return fdt; } -static void spapr_core_release(DeviceState *dev, void *opaque) +static void spapr_core_unplug(HotplugHandler *hotplug_dev, DeviceState *dev, + Error **errp) { sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine()); CPUCore *cc = CPU_CORE(dev); @@ -2497,8 +2498,17 @@ static void spapr_core_release(DeviceState *dev, void *opaque) object_unparent(OBJECT(dev)); } -static void spapr_core_unplug(HotplugHandler *hotplug_dev, DeviceState *dev, - Error **errp) +static void spapr_core_release(DeviceState *dev, void *opaque) +{ +HotplugHandler *hotplug_ctrl; + +hotplug_ctrl = qdev_get_hotplug_handler(dev); +hotplug_handler_unplug(hotplug_ctrl, dev, _abort); +} + +static +void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev, + Error **errp) { CPUCore *cc = CPU_CORE(dev); int smt = kvmppc_smt_threads(); @@ -2719,7 +2729,7 @@ static void spapr_machine_device_unplug_request(HotplugHandler *hotplug_dev, error_setg(errp, "CPU hot unplug not supported on this machine"); return; } -spapr_core_unplug(hotplug_dev, dev, errp); +spapr_core_unplug_request(hotplug_dev, dev, errp); } } -- 2.7.4
Re: [Qemu-devel] [PATCH v2 1/1] mirror: do not increase offset during initial zero_or_discard phase
On 02/02/2017 08:25 AM, Denis V. Lunev wrote: > From: Anton Nefedov> > If explicit zeroing out before mirroring is required for the target image, > it moves the block job offset counter to EOF, then offset and len counters > count the image size twice. There is no harm but stats are confusing, > specifically the progress of the operation is always reported as 99% by > management tools. > > The patch skips offset increase for the first "technical" pass over the > image. This should not cause any further harm. > > Signed-off-by: Anton Nefedov > Signed-off-by: Denis V. Lunev > CC: Jeff Cody > CC: Kevin Wolf > CC: Max Reitz > CC: Eric Blake > --- > +bool initial_zeroing_ongoing; Long name. With a bit of bikeshedding, I might have used 'init_pass' for a shorter name (particularly if some later patch introduces another aspect of initialization that is not zeroing but is worth ignoring with respects to progress reporting). > } MirrorBlockJob; > > typedef struct MirrorOp { > @@ -117,9 +118,10 @@ static void mirror_iteration_done(MirrorOp *op, int ret) > if (s->cow_bitmap) { > bitmap_set(s->cow_bitmap, chunk_num, nb_chunks); > } > -s->common.offset += (uint64_t)op->nb_sectors * BDRV_SECTOR_SIZE; > +if (!s->initial_zeroing_ongoing) { > +s->common.offset += (uint64_t)op->nb_sectors * BDRV_SECTOR_SIZE; > +} > } > - > qemu_iovec_destroy(>qiov); Why are you deleting the blank line? Other than naming, the patch looks reasonable. If you spin a v3 with only the name changed, you can add: Reviewed-by: Eric Blake -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature