[Qemu-devel] [Bug 1178101] [NEW] Could not enable gtk UI on build for Windows target
Public bug reported: $ ${QEMU_SRC_DIR}/configure --prefix=${BIN_ROOT} --cross- prefix=${HOST_TRIPLET}- --extra-cflags=-I${BIN_ROOT}/include --extra- ldflags=-L${BIN_ROOT}/lib --enable-gtk --disable-xen ERROR: User requested feature gtk configure was not able to find it $ cat config.log # QEMU configure log Thu May 9 13:50:40 CST 2013 # Configured with: '/home/cauchy/vcs/git/qemu/configure' '--prefix=/home/cauchy/w32' '--cross-prefix=i686-w64-mingw32-' '--extra-cflags=-I/home/cauchy/w32/include' '--extra-ldflags=-L/home/cauchy/w32/lib' '--enable-gtk' '--disable-xen' # i686-w64-mingw32-gcc -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -I/home/cauchy/w32/include -c -o /tmp/qemu-conf--18025-.o /tmp/qemu-conf--18025-.c /tmp/qemu-conf--18025-.c:2:2: error: #error __linux__ not defined #error __linux__ not defined ^ i686-w64-mingw32-gcc -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -I/home/cauchy/w32/include -c -o /tmp/qemu-conf--18025-.o /tmp/qemu-conf--18025-.c i686-w64-mingw32-gcc -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -I/home/cauchy/w32/include -c -o /tmp/qemu-conf--18025-.o /tmp/qemu-conf--18025-.c i686-w64-mingw32-gcc -D__USE_MINGW_ANSI_STDIO=1 -DWIN32_LEAN_AND_MEAN -DWINVER=0x501 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -I/home/cauchy/w32/include -o /tmp/qemu-conf--18025-.exe /tmp/qemu-conf--18025-.c -g -L/home/cauchy/w32/lib -liberty i686-w64-mingw32-gcc -m32 -D__USE_MINGW_ANSI_STDIO=1 -DWIN32_LEAN_AND_MEAN -DWINVER=0x501 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -I/home/cauchy/w32/include -c -o /tmp/qemu-conf--18025-.o /tmp/qemu-conf--18025-.c i686-w64-mingw32-gcc -m32 -D__USE_MINGW_ANSI_STDIO=1 -DWIN32_LEAN_AND_MEAN -DWINVER=0x501 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -I/home/cauchy/w32/include -Werror -Winitializer-overrides -o /tmp/qemu-conf--18025-.exe /tmp/qemu-conf--18025-.c -m32 -g -L/home/cauchy/w32/lib i686-w64-mingw32-gcc: error: unrecognized command line option ‘-Winitializer-overrides’ i686-w64-mingw32-gcc -m32 -D__USE_MINGW_ANSI_STDIO=1 -DWIN32_LEAN_AND_MEAN -DWINVER=0x501 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -I/home/cauchy/w32/include -Werror -Wendif-labels -o /tmp/qemu-conf--18025-.exe /tmp/qemu-conf--18025-.c -m32 -g -L/home/cauchy/w32/lib i686-w64-mingw32-gcc -m32 -D__USE_MINGW_ANSI_STDIO=1 -DWIN32_LEAN_AND_MEAN -DWINVER=0x501 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -I/home/cauchy/w32/include -Wendif-labels -Werror -Wmissing-include-dirs -o /tmp/qemu-conf--18025-.exe /tmp/qemu-conf--18025-.c -m32 -g -L/home/cauchy/w32/lib i686-w64-mingw32-gcc -m32 -D__USE_MINGW_ANSI_STDIO=1 -DWIN32_LEAN_AND_MEAN -DWINVER=0x501 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -I/home/cauchy/w32/include -Wendif-labels -Wmissing-include-dirs -Werror -Wempty-body -o /tmp/qemu-conf--18025-.exe /tmp/qemu-conf--18025-.c -m32 -g -L/home/cauchy/w32/lib i686-w64-mingw32-gcc -m32 -D__USE_MINGW_ANSI_STDIO=1 -DWIN32_LEAN_AND_MEAN -DWINVER=0x501 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -I/home/cauchy/w32/include -Wendif-labels -Wmissing-include-dirs -Wempty-body -Werror -Wnested-externs -o /tmp/qemu-conf--18025-.exe /tmp/qemu-conf--18025-.c -m32 -g -L/home/cauchy/w32/lib i686-w64-mingw32-gcc -m32 -D__USE_MINGW_ANSI_STDIO=1 -DWIN32_LEAN_AND_MEAN -DWINVER=0x501 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -I/home/cauchy/w32/include -Wendif-labels -Wmissing-include-dirs -Wempty-body -Wnested-externs -Werror -Wformat-security -o /tmp/qemu-conf--18025-.exe /tmp/qemu-conf--18025-.c -m32 -g -L/home/cauchy/w32/lib i686-w64-mingw32-gcc -m32 -D__USE_MINGW_ANSI_STDIO=1 -DWIN32_LEAN_AND_MEAN -DWINVER=0x501
Re: [Qemu-devel] [libvirt]virtio serial device problem
On 2013年05月08日 23:53, fred.kon...@greensocs.com wrote: On 05/07/2013 07:50 PM, Paolo Bonzini wrote: Il 07/05/2013 09:20, Li Zhang ha scritto: Hi all, Hi, When we use the latest version of QEMU to build ovirt, we get this error reported from libvirt. What QEMU commit is this? b3e6d591b05538056d665572f3e3bbfb3cbb70e7 This commit is from 05/29 no? there were issues with that. But it should be fixed. Do you still have the command-line issue with the last git? See commit 80270a19685dd20eda017b0360c743b3e3ed6f57 Hi Fred, This patch is to change bus which can be compatible with old version, right? But I saw the current name is still different from old version. The current name is: virtio-serial-bus0.0 The old version is: virtio-serial0.0 Is it possible to change it back to the old name? Thanks. :) --Li Thanks, Fred It might have been fixed already. Hm. From what I see, it is all correct from the qemu side, the problem is in libvirt which does not know about virtio-pci-bus yet. Paolo qemu-system-ppc64: -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm: Bus 'virtio-serial0.0' is full qemu-system-ppc64: -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm: Bus 'virtio-serial0.0' not found Libvirt helps create QEMU command line and put virtserialport device to bus virtio-serial0.0. For latest version of QEMU, the bus type is changed. (qemu) info qtree bus: main-system-bus type System dev: spapr-pci-host-bridge, id index = 0 buid = 0x8002000 liobn = 0x8000 mem_win_addr = 0x100a000 mem_win_size = 0x2000 io_win_addr = 0x1008000 io_win_size = 0x1 msi_win_addr = 0x1009000 irq 0 bus: pci type PCI dev: virtio-serial-pci, id virtio-serial0 ioeventfd = on vectors = 2 class = 0x780 indirect_desc = on event_idx = on max_ports = 31 addr = 03.0 romfile = null rombar = 1 multifunction = off command_serr_enable = on class Class 0780, addr 00:03.0, pci id 1af4:1003 (sub 1af4:0003) bar 0: i/o at 0x [0x1e] bar 1: mem at 0x [0xffe] bus: virtio-serial0.0 type virtio-pci-bus dev: virtio-serial-device, id max_ports = 31 bus: virtio-serial-bus.0 type virtio-serial-bus dev: virtserialport, id channel1 chardev = charchannel1 nr = 2 name = org.qemu.guest_agent.0 port 2, guest off, host off, throttle off dev: virtserialport, id channel0 chardev = charchannel0 nr = 1 name = com.redhat.rhevm.vdsm port 1, guest off, host off, throttle off But we tried to replace virtio-serial0.0 with virtio-serial-bus.0, SLOF crashes. It still doesn't work at all. Does anyone know how to use virtserialport in QEMU command line? If configuration is changed in QEMU, libvirt also needs to change it accordingly. Thanks. :) --Li -- Alexey Kardashevskiy IBM OzLabs, LTC Team e-mail: a...@au1.ibm.com notes: Alexey Kardashevskiy/Australia/IBM
[Qemu-devel] [Bug 1178107] [NEW] qemu-system-*.exe -cpu ? (or -M ?) exit silently
Public bug reported: For example, 'qemu-system-arm -cpu ?' on Linux host give me available cpu list: Available CPUs: arm1026 arm1136 arm1136-r2 ... But on Windows host, I got nothing: C:\opt\qemu-1.5.0-rc0-win64qemu-system-arm -cpu ? C:\opt\qemu-1.5.0-rc0-win64echo %ERRORLEVEL% 0 ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1178107 Title: qemu-system-*.exe -cpu ? (or -M ?) exit silently Status in QEMU: New Bug description: For example, 'qemu-system-arm -cpu ?' on Linux host give me available cpu list: Available CPUs: arm1026 arm1136 arm1136-r2 ... But on Windows host, I got nothing: C:\opt\qemu-1.5.0-rc0-win64qemu-system-arm -cpu ? C:\opt\qemu-1.5.0-rc0-win64echo %ERRORLEVEL% 0 To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1178107/+subscriptions
Re: [Qemu-devel] [Qemu-trivial] [PATCH 2/2] remove needless semicolon
08.05.2013 17:25, Anthony Liguori wrote: Michael Tokarev m...@tls.msk.ru writes: 08.05.2013 13:46, Trival wrote: Signed-off-by: Trival triv...@linux.vnet.ibm.com Something went wrong in sending this. This is not a valid SoB. So, do we not accept it? Should I revert it in the trivial-patches-next ? Thanks, /mjt
Re: [Qemu-devel] [PATCH V18 5/6] add-cow file format core code.
On Thu, Apr 18, 2013 at 6:03 PM, Stefan Hajnoczi stefa...@gmail.com wrote: On Wed, Apr 10, 2013 at 04:11:52PM +0800, Dong Xu Wang wrote: +header.cluster_bits = ffs(cluster_size) - 1; +if (header.cluster_bits MIN_CLUSTER_BITS || +header.cluster_bits MAX_CLUSTER_BITS || +(1 header.cluster_bits) != cluster_size) { +error_report( +Cluster size must be a power of two between %d and %dk, +1 MIN_CLUSTER_BITS, 1 (MAX_CLUSTER_BITS - 10)); +return -EINVAL; +} + + header.header_size = MAX(cluster_size, DEFAULT_HEADER_SIZE); Indentation. +if (backing_filename) { +header.backing_offset = sizeof(header); +header.backing_size = strlen(backing_filename); + +if (!backing_fmt) { +backing_bs = bdrv_new(image); +ret = bdrv_open(backing_bs, backing_filename, NULL, +BDRV_O_RDWR | BDRV_O_CACHE_WB, NULL); +if (ret 0) { +return ret; backing_bs is leaked. +ret = bdrv_file_open(bs, filename, NULL, BDRV_O_RDWR); +if (ret 0) { +return ret; +} +snprintf(header.backing_fmt, sizeof(header.backing_fmt), %s, + backing_fmt ? backing_fmt : ); +snprintf(header.image_fmt, sizeof(header.image_fmt), %s, + image_format ? image_format : raw); snprintf() doesn't have the semantics in the add-cow specification: 44 - 59:backing file format Format of backing file. It will be filled with 0 if backing file name offset is 0. If backing file name offset is non-empty, it must be non-empty. It is coded in free-form ASCII, and is not NUL-terminated. Zero padded on the right. 60 - 75:image file format Format of image file. It must be non-empty. It is coded in free-form ASCII, and is not NUL-terminated. Zero padded on the right. strncpy() does the zero padding and doesn't NUL-terminate if the max buffer size is used. +if ((s-header.compat_features ACOW_F_ALL_ALLOCATED) == 0) { +snprintf(bs-backing_format, sizeof(bs-backing_format), + %s, s-header.backing_fmt); s-header.backing_fmt is not NUL-terminated so using snprintf() is inappropriate (could it read beyond the end of .backing_fmt?). +} + +if (s-header.cluster_bits MIN_CLUSTER_BITS || +s-header.cluster_bits MAX_CLUSTER_BITS) { +ret = -EINVAL; +goto fail; +} + +s-cluster_size = 1 s-header.cluster_bits; +if (s-header.header_size != MAX(s-cluster_size, DEFAULT_HEADER_SIZE)) { +char buf[64]; +snprintf(buf, sizeof(buf), Header size: %d, %u or PRIu32 since header_size is uint32_t. This avoids compiler or code scanner warnings. +s-image_hd = bdrv_new(); +ret = bdrv_open(s-image_hd, image_filename, NULL, flags, +bdrv_find_format(s-header.image_fmt)); Cannot use image_fmt as a string since it is not NUL-terminated. +static coroutine_fn int add_cow_co_writev(BlockDriverState *bs, + int64_t sector_num, + int remaining_sectors, + QEMUIOVector *qiov) +{ +BDRVAddCowState *s = bs-opaque; +int ret = 0, i; +QEMUIOVector hd_qiov; +uint8_t *table; +uint64_t offset; +int mask = s-cluster_sectors - 1; +int cluster_mask = s-cluster_size - 1; + +qemu_co_mutex_lock(s-lock); +qemu_iovec_init(hd_qiov, qiov-niov); +ret = bdrv_co_writev(s-image_hd, sector_num, + remaining_sectors, qiov); All writes are serialized. This means write performance will be very poor for multi-threaded workloads. qcow2 tracks allocating writes and allows them to execute at the same time if they do not overlap clusters. + +if (ret 0) { +goto fail; +} +if ((s-header.compat_features ACOW_F_ALL_ALLOCATED) == 0) { +/* Copy content of unmodified sectors */ +if (!is_cluster_head(sector_num, s-cluster_sectors) + !is_allocated(bs, sector_num)) { +ret = copy_sectors(bs, sector_num ~mask, sector_num); +if (ret 0) { +goto fail; +} +} + +if (!is_cluster_tail(sector_num + remaining_sectors - 1, + s-cluster_sectors) + !is_allocated(bs, sector_num + remaining_sectors - 1)) { +ret = copy_sectors(bs, sector_num + remaining_sectors, + ((sector_num + remaining_sectors) | mask) + 1); +if (ret 0) { +goto fail; +} +} This trashes the cluster when remaining_sectors = 0, sector_num = cluster_sectors, and sector cluster_sectors - 1 is
Re: [Qemu-devel] [PATCH v5 0/5] KVM flash memory support
Il 09/05/2013 00:44, Jordan Justen ha scritto: git://github.com/jljusten/qemu.git kvm-flash-v5 Utilize KVM_CAP_READONLY_MEM to support PC system flash emulation with KVM. v5: * Remove patch to pflash_cfi01 which enabled readonly mode * Adjust kvm code to use KVM READONLY support for ranges that either have the readonly flag set, or for devices with readable set. v4: * With a machine type of isapc, don't mark the BIOS as read-only. isapc + seabios will not boot if the BIOS is read-only. This matches the current behavior of isapc with KVM, which is the only mode under which isapc currently works. v3: * Squash patch 2 3 based on Xiao's feedback that what I was calling a 'workaround' in patch 3 was actually what is required by the KVM READONLY memory support. v2: * Remove rom_only from PC_COMPAT_1_4 * Only enable flash when a pflash drive is created. Jordan Justen (5): isapc: Fix non-KVM qemu boot (read/write memory for isapc BIOS) kvm: add kvm_readonly_mem_enabled kvm: support using KVM_MEM_READONLY flag for regions pc_sysfw: allow flash (-pflash) memory to be used with KVM pc_sysfw: change rom_only default to 0 hw/block/pc_sysfw.c | 64 -- hw/i386/pc_piix.c|5 include/hw/i386/pc.h |4 include/sysemu/kvm.h | 10 kvm-all.c| 44 +++--- kvm-stub.c |1 + 6 files changed, 92 insertions(+), 36 deletions(-) Reviewed-by: Paolo Bonzini pbonz...@redhat.com
Re: [Qemu-devel] [libvirt]virtio serial device problem
On 05/09/2013 04:07 PM, Li Zhang wrote: On 2013年05月08日 23:53, fred.kon...@greensocs.com wrote: On 05/07/2013 07:50 PM, Paolo Bonzini wrote: Il 07/05/2013 09:20, Li Zhang ha scritto: Hi all, Hi, When we use the latest version of QEMU to build ovirt, we get this error reported from libvirt. What QEMU commit is this? b3e6d591b05538056d665572f3e3bbfb3cbb70e7 This commit is from 05/29 no? there were issues with that. But it should be fixed. Do you still have the command-line issue with the last git? See commit 80270a19685dd20eda017b0360c743b3e3ed6f57 Hi Fred, This patch is to change bus which can be compatible with old version, right? But I saw the current name is still different from old version. The current name is: virtio-serial-bus0.0 The old version is: virtio-serial0.0 Is it possible to change it back to the old name? This is what the most recent qemu produces: bus: pci type PCI dev: virtio-serial-pci, id virtio-serial0 ioeventfd = on vectors = 2 class = 0x780 indirect_desc = on event_idx = on max_ports = 31 addr = 03.0 romfile = null rombar = 1 multifunction = off command_serr_enable = on class Class 0780, addr 00:03.0, pci id 1af4:1003 (sub 1af4:0003) bar 0: i/o at 0x [0x1e] bar 1: mem at 0x [0xffe] bus: virtio-bus type virtio-pci-bus dev: virtio-serial-device, id max_ports = 31 bus: virtio-serial0.0 type virtio-serial-bus dev: virtserialport, id channel0 chardev = charchannel0 nr = 1 name = com.redhat.rhevm.vdsm port 1, guest off, host off, throttle off The device layout is new, the bus name is old - virtio-serial0.0, everything should be ok now. Thanks. :) --Li Thanks, Fred It might have been fixed already. Hm. From what I see, it is all correct from the qemu side, the problem is in libvirt which does not know about virtio-pci-bus yet. Paolo qemu-system-ppc64: -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm: Bus 'virtio-serial0.0' is full qemu-system-ppc64: -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm: Bus 'virtio-serial0.0' not found Libvirt helps create QEMU command line and put virtserialport device to bus virtio-serial0.0. For latest version of QEMU, the bus type is changed. (qemu) info qtree bus: main-system-bus type System dev: spapr-pci-host-bridge, id index = 0 buid = 0x8002000 liobn = 0x8000 mem_win_addr = 0x100a000 mem_win_size = 0x2000 io_win_addr = 0x1008000 io_win_size = 0x1 msi_win_addr = 0x1009000 irq 0 bus: pci type PCI dev: virtio-serial-pci, id virtio-serial0 ioeventfd = on vectors = 2 class = 0x780 indirect_desc = on event_idx = on max_ports = 31 addr = 03.0 romfile = null rombar = 1 multifunction = off command_serr_enable = on class Class 0780, addr 00:03.0, pci id 1af4:1003 (sub 1af4:0003) bar 0: i/o at 0x [0x1e] bar 1: mem at 0x [0xffe] bus: virtio-serial0.0 type virtio-pci-bus dev: virtio-serial-device, id max_ports = 31 bus: virtio-serial-bus.0 type virtio-serial-bus dev: virtserialport, id channel1 chardev = charchannel1 nr = 2 name = org.qemu.guest_agent.0 port 2, guest off, host off, throttle off dev: virtserialport, id channel0 chardev = charchannel0 nr = 1 name = com.redhat.rhevm.vdsm port 1, guest off, host off, throttle off But we tried to replace virtio-serial0.0 with virtio-serial-bus.0, SLOF crashes. It still doesn't work at all. Does anyone know how to use virtserialport in QEMU command line? If configuration is changed in QEMU, libvirt also needs to change it accordingly. Thanks. :) --Li -- Alexey Kardashevskiy IBM OzLabs, LTC Team e-mail: a...@au1.ibm.com notes: Alexey Kardashevskiy/Australia/IBM -- Alexey Kardashevskiy IBM OzLabs, LTC Team e-mail: a...@au1.ibm.com notes: Alexey Kardashevskiy/Australia/IBM
Re: [Qemu-devel] [PATCH] curl: fix curl read
On Fri, May 03, 2013 at 04:00:09PM +0800, Fam Zheng wrote: @@ -391,7 +427,12 @@ static QemuOptsList runtime_opts = { .type = QEMU_OPT_SIZE, .help = Readahead size, }, -{ /* end of list */ } +{ +.name = ssl_no_cert, +.type = QEMU_OPT_BOOL, +.help = SSL certificate check, +}, This new option should be in a separate patch. +if (!strncmp(s-url, http://;, strlen(http://;)) !s-accept_range) { +strncpy(state-errmsg, Server not supporting range., CURL_ERROR_SIZE); +goto out; +} This check is unrelated to the API change and should be in a separate patch. s-multi = curl_multi_init(); -curl_multi_setopt( s-multi, CURLMOPT_SOCKETDATA, s); -curl_multi_setopt( s-multi, CURLMOPT_SOCKETFUNCTION, curl_sock_cb ); -curl_multi_do(s); +if (!s-multi) { +goto out_noclean; +} +curl_multi_setopt(s-multi, CURLMOPT_SOCKETDATA, s); +curl_multi_setopt(s-multi, CURLMOPT_SOCKETFUNCTION, curl_sock_cb); +curl_multi_setopt(s-multi, CURLMOPT_TIMERDATA, s); +curl_multi_setopt(s-multi, CURLMOPT_TIMERFUNCTION, curl_multi_timer_cb); +curl_multi_socket_action(s-multi, CURL_SOCKET_TIMEOUT, 0, running); The timeout should be added in a separate patch. +cache = curl_find_cache(s, aio_base, aio_bytes); +if (cache) { +curl_complete_io(s, acb, cache); +return; } What is the point of the cache? Can you split it into a separate patch? +/* Try to release some cache */ +while (0 s-cache_quota = 0) { while 0?
[Qemu-devel] Jiajun, add me to your LinkedIn network?
LinkedIn Zhou Chunhua requested to add you as a connection on LinkedIn: -- Jiajun, I'd like to add you to my professional network on LinkedIn. - Zhou Accept invitation from Zhou Chunhua http://www.linkedin.com/e/-kkb1ec-hghkilms-5b/qTMmi8QEI_f3FNXUkL1mvZgy00BGYniwg3/blk/I516399382_11/3wOtCVFbmdxnSVFbm8JrnpKqlZJrmZzbmNJpjRQnOpBtn9QfmhBt71BoSd1p65Lr6lOfP4NnP8UcPAVcPoNdkALekxPcCNSdR8LczoTdz0UczcOc34LrCBxbOYWrSlI/eml-comm_invm-b-in_ac-inv28/?hs=falsetok=2MY3cz4ADhWRI1 View profile of Zhou Chunhua http://www.linkedin.com/e/-kkb1ec-hghkilms-5b/rso/223045405/29qh/name/86670410_I516399382_11/?hs=falsetok=1agjIHEZvhWRI1 -- You are receiving Invitation emails. This email was intended for Jiajun Liu. Learn why this is included: http://www.linkedin.com/e/-kkb1ec-hghkilms-5b/plh/http%3A%2F%2Fhelp%2Elinkedin%2Ecom%2Fapp%2Fanswers%2Fdetail%2Fa_id%2F4788/-GXI/?hs=falsetok=2wwvEflx7hWRI1 (c) 2012, LinkedIn Corporation. 2029 Stierlin Ct, Mountain View, CA 94043, USA.
Re: [Qemu-devel] [ANNOUNCE] QEMU 1.5.0-rc1 is now available
On Thu, May 9, 2013 at 6:29 AM, Anthony Liguori aligu...@us.ibm.com wrote: Hi, On behalf of the QEMU Team, I'd like to announce the availability of the second release candidate for the QEMU 1.5 release. This release is meant for testing purposes and should not be used in a production environment. http://wiki.qemu.org/download/qemu-1.5.0-rc1.tar.bz2 You can help improve the quality of the QEMU 1.5 release by testing this release and reporting bugs on Launchpad: https://bugs.launchpad.net/qemu/ The release plan for the 1.5 release is available at: http://wiki.qemu.org/Planning/1.5 Please add entries to the ChangeLog for the 1.5 release below: http://wiki.qemu.org/ChangeLog/Next This following changes have been made since 1.5.0-rc0: - virtio: properly validate address before accessing config (Jason Wang) - virtio-pci: fix level interrupts (Michael S. Tsirkin) - PPC: Fix rldcl (Alexander Graf) - PPC: Depend behavior of cmp instructions only on instruction encoding (Alexander Graf) - target-mips: fix incorrect behaviour for INSV (Petar Jovanovic) - target-mips: add missing check_dspr2 for multiply instructions (Petar Jovanovic) - qemu-iotests: fix 017 018 for vmdk (Fam Zheng) - qemu-iotests: exclude vmdk and qcow from 043 (Fam Zheng) - qemu-iotests: exclude vmdk for test 042 (Fam Zheng) - qtest/ide-test: Test short and long PRDTs (Kevin Wolf) - qtest/ide-test: Add simple DMA read/write test case (Kevin Wolf) - qtest: Add IDE test case (Kevin Wolf) - libqos/pci: Enable bus mastering (Kevin Wolf) - ide: Reset BMIDEA bit when the bus master is stopped (Kevin Wolf) - de_DE.po: Add missing leading spaces (Kevin Wolf) - ahci: Don't allow creating slave drives (Kevin Wolf) Regards, Anthony Liguori For convenience, I had upload qemu-build-dependency-r1.zip, qemu-1.5.0-rc1-win32.7z and qemu-1.5.0-rc1-win64.7z https://code.google.com/p/i18n-zh/downloads/list Regards, Dongsheng
Re: [Qemu-devel] [Bug 1176366] [NEW] TCPIP not working on qemu 1.4.50 (master)
On Sat, May 04, 2013 at 04:13:19PM -, TC1988 wrote: whenever I try, in the guest OS, in this case it's NT 3.1, to enable TCP/IP, it crashes the whole emulator. With either the ne2000 isa, ne2000 pci or PCnet, still crashes below is attached a screenshot. Please use git-bisect(1) to identify the commit that broke networking. http://git-scm.com/book/en/Git-Tools-Debugging-with-Git#Binary-Search https://www.kernel.org/pub/software/scm/git/docs/git-bisect.html Stefan
Re: [Qemu-devel] [PATCH for-1.5] virtio-pci: bugfix
On Mon, May 06, 2013 at 06:00:27PM +0300, Michael S. Tsirkin wrote: mask notifiers are never called without msix, so devices with backend masking like vhost don't work. Call mask notifiers explicitly at startup/cleanup to make it work. Signed-off-by: Michael S. Tsirkin m...@redhat.com Tested-by: Alexander Graf ag...@suse.de --- hw/virtio/virtio-pci.c | 4 1 file changed, 4 insertions(+) Please choose a descriptive commit message, not just bugfix. Stefan
Re: [Qemu-devel] [PATCH] curl: fix curl read
On Thu, 05/09 08:41, Stefan Hajnoczi wrote: On Fri, May 03, 2013 at 04:00:09PM +0800, Fam Zheng wrote: @@ -391,7 +427,12 @@ static QemuOptsList runtime_opts = { .type = QEMU_OPT_SIZE, .help = Readahead size, }, -{ /* end of list */ } +{ +.name = ssl_no_cert, +.type = QEMU_OPT_BOOL, +.help = SSL certificate check, +}, This new option should be in a separate patch. OK. I'll try to split them to patches. +if (!strncmp(s-url, http://;, strlen(http://;)) !s-accept_range) { +strncpy(state-errmsg, Server not supporting range., CURL_ERROR_SIZE); +goto out; +} This check is unrelated to the API change and should be in a separate patch. s-multi = curl_multi_init(); -curl_multi_setopt( s-multi, CURLMOPT_SOCKETDATA, s); -curl_multi_setopt( s-multi, CURLMOPT_SOCKETFUNCTION, curl_sock_cb ); -curl_multi_do(s); +if (!s-multi) { +goto out_noclean; +} +curl_multi_setopt(s-multi, CURLMOPT_SOCKETDATA, s); +curl_multi_setopt(s-multi, CURLMOPT_SOCKETFUNCTION, curl_sock_cb); +curl_multi_setopt(s-multi, CURLMOPT_TIMERDATA, s); +curl_multi_setopt(s-multi, CURLMOPT_TIMERFUNCTION, curl_multi_timer_cb); +curl_multi_socket_action(s-multi, CURL_SOCKET_TIMEOUT, 0, running); The timeout should be added in a separate patch. +cache = curl_find_cache(s, aio_base, aio_bytes); +if (cache) { +curl_complete_io(s, acb, cache); +return; } What is the point of the cache? Can you split it into a separate patch? The cache is for prefetch. Data is fetched by 256k chunks using libcurl and stored in cache to fill future io request, reducing overall http request overhead. -- Fam
Re: [Qemu-devel] [Qemu-trivial] [PATCH 2/2] remove needless semicolon
On Thu, May 09, 2013 at 10:23:09AM +0400, Michael Tokarev wrote: 08.05.2013 17:25, Anthony Liguori wrote: Michael Tokarev m...@tls.msk.ru writes: 08.05.2013 13:46, Trival wrote: Signed-off-by: Trival triv...@linux.vnet.ibm.com I think wdongxu probably wants HIS name and email go here. wdongxu, can you resend your patches with: Signed-off-by: your real name wdon...@linux.vnet.ibm.com ? Something went wrong in sending this. This is not a valid SoB. So, do we not accept it? Should I revert it in the trivial-patches-next ? Thanks, /mjt
Re: [Qemu-devel] [libvirt]virtio serial device problem
On 2013年05月09日 14:31, Alexey Kardashevskiy wrote: On 05/09/2013 04:07 PM, Li Zhang wrote: On 2013年05月08日 23:53, fred.kon...@greensocs.com wrote: On 05/07/2013 07:50 PM, Paolo Bonzini wrote: Il 07/05/2013 09:20, Li Zhang ha scritto: Hi all, Hi, When we use the latest version of QEMU to build ovirt, we get this error reported from libvirt. What QEMU commit is this? b3e6d591b05538056d665572f3e3bbfb3cbb70e7 This commit is from 05/29 no? there were issues with that. But it should be fixed. Do you still have the command-line issue with the last git? See commit 80270a19685dd20eda017b0360c743b3e3ed6f57 Hi Fred, This patch is to change bus which can be compatible with old version, right? But I saw the current name is still different from old version. The current name is: virtio-serial-bus0.0 The old version is: virtio-serial0.0 Is it possible to change it back to the old name? This is what the most recent qemu produces: bus: pci type PCI dev: virtio-serial-pci, id virtio-serial0 ioeventfd = on vectors = 2 class = 0x780 indirect_desc = on event_idx = on max_ports = 31 addr = 03.0 romfile = null rombar = 1 multifunction = off command_serr_enable = on class Class 0780, addr 00:03.0, pci id 1af4:1003 (sub 1af4:0003) bar 0: i/o at 0x [0x1e] bar 1: mem at 0x [0xffe] bus: virtio-bus type virtio-pci-bus dev: virtio-serial-device, id max_ports = 31 bus: virtio-serial0.0 type virtio-serial-bus dev: virtserialport, id channel0 chardev = charchannel0 nr = 1 name = com.redhat.rhevm.vdsm port 1, guest off, host off, throttle off The device layout is new, the bus name is old - virtio-serial0.0, everything should be ok now. Alexey, thanks. It seems that both of them can be recognized. :) Thanks. :) --Li Thanks, Fred It might have been fixed already. Hm. From what I see, it is all correct from the qemu side, the problem is in libvirt which does not know about virtio-pci-bus yet. Paolo qemu-system-ppc64: -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm: Bus 'virtio-serial0.0' is full qemu-system-ppc64: -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm: Bus 'virtio-serial0.0' not found Libvirt helps create QEMU command line and put virtserialport device to bus virtio-serial0.0. For latest version of QEMU, the bus type is changed. (qemu) info qtree bus: main-system-bus type System dev: spapr-pci-host-bridge, id index = 0 buid = 0x8002000 liobn = 0x8000 mem_win_addr = 0x100a000 mem_win_size = 0x2000 io_win_addr = 0x1008000 io_win_size = 0x1 msi_win_addr = 0x1009000 irq 0 bus: pci type PCI dev: virtio-serial-pci, id virtio-serial0 ioeventfd = on vectors = 2 class = 0x780 indirect_desc = on event_idx = on max_ports = 31 addr = 03.0 romfile = null rombar = 1 multifunction = off command_serr_enable = on class Class 0780, addr 00:03.0, pci id 1af4:1003 (sub 1af4:0003) bar 0: i/o at 0x [0x1e] bar 1: mem at 0x [0xffe] bus: virtio-serial0.0 type virtio-pci-bus dev: virtio-serial-device, id max_ports = 31 bus: virtio-serial-bus.0 type virtio-serial-bus dev: virtserialport, id channel1 chardev = charchannel1 nr = 2 name = org.qemu.guest_agent.0 port 2, guest off, host off, throttle off dev: virtserialport, id channel0 chardev = charchannel0 nr = 1 name = com.redhat.rhevm.vdsm port 1, guest off, host off, throttle off But we tried to replace virtio-serial0.0 with virtio-serial-bus.0, SLOF crashes. It still doesn't work at all. Does anyone know how to use virtserialport in QEMU command line? If configuration is changed in QEMU, libvirt also needs to change it accordingly. Thanks. :) --Li -- Alexey Kardashevskiy IBM OzLabs, LTC Team e-mail: a...@au1.ibm.com notes: Alexey Kardashevskiy/Australia/IBM
Re: [Qemu-devel] [Qemu-trivial] [PATCH 2/2] remove needless semicolon
On 9 May 2013 08:08, Hu Tao hu...@cn.fujitsu.com wrote: On Thu, May 09, 2013 at 10:23:09AM +0400, Michael Tokarev wrote: 08.05.2013 17:25, Anthony Liguori wrote: Michael Tokarev m...@tls.msk.ru writes: 08.05.2013 13:46, Trival wrote: Signed-off-by: Trival triv...@linux.vnet.ibm.com I think wdongxu probably wants HIS name and email go here. wdongxu, can you resend your patches with: Signed-off-by: your real name wdon...@linux.vnet.ibm.com ? You should make sure you fix the From: line as well as the Signed-off-by:, please. thanks -- PMM
Re: [Qemu-devel] Few Qs related to using qemu-nbd for exporting snapshot
On Wed, May 08, 2013 at 10:30:29AM +0530, Deepak C Shetty wrote: I am looking at using qemu-nbd to export an existing snapshot to a Backup virtual appliance (VA) and had the following Qs in that context... 1) Exporting an image using unix socket (-k option) Vs using --connect=/dev/nbd0, which one is better / preferred ? The idea being once exported, the disk will be hotplugged into Backup VA which will then use it as a src for backup operation. --connect=/dev/nbd0 requires that the host kernel ships the nbd.ko driver. RHEL doesn't support nbd.ko, other distros may or may not. Why are you using qemu-nbd at all if the backup appliance is on the same host? Would it be possible to attach the image file directly to the virtual appliance? 2) If i use the --connect option, it requires the nbd kernel module to be loaded and i also get a nice --disconnect way of terminating the nbd export. Using -k option qemu-nbd doesn't need the nbd kernel module, but the cmdline blocks and sending SIGTERM is the only way to terminate the nbd export... Is there a better way to terminate qemu-nbd if using the -k option ? If not, i assume SIGTERM would gracefully shut down qemu-nbd ? SIGTERM stops qemu-nbd properly. 3) I was not able to get qemu-nbd running without the -t (persistent) option. Ideally without -t would be suited, since there is only one client/connection per nbd export and the nbd server would nicely shut itself down.. but i get the below error server as... $ qemu-nbd -k /tmp/nbds1 ./debian_lenny_i386_standard.qcow2 client as... $ qemu-system-x86_64 -m 1024 -drive file=nbd:unix:/tmp/nbds1 connect(unix:/tmp/nbds1): No such file or directory qemu-system-x86_64: -drive file=nbd:unix:/tmp/nbds1: could not open disk image nbd:unix:/tmp/nbds1: No such file or directory nbds1 is indeed there... srwxrwxr-x. 1 dpkshetty dpkshetty 0 May 3 14:24 /tmp/nbds1 and then the server returns to the shell.. echo $? is 0 Try -drive file=nbd:...,format=raw to prevent format probing. Recent QEMU versions avoid reopening the image file here. Even this may not help if QEMU probes the disk geometry by opening and inspecting the MBR. Kevin can explain the current state of probing better than me. Stefan
Re: [Qemu-devel] qemu-img problem when create a file larger than fs's size
On Wed, May 08, 2013 at 01:18:17PM +0800, yuxh wrote: I have to consult you a qemu-img's problem. Is this reasonable to create a file which is larger than the available size of the fs by qemu-img cmd ? When I use qemu-img create a file which is larger than the available size of the fs, the creation is completed succesfully. However when I use this file in guest as a guest's disk, and write beyond the size the host file can provides, the guest was paused by qemu-kvm or libvirt and was in maybe a infinite circle where the guest just can't be used except I detach the disk from guest or destroy the guest. You can change the ENOSPC policy with -drive werror=,rerror=. See the QEMU man page. The default behavior is to pause the guest so the host admin can free up or add space. Then the guest can be continued - this will retry the I/O. But you can also tell QEMU to pass the error through to the guest using the -drive werror=,rerror= options. Stefan
Re: [Qemu-devel] [Qemu-trivial] [PATCH 2/2] remove needless semicolon
On 2013/5/9 15:08, Hu Tao wrote: On Thu, May 09, 2013 at 10:23:09AM +0400, Michael Tokarev wrote: 08.05.2013 17:25, Anthony Liguori wrote: Michael Tokarev m...@tls.msk.ru writes: 08.05.2013 13:46, Trival wrote: Signed-off-by: Trival triv...@linux.vnet.ibm.com I think wdongxu probably wants HIS name and email go here. wdongxu, can you resend your patches with: Sorry for the inconvenience, will re-send the patches. Signed-off-by: your real name wdon...@linux.vnet.ibm.com ? Something went wrong in sending this. This is not a valid SoB. So, do we not accept it? Should I revert it in the trivial-patches-next ? Thanks, /mjt
[Qemu-devel] [PATCH 1/2] clean unnecessary code: don't check g_strdup arg for NULL
Signed-off-by: Dong Xu Wang wdon...@linux.vnet.ibm.com --- util/uri.c | 2 +- vl.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/util/uri.c b/util/uri.c index 4238729..e348c17 100644 --- a/util/uri.c +++ b/util/uri.c @@ -2162,7 +2162,7 @@ query_params_append (struct QueryParams *ps, } ps-p[ps-n].name = g_strdup(name); -ps-p[ps-n].value = value ? g_strdup(value) : NULL; +ps-p[ps-n].value = g_strdup(value); ps-p[ps-n].ignore = 0; ps-n++; diff --git a/vl.c b/vl.c index 6e6225f..be0a93c 100644 --- a/vl.c +++ b/vl.c @@ -1215,7 +1215,7 @@ void add_boot_device_path(int32_t bootindex, DeviceState *dev, node = g_malloc0(sizeof(FWBootEntry)); node-bootindex = bootindex; -node-suffix = suffix ? g_strdup(suffix) : NULL; +node-suffix = g_strdup(suffix); node-dev = dev; QTAILQ_FOREACH(i, fw_boot_order, link) { -- 1.7.11.7
[Qemu-devel] [PATCH 2/2] remove double semicolons
Signed-off-by: Dong Xu Wang wdon...@linux.vnet.ibm.com --- block/nbd.c | 2 +- fsdev/virtfs-proxy-helper.c | 4 ++-- hw/9pfs/virtio-9p-local.c | 2 +- hw/i386/pc_q35.c| 2 +- hw/intc/imx_avic.c | 2 +- hw/usb/host-linux.c | 4 ++-- qga/channel-win32.c | 2 +- 7 files changed, 9 insertions(+), 9 deletions(-) diff --git a/block/nbd.c b/block/nbd.c index fab114b..30e3b78 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -609,7 +609,7 @@ static int nbd_co_discard(BlockDriverState *bs, int64_t sector_num, return 0; } request.type = NBD_CMD_TRIM; -request.from = sector_num * 512;; +request.from = sector_num * 512; request.len = nb_sectors * 512; nbd_coroutine_start(s, request); diff --git a/fsdev/virtfs-proxy-helper.c b/fsdev/virtfs-proxy-helper.c index 36f6616..713a7b2 100644 --- a/fsdev/virtfs-proxy-helper.c +++ b/fsdev/virtfs-proxy-helper.c @@ -248,7 +248,7 @@ static int send_fd(int sockfd, int fd) static int send_status(int sockfd, struct iovec *iovec, int status) { ProxyHeader header; -int retval, msg_size;; +int retval, msg_size; if (status 0) { header.type = T_ERROR; @@ -381,7 +381,7 @@ static int send_response(int sock, struct iovec *iovec, int size) proxy_marshal(iovec, 0, dd, header.type, header.size); retval = socket_write(sock, iovec-iov_base, header.size + PROXY_HDR_SZ); if (retval 0) { -return retval;; +return retval; } return 0; } diff --git a/hw/9pfs/virtio-9p-local.c b/hw/9pfs/virtio-9p-local.c index be898ec..6ece6f7 100644 --- a/hw/9pfs/virtio-9p-local.c +++ b/hw/9pfs/virtio-9p-local.c @@ -878,7 +878,7 @@ static int local_remove(FsContext *ctx, const char *path) * Now remove the name from parent directory * .virtfs_metadata directory */ -err = remove(local_mapped_attr_path(ctx, path, buffer));; +err = remove(local_mapped_attr_path(ctx, path, buffer)); if (err 0 errno != ENOENT) { /* * We didn't had the .virtfs_metadata file. May be file created diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index 4160e2b..6825380 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -128,7 +128,7 @@ static void pc_q35_init(QEMUMachineInitArgs *args) q35_host-mch.ram_memory = ram_memory; q35_host-mch.pci_address_space = pci_memory; q35_host-mch.system_memory = get_system_memory(); -q35_host-mch.address_space_io = get_system_io();; +q35_host-mch.address_space_io = get_system_io(); q35_host-mch.below_4g_mem_size = below_4g_mem_size; q35_host-mch.above_4g_mem_size = above_4g_mem_size; /* pci */ diff --git a/hw/intc/imx_avic.c b/hw/intc/imx_avic.c index 4e280b6..ff45dcd 100644 --- a/hw/intc/imx_avic.c +++ b/hw/intc/imx_avic.c @@ -370,7 +370,7 @@ static void imx_avic_reset(DeviceState *dev) static int imx_avic_init(SysBusDevice *dev) { -IMXAVICState *s = FROM_SYSBUS(IMXAVICState, dev);; +IMXAVICState *s = FROM_SYSBUS(IMXAVICState, dev); memory_region_init_io(s-iomem, imx_avic_ops, s, imx_avic, 0x1000); sysbus_init_mmio(dev, s-iomem); diff --git a/hw/usb/host-linux.c b/hw/usb/host-linux.c index 8994668..ca09a89 100644 --- a/hw/usb/host-linux.c +++ b/hw/usb/host-linux.c @@ -651,7 +651,7 @@ static void usb_host_handle_reset(USBDevice *dev) trace_usb_host_reset(s-bus_num, s-addr); -usb_host_do_reset(s);; +usb_host_do_reset(s); usb_host_claim_interfaces(s, 0); usb_linux_update_endp_table(s); @@ -1429,7 +1429,7 @@ static void usb_host_exit_notifier(struct Notifier *n, void *data) usb_host_release_port(s); if (s-fd != -1) { -usb_host_do_reset(s);; +usb_host_do_reset(s); } } diff --git a/qga/channel-win32.c b/qga/channel-win32.c index 7ed98d7..8a303f3 100644 --- a/qga/channel-win32.c +++ b/qga/channel-win32.c @@ -268,7 +268,7 @@ static GIOStatus ga_channel_write(GAChannel *c, const char *buf, size_t size, GIOStatus ga_channel_write_all(GAChannel *c, const char *buf, size_t size) { -GIOStatus status = G_IO_STATUS_NORMAL;; +GIOStatus status = G_IO_STATUS_NORMAL; size_t count; while (size) { -- 1.7.11.7
Re: [Qemu-devel] [PATCH 0/6] Drop the irredeemably racy cpu_unlink_tb()
[Rehashing a relatively old thread. It is available, in particular, at http://thread.gmane.org/gmane.comp.emulators.qemu/196629] 22.02.2013 22:09, Peter Maydell wrote: This patch series gets rid of cpu_unlink_tb(), which is irredeemably racy, since it modifies the TB graph with no locking from other threads, signal handlers, etc etc. (The signal handler case is why you can't just fix this with more locks.) Instead we take the much simpler approach of setting a flag for the CPU when we want it to stop executing TBs, and generate code to check the flag at the start of every TB. The raciness is easiest to provoke with multithreaded linux-user guests but it is I think also a risk in system emulation mode. This fixes the crashes seen in LP:668799; however there are another class of crashes described in LP:1098729 which stem from the fact that in linux-user with a multithreaded guest all threads will use and modify the same global TCG date structures (including the generated code buffer) without any kind of locking. This means that multithreaded guest binaries are still in the unsupported category. Now when Debian Wheezy has been released with qemu 1.2, users started to file bugreports about multithreaded apps crashing. So, while I understand these are unsupported as per above, I think it is better to fix as much as we can, and allow some more usage cases, than to describe that it does not work. (And yes, I also understand it's better to fix that before a release, but oh well). So I tried to backport the series to 1.2 branch, to see how it goes. But there were many changes since 1.2, so it ended up in quite some changes, -- not exactly huge and complicated, but I need some help or additional pair (or two) of eyes to ensure the sanity of the resulting code. The backported patchset is smaller than the original. Andreas Färber (1): cpu: Introduce ENV_OFFSET macros This change isn't needed in 1.2, because all CPU state is in single place there, it hasn't been split between env and cpu states there. Peter Maydell (5): tcg: Document tcg_qemu_tb_exec() and provide constants for low bit uses Just a minor context difference in the header, no questions. cpu-exec: wrap tcg_qemu_tb_exec() in a fn to restore the PC This needed a back merge of env+cpu states back to cpu. Maybe we should somehow revisit the whole concept of the two, because it's sorta fun: at some point all functions has been converted to accept `cpu' instead of `env', but in many places the first thing a function does is to get `env' pointer out of `cpu'. The backport of this change reverts this piggy-backing and uses `env' everywhere consistently again. Handle CPU interrupts by inline checking of a flag The main patch in the series. The same simplification from `cpu' back to `env' as above. I had to add `tcg_exit_req' field into CPU_COMMON macro in cpu-defs.h, instead of adding it to CPUState struct in include/qom/cpu.h. Plus the corresponding changes in gen-icount.h -- that's where ENV_OFFSET was used, but due to the same env to cpu split, not needed anymore. My main question is actually about this place, I don't exactly understand how the code generation works, so am not sure if I backported it correctly. Plus minor code shuffling - for one, bits in translate-all.c was in exec.c before, so I had to modify it there. translate-all.c: Remove cpu_unlink_tb() That's easy, but again the bits being removed are in exec.c in 1.2, not in translate-all.c (so the resulting backported patch is now misnamed). Technically this patch isn't needed for a backport, since the function(s) are really unused, but gcc generates a warning about unused static function and the compilation fails due to -Werror. gen-icount.h: Rename gen_icount_start/end to gen_tb_start/end And I didn't try to backport this one, since this is just mechanical s/icount/tb/g without any code changes. Now, the resulting thing compiles (ha!), but I'm not really sure how to test it. I ran a few random apps using qemu-i386 and qemu-arm, it appears to work. If all goes well, I think this series needs to be included in whatever -stable qemu series are in use. I'll post the 4 resulting patches in reply to this message. Thank you! /mjt
[Qemu-devel] [PATCH 2/4] cpu-exec: wrap tcg_qemu_tb_exec() in a fn to restore the PC
From: Peter Maydell peter.mayd...@linaro.org If tcg_qemu_tb_exec() returns a value whose low bits don't indicate a link to an indexed next TB, this means that the TB execution never started (eg because the instruction counter hit zero). In this case the guest PC has to be reset to the address of the start of the TB. Refactor the cpu-exec code to make all tcg_qemu_tb_exec() calls pass through a wrapper function which does this restoration if necessary. Note that the apparent change in cpu_exec_nocache() from calling cpu_pc_from_tb() with the old TB to calling it with the TB returned by do_tcg_qemu_tb_exec() is safe, because in the nocache case we can guarantee that the TB we try to execute is not linked to any others, so the only possible returned TB is the one we started at. That is, we should arguably previously have included in cpu_exec_nocache() an assert(next_tb ~TB_EXIT_MASK) == tb), since the API requires restore from next_tb but we were using tb. Signed-off-by: Peter Maydell peter.mayd...@linaro.org Reviewed-by: Richard Henderson r...@twiddle.net Signed-off-by: Blue Swirl blauwir...@gmail.com (cherry picked from commit 77211379d73ea0c89c0b5bb6eee74b17cb06f9a8) Conflicts: cpu-exec.c Signed-off-by: Michael Tokarev m...@tls.msk.ru --- cpu-exec.c | 28 +--- 1 file changed, 17 insertions(+), 11 deletions(-) diff --git a/cpu-exec.c b/cpu-exec.c index 18da91e..edef121 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -51,12 +51,26 @@ void cpu_resume_from_signal(CPUArchState *env, void *puc) } #endif +/* Execute a TB, and fix up the CPU state afterwards if necessary */ +static inline tcg_target_ulong cpu_tb_exec(CPUArchState *env, uint8_t *tb_ptr) +{ +tcg_target_ulong next_tb = tcg_qemu_tb_exec(env, tb_ptr); +if ((next_tb TB_EXIT_MASK) TB_EXIT_IDX1) { +/* We didn't start executing this TB (eg because the instruction + * counter hit zero); we must restore the guest PC to the address + * of the start of the TB. + */ +TranslationBlock *tb = (TranslationBlock *)(next_tb ~TB_EXIT_MASK); +cpu_pc_from_tb(env, tb); +} +return next_tb; +} + /* Execute the code without caching the generated code. An interpreter could be used if available. */ static void cpu_exec_nocache(CPUArchState *env, int max_cycles, TranslationBlock *orig_tb) { -tcg_target_ulong next_tb; TranslationBlock *tb; /* Should never happen. @@ -68,14 +82,8 @@ static void cpu_exec_nocache(CPUArchState *env, int max_cycles, max_cycles); env-current_tb = tb; /* execute the generated code */ -next_tb = tcg_qemu_tb_exec(env, tb-tc_ptr); +cpu_tb_exec(env, tb-tc_ptr); env-current_tb = NULL; - -if ((next_tb TB_EXIT_MASK) == TB_EXIT_ICOUNT_EXPIRED) { -/* Restore PC. This may happen if async event occurs before - the TB starts executing. */ -cpu_pc_from_tb(env, tb); -} tb_phys_invalidate(tb, -1); tb_free(tb); } @@ -569,13 +577,11 @@ int cpu_exec(CPUArchState *env) if (likely(!env-exit_request)) { tc_ptr = tb-tc_ptr; /* execute the generated code */ -next_tb = tcg_qemu_tb_exec(env, tc_ptr); +next_tb = cpu_tb_exec(env, tc_ptr); if ((next_tb TB_EXIT_MASK) == TB_EXIT_ICOUNT_EXPIRED) { /* Instruction counter expired. */ int insns_left; tb = (TranslationBlock *)(next_tb ~TB_EXIT_MASK); -/* Restore PC. */ -cpu_pc_from_tb(env, tb); insns_left = env-icount_decr.u32; if (env-icount_extra insns_left = 0) { /* Refill decrementer and continue execution. */ -- 1.7.10.4
Re: [Qemu-devel] [PATCH 1/2] clean unnecessary code: don't check g_strdup arg for NULL
09.05.2013 11:53, Dong Xu Wang wrote: Signed-off-by: Dong Xu Wang wdon...@linux.vnet.ibm.com Thank you Dong, (re)applied both patches to the trivial patch queue. /mjt
[Qemu-devel] [PATCH 3/4] Handle CPU interrupts by inline checking of a flag
From: Peter Maydell peter.mayd...@linaro.org Fix some of the nasty TCG race conditions and crashes by implementing cpu_exit() as setting a flag which is checked at the start of each TB. This avoids crashes if a thread or signal handler calls cpu_exit() while the execution thread is itself modifying the TB graph (which may happen in system emulation mode as well as in linux-user mode with a multithreaded guest binary). This fixes the crashes seen in LP:668799; however there are another class of crashes described in LP:1098729 which stem from the fact that in linux-user with a multithreaded guest all threads will use and modify the same global TCG date structures (including the generated code buffer) without any kind of locking. This means that multithreaded guest binaries are still in the unsupported category. Signed-off-by: Peter Maydell peter.mayd...@linaro.org Reviewed-by: Richard Henderson r...@twiddle.net Signed-off-by: Blue Swirl blauwir...@gmail.com (cherry picked from commit 378df4b23753a11be650af7664ca76bc75cb9f01) Conflicts: cpu-exec.c exec.c include/qom/cpu.h translate-all.c (original code was in exec.c, changed there) Signed-off-by: Michael Tokarev m...@tls.msk.ru --- cpu-defs.h |1 + cpu-exec.c | 25 - exec.c |6 +++--- gen-icount.h | 11 +++ tcg/tcg.h|5 + 5 files changed, 44 insertions(+), 4 deletions(-) diff --git a/cpu-defs.h b/cpu-defs.h index f49e950..75c8e40 100644 --- a/cpu-defs.h +++ b/cpu-defs.h @@ -173,6 +173,7 @@ typedef struct CPUWatchpoint { uint32_t halted; /* Nonzero if the CPU is in suspend state */ \ uint32_t interrupt_request; \ volatile sig_atomic_t exit_request; \ +volatile sig_atomic_t tcg_exit_req; \ CPU_COMMON_TLB \ struct TranslationBlock *tb_jmp_cache[TB_JMP_CACHE_SIZE]; \ /* buffer for temporaries in the code generator */ \ diff --git a/cpu-exec.c b/cpu-exec.c index edef121..3049e46 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -63,6 +63,12 @@ static inline tcg_target_ulong cpu_tb_exec(CPUArchState *env, uint8_t *tb_ptr) TranslationBlock *tb = (TranslationBlock *)(next_tb ~TB_EXIT_MASK); cpu_pc_from_tb(env, tb); } +if ((next_tb TB_EXIT_MASK) == TB_EXIT_REQUESTED) { +/* We were asked to stop executing TBs (probably a pending + * interrupt. We've now stopped, so clear the flag. + */ +env-tcg_exit_req = 0; +} return next_tb; } @@ -578,7 +584,20 @@ int cpu_exec(CPUArchState *env) tc_ptr = tb-tc_ptr; /* execute the generated code */ next_tb = cpu_tb_exec(env, tc_ptr); -if ((next_tb TB_EXIT_MASK) == TB_EXIT_ICOUNT_EXPIRED) { +switch (next_tb TB_EXIT_MASK) { +case TB_EXIT_REQUESTED: +/* Something asked us to stop executing + * chained TBs; just continue round the main + * loop. Whatever requested the exit will also + * have set something else (eg exit_request or + * interrupt_request) which we will handle + * next time around the loop. + */ +tb = (TranslationBlock *)(next_tb ~TB_EXIT_MASK); +next_tb = 0; +break; +case TB_EXIT_ICOUNT_EXPIRED: +{ /* Instruction counter expired. */ int insns_left; tb = (TranslationBlock *)(next_tb ~TB_EXIT_MASK); @@ -602,6 +621,10 @@ int cpu_exec(CPUArchState *env) next_tb = 0; cpu_loop_exit(env); } +break; +} +default: +break; } } env-current_tb = NULL; diff --git a/exec.c b/exec.c index 0a67f07..54a37b5 100644 --- a/exec.c +++ b/exec.c @@ -1756,7 +1756,7 @@ static void tcg_handle_interrupt(CPUArchState *env, int mask) cpu_abort(env, Raised interrupt while not in I/O function); } } else { -cpu_unlink_tb(env); +env-tcg_exit_req = 1; } } @@ -1767,7 +1767,7 @@ CPUInterruptHandler cpu_interrupt_handler = tcg_handle_interrupt; void cpu_interrupt(CPUArchState *env, int mask) { env-interrupt_request |= mask; -cpu_unlink_tb(env); +env-tcg_exit_req = 1; } #endif /* CONFIG_USER_ONLY */ @@ -1779,7 +1779,7 @@ void cpu_reset_interrupt(CPUArchState *env, int mask) void cpu_exit(CPUArchState
[Qemu-devel] [PATCH 1/4] tcg: Document tcg_qemu_tb_exec() and provide constants for low bit uses
From: Peter Maydell peter.mayd...@linaro.org Document tcg_qemu_tb_exec(). In particular, its return value is a combination of a pointer to the next translation block and some extra information in the low two bits. Provide some #defines for the values passed in these bits to improve code clarity. Signed-off-by: Peter Maydell peter.mayd...@linaro.org Reviewed-by: Richard Henderson r...@twiddle.net Signed-off-by: Blue Swirl blauwir...@gmail.com (cherry picked from commit 0980011b4f66482d2733ab2dd0f2f61747772c6b) Conflicts: tcg/tcg.h Signed-off-by: Michael Tokarev m...@tls.msk.ru --- cpu-exec.c |9 + gen-icount.h |2 +- tcg/tcg.h| 44 +++- 3 files changed, 49 insertions(+), 6 deletions(-) diff --git a/cpu-exec.c b/cpu-exec.c index 6db32cd..18da91e 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -71,7 +71,7 @@ static void cpu_exec_nocache(CPUArchState *env, int max_cycles, next_tb = tcg_qemu_tb_exec(env, tb-tc_ptr); env-current_tb = NULL; -if ((next_tb 3) == 2) { +if ((next_tb TB_EXIT_MASK) == TB_EXIT_ICOUNT_EXPIRED) { /* Restore PC. This may happen if async event occurs before the TB starts executing. */ cpu_pc_from_tb(env, tb); @@ -555,7 +555,8 @@ int cpu_exec(CPUArchState *env) spans two pages, we cannot safely do a direct jump. */ if (next_tb != 0 tb-page_addr[1] == -1) { -tb_add_jump((TranslationBlock *)(next_tb ~3), next_tb 3, tb); +tb_add_jump((TranslationBlock *)(next_tb ~TB_EXIT_MASK), +next_tb TB_EXIT_MASK, tb); } spin_unlock(tb_lock); @@ -569,10 +570,10 @@ int cpu_exec(CPUArchState *env) tc_ptr = tb-tc_ptr; /* execute the generated code */ next_tb = tcg_qemu_tb_exec(env, tc_ptr); -if ((next_tb 3) == 2) { +if ((next_tb TB_EXIT_MASK) == TB_EXIT_ICOUNT_EXPIRED) { /* Instruction counter expired. */ int insns_left; -tb = (TranslationBlock *)(next_tb ~3); +tb = (TranslationBlock *)(next_tb ~TB_EXIT_MASK); /* Restore PC. */ cpu_pc_from_tb(env, tb); insns_left = env-icount_decr.u32; diff --git a/gen-icount.h b/gen-icount.h index 430cb44..3d5c131 100644 --- a/gen-icount.h +++ b/gen-icount.h @@ -29,7 +29,7 @@ static void gen_icount_end(TranslationBlock *tb, int num_insns) if (use_icount) { *icount_arg = num_insns; gen_set_label(icount_label); -tcg_gen_exit_tb((tcg_target_long)tb + 2); +tcg_gen_exit_tb((tcg_target_long)tb + TB_EXIT_ICOUNT_EXPIRED); } } diff --git a/tcg/tcg.h b/tcg/tcg.h index a83bddd..af06eb5 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -581,7 +581,49 @@ TCGv_i64 tcg_const_local_i64(int64_t val); extern uint8_t code_gen_prologue[]; -/* TCG targets may use a different definition of tcg_qemu_tb_exec. */ +/** + * tcg_qemu_tb_exec: + * @env: CPUArchState * for the CPU + * @tb_ptr: address of generated code for the TB to execute + * + * Start executing code from a given translation block. + * Where translation blocks have been linked, execution + * may proceed from the given TB into successive ones. + * Control eventually returns only when some action is needed + * from the top-level loop: either control must pass to a TB + * which has not yet been directly linked, or an asynchronous + * event such as an interrupt needs handling. + * + * The return value is a pointer to the next TB to execute + * (if known; otherwise zero). This pointer is assumed to be + * 4-aligned, and the bottom two bits are used to return further + * information: + * 0, 1: the link between this TB and the next is via the specified + *TB index (0 or 1). That is, we left the TB via (the equivalent + *of) goto_tb index. The main loop uses this to determine + *how to link the TB just executed to the next. + * 2:we are using instruction counting code generation, and we + *did not start executing this TB because the instruction counter + *would hit zero midway through it. In this case the next-TB pointer + *returned is the TB we were about to execute, and the caller must + *arrange to execute the remaining count of instructions. + * + * If the bottom two bits indicate an exit-via-index then the CPU + * state is correctly synchronised and ready for execution of the next + * TB (and in particular the guest PC is the address to execute next). + * Otherwise, we gave up on execution of this TB before it started, and + * the caller must fix up the CPU state by calling cpu_pc_from_tb() + * with the next-TB pointer we return. + * + * Note that TCG targets may
[Qemu-devel] [PATCH 4/4] translate-all.c: Remove cpu_unlink_tb()
From: Peter Maydell peter.mayd...@linaro.org The (unsafe) function cpu_unlink_tb() is now unused, so we can simply remove it and any code that was only used by it. Signed-off-by: Peter Maydell peter.mayd...@linaro.org Reviewed-by: Richard Henderson r...@twiddle.net Signed-off-by: Blue Swirl blauwir...@gmail.com (cherry picked from commit 3a808cc407744c30daa7470b5f191cde1fbc1aae) Conflicts: translate-all.c (original code was in exec.c, so the patch title is now misleading) Signed-off-by: Michael Tokarev m...@tls.msk.ru --- exec.c | 67 1 file changed, 67 deletions(-) diff --git a/exec.c b/exec.c index 54a37b5..fb72d74 100644 --- a/exec.c +++ b/exec.c @@ -1421,53 +1421,6 @@ TranslationBlock *tb_find_pc(uintptr_t tc_ptr) return tbs[m_max]; } -static void tb_reset_jump_recursive(TranslationBlock *tb); - -static inline void tb_reset_jump_recursive2(TranslationBlock *tb, int n) -{ -TranslationBlock *tb1, *tb_next, **ptb; -unsigned int n1; - -tb1 = tb-jmp_next[n]; -if (tb1 != NULL) { -/* find head of list */ -for(;;) { -n1 = (uintptr_t)tb1 3; -tb1 = (TranslationBlock *)((uintptr_t)tb1 ~3); -if (n1 == 2) -break; -tb1 = tb1-jmp_next[n1]; -} -/* we are now sure now that tb jumps to tb1 */ -tb_next = tb1; - -/* remove tb from the jmp_first list */ -ptb = tb_next-jmp_first; -for(;;) { -tb1 = *ptb; -n1 = (uintptr_t)tb1 3; -tb1 = (TranslationBlock *)((uintptr_t)tb1 ~3); -if (n1 == n tb1 == tb) -break; -ptb = tb1-jmp_next[n1]; -} -*ptb = tb-jmp_next[n]; -tb-jmp_next[n] = NULL; - -/* suppress the jump to next tb in generated code */ -tb_reset_jump(tb, n); - -/* suppress jumps in the tb on which we could have jumped */ -tb_reset_jump_recursive(tb_next); -} -} - -static void tb_reset_jump_recursive(TranslationBlock *tb) -{ -tb_reset_jump_recursive2(tb, 0); -tb_reset_jump_recursive2(tb, 1); -} - #if defined(TARGET_HAS_ICE) #if defined(CONFIG_USER_ONLY) static void breakpoint_invalidate(CPUArchState *env, target_ulong pc) @@ -1711,26 +1664,6 @@ void cpu_set_log_filename(const char *filename) cpu_set_log(loglevel); } -static void cpu_unlink_tb(CPUArchState *env) -{ -/* FIXME: TB unchaining isn't SMP safe. For now just ignore the - problem and hope the cpu will stop of its own accord. For userspace - emulation this often isn't actually as bad as it sounds. Often - signals are used primarily to interrupt blocking syscalls. */ -TranslationBlock *tb; -static spinlock_t interrupt_lock = SPIN_LOCK_UNLOCKED; - -spin_lock(interrupt_lock); -tb = env-current_tb; -/* if the cpu is currently executing code, we must unlink it and - all the potentially executing TB */ -if (tb) { -env-current_tb = NULL; -tb_reset_jump_recursive(tb); -} -spin_unlock(interrupt_lock); -} - #ifndef CONFIG_USER_ONLY /* mask must never be zero, except for A20 change call */ static void tcg_handle_interrupt(CPUArchState *env, int mask) -- 1.7.10.4
Re: [Qemu-devel] qemu-img problem when create a file larger than fs's size
On 05/09/2013 03:44 PM, Stefan Hajnoczi wrote: On Wed, May 08, 2013 at 01:18:17PM +0800, yuxh wrote: I have to consult you a qemu-img's problem. Is this reasonable to create a file which is larger than the available size of the fs by qemu-img cmd ? When I use qemu-img create a file which is larger than the available size of the fs, the creation is completed succesfully. However when I use this file in guest as a guest's disk, and write beyond the size the host file can provides, the guest was paused by qemu-kvm or libvirt and was in maybe a infinite circle where the guest just can't be used except I detach the disk from guest or destroy the guest. You can change the ENOSPC policy with -drive werror=,rerror=. See the QEMU man page. The default behavior is to pause the guest so the host admin can free up or add space. Then the guest can be continued - this will retry the I/O. But you can also tell QEMU to pass the error through to the guest using the -drive werror=,rerror= options. Stefan Oh, understand now. Thank you so much for your explanation. Yu
Re: [Qemu-devel] [PATCH 1/2] Vring: vring's listener's priority should higher than kvm
On Thu, May 09, 2013 at 08:40:21AM +0800, Liu Ping Fan wrote: From: Liu Ping Fan pingf...@linux.vnet.ibm.com Hosts threads which handle vring should have high MemoryListener priority than kvm. For currently code, take the following scenario: kvm_region_add() run earlier before vhost_region_add(), then in guest, vring's desc[i] can refer to addressX in the new region known by guest. But vhost does not know this new region yet, and the vring handler will fail. Is there a concrete scenario where this happens? I can think of situations like the ioeventfd being readable before vhost/hostmem is populated. But I don't see how that's related to the priority of kvm_region_add(). Stefan
Re: [Qemu-devel] [PATCH 2/2] mem: highlight the listener's priority as enum
On Thu, May 09, 2013 at 08:40:22AM +0800, Liu Ping Fan wrote: diff --git a/include/exec/memory.h b/include/exec/memory.h index 9e88320..77e0d40 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -192,6 +192,16 @@ struct MemoryRegionSection { typedef struct MemoryListener MemoryListener; +/* The list of priority, ex, vhost should have higher priority (less num) than + * kvm, ie PRI_VRING PRI_HYPV s/PRI_HYPV/PRI_HYPERV/ + */ +typedef enum { +PRI_DEFAULT = 0, +PRI_CORE = 1, +PRI_VRING = 9, +PRI_HYPERV = 10, HYPERV == hypervisor? Easy to confuse with Microsoft Hyper-V. Please use PRI_ACCEL or PRI_HYPERVISOR.
Re: [Qemu-devel] [Qemu-trivial] [PATCH] target-mips: add missing check_dspr2 for multiply instructions
08.05.2013 18:09, Petar Jovanovic wrote: From: Petar Jovanovic petar.jovano...@imgtec.com The emulator needs to check in hflags if DSP unit has been turned off before it generates code for MUL_PH, MUL_S_PH, MULQ_S_W, and MULQ_RS_W. Signed-off-by: Petar Jovanovic petar.jovano...@imgtec.com --- target-mips/translate.c |1 + 1 file changed, 1 insertion(+) diff --git a/target-mips/translate.c b/target-mips/translate.c index b7f8203..0a53203 100644 --- a/target-mips/translate.c +++ b/target-mips/translate.c @@ -13400,6 +13400,7 @@ static void gen_mipsdsp_multiply(DisasContext *ctx, uint32_t op1, uint32_t op2, /* OPC_MULT_G_2E, OPC_ADDUH_QB_DSP, OPC_MUL_PH_DSP have * the same mask and op1. */ case OPC_MULT_G_2E: +check_dspr2(ctx); switch (op2) { case OPC_MUL_PH: gen_helper_mul_ph(cpu_gpr[ret], v1_t, v2_t, cpu_env); FWIW. While the patch has been applied by Aurelien, I've a small comment about the trivialness of patches, having in mind this example. This appears to be a trivial patch by the look of it, that is, it is just one-liner adding a simple instruction. But nevertheless, it is NOT suitable for qemu-trivial, because it need to be verified by the subsystem maintainer who understand all the possible implications. The patch changes core logic which may have unexpected consequences in unexpected places, it may lead to wrong code generation and bugs, and other fun side effects. In that way it is not trivial, and really need to be sent to subsystem maintainers (and to qemu-devel@), not to the trivial patch queue. From not-so-trivial stuff I can consider, for example, fixes in command line processing which do not break backward compatibility (if it wasn't outright bug like a misspelt option name), -- but even there, it might be a good idea to discuss it on qemu-devel anyway. In other words, if you know some subsystem and to you some thing looks really trivial, it isn't necessary the same for someone who does not have the same knowlege about that subsystem.. ;) Thank you! /mjt
Re: [Qemu-devel] [PATCH 1/2] Vring: vring's listener's priority should higher than kvm
On Thu, May 09, 2013 at 08:40:21AM +0800, Liu Ping Fan wrote: From: Liu Ping Fan pingf...@linux.vnet.ibm.com Hosts threads which handle vring should have high MemoryListener priority than kvm. For currently code, take the following scenario: kvm_region_add() run earlier before vhost_region_add(), then in guest, vring's desc[i] can refer to addressX in the new region known by guest. But vhost does not know this new region yet, and the vring handler will fail. Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com Is this seen in testing, or are you describing a theorecitical scenario? Please make this clear in the commit log. --- hw/virtio/dataplane/hostmem.c |2 +- hw/virtio/vhost.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/virtio/dataplane/hostmem.c b/hw/virtio/dataplane/hostmem.c index 37292ff..67cbce1 100644 --- a/hw/virtio/dataplane/hostmem.c +++ b/hw/virtio/dataplane/hostmem.c @@ -158,7 +158,7 @@ void hostmem_init(HostMem *hostmem) .eventfd_del = hostmem_listener_eventfd_dummy, .coalesced_mmio_add = hostmem_listener_coalesced_mmio_dummy, .coalesced_mmio_del = hostmem_listener_coalesced_mmio_dummy, -.priority = 10, +.priority = 9, }; memory_listener_register(hostmem-listener, address_space_memory); diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index fbabf99..91c313b 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -856,7 +856,7 @@ int vhost_dev_init(struct vhost_dev *hdev, int devfd, const char *devpath, .log_global_stop = vhost_log_global_stop, .eventfd_add = vhost_eventfd_add, .eventfd_del = vhost_eventfd_del, -.priority = 10 +.priority = 9 }; hdev-mem = g_malloc0(offsetof(struct vhost_memory, regions)); hdev-n_mem_sections = 0; -- 1.7.4.4
Re: [Qemu-devel] [PATCH 1/2] Vring: vring's listener's priority should higher than kvm
On Thu, May 9, 2013 at 4:44 PM, Michael S. Tsirkin m...@redhat.com wrote: On Thu, May 09, 2013 at 08:40:21AM +0800, Liu Ping Fan wrote: From: Liu Ping Fan pingf...@linux.vnet.ibm.com Hosts threads which handle vring should have high MemoryListener priority than kvm. For currently code, take the following scenario: kvm_region_add() run earlier before vhost_region_add(), then in guest, vring's desc[i] can refer to addressX in the new region known by guest. But vhost does not know this new region yet, and the vring handler will fail. Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com Is this seen in testing, or are you describing a theorecitical scenario? Please make this clear in the commit log. A theorecitical scenario. I think vcpu thread and vhost are async, so we need this method to sync. --- hw/virtio/dataplane/hostmem.c |2 +- hw/virtio/vhost.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/virtio/dataplane/hostmem.c b/hw/virtio/dataplane/hostmem.c index 37292ff..67cbce1 100644 --- a/hw/virtio/dataplane/hostmem.c +++ b/hw/virtio/dataplane/hostmem.c @@ -158,7 +158,7 @@ void hostmem_init(HostMem *hostmem) .eventfd_del = hostmem_listener_eventfd_dummy, .coalesced_mmio_add = hostmem_listener_coalesced_mmio_dummy, .coalesced_mmio_del = hostmem_listener_coalesced_mmio_dummy, -.priority = 10, +.priority = 9, }; memory_listener_register(hostmem-listener, address_space_memory); diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index fbabf99..91c313b 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -856,7 +856,7 @@ int vhost_dev_init(struct vhost_dev *hdev, int devfd, const char *devpath, .log_global_stop = vhost_log_global_stop, .eventfd_add = vhost_eventfd_add, .eventfd_del = vhost_eventfd_del, -.priority = 10 +.priority = 9 }; hdev-mem = g_malloc0(offsetof(struct vhost_memory, regions)); hdev-n_mem_sections = 0; -- 1.7.4.4
Re: [Qemu-devel] [PATCH 1/2] Vring: vring's listener's priority should higher than kvm
On Thu, May 9, 2013 at 4:30 PM, Stefan Hajnoczi stefa...@gmail.com wrote: On Thu, May 09, 2013 at 08:40:21AM +0800, Liu Ping Fan wrote: From: Liu Ping Fan pingf...@linux.vnet.ibm.com Hosts threads which handle vring should have high MemoryListener priority than kvm. For currently code, take the following scenario: kvm_region_add() run earlier before vhost_region_add(), then in guest, vring's desc[i] can refer to addressX in the new region known by guest. But vhost does not know this new region yet, and the vring handler will fail. Is there a concrete scenario where this happens? I can think of situations like the ioeventfd being readable before vhost/hostmem is populated. But I don't see how that's related to the priority of kvm_region_add(). For kvm, ie, In guest, vring_desc.addr can point to a chunk of data in the new added memory, and kick vhost. The vhost has not added this new region, so its local lookup table can not translate this new address, and vring handler will fail. If vhost priority is higher than kvm, then, it will know this new address earlier than kvm. Stefan
Re: [Qemu-devel] [PATCH 0/6] Drop the irredeemably racy cpu_unlink_tb()
On 9 May 2013 09:09, Michael Tokarev m...@tls.msk.ru wrote: Now, the resulting thing compiles (ha!), but I'm not really sure how to test it. I ran a few random apps using qemu-i386 and qemu-arm, it appears to work. You need to test TCG system emulation too, and in particular something with multiple guest CPU cores. -- PMM
Re: [Qemu-devel] [PATCH 2/2] mem: highlight the listener's priority as enum
On Thu, May 9, 2013 at 4:31 PM, Stefan Hajnoczi stefa...@gmail.com wrote: On Thu, May 09, 2013 at 08:40:22AM +0800, Liu Ping Fan wrote: diff --git a/include/exec/memory.h b/include/exec/memory.h index 9e88320..77e0d40 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -192,6 +192,16 @@ struct MemoryRegionSection { typedef struct MemoryListener MemoryListener; +/* The list of priority, ex, vhost should have higher priority (less num) than + * kvm, ie PRI_VRING PRI_HYPV s/PRI_HYPV/PRI_HYPERV/ Will fix. + */ +typedef enum { +PRI_DEFAULT = 0, +PRI_CORE = 1, +PRI_VRING = 9, +PRI_HYPERV = 10, HYPERV == hypervisor? Easy to confuse with Microsoft Hyper-V. Please use PRI_ACCEL or PRI_HYPERVISOR. Ok. Regards, Pingfan
Re: [Qemu-devel] [PATCH 2/2] mem: highlight the listener's priority as enum
On 9 May 2013 01:40, Liu Ping Fan qemul...@gmail.com wrote: From: Liu Ping Fan pingf...@linux.vnet.ibm.com It will make the priority prominent, when new listener added. All the priority's value are kept unchanged, except for vhost and hostmem.(Changes introduced by prev patch) Any chance of something somewhere which explains why these various things care about the order and thus need these particular priorities? (ie what are the dependencies between the listeners?) thanks -- PMM
Re: [Qemu-devel] [PATCH 2/2] mem: highlight the listener's priority as enum
On Thu, May 9, 2013 at 5:21 PM, Peter Maydell peter.mayd...@linaro.org wrote: On 9 May 2013 01:40, Liu Ping Fan qemul...@gmail.com wrote: From: Liu Ping Fan pingf...@linux.vnet.ibm.com It will make the priority prominent, when new listener added. All the priority's value are kept unchanged, except for vhost and hostmem.(Changes introduced by prev patch) Any chance of something somewhere which explains why these various things care about the order and thus need these particular priorities? (ie what are the dependencies between the listeners?) Sorry, I do not go so deeply, so leave the original value as they are. As far as I know, the most prominent one is the kvm and the core. If a region removed from core earlier than kvm, then cpu_physical_memory_rw() can use an address valid in kvm, but invalid for radix-tree, and cause error. thanks -- PMM
[Qemu-devel] [PATCH] qxl: Call spice_qxl_driver_unload from qxl_enter_vga_mode
From: Hans de Goede hdego...@redhat.com With a SPICE_DISPLAY_CAP_MONITORS_CONFIG capable client, the client needs to know what part of the primary to use for each monitor. If the guest driver does not support this, the server sends messages to the client for a single monitor spanning the entire primary. As soon as the guest calls spice_qxl_monitors_config_async once, the server sets the red_worker driver_has_monitors_config flag and stops doing this. This is a problem when the driver gets unloaded, for example after a reboot or when switching to a text vc with usermode mode-setting under Linux. To reproduce this start a multi-mon capable Linux guest which uses usermode mode-setting and then once X has started switch to a text vc. Note how the client window does not only not resize, if you try to resize it manually you always keep blackborders since the aspect is wrong. This patch calls a new spice-server method called spice_qxl_driver_unload which clears the driver_has_monitors_config flag inside the server, thereby fixing this. Signed-off-by: Hans de Goede hdego...@redhat.com Signed-off-by: Gerd Hoffmann kra...@redhat.com --- hw/display/qxl.c |3 +++ 1 file changed, 3 insertions(+) diff --git a/hw/display/qxl.c b/hw/display/qxl.c index 2d49e9a..c475cb1 100644 --- a/hw/display/qxl.c +++ b/hw/display/qxl.c @@ -1077,6 +1077,9 @@ static void qxl_enter_vga_mode(PCIQXLDevice *d) return; } trace_qxl_enter_vga_mode(d-id); +#if SPICE_SERVER_VERSION = 0x000c03 /* release 0.12.3 */ +spice_qxl_driver_unload(d-ssd.qxl); +#endif qemu_spice_create_host_primary(d-ssd); d-mode = QXL_MODE_VGA; vga_dirty_log_start(d-vga); -- 1.7.9.7
[Qemu-devel] [PULL for-1.5] spice: bugfix queue
Hi, Tiny spice patch queue, featuring a single qxl bugfix. please pull, Gerd The following changes since commit 47ec15cdd44877e553ed0bd0a16aea8a295dad62: Update version for 1.5.0-rc1 (2013-05-08 15:54:47 -0500) are available in the git repository at: git://anongit.freedesktop.org/spice/qemu spice.v70 for you to fetch changes up to 0a2b5e3a7899b40d05d7c6c1c41eb4e64dd2ed4b: qxl: Call spice_qxl_driver_unload from qxl_enter_vga_mode (2013-05-09 11:46:53 +0200) Hans de Goede (1): qxl: Call spice_qxl_driver_unload from qxl_enter_vga_mode hw/display/qxl.c |3 +++ 1 file changed, 3 insertions(+)
Re: [Qemu-devel] [PATCH 0/6] Drop the irredeemably racy cpu_unlink_tb()
09.05.2013 13:01, Peter Maydell wrote: On 9 May 2013 09:09, Michael Tokarev m...@tls.msk.ru wrote: Now, the resulting thing compiles (ha!), but I'm not really sure how to test it. I ran a few random apps using qemu-i386 and qemu-arm, it appears to work. You need to test TCG system emulation too, and in particular something with multiple guest CPU cores. As suggested on IRC, I ran a proposed omp testcase (in LP:668799) in a armel chroot created by the same patched qemu-arm-static, with 6 and 60 threads, and tried running a few java binaries I've found -- that works well from creating system to building and running testcases. I also tried the same with qemu-system-i386 (without kvm), installing a system using patched qemu, and running multi-threaded apps in it. At least I don't see any obvious new breakages due to my backport, but I can definitely go further than with unpatched qemu. So this is at least promising, so far... ;) Thanks, /mjt
Re: [Qemu-devel] [PATCH 0/6] Drop the irredeemably racy cpu_unlink_tb()
Am 09.05.2013 10:09, schrieb Michael Tokarev: 22.02.2013 22:09, Peter Maydell wrote: cpu-exec: wrap tcg_qemu_tb_exec() in a fn to restore the PC This needed a back merge of env+cpu states back to cpu. Maybe we should somehow revisit the whole concept of the two, because it's sorta fun: at some point all functions has been converted to accept `cpu' instead of `env', but in many places the first thing a function does is to get `env' pointer out of `cpu'. The concept is really easy: There is so much CPU code around that for many years no one dared to touch it, ;) so changes need to be done incrementally - not only to identify any fallout! If one function is converted to no longer rely on env, then rather likely in some caller it still needs to convert from env - cpu. Once that caller is converted too, it goes on moving the conversion outwards until the only remaining env functions are TCG-related. env access from a specific *CPU type is cheap, thus only talking about common code here. You will find more background on the big QOM CPUState part X series. CPU_COMMON is definitely not the way for the future, it should be okay for backporting though in this case. Regards, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
[Qemu-devel] virsh edit failed to take effect on KVM
Hi all, we use the command virsh edit to modify the VM configuration information online on KVM Platform(libvirt-1.0.0 and qemu-1.4), but it does not take effect after reboot. However, it works fine on Xen Platform. for an example,a VM is running with the following configuration information: ... os type arch='x86_64'hvm/type boot dev='hd'/ bootmenu enable='yes'/ /os ... use command virsh edit to modify it: ... os type arch='x86_64'hvm/type boot dev='cdrom'/ bootmenu enable='yes'/ /os ... With the changing, the VM is expected to start from cdrom, when execute the command virsh reboot. But the fact is that the modify does not take effect, the VM is still start from hd. Well, it will take effect if I use command virsh shutdown and virsh start instesad of virsh reboot. We are wondering if there have any other ways to take the online modify effect. What is the next step going on with the command virsh edit on KVM Platform? Any ideas? Thanks! -Gonglei
Re: [Qemu-devel] [PATCH 6/8] qapi: add native list coverage for visitor serialization tests
On 05/09/13 01:33, Michael Roth wrote: +case PTYPE_NUMBER: { +numberList *ptr; +char *double1, *double2; +if (cur_head) { +ptr = cur_head; +cur_head = ptr-next; +} else { +cur_head = ptr = pl_copy.value.numbers; +} +/* we serialize with %f for our reference visitors, so rather than + * fuzzy * floating math to test equality, just compare the + * formatted values + */ I think this comment block has been copied from elsewhere in this file, indented more deeply and re-filled. There's an asterisk in the comment body now. +double1 = g_malloc0(calc_float_string_storage(pt-value.number)); +double2 = g_malloc0(calc_float_string_storage(ptr-value)); +g_assert_cmpstr(double1, ==, double2); Are you comparing empty strings? Space is allocated and zeroed, but I can't see where the values are actually formatted. (Same holds for the original instance of this code, test_primitives().) +g_free(double1); +g_free(double2); +break; +} Thanks, Laszlo
Re: [Qemu-devel] virsh edit failed to take effect on KVM
On 05/09/13 13:42, Gonglei (Arei) wrote: Hi all, we use the command virsh edit to modify the VM configuration information online on KVM Platform(libvirt-1.0.0 and qemu-1.4), but it does not take effect after reboot. However, it works fine on Xen Platform. for an example,a VM is running with the following configuration information: ... os type arch='x86_64'hvm/type boot dev='hd'/ bootmenu enable='yes'/ /os ... use command virsh edit to modify it: ... os type arch='x86_64'hvm/type boot dev='cdrom'/ bootmenu enable='yes'/ /os ... With the changing, the VM is expected to start from cdrom, when execute the command virsh reboot. But the fact is that the modify does not take effect, the VM is still start from hd. Well, it will take effect if I use command virsh shutdown and virsh start instesad of virsh reboot. We are wondering if there have any other ways to take the online modify effect. What is the next step going on with the command virsh edit on KVM Platform? Any ideas? Under Xen, xm reboot (or, more likely, xl reboot recently), which is what I expect virsh reboot to translate to, always creates a brand new domain. Therefore domain config changes take effect. Under KVM the same VM instance continues to run, it just goes through an emulated reset. Laszlo
Re: [Qemu-devel] [PATCH for 1.5] tcg/optimize: fix setcond2 optimization
On 2013-05-08 15:42, Aurelien Jarno wrote: When setcond2 is rewritten into setcond, the state of the destination temp should be reset, so that a copy of the previous value is not used instead of the result. Reported-by: Michael Tokarev m...@tls.msk.ru Cc: Richard Henderson r...@twiddle.net Signed-off-by: Aurelien Jarno aurel...@aurel32.net Reviewed-by: Richard Henderson r...@twiddle.net r~
Re: [Qemu-devel] [PATCH 0/8] qapi: add support for lists of native types
On 05/09/13 01:33, Michael Roth wrote: These patches apply on top of qemu.git master, and can also be obtained from: git://github.com/mdroth/qemu.git qapi-native-lists Sending this now since a number of series have popped up in the past that wanted this, and Amos has some pending patches (query-mac-tables) that rely on this as well. These patches add support for specifying lists of native qapi types (int/bool/str/number) like so: { 'type': 'foo', 'data': { 'bar': ['int'] }} for a 'bar' field that is a list of type 'int', { 'type': 'foo2', 'data': { 'bar2': ['str'] }} for a 'bar2' field that is a list of type 'str', and so on. This uses linked list types for the native C representations, just as we do for complex schema-defined types. In the future we may add schema annotations of some sort to specify a more natural/efficient array type for the C representations, but this should serve the majority of uses-cases for now. Makefile |6 +- qapi-schema-test.json |8 ++ scripts/qapi-types.py | 44 ++- scripts/qapi-visit.py | 36 - scripts/qapi.py| 21 +++ tests/test-qmp-input-visitor.c | 181 + tests/test-qmp-output-visitor.c| 172 tests/test-visitor-serialization.c | 256 +--- 8 files changed, 692 insertions(+), 32 deletions(-) Two notes: - the remark I made for 6/8 (comparing empty strings), - for 7/8 and 8/8: the format specification %3.4f is not very useful. 3 is the field width, 4 is the precision, and the latter means for %f the number of digits printed after the radix character. It's not useful to specify a smaller field width (which covers the entire output string) than precision here. Other than these nothing pokes me in the eye. Laszlo
Re: [Qemu-devel] [PATCH 6/8] qapi: add native list coverage for visitor serialization tests
On Thu, May 09, 2013 at 02:31:03PM +0200, Laszlo Ersek wrote: On 05/09/13 01:33, Michael Roth wrote: +case PTYPE_NUMBER: { +numberList *ptr; +char *double1, *double2; +if (cur_head) { +ptr = cur_head; +cur_head = ptr-next; +} else { +cur_head = ptr = pl_copy.value.numbers; +} +/* we serialize with %f for our reference visitors, so rather than + * fuzzy * floating math to test equality, just compare the + * formatted values + */ I think this comment block has been copied from elsewhere in this file, indented more deeply and re-filled. There's an asterisk in the comment body now. Yes :) Will fix it on the next pass. +double1 = g_malloc0(calc_float_string_storage(pt-value.number)); +double2 = g_malloc0(calc_float_string_storage(ptr-value)); +g_assert_cmpstr(double1, ==, double2); Are you comparing empty strings? Space is allocated and zeroed, but I can't see where the values are actually formatted. Well, that's embarassing... (Same holds for the original instance of this code, test_primitives().) but at least I can blame it on the fact that I copied it from here (we'll just ignore who the original author was :) Will add a patch on the next pass to fix the original instance beforehand. Thanks for the catch! +g_free(double1); +g_free(double2); +break; +} Thanks, Laszlo
Re: [Qemu-devel] [PATCH 0/8] qapi: add support for lists of native types
On Thu, May 09, 2013 at 03:31:08PM +0200, Laszlo Ersek wrote: On 05/09/13 01:33, Michael Roth wrote: These patches apply on top of qemu.git master, and can also be obtained from: git://github.com/mdroth/qemu.git qapi-native-lists Sending this now since a number of series have popped up in the past that wanted this, and Amos has some pending patches (query-mac-tables) that rely on this as well. These patches add support for specifying lists of native qapi types (int/bool/str/number) like so: { 'type': 'foo', 'data': { 'bar': ['int'] }} for a 'bar' field that is a list of type 'int', { 'type': 'foo2', 'data': { 'bar2': ['str'] }} for a 'bar2' field that is a list of type 'str', and so on. This uses linked list types for the native C representations, just as we do for complex schema-defined types. In the future we may add schema annotations of some sort to specify a more natural/efficient array type for the C representations, but this should serve the majority of uses-cases for now. Makefile |6 +- qapi-schema-test.json |8 ++ scripts/qapi-types.py | 44 ++- scripts/qapi-visit.py | 36 - scripts/qapi.py| 21 +++ tests/test-qmp-input-visitor.c | 181 + tests/test-qmp-output-visitor.c| 172 tests/test-visitor-serialization.c | 256 +--- 8 files changed, 692 insertions(+), 32 deletions(-) Two notes: - the remark I made for 6/8 (comparing empty strings), - for 7/8 and 8/8: the format specification %3.4f is not very useful. 3 is the field width, 4 is the precision, and the latter means for %f the number of digits printed after the radix character. It's not useful to specify a smaller field width (which covers the entire output string) than precision here. Yup, that's a mistake. The field width isn't useful at all for what I was intending actually (to constrain the whole portion of float so that the fractional portion wouldn't get truncated by snprintf and make the test less likely to catch re-encoding issues). I think just using %.4f should suffice since we have plenty of extra buffer space for the values we're working with (max being 32 / 3) so I'll do that for the next pass. Other than these nothing pokes me in the eye. Laszlo
Re: [Qemu-devel] virsh edit failed to take effect on KVM
On 05/09/2013 05:42 AM, Gonglei (Arei) wrote: Hi all, we use the command virsh edit to modify the VM configuration information online on KVM Platform(libvirt-1.0.0 and qemu-1.4), but it does not take effect after reboot. However, it works fine on Xen Platform. This sort of question belongs best on libvirt-us...@redhat.com (libvirt-list-request is not the right list address, and your question isn't quite relevant to qemu-devel). With the changing, the VM is expected to start from cdrom, when execute the command virsh reboot. 'virsh reboot' triggers a software reboot (reusing the same existing qemu process); but 'virsh edit' only affects what will take place on a hard boot (creation of a new qemu process). But the fact is that the modify does not take effect, the VM is still start from hd. Well, it will take effect if I use command virsh shutdown and virsh start instesad of virsh reboot. Correct, because that sequence spawns a new qemu process. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [Xen-devel] [RFC PATCH] vga: Start supporting resolution not multiple of 16 correctly.
On Fri, Mar 15, 2013 at 06:14:30PM +, Frediano Ziglio wrote: Modern notebook support 136x768 resolution. The resolution width is not multiple of 16 causing some problems. Qemu VGA emulation require width resolution to be multiple of 8. VNC implementation require width resolution to be multiple of 16. This patch remove these limits. Was tested with a Windows machine with standard vga and 1366x768 as resolution. I had to update vgabios as version in qemu (pc-bios/vgabios-stdvga.bin) is quite old. I also had to add some patches on top of VGABIOS 0.7a to add some new resolutions. I have some doubt about this patch - are other UI (sdl, cocoa, qxl) happy if resolution is not multiple of 16 ? - scanline is computed exactly without any alignment (so 1366 8 bit is 1366 bytes) while getting vesa information from a laptop it seems to use some kind of alignment (if became 0x580 which is 1408 bytes). Perhaps should I change either VGABIOS and Qemu to make this alignment? Any plans to re-send the patch with Stefano's comments addressed? Thanks, -- Pasi Signed-off-by: Frediano Ziglio frediano.zig...@citrix.com --- hw/vga.c |2 +- ui/vnc.c | 27 +-- 2 files changed, 14 insertions(+), 15 deletions(-) diff --git a/hw/vga.c b/hw/vga.c index 1caf23d..d229f06 100644 --- a/hw/vga.c +++ b/hw/vga.c @@ -651,7 +651,7 @@ void vbe_ioport_write_data(void *opaque, uint32_t addr, uint32_t val) } break; case VBE_DISPI_INDEX_XRES: -if ((val = VBE_DISPI_MAX_XRES) ((val 7) == 0)) { +if ((val = VBE_DISPI_MAX_XRES) ((val 1) == 0)) { s-vbe_regs[s-vbe_index] = val; } break; diff --git a/ui/vnc.c b/ui/vnc.c index ff4e2ae..328d14d 100644 --- a/ui/vnc.c +++ b/ui/vnc.c @@ -907,26 +907,27 @@ static int vnc_update_client(VncState *vs, int has_dirty) for (y = 0; y height; y++) { int x; int last_x = -1; -for (x = 0; x width / 16; x++) { -if (test_and_clear_bit(x, vs-dirty[y])) { +for (x = 0; x width; x += 16) { +if (test_and_clear_bit(x/16, vs-dirty[y])) { if (last_x == -1) { last_x = x; } } else { if (last_x != -1) { -int h = find_and_clear_dirty_height(vs, y, last_x, x, +int h = find_and_clear_dirty_height(vs, y, last_x/16, x/16, height); -n += vnc_job_add_rect(job, last_x * 16, y, - (x - last_x) * 16, h); +n += vnc_job_add_rect(job, last_x, y, + (x - last_x), h); } last_x = -1; } } if (last_x != -1) { -int h = find_and_clear_dirty_height(vs, y, last_x, x, height); -n += vnc_job_add_rect(job, last_x * 16, y, - (x - last_x) * 16, h); +int h = find_and_clear_dirty_height(vs, y, last_x/16, x/16, height); +if (x width) x = width; +n += vnc_job_add_rect(job, last_x, y, + (x - last_x), h); } } @@ -1771,7 +1772,7 @@ static void framebuffer_update_request(VncState *vs, int incremental, int w, int h) { int i; -const size_t width = ds_get_width(vs-ds) / 16; +const size_t width = (ds_get_width(vs-ds)+15) / 16; if (y_position ds_get_height(vs-ds)) y_position = ds_get_height(vs-ds); @@ -2595,10 +2596,6 @@ static int vnc_refresh_server_surface(VncDisplay *vd) * Check and copy modified bits from guest to server surface. * Update server dirty map. */ -cmp_bytes = 64; -if (cmp_bytes vnc_server_fb_stride(vd)) { -cmp_bytes = vnc_server_fb_stride(vd); -} if (vd-guest.format != VNC_SERVER_FB_FORMAT) { int width = pixman_image_get_width(vd-server); tmpbuf = qemu_pixman_linebuf_create(VNC_SERVER_FB_FORMAT, width); @@ -2619,8 +2616,10 @@ static int vnc_refresh_server_surface(VncDisplay *vd) } server_ptr = server_row; -for (x = 0; x + 15 width; +cmp_bytes = 64; +for (x = 0; x width; x += 16, guest_ptr += cmp_bytes, server_ptr += cmp_bytes) { +if (width - x 16) cmp_bytes = 4 * (width - x); if (!test_and_clear_bit((x / 16), vd-guest.dirty[y])) continue; if (memcmp(server_ptr, guest_ptr, cmp_bytes) == 0) -- 1.7.10.4
Re: [Qemu-devel] [PATCH] qga: set umask 0077 when daemonizing (CVE-2013-2007)
On 5/7/2013 at 05:47 AM, Anthony Liguori aligu...@us.ibm.com wrote: From: Laszlo Ersek ler...@redhat.com The qemu guest agent creates a bunch of files with insecure permissions when started in daemon mode. For example: -rw-rw-rw- 1 root root /var/log/qemu-ga.log -rw-rw-rw- 1 root root /var/run/qga.state -rw-rw-rw- 1 root root /var/log/qga-fsfreeze-hook.log In addition, at least all files created with the guest-file-open QMP command, and all files created with shell output redirection (or otherwise) by utilities invoked by the fsfreeze hook script are affected. For now mask all file mode bits for group and others in become_daemon(). Temporarily, for compatibility reasons, stick with the 0666 file-mode in case of files newly created by the guest-file-open QMP call. Do so without changing the umask temporarily. Signed-off-by: Laszlo Ersek ler...@redhat.com Signed-off-by: Anthony Liguori aligu...@us.ibm.com --- qga/commands-posix.c | 123 +-- qga/main.c | 2 +- 2 files changed, 120 insertions(+), 5 deletions(-) diff --git a/qga/commands-posix.c b/qga/commands-posix.c index 3b5c536..04c6951 100644 --- a/qga/commands-posix.c +++ b/qga/commands-posix.c @@ -18,6 +18,9 @@ #include unistd.h #include errno.h #include fcntl.h +#include stdio.h +#include string.h +#include sys/stat.h #include inttypes.h #include qga/guest-agent-core.h #include qga-qmp-commands.h @@ -237,9 +240,122 @@ static GuestFileHandle *guest_file_handle_find(int64_t id, Error **err) return NULL; } +typedef const char * const ccpc; + +/* http://pubs.opengroup.org/onlinepubs/9699919799/functions/fopen.html */ +static const struct { +ccpc *forms; +int oflag_base; +} guest_file_open_modes[] = { +{ (ccpc[]){ r, rb, NULL }, O_RDONLY }, +{ (ccpc[]){ w, wb, NULL }, O_WRONLY | O_CREAT | O_TRUNC }, +{ (ccpc[]){ a, ab, NULL }, O_WRONLY | O_CREAT | O_APPEND }, +{ (ccpc[]){ r+, rb+, r+b, NULL }, O_RDWR }, +{ (ccpc[]){ w+, wb+, w+b, NULL }, O_RDWR | O_CREAT | O_TRUNC }, +{ (ccpc[]){ a+, ab+, a+b, NULL }, O_RDWR | O_CREAT | O_APPEND } +}; + +static int +find_open_flag(const char *mode_str, Error **err) +{ +unsigned mode; + +for (mode = 0; mode ARRAY_SIZE(guest_file_open_modes); ++mode) { +ccpc *form; + +form = guest_file_open_modes[mode].forms; +while (*form != NULL strcmp(*form, mode_str) != 0) { +++form; +} +if (*form != NULL) { +break; +} +} + +if (mode == ARRAY_SIZE(guest_file_open_modes)) { +error_setg(err, invalid file open mode '%s', mode_str); +return -1; +} +return guest_file_open_modes[mode].oflag_base | O_NOCTTY | O_NONBLOCK; +} + +#define DEFAULT_NEW_FILE_MODE (S_IRUSR | S_IWUSR | \ + S_IRGRP | S_IWGRP | \ + S_IROTH | S_IWOTH) + +static FILE * +safe_open_or_create(const char *path, const char *mode, Error **err) +{ +Error *local_err = NULL; +int oflag; + +oflag = find_open_flag(mode, local_err); +if (local_err == NULL) { +int fd; + +/* If the caller wants / allows creation of a new file, we implement it + * with a two step process: open() + (open() / fchmod()). + * + * First we insist on creating the file exclusively as a new file. If + * that succeeds, we're free to set any file-mode bits on it. (The + * motivation is that we want to set those file-mode bits independently + * of the current umask.) + * + * If the exclusive creation fails because the file already exists + * (EEXIST is not possible for any other reason), we just attempt to + * open the file, but in this case we won't be allowed to change the + * file-mode bits on the preexistent file. + * + * The pathname should never disappear between the two open()s in + * practice. If it happens, then someone very likely tried to race us. + * In this case just go ahead and report the ENOENT from the second + * open() to the caller. + * + * If the caller wants to open a preexistent file, then the first + * open() is decisive and its third argument is ignored, and the second + * open() and the fchmod() are never called. + */ +fd = open(path, oflag | ((oflag O_CREAT) ? O_EXCL : 0), 0); +if (fd == -1 errno == EEXIST) { +oflag = ~(unsigned)O_CREAT; +fd = open(path, oflag); +} + +if (fd == -1) { +error_setg_errno(local_err, errno, failed to open file '%s' + (mode: '%s'), path, mode); +
Re: [Qemu-devel] [RFC PATCH 0/8] MemoryRegion and FlatView refcounting, replace hostmem with memory_region_find
Il 09/05/2013 02:53, liu ping fan ha scritto: On Wed, May 8, 2013 at 11:44 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 08/05/2013 08:20, liu ping fan ha scritto: On Mon, May 6, 2013 at 10:25 PM, Paolo Bonzini pbonz...@redhat.com wrote: Hi, this is an alternative approach to refactoring of dataplane's HostMem code. Here, I take Ping Fan's idea of RCU-style updating of the region list and apply it to the AddressSpace's FlatView. With this In fact, I am worrying about the priority of MemoryListener, if it is true, then we should drop RCU-style idea. You mean in hostmem, or in general as in this patch? Note that this patch releases the old FlatView at the end of all MemoryListener operations. Both in hostmem and this patch, they all broke the original design of the MemoryListener, see notes for priority in code. I think both hostmem and this patch are fine. The hypervisor is never involved, all accesses go through the old FlatView and regions cannot disappear thanks to ref/unref. In fact, we need _more_ RCU-style updates, not less. For BQL-less dispatch, address space mapping/translation can race against the MemoryListeners in exec.c. To fix this, phys_sections and AddressSpaceDispatch need to be reference counted and RCU-ified as well. Paolo I have set out 2 patches to highlight this issue, and have CC you and Stefanha. Regards, Pingfan Paolo Also if it is true, there is already a bug with hostmem listener. It should use region_del, not region_nop to reconstruct the local view. But just let me have a deep thinking. Regards, Pingfan change, dataplane can simply use memory_region_find instead of hostmem. This is a somewhat larger change, but I prefer it for two reasons. 1) it splits the task of adding BQL-less memory dispatch in two parts, tacking memory_region_find first (which is simpler because locking is left to the caller). 2) HostMem duplicates a lot of the FlatView logic, and adding the RCU-style update in FlatView benefits everyone. The missing ingredients here are: 1) remember and unreference the MemoryRegions that are used in a vring entry. In order to implement this, it is probably simpler to change vring.c to use virtio.c's VirtQueueElement data structure. We want something like that anyway in order to support migration. 2) add an owner field to MemoryRegion, and set it for all MemoryRegions for hot-unpluggable devices. In this series, ref/unref are stubs. For simplicity I based the patches on my IOMMU rebase. I placed the tree at git://github.com/bonzini/qemu.git, branch iommu. Paolo Paolo Bonzini (8): memory: add ref/unref calls exec: check MRU in qemu_ram_addr_from_host memory: return MemoryRegion from qemu_ram_addr_from_host memory: ref/unref memory across address_space_map/unmap memory: access FlatView from a local variable memory: use a new FlatView pointer on every topology update memory: add reference counting to FlatView dataplane: replace hostmem with memory_region_find exec.c| 63 +--- hw/core/loader.c |1 + hw/display/exynos4210_fimd.c |6 + hw/display/framebuffer.c | 10 +- hw/i386/kvm/ioapic.c |2 + hw/i386/kvmvapic.c|1 + hw/misc/vfio.c|2 + hw/virtio/dataplane/Makefile.objs |2 +- hw/virtio/dataplane/hostmem.c | 176 - hw/virtio/dataplane/vring.c | 56 +-- hw/virtio/vhost.c |2 + hw/virtio/virtio-balloon.c|1 + hw/xen/xen_pt.c |4 + include/exec/cpu-common.h |2 +- include/exec/memory.h |9 ++ include/hw/virtio/dataplane/hostmem.h | 57 --- include/hw/virtio/dataplane/vring.h |3 +- kvm-all.c |2 + memory.c | 142 +- target-arm/kvm.c |2 + target-i386/kvm.c |4 +- target-sparc/mmu_helper.c |1 + xen-all.c |2 + 23 files changed, 253 insertions(+), 297 deletions(-) delete mode 100644 hw/virtio/dataplane/hostmem.c delete mode 100644 include/hw/virtio/dataplane/hostmem.h
Re: [Qemu-devel] [PATCH 9/9] Make monitor command 'dump-guest-memory' dump in kdump-compressed format
On 05/08/2013 11:27 PM, Zhang Xiaohe wrote: 于 2013年05月09日 01:16, Eric Blake 写道: On 05/08/2013 02:50 AM, qiaonuohan wrote: Thanks for your suggestion. I will fix it like: { 'enum': 'DumpCompressionFormat', 'data': [ 'zlib', 'lzo', 'snappy' ] } For zlib is treated as the default compression format, and 'uncompressed' won't be an option. No, I was serious that you need to provide 'uncompressed' as an explicit enum value. It is very annoying to toggle between four states (three compression formats and a fourth state of no compression) when the fourth is available only by omitting a parameter. The default MUST be 'uncompressed' for backwards-compatibility, not 'zlib'. We'd like to make sure that we understand you precisely. The definion is like below: { 'enum': 'DumpGuestMemoryFormat', 'data': [ 'uncompressed', 'zlib', 'lzo', 'snappy' ] } { 'command': 'dump-guest-memory', 'data': { 'paging': 'bool', 'protocol': 'str', '*begin': 'int', '*length': 'int', '*format': 'DumpCompressionFormat' } } Closer - make sure you use the same type name in both places (the enum name 'DumpGuestMemoryFormat' is slightly nicer than the command use of 'DumpCompressionFormat'. 'format' is optional: 1. when 'format' is not specified, vmcore will be in ELF format. 2. when 'format' is specified and its parameter is 'uncompressed', vmcore will be in ELF format as well. 3. when 'format' is specified and its parameter is 'zlib/lzo/snappy', vmcore will be in kdump-compressed format. Correct. Or you could use the name 'elf' instead of 'uncompressed', if that makes more sense - as long as the enum calls out the names you are supporting. If this is what you suggest, then I don't think it is necessary to add 'uncompressed'. The backwards-compatibility is assured by case 1, in which the interface is exactly the same as before. So why do we add another parameter to do the same thing again? Because it is nicer to apps to allow them to explicitly specify the default. Making 'format' optional allows older apps to still work, but for newer apps, it is easier to ALWAYS supply the format argument than it is to make the generation of the format argument conditional on whether the default behavior is desired. Trust me - I'm reviewing this as a potential user of your interface. When I ask for a fourth enum value, it's because I want to use it. When you keep coming back and complaining that you don't want to provide the fourth enum value because it is redundant, I feel like you aren't paying attention to your user base. I also think that implementation-wise, it will be easier to write your code if you have an enum supplying all four possibilities, since it is easier to write code that defaults a missing argument to a known enum value, and then have the rest of your code deal with enum values, than it is to write code that has to check everywhere whether the argument is missing (for one value) vs. one of three other enum values. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
[Qemu-devel] [RFC v2] virtio-balloon: automatic ballooning support
Automatic ballooning consists of dynamically adjusting the guest's balloon according to memory pressure in the host and in the guest. This commit implements the host side of automatic balloning, which basically consists of: 1. Registering with the memory.pressure_level API (from the Linux memory controller cgroup) for the MEDIUM pressure event This is a new feature starting on Linux kernel 3.10. For more information on this please check Documentation/cgroups/memory.txt in Linux kernel sources. 2. On MEDIUM pressure event reception, QEMU asks the guest kernel to inflate the balloon by 16MB 3. This is only done if the guest negotiates VIRTIO_BALLOON_F_AUTO_BALLOON which means the guest's kernel virtio-balloon driver also supports automatic ballooning Automatic deflate is performed by the guest. Here are some numbers. The test-case is to run 35 VMs (1G of RAM each) in parallel doing a kernel build. Host has 32GB of RAM and 16GB of swap. SWAP IN and SWAP OUT correspond to the number of pages swapped in and swapped out, respectively. Auto-ballooning disabled: RUN TIME(s) SWAP IN SWAP OUT 1634 930980 1588522 2610 627422 1362174 3649 1079847 1616367 4543 953289 1635379 5642 913237 1514000 Auto-ballooning enabled: RUN TIME(s) SWAP IN SWAP OUT 1629 901 12537 2624 981 18506 3626 573 9085 4631 2250 42534 5627 1610 20808 FIXMEs/TODOs: - Should we have a lower limit for guest memory? Otherwise it can reach 0 if too many events are received - Or maybe we should rate-limit events? - It seems that events are being lost when too many of them are sent at the same time on a busy host - Allow this to be dynamically enabled by mngt Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- o You can find my test script here: http://repo.or.cz/w/qemu/qmp-unstable.git/blob/refs/heads/balloon/auto-ballooning/memcg/rfc:/scripts/autob-test o You can find the guest driver counterpart code at: http://repo.or.cz/w/linux-2.6/luiz-linux-2.6.git/shortlog/refs/heads/virtio-balloon/auto-deflate/rfc o To play with automatic ballooning, do the following: 1. You'll need 3.9+ for the host kernel 2. Get the guest kernel bits from: git://repo.or.cz/linux-2.6/luiz-linux-2.6.git virtio-balloon/auto-deflate/rfc 3. Apply this patch to QEMU 4. Enable the balloon device in qemu with: -device virtio-balloon-pci,auto-balloon=true 5. Generate memory pressure in the host, or put QEMU in a memcg cgroup with limited memory. Watch the VM memory going down 6. Generate pressure in the guest to see it going up again (say, a kernel build with -j16) hw/virtio/virtio-balloon.c | 162 + hw/virtio/virtio-pci.c | 5 ++ hw/virtio/virtio-pci.h | 1 + include/hw/virtio/virtio-balloon.h | 15 4 files changed, 183 insertions(+) diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c index d669756..4b23360 100644 --- a/hw/virtio/virtio-balloon.c +++ b/hw/virtio/virtio-balloon.c @@ -31,6 +31,12 @@ #include hw/virtio/virtio-bus.h +void virtio_balloon_set_conf(DeviceState *dev, VirtIOBalloonConf *bconf) +{ +VirtIOBalloon *s = VIRTIO_BALLOON(dev); +memcpy((s-bconf), bconf, sizeof(struct VirtIOBalloonConf)); +} + static void balloon_page(void *addr, int deflate) { #if defined(__linux__) @@ -279,9 +285,21 @@ static void virtio_balloon_set_config(VirtIODevice *vdev, } } +static bool auto_balloon_enabled(const VirtIOBalloon *s) +{ +return s-bconf.auto_balloon; +} + static uint32_t virtio_balloon_get_features(VirtIODevice *vdev, uint32_t f) { +VirtIOBalloon *s = VIRTIO_BALLOON(vdev); + f |= (1 VIRTIO_BALLOON_F_STATS_VQ); + +if (auto_balloon_enabled(s)) { +f |= (1 VIRTIO_BALLOON_F_AUTO_BALLOON); +} + return f; } @@ -336,6 +354,141 @@ static int virtio_balloon_load(QEMUFile *f, void *opaque, int version_id) return 0; } +static int open_sysfile(const char *path, const char *file, mode_t mode) +{ +char *p; +int fd; + +p = g_strjoin(/, path, file, NULL); +fd = qemu_open(p, mode); +if (fd 0) { +error_report(balloon: can't open '%s': %s, p, strerror(errno)); +} + +g_free(p); +return fd; +} + +static int write_fd(int fd, const char *fmt, ...) +{ +va_list ap; +char *str; +int ret; + +va_start(ap, fmt); +str = g_strdup_vprintf(fmt, ap); +va_end(ap); + +do { +ret = write(fd, str, strlen(str)); +} while (ret 0 errno == EINTR); + +if (ret 0) { +error_report(balloon: write failed: %s, strerror(errno)); +} + +g_free(str); +return ret; +} + +static bool guest_supports_auto_balloon(const VirtIOBalloon *s) +{ +VirtIODevice *vdev = VIRTIO_DEVICE(s); +return vdev-guest_features (1
Re: [Qemu-devel] [PATCH 1/2] Vring: vring's listener's priority should higher than kvm
Il 09/05/2013 10:54, liu ping fan ha scritto: On Thu, May 9, 2013 at 4:44 PM, Michael S. Tsirkin m...@redhat.com wrote: On Thu, May 09, 2013 at 08:40:21AM +0800, Liu Ping Fan wrote: From: Liu Ping Fan pingf...@linux.vnet.ibm.com Hosts threads which handle vring should have high MemoryListener priority than kvm. For currently code, take the following scenario: kvm_region_add() run earlier before vhost_region_add(), then in guest, vring's desc[i] can refer to addressX in the new region known by guest. But vhost does not know this new region yet, and the vring handler will fail. Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com Is this seen in testing, or are you describing a theorecitical scenario? Please make this clear in the commit log. A theorecitical scenario. I think vcpu thread and vhost are async, so we need this method to sync. But why should this matter for hostmem? It doesn't communicate in any way with the hypervisor. Paolo
Re: [Qemu-devel] [PATCH V18 5/6] add-cow file format core code.
On Thu, May 09, 2013 at 02:24:20PM +0800, Dong Xu Wang wrote: On Thu, Apr 18, 2013 at 6:03 PM, Stefan Hajnoczi stefa...@gmail.com wrote: On Wed, Apr 10, 2013 at 04:11:52PM +0800, Dong Xu Wang wrote: +static coroutine_fn int add_cow_co_flush(BlockDriverState *bs) +{ +BDRVAddCowState *s = bs-opaque; +int ret; + +qemu_co_mutex_lock(s-lock); +if (s-bitmap_cache) { +ret = block_cache_flush(bs, s-bitmap_cache); +if (ret 0) { +return ret; +} +} +ret = bdrv_flush(s-image_hd); This is the wrong way around. We must flush image_hd first so that valid data is on disk. Then we can flush bitmap_cache to mark the clusters allocated. Beyond explicit flushes you also need to make sure that image_hd is flushed *before* bitmap_cache tables are written out (e.g. cache eviction when the cache becomes full). It seems this code is missing. Hi Stefan, how can I make sure image_hd flush operation completed before I write bitmap_cache tables? Is there any functions I can use? You could extend the block cache like this: block_cache_add_dependency_func(BlockCache *c, int (*dependency_fn)(BlockCache *c, void *opaque), void *opaque); Then add-cow would register a custom dependency function that flushes image_hd. A hackier way of achieving the same thing is to instantiate an image_hd_cache and set a dependency from the bitmap_cache on the image_hd_cache. The image_hd_cache actually has no cache entries, but it uses image_hd and will flush it. Stefan
Re: [Qemu-devel] [PATCH] curl: fix curl read
On Thu, May 09, 2013 at 02:56:50PM +0800, Fam Zheng wrote: On Thu, 05/09 08:41, Stefan Hajnoczi wrote: On Fri, May 03, 2013 at 04:00:09PM +0800, Fam Zheng wrote: +cache = curl_find_cache(s, aio_base, aio_bytes); +if (cache) { +curl_complete_io(s, acb, cache); +return; } What is the point of the cache? Can you split it into a separate patch? The cache is for prefetch. Data is fetched by 256k chunks using libcurl and stored in cache to fill future io request, reducing overall http request overhead. That sounds like a performance optimization which is not part of fixing broken block/curl.c :). But why bother doing this - the guest kernel uses read-ahead anyway to fetch more data than the userspace read(2) requested? This cache amplifies the read-ahead and potentially wastes (e.g. random-read workload). Let the guest decide how much read-ahead is appropriate. Stefan
Re: [Qemu-devel] [PATCH 1/2] Vring: vring's listener's priority should higher than kvm
On Thu, May 09, 2013 at 05:00:20PM +0800, liu ping fan wrote: On Thu, May 9, 2013 at 4:30 PM, Stefan Hajnoczi stefa...@gmail.com wrote: On Thu, May 09, 2013 at 08:40:21AM +0800, Liu Ping Fan wrote: From: Liu Ping Fan pingf...@linux.vnet.ibm.com Hosts threads which handle vring should have high MemoryListener priority than kvm. For currently code, take the following scenario: kvm_region_add() run earlier before vhost_region_add(), then in guest, vring's desc[i] can refer to addressX in the new region known by guest. But vhost does not know this new region yet, and the vring handler will fail. Is there a concrete scenario where this happens? I can think of situations like the ioeventfd being readable before vhost/hostmem is populated. But I don't see how that's related to the priority of kvm_region_add(). For kvm, ie, In guest, vring_desc.addr can point to a chunk of data in the new added memory, and kick vhost. The vhost has not added this new region, so its local lookup table can not translate this new address, and vring handler will fail. If vhost priority is higher than kvm, then, it will know this new address earlier than kvm. Isn't the real solution to ensure that the memory API is up-to-date before we notify the guest of memory hotplug? I still don't see a kvm vs vhost race. I see a guest vs vhost race which priority doesn't fix. Stefan
[Qemu-devel] [PATCH for-1.4.y 0/4] Drop the irredeemably racy cpu_unlink_tb()
This is a backport of the patch series by Peter Maydell, as found at http://thread.gmane.org/gmane.comp.emulators.qemu/196629 , to 1.4.y stable tree, very similar to what's been done for 1.1 tree. The difference is in context and because some code were moved between exec.c and translate-all.c, but basically it is still the same series. Peter Maydell (4): tcg: Document tcg_qemu_tb_exec() and provide constants for low bit uses cpu-exec: wrap tcg_qemu_tb_exec() in a fn to restore the PC Handle CPU interrupts by inline checking of a flag translate-all.c: Remove cpu_unlink_tb() cpu-exec.c| 58 ++- exec.c|2 +- include/exec/cpu-defs.h |1 + include/exec/gen-icount.h | 13 +++- tcg/tcg.h | 49 +- translate-all.c | 73 ++--- 6 files changed, 108 insertions(+), 88 deletions(-) -- 1.7.10.4
[Qemu-devel] [PATCH for-1.4.y 2/4] cpu-exec: wrap tcg_qemu_tb_exec() in a fn to restore the PC
From: Peter Maydell peter.mayd...@linaro.org If tcg_qemu_tb_exec() returns a value whose low bits don't indicate a link to an indexed next TB, this means that the TB execution never started (eg because the instruction counter hit zero). In this case the guest PC has to be reset to the address of the start of the TB. Refactor the cpu-exec code to make all tcg_qemu_tb_exec() calls pass through a wrapper function which does this restoration if necessary. Note that the apparent change in cpu_exec_nocache() from calling cpu_pc_from_tb() with the old TB to calling it with the TB returned by do_tcg_qemu_tb_exec() is safe, because in the nocache case we can guarantee that the TB we try to execute is not linked to any others, so the only possible returned TB is the one we started at. That is, we should arguably previously have included in cpu_exec_nocache() an assert(next_tb ~TB_EXIT_MASK) == tb), since the API requires restore from next_tb but we were using tb. Signed-off-by: Peter Maydell peter.mayd...@linaro.org Reviewed-by: Richard Henderson r...@twiddle.net Signed-off-by: Blue Swirl blauwir...@gmail.com (cherry picked from commit 77211379d73ea0c89c0b5bb6eee74b17cb06f9a8) Conflicts: cpu-exec.c Signed-off-by: Michael Tokarev m...@tls.msk.ru --- cpu-exec.c | 28 +--- 1 file changed, 17 insertions(+), 11 deletions(-) diff --git a/cpu-exec.c b/cpu-exec.c index 797e11a..4ffae22 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -51,12 +51,26 @@ void cpu_resume_from_signal(CPUArchState *env, void *puc) } #endif +/* Execute a TB, and fix up the CPU state afterwards if necessary */ +static inline tcg_target_ulong cpu_tb_exec(CPUArchState *env, uint8_t *tb_ptr) +{ +tcg_target_ulong next_tb = tcg_qemu_tb_exec(env, tb_ptr); +if ((next_tb TB_EXIT_MASK) TB_EXIT_IDX1) { +/* We didn't start executing this TB (eg because the instruction + * counter hit zero); we must restore the guest PC to the address + * of the start of the TB. + */ +TranslationBlock *tb = (TranslationBlock *)(next_tb ~TB_EXIT_MASK); +cpu_pc_from_tb(env, tb); +} +return next_tb; +} + /* Execute the code without caching the generated code. An interpreter could be used if available. */ static void cpu_exec_nocache(CPUArchState *env, int max_cycles, TranslationBlock *orig_tb) { -tcg_target_ulong next_tb; TranslationBlock *tb; /* Should never happen. @@ -68,14 +82,8 @@ static void cpu_exec_nocache(CPUArchState *env, int max_cycles, max_cycles); env-current_tb = tb; /* execute the generated code */ -next_tb = tcg_qemu_tb_exec(env, tb-tc_ptr); +cpu_tb_exec(env, tb-tc_ptr); env-current_tb = NULL; - -if ((next_tb TB_EXIT_MASK) == TB_EXIT_ICOUNT_EXPIRED) { -/* Restore PC. This may happen if async event occurs before - the TB starts executing. */ -cpu_pc_from_tb(env, tb); -} tb_phys_invalidate(tb, -1); tb_free(tb); } @@ -597,13 +605,11 @@ int cpu_exec(CPUArchState *env) if (likely(!env-exit_request)) { tc_ptr = tb-tc_ptr; /* execute the generated code */ -next_tb = tcg_qemu_tb_exec(env, tc_ptr); +next_tb = cpu_tb_exec(env, tc_ptr); if ((next_tb TB_EXIT_MASK) == TB_EXIT_ICOUNT_EXPIRED) { /* Instruction counter expired. */ int insns_left; tb = (TranslationBlock *)(next_tb ~TB_EXIT_MASK); -/* Restore PC. */ -cpu_pc_from_tb(env, tb); insns_left = env-icount_decr.u32; if (env-icount_extra insns_left = 0) { /* Refill decrementer and continue execution. */ -- 1.7.10.4
[Qemu-devel] [PATCH for-1.4.y 1/4] tcg: Document tcg_qemu_tb_exec() and provide constants for low bit uses
From: Peter Maydell peter.mayd...@linaro.org Document tcg_qemu_tb_exec(). In particular, its return value is a combination of a pointer to the next translation block and some extra information in the low two bits. Provide some #defines for the values passed in these bits to improve code clarity. Signed-off-by: Peter Maydell peter.mayd...@linaro.org Reviewed-by: Richard Henderson r...@twiddle.net Signed-off-by: Blue Swirl blauwir...@gmail.com (cherry picked from commit 0980011b4f66482d2733ab2dd0f2f61747772c6b) Conflicts: tcg/tcg.h Signed-off-by: Michael Tokarev m...@tls.msk.ru --- cpu-exec.c|9 + include/exec/gen-icount.h |2 +- tcg/tcg.h | 44 +++- 3 files changed, 49 insertions(+), 6 deletions(-) diff --git a/cpu-exec.c b/cpu-exec.c index 19ebb4a..797e11a 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -71,7 +71,7 @@ static void cpu_exec_nocache(CPUArchState *env, int max_cycles, next_tb = tcg_qemu_tb_exec(env, tb-tc_ptr); env-current_tb = NULL; -if ((next_tb 3) == 2) { +if ((next_tb TB_EXIT_MASK) == TB_EXIT_ICOUNT_EXPIRED) { /* Restore PC. This may happen if async event occurs before the TB starts executing. */ cpu_pc_from_tb(env, tb); @@ -583,7 +583,8 @@ int cpu_exec(CPUArchState *env) spans two pages, we cannot safely do a direct jump. */ if (next_tb != 0 tb-page_addr[1] == -1) { -tb_add_jump((TranslationBlock *)(next_tb ~3), next_tb 3, tb); +tb_add_jump((TranslationBlock *)(next_tb ~TB_EXIT_MASK), +next_tb TB_EXIT_MASK, tb); } spin_unlock(tb_lock); @@ -597,10 +598,10 @@ int cpu_exec(CPUArchState *env) tc_ptr = tb-tc_ptr; /* execute the generated code */ next_tb = tcg_qemu_tb_exec(env, tc_ptr); -if ((next_tb 3) == 2) { +if ((next_tb TB_EXIT_MASK) == TB_EXIT_ICOUNT_EXPIRED) { /* Instruction counter expired. */ int insns_left; -tb = (TranslationBlock *)(next_tb ~3); +tb = (TranslationBlock *)(next_tb ~TB_EXIT_MASK); /* Restore PC. */ cpu_pc_from_tb(env, tb); insns_left = env-icount_decr.u32; diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h index 8043b3b..c858a73 100644 --- a/include/exec/gen-icount.h +++ b/include/exec/gen-icount.h @@ -32,7 +32,7 @@ static void gen_icount_end(TranslationBlock *tb, int num_insns) if (use_icount) { *icount_arg = num_insns; gen_set_label(icount_label); -tcg_gen_exit_tb((tcg_target_long)tb + 2); +tcg_gen_exit_tb((tcg_target_long)tb + TB_EXIT_ICOUNT_EXPIRED); } } diff --git a/tcg/tcg.h b/tcg/tcg.h index a427972..10eb3f4 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -660,7 +660,49 @@ TCGv_i64 tcg_const_local_i64(int64_t val); extern uint8_t *code_gen_prologue; -/* TCG targets may use a different definition of tcg_qemu_tb_exec. */ +/** + * tcg_qemu_tb_exec: + * @env: CPUArchState * for the CPU + * @tb_ptr: address of generated code for the TB to execute + * + * Start executing code from a given translation block. + * Where translation blocks have been linked, execution + * may proceed from the given TB into successive ones. + * Control eventually returns only when some action is needed + * from the top-level loop: either control must pass to a TB + * which has not yet been directly linked, or an asynchronous + * event such as an interrupt needs handling. + * + * The return value is a pointer to the next TB to execute + * (if known; otherwise zero). This pointer is assumed to be + * 4-aligned, and the bottom two bits are used to return further + * information: + * 0, 1: the link between this TB and the next is via the specified + *TB index (0 or 1). That is, we left the TB via (the equivalent + *of) goto_tb index. The main loop uses this to determine + *how to link the TB just executed to the next. + * 2:we are using instruction counting code generation, and we + *did not start executing this TB because the instruction counter + *would hit zero midway through it. In this case the next-TB pointer + *returned is the TB we were about to execute, and the caller must + *arrange to execute the remaining count of instructions. + * + * If the bottom two bits indicate an exit-via-index then the CPU + * state is correctly synchronised and ready for execution of the next + * TB (and in particular the guest PC is the address to execute next). + * Otherwise, we gave up on execution of this TB before it started, and + * the caller must fix up the CPU state by calling
[Qemu-devel] [PATCH for-1.4.y 3/4] Handle CPU interrupts by inline checking of a flag
Fix some of the nasty TCG race conditions and crashes by implementing cpu_exit() as setting a flag which is checked at the start of each TB. This avoids crashes if a thread or signal handler calls cpu_exit() while the execution thread is itself modifying the TB graph (which may happen in system emulation mode as well as in linux-user mode with a multithreaded guest binary). This fixes the crashes seen in LP:668799; however there are another class of crashes described in LP:1098729 which stem from the fact that in linux-user with a multithreaded guest all threads will use and modify the same global TCG date structures (including the generated code buffer) without any kind of locking. This means that multithreaded guest binaries are still in the unsupported category. Signed-off-by: Peter Maydell peter.mayd...@linaro.org Reviewed-by: Richard Henderson r...@twiddle.net Signed-off-by: Blue Swirl blauwir...@gmail.com (cherry picked from commit 378df4b23753a11be650af7664ca76bc75cb9f01) Conflicts: exec.c include/qom/cpu.h translate-all.c include/exec/gen-icount.h Signed-off-by: Michael Tokarev m...@tls.msk.ru Conflicts: cpu-exec.c --- cpu-exec.c| 25 - exec.c|2 +- include/exec/cpu-defs.h |1 + include/exec/gen-icount.h | 11 +++ tcg/tcg.h |5 + translate-all.c |4 ++-- 6 files changed, 44 insertions(+), 4 deletions(-) diff --git a/cpu-exec.c b/cpu-exec.c index 4ffae22..1c6af24 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -63,6 +63,12 @@ static inline tcg_target_ulong cpu_tb_exec(CPUArchState *env, uint8_t *tb_ptr) TranslationBlock *tb = (TranslationBlock *)(next_tb ~TB_EXIT_MASK); cpu_pc_from_tb(env, tb); } +if ((next_tb TB_EXIT_MASK) == TB_EXIT_REQUESTED) { +/* We were asked to stop executing TBs (probably a pending + * interrupt. We've now stopped, so clear the flag. + */ +env-tcg_exit_req = 0; +} return next_tb; } @@ -606,7 +612,20 @@ int cpu_exec(CPUArchState *env) tc_ptr = tb-tc_ptr; /* execute the generated code */ next_tb = cpu_tb_exec(env, tc_ptr); -if ((next_tb TB_EXIT_MASK) == TB_EXIT_ICOUNT_EXPIRED) { +switch (next_tb TB_EXIT_MASK) { +case TB_EXIT_REQUESTED: +/* Something asked us to stop executing + * chained TBs; just continue round the main + * loop. Whatever requested the exit will also + * have set something else (eg exit_request or + * interrupt_request) which we will handle + * next time around the loop. + */ +tb = (TranslationBlock *)(next_tb ~TB_EXIT_MASK); +next_tb = 0; +break; +case TB_EXIT_ICOUNT_EXPIRED: +{ /* Instruction counter expired. */ int insns_left; tb = (TranslationBlock *)(next_tb ~TB_EXIT_MASK); @@ -630,6 +649,10 @@ int cpu_exec(CPUArchState *env) next_tb = 0; cpu_loop_exit(env); } +break; +} +default: +break; } } env-current_tb = NULL; diff --git a/exec.c b/exec.c index b85508b..371713a 100644 --- a/exec.c +++ b/exec.c @@ -493,7 +493,7 @@ void cpu_reset_interrupt(CPUArchState *env, int mask) void cpu_exit(CPUArchState *env) { env-exit_request = 1; -cpu_unlink_tb(env); +env-tcg_exit_req = 1; } void cpu_abort(CPUArchState *env, const char *fmt, ...) diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h index 2911b9f..07fce69 100644 --- a/include/exec/cpu-defs.h +++ b/include/exec/cpu-defs.h @@ -161,6 +161,7 @@ typedef struct CPUWatchpoint { uint32_t halted; /* Nonzero if the CPU is in suspend state */ \ uint32_t interrupt_request; \ volatile sig_atomic_t exit_request; \ +volatile sig_atomic_t tcg_exit_req; \ CPU_COMMON_TLB \ struct TranslationBlock *tb_jmp_cache[TB_JMP_CACHE_SIZE]; \ /* buffer for temporaries in the code generator */ \ diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h index c858a73..f45f975 100644 --- a/include/exec/gen-icount.h +++ b/include/exec/gen-icount.h @@ -7,10 +7,18 @@ static TCGArg *icount_arg; static int icount_label; +static int exitreq_label;
[Qemu-devel] [PATCH for-1.4.y 4/4] translate-all.c: Remove cpu_unlink_tb()
From: Peter Maydell peter.mayd...@linaro.org The (unsafe) function cpu_unlink_tb() is now unused, so we can simply remove it and any code that was only used by it. Signed-off-by: Peter Maydell peter.mayd...@linaro.org Reviewed-by: Richard Henderson r...@twiddle.net Signed-off-by: Blue Swirl blauwir...@gmail.com (cherry picked from commit 3a808cc407744c30daa7470b5f191cde1fbc1aae) Conflicts: translate-all.c Signed-off-by: Michael Tokarev m...@tls.msk.ru --- translate-all.c | 69 --- 1 file changed, 69 deletions(-) diff --git a/translate-all.c b/translate-all.c index 1288b2a..ba4d3f6 100644 --- a/translate-all.c +++ b/translate-all.c @@ -1350,55 +1350,6 @@ static TranslationBlock *tb_find_pc(uintptr_t tc_ptr) return tbs[m_max]; } -static void tb_reset_jump_recursive(TranslationBlock *tb); - -static inline void tb_reset_jump_recursive2(TranslationBlock *tb, int n) -{ -TranslationBlock *tb1, *tb_next, **ptb; -unsigned int n1; - -tb1 = tb-jmp_next[n]; -if (tb1 != NULL) { -/* find head of list */ -for (;;) { -n1 = (uintptr_t)tb1 3; -tb1 = (TranslationBlock *)((uintptr_t)tb1 ~3); -if (n1 == 2) { -break; -} -tb1 = tb1-jmp_next[n1]; -} -/* we are now sure now that tb jumps to tb1 */ -tb_next = tb1; - -/* remove tb from the jmp_first list */ -ptb = tb_next-jmp_first; -for (;;) { -tb1 = *ptb; -n1 = (uintptr_t)tb1 3; -tb1 = (TranslationBlock *)((uintptr_t)tb1 ~3); -if (n1 == n tb1 == tb) { -break; -} -ptb = tb1-jmp_next[n1]; -} -*ptb = tb-jmp_next[n]; -tb-jmp_next[n] = NULL; - -/* suppress the jump to next tb in generated code */ -tb_reset_jump(tb, n); - -/* suppress jumps in the tb on which we could have jumped */ -tb_reset_jump_recursive(tb_next); -} -} - -static void tb_reset_jump_recursive(TranslationBlock *tb) -{ -tb_reset_jump_recursive2(tb, 0); -tb_reset_jump_recursive2(tb, 1); -} - #if defined(TARGET_HAS_ICE) !defined(CONFIG_USER_ONLY) void tb_invalidate_phys_addr(hwaddr addr) { @@ -1417,26 +1368,6 @@ void tb_invalidate_phys_addr(hwaddr addr) } #endif /* TARGET_HAS_ICE !defined(CONFIG_USER_ONLY) */ -void cpu_unlink_tb(CPUArchState *env) -{ -/* FIXME: TB unchaining isn't SMP safe. For now just ignore the - problem and hope the cpu will stop of its own accord. For userspace - emulation this often isn't actually as bad as it sounds. Often - signals are used primarily to interrupt blocking syscalls. */ -TranslationBlock *tb; -static spinlock_t interrupt_lock = SPIN_LOCK_UNLOCKED; - -spin_lock(interrupt_lock); -tb = env-current_tb; -/* if the cpu is currently executing code, we must unlink it and - all the potentially executing TB */ -if (tb) { -env-current_tb = NULL; -tb_reset_jump_recursive(tb); -} -spin_unlock(interrupt_lock); -} - void tb_check_watchpoint(CPUArchState *env) { TranslationBlock *tb; -- 1.7.10.4
Re: [Qemu-devel] [PATCH for-1.4.y 3/4] Handle CPU interrupts by inline checking of a flag
09.05.2013 19:30, Michael Tokarev wrote: Fix some of the nasty TCG race conditions and crashes by implementing ... Somehow this went out without proper authorship line, dunno why. I can resent if needed, adding: From: Peter Maydell peter.mayd...@linaro.org Thanks, /mjt
[Qemu-devel] [Bug 1100843] Re: Live Migration Causes Performance Issues
** Changed in: linux (Ubuntu) Status: Incomplete = Confirmed -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1100843 Title: Live Migration Causes Performance Issues Status in QEMU: New Status in “linux” package in Ubuntu: Confirmed Status in “qemu-kvm” package in Ubuntu: Triaged Bug description: I have 2 physical hosts running Ubuntu Precise. With 1.0+noroms- 0ubuntu14.7 and qemu-kvm 1.2.0+noroms-0ubuntu7 (source from quantal, built for Precise with pbuilder.) I attempted to build qemu-1.3.0 debs from source to test, but libvirt seems to have an issue with it that I haven't been able to track down yet. I'm seeing a performance degradation after live migration on Precise, but not Lucid. These hosts are managed by libvirt (tested both 0.9.8-2ubuntu17 and 1.0.0-0ubuntu4) in conjunction with OpenNebula. I don't seem to have this problem with lucid guests (running a number of standard kernels, 3.2.5 mainline and backported linux- image-3.2.0-35-generic as well.) I first noticed this problem with phoronix doing compilation tests, and then tried lmbench where even simple calls experience performance degradation. I've attempted to post to the kvm mailing list, but so far the only suggestion was it may be related to transparent hugepages not being used after migration, but this didn't pan out. Someone else has a similar problem here - http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592 qemu command line example: /usr/bin/kvm -name one-2 -S -M pc-1.2 -cpu Westmere -enable-kvm -m 73728 -smp 16,sockets=2,cores=8,threads=1 -uuid f89e31a4-4945-c12c-6544-149ba0746c2f -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/one-2.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -no-kvm-pit-reinjection -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/one//datastores/0/2/disk.0,if=none,id=drive-virtio- disk0,format=raw,cache=none -device virtio-blk- pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio- disk0,bootindex=1 -drive file=/var/lib/one//datastores/0/2/disk.1,if=none,id=drive- ide0-0-0,readonly=on,format=raw -device ide-cd,bus=ide.0,unit=0,drive =drive-ide0-0-0,id=ide0-0-0 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=25 -device virtio-net- pci,netdev=hostnet0,id=net0,mac=02:00:0a:64:02:fe,bus=pci.0,addr=0x3 -vnc 0.0.0.0:2,password -vga cirrus -incoming tcp:0.0.0.0:49155 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 Disk backend is LVM running on SAN via FC connection (using symlink from /var/lib/one/datastores/0/2/disk.0 above) ubuntu-12.04 - first boot == Simple syscall: 0.0527 microseconds Simple read: 0.1143 microseconds Simple write: 0.0953 microseconds Simple open/close: 1.0432 microseconds Using phoronix pts/compuational ImageMagick - 31.54s Linux Kernel 3.1 - 43.91s Mplayer - 30.49s PHP - 22.25s ubuntu-12.04 - post live migration == Simple syscall: 0.0621 microseconds Simple read: 0.2485 microseconds Simple write: 0.2252 microseconds Simple open/close: 1.4626 microseconds Using phoronix pts/compilation ImageMagick - 43.29s Linux Kernel 3.1 - 76.67s Mplayer - 45.41s PHP - 29.1s I don't have phoronix results for 10.04 handy, but they were within 1% of each other... ubuntu-10.04 - first boot == Simple syscall: 0.0524 microseconds Simple read: 0.1135 microseconds Simple write: 0.0972 microseconds Simple open/close: 1.1261 microseconds ubuntu-10.04 - post live migration == Simple syscall: 0.0526 microseconds Simple read: 0.1075 microseconds Simple write: 0.0951 microseconds Simple open/close: 1.0413 microseconds To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1100843/+subscriptions
[Qemu-devel] [PATCH] vfio-pci: Add support for MSI affinity
When MSI is accelerated through KVM the vectors are only programmed when the guest first enables MSI support. Subsequent writes to the vector address or data fields are ignored. Unfortunately that means we're ignore updates done to adjust SMP affinity of the vectors. MSI SMP affinity already works in non-KVM mode because the address and data fields are read from their backing store on each interrupt. This patch stores the MSIMessage programmed into KVM so that we can determine when changes are made and update the routes. Signed-off-by: Alex Williamson alex.william...@redhat.com --- For qemu-1.6 hw/misc/vfio.c | 47 --- 1 file changed, 40 insertions(+), 7 deletions(-) diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c index 693a9ff..67a1580 100644 --- a/hw/misc/vfio.c +++ b/hw/misc/vfio.c @@ -102,6 +102,7 @@ typedef struct VFIOINTx { typedef struct VFIOMSIVector { EventNotifier interrupt; /* eventfd triggered on interrupt */ struct VFIODevice *vdev; /* back pointer to device */ +MSIMessage msg; /* cache the MSI message so we know when it changes */ int virq; /* KVM irqchip route for QEMU bypass */ bool use; } VFIOMSIVector; @@ -776,7 +777,6 @@ retry: vdev-msi_vectors = g_malloc0(vdev-nr_vectors * sizeof(VFIOMSIVector)); for (i = 0; i vdev-nr_vectors; i++) { -MSIMessage msg; VFIOMSIVector *vector = vdev-msi_vectors[i]; vector-vdev = vdev; @@ -786,13 +786,13 @@ retry: error_report(vfio: Error: event_notifier_init failed); } -msg = msi_get_message(vdev-pdev, i); +vector-msg = msi_get_message(vdev-pdev, i); /* * Attempt to enable route through KVM irqchip, * default to userspace handling if unavailable. */ -vector-virq = kvm_irqchip_add_msi_route(kvm_state, msg); +vector-virq = kvm_irqchip_add_msi_route(kvm_state, vector-msg); if (vector-virq 0 || kvm_irqchip_add_irqfd_notifier(kvm_state, vector-interrupt, vector-virq) 0) { @@ -898,6 +898,33 @@ static void vfio_disable_msi(VFIODevice *vdev) vdev-host.bus, vdev-host.slot, vdev-host.function); } +static void vfio_update_msi(VFIODevice *vdev) +{ +int i; + +for (i = 0; i vdev-nr_vectors; i++) { +VFIOMSIVector *vector = vdev-msi_vectors[i]; +MSIMessage msg; + +if (!vector-use || vector-virq 0) { +continue; +} + +msg = msi_get_message(vdev-pdev, i); + +if (msg.address != vector-msg.address || +msg.data != vector-msg.data) { + +DPRINTF(%s(%04x:%02x:%02x.%x) MSI vector %d changed\n, +__func__, vdev-host.domain, vdev-host.bus, +vdev-host.slot, vdev-host.function, i); + +kvm_irqchip_update_msi_route(kvm_state, vector-virq, msg); +vector-msg = msg; +} +} +} + /* * IO Port/MMIO - Beware of the endians, VFIO is always little endian */ @@ -1850,10 +1877,16 @@ static void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr, is_enabled = msi_enabled(pdev); -if (!was_enabled is_enabled) { -vfio_enable_msi(vdev); -} else if (was_enabled !is_enabled) { -vfio_disable_msi(vdev); +if (!was_enabled) { +if (is_enabled) { +vfio_enable_msi(vdev); +} +} else { +if (!is_enabled) { +vfio_disable_msi(vdev); +} else { +vfio_update_msi(vdev); +} } } else if (pdev-cap_present QEMU_PCI_CAP_MSIX ranges_overlap(addr, len, pdev-msix_cap, MSIX_CAP_LENGTH)) {
[Qemu-devel] [PATCH 0/2] pci-assign: MSI affinity support
I posted these about 6 months ago and Jan felt we should implement MSI notifiers like we have for MSI-X. That still hasn't happened. This is a pretty trivial addition to pci-assign and could easily be ported to MSI notifiers if anyone cares enough down the road to implement them. I don't feel that this, or the vfio patch that adds the same, warrants that kind of infrastructure. I expect these are for qemu-1.6. Thanks, Alex --- Alex Williamson (2): pci-assign: Refactor MSI virq array pci-assign: Add MSI affinity support hw/i386/kvm/pci-assign.c | 78 +++--- 1 file changed, 53 insertions(+), 25 deletions(-)
[Qemu-devel] [PATCH 1/2] pci-assign: Refactor MSI virq array
Convert the msi_virq array into a struct array so we can easily add a place to track the MSIMessage for each vector. Signed-off-by: Alex Williamson alex.william...@redhat.com --- hw/i386/kvm/pci-assign.c | 51 +++--- 1 file changed, 26 insertions(+), 25 deletions(-) diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c index c1e08ec..0f83a4c 100644 --- a/hw/i386/kvm/pci-assign.c +++ b/hw/i386/kvm/pci-assign.c @@ -131,8 +131,10 @@ typedef struct AssignedDevice { } cap; uint8_t emulate_config_read[PCI_CONFIG_SPACE_SIZE]; uint8_t emulate_config_write[PCI_CONFIG_SPACE_SIZE]; -int msi_virq_nr; -int *msi_virq; +int msi_nr; +struct { +int virq; +} *msi; MSIXTableEntry *msix_table; hwaddr msix_table_addr; uint16_t msix_max; @@ -689,19 +691,18 @@ again: return 0; } -static void free_msi_virqs(AssignedDevice *dev) +static void free_msi(AssignedDevice *dev) { int i; -for (i = 0; i dev-msi_virq_nr; i++) { -if (dev-msi_virq[i] = 0) { -kvm_irqchip_release_virq(kvm_state, dev-msi_virq[i]); -dev-msi_virq[i] = -1; +for (i = 0; i dev-msi_nr; i++) { +if (dev-msi[i].virq = 0) { +kvm_irqchip_release_virq(kvm_state, dev-msi[i].virq); } } -g_free(dev-msi_virq); -dev-msi_virq = NULL; -dev-msi_virq_nr = 0; +g_free(dev-msi); +dev-msi = NULL; +dev-msi_nr = 0; } static void free_assigned_device(AssignedDevice *dev) @@ -756,7 +757,7 @@ static void free_assigned_device(AssignedDevice *dev) close(dev-real_device.config_fd); } -free_msi_virqs(dev); +free_msi(dev); } static void assign_failed_examine(AssignedDevice *dev) @@ -995,7 +996,7 @@ static void assigned_dev_update_msi(PCIDevice *pci_dev) perror(assigned_dev_update_msi: deassign irq); } -free_msi_virqs(assigned_dev); +free_msi(assigned_dev); assigned_dev-assigned_irq_type = ASSIGNED_IRQ_NONE; pci_device_set_intx_routing_notifier(pci_dev, NULL); @@ -1011,9 +1012,9 @@ static void assigned_dev_update_msi(PCIDevice *pci_dev) return; } -assigned_dev-msi_virq = g_malloc(sizeof(*assigned_dev-msi_virq)); -assigned_dev-msi_virq_nr = 1; -assigned_dev-msi_virq[0] = virq; +assigned_dev-msi = g_malloc(sizeof(*assigned_dev-msi)); +assigned_dev-msi_nr = 1; +assigned_dev-msi[0].virq = virq; if (kvm_device_msi_assign(kvm_state, assigned_dev-dev_id, virq) 0) { perror(assigned_dev_update_msi: kvm_device_msi_assign); } @@ -1074,14 +1075,14 @@ static int assigned_dev_update_msix_mmio(PCIDevice *pci_dev) return r; } -free_msi_virqs(adev); +free_msi(adev); -adev-msi_virq_nr = adev-msix_max; -adev-msi_virq = g_malloc(adev-msix_max * sizeof(*adev-msi_virq)); +adev-msi_nr = adev-msix_max; +adev-msi = g_malloc(adev-msix_max * sizeof(*adev-msi)); entry = adev-msix_table; for (i = 0; i adev-msix_max; i++, entry++) { -adev-msi_virq[i] = -1; +adev-msi[i].virq = -1; if (assigned_dev_msix_skipped(entry)) { continue; @@ -1093,13 +1094,13 @@ static int assigned_dev_update_msix_mmio(PCIDevice *pci_dev) if (r 0) { return r; } -adev-msi_virq[i] = r; +adev-msi[i].virq = r; DEBUG(MSI-X vector %d, gsi %d, addr %08x_%08x, data %08x\n, i, r, entry-addr_hi, entry-addr_lo, entry-data); r = kvm_device_msix_set_vector(kvm_state, adev-dev_id, i, - adev-msi_virq[i]); + adev-msi[i].virq); if (r) { error_report(fail to set MSI-X entry! %s, strerror(-r)); break; @@ -1127,7 +1128,7 @@ static void assigned_dev_update_msix(PCIDevice *pci_dev) perror(assigned_dev_update_msix: deassign irq); } -free_msi_virqs(assigned_dev); +free_msi(assigned_dev); assigned_dev-assigned_irq_type = ASSIGNED_IRQ_NONE; pci_device_set_intx_routing_notifier(pci_dev, NULL); @@ -1139,7 +1140,7 @@ static void assigned_dev_update_msix(PCIDevice *pci_dev) return; } -if (assigned_dev-msi_virq_nr 0) { +if (assigned_dev-msi_nr 0) { if (kvm_device_msix_assign(kvm_state, assigned_dev-dev_id) 0) { perror(assigned_dev_enable_msix: assign irq); return; @@ -1567,7 +1568,7 @@ static void assigned_dev_msix_mmio_write(void *opaque, hwaddr addr, } else if (assigned_dev_msix_masked(orig) !assigned_dev_msix_masked(entry)) { /* Vector unmasked */ -if (i = adev-msi_virq_nr || adev-msi_virq[i] 0) { +if (i = adev-msi_nr || adev-msi[i].virq 0) {
[Qemu-devel] [PATCH 2/2] pci-assign: Add MSI affinity support
Track the last MSIMessage programmed so we can determine when it has changed and update the routing to the guest. Signed-off-by: Alex Williamson alex.william...@redhat.com --- hw/i386/kvm/pci-assign.c | 27 +++ 1 file changed, 27 insertions(+) diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c index 0f83a4c..e09265b 100644 --- a/hw/i386/kvm/pci-assign.c +++ b/hw/i386/kvm/pci-assign.c @@ -134,6 +134,7 @@ typedef struct AssignedDevice { int msi_nr; struct { int virq; +MSIMessage msg; } *msi; MSIXTableEntry *msix_table; hwaddr msix_table_addr; @@ -1015,6 +1016,7 @@ static void assigned_dev_update_msi(PCIDevice *pci_dev) assigned_dev-msi = g_malloc(sizeof(*assigned_dev-msi)); assigned_dev-msi_nr = 1; assigned_dev-msi[0].virq = virq; +assigned_dev-msi[0].msg = msg; if (kvm_device_msi_assign(kvm_state, assigned_dev-dev_id, virq) 0) { perror(assigned_dev_update_msi: kvm_device_msi_assign); } @@ -1027,6 +1029,27 @@ static void assigned_dev_update_msi(PCIDevice *pci_dev) } } +static void assigned_dev_update_msi_msg(PCIDevice *pci_dev) +{ +AssignedDevice *assigned_dev = DO_UPCAST(AssignedDevice, dev, pci_dev); +uint8_t ctrl_byte = pci_get_byte(pci_dev-config + pci_dev-msi_cap + + PCI_MSI_FLAGS); +MSIMessage msg; + +if (assigned_dev-assigned_irq_type != ASSIGNED_IRQ_MSI || +!(ctrl_byte PCI_MSI_FLAGS_ENABLE)) { +return; +} + +msg = msi_get_message(pci_dev, 0); + +if (msg.address != assigned_dev-msi[0].msg.address || +msg.data != assigned_dev-msi[0].msg.data) { +kvm_irqchip_update_msi_route(kvm_state, assigned_dev-msi[0].virq, msg); +assigned_dev-msi[0].msg = msg; +} +} + static bool assigned_dev_msix_masked(MSIXTableEntry *entry) { return (entry-ctrl cpu_to_le32(0x1)) != 0; @@ -1202,6 +1225,10 @@ static void assigned_dev_pci_write_config(PCIDevice *pci_dev, uint32_t address, if (range_covers_byte(address, len, pci_dev-msi_cap + PCI_MSI_FLAGS)) { assigned_dev_update_msi(pci_dev); +} else if (ranges_overlap(address, len, + pci_dev-msi_cap + PCI_MSI_ADDRESS_LO, + 10 - PCI_MSI_ADDRESS_LO)) { +assigned_dev_update_msi_msg(pci_dev); } } if (assigned_dev-cap.available ASSIGNED_DEVICE_CAP_MSIX) {
Re: [Qemu-devel] (i386) TCG is broken in 1.5-rc0 with guest kernel 3.2?
On 05/08/2013 02:03 AM, Michael Tokarev wrote: I just tried 1.5-rc0 (32bit x86 userspace), and guest kernel crashes in an interesting way right at init time. qemu-system-x86_64 -net none -kernel /boot/vmlinuz-3.2-amd64 -append console=ttyS0 -serial file:serial Irritatingly, this works for me, with the current fedora 18 kernel, x86_64-softmmu/qemu-system-x86_64: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=0x3286b0bfb3272b9b3004abe8d0c306f12470d77d, not stripped ./x86_64-softmmu/qemu-system-x86_64 -net none -kernel /boot/vmlinuz 3.8.11-200.fc18.x86_64 -append console=ttyS0 -serial stdio Which makes it all the way until it panics for lack of root filesystem. Hopefully this is the problem fixed by Aurelien's tcg/optimize.c patch today (not yet applied to qemu head)? r~
Re: [Qemu-devel] [PATCH 01/40] memory: assert that PhysPageEntry's ptr does not overflow
Il 09/05/2013 05:41, liu ping fan ha scritto: On Wed, May 8, 2013 at 12:08 AM, Paolo Bonzini pbonz...@redhat.com wrote: - Messaggio originale - Da: Peter Maydell peter.mayd...@linaro.org A: Paolo Bonzini pbonz...@redhat.com Cc: qemu-devel@nongnu.org, a...@ozlabs.ru, jan kiszka jan.kis...@siemens.com, qemul...@gmail.com, stefa...@redhat.com, da...@gibson.dropbear.id.au Inviato: Martedì, 7 maggio 2013 17:44:59 Oggetto: Re: [Qemu-devel] [PATCH 01/40] memory: assert that PhysPageEntry's ptr does not overflow On 7 May 2013 15:16, Paolo Bonzini pbonz...@redhat.com wrote: Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- exec.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/exec.c b/exec.c index 19725db..2e5b89a 100644 --- a/exec.c +++ b/exec.c @@ -719,6 +719,8 @@ static void destroy_all_mappings(AddressSpaceDispatch *d) static uint16_t phys_section_add(MemoryRegionSection *section) { +assert(phys_sections_nb TARGET_PAGE_SIZE); + if (phys_sections_nb == phys_sections_nb_alloc) { phys_sections_nb_alloc = MAX(phys_sections_nb_alloc * 2, 16); phys_sections = g_renew(MemoryRegionSection, phys_sections, Why is the limit we're asserting not the same as the maximum size that we pass to g_renew() below? That's a minimum size, isn't it? I'm asserting that the physical section number doesn't overflow into the page, since the TLB entries are stored as a combination of the two. Could you explain more detail? Why TARGET_PAGE_SIZE, not 2^15? Because the TLB entry is the or of the page address and the phys_section. Look here: hwaddr memory_region_section_get_iotlb(CPUArchState *env, MemoryRegionSection *section, target_ulong vaddr, hwaddr paddr, int prot, target_ulong *address) { hwaddr iotlb; CPUWatchpoint *wp; if (memory_region_is_ram(section-mr)) { /* Normal RAM. */ iotlb = (memory_region_get_ram_addr(section-mr) TARGET_PAGE_MASK) + memory_region_section_addr(section, paddr); if (!section-readonly) { iotlb |= phys_section_notdirty; } else { iotlb |= phys_section_rom; } } else { iotlb = section - phys_sections; iotlb += memory_region_section_addr(section, paddr); } where the else could be written better as: iotlb = memory_region_section_addr(section, paddr); iotlb |= section - phys_sections; memory_region_section_addr will return a page-aligned value. Paolo
Re: [Qemu-devel] (i386) TCG is broken in 1.5-rc0 with guest kernel 3.2?
On Thu, May 09, 2013 at 09:43:20AM -0700, Richard Henderson wrote: On 05/08/2013 02:03 AM, Michael Tokarev wrote: I just tried 1.5-rc0 (32bit x86 userspace), and guest kernel crashes in an interesting way right at init time. qemu-system-x86_64 -net none -kernel /boot/vmlinuz-3.2-amd64 -append console=ttyS0 -serial file:serial Irritatingly, this works for me, with the current fedora 18 kernel, x86_64-softmmu/qemu-system-x86_64: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=0x3286b0bfb3272b9b3004abe8d0c306f12470d77d, not stripped ./x86_64-softmmu/qemu-system-x86_64 -net none -kernel /boot/vmlinuz 3.8.11-200.fc18.x86_64 -append console=ttyS0 -serial stdio Which makes it all the way until it panics for lack of root filesystem. Hopefully this is the problem fixed by Aurelien's tcg/optimize.c patch today (not yet applied to qemu head)? Yes, it is. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] (i386) TCG is broken in 1.5-rc0 with guest kernel 3.2?
09.05.2013 20:52, Aurelien Jarno wrote: On Thu, May 09, 2013 at 09:43:20AM -0700, Richard Henderson wrote: On 05/08/2013 02:03 AM, Michael Tokarev wrote: I just tried 1.5-rc0 (32bit x86 userspace), and guest kernel crashes in an interesting way right at init time. qemu-system-x86_64 -net none -kernel /boot/vmlinuz-3.2-amd64 -append console=ttyS0 -serial file:serial Irritatingly, this works for me, with the current fedora 18 kernel, x86_64-softmmu/qemu-system-x86_64: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=0x3286b0bfb3272b9b3004abe8d0c306f12470d77d, not stripped ./x86_64-softmmu/qemu-system-x86_64 -net none -kernel /boot/vmlinuz 3.8.11-200.fc18.x86_64 -append console=ttyS0 -serial stdio Which makes it all the way until it panics for lack of root filesystem. Hopefully this is the problem fixed by Aurelien's tcg/optimize.c patch today (not yet applied to qemu head)? Yes, it is. Yes, the patch by Aurelien fixes the reported problem for me. Thanks! /mjt
Re: [Qemu-devel] [PATCH v6 00/11] rdma: migration support
Comments inline. FYI: please CC mrhi...@us.ibm.com, because it helps me know when to scroll threw the bazillion qemu-devel emails. I have things separated out into folders and rules, but a direct CC is better =) On 05/03/2013 07:28 PM, Chegu Vinod wrote: Hi Michael, I picked up the qemu bits from your github branch and gave it a try. (BTW the setup I was given temporary access to has a pair of MLX's IB QDR cards connected back to back via QSFP cables) Observed a couple of things and wanted to share..perhaps you may be aware of them already or perhaps these are unrelated to your specific changes ? (Note: Still haven't finished the review of your changes ). a) x-rdma-pin-all off case Seem to only work sometimes but fails at other times. Here is an example... (qemu) rdma: Accepting rdma connection... rdma: Memory pin all: disabled rdma: verbs context after listen: 0x56757d50 rdma: dest_connect Source GID: fe80::2:c903:9:53a5, Dest GID: fe80::2:c903:9:5855 rdma: Accepted migration qemu-system-x86_64: VQ 1 size 0x100 Guest index 0x4d2 inconsistent with Host ind ex 0x4ec: delta 0xffe6 qemu: warning: error while loading state for instance 0x0 of device 'virtio-net' load of migration failed Can you give me more details about the configuration of your VM? b) x-rdma-pin-all on case : The guest is not resuming on the target host. i.e. the source host's qemu states that migration is complete but the guest is not responsive anymore... (doesn't seem to have crashed but its stuck somewhere). Have you seen this behavior before ? Any tips on how I could extract additional info ? Is the QEMU monitor still responsive? Can you capture a screenshot of the guest's console to see if there is a panic? What kind of storage is attached to the VM? Besides the list of noted restrictions/issues around having to pin all of guest memoryif the pinning is done as part of starting of the migration it ends up taking noticeably long time for larger guests. Wonder whether that should be counted as part of the total migration time ?. That's a good question: The pin-all option should not be slowing down your VM to much as the VM should still be running before the migration_thread() actually kicks in and starts the migration. I need more information on the configuration of your VM, guest operating system, architecture and so forth... And similarly as before whether or not QEMU is not responsive or whether or not it's the guest that's panicked... Also the act of pinning all the memory seems to freeze the guest. e.g. : For larger enterprise sized guests (say 128GB and higher) the guest is frozen is anywhere from nearly a minute (~50seconds) to multiple minutes as the guest size increases...which imo kind of defeats the purpose of live guest migration. That's bad =) There must be a bug somewhere the largest VM I can create on my hardware is ~16GB - so let me give that a try and try to track down the problem. Would like to hear if you have already thought about any other alternatives to address this issue ? for e.g. would it be better to pin all of the guest's memory as part of starting the guest itself ? Yes there are restrictions when we do pinning...but it can help with performance. For such a large VM, I would definitely recommend pinning because I'm assuming you have enough processors or a large enough application to actually *use* that much memory, which would suggest that even after the bulk phase round of the migration has already completed that your VM is probably going to remain to be pretty busy. It's just a matter of me tracking down what's causing the freeze and fixing it I'll look into it right now on my machine. --- BTW, a different (yet sort of related) topic... recently a patch went into upstream that provided an option to qemu to mlock all of guest memory : https://lists.gnu.org/archive/html/qemu-devel/2013-04/msg03947.html . I had no idea...very interesting. but when attempting to do the mlock for larger guests a lot of time is spent bringing each page into cache and clearing/zeron'g it etc.etc. https://lists.gnu.org/archive/html/qemu-devel/2013-04/msg04161.html Wow, I didn't know that either. Perhaps this must be causing the entire QEMU process and its threads to seize up. It may be necessary to run the pinning command *outside* of QEMU's I/O lock in a separate thread if it's really that much overhead. Thanks a lot for pointing this out. Note: The basic tcp based live guest migration in the same qemu version still works fine on the same hosts over a pair of non-RDMA cards 10Gb NICs connected back-to-back. Acknowledged.
[Qemu-devel] [PATCH] target-mips: clean-up in BIT_INSV
From: Petar Jovanovic petar.jovano...@imgtec.com This is a small follow-up change to fix incorrect behaviour for INSV. It includes two minor modifications: - sizefilter is constant so it can be moved inside of the block, - (int64_t)0x01 is replaced with 1LL for ease of reading. No functional change. Signed-off-by: Petar Jovanovic petar.jovano...@imgtec.com --- target-mips/dsp_helper.c |9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/target-mips/dsp_helper.c b/target-mips/dsp_helper.c index 9212789..426a3b6 100644 --- a/target-mips/dsp_helper.c +++ b/target-mips/dsp_helper.c @@ -2900,11 +2900,12 @@ target_ulong helper_bitrev(target_ulong rt) return (target_ulong)rd; } -#define BIT_INSV(name, posfilter, sizefilter, ret_type) \ +#define BIT_INSV(name, posfilter, ret_type) \ target_ulong helper_##name(CPUMIPSState *env, target_ulong rs, \ target_ulong rt) \ { \ uint32_t pos, size, msb, lsb; \ +uint32_t const sizefilter = 0x3F; \ target_ulong filter;\ target_ulong temp, temprs, temprt; \ target_ulong dspc; \ @@ -2921,7 +2922,7 @@ target_ulong helper_##name(CPUMIPSState *env, target_ulong rs, \ return rt; \ } \ \ -filter = ((int64_t)0x01 size) - 1; \ +filter = (1LL size) - 1; \ filter = filter pos; \ temprs = (rs pos) filter; \ temprt = rt ~filter; \ @@ -2930,9 +2931,9 @@ target_ulong helper_##name(CPUMIPSState *env, target_ulong rs, \ return (target_long)(ret_type)temp; \ } -BIT_INSV(insv, 0x1F, 0x3F, int32_t); +BIT_INSV(insv, 0x1F, int32_t); #ifdef TARGET_MIPS64 -BIT_INSV(dinsv, 0x7F, 0x3F, target_long); +BIT_INSV(dinsv, 0x7F, target_long); #endif #undef BIT_INSV -- 1.7.9.5
[Qemu-devel] [PATCH for 1.5] target-i386 ROR r8/r16 imm instruction fix
Fix EFLAGS corruption by ROR r8/r16 imm instruction located at the end of the TB, similarly to commit 089305ac for the non-immediate case. Reported-by: Hervé Poussineau hpous...@reactos.org Cc: Richard Henderson r...@twiddle.net Signed-off-by: Aurelien Jarno aurel...@aurel32.net --- target-i386/translate.c |1 + 1 file changed, 1 insertion(+) diff --git a/target-i386/translate.c b/target-i386/translate.c index 524a0b4..0aeccdb 100644 --- a/target-i386/translate.c +++ b/target-i386/translate.c @@ -1871,6 +1871,7 @@ static void gen_rot_rm_im(DisasContext *s, int ot, int op1, int op2, if (is_right) { tcg_gen_shri_tl(cpu_cc_src2, cpu_T[0], mask - 1); tcg_gen_shri_tl(cpu_cc_dst, cpu_T[0], mask); +tcg_gen_andi_tl(cpu_cc_dst, cpu_cc_dst, 1); } else { tcg_gen_shri_tl(cpu_cc_src2, cpu_T[0], mask); tcg_gen_andi_tl(cpu_cc_dst, cpu_T[0], 1); -- 1.7.10.4
Re: [Qemu-devel] [PATCH for 1.5] Revert target-i386: Use movcond to implement rotate flags.
On Wed, May 01, 2013 at 08:06:51AM +0200, Hervé Poussineau wrote: This commit breaks boot of MS-DOS 6.22, which stops at: MODE CON CODEPAGE PREPARE=((850) C:\DOS\EGA.CPI) This reverts part of commit 34d80a55ff8517fd37bcfea5063b9797e2bd9132. Signed-off-by: Hervé Poussineau hpous...@reactos.org --- checkpatch.pl complains about this patch, but I preferred to do a simple revert than to fix code style. target-i386/translate.c | 98 ++- 1 file changed, 45 insertions(+), 53 deletions(-) The actual bug is the same than the one fixed in commit 089305ac, but for the immediate case. I have just send a patch to fix the issue. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] [PATCH for 1.5] target-i386 ROR r8/r16 imm instruction fix
On 05/09/2013 10:40 AM, Aurelien Jarno wrote: Fix EFLAGS corruption by ROR r8/r16 imm instruction located at the end of the TB, similarly to commit 089305ac for the non-immediate case. Reported-by: Hervé Poussineau hpous...@reactos.org Cc: Richard Henderson r...@twiddle.net Signed-off-by: Aurelien Jarno aurel...@aurel32.net --- target-i386/translate.c |1 + 1 file changed, 1 insertion(+) Reviewed-by: Richard Henderson r...@twiddle.net r~
Re: [Qemu-devel] [PATCH] target-mips: clean-up in BIT_INSV
On Thu, May 09, 2013 at 07:28:37PM +0200, Petar Jovanovic wrote: From: Petar Jovanovic petar.jovano...@imgtec.com This is a small follow-up change to fix incorrect behaviour for INSV. It includes two minor modifications: - sizefilter is constant so it can be moved inside of the block, - (int64_t)0x01 is replaced with 1LL for ease of reading. No functional change. Signed-off-by: Petar Jovanovic petar.jovano...@imgtec.com --- target-mips/dsp_helper.c |9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/target-mips/dsp_helper.c b/target-mips/dsp_helper.c index 9212789..426a3b6 100644 --- a/target-mips/dsp_helper.c +++ b/target-mips/dsp_helper.c @@ -2900,11 +2900,12 @@ target_ulong helper_bitrev(target_ulong rt) return (target_ulong)rd; } -#define BIT_INSV(name, posfilter, sizefilter, ret_type) \ +#define BIT_INSV(name, posfilter, ret_type) \ target_ulong helper_##name(CPUMIPSState *env, target_ulong rs, \ target_ulong rt) \ { \ uint32_t pos, size, msb, lsb; \ +uint32_t const sizefilter = 0x3F; \ target_ulong filter;\ target_ulong temp, temprs, temprt; \ target_ulong dspc; \ @@ -2921,7 +2922,7 @@ target_ulong helper_##name(CPUMIPSState *env, target_ulong rs, \ return rt; \ } \ \ -filter = ((int64_t)0x01 size) - 1; \ +filter = (1LL size) - 1; \ filter = filter pos; \ temprs = (rs pos) filter; \ temprt = rt ~filter; \ @@ -2930,9 +2931,9 @@ target_ulong helper_##name(CPUMIPSState *env, target_ulong rs, \ return (target_long)(ret_type)temp; \ } -BIT_INSV(insv, 0x1F, 0x3F, int32_t); +BIT_INSV(insv, 0x1F, int32_t); #ifdef TARGET_MIPS64 -BIT_INSV(dinsv, 0x7F, 0x3F, target_long); +BIT_INSV(dinsv, 0x7F, target_long); #endif #undef BIT_INSV Thanks, queued for 1.6. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/7] pci: add MPC105 PCI host bridge emulation
On Tue, May 7, 2013 at 5:48 AM, Hervé Poussineau hpous...@reactos.org wrote: Andreas Färber a écrit : Am 06.05.2013 22:57, schrieb Hervé Poussineau: Alexander Graf a écrit : On 05/03/2013 07:57 AM, Hervé Poussineau wrote: Alexander Graf a écrit : Am 02.05.2013 um 22:08 schrieb Hervé Poussineau hpous...@reactos.org: Non-contiguous I/O is not implemented. There is also somewhere a bug in the memory controller, which means that some real firmwares may not detect the correct amount of memory. This can be bypassed by adding '-m 1G' on the command line. Add x-auto-conf property, to automatically configure the memory controller at startup. This will be required by OpenBIOS, which doesn't know how to do it. Why not teach it? I'd prefer to see that logic in firmware. Me too, but I'm not confident enough in my capabilities to do it. Huh? Why not? Most of the device initialization code in OpenBIOS happens in C, so you don't even have to touch Forth code :). Autoconfiguration is only in one place of the code, so I think it can be removed easily once OpenBIOS has this logic. I'd prefer if we could come up with a clean model from the start. It really shouldn't be hard at all. I thought that for all other usages of OpenBIOS in QEMU, RAM was supposed to be available as soon as machine was powered on. However, I checked OpenBIOS code: One of the first things done in arch/ppc/qemu/start.S is to copy the exception vectors. So, I should add code before it to detect memory controller, detect ram size and configure memory controller? No. Why? QEMU does not depend on the memory controller being initialized, only the OS might expect some registers to be filled in. So you should look at or add the MPC105 PHB initialization hook in OpenBIOS' PCI code, long after the memory is set up. OpenBIOS depends of memory being available (at least the first KB/MB) even at its very startup, in arch/ppc/qemu/start.S. PCI initialization code comes much later. OpenBIOS for Sparc32 and Sparc64 does not use RAM before RAM size has been read from fw_cfg. A check for QEMU signature is done and machine ID is also read before that. arch/sparc32/entry.S arch/sparc64/entry.S It should be easy to change PPC to check the machine ID before accessing RAM. At boot, MPC105 datasheet says that memory controller is not configured, ie you have to not use RAM in OpenBIOS before PCI initialization. The memory controller should be initialized much earlier than PCI. For other PPC targets (mac99, g3beige) using OpenBIOS, RAM is accessible at startup, so that's not a problem for OpenBIOS. So, no, QEMU does not depend of the memory controller being initialized, but OpenBIOS depends of the RAM being accessible ways before PCI initialization. I don't speak of the OS which might (or might not) expect some registers to be filled in. x-auto-conf property correctly sets some registers, so that memory is available at startup (like on mac99, g3beige emulations). Hervé
Re: [Qemu-devel] [RFC] reverse execution.
On Tue, May 7, 2013 at 6:27 PM, KONRAD Frédéric fred.kon...@greensocs.com wrote: Hi, We are trying to find a way to do reverse execution happen with QEMU. Actually, it is possible to debug the guest through the gdbstub, we want to make the reverse execution possible with GDB as well. How we are trying to make that working (basically without optimisation): -QEMU takes regular snapshot of the VM: that can be done with the save vm code without optimisation first. -When the VM is stopped and GDB requests a reverse-step: load the last snapshot and replay to one instruction before the current PC. There are one issue with that for now (for a basic running reverse execution): -How to stop one instruction before the actual PC. Add a special translation mode for reverse execution where the next PC is checked after each instruction. Alternatively, you could make temporary snapshots during this mode (first 1s intervals, then 0.1s etc) which could be used to find the location. I think this way was discussed briefly earlier in the list, please check the archives. We though that using -icount and stop the guest a little time before the actual position would give us the right behavior (We use a qemu_timer with vm_clock to stop the vm at the good time), but it seems that it is not deterministic, and not reproducable. Is that normal? We don't make any input during the replay, and we though that it can be caused by some timer interruption but -icount is using a virtual timer as I understand? We have two other ideas: -Using TCI and count each instruction executed by the processor, then stop one instruction before the actual position. This seems slower. -Using single-step to count each instruction, then stop one instruction before the actual position. Would that be better? For now we can restore the VM from the last snapshot, when we do a reverse-step but we can't stop at the exact position. Thanks, Fred
Re: [Qemu-devel] [PATCH] target-mips: clean-up in BIT_INSV
On 9 May 2013 18:28, Petar Jovanovic petar.jovano...@rt-rk.com wrote: @@ -2921,7 +2922,7 @@ target_ulong helper_##name(CPUMIPSState *env, target_ulong rs, \ return rt; \ } \ \ -filter = ((int64_t)0x01 size) - 1; \ +filter = (1LL size) - 1; \ filter = filter pos; \ temprs = (rs pos) filter; \ temprt = rt ~filter; \ This section of code is pretty much hand-coding temp = deposit64(rt, pos, size, rs); isn't it? -- PMM
Re: [Qemu-devel] sparc-linux-user: Fix missing symbols in .rel/.rela.plt sections
On Wed, May 8, 2013 at 7:18 AM, Michael Tokarev m...@tls.msk.ru wrote: Ping #2? 06.04.2013 10:08, Michael Tokarev wrote: This patch was submitted more than a year ago (at Jan-2012). Is it still needed? If yes, why it hasn't been applied? Well, is it still needed? It still applies cleanly to the current git, with the exception of s|^|ldscripts/| - sparc.ld moved from the top directory to ldscripts/. (Ref: http://patchwork.ozlabs.org/patch/135267 ) Thanks, /mjt 10.01.2012 11:38, Aurelien Jarno wrote: sparc-linux-user: Fix missing symbols in .rel/.rela.plt sections Fix .rel.plt sections in the output to not only include .rel.plt sections from the input but also the .rel.iplt sections and to define the hidden symbols __rel_iplt_start and __rel_iplt_end around .rel.iplt as otherwise we get undefined references to these when linking statically to a multiarch enabled libc (using STT_GNU_IFUNC). Cc: Blue Swirl blauwir...@gmail.com Signed-off-by: Aurelien Jarno aurel...@aurel32.net --- sparc.ld | 16 ++-- 1 files changed, 14 insertions(+), 2 deletions(-) diff --git a/sparc.ld b/sparc.ld index 56efe34..cec17c9 100644 --- a/sparc.ld +++ b/sparc.ld @@ -37,8 +37,20 @@ SECTIONS .rela.fini : { *(.rela.fini) } .rel.bss : { *(.rel.bss)} .rela.bss : { *(.rela.bss) } - .rel.plt : { *(.rel.plt)} - .rela.plt : { *(.rela.plt) } + .rel.plt : + { +*(.rel.plt) +PROVIDE (__rel_iplt_start = .); +*(.rel.iplt) +PROVIDE (__rel_iplt_end = .); + } + .rela.plt : + { +*(.rela.plt) +PROVIDE (__rela_iplt_start = .); +*(.rela.iplt) +PROVIDE (__rela_iplt_end = .); + } .init : { *(.init) } =0x47ff041f .text : {
Re: [Qemu-devel] Reporting Heisenbugs in qemu
On Wed, May 8, 2013 at 10:18 AM, Paolo Bonzini pbonz...@redhat.com wrote: Paolo Bonzini pbonz...@redhat.com writes: I guess that's the register windows. There's only so much you can do to optimize them, and heavily recursive workloads (like Perl, or the RTL half of GCC) pay a hefty price. Two qemu targets stand out for slowness, sparc (32 and 64) and mips (64, don't know about 32). x86 (32 and 64), arm, and ppc run with a slowdown of 30 for my bogus benchmark of GMP configure+make. With FreeBSD x86_64 I see a slowdown of just 13. (My reference system runs FreeBSD, so running FreeBSD under qemu is only far.) My claimed slowdown factors are affected by kernel, libraries, and unfortunately very much by gcc speed, which vary with target. If the sparc emulation speed is due to register windows, then why does mips seem just as slow? No idea. :) Of all the architectures you listed, MIPS is really the one that I have no inkling of... If register windows shortage is a problem, it should be easy to pretend to have lots of them, right? That can help to avoid trapping to the kernel. But you still have to spill the whole window to the stack on every call to a non-leaf function, which can be expensive even if you do it in the translated code. The kernel wouldn't be able to spill or fill more than 8 windows anyway. But maybe we could implement window cleaning in QEMU, if nobody minds not seeing any CLEANWIN traps. I don't think real HW can clean the windows. Register window access in QEMU could be improved slightly, for example it may be possible to handle wrapping of the registers with sparse layout and page mapping tricks. Also the translator might be able to keep track of current CWP and generate direct accesses instead of going through regwptr. Paolo
Re: [Qemu-devel] Profiling sparc64 emulation
On Wed, May 08, 2013 at 11:02:24PM +0200, Artyom Tarasenko wrote: On Wed, May 8, 2013 at 12:57 AM, Aurelien Jarno aurel...@aurel32.net wrote: On Tue, May 07, 2013 at 11:29:20PM +0200, Artyom Tarasenko wrote: On Tue, May 7, 2013 at 1:38 PM, Torbjorn Granlund t...@gmplib.org wrote: The 2nd table of http://gmplib.org/devel/testsystems.html shows all emulated systems I am using, most of which are qemu-based. Do I read it correct that qemu-system-ppc64 with the slowdown factor of 33 is ~3 times faster than qemu-system-sparc64 with the slowdown factor of 96 ? Do they both use Debian Wheezy guest? You have a remark that ppc64 has problems with its clock. Was it taken into account when the slowdown factors were calculated? Clock or not, it should be noted that qemu-system-sparc64 is undoubtedly slower (at least 5 to 10 times) than qemu-system-{arm,ppc,mips,...} on some type of load like perl scripts. That's interesting. Actually it should be possible to lauch perl under user mode qemu-sparc32plus. Is it possible to launch perl under user mode qemu-ppc{32,64} too? That would allow to understand whether the bad performance has to do with TCG or the rest of the system emulation. I haven't done that yet, but I have run perf top while running perl script (lintian), on both qemu-system-sparc64 and qemu-system-ppc64. The results are quite different: qemu-system-ppc64 - 49,73% perf-10672.map [.] 0x7f7853ab4e0f 13,23% qemu-system-ppc64[.] cpu_ppc_exec 13,16% libglib-2.0.so.0.3200.4 [.] g_hash_table_lookup 8,18% libglib-2.0.so.0.3200.4 [.] g_str_hash 2,47% qemu-system-ppc64[.] object_class_dynamic_cast 1,97% qemu-system-ppc64[.] type_is_ancestor 1,05% libglib-2.0.so.0.3200.4 [.] g_str_equal 0,91% qemu-system-ppc64[.] ppc_cpu_do_interrupt 0,90% qemu-system-ppc64[.] object_dynamic_cast_assert 0,79% libc-2.13.so [.] __sigsetjmp 0,62% qemu-system-ppc64[.] type_get_parent.isra.3 0,58% qemu-system-ppc64[.] type_get_by_name 0,57% qemu-system-ppc64[.] qemu_log_mask 0,54% qemu-system-ppc64[.] object_dynamic_cast qemu-system-sparc64 --- 17,43% perf-8154.map[.] 0x7f6ac10245c8 10,46% qemu-system-sparc64 [.] tcg_optimize 10,36% qemu-system-sparc64 [.] cpu_sparc_exec 6,35% qemu-system-sparc64 [.] tb_flush_jmp_cache 4,75% qemu-system-sparc64 [.] get_physical_address_data 4,45% qemu-system-sparc64 [.] tcg_liveness_analysis 4,35% qemu-system-sparc64 [.] tcg_reg_alloc_op 2,90% qemu-system-sparc64 [.] tlb_flush_page 2,35% qemu-system-sparc64 [.] disas_sparc_insn 2,28% qemu-system-sparc64 [.] get_physical_address_code 2,21% qemu-system-sparc64 [.] tlb_flush 1,64% qemu-system-sparc64 [.] tcg_out_opc 1,22% qemu-system-sparc64 [.] tcg_out_modrm_sib_offset.constprop.41 1,20% qemu-system-sparc64 [.] helper_ld_asi 1,14% qemu-system-sparc64 [.] gen_intermediate_code_pc 1,04% qemu-system-sparc64 [.] helper_st_asi 1,00% qemu-system-sparc64 [.] object_class_dynamic_cast 0,98% qemu-system-sparc64 [.] tb_find_pc
Re: [Qemu-devel] Profiling sparc64 emulation
Il 09/05/2013 20:30, Aurelien Jarno ha scritto: 13,16% libglib-2.0.so.0.3200.4 [.] g_hash_table_lookup 8,18% libglib-2.0.so.0.3200.4 [.] g_str_hash 2,47% qemu-system-ppc64[.] object_class_dynamic_cast 1,97% qemu-system-ppc64[.] type_is_ancestor That's worrisome, but should be easy to fix... can you make a callgraph profile? Paolo
[Qemu-devel] [RFC PATCH v5 2/3] Add 'auto-converge' migration capability
The auto-converge migration capability allows the user to specify if they choose live migration seqeunce to automatically detect and force convergence. Signed-off-by: Chegu Vinod chegu_vi...@hp.com --- include/migration/migration.h |2 ++ migration.c |9 + qapi-schema.json |5 - 3 files changed, 15 insertions(+), 1 deletions(-) diff --git a/include/migration/migration.h b/include/migration/migration.h index e2acec6..ace91b0 100644 --- a/include/migration/migration.h +++ b/include/migration/migration.h @@ -127,4 +127,6 @@ int migrate_use_xbzrle(void); int64_t migrate_xbzrle_cache_size(void); int64_t xbzrle_cache_resize(int64_t new_size); + +bool migrate_auto_converge(void); #endif diff --git a/migration.c b/migration.c index 3eb0fad..570cee5 100644 --- a/migration.c +++ b/migration.c @@ -474,6 +474,15 @@ void qmp_migrate_set_downtime(double value, Error **errp) max_downtime = (uint64_t)value; } +bool migrate_auto_converge(void) +{ +MigrationState *s; + +s = migrate_get_current(); + +return s-enabled_capabilities[MIGRATION_CAPABILITY_AUTO_CONVERGE]; +} + int migrate_use_xbzrle(void) { MigrationState *s; diff --git a/qapi-schema.json b/qapi-schema.json index 199744a..b33839c 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -602,10 +602,13 @@ # This feature allows us to minimize migration traffic for certain work # loads, by sending compressed difference of the pages # +# @auto-converge: Migration supports automatic throttling down of guest +# to force convergence. (since 1.6) +# # Since: 1.2 ## { 'enum': 'MigrationCapability', - 'data': ['xbzrle'] } + 'data': ['xbzrle', 'auto-converge'] } ## # @MigrationCapabilityStatus -- 1.7.1
[Qemu-devel] [RFC PATCH v5 0/3] Throttle-down guest to help with live migration convergence.
Busy enterprise workloads hosted on large sized VM's tend to dirty memory faster than the transfer rate achieved via live guest migration. Despite some good recent improvements ( using dedicated 10Gig NICs between hosts) the live migration does NOT converge. If a user chooses to force convergence of their migration via a new migration capability auto-converge then this change will auto-detect lack of convergence scenario and trigger a slow down of the workload by explicitly disallowing the VCPUs from spending much time in the VM context. The migration thread tries to catchup and this eventually leads to convergence in some deterministic amount of time. Yes it does impact the performance of all the VCPUs but in my observation that lasts only for a short duration of time. i.e. end up entering stage 3 (downtime phase) soon after that. No external trigger is required. Thanks to Juan and Paolo for their useful suggestions. --- Changes from v4: - incorporated feedback from Paolo. - split into 3 patches. Changes from v3: - incorporated feedback from Paolo and Eric - rebased to latest qemu.git Changes from v2: - incorporated feedback from Orit, Juan and Eric - stop the throttling thread at the start of stage 3 - rebased to latest qemu.git Changes from v1: - rebased to latest qemu.git - added auto-converge capability(default off) - suggested by Anthony Liguori Eric Blake. Signed-off-by: Chegu Vinod chegu_vi...@hp.com Chegu Vinod (3): Introduce async_run_on_cpu() Add 'auto-converge' migration capability Force auto-convegence of live migration arch_init.c | 68 + cpus.c| 29 + include/migration/migration.h |6 +++ include/qemu-common.h |1 + include/qom/cpu.h | 10 ++ migration.c | 10 ++ qapi-schema.json |5 ++- 7 files changed, 128 insertions(+), 1 deletions(-)
[Qemu-devel] [RFC PATCH v5 1/3] Introduce async_run_on_cpu()
Introduce an asynchronous version of run_on_cpu() i.e. the caller doesn't have to block till the call back routine finishes execution on the target vcpu. Signed-off-by: Chegu Vinod chegu_vi...@hp.com --- cpus.c| 29 + include/qemu-common.h |1 + include/qom/cpu.h | 10 ++ 3 files changed, 40 insertions(+), 0 deletions(-) diff --git a/cpus.c b/cpus.c index c232265..8cd4eab 100644 --- a/cpus.c +++ b/cpus.c @@ -653,6 +653,7 @@ void run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data) wi.func = func; wi.data = data; +wi.free = false; if (cpu-queued_work_first == NULL) { cpu-queued_work_first = wi; } else { @@ -671,6 +672,31 @@ void run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data) } } +void async_run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data) +{ +struct qemu_work_item *wi; + +if (qemu_cpu_is_self(cpu)) { +func(data); +return; +} + +wi = g_malloc0(sizeof(struct qemu_work_item)); +wi-func = func; +wi-data = data; +wi-free = true; +if (cpu-queued_work_first == NULL) { +cpu-queued_work_first = wi; +} else { +cpu-queued_work_last-next = wi; +} +cpu-queued_work_last = wi; +wi-next = NULL; +wi-done = false; + +qemu_cpu_kick(cpu); +} + static void flush_queued_work(CPUState *cpu) { struct qemu_work_item *wi; @@ -683,6 +709,9 @@ static void flush_queued_work(CPUState *cpu) cpu-queued_work_first = wi-next; wi-func(wi-data); wi-done = true; +if (wi-free) { +g_free(wi); +} } cpu-queued_work_last = NULL; qemu_cond_broadcast(qemu_work_cond); diff --git a/include/qemu-common.h b/include/qemu-common.h index b399d85..bad6e1f 100644 --- a/include/qemu-common.h +++ b/include/qemu-common.h @@ -286,6 +286,7 @@ struct qemu_work_item { void (*func)(void *data); void *data; int done; +bool free; }; #ifdef CONFIG_USER_ONLY diff --git a/include/qom/cpu.h b/include/qom/cpu.h index 7cd9442..46465e9 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -265,6 +265,16 @@ bool cpu_is_stopped(CPUState *cpu); void run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data); /** + * async_run_on_cpu: + * @cpu: The vCPU to run on. + * @func: The function to be executed. + * @data: Data to pass to the function. + * + * Schedules the function @func for execution on the vCPU @cpu asynchronously. + */ +void async_run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data); + +/** * qemu_for_each_cpu: * @func: The function to be executed. * @data: Data to pass to the function. -- 1.7.1
[Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration
If a user chooses to turn on the auto-converge migration capability these changes detect the lack of convergence and throttle down the guest. i.e. force the VCPUs out of the guest for some duration and let the migration thread catchup and help converge. Verified the convergence using the following : - SpecJbb2005 workload running on a 20VCPU/256G guest(~80% busy) - OLTP like workload running on a 80VCPU/512G guest (~80% busy) Sample results with SpecJbb2005 workload : (migrate speed set to 20Gb and migrate downtime set to 4seconds). (qemu) info migrate capabilities: xbzrle: off auto-converge: off Migration status: active total time: 1487503 milliseconds expected downtime: 519 milliseconds transferred ram: 383749347 kbytes remaining ram: 2753372 kbytes total ram: 268444224 kbytes duplicate: 65461532 pages skipped: 64901568 pages normal: 95750218 pages normal bytes: 383000872 kbytes dirty pages rate: 67551 pages --- (qemu) info migrate capabilities: xbzrle: off auto-converge: on Migration status: completed total time: 241161 milliseconds downtime: 6373 milliseconds transferred ram: 28235307 kbytes remaining ram: 0 kbytes total ram: 268444224 kbytes duplicate: 64946416 pages skipped: 64903523 pages normal: 7044971 pages normal bytes: 28179884 kbytes Signed-off-by: Chegu Vinod chegu_vi...@hp.com --- arch_init.c | 68 + include/migration/migration.h |4 ++ migration.c |1 + 3 files changed, 73 insertions(+), 0 deletions(-) diff --git a/arch_init.c b/arch_init.c index 49c5dc2..29788d6 100644 --- a/arch_init.c +++ b/arch_init.c @@ -49,6 +49,7 @@ #include trace.h #include exec/cpu-all.h #include hw/acpi/acpi.h +#include sysemu/cpus.h #ifdef DEBUG_ARCH_INIT #define DPRINTF(fmt, ...) \ @@ -104,6 +105,8 @@ int graphic_depth = 15; #endif const uint32_t arch_type = QEMU_ARCH; +static bool mig_throttle_on; + /***/ /* ram save/restore */ @@ -378,8 +381,15 @@ static void migration_bitmap_sync(void) uint64_t num_dirty_pages_init = migration_dirty_pages; MigrationState *s = migrate_get_current(); static int64_t start_time; +static int64_t bytes_xfer_prev; static int64_t num_dirty_pages_period; int64_t end_time; +int64_t bytes_xfer_now; +static int dirty_rate_high_cnt; + +if (!bytes_xfer_prev) { +bytes_xfer_prev = ram_bytes_transferred(); +} if (!start_time) { start_time = qemu_get_clock_ms(rt_clock); @@ -404,6 +414,23 @@ static void migration_bitmap_sync(void) /* more than 1 second = 1000 millisecons */ if (end_time start_time + 1000) { +if (migrate_auto_converge()) { +/* The following detection logic can be refined later. For now: + Check to see if the dirtied bytes is 50% more than the approx. + amount of bytes that just got transferred since the last time we + were in this routine. If that happens N times (for now N==5) + we turn on the throttle down logic */ +bytes_xfer_now = ram_bytes_transferred(); +if (s-dirty_pages_rate +((num_dirty_pages_period*TARGET_PAGE_SIZE) +((bytes_xfer_now - bytes_xfer_prev)/2))) { +if (dirty_rate_high_cnt++ 5) { +DPRINTF(Unable to converge. Throtting down guest\n); +mig_throttle_on = true; +} + } + bytes_xfer_prev = bytes_xfer_now; +} s-dirty_pages_rate = num_dirty_pages_period * 1000 / (end_time - start_time); s-dirty_bytes_rate = s-dirty_pages_rate * TARGET_PAGE_SIZE; @@ -496,6 +523,15 @@ static int ram_save_block(QEMUFile *f, bool last_stage) return bytes_sent; } +bool throttling_needed(void) +{ +if (!migrate_auto_converge()) { +return false; +} + +return mig_throttle_on; +} + static uint64_t bytes_transferred; static ram_addr_t ram_save_remaining(void) @@ -1098,3 +1134,35 @@ TargetInfo *qmp_query_target(Error **errp) return info; } + +static void mig_delay_vcpu(void) +{ +qemu_mutex_unlock_iothread(); +g_usleep(50*1000); +qemu_mutex_lock_iothread(); +} + +/* Stub used for getting the vcpu out of VM and into qemu via + run_on_cpu()*/ +static void mig_kick_cpu(void *opq) +{ +mig_delay_vcpu(); +return; +} + +/* To reduce the dirty rate explicitly disallow the VCPUs from spending + much time in the VM. The migration thread will try to catchup. + Workload will experience a performance drop. +*/ +void migration_throttle_down(void) +{ +if (throttling_needed()) { +CPUArchState *penv = first_cpu; +while (penv) { +qemu_mutex_lock_iothread(); +async_run_on_cpu(ENV_GET_CPU(penv), mig_kick_cpu, NULL); +
Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration
On Thu, 9 May 2013 12:43:20 -0700 Chegu Vinod chegu_vi...@hp.com wrote: If a user chooses to turn on the auto-converge migration capability these changes detect the lack of convergence and throttle down the guest. i.e. force the VCPUs out of the guest for some duration and let the migration thread catchup and help converge. [...] +void migration_throttle_down(void) +{ +if (throttling_needed()) { +CPUArchState *penv = first_cpu; +while (penv) { +qemu_mutex_lock_iothread(); +async_run_on_cpu(ENV_GET_CPU(penv), mig_kick_cpu, NULL); +qemu_mutex_unlock_iothread(); +penv = penv-next_cpu; could you replace open coded loop with qemu_for_each_cpu()? +} +} +} -- Regards, Igor
Re: [Qemu-devel] Profiling sparc64 emulation
On Thu, May 9, 2013 at 8:30 PM, Aurelien Jarno aurel...@aurel32.net wrote: On Wed, May 08, 2013 at 11:02:24PM +0200, Artyom Tarasenko wrote: On Wed, May 8, 2013 at 12:57 AM, Aurelien Jarno aurel...@aurel32.net wrote: On Tue, May 07, 2013 at 11:29:20PM +0200, Artyom Tarasenko wrote: On Tue, May 7, 2013 at 1:38 PM, Torbjorn Granlund t...@gmplib.org wrote: The 2nd table of http://gmplib.org/devel/testsystems.html shows all emulated systems I am using, most of which are qemu-based. Do I read it correct that qemu-system-ppc64 with the slowdown factor of 33 is ~3 times faster than qemu-system-sparc64 with the slowdown factor of 96 ? Do they both use Debian Wheezy guest? You have a remark that ppc64 has problems with its clock. Was it taken into account when the slowdown factors were calculated? Clock or not, it should be noted that qemu-system-sparc64 is undoubtedly slower (at least 5 to 10 times) than qemu-system-{arm,ppc,mips,...} on some type of load like perl scripts. That's interesting. Actually it should be possible to lauch perl under user mode qemu-sparc32plus. Is it possible to launch perl under user mode qemu-ppc{32,64} too? That would allow to understand whether the bad performance has to do with TCG or the rest of the system emulation. I haven't done that yet, but I have run perf top while running perl script (lintian), on both qemu-system-sparc64 and qemu-system-ppc64. The results are quite different: qemu-system-ppc64 - 49,73% perf-10672.map [.] 0x7f7853ab4e0f 13,23% qemu-system-ppc64[.] cpu_ppc_exec 13,16% libglib-2.0.so.0.3200.4 [.] g_hash_table_lookup 8,18% libglib-2.0.so.0.3200.4 [.] g_str_hash 2,47% qemu-system-ppc64[.] object_class_dynamic_cast 1,97% qemu-system-ppc64[.] type_is_ancestor 1,05% libglib-2.0.so.0.3200.4 [.] g_str_equal 0,91% qemu-system-ppc64[.] ppc_cpu_do_interrupt 0,90% qemu-system-ppc64[.] object_dynamic_cast_assert 0,79% libc-2.13.so [.] __sigsetjmp 0,62% qemu-system-ppc64[.] type_get_parent.isra.3 0,58% qemu-system-ppc64[.] type_get_by_name 0,57% qemu-system-ppc64[.] qemu_log_mask 0,54% qemu-system-ppc64[.] object_dynamic_cast qemu-system-sparc64 --- 17,43% perf-8154.map[.] 0x7f6ac10245c8 10,46% qemu-system-sparc64 [.] tcg_optimize 10,36% qemu-system-sparc64 [.] cpu_sparc_exec 6,35% qemu-system-sparc64 [.] tb_flush_jmp_cache 4,75% qemu-system-sparc64 [.] get_physical_address_data 4,45% qemu-system-sparc64 [.] tcg_liveness_analysis 4,35% qemu-system-sparc64 [.] tcg_reg_alloc_op 2,90% qemu-system-sparc64 [.] tlb_flush_page 2,35% qemu-system-sparc64 [.] disas_sparc_insn 2,28% qemu-system-sparc64 [.] get_physical_address_code 2,21% qemu-system-sparc64 [.] tlb_flush 1,64% qemu-system-sparc64 [.] tcg_out_opc 1,22% qemu-system-sparc64 [.] tcg_out_modrm_sib_offset.constprop.41 1,20% qemu-system-sparc64 [.] helper_ld_asi 1,14% qemu-system-sparc64 [.] gen_intermediate_code_pc 1,04% qemu-system-sparc64 [.] helper_st_asi 1,00% qemu-system-sparc64 [.] object_class_dynamic_cast 0,98% qemu-system-sparc64 [.] tb_find_pc 0,94% qemu-system-sparc64 [.] get_page_addr_code 0,92% qemu-system-sparc64 [.] tcg_gen_code_search_pc 0,91% qemu-system-sparc64 [.] tlb_set_page 0,83% qemu-system-sparc64 [.] reset_temp 0,82% qemu-system-sparc64 [.] tcg_reg_alloc_start The perf-.map correspond to the code execution. As you can see it's a lot lower on sparc, while a lot of smaller code generation/mmu code appears. It's seems that the optimizations have to be focused on the system part, not the TCG part, at least for now. A quick look at the MMU seems to show some performance issue here, due to the split code/data MMU on SPARC64, while the QEMU TLB is a joint one. As a consequence one can see a lot of ping pong, setting a given page to read or read/write, then execution, and later read or read/write again. My guess is that it's related to constants table in the same page than the code. It should also be noted that the tcg_optimize starts to take a non-negligible time, in both cases. The code grew up quite a lot recently, and it might be time to rework it. It's nice to have optimized code, but not if the gain is lower than the optimization time. Is it possible to disable some optimisations, or the whole optimisation completely? I see no command line switches for that.
Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration
On Thu, 9 May 2013 12:43:20 -0700 Chegu Vinod chegu_vi...@hp.com wrote: If a user chooses to turn on the auto-converge migration capability these changes detect the lack of convergence and throttle down the guest. i.e. force the VCPUs out of the guest for some duration and let the migration thread catchup and help converge. [...] + +static void mig_delay_vcpu(void) +{ +qemu_mutex_unlock_iothread(); +g_usleep(50*1000); +qemu_mutex_lock_iothread(); +} + +/* Stub used for getting the vcpu out of VM and into qemu via + run_on_cpu()*/ +static void mig_kick_cpu(void *opq) +{ +mig_delay_vcpu(); +return; +} + +/* To reduce the dirty rate explicitly disallow the VCPUs from spending + much time in the VM. The migration thread will try to catchup. + Workload will experience a performance drop. +*/ +void migration_throttle_down(void) +{ +if (throttling_needed()) { +CPUArchState *penv = first_cpu; +while (penv) { +qemu_mutex_lock_iothread(); Locking it here and the unlocking it inside of queued work doesn't look nice. What exactly are you protecting with this lock? +async_run_on_cpu(ENV_GET_CPU(penv), mig_kick_cpu, NULL); +qemu_mutex_unlock_iothread(); +penv = penv-next_cpu; +} +} +} -- Regards, Igor
Re: [Qemu-devel] Profiling sparc64 emulation
On 05/09/2013 01:11 PM, Artyom Tarasenko wrote: Is it possible to disable some optimisations, or the whole optimisation completely? I see no command line switches for that. No command-line switches. See the top lines of tcg/tcg.c for compile-time disabling. r~
Re: [Qemu-devel] [PATCH v6 00/11] rdma: migration support
On 5/9/2013 10:20 AM, Michael R. Hines wrote: Comments inline. FYI: please CC mrhi...@us.ibm.com, because it helps me know when to scroll threw the bazillion qemu-devel emails. I have things separated out into folders and rules, but a direct CC is better =) Sure will do. On 05/03/2013 07:28 PM, Chegu Vinod wrote: Hi Michael, I picked up the qemu bits from your github branch and gave it a try. (BTW the setup I was given temporary access to has a pair of MLX's IB QDR cards connected back to back via QSFP cables) Observed a couple of things and wanted to share..perhaps you may be aware of them already or perhaps these are unrelated to your specific changes ? (Note: Still haven't finished the review of your changes ). a) x-rdma-pin-all off case Seem to only work sometimes but fails at other times. Here is an example... (qemu) rdma: Accepting rdma connection... rdma: Memory pin all: disabled rdma: verbs context after listen: 0x56757d50 rdma: dest_connect Source GID: fe80::2:c903:9:53a5, Dest GID: fe80::2:c903:9:5855 rdma: Accepted migration qemu-system-x86_64: VQ 1 size 0x100 Guest index 0x4d2 inconsistent with Host ind ex 0x4ec: delta 0xffe6 qemu: warning: error while loading state for instance 0x0 of device 'virtio-net' load of migration failed Can you give me more details about the configuration of your VM? The guest is a 10-VCPU/128GB ...and nothing really that fancy with respect to storage or networking. Hosted on a large Westmere-EX box (target is a similarly configured Westmere-X system). There is a shared SAN disk between the two hosts. Both hosts have 3.9-rc7 kernel that I got at that time from kvm.git tree. The guest was also running the same kernel. Since I was just trying it out I was not running any workload either. On the source host the qemu command line : /usr/local/bin/qemu-system-x86_64 \ -enable-kvm \ -cpu host \ -name vm1 \ -m 131072 -smp 10,sockets=1,cores=10,threads=1 \ -mem-path /dev/hugepages \ -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/vm1.monitor,server,nowait \ -drive file=/dev/libvirt_lvm3/vm1,if=none,id=drive-virtio-disk0,format=raw,cache=none,aio=native \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \ -monitor stdio \ -net nic,model=virtio,macaddr=52:54:00:71:01:01,netdev=nic-0 \ -netdev tap,id=nic-0,ifname=tap0,script=no,downscript=no,vhost=on \ -vnc :4 On the destination host the command line was same as the above with the following additional arg... -incoming x-rdma:static private ipaddr of the IB:port # b) x-rdma-pin-all on case : The guest is not resuming on the target host. i.e. the source host's qemu states that migration is complete but the guest is not responsive anymore... (doesn't seem to have crashed but its stuck somewhere).Have you seen this behavior before ? Any tips on how I could extract additional info ? Is the QEMU monitor still responsive? They were responsive. Can you capture a screenshot of the guest's console to see if there is a panic? No panic on the guest's console :( What kind of storage is attached to the VM? Simple virtio disk hosted on a SAN disk (see the qemu command line). Besides the list of noted restrictions/issues around having to pin all of guest memoryif the pinning is done as part of starting of the migration it ends up taking noticeably long time for larger guests. Wonder whether that should be counted as part of the total migration time ?. That's a good question: The pin-all option should not be slowing down your VM to much as the VM should still be running before the migration_thread() actually kicks in and starts the migration. Well I had hoped that it would not have any serious impacts but it ended up freezing the guest... I need more information on the configuration of your VM, guest operating system, architecture and so forth... Pl. see above. And similarly as before whether or not QEMU is not responsive or whether or not it's the guest that's panicked... Guest just freezes...doesn't panic when this pinning is in progress (i.e. after I set the capability and start the migration) . After the pin'ng completes the guest continues to run and the migration continues...till it completes (as per the source host's qemu)...but I never see it resume on the target host. Also the act of pinning all the memory seems to freeze the guest. e.g. : For larger enterprise sized guests (say 128GB and higher) the guest is frozen is anywhere from nearly a minute (~50seconds) to multiple minutes as the guest size increases...which imo kind of defeats the purpose of live guest migration. That's bad =) There must be a bug somewhere the largest VM I can create on my hardware is ~16GB - so let me give that a try and try to track down the problem. Ok. Perhaps run a simple test run inside the guest can help observe any scheduling
Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration
On 5/9/2013 1:05 PM, Igor Mammedov wrote: On Thu, 9 May 2013 12:43:20 -0700 Chegu Vinod chegu_vi...@hp.com wrote: If a user chooses to turn on the auto-converge migration capability these changes detect the lack of convergence and throttle down the guest. i.e. force the VCPUs out of the guest for some duration and let the migration thread catchup and help converge. [...] +void migration_throttle_down(void) +{ +if (throttling_needed()) { +CPUArchState *penv = first_cpu; +while (penv) { +qemu_mutex_lock_iothread(); +async_run_on_cpu(ENV_GET_CPU(penv), mig_kick_cpu, NULL); +qemu_mutex_unlock_iothread(); +penv = penv-next_cpu; could you replace open coded loop with qemu_for_each_cpu()? Yes will try to replace it in the next version. Vinod +} +} +}
Re: [Qemu-devel] [PATCH v6 00/11] rdma: migration support
Some more followup questions below to help me debug before I start digging in... On 05/09/2013 06:20 PM, Chegu Vinod wrote: Setting aside the mlock() freezes for the moment, let's first fix your crashing problem on the destination-side. Let's make that a priority before we fix the mlock problem. When the migration completes, can you provide me with more detailed information about the state of QEMU on the destination? Is it responding? What's on the VNC console? Is QEMU responding? Is the network responding? Was the VM idle? Or running an application? Can you attach GDB to QEMU after the migration? /usr/local/bin/qemu-system-x86_64 \ -enable-kvm \ -cpu host \ -name vm1 \ -m 131072 -smp 10,sockets=1,cores=10,threads=1 \ -mem-path /dev/hugepages \ Can you disable hugepages and re-test? I'll get back to the other mlock() issues later after we at least first make sure the migration itself is working.
Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration
On 5/9/2013 1:24 PM, Igor Mammedov wrote: On Thu, 9 May 2013 12:43:20 -0700 Chegu Vinod chegu_vi...@hp.com wrote: If a user chooses to turn on the auto-converge migration capability these changes detect the lack of convergence and throttle down the guest. i.e. force the VCPUs out of the guest for some duration and let the migration thread catchup and help converge. [...] + +static void mig_delay_vcpu(void) +{ +qemu_mutex_unlock_iothread(); +g_usleep(50*1000); +qemu_mutex_lock_iothread(); +} + +/* Stub used for getting the vcpu out of VM and into qemu via + run_on_cpu()*/ +static void mig_kick_cpu(void *opq) +{ +mig_delay_vcpu(); +return; +} + +/* To reduce the dirty rate explicitly disallow the VCPUs from spending + much time in the VM. The migration thread will try to catchup. + Workload will experience a performance drop. +*/ +void migration_throttle_down(void) +{ +if (throttling_needed()) { +CPUArchState *penv = first_cpu; +while (penv) { +qemu_mutex_lock_iothread(); Locking it here and the unlocking it inside of queued work doesn't look nice. Yes...but see below. What exactly are you protecting with this lock? It was my understanding that BQL is supposed to be held when the vcpu threads start entering and executing in the qemu context (as qemu is not MP safe).. Still true? In this specific use case I was concerned about the fraction of the time when a given vcpu thread is in the qemu context but not executing the callback routine...and was hence holding the BQL.Holding the BQL and g_usleep'ng is not only bad but would slow down the migration thread...hence the doesn't look nice stuff :( For this specific use case If its not really required to even bother with the BQL then pl. do let me know. Also pl. refer to version 3 of my patchI was doing a g_usleep() in kvm_cpu_exec() and was not messing much with the BQLbut that was deemed as not a good thing either. Thanks Vinod +async_run_on_cpu(ENV_GET_CPU(penv), mig_kick_cpu, NULL); +qemu_mutex_unlock_iothread(); +penv = penv-next_cpu; +} +} +}
Re: [Qemu-devel] [RFC PATCH 0/8] MemoryRegion and FlatView refcounting, replace hostmem with memory_region_find
On Thu, May 9, 2013 at 10:50 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 09/05/2013 02:53, liu ping fan ha scritto: On Wed, May 8, 2013 at 11:44 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 08/05/2013 08:20, liu ping fan ha scritto: On Mon, May 6, 2013 at 10:25 PM, Paolo Bonzini pbonz...@redhat.com wrote: Hi, this is an alternative approach to refactoring of dataplane's HostMem code. Here, I take Ping Fan's idea of RCU-style updating of the region list and apply it to the AddressSpace's FlatView. With this In fact, I am worrying about the priority of MemoryListener, if it is true, then we should drop RCU-style idea. You mean in hostmem, or in general as in this patch? Note that this patch releases the old FlatView at the end of all MemoryListener operations. Both in hostmem and this patch, they all broke the original design of the MemoryListener, see notes for priority in code. I think both hostmem and this patch are fine. The hypervisor is never involved, all accesses go through the old FlatView and regions cannot disappear thanks to ref/unref. Here, we worry about add, not del. In fact, we need _more_ RCU-style updates, not less. For BQL-less dispatch, address space mapping/translation can race against the MemoryListeners in exec.c. To fix this, phys_sections and AddressSpaceDispatch need to be reference counted and RCU-ified as well. Agree, I like RCU too. Paolo I have set out 2 patches to highlight this issue, and have CC you and Stefanha. Regards, Pingfan Paolo Also if it is true, there is already a bug with hostmem listener. It should use region_del, not region_nop to reconstruct the local view. But just let me have a deep thinking. Regards, Pingfan change, dataplane can simply use memory_region_find instead of hostmem. This is a somewhat larger change, but I prefer it for two reasons. 1) it splits the task of adding BQL-less memory dispatch in two parts, tacking memory_region_find first (which is simpler because locking is left to the caller). 2) HostMem duplicates a lot of the FlatView logic, and adding the RCU-style update in FlatView benefits everyone. The missing ingredients here are: 1) remember and unreference the MemoryRegions that are used in a vring entry. In order to implement this, it is probably simpler to change vring.c to use virtio.c's VirtQueueElement data structure. We want something like that anyway in order to support migration. 2) add an owner field to MemoryRegion, and set it for all MemoryRegions for hot-unpluggable devices. In this series, ref/unref are stubs. For simplicity I based the patches on my IOMMU rebase. I placed the tree at git://github.com/bonzini/qemu.git, branch iommu. Paolo Paolo Bonzini (8): memory: add ref/unref calls exec: check MRU in qemu_ram_addr_from_host memory: return MemoryRegion from qemu_ram_addr_from_host memory: ref/unref memory across address_space_map/unmap memory: access FlatView from a local variable memory: use a new FlatView pointer on every topology update memory: add reference counting to FlatView dataplane: replace hostmem with memory_region_find exec.c| 63 +--- hw/core/loader.c |1 + hw/display/exynos4210_fimd.c |6 + hw/display/framebuffer.c | 10 +- hw/i386/kvm/ioapic.c |2 + hw/i386/kvmvapic.c|1 + hw/misc/vfio.c|2 + hw/virtio/dataplane/Makefile.objs |2 +- hw/virtio/dataplane/hostmem.c | 176 - hw/virtio/dataplane/vring.c | 56 +-- hw/virtio/vhost.c |2 + hw/virtio/virtio-balloon.c|1 + hw/xen/xen_pt.c |4 + include/exec/cpu-common.h |2 +- include/exec/memory.h |9 ++ include/hw/virtio/dataplane/hostmem.h | 57 --- include/hw/virtio/dataplane/vring.h |3 +- kvm-all.c |2 + memory.c | 142 +- target-arm/kvm.c |2 + target-i386/kvm.c |4 +- target-sparc/mmu_helper.c |1 + xen-all.c |2 + 23 files changed, 253 insertions(+), 297 deletions(-) delete mode 100644 hw/virtio/dataplane/hostmem.c delete mode 100644 include/hw/virtio/dataplane/hostmem.h