Re: [Qemu-devel] [RFC] How to deal with the conflict between DE keymap and qemu console ?
Hi, Works perfectly fine for me. What vnc client is this? I tested two kinds of vnc clients: 1) VNC Viewer Free Edition 4.1.2 2) TightVNC Viewer for Windows Version 2.6.4 Ah, windows vnc clients. I've tested with linux. I'll have a look. Do you have download links for me? thanks, Gerd
Re: [Qemu-devel] [RFC] How to deal with the conflict between DE keymap and qemu console ?
-Original Message- From: Gerd Hoffmann [mailto:kra...@redhat.com] Sent: Thursday, May 22, 2014 2:00 PM To: Gonglei (Arei) Cc: qemu-devel@nongnu.org Subject: Re: [RFC] How to deal with the conflict between DE keymap and qemu console ? Hi, Works perfectly fine for me. What vnc client is this? I tested two kinds of vnc clients: 1) VNC Viewer Free Edition 4.1.2 2) TightVNC Viewer for Windows Version 2.6.4 Ah, windows vnc clients. I've tested with linux. I'll have a look. Do you have download links for me? Thanks, Sure. 1)VNC Viewer: www.realvnc.com 2) TightVNC: http://www.tightvnc.com/?f=va Best regards, -Gonglei
Re: [Qemu-devel] [PATCH] inet_listen_opts: add error checking
-Original Message- From: Gerd Hoffmann [mailto:kra...@redhat.com] Sent: Thursday, May 22, 2014 1:44 PM To: Gonglei (Arei) Cc: qemu-devel@nongnu.org Subject: Re: [Qemu-devel] [PATCH] inet_listen_opts: add error checking /* lookup */ -if (port_offset) -snprintf(port, sizeof(port), %d, atoi(port) + port_offset); +if (port_offset) { +unsigned long long baseport; unsigned long long is not necessary, unsigned long is enough. +if (parse_uint_full(port, baseport, 10) 0) { parse_uint_full prototype uses unsigned long long. And on 32bit machines it actually is a difference (long long is 64bit integer whereas long is 32bit only). Okay, Thanks. Best regards, -Gonglei
Re: [Qemu-devel] [v2][PATCH 4/8] xen, gfx passthrough: reserve 00:02.0 for INTEL IGD
-Original Message- From: Gerd Hoffmann [mailto:kra...@redhat.com] Sent: Thursday, May 22, 2014 1:40 PM To: Chen, Tiejun Cc: Anthony PERARD; Daniel P. Berrange; peter.mayd...@linaro.org; xen-de...@lists.xensource.com; m...@redhat.com; stefano.stabell...@eu.citrix.com; Kay, Allen M; kelly.zyta...@amd.com; qemu-devel@nongnu.org; Zhang, Yang Z; anth...@codemonkey.ws Subject: Re: [Qemu-devel] [v2][PATCH 4/8] xen, gfx passthrough: reserve 00:02.0 for INTEL IGD Hi, According to our discussions, I realize we may have some plans or policies dedicated to how to assign devfn, but to support GFX passthrough for XEN, I think currently it may be a better solution to adopt #1 simply like this: diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index eaf3e61..500b3c2 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -386,7 +386,7 @@ static void pc_xen_hvm_init(QEMUMachineInitArgs *args) bus = pci_find_primary_bus(); if (bus != NULL) { -pci_create_simple(bus, -1, xen-platform); +pci_create_simple(bus, PCI_DEVFN(3,0), xen-platform); } } #endif Then we can go out to plan how to assign devfn in common, is this fine? Somewhere else in the thread someone listed a libxl config file snippet which is supposed to disable the xen platform device. Which doesn't work with upstream qemu for some reason. Searching ... ah, here: http://lists.gnu.org/archive/html/qemu-devel/2014-05/msg03996.html So, how about making xen_platform_pci=0 working correctly ? Yes, this is exactly what I did as Anthony told me to add these two lines: xen_platform_pci=0 device_model_args_hvm = [ '-device', 'xen-platform,addr=0x3' ] But also as I already reply to Anthony yesterday, this doesn't work http://lists.gnu.org/archive/html/qemu-devel/2014-05/msg04101.html I think this only work with some legacy xen/qemu which don't introduce to support ACPI pcihp. Another useful thing would be to not create the xen platform device in case -nodefaults was specified on the command line (that switch turns off a bunch of other devices present by default: vga, nic, cdrom, ...). Currently looks 'xen-platform' itself can't be created, not those devices existed on that. Thanks Tiejun
Re: [Qemu-devel] [PATCH 00/25] qemu gtk ui overhaul
Hi, I noticed that some issues with the old code still exist: when Show tabs is enabled and disabled again, QEMU adds some black scan lines at the top of the VGA screen. This can be easily reproduced. Easily fixed, was just a missing gd_update_windowsize call. Another small problem is Zoom To Fit which sometimes changes the display even if it was optimal before (more difficult to reproduce). Changes in what way? Fullscreen shows some new issues with your code (tested on a 1024x600 Linux netbook). With Alt+Ctrl+F in non detached compatmonitor0 I get a display which continuously switches from fullscreen to non fullscreen. Hmm, doesn't reproduce here. In detached mode, a fullscreen VGA display shows some white scan lines at the top. I suspect this is the menu bar. Current code tries to hide it by forcing the widget size to 0x0. In gtk2 this results in a small white line at the top of the screen, probably because the minimum widget size in gtk is 1x1. In gtk3 this doesn't work at all and the menu bar remains visible. This isn't a new issue btw. I've tried to fix that with a simple gtk_widget_hide(menu_bar) call. Which has the side effect to also turn off all menu accelerators. Which is a non-starter, especially for Ctrl-Alt-F not working any more. If someone has an idea how to tackle this, I'm all ears. It would be nice to have scroll bars when necessary. On my small display it is no longer possible to see the whole compatmonitor0 contents (it's larger than my display, zooming or resizing the window does not work and there are no scroll bars). Try setting the VC_TERM_{X,Y}_MIN defines at the top of gtk.c to something smaller. Then you should be able to make the vte tabs small enough that they fit onto the screen. Making the terminal font configurable is probably a good idea too. cheers, Gerd
[Qemu-devel] [PATCH 0/2] dataplane: Enable scsi=on
This makes the SG_IO code of non-dataplane available to dataplane, so that dataplane can use to allow scsi=on. Fam Fam Zheng (2): virtio-blk: Factor out virtio_blk_handle_scsi_req from virtio_blk_handle_scsi dataplane: Support VIRTIO_BLK_T_SCSI_CMD hw/block/dataplane/virtio-blk.c | 19 +- hw/block/virtio-blk.c | 77 + include/hw/virtio/virtio-blk.h | 3 ++ 3 files changed, 54 insertions(+), 45 deletions(-) -- 1.9.2
[Qemu-devel] [PATCH 1/2] virtio-blk: Factor out virtio_blk_handle_scsi_req from virtio_blk_handle_scsi
The common logic to process a scsi request in a VirtQueueElement is extracted to a function to share with dataplane. Signed-off-by: Fam Zheng f...@redhat.com --- hw/block/virtio-blk.c | 77 ++ include/hw/virtio/virtio-blk.h | 3 ++ 2 files changed, 43 insertions(+), 37 deletions(-) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index 5e9433d..11ffa21 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -125,13 +125,15 @@ static VirtIOBlockReq *virtio_blk_get_request(VirtIOBlock *s) return req; } -static void virtio_blk_handle_scsi(VirtIOBlockReq *req) +int virtio_blk_handle_scsi_req(BlockDriverState *bs, + VirtQueueElement *elem) { +int status = VIRTIO_BLK_S_OK; +struct virtio_scsi_inhdr *scsi = NULL; #ifdef __linux__ -int ret; int i; +struct sg_io_hdr hdr; #endif -int status = VIRTIO_BLK_S_OK; /* * We require at least one output segment each for the virtio_blk_outhdr @@ -140,63 +142,56 @@ static void virtio_blk_handle_scsi(VirtIOBlockReq *req) * We also at least require the virtio_blk_inhdr, the virtio_scsi_inhdr * and the sense buffer pointer in the input segments. */ -if (req-elem.out_num 2 || req-elem.in_num 3) { -virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR); -g_free(req); -return; +if (elem-out_num 2 || elem-in_num 3) { +status = VIRTIO_BLK_S_IOERR; +goto fail; } /* * The scsi inhdr is placed in the second-to-last input segment, just * before the regular inhdr. */ -req-scsi = (void *)req-elem.in_sg[req-elem.in_num - 2].iov_base; - -if (!req-dev-blk.scsi) { -status = VIRTIO_BLK_S_UNSUPP; -goto fail; -} +scsi = (void *)elem-in_sg[elem-in_num - 2].iov_base; /* * No support for bidirection commands yet. */ -if (req-elem.out_num 2 req-elem.in_num 3) { +if (elem-out_num 2 elem-in_num 3) { status = VIRTIO_BLK_S_UNSUPP; goto fail; } #ifdef __linux__ -struct sg_io_hdr hdr; memset(hdr, 0, sizeof(struct sg_io_hdr)); hdr.interface_id = 'S'; -hdr.cmd_len = req-elem.out_sg[1].iov_len; -hdr.cmdp = req-elem.out_sg[1].iov_base; +hdr.cmd_len = elem-out_sg[1].iov_len; +hdr.cmdp = elem-out_sg[1].iov_base; hdr.dxfer_len = 0; -if (req-elem.out_num 2) { +if (elem-out_num 2) { /* * If there are more than the minimally required 2 output segments * there is write payload starting from the third iovec. */ hdr.dxfer_direction = SG_DXFER_TO_DEV; -hdr.iovec_count = req-elem.out_num - 2; +hdr.iovec_count = elem-out_num - 2; for (i = 0; i hdr.iovec_count; i++) -hdr.dxfer_len += req-elem.out_sg[i + 2].iov_len; +hdr.dxfer_len += elem-out_sg[i + 2].iov_len; -hdr.dxferp = req-elem.out_sg + 2; +hdr.dxferp = elem-out_sg + 2; -} else if (req-elem.in_num 3) { +} else if (elem-in_num 3) { /* * If we have more than 3 input segments the guest wants to actually * read data. */ hdr.dxfer_direction = SG_DXFER_FROM_DEV; -hdr.iovec_count = req-elem.in_num - 3; +hdr.iovec_count = elem-in_num - 3; for (i = 0; i hdr.iovec_count; i++) -hdr.dxfer_len += req-elem.in_sg[i].iov_len; +hdr.dxfer_len += elem-in_sg[i].iov_len; -hdr.dxferp = req-elem.in_sg; +hdr.dxferp = elem-in_sg; } else { /* * Some SCSI commands don't actually transfer any data. @@ -204,11 +199,11 @@ static void virtio_blk_handle_scsi(VirtIOBlockReq *req) hdr.dxfer_direction = SG_DXFER_NONE; } -hdr.sbp = req-elem.in_sg[req-elem.in_num - 3].iov_base; -hdr.mx_sb_len = req-elem.in_sg[req-elem.in_num - 3].iov_len; +hdr.sbp = elem-in_sg[elem-in_num - 3].iov_base; +hdr.mx_sb_len = elem-in_sg[elem-in_num - 3].iov_len; -ret = bdrv_ioctl(req-dev-bs, SG_IO, hdr); -if (ret) { +status = bdrv_ioctl(bs, SG_IO, hdr); +if (status) { status = VIRTIO_BLK_S_UNSUPP; goto fail; } @@ -224,23 +219,31 @@ static void virtio_blk_handle_scsi(VirtIOBlockReq *req) hdr.status = CHECK_CONDITION; } -stl_p(req-scsi-errors, +stl_p(scsi-errors, hdr.status | (hdr.msg_status 8) | (hdr.host_status 16) | (hdr.driver_status 24)); -stl_p(req-scsi-residual, hdr.resid); -stl_p(req-scsi-sense_len, hdr.sb_len_wr); -stl_p(req-scsi-data_len, hdr.dxfer_len); +stl_p(scsi-residual, hdr.resid); +stl_p(scsi-sense_len, hdr.sb_len_wr); +stl_p(scsi-data_len, hdr.dxfer_len); -virtio_blk_req_complete(req, status); -g_free(req); -return; +return status; #else abort(); #endif fail: /* Just put anything nonzero so
[Qemu-devel] [PATCH 2/2] dataplane: Support VIRTIO_BLK_T_SCSI_CMD
Signed-off-by: Fam Zheng f...@redhat.com --- hw/block/dataplane/virtio-blk.c | 19 +++ 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index 46a6824..6a66196 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -201,6 +201,16 @@ static void do_flush_cmd(VirtIOBlockDataPlane *s, VirtQueueElement *elem, bdrv_aio_flush(s-blk-conf.bs, complete_flush, req); } +static void do_scsi_cmd(VirtIOBlockDataPlane *s, VirtQueueElement *elem, + QEMUIOVector *inhdr) +{ +int status; + +status = virtio_blk_handle_scsi_req(s-blk-conf.bs, elem); +complete_request_early(s, elem, inhdr, status); + +} + static int process_request(VirtIOBlockDataPlane *s, VirtQueueElement *elem) { struct iovec *iov = elem-out_sg; @@ -249,8 +259,7 @@ static int process_request(VirtIOBlockDataPlane *s, VirtQueueElement *elem) return 0; case VIRTIO_BLK_T_SCSI_CMD: -/* TODO support SCSI commands */ -complete_request_early(s, elem, inhdr, VIRTIO_BLK_S_UNSUPP); +do_scsi_cmd(s, elem, inhdr); return 0; case VIRTIO_BLK_T_FLUSH: @@ -326,12 +335,6 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk, return; } -if (blk-scsi) { -error_setg(errp, - device is incompatible with x-data-plane, use scsi=off); -return; -} - /* If dataplane is (re-)enabled while the guest is running there could be * block jobs that can conflict. */ -- 1.9.2
Re: [Qemu-devel] [v2][PATCH 4/8] xen, gfx passthrough: reserve 00:02.0 for INTEL IGD
Hi, Another useful thing would be to not create the xen platform device in case -nodefaults was specified on the command line (that switch turns off a bunch of other devices present by default: vga, nic, cdrom, ...). Currently looks 'xen-platform' itself can't be created, not those devices existed on that. The error message looks more like libxl tries to hot-unplug the xen platform device. Attached patch (untested!) hooks up the xen platform device to the default device code we have in qemu. Two effects: (1) As mentioned above the device will not be created in case -nodefaults is specified on the command line. (2) Autocreating the device is also turned off in case xen-platform is added manually via -device. With the patch applied you should be able to move the xen-platform device to some other place with a simple 'qemu -device xen-platform,addr=$slot'. cheers, Gerd From 405349bf38fc05653bfb8c971c61796e981eee98 Mon Sep 17 00:00:00 2001 From: Gerd Hoffmann kra...@redhat.com Date: Thu, 22 May 2014 08:34:28 +0200 Subject: [PATCH] xen: make xen-platform a default device --- hw/i386/pc_piix.c| 2 +- include/hw/xen/xen.h | 1 + vl.c | 3 +++ 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index eaf3e61..f987d03 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -385,7 +385,7 @@ static void pc_xen_hvm_init(QEMUMachineInitArgs *args) pc_init_pci(args); bus = pci_find_primary_bus(); -if (bus != NULL) { +if (bus != NULL default_xenplatform) { pci_create_simple(bus, -1, xen-platform); } } diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h index 85fda3d..b350413 100644 --- a/include/hw/xen/xen.h +++ b/include/hw/xen/xen.h @@ -20,6 +20,7 @@ enum xen_mode { extern uint32_t xen_domid; extern enum xen_mode xen_mode; +extern int default_xenplatform; extern bool xen_allowed; diff --git a/vl.c b/vl.c index 709d8cd..673148e 100644 --- a/vl.c +++ b/vl.c @@ -226,6 +226,7 @@ static int default_floppy = 1; static int default_cdrom = 1; static int default_sdcard = 1; static int default_vga = 1; +int default_xenplatform = 1; static struct { const char *driver; @@ -247,6 +248,7 @@ static struct { { .driver = isa-cirrus-vga, .flag = default_vga }, { .driver = vmware-svga, .flag = default_vga }, { .driver = qxl-vga, .flag = default_vga }, +{ .driver = xen-platform, .flag = default_xenplatform }, }; static QemuOptsList qemu_rtc_opts = { @@ -4101,6 +4103,7 @@ int main(int argc, char **argv, char **envp) default_monitor = 0; default_net = 0; default_vga = 0; +default_xenplatform = 0; } if (is_daemonized()) { -- 1.8.3.1
Re: [Qemu-devel] [PATCH v2 1/7] iotests: Allow out-of-tree run
On Tue, 05/20 22:23, Max Reitz wrote: As out-of-tree builds are preferred for qemu, running the qemu-iotests in that out-of-tree build should be supported as well. To do so, a symbolic link has to be created pointing to the check script in the source directory. That script will check whether it has been run through a symlink, and if so, will assume it is run in the build tree. All output and temporary operations performed by iotests are then redirected here and, unless specified otherwise by the user, QEMU_PROG etc. will be set to paths appropriate for the build tree. Also, drop making every test case executable if it is not yet, as this would modify the source tree which is not desired for out-of-tree runs and should be fixed in the repository anyway. Signed-off-by: Max Reitz mre...@redhat.com --- tests/qemu-iotests/check | 95 +--- tests/qemu-iotests/common| 8 ++-- tests/qemu-iotests/common.config | 2 +- tests/qemu-iotests/common.rc | 8 ++-- tests/qemu-iotests/iotests.py| 3 +- 5 files changed, 91 insertions(+), 25 deletions(-) diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check index e2ed5a9..aa30ce5 100755 --- a/tests/qemu-iotests/check +++ b/tests/qemu-iotests/check @@ -34,22 +34,86 @@ timestamp=${TIMESTAMP:=false} # generic initialization iam=check +_init_error() +{ +echo $iam: $1 2 +exit 1 +} + +if [ -L $0 ] +then +# called from the build tree +source_iotests=$(dirname $(readlink $0)) +if [ -z $source_iotests ] +then +_init_error failed to obtain source tree name from check symlink +fi +source_iotests=$(cd $source_iotests; pwd) || _init_error failed to enter source tree +build_iotests=$PWD +build_root=$build_iotests/../.. + +if [ -x $build_iotests/socket_scm_helper ] +then +export SOCKET_SCM_HELPER=$build_iotests/socket_scm_helper +fi +else +# called from the source tree +source_iotests=$PWD +fi + +if [ -n $build_root ] +then +if [ -z $QEMU_PROG ] +then I think it would be a good idea to uniform the two cases: if running from source tree, simply let build_root=$source_iotests/../.. So we don't need the condition here, and QEMU_*_PROG is automatically set for both cases, instead of only for out-of-tree. Fam +arch=$(uname -m 2 /dev/null) + +if [[ -n $arch -x $build_root/$arch-softmmu/qemu-system-$arch ]] +then +export QEMU_PROG=$build_root/$arch-softmmu/qemu-system-$arch +else +pushd -q $build_root +for binary in *-softmmu/qemu-system-* +do +if [ -x $binary ] +then +export QEMU_PROG=$build_root/$binary +break +fi +done +popd -q +fi +fi + +if [[ -z $QEMU_IMG_PROG -x $build_root/qemu-img ]] +then +export QEMU_IMG_PROG=$build_root/qemu-img +fi + +if [[ -z $QEMU_IO_PROG -x $build_root/qemu-io ]] +then +export QEMU_IO_PROG=$build_root/qemu-io +fi + +if [[ -z $QEMU_NBD_PROG -x $build_root/qemu-nbd ]] +then +export QEMU_NBD_PROG=$build_root/qemu-nbd +fi +fi + # we need common.config -if ! . ./common.config +if ! . $source_iotests/common.config then -echo $iam: failed to source common.config -exit 1 +_init_error failed to source common.config fi # we need common.rc -if ! . ./common.rc +if ! . $source_iotests/common.rc then -echo check: failed to source common.rc -exit 1 +_init_error failed to source common.rc fi # we need common -. ./common +. $source_iotests/common #if [ `id -u` -ne 0 ] #then @@ -194,7 +258,7 @@ do echo - expunged rm -f $seq.out.bad echo /^$seq\$/d $tmp.expunged -elif [ ! -f $seq ] +elif [ ! -f $source_iotests/$seq ] then echo - no such test? echo /^$seq\$/d $tmp.expunged @@ -215,9 +279,10 @@ do start=`_wallclock` $timestamp echo -n [`date +%T`] -[ ! -x $seq ] chmod u+x $seq # ensure we can run it +export OUTPUT_DIR=$PWD +(cd $source_iotests; MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(($RANDOM % 255 + 1))} \ -./$seq $tmp.out 21 +./$seq $tmp.out 21) sts=$? $timestamp _timestamp stop=`_wallclock` @@ -242,17 +307,17 @@ do err=true fi -reference=$seq.out +reference=$source_iotests/$seq.out if [ $CACHEMODE = none ]; then -[ -f $seq.out.nocache ] reference=$seq.out.nocache +[ -f $source_iotests/$seq.out.nocache ] reference=$source_iotests/$seq.out.nocache fi -if [ !
Re: [Qemu-devel] [v2][PATCH 4/8] xen, gfx passthrough: reserve 00:02.0 for INTEL IGD
On Thu, May 22, 2014 at 08:44:41AM +0200, Gerd Hoffmann wrote: Hi, Another useful thing would be to not create the xen platform device in case -nodefaults was specified on the command line (that switch turns off a bunch of other devices present by default: vga, nic, cdrom, ...). Currently looks 'xen-platform' itself can't be created, not those devices existed on that. The error message looks more like libxl tries to hot-unplug the xen platform device. Attached patch (untested!) hooks up the xen platform device to the default device code we have in qemu. Two effects: (1) As mentioned above the device will not be created in case -nodefaults is specified on the command line. (2) Autocreating the device is also turned off in case xen-platform is added manually via -device. With the patch applied you should be able to move the xen-platform device to some other place with a simple 'qemu -device xen-platform,addr=$slot'. cheers, Gerd From 405349bf38fc05653bfb8c971c61796e981eee98 Mon Sep 17 00:00:00 2001 From: Gerd Hoffmann kra...@redhat.com Date: Thu, 22 May 2014 08:34:28 +0200 Subject: [PATCH] xen: make xen-platform a default device --- Looks good to me, though Sob is missing. Can some Xen folks pls ack this one? I'll queue it up then. hw/i386/pc_piix.c| 2 +- include/hw/xen/xen.h | 1 + vl.c | 3 +++ 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index eaf3e61..f987d03 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -385,7 +385,7 @@ static void pc_xen_hvm_init(QEMUMachineInitArgs *args) pc_init_pci(args); bus = pci_find_primary_bus(); -if (bus != NULL) { +if (bus != NULL default_xenplatform) { pci_create_simple(bus, -1, xen-platform); } } diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h index 85fda3d..b350413 100644 --- a/include/hw/xen/xen.h +++ b/include/hw/xen/xen.h @@ -20,6 +20,7 @@ enum xen_mode { extern uint32_t xen_domid; extern enum xen_mode xen_mode; +extern int default_xenplatform; extern bool xen_allowed; diff --git a/vl.c b/vl.c index 709d8cd..673148e 100644 --- a/vl.c +++ b/vl.c @@ -226,6 +226,7 @@ static int default_floppy = 1; static int default_cdrom = 1; static int default_sdcard = 1; static int default_vga = 1; +int default_xenplatform = 1; static struct { const char *driver; @@ -247,6 +248,7 @@ static struct { { .driver = isa-cirrus-vga, .flag = default_vga }, { .driver = vmware-svga, .flag = default_vga }, { .driver = qxl-vga, .flag = default_vga }, +{ .driver = xen-platform, .flag = default_xenplatform }, }; static QemuOptsList qemu_rtc_opts = { @@ -4101,6 +4103,7 @@ int main(int argc, char **argv, char **envp) default_monitor = 0; default_net = 0; default_vga = 0; +default_xenplatform = 0; } if (is_daemonized()) { -- 1.8.3.1
Re: [Qemu-devel] [PATCH v2 8/8] spapr_pci: Use XICS interrupt allocator and do not cache interrupts in PHB
On 05/21/2014 10:42 PM, Alexey Kardashevskiy wrote: On 05/21/2014 08:35 PM, Alexander Graf wrote: On 21.05.14 12:13, Alexey Kardashevskiy wrote: On 05/21/2014 07:50 PM, Alexander Graf wrote: On 21.05.14 11:33, Alexey Kardashevskiy wrote: On 05/21/2014 07:13 PM, Alexander Graf wrote: On 21.05.14 11:11, Michael S. Tsirkin wrote: On Wed, May 21, 2014 at 11:06:09AM +0200, Alexander Graf wrote: On 21.05.14 10:52, Alexey Kardashevskiy wrote: On 05/21/2014 06:40 PM, Alexander Graf wrote: On 15.05.14 11:59, Alexey Kardashevskiy wrote: Currently SPAPR PHB keeps track of all allocated MSI/MISX interrupt as XICS used to be unable to reuse interrupts which becomes a problem for dynamic MSI reconfiguration which is happening on guest driver reload or PCI hot (un)plug. Another problem is that PHB has a limit of devices supporting MSI/MSIX (SPAPR_MSIX_MAX_DEVS=32) and there is no good reason for that. This makes use of new XICS ability to reuse interrupts. This removes cached MSI configuration from SPAPR PHB so the first IRQ number of a device is stored in MSI/MSIX config space so there is no need to store this anywhere else. From now on, SPAPR PHB only keeps flags telling what type of interrupt for which device it has configured in order to return error if (for example) MSIX was enabled and the guest is trying to disable MSI which it has not enabled. This removes a limit for the maximum number of MSIX-enabled devices per PHB, now XICS and PCI bus capacity are the only limitation. This changes migration stream as it fixes vmstate_spapr_pci_msi::name which was wrong since the beginning. This fixed traces to be more informative. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- In reality either MSIX or MSI is enabled, never both. So I could remove msi/msix bitmaps from this patch, would it make sense? Is this a hard requirement? Does a device have to choose between MSIX and MSI or could it theoretically have both enabled? Is this a PCI limitation, a PAPR/XICS limitation or just a limitation of your implementation? My implementation does not have this limitation, I asked if I can simplify code by introducing one :) I cannot see any reason why PCI cannot have both MSI and MSIX enabled but it does not seem to be used by anyone = cannot debug and confirm. PAPR spec assumes that if the guest tries enabling MSIX when MSI is already enabled, this is a change, not enabling both types. But it also says MSI and MSIX vector numbers are not shared. Hm. Yeah, I'm not aware of any limitation on hardware here and I'd rather not impose one. Michael, do you know of any hardware that uses MSI *and* MSI-X at the same time? Alex No, and the PCI spec says: A function is permitted to implement both MSI and MSI-X, but system software is prohibited from enabling both at the same time. If system software enables both at the same time, the result is undefined. Ah, cool. So yes Alexey, feel free to impose it :). Heh. This solves just half of the problem - I still have to keep track of what device got MSI/MSIX configured via that ibm,change-msi interface. I was hoping I can store such flag somewhere in a device PCI config space but MSI/MSIX enable bit is not good as it is not set when those calls are made. And I cannot rely on address/data fields much as the guest can change them (I already use them to store IRQ numbers and btw it is missing checks when I read them back for disposal, I'll fix in next round). Or on enable event I could put IRQ numbers to .data of MSI config space and on disable check if it is not zero, then configuration took place, then I can remove my msi[]/msix[] flag arrays. If the guest did any change to MSI/MSIX config space (it does not on SPAPR except weird selftest cases), I compare .data with what ICS can possibly have and either reject disable or handle it and if it breaks XICS - that's too bad for the stupid guest. Would that be acceptable? Can't you prohibit the guest from writing to the MSI configuration registers itself? Then you don't need to do sanity checks. I could for emulated devices but VFIO uses the same code. For example, there is an IBM SCSI IPR card which does a self test. For that, it saves MSIX BAR content, does reboot via some backdoor interface and restores MSIX BAR. It has been solved for VFIO in the host kernel by restoring MSIX data from cached values when guest is trying to restore it with what it thinks is actual MSIX data (it is virtualized because of x86). But there is cache We already have a cache because we don't access the real PCI registers with msi_set_message(), no? For emulated devices there is no cache. And in any case the guest is allowed to write to it... Who knows what AIX does? I do not. Tried GHashTable for keeping bus:dev.fn - (irq, num), more or less ok but how to migrate such thing? Temporary cache somewhere and then unpack
Re: [Qemu-devel] [PATCH 00/25] qemu gtk ui overhaul
Il 22/05/2014 08:24, Gerd Hoffmann ha scritto: I suspect this is the menu bar. Current code tries to hide it by forcing the widget size to 0x0. In gtk2 this results in a small white line at the top of the screen, probably because the minimum widget size in gtk is 1x1. In gtk3 this doesn't work at all and the menu bar remains visible. This isn't a new issue btw. I've tried to fix that with a simple gtk_widget_hide(menu_bar) call. Which has the side effect to also turn off all menu accelerators. Which is a non-starter, especially for Ctrl-Alt-F not working any more. Subclass the menu bar and override the draw and size_allocate methods when the widget is hidden? Paolo
Re: [Qemu-devel] [PATCH 1/2] virtio-blk: Factor out virtio_blk_handle_scsi_req from virtio_blk_handle_scsi
Il 22/05/2014 08:36, Fam Zheng ha scritto: - -if (!req-dev-blk.scsi) { -status = VIRTIO_BLK_S_UNSUPP; -goto fail; -} Where did you move this condition? Also, the scsi field of VirtIOBlockReq is now unused. Paolo
Re: [Qemu-devel] [PATCH 2/2] dataplane: Support VIRTIO_BLK_T_SCSI_CMD
Il 22/05/2014 08:36, Fam Zheng ha scritto: Signed-off-by: Fam Zheng f...@redhat.com --- hw/block/dataplane/virtio-blk.c | 19 +++ 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index 46a6824..6a66196 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -201,6 +201,16 @@ static void do_flush_cmd(VirtIOBlockDataPlane *s, VirtQueueElement *elem, bdrv_aio_flush(s-blk-conf.bs, complete_flush, req); } +static void do_scsi_cmd(VirtIOBlockDataPlane *s, VirtQueueElement *elem, + QEMUIOVector *inhdr) +{ +int status; + +status = virtio_blk_handle_scsi_req(s-blk-conf.bs, elem); +complete_request_early(s, elem, inhdr, status); + +} + static int process_request(VirtIOBlockDataPlane *s, VirtQueueElement *elem) { struct iovec *iov = elem-out_sg; @@ -249,8 +259,7 @@ static int process_request(VirtIOBlockDataPlane *s, VirtQueueElement *elem) return 0; case VIRTIO_BLK_T_SCSI_CMD: -/* TODO support SCSI commands */ -complete_request_early(s, elem, inhdr, VIRTIO_BLK_S_UNSUPP); +do_scsi_cmd(s, elem, inhdr); return 0; case VIRTIO_BLK_T_FLUSH: @@ -326,12 +335,6 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk, return; } -if (blk-scsi) { -error_setg(errp, - device is incompatible with x-data-plane, use scsi=off); -return; -} - /* If dataplane is (re-)enabled while the guest is running there could be * block jobs that can conflict. */ Reviewed-by: Paolo Bonzini pbonz...@redhat.com
Re: [Qemu-devel] [PATCH 3/7] monitor: Add migrate_set_capability completion.
Dr. David Alan Gilbert dgilb...@redhat.com writes: * Hani Benhabiles (kroo...@gmail.com) wrote: Signed-off-by: Hani Benhabiles h...@linux.com --- hmp-commands.hx | 1 + hmp.h | 2 ++ monitor.c | 21 + 3 files changed, 24 insertions(+) diff --git a/hmp-commands.hx b/hmp-commands.hx index 45e1763..919af6e 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -975,6 +975,7 @@ ETEXI .params = capability state, .help = Enable/Disable the usage of a capability for migration, .mhandler.cmd = hmp_migrate_set_capability, +.command_completion = migrate_set_capability_completion, }, STEXI diff --git a/hmp.h b/hmp.h index a70804d..0c814d0 100644 --- a/hmp.h +++ b/hmp.h @@ -107,5 +107,7 @@ void ringbuf_write_completion(ReadLineState *rs, int nb_args, const char *str); void ringbuf_read_completion(ReadLineState *rs, int nb_args, const char *str); void watchdog_action_completion(ReadLineState *rs, int nb_args, const char *str); +void migrate_set_capability_completion(ReadLineState *rs, int nb_args, + const char *str); Thank you for doing this; I spend way too much time typing these commands. #endif diff --git a/monitor.c b/monitor.c index fb300c2..6a3a5c9 100644 --- a/monitor.c +++ b/monitor.c @@ -4572,6 +4572,27 @@ void watchdog_action_completion(ReadLineState *rs, int nb_args, const char *str) add_completion_option(rs, str, none); } +void migrate_set_capability_completion(ReadLineState *rs, int nb_args, + const char *str) +{ +size_t len; + +len = strlen(str); +readline_set_completion_index(rs, len); +if (nb_args == 2) { +int i; +for (i = 0; i MIGRATION_CAPABILITY_MAX; i++) { +const char *name = MigrationCapability_lookup[i]; +if (!strncmp(str, name, len)) { +readline_add_completion(rs, name); +} +} +} else if (nb_args == 3) { +add_completion_option(rs, str, on); +add_completion_option(rs, str, off); +} It's a shame you have to do all of these manually; if we could tell something that we had an enum of 'MigrationCapability' then it could remove the command specific glue. HMP command arguments are specified by an args_type string, which can't express enumeration, so we use strings instead. We could hack in another arguments type 'E', abusing the parameter name like type 'O'. But I think the args_type string is long past its best before date. We could replace it by a command schema, like QMP's. Differences to QMP include: arguments are positional, fewer types are accepted, and there's convenience syntax such as size suffixes. Not exactly an easy task, I'm afraid. [...]
Re: [Qemu-devel] [PATCH 0/2] dataplane: Enable scsi=on
Il 22/05/2014 08:36, Fam Zheng ha scritto: This makes the SG_IO code of non-dataplane available to dataplane, so that dataplane can use to allow scsi=on. I like the idea of using more hw/block/virtio-blk.c code in the dataplane thread. We should turn this into an actual plan to remove more and more hw/block/dataplane/ code. It will help for rerror=/werror= too. Paolo
Re: [Qemu-devel] [PATCH 2/9] target-ppc: Refactor init_proc_POWER7
Am 22.05.2014 um 05:59 schrieb Alexey Kardashevskiy a...@ozlabs.ru: On 05/21/2014 11:23 PM, Alexander Graf wrote: On 21.05.14 14:30, Alexey Kardashevskiy wrote: On 05/21/2014 08:44 PM, Alexander Graf wrote: On 21.05.14 08:20, Alexey Kardashevskiy wrote: This moves SPR initialization to helper functions. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru I like the idea, but please refactor all book3s CPUs, not just POWER7. I also think we can cover a lot of the SPR registration by matching on feature fields. VR for example is coupled to Altivec. Ok. Maybe we could also introduce an enum for the exact cpu type, similar to how we do it on e500? Then we could do fun things like if (cpu_type = CPU_TYPE_970) { gen_spr_book3s_vr(env); } if (cpu_type = CPU_TYPE_POWER7) { gen_spr_lpar(env); } switch (cpu_type) { case CPU_TYPE_POWER7: env-slb_nr = 32; break; default: env-slb_nr = 64; break; } and thus combine all those book3s init functions into a single, more maintainable version. If I can, I would like not to do it in this way, I'd rather have explicit list of gen_spr_FACILITY() calls always. For example, DABR/DABRX/whateverPOWER8has - it is not going to be always , and this breaks my weak mind :( Could you give me some examples where a newer POWER has lost features over an older POWER? DABR/DABRX, also Paul mentioned one exception is the instructions in power6 that are register moves between gpr and fpr registers. For those cases, just switch() explicitly for the cpu type. We don't have to arithmetically solve all of the feature matches :) Alex I do not know anything else though. So your point is taken, I'll try to do what you want :) -- Alexey
Re: [Qemu-devel] [PATCH 8/9] spapr_iommu: Introduce page_shift in sPAPRTCETable
Am 22.05.2014 um 06:25 schrieb Alexey Kardashevskiy a...@ozlabs.ru: On 05/22/2014 08:11 AM, Alexander Graf wrote: On 21.05.14 16:21, Alexey Kardashevskiy wrote: At the moment only 4K pages are supported by sPAPRTCETable. Since sPAPR spec allows other page sizes and we are going to implement them, we need page size to be configrable. This adds @page_shift into sPAPRTCETable and replaces SPAPR_TCE_PAGE_SHIFT with it whereever it is possible. This removes SPAPR_TCE_PAGE_MASK as it is no longer used. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- hw/ppc/spapr_iommu.c | 54 +- hw/ppc/spapr_pci.c | 1 + hw/ppc/spapr_vio.c | 1 + include/hw/ppc/spapr.h | 3 ++- 4 files changed, 35 insertions(+), 24 deletions(-) diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c index 90de3e3..e765a6d 100644 --- a/hw/ppc/spapr_iommu.c +++ b/hw/ppc/spapr_iommu.c @@ -70,12 +70,13 @@ static IOMMUTLBEntry spapr_tce_translate_iommu(MemoryRegion *iommu, hwaddr addr) if (tcet-bypass) { ret.perm = IOMMU_RW; -} else if ((addr SPAPR_TCE_PAGE_SHIFT) tcet-nb_table) { +} else if ((addr tcet-page_shift) tcet-nb_table) { /* Check if we are in bound */ -tce = tcet-table[addr SPAPR_TCE_PAGE_SHIFT]; -ret.iova = addr ~SPAPR_TCE_PAGE_MASK; -ret.translated_addr = tce ~SPAPR_TCE_PAGE_MASK; -ret.addr_mask = SPAPR_TCE_PAGE_MASK; +target_ulong mask = ~((1 tcet-page_shift) - 1); Why target_ulong? This should be u64 or hwaddr or something along those lines, no? Should be hwaddr, right, will fix. Also, can the mask grow bigger than 31bits? If so you probably want 1ULL (below as well). It cannot now but PAPR allows pages to be 16GB so you are making good point here, will fix too. In fact, we might be better off with a few more fields to tcet. Just add page_mask and page_size in addition to the page_shift one and use them instead of calculating them over and over again. This is only used for emulated devices, not even for virtio. How much will we earn from that optimization? Will it be even measurable? On the other hand, this creates an opportunity (subtle, yes) to screw things up. Still worth? I'm not concerned about speed here, but readibility :). Inline calculations are just harder to read. The alternative would be to extract every calculation into a local variable and use that instead ;). Alex -- Alexey
Re: [Qemu-devel] [PATCH 1/2] virtio-blk: Factor out virtio_blk_handle_scsi_req from virtio_blk_handle_scsi
On Thu, 05/22 09:03, Paolo Bonzini wrote: Il 22/05/2014 08:36, Fam Zheng ha scritto: - -if (!req-dev-blk.scsi) { -status = VIRTIO_BLK_S_UNSUPP; -goto fail; -} Where did you move this condition? Also, the scsi field of VirtIOBlockReq is now unused. Just realized I missed that. Will fix. Thanks for reviewing! Fam
Re: [Qemu-devel] [v2][PATCH 4/8] xen, gfx passthrough: reserve 00:02.0 for INTEL IGD
-Original Message- From: Gerd Hoffmann [mailto:kra...@redhat.com] Sent: Thursday, May 22, 2014 2:45 PM To: Chen, Tiejun Cc: Anthony PERARD; Daniel P. Berrange; peter.mayd...@linaro.org; xen-de...@lists.xensource.com; m...@redhat.com; stefano.stabell...@eu.citrix.com; Kay, Allen M; kelly.zyta...@amd.com; qemu-devel@nongnu.org; Zhang, Yang Z; anth...@codemonkey.ws Subject: Re: [Qemu-devel] [v2][PATCH 4/8] xen, gfx passthrough: reserve 00:02.0 for INTEL IGD Hi, Another useful thing would be to not create the xen platform device in case -nodefaults was specified on the command line (that switch turns off a bunch of other devices present by default: vga, nic, cdrom, ...). Currently looks 'xen-platform' itself can't be created, not those devices existed on that. The error message looks more like libxl tries to hot-unplug the xen platform device. Attached patch (untested!) hooks up the xen platform device to the default device code we have in qemu. Two effects: (1) As mentioned above the device will not be created in case -nodefaults is specified on the command line. I think this is unnecessary to add this option to support GFX passthrough. (2) Autocreating the device is also turned off in case xen-platform is added manually via -device. With the patch applied you should be able to move the xen-platform device to some other place with a simple 'qemu -device xen-platform,addr=$slot'. Yes, this patch works now. So I assume this would be merged into ML soon. Thanks for your great help, Gerd! Tiejun
Re: [Qemu-devel] [PATCH v2 8/8] spapr_pci: Use XICS interrupt allocator and do not cache interrupts in PHB
Am 22.05.2014 um 08:53 schrieb Alexey Kardashevskiy a...@ozlabs.ru: On 05/21/2014 10:42 PM, Alexey Kardashevskiy wrote: On 05/21/2014 08:35 PM, Alexander Graf wrote: On 21.05.14 12:13, Alexey Kardashevskiy wrote: On 05/21/2014 07:50 PM, Alexander Graf wrote: On 21.05.14 11:33, Alexey Kardashevskiy wrote: On 05/21/2014 07:13 PM, Alexander Graf wrote: On 21.05.14 11:11, Michael S. Tsirkin wrote: On Wed, May 21, 2014 at 11:06:09AM +0200, Alexander Graf wrote: On 21.05.14 10:52, Alexey Kardashevskiy wrote: On 05/21/2014 06:40 PM, Alexander Graf wrote: On 15.05.14 11:59, Alexey Kardashevskiy wrote: Currently SPAPR PHB keeps track of all allocated MSI/MISX interrupt as XICS used to be unable to reuse interrupts which becomes a problem for dynamic MSI reconfiguration which is happening on guest driver reload or PCI hot (un)plug. Another problem is that PHB has a limit of devices supporting MSI/MSIX (SPAPR_MSIX_MAX_DEVS=32) and there is no good reason for that. This makes use of new XICS ability to reuse interrupts. This removes cached MSI configuration from SPAPR PHB so the first IRQ number of a device is stored in MSI/MSIX config space so there is no need to store this anywhere else. From now on, SPAPR PHB only keeps flags telling what type of interrupt for which device it has configured in order to return error if (for example) MSIX was enabled and the guest is trying to disable MSI which it has not enabled. This removes a limit for the maximum number of MSIX-enabled devices per PHB, now XICS and PCI bus capacity are the only limitation. This changes migration stream as it fixes vmstate_spapr_pci_msi::name which was wrong since the beginning. This fixed traces to be more informative. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- In reality either MSIX or MSI is enabled, never both. So I could remove msi/msix bitmaps from this patch, would it make sense? Is this a hard requirement? Does a device have to choose between MSIX and MSI or could it theoretically have both enabled? Is this a PCI limitation, a PAPR/XICS limitation or just a limitation of your implementation? My implementation does not have this limitation, I asked if I can simplify code by introducing one :) I cannot see any reason why PCI cannot have both MSI and MSIX enabled but it does not seem to be used by anyone = cannot debug and confirm. PAPR spec assumes that if the guest tries enabling MSIX when MSI is already enabled, this is a change, not enabling both types. But it also says MSI and MSIX vector numbers are not shared. Hm. Yeah, I'm not aware of any limitation on hardware here and I'd rather not impose one. Michael, do you know of any hardware that uses MSI *and* MSI-X at the same time? Alex No, and the PCI spec says: A function is permitted to implement both MSI and MSI-X, but system software is prohibited from enabling both at the same time. If system software enables both at the same time, the result is undefined. Ah, cool. So yes Alexey, feel free to impose it :). Heh. This solves just half of the problem - I still have to keep track of what device got MSI/MSIX configured via that ibm,change-msi interface. I was hoping I can store such flag somewhere in a device PCI config space but MSI/MSIX enable bit is not good as it is not set when those calls are made. And I cannot rely on address/data fields much as the guest can change them (I already use them to store IRQ numbers and btw it is missing checks when I read them back for disposal, I'll fix in next round). Or on enable event I could put IRQ numbers to .data of MSI config space and on disable check if it is not zero, then configuration took place, then I can remove my msi[]/msix[] flag arrays. If the guest did any change to MSI/MSIX config space (it does not on SPAPR except weird selftest cases), I compare .data with what ICS can possibly have and either reject disable or handle it and if it breaks XICS - that's too bad for the stupid guest. Would that be acceptable? Can't you prohibit the guest from writing to the MSI configuration registers itself? Then you don't need to do sanity checks. I could for emulated devices but VFIO uses the same code. For example, there is an IBM SCSI IPR card which does a self test. For that, it saves MSIX BAR content, does reboot via some backdoor interface and restores MSIX BAR. It has been solved for VFIO in the host kernel by restoring MSIX data from cached values when guest is trying to restore it with what it thinks is actual MSIX data (it is virtualized because of x86). But there is cache We already have a cache because we don't access the real PCI registers with msi_set_message(), no? For emulated devices there is no cache. And in any case the guest is allowed to write to it... Who knows what AIX does? I do not. Tried GHashTable for keeping bus:dev.fn - (irq,
[Qemu-devel] [PATCH] xen: make xen-platform a default device
Patch hooks up the xen platform device to the default device code we have in qemu. Two effects: (1) The device will not be created in case -nodefaults is specified on the command line. (2) Autocreating the device is also turned off in case xen-platform is added manually via -device. With the patch applied you can move the xen-platform device to some other place with a simple 'qemu -device xen-platform,addr=$slot'. Tested-by: Tiejun Chen tiejun.c...@intel.com Signed-off-by: Gerd Hoffmann kra...@redhat.com --- hw/i386/pc_piix.c| 2 +- include/hw/xen/xen.h | 1 + vl.c | 3 +++ 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index eaf3e61..f987d03 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -385,7 +385,7 @@ static void pc_xen_hvm_init(QEMUMachineInitArgs *args) pc_init_pci(args); bus = pci_find_primary_bus(); -if (bus != NULL) { +if (bus != NULL default_xenplatform) { pci_create_simple(bus, -1, xen-platform); } } diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h index 85fda3d..b350413 100644 --- a/include/hw/xen/xen.h +++ b/include/hw/xen/xen.h @@ -20,6 +20,7 @@ enum xen_mode { extern uint32_t xen_domid; extern enum xen_mode xen_mode; +extern int default_xenplatform; extern bool xen_allowed; diff --git a/vl.c b/vl.c index 709d8cd..673148e 100644 --- a/vl.c +++ b/vl.c @@ -226,6 +226,7 @@ static int default_floppy = 1; static int default_cdrom = 1; static int default_sdcard = 1; static int default_vga = 1; +int default_xenplatform = 1; static struct { const char *driver; @@ -247,6 +248,7 @@ static struct { { .driver = isa-cirrus-vga, .flag = default_vga }, { .driver = vmware-svga, .flag = default_vga }, { .driver = qxl-vga, .flag = default_vga }, +{ .driver = xen-platform, .flag = default_xenplatform }, }; static QemuOptsList qemu_rtc_opts = { @@ -4101,6 +4103,7 @@ int main(int argc, char **argv, char **envp) default_monitor = 0; default_net = 0; default_vga = 0; +default_xenplatform = 0; } if (is_daemonized()) { -- 1.8.3.1
[Qemu-devel] [PATCH v2 1/2] openpic: Move definition of openpic_reset
This patch moves the definition of openpic_reset after the various register read/write functions. No functional change. It is in preparation for using the register read/write functions in openpic_reset. Signed-off-by: Paul Janzen p...@pauljanzen.org --- hw/intc/openpic.c | 99 +++-- 1 files changed, 50 insertions(+), 49 deletions(-) diff --git a/hw/intc/openpic.c b/hw/intc/openpic.c index 17136c9..81469ff 100644 --- a/hw/intc/openpic.c +++ b/hw/intc/openpic.c @@ -192,6 +192,7 @@ static uint32_t openpic_cpu_read_internal(void *opaque, hwaddr addr, int idx); static void openpic_cpu_write_internal(void *opaque, hwaddr addr, uint32_t val, int idx); +static void openpic_reset(DeviceState *d); typedef enum IRQType { IRQ_TYPE_NORMAL = 0, @@ -534,55 +535,6 @@ static void openpic_set_irq(void *opaque, int n_IRQ, int level) } } -static void openpic_reset(DeviceState *d) -{ -OpenPICState *opp = OPENPIC(d); -int i; - -opp-gcr = GCR_RESET; -/* Initialise controller registers */ -opp-frr = ((opp-nb_irqs - 1) FRR_NIRQ_SHIFT) | - ((opp-nb_cpus - 1) FRR_NCPU_SHIFT) | - (opp-vid FRR_VID_SHIFT); - -opp-pir = 0; -opp-spve = -1 opp-vector_mask; -opp-tfrr = opp-tfrr_reset; -/* Initialise IRQ sources */ -for (i = 0; i opp-max_irq; i++) { -opp-src[i].ivpr = opp-ivpr_reset; -opp-src[i].idr = opp-idr_reset; - -switch (opp-src[i].type) { -case IRQ_TYPE_NORMAL: -opp-src[i].level = !!(opp-ivpr_reset IVPR_SENSE_MASK); -break; - -case IRQ_TYPE_FSLINT: -opp-src[i].ivpr |= IVPR_POLARITY_MASK; -break; - -case IRQ_TYPE_FSLSPECIAL: -break; -} -} -/* Initialise IRQ destinations */ -for (i = 0; i MAX_CPU; i++) { -opp-dst[i].ctpr = 15; -memset(opp-dst[i].raised, 0, sizeof(IRQQueue)); -opp-dst[i].raised.next = -1; -memset(opp-dst[i].servicing, 0, sizeof(IRQQueue)); -opp-dst[i].servicing.next = -1; -} -/* Initialise timers */ -for (i = 0; i OPENPIC_MAX_TMR; i++) { -opp-timers[i].tccr = 0; -opp-timers[i].tbcr = TBCR_CI; -} -/* Go out of RESET state */ -opp-gcr = 0; -} - static inline uint32_t read_IRQreg_idr(OpenPICState *opp, int n_IRQ) { return opp-src[n_IRQ].idr; @@ -1466,6 +1418,55 @@ static int openpic_load(QEMUFile* f, void *opaque, int version_id) return 0; } +static void openpic_reset(DeviceState *d) +{ +OpenPICState *opp = OPENPIC(d); +int i; + +opp-gcr = GCR_RESET; +/* Initialise controller registers */ +opp-frr = ((opp-nb_irqs - 1) FRR_NIRQ_SHIFT) | + ((opp-nb_cpus - 1) FRR_NCPU_SHIFT) | + (opp-vid FRR_VID_SHIFT); + +opp-pir = 0; +opp-spve = -1 opp-vector_mask; +opp-tfrr = opp-tfrr_reset; +/* Initialise IRQ sources */ +for (i = 0; i opp-max_irq; i++) { +opp-src[i].ivpr = opp-ivpr_reset; +opp-src[i].idr = opp-idr_reset; + +switch (opp-src[i].type) { +case IRQ_TYPE_NORMAL: +opp-src[i].level = !!(opp-ivpr_reset IVPR_SENSE_MASK); +break; + +case IRQ_TYPE_FSLINT: +opp-src[i].ivpr |= IVPR_POLARITY_MASK; +break; + +case IRQ_TYPE_FSLSPECIAL: +break; +} +} +/* Initialise IRQ destinations */ +for (i = 0; i MAX_CPU; i++) { +opp-dst[i].ctpr = 15; +memset(opp-dst[i].raised, 0, sizeof(IRQQueue)); +opp-dst[i].raised.next = -1; +memset(opp-dst[i].servicing, 0, sizeof(IRQQueue)); +opp-dst[i].servicing.next = -1; +} +/* Initialise timers */ +for (i = 0; i OPENPIC_MAX_TMR; i++) { +opp-timers[i].tccr = 0; +opp-timers[i].tbcr = TBCR_CI; +} +/* Go out of RESET state */ +opp-gcr = 0; +} + typedef struct MemReg { const char *name; MemoryRegionOps const *ops; -- 1.7.1
Re: [Qemu-devel] [PATCH] xen: make xen-platform a default device
On Thu, May 22, 2014 at 09:20:50AM +0200, Gerd Hoffmann wrote: Patch hooks up the xen platform device to the default device code we have in qemu. Two effects: (1) The device will not be created in case -nodefaults is specified on the command line. (2) Autocreating the device is also turned off in case xen-platform is added manually via -device. With the patch applied you can move the xen-platform device to some other place with a simple 'qemu -device xen-platform,addr=$slot'. Tested-by: Tiejun Chen tiejun.c...@intel.com Signed-off-by: Gerd Hoffmann kra...@redhat.com Applied, thanks! --- hw/i386/pc_piix.c| 2 +- include/hw/xen/xen.h | 1 + vl.c | 3 +++ 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index eaf3e61..f987d03 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -385,7 +385,7 @@ static void pc_xen_hvm_init(QEMUMachineInitArgs *args) pc_init_pci(args); bus = pci_find_primary_bus(); -if (bus != NULL) { +if (bus != NULL default_xenplatform) { pci_create_simple(bus, -1, xen-platform); } } diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h index 85fda3d..b350413 100644 --- a/include/hw/xen/xen.h +++ b/include/hw/xen/xen.h @@ -20,6 +20,7 @@ enum xen_mode { extern uint32_t xen_domid; extern enum xen_mode xen_mode; +extern int default_xenplatform; extern bool xen_allowed; diff --git a/vl.c b/vl.c index 709d8cd..673148e 100644 --- a/vl.c +++ b/vl.c @@ -226,6 +226,7 @@ static int default_floppy = 1; static int default_cdrom = 1; static int default_sdcard = 1; static int default_vga = 1; +int default_xenplatform = 1; static struct { const char *driver; @@ -247,6 +248,7 @@ static struct { { .driver = isa-cirrus-vga, .flag = default_vga }, { .driver = vmware-svga, .flag = default_vga }, { .driver = qxl-vga, .flag = default_vga }, +{ .driver = xen-platform, .flag = default_xenplatform }, }; static QemuOptsList qemu_rtc_opts = { @@ -4101,6 +4103,7 @@ int main(int argc, char **argv, char **envp) default_monitor = 0; default_net = 0; default_vga = 0; +default_xenplatform = 0; } if (is_daemonized()) { -- 1.8.3.1
[Qemu-devel] [PATCH v2 2/2] openpic: Reset IRQ source private members
The openpic emulation code maintains an allowable-CPU's bitmap (destmask) for each IRQ source which is calculated from the IDR register value whenever the guest OS writes to it. However, if the guest OS relies on the system to set the IDR register to a default value at reset, and does not write IDR, then destmask does not get updated, and interrupts do not get propagated to the guest. Additionally, if an IRQ source is marked as critical, the source's internal output and nomask fields are not correctly reset when the PIC is reset. Fix both these issues by calling write_IRQreg_idr from within openpic_reset, instead of simply setting the IDR register to the specified idr_reset value. Signed-off-by: Paul Janzen p...@pauljanzen.org --- hw/intc/openpic.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/intc/openpic.c b/hw/intc/openpic.c index 81469ff..811db6f 100644 --- a/hw/intc/openpic.c +++ b/hw/intc/openpic.c @@ -1435,8 +1435,6 @@ static void openpic_reset(DeviceState *d) /* Initialise IRQ sources */ for (i = 0; i opp-max_irq; i++) { opp-src[i].ivpr = opp-ivpr_reset; -opp-src[i].idr = opp-idr_reset; - switch (opp-src[i].type) { case IRQ_TYPE_NORMAL: opp-src[i].level = !!(opp-ivpr_reset IVPR_SENSE_MASK); @@ -1449,6 +1447,8 @@ static void openpic_reset(DeviceState *d) case IRQ_TYPE_FSLSPECIAL: break; } + +write_IRQreg_idr(opp, i, opp-idr_reset); } /* Initialise IRQ destinations */ for (i = 0; i MAX_CPU; i++) { -- 1.7.1
Re: [Qemu-devel] [PATCH v3 16/22] target-arm: A64: Generalize ERET to various ELs
On 22 May 2014 01:48, Edgar E. Iglesias edgar.igles...@gmail.com wrote: On Wed, May 21, 2014 at 08:20:20PM +0100, Peter Maydell wrote: it needs to also fix the bit in the returning to an exception level which is 32 bit which says new_el = 0 since that's not guaranteed to be true any more. (Also I think the register mapping for AArch32 EL2/EL1 needs handling correctly.) I've tried to stay away from touching too much of the AArch32 code as I haven't had a setup to test 64/32 transitions beyond a64/el1 and a32/el0. OK; if we put in a TODO comment that we assume EL1..EL3 are 64 bit currently, we'll have a marker to come back and fix later. thanks -- PMM
Re: [Qemu-devel] [PATCH] spapr_iommu: Replace @instance_id with LIOBN for migration
On 05/12/2014 06:46 PM, Alexey Kardashevskiy wrote: SPAPR IOMMU is a bus-less device and therefore its only ID in migration stream is an instance id which is not reliable ID as it depends on the command line parameters order. Since libvirt may change the order, we need something better than that. This removes VMSD descriptor from the class definitiion and registers it with @liobn as an intance ID to let the destination side find the right device to receive migration data. Ping? Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- hw/ppc/spapr_iommu.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c index 72493d8..7c3f8c2 100644 --- a/hw/ppc/spapr_iommu.c +++ b/hw/ppc/spapr_iommu.c @@ -140,6 +140,9 @@ static int spapr_tce_table_realize(DeviceState *dev) QLIST_INSERT_HEAD(spapr_tce_tables, tcet, list); +vmstate_register(DEVICE(tcet), tcet-liobn, vmstate_spapr_tce_table, + tcet); + return 0; } @@ -323,7 +326,6 @@ int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname, static void spapr_tce_table_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); -dc-vmsd = vmstate_spapr_tce_table; dc-init = spapr_tce_table_realize; dc-reset = spapr_tce_reset; -- Alexey
Re: [Qemu-devel] [PATCH v2 16/23] target-arm: Use arm_current_sctlr to access SCTLR
On Tue, May 13, 2014 at 06:16:01PM +0200, Fabian Aggeler wrote: Add SCTLR_EL3 and introduce new function to access correct instance of SCTLR in different modes/worlds. Hi, AArch64 has a couple of insn/regs that do address translation as seen by other ELs. E.g, from EL3 you can perform address translation as seen by EL0. See for example ATS12E0R. AArch32 has similar features, it can also lower S to NS when in S mode. With regards to arm_current_sctlr() and the use in get_phys_addr, I don't think it is enough here. I was planning to post was patches that add new args to get_phys_addr() with the translation-EL and flags to control if stage-2 translation should be done or not. That way ats_write() can keep reusing get_phys_addr(). We would need to pass the security state as an argument aswell to handle AArch32. Does that make sense? Cheers, Edgar Signed-off-by: Fabian Aggeler aggel...@ethz.ch --- hw/arm/pxa2xx.c| 2 +- target-arm/cpu-qom.h | 1 + target-arm/cpu.c | 4 +-- target-arm/cpu.h | 14 ++- target-arm/helper.c| 67 ++ target-arm/op_helper.c | 2 +- 6 files changed, 69 insertions(+), 21 deletions(-) diff --git a/hw/arm/pxa2xx.c b/hw/arm/pxa2xx.c index e0cd847..8b69e72 100644 --- a/hw/arm/pxa2xx.c +++ b/hw/arm/pxa2xx.c @@ -274,7 +274,7 @@ static void pxa2xx_pwrmode_write(CPUARMState *env, const ARMCPRegInfo *ri, case 3: s-cpu-env.uncached_cpsr = ARM_CPU_MODE_SVC; s-cpu-env.daif = PSTATE_A | PSTATE_F | PSTATE_I; -s-cpu-env.cp15.c1_sys = 0; +s-cpu-env.cp15.c1_sys_el1 = 0; s-cpu-env.cp15.c1_coproc = 0; s-cpu-env.cp15.ttbr0_el1 = 0; s-cpu-env.cp15.c3 = 0; diff --git a/target-arm/cpu-qom.h b/target-arm/cpu-qom.h index edc7f26..38cbd43 100644 --- a/target-arm/cpu-qom.h +++ b/target-arm/cpu-qom.h @@ -119,6 +119,7 @@ typedef struct ARMCPU { uint32_t mvfr2; uint32_t ctr; uint32_t reset_sctlr; +uint32_t reset_sctlr_el3; uint32_t id_pfr0; uint32_t id_pfr1; uint32_t id_dfr0; diff --git a/target-arm/cpu.c b/target-arm/cpu.c index b0d4a9b..bdbdbb1 100644 --- a/target-arm/cpu.c +++ b/target-arm/cpu.c @@ -100,7 +100,7 @@ static void arm_cpu_reset(CPUState *s) #if defined(CONFIG_USER_ONLY) env-pstate = PSTATE_MODE_EL0t; /* Userspace expects access to CTL_EL0 and the cache ops */ -env-cp15.c1_sys |= SCTLR_UCT | SCTLR_UCI; +env-cp15.c1_sys_el1 |= SCTLR_UCT | SCTLR_UCI; /* and to the FP/Neon instructions */ env-cp15.c1_coproc = deposit64(env-cp15.c1_coproc, 20, 2, 3); #else @@ -146,7 +146,7 @@ static void arm_cpu_reset(CPUState *s) } } -if (env-cp15.c1_sys SCTLR_V) { +if (arm_current_sctlr(env) SCTLR_V) { env-regs[15] = 0x; } diff --git a/target-arm/cpu.h b/target-arm/cpu.h index a4bb6bd..780c1f5 100644 --- a/target-arm/cpu.h +++ b/target-arm/cpu.h @@ -180,7 +180,8 @@ typedef struct CPUARMState { struct { uint32_t c0_cpuid; uint64_t c0_cssel; /* Cache size selection. */ -uint64_t c1_sys; /* System control register. */ +uint64_t c1_sys_el1; /* System control register (EL1). */ +uint64_t c1_sys_el3; /* System control register (EL3). */ uint64_t c1_coproc; /* Coprocessor access register. */ uint32_t c1_xscaleauxcr; /* XScale auxiliary control register. */ uint32_t c1_scr; /* secure config register. */ @@ -971,6 +972,17 @@ static inline int arm_current_pl(CPUARMState *env) return 1; } +static inline uint64_t arm_current_sctlr(CPUARMState *env) +{ +if (is_a64(env) arm_current_pl(env) == 3) { +/* EL3 has its own SCTLR */ +return env-cp15.c1_sys_el3; +} else { +/* Only A32 with NS-bit clear accesses secure instance of SCTLR */ +return A32_MAPPED_EL3_REG_GET(env, c1_sys); +} +} + typedef struct ARMCPRegInfo ARMCPRegInfo; typedef enum CPAccessResult { diff --git a/target-arm/helper.c b/target-arm/helper.c index 98c3dc9..ac8b15a 100644 --- a/target-arm/helper.c +++ b/target-arm/helper.c @@ -1767,7 +1767,7 @@ static void aa64_fpsr_write(CPUARMState *env, const ARMCPRegInfo *ri, static CPAccessResult aa64_daif_access(CPUARMState *env, const ARMCPRegInfo *ri) { -if (arm_current_pl(env) == 0 !(env-cp15.c1_sys SCTLR_UMA)) { +if (arm_current_pl(env) == 0 !(arm_current_sctlr(env) SCTLR_UMA)) { return CP_ACCESS_TRAP; } return CP_ACCESS_OK; @@ -1785,7 +1785,7 @@ static CPAccessResult aa64_cacheop_access(CPUARMState *env, /* Cache invalidate/clean: NOP, but EL0 must UNDEF unless * SCTLR_EL1.UCI is set. */ -if (arm_current_pl(env) == 0 !(env-cp15.c1_sys SCTLR_UCI)) { +if (arm_current_pl(env) == 0 !(arm_current_sctlr(env)
[Qemu-devel] [PATCH v2 0/2] dataplane: Enable scsi=on
This makes the SG_IO code of non-dataplane available to dataplane, so that dataplane can use to allow scsi=on. v2: [1/2] Fix scsi=off case and drop VirtIOBlockReq.scsi. [2/2] Pass conf to virtio_blk_handle_scsi_req. Fam Fam Zheng (2): virtio-blk: Factor out virtio_blk_handle_scsi_req from virtio_blk_handle_scsi dataplane: Support VIRTIO_BLK_T_SCSI_CMD hw/block/dataplane/virtio-blk.c | 18 + hw/block/virtio-blk.c | 83 +++-- include/hw/virtio/virtio-blk.h | 4 ++ 3 files changed, 60 insertions(+), 45 deletions(-) -- 1.9.2
[Qemu-devel] [PATCH v2 2/2] dataplane: Support VIRTIO_BLK_T_SCSI_CMD
Signed-off-by: Fam Zheng f...@redhat.com --- hw/block/dataplane/virtio-blk.c | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index 46a6824..e833045 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -201,6 +201,15 @@ static void do_flush_cmd(VirtIOBlockDataPlane *s, VirtQueueElement *elem, bdrv_aio_flush(s-blk-conf.bs, complete_flush, req); } +static void do_scsi_cmd(VirtIOBlockDataPlane *s, VirtQueueElement *elem, +QEMUIOVector *inhdr) +{ +int status; + +status = virtio_blk_handle_scsi_req(s-blk-conf.bs, s-blk, elem); +complete_request_early(s, elem, inhdr, status); +} + static int process_request(VirtIOBlockDataPlane *s, VirtQueueElement *elem) { struct iovec *iov = elem-out_sg; @@ -249,8 +258,7 @@ static int process_request(VirtIOBlockDataPlane *s, VirtQueueElement *elem) return 0; case VIRTIO_BLK_T_SCSI_CMD: -/* TODO support SCSI commands */ -complete_request_early(s, elem, inhdr, VIRTIO_BLK_S_UNSUPP); +do_scsi_cmd(s, elem, inhdr); return 0; case VIRTIO_BLK_T_FLUSH: @@ -326,12 +334,6 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk, return; } -if (blk-scsi) { -error_setg(errp, - device is incompatible with x-data-plane, use scsi=off); -return; -} - /* If dataplane is (re-)enabled while the guest is running there could be * block jobs that can conflict. */ -- 1.9.2
[Qemu-devel] [PATCH v2 1/2] virtio-blk: Factor out virtio_blk_handle_scsi_req from virtio_blk_handle_scsi
The common logic to process a scsi request in a VirtQueueElement is extracted to a function to share with dataplane. This makes VirtIOBlockReq.scsi unused, so drop it. Signed-off-by: Fam Zheng f...@redhat.com --- hw/block/virtio-blk.c | 83 +++--- include/hw/virtio/virtio-blk.h | 4 ++ 2 files changed, 50 insertions(+), 37 deletions(-) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index 5e9433d..78e81a6 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -33,7 +33,6 @@ typedef struct VirtIOBlockReq VirtQueueElement elem; struct virtio_blk_inhdr *in; struct virtio_blk_outhdr *out; -struct virtio_scsi_inhdr *scsi; QEMUIOVector qiov; struct VirtIOBlockReq *next; BlockAcctCookie acct; @@ -125,13 +124,16 @@ static VirtIOBlockReq *virtio_blk_get_request(VirtIOBlock *s) return req; } -static void virtio_blk_handle_scsi(VirtIOBlockReq *req) +int virtio_blk_handle_scsi_req(BlockDriverState *bs, + VirtIOBlkConf *conf, + VirtQueueElement *elem) { +int status = VIRTIO_BLK_S_OK; +struct virtio_scsi_inhdr *scsi = NULL; #ifdef __linux__ -int ret; int i; +struct sg_io_hdr hdr; #endif -int status = VIRTIO_BLK_S_OK; /* * We require at least one output segment each for the virtio_blk_outhdr @@ -140,63 +142,61 @@ static void virtio_blk_handle_scsi(VirtIOBlockReq *req) * We also at least require the virtio_blk_inhdr, the virtio_scsi_inhdr * and the sense buffer pointer in the input segments. */ -if (req-elem.out_num 2 || req-elem.in_num 3) { -virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR); -g_free(req); -return; +if (elem-out_num 2 || elem-in_num 3) { +status = VIRTIO_BLK_S_IOERR; +goto fail; } /* * The scsi inhdr is placed in the second-to-last input segment, just * before the regular inhdr. */ -req-scsi = (void *)req-elem.in_sg[req-elem.in_num - 2].iov_base; +scsi = (void *)elem-in_sg[elem-in_num - 2].iov_base; -if (!req-dev-blk.scsi) { +/* + * No support for bidirection commands yet. + */ +if (elem-out_num 2 elem-in_num 3) { status = VIRTIO_BLK_S_UNSUPP; goto fail; } -/* - * No support for bidirection commands yet. - */ -if (req-elem.out_num 2 req-elem.in_num 3) { +if (!conf-scsi) { status = VIRTIO_BLK_S_UNSUPP; goto fail; } #ifdef __linux__ -struct sg_io_hdr hdr; memset(hdr, 0, sizeof(struct sg_io_hdr)); hdr.interface_id = 'S'; -hdr.cmd_len = req-elem.out_sg[1].iov_len; -hdr.cmdp = req-elem.out_sg[1].iov_base; +hdr.cmd_len = elem-out_sg[1].iov_len; +hdr.cmdp = elem-out_sg[1].iov_base; hdr.dxfer_len = 0; -if (req-elem.out_num 2) { +if (elem-out_num 2) { /* * If there are more than the minimally required 2 output segments * there is write payload starting from the third iovec. */ hdr.dxfer_direction = SG_DXFER_TO_DEV; -hdr.iovec_count = req-elem.out_num - 2; +hdr.iovec_count = elem-out_num - 2; for (i = 0; i hdr.iovec_count; i++) -hdr.dxfer_len += req-elem.out_sg[i + 2].iov_len; +hdr.dxfer_len += elem-out_sg[i + 2].iov_len; -hdr.dxferp = req-elem.out_sg + 2; +hdr.dxferp = elem-out_sg + 2; -} else if (req-elem.in_num 3) { +} else if (elem-in_num 3) { /* * If we have more than 3 input segments the guest wants to actually * read data. */ hdr.dxfer_direction = SG_DXFER_FROM_DEV; -hdr.iovec_count = req-elem.in_num - 3; +hdr.iovec_count = elem-in_num - 3; for (i = 0; i hdr.iovec_count; i++) -hdr.dxfer_len += req-elem.in_sg[i].iov_len; +hdr.dxfer_len += elem-in_sg[i].iov_len; -hdr.dxferp = req-elem.in_sg; +hdr.dxferp = elem-in_sg; } else { /* * Some SCSI commands don't actually transfer any data. @@ -204,11 +204,11 @@ static void virtio_blk_handle_scsi(VirtIOBlockReq *req) hdr.dxfer_direction = SG_DXFER_NONE; } -hdr.sbp = req-elem.in_sg[req-elem.in_num - 3].iov_base; -hdr.mx_sb_len = req-elem.in_sg[req-elem.in_num - 3].iov_len; +hdr.sbp = elem-in_sg[elem-in_num - 3].iov_base; +hdr.mx_sb_len = elem-in_sg[elem-in_num - 3].iov_len; -ret = bdrv_ioctl(req-dev-bs, SG_IO, hdr); -if (ret) { +status = bdrv_ioctl(bs, SG_IO, hdr); +if (status) { status = VIRTIO_BLK_S_UNSUPP; goto fail; } @@ -224,23 +224,32 @@ static void virtio_blk_handle_scsi(VirtIOBlockReq *req) hdr.status = CHECK_CONDITION; } -stl_p(req-scsi-errors, +stl_p(scsi-errors, hdr.status | (hdr.msg_status 8) | (hdr.host_status
Re: [Qemu-devel] [PATCH v2 14/23] target-arm: add banked coprocessor register type and macros
On Tue, May 13, 2014 at 06:15:59PM +0200, Fabian Aggeler wrote: Banked CP registers can be defined with a A32_BANKED_REG macro which defines a non-secure instance of the register followed by an adjacent secure instance. Using a union makes the code backwards-compatible since the non-secure instance can normally be accessed by its existing name. When translating a banked CP register access instruction in monitor mode, the SCR.NS bit determines which instance is going to be accessed. If EL3 is operating in Aarch64 state coprocessor registers are not banked anymore but in some cases have its own _EL3 instance. Hi Regarding the sctlr regs and the banking of S/NS regs in general, I think the general pattern should be to arrayify the regs that need to be indexed by EL. This is an example of a structure (indexed by EL) with the aarch32 struct beeing here to help clarify. union { struct { uint64_t pad; uint64_t sctlr_ns; uint64_t hsctlr; uint64_t sctlr_s; } aarch32; uint64_t sctlr_el[4]; } I think we would naturally want to register this for AArch32 as banked with NS=sctlr_el[1] and S=sctlr_el[3]. Another register example is FAR. For FAR, things look like this (when EL2 is available and ignoring DFAR for clarity): union { struct { uint64_t pad; uint64_t ifar_ns; uint64_t ifar_s; } aarch32; uint64_t far_el[4]; } Preferably we need something that can handle both cases. An option could be to arrayify the .fieldoffset in reginfos? If we don't want hardcoded TZ knowledge in the generic cpreg accessors, maybe there could be a generic function that returns the .fieldoffset index based on CPUState (e.g NS=0, S=1 etc). Or maybe specialized read/write fns would be enough. Just an idea to get the discussion going. struct ARMCPRegInfo { union { ptrdiff_t fieldoffset; ptrdiff_t fieldoffsets[2]; }; }; unsigned int arm_cpreg_tzbank_idx(CPUARMState *env) { return is_a64(env) ? 0 : arm_is_secure(env); } Example: { .name = FAR_EL1, .state = ARM_CP_STATE_BOTH, .opc0 = 3, .crn = 6, .crm = 0, .opc1 = 0, .opc2 = 0, .access = PL1_RW, .fieldindex_fn = arm_cpreg_tzbank_idx, .fieldoffsets[] = { offsetof(CPUARMState, cp15.far_el[1]), offsetof(CPUARMState, cp15.far_el[2])}, .resetvalue = 0, }, ARMCPRegInfo sctlr = { .name = SCTLR, .state = ARM_CP_STATE_BOTH, .opc0 = 3, .crn = 1, .crm = 0, .opc1 = 0, .opc2 = 0, .access = PL1_RW, .fieldindex_fn = arm_cpreg_tzbank_idx, .fieldoffsets[] = { offsetof(CPUARMState, cp15.sctlr_el[1]), offsetof(CPUARMState, cp15.sctlr_el[3]), }, /* Assuming raw_write and raw_read respect the indexing. */ .writefn = sctlr_write, .resetvalue = cpu-reset_sctlr, .raw_writefn = raw_write, }; Best regards, Edgar Signed-off-by: Sergey Fedorov s.fedo...@samsung.com Signed-off-by: Fabian Aggeler aggel...@ethz.ch --- target-arm/cpu.h | 121 + target-arm/helper.c| 64 -- target-arm/translate.c | 19 +--- 3 files changed, 184 insertions(+), 20 deletions(-) diff --git a/target-arm/cpu.h b/target-arm/cpu.h index a970d55..9e325ac 100644 --- a/target-arm/cpu.h +++ b/target-arm/cpu.h @@ -80,6 +80,16 @@ #define offsetofhigh32(S, M) (offsetof(S, M) + sizeof(uint32_t)) #endif +/* Define a banked coprocessor register state field. Use %name as the + * register field name for the secure instance. The non-secure instance + * has a _nonsecure suffix. + */ +#define A32_BANKED_REG(type, name) \ +union { \ +type name; \ +type name##_banked[2]; \ +} + /* Meanings of the ARMCPU object's two inbound GPIO lines */ #define ARM_CPU_IRQ 0 #define ARM_CPU_FIQ 1 @@ -89,6 +99,7 @@ typedef void ARMWriteCPFunc(void *opaque, int cp_info, typedef uint32_t ARMReadCPFunc(void *opaque, int cp_info, int dstreg, int operand); + struct arm_boot_info; #define NB_MMU_MODES 5 @@ -673,6 +684,78 @@ static inline bool arm_el_is_aa64(CPUARMState *env, int el) return arm_feature(env, ARM_FEATURE_AARCH64); } +/* When EL3 is operating in Aarch32 state, the NS-bit determines + * whether the secure instance of a cp-register should be used. */ +#define USE_SECURE_REG(env) ( \ +arm_feature(env, ARM_FEATURE_SECURITY_EXTENSIONS) \ +!arm_el_is_aa64(env, 3) \ +!((env)-cp15.c1_scr 1/*NS*/)) + +#define NONSECURE_BANK 0 +#define SECURE_BANK 1 + +#define A32_BANKED_REG_GET(env, regname) \ +((USE_SECURE_REG(env)) ? \ +(env)-cp15.regname##_banked[SECURE_BANK] : \ +(env)-cp15.regname) + +#define A32_MAPPED_EL3_REG_GET(env,
Re: [Qemu-devel] [PATCH v2 1/2] virtio-blk: Factor out virtio_blk_handle_scsi_req from virtio_blk_handle_scsi
Il 22/05/2014 09:37, Fam Zheng ha scritto: -static void virtio_blk_handle_scsi(VirtIOBlockReq *req) +int virtio_blk_handle_scsi_req(BlockDriverState *bs, + VirtIOBlkConf *conf, + VirtQueueElement *elem) Two more comments... please pass the VirtIOBlock * instead of bs/conf here. { +int status = VIRTIO_BLK_S_OK; +struct virtio_scsi_inhdr *scsi = NULL; #ifdef __linux__ -int ret; int i; +struct sg_io_hdr hdr; #endif -int status = VIRTIO_BLK_S_OK; /* * We require at least one output segment each for the virtio_blk_outhdr @@ -140,63 +142,61 @@ static void virtio_blk_handle_scsi(VirtIOBlockReq *req) * We also at least require the virtio_blk_inhdr, the virtio_scsi_inhdr * and the sense buffer pointer in the input segments. */ -if (req-elem.out_num 2 || req-elem.in_num 3) { -virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR); -g_free(req); -return; +if (elem-out_num 2 || elem-in_num 3) { +status = VIRTIO_BLK_S_IOERR; +goto fail; } /* * The scsi inhdr is placed in the second-to-last input segment, just * before the regular inhdr. */ -req-scsi = (void *)req-elem.in_sg[req-elem.in_num - 2].iov_base; +scsi = (void *)elem-in_sg[elem-in_num - 2].iov_base; -if (!req-dev-blk.scsi) { +/* + * No support for bidirection commands yet. + */ +if (elem-out_num 2 elem-in_num 3) { status = VIRTIO_BLK_S_UNSUPP; goto fail; } -/* - * No support for bidirection commands yet. - */ -if (req-elem.out_num 2 req-elem.in_num 3) { +if (!conf-scsi) { status = VIRTIO_BLK_S_UNSUPP; goto fail; } Swapping the two conditionals unnecessarily makes the patch bigger. Paolo #ifdef __linux__ -struct sg_io_hdr hdr; memset(hdr, 0, sizeof(struct sg_io_hdr)); hdr.interface_id = 'S'; -hdr.cmd_len = req-elem.out_sg[1].iov_len; -hdr.cmdp = req-elem.out_sg[1].iov_base; +hdr.cmd_len = elem-out_sg[1].iov_len; +hdr.cmdp = elem-out_sg[1].iov_base; hdr.dxfer_len = 0; -if (req-elem.out_num 2) { +if (elem-out_num 2) { /* * If there are more than the minimally required 2 output segments * there is write payload starting from the third iovec. */ hdr.dxfer_direction = SG_DXFER_TO_DEV; -hdr.iovec_count = req-elem.out_num - 2; +hdr.iovec_count = elem-out_num - 2; for (i = 0; i hdr.iovec_count; i++) -hdr.dxfer_len += req-elem.out_sg[i + 2].iov_len; +hdr.dxfer_len += elem-out_sg[i + 2].iov_len; -hdr.dxferp = req-elem.out_sg + 2; +hdr.dxferp = elem-out_sg + 2; -} else if (req-elem.in_num 3) { +} else if (elem-in_num 3) { /* * If we have more than 3 input segments the guest wants to actually * read data. */ hdr.dxfer_direction = SG_DXFER_FROM_DEV; -hdr.iovec_count = req-elem.in_num - 3; +hdr.iovec_count = elem-in_num - 3; for (i = 0; i hdr.iovec_count; i++) -hdr.dxfer_len += req-elem.in_sg[i].iov_len; +hdr.dxfer_len += elem-in_sg[i].iov_len; -hdr.dxferp = req-elem.in_sg; +hdr.dxferp = elem-in_sg; } else { /* * Some SCSI commands don't actually transfer any data. @@ -204,11 +204,11 @@ static void virtio_blk_handle_scsi(VirtIOBlockReq *req) hdr.dxfer_direction = SG_DXFER_NONE; } -hdr.sbp = req-elem.in_sg[req-elem.in_num - 3].iov_base; -hdr.mx_sb_len = req-elem.in_sg[req-elem.in_num - 3].iov_len; +hdr.sbp = elem-in_sg[elem-in_num - 3].iov_base; +hdr.mx_sb_len = elem-in_sg[elem-in_num - 3].iov_len; -ret = bdrv_ioctl(req-dev-bs, SG_IO, hdr); -if (ret) { +status = bdrv_ioctl(bs, SG_IO, hdr); +if (status) { status = VIRTIO_BLK_S_UNSUPP; goto fail; } @@ -224,23 +224,32 @@ static void virtio_blk_handle_scsi(VirtIOBlockReq *req) hdr.status = CHECK_CONDITION; } -stl_p(req-scsi-errors, +stl_p(scsi-errors, hdr.status | (hdr.msg_status 8) | (hdr.host_status 16) | (hdr.driver_status 24)); -stl_p(req-scsi-residual, hdr.resid); -stl_p(req-scsi-sense_len, hdr.sb_len_wr); -stl_p(req-scsi-data_len, hdr.dxfer_len); +stl_p(scsi-residual, hdr.resid); +stl_p(scsi-sense_len, hdr.sb_len_wr); +stl_p(scsi-data_len, hdr.dxfer_len); -virtio_blk_req_complete(req, status); -g_free(req); -return; +return status; #else abort(); #endif fail: /* Just put anything nonzero so that the ioctl fails in the guest. */ -stl_p(req-scsi-errors, 255); +if (scsi) { +stl_p(scsi-errors, 255); +} +return status; +} + +static void virtio_blk_handle_scsi(VirtIOBlockReq *req) +{ +int status; + +status =
[Qemu-devel] [PATCH v3 0/2] dataplane: Enable scsi=on
This makes the SG_IO code of non-dataplane available to dataplane, so that dataplane can use to allow scsi=on. v3: Change parameter to VirtIOBlock pointer. Undo the swap in code movement. (Paolo) v2: [1/2] Fix scsi=off case and drop VirtIOBlockReq.scsi. [2/2] Pass conf to virtio_blk_handle_scsi_req. Fam Fam Zheng (2): virtio-blk: Factor out virtio_blk_handle_scsi_req from virtio_blk_handle_scsi dataplane: Support VIRTIO_BLK_T_SCSI_CMD hw/block/dataplane/virtio-blk.c | 18 +- hw/block/virtio-blk.c | 75 ++--- include/hw/virtio/virtio-blk.h | 3 ++ 3 files changed, 54 insertions(+), 42 deletions(-) -- 1.9.2
[Qemu-devel] [PATCH v3 1/2] virtio-blk: Factor out virtio_blk_handle_scsi_req from virtio_blk_handle_scsi
The common logic to process a scsi request in a VirtQueueElement is extracted to a function to share with dataplane. This makes VirtIOBlockReq.scsi unused, so drop it. Signed-off-by: Fam Zheng f...@redhat.com --- hw/block/virtio-blk.c | 75 +++--- include/hw/virtio/virtio-blk.h | 3 ++ 2 files changed, 44 insertions(+), 34 deletions(-) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index 5e9433d..0b1446e 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -33,7 +33,6 @@ typedef struct VirtIOBlockReq VirtQueueElement elem; struct virtio_blk_inhdr *in; struct virtio_blk_outhdr *out; -struct virtio_scsi_inhdr *scsi; QEMUIOVector qiov; struct VirtIOBlockReq *next; BlockAcctCookie acct; @@ -125,13 +124,15 @@ static VirtIOBlockReq *virtio_blk_get_request(VirtIOBlock *s) return req; } -static void virtio_blk_handle_scsi(VirtIOBlockReq *req) +int virtio_blk_handle_scsi_req(VirtIOBlock *blk, + VirtQueueElement *elem) { +int status = VIRTIO_BLK_S_OK; +struct virtio_scsi_inhdr *scsi = NULL; #ifdef __linux__ -int ret; int i; +struct sg_io_hdr hdr; #endif -int status = VIRTIO_BLK_S_OK; /* * We require at least one output segment each for the virtio_blk_outhdr @@ -140,19 +141,18 @@ static void virtio_blk_handle_scsi(VirtIOBlockReq *req) * We also at least require the virtio_blk_inhdr, the virtio_scsi_inhdr * and the sense buffer pointer in the input segments. */ -if (req-elem.out_num 2 || req-elem.in_num 3) { -virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR); -g_free(req); -return; +if (elem-out_num 2 || elem-in_num 3) { +status = VIRTIO_BLK_S_IOERR; +goto fail; } /* * The scsi inhdr is placed in the second-to-last input segment, just * before the regular inhdr. */ -req-scsi = (void *)req-elem.in_sg[req-elem.in_num - 2].iov_base; +scsi = (void *)elem-in_sg[elem-in_num - 2].iov_base; -if (!req-dev-blk.scsi) { +if (!blk-blk.scsi) { status = VIRTIO_BLK_S_UNSUPP; goto fail; } @@ -160,43 +160,42 @@ static void virtio_blk_handle_scsi(VirtIOBlockReq *req) /* * No support for bidirection commands yet. */ -if (req-elem.out_num 2 req-elem.in_num 3) { +if (elem-out_num 2 elem-in_num 3) { status = VIRTIO_BLK_S_UNSUPP; goto fail; } #ifdef __linux__ -struct sg_io_hdr hdr; memset(hdr, 0, sizeof(struct sg_io_hdr)); hdr.interface_id = 'S'; -hdr.cmd_len = req-elem.out_sg[1].iov_len; -hdr.cmdp = req-elem.out_sg[1].iov_base; +hdr.cmd_len = elem-out_sg[1].iov_len; +hdr.cmdp = elem-out_sg[1].iov_base; hdr.dxfer_len = 0; -if (req-elem.out_num 2) { +if (elem-out_num 2) { /* * If there are more than the minimally required 2 output segments * there is write payload starting from the third iovec. */ hdr.dxfer_direction = SG_DXFER_TO_DEV; -hdr.iovec_count = req-elem.out_num - 2; +hdr.iovec_count = elem-out_num - 2; for (i = 0; i hdr.iovec_count; i++) -hdr.dxfer_len += req-elem.out_sg[i + 2].iov_len; +hdr.dxfer_len += elem-out_sg[i + 2].iov_len; -hdr.dxferp = req-elem.out_sg + 2; +hdr.dxferp = elem-out_sg + 2; -} else if (req-elem.in_num 3) { +} else if (elem-in_num 3) { /* * If we have more than 3 input segments the guest wants to actually * read data. */ hdr.dxfer_direction = SG_DXFER_FROM_DEV; -hdr.iovec_count = req-elem.in_num - 3; +hdr.iovec_count = elem-in_num - 3; for (i = 0; i hdr.iovec_count; i++) -hdr.dxfer_len += req-elem.in_sg[i].iov_len; +hdr.dxfer_len += elem-in_sg[i].iov_len; -hdr.dxferp = req-elem.in_sg; +hdr.dxferp = elem-in_sg; } else { /* * Some SCSI commands don't actually transfer any data. @@ -204,11 +203,11 @@ static void virtio_blk_handle_scsi(VirtIOBlockReq *req) hdr.dxfer_direction = SG_DXFER_NONE; } -hdr.sbp = req-elem.in_sg[req-elem.in_num - 3].iov_base; -hdr.mx_sb_len = req-elem.in_sg[req-elem.in_num - 3].iov_len; +hdr.sbp = elem-in_sg[elem-in_num - 3].iov_base; +hdr.mx_sb_len = elem-in_sg[elem-in_num - 3].iov_len; -ret = bdrv_ioctl(req-dev-bs, SG_IO, hdr); -if (ret) { +status = bdrv_ioctl(blk-bs, SG_IO, hdr); +if (status) { status = VIRTIO_BLK_S_UNSUPP; goto fail; } @@ -224,23 +223,31 @@ static void virtio_blk_handle_scsi(VirtIOBlockReq *req) hdr.status = CHECK_CONDITION; } -stl_p(req-scsi-errors, +stl_p(scsi-errors, hdr.status | (hdr.msg_status 8) | (hdr.host_status 16) | (hdr.driver_status 24)); -
[Qemu-devel] [PATCH v3 2/2] dataplane: Support VIRTIO_BLK_T_SCSI_CMD
Signed-off-by: Fam Zheng f...@redhat.com --- hw/block/dataplane/virtio-blk.c | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index 46a6824..03cbdec 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -201,6 +201,15 @@ static void do_flush_cmd(VirtIOBlockDataPlane *s, VirtQueueElement *elem, bdrv_aio_flush(s-blk-conf.bs, complete_flush, req); } +static void do_scsi_cmd(VirtIOBlockDataPlane *s, VirtQueueElement *elem, +QEMUIOVector *inhdr) +{ +int status; + +status = virtio_blk_handle_scsi_req(VIRTIO_BLK(s-vdev), elem); +complete_request_early(s, elem, inhdr, status); +} + static int process_request(VirtIOBlockDataPlane *s, VirtQueueElement *elem) { struct iovec *iov = elem-out_sg; @@ -249,8 +258,7 @@ static int process_request(VirtIOBlockDataPlane *s, VirtQueueElement *elem) return 0; case VIRTIO_BLK_T_SCSI_CMD: -/* TODO support SCSI commands */ -complete_request_early(s, elem, inhdr, VIRTIO_BLK_S_UNSUPP); +do_scsi_cmd(s, elem, inhdr); return 0; case VIRTIO_BLK_T_FLUSH: @@ -326,12 +334,6 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk, return; } -if (blk-scsi) { -error_setg(errp, - device is incompatible with x-data-plane, use scsi=off); -return; -} - /* If dataplane is (re-)enabled while the guest is running there could be * block jobs that can conflict. */ -- 1.9.2
Re: [Qemu-devel] [PATCH v3 0/2] dataplane: Enable scsi=on
Il 22/05/2014 10:22, Fam Zheng ha scritto: This makes the SG_IO code of non-dataplane available to dataplane, so that dataplane can use to allow scsi=on. v3: Change parameter to VirtIOBlock pointer. Undo the swap in code movement. (Paolo) v2: [1/2] Fix scsi=off case and drop VirtIOBlockReq.scsi. [2/2] Pass conf to virtio_blk_handle_scsi_req. Fam Fam Zheng (2): virtio-blk: Factor out virtio_blk_handle_scsi_req from virtio_blk_handle_scsi dataplane: Support VIRTIO_BLK_T_SCSI_CMD hw/block/dataplane/virtio-blk.c | 18 +- hw/block/virtio-blk.c | 75 ++--- include/hw/virtio/virtio-blk.h | 3 ++ 3 files changed, 54 insertions(+), 42 deletions(-) Reviewed-by: Paolo Bonzini pbonz...@redhat.com
Re: [Qemu-devel] [PATCH v1 RFC 6/6] KVM: s390: add cpu model support
Il 22/05/2014 10:23, Michael Mueller ha scritto: On Wed, 21 May 2014 15:22:35 +0200 Alexander Graf ag...@suse.de wrote: I have seen the slides from Eduardo which he presented during this years DevConf in Brno and made my comments according the s390x implementation on that. Is you will see, this is mostly overlapping except for the model definition authority that I clearly see on qemu's side. See pdf attachment. More comments: - Only one machine type in s390 case which is -machine s390-virtio-ccw This probably should change sooner or later, as soon as the implementation becomes stable enough. Versioning is necessary for live migration across different QEMU version. Perhaps start versioning in 2.2, i.e. start making s390-virtio-ccw-2.1 an alias for s390-virtio-ccw now? Note that new virtio device features can appear at any time outside the s390 code, and will take part in versioning as well. - No enforce option Strongly suggest making enforce the only possible behavior. - Not in the s390x case, because the KVM facility mask limits the cpu model specific facilities What if the KVM facility mask changes? For x86, nowadays new CPUID bits are only introduced in KVM when a new processors comes out. But if we introduced an older CPUID bit, it would be a huge complication for backwards compatibility. Is it different for s390? Paolo
Re: [Qemu-devel] [PATCH v2 04/23] target-arm: preserve RAO/WI bits of ARMv7 SCTLR
On 21 May 2014, at 18:12, Peter Maydell peter.mayd...@linaro.org wrote: On 14 May 2014 06:43, Sergey Fedorov serge.f...@gmail.com wrote: On 13.05.2014 20:15, Fabian Aggeler wrote: From: Svetlana Fedoseeva s.fedose...@samsung.com Signed-off-by: Svetlana Fedoseeva s.fedose...@samsung.com Signed-off-by: Sergey Fedorov s.fedo...@samsung.com Signed-off-by: Fabian Aggeler aggel...@ethz.ch --- target-arm/helper.c | 5 + 1 file changed, 5 insertions(+) diff --git a/target-arm/helper.c b/target-arm/helper.c index 9c3269f..2b57ad9 100644 --- a/target-arm/helper.c +++ b/target-arm/helper.c @@ -2083,6 +2083,11 @@ static void sctlr_write(CPUARMState *env, const ARMCPRegInfo *ri, { ARMCPU *cpu = arm_env_get_cpu(env); +if (arm_feature(env, ARM_FEATURE_V7)) { +value |= SCTLR_XP | SCTLR_U | SCTLR_nTWE | SCTLR_nTWI | SCTLR_L +| SCTLR_CP15BEN | SCTLR_P; /* These bits are RAO/WI */ Actually, some of these bits are RAO/WI since v6. Also, there are some RAZ/WI bits varying over architecture variants. There is some overview at ARM ARM v7-AP section L.7.4. Maybe it is worth to fix more precisely over supported architecture variants? By the way, this patch could be separated from security extensions support patch set. Agreed. Our compliance for bits that should-be-0/1 is not great, but if we don't actually need to do those fixes for TZ support then they're probably better separated out (ie drop them from this patchset for the moment and submit them separately or later...) Also for v8 many of these RAZ/RAO bits become RES0/RES1 and the rules are different... thanks — PMM Okay, I will separate them and submit them separately. Thanks, Fabian
[Qemu-devel] [PATCH] virtio-pci: report an error when disable msix
QEMU remains 4k memory for PCI BAR, each msix entry takes 16 bytes. If user assigns more than 128 vectors, msix resource isn't enough, so msix will be disabled. This patch addes a note when fail to init exclusive bars for msix. qemu -device virtio-net-pci,netdev=h1,vectors=129,mq=on \ -netdev tap,id=h1,queues=8 Signed-off-by: Amos Kong ak...@redhat.com --- hw/virtio/virtio-pci.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c index ce97514..ea5dcdf 100644 --- a/hw/virtio/virtio-pci.c +++ b/hw/virtio/virtio-pci.c @@ -976,6 +976,8 @@ static void virtio_pci_device_plugged(DeviceState *d) if (proxy-nvectors msix_init_exclusive_bar(proxy-pci_dev, proxy-nvectors, 1)) { +error_report(%s: unable to init exclusive bars for msix, disable msix, + __func__); proxy-nvectors = 0; } -- 1.9.0
Re: [Qemu-devel] [PATCH v2 01/23] target-arm: add new CPU feature for Security Extensions
On 21 May 2014, at 16:51, Peter Maydell peter.mayd...@linaro.org wrote: On 13 May 2014 17:15, Fabian Aggeler aggel...@ethz.ch wrote: --- a/target-arm/cpu.h +++ b/target-arm/cpu.h @@ -631,6 +631,7 @@ enum arm_features { ARM_FEATURE_CBAR, /* has cp15 CBAR */ ARM_FEATURE_CRC, /* ARMv8 CRC instructions */ ARM_FEATURE_CBAR_RO, /* has cp15 CBAR and it is read-only */ +ARM_FEATURE_SECURITY_EXTENSIONS, /* has Security Extensions */ }; Also this feature name is pretty long and unwieldy; something shorter would be nice. I think we should probably go with ARM_FEATURE_EL2/ARM_FEATURE_EL3 as per Edgar's patchset. thanks -- PMM Makes sense, I will use Edgar’s ARM_FEATURE_EL3 in the next version. Fabian
Re: [Qemu-devel] [PATCH] virtio-pci: report an error when disable msix
On Thu, May 22, 2014 at 05:02:17PM +0800, Amos Kong wrote: QEMU remains 4k memory for PCI BAR, each msix entry takes 16 bytes. If user assigns more than 128 vectors, msix resource isn't enough, so msix will be disabled. This patch addes a note when fail to init exclusive bars for msix. qemu -device virtio-net-pci,netdev=h1,vectors=129,mq=on \ -netdev tap,id=h1,queues=8 Signed-off-by: Amos Kong ak...@redhat.com OK I guess, but how about removing the limit instead? --- hw/virtio/virtio-pci.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c index ce97514..ea5dcdf 100644 --- a/hw/virtio/virtio-pci.c +++ b/hw/virtio/virtio-pci.c @@ -976,6 +976,8 @@ static void virtio_pci_device_plugged(DeviceState *d) if (proxy-nvectors msix_init_exclusive_bar(proxy-pci_dev, proxy-nvectors, 1)) { +error_report(%s: unable to init exclusive bars for msix, disable msix, + __func__); proxy-nvectors = 0; } -- 1.9.0
Re: [Qemu-devel] [PATCH] virtio-pci: report an error when disable msix
On 22 May 2014 10:02, Amos Kong ak...@redhat.com wrote: QEMU remains 4k memory for PCI BAR, each msix entry takes 16 bytes. If user assigns more than 128 vectors, msix resource isn't enough, so msix will be disabled. This patch addes a note when fail to init exclusive bars for msix. qemu -device virtio-net-pci,netdev=h1,vectors=129,mq=on \ -netdev tap,id=h1,queues=8 Signed-off-by: Amos Kong ak...@redhat.com --- hw/virtio/virtio-pci.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c index ce97514..ea5dcdf 100644 --- a/hw/virtio/virtio-pci.c +++ b/hw/virtio/virtio-pci.c @@ -976,6 +976,8 @@ static void virtio_pci_device_plugged(DeviceState *d) if (proxy-nvectors msix_init_exclusive_bar(proxy-pci_dev, proxy-nvectors, 1)) { +error_report(%s: unable to init exclusive bars for msix, disable msix, + __func__); I think you probably mean disabling. proxy-nvectors = 0; } thanks -- PMM
[Qemu-devel] [PATCH] docs: clarify that qcow2 file size is not always a cluster multiple
Normally one would expect that qcow2 image file lengths are multiples of the cluster size. This is not true in all cases and the spec should document this so implementers remember to accept such files. $ qemu-img create -f qcow2 foo.qcow2 2G Formatting 'foo.qcow2', fmt=qcow2 size=2147483648 encryption=off cluster_size=65536 lazy_refcounts=off $ ls -l foo.qcow2 -rw-r--r-- 1 stefanha stefanha 197120 May 22 11:40 foo.qcow2 $ bc -q 3 * (64 * 1024) + 512 197120 The extra sector are the 4 L1 table entries that a 2 GB disk with 64 KB cluster size needs. The rest of the L1 table is omitted from the file but allocation will continue at the next cluster boundary. Reported-by: Maria Kustova m...@catit.be Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- docs/specs/qcow2.txt | 4 1 file changed, 4 insertions(+) diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt index f19536a..a46ee57 100644 --- a/docs/specs/qcow2.txt +++ b/docs/specs/qcow2.txt @@ -4,6 +4,10 @@ A qcow2 image file is organized in units of constant size, which are called (host) clusters. A cluster is the unit in which all allocations are done, both for actual guest data and for image metadata. +If the end of the image file is not on a cluster boundary, the missing data is +treated as zeroes. This layout can occur when an L1 table or refcount table is +the last thing in the file and not all entries in the table are used. + Likewise, the virtual disk as seen by the guest is divided into (guest) clusters of the same size. -- 1.9.0
Re: [Qemu-devel] [Xen-devel] [v2][PATCH 4/8] xen, gfx passthrough: reserve 00:02.0 for INTEL IGD
On Thu, May 22, 2014 at 07:39:36AM +0200, Gerd Hoffmann wrote: Hi, According to our discussions, I realize we may have some plans or policies dedicated to how to assign devfn, but to support GFX passthrough for XEN, I think currently it may be a better solution to adopt #1 simply like this: diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index eaf3e61..500b3c2 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -386,7 +386,7 @@ static void pc_xen_hvm_init(QEMUMachineInitArgs *args) bus = pci_find_primary_bus(); if (bus != NULL) { -pci_create_simple(bus, -1, xen-platform); +pci_create_simple(bus, PCI_DEVFN(3,0), xen-platform); } } #endif Then we can go out to plan how to assign devfn in common, is this fine? Somewhere else in the thread someone listed a libxl config file snippet which is supposed to disable the xen platform device. Which doesn't work with upstream qemu for some reason. Searching ... ah, here: http://lists.gnu.org/archive/html/qemu-devel/2014-05/msg03996.html So, how about making xen_platform_pci=0 working correctly ? You don't want that enabled by default. Otherwise the Linux (and Windows) drivers won't enable the PV functionality and we are using the emulated drivers (slow). Another useful thing would be to not create the xen platform device in case -nodefaults was specified on the command line (that switch turns off a bunch of other devices present by default: vga, nic, cdrom, ...). cheers, Gerd ___ Xen-devel mailing list xen-de...@lists.xen.org http://lists.xen.org/xen-devel
Re: [Qemu-devel] [PATCH 8/9] spapr_iommu: Introduce page_shift in sPAPRTCETable
On 22.05.14 01:45, Alexey Kardashevskiy wrote: On 05/22/2014 08:11 AM, Alexander Graf wrote: On 21.05.14 16:21, Alexey Kardashevskiy wrote: At the moment only 4K pages are supported by sPAPRTCETable. Since sPAPR spec allows other page sizes and we are going to implement them, we need page size to be configrable. This adds @page_shift into sPAPRTCETable and replaces SPAPR_TCE_PAGE_SHIFT with it whereever it is possible. This removes SPAPR_TCE_PAGE_MASK as it is no longer used. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- hw/ppc/spapr_iommu.c | 54 +- hw/ppc/spapr_pci.c | 1 + hw/ppc/spapr_vio.c | 1 + include/hw/ppc/spapr.h | 3 ++- 4 files changed, 35 insertions(+), 24 deletions(-) diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c index 90de3e3..e765a6d 100644 --- a/hw/ppc/spapr_iommu.c +++ b/hw/ppc/spapr_iommu.c @@ -70,12 +70,13 @@ static IOMMUTLBEntry spapr_tce_translate_iommu(MemoryRegion *iommu, hwaddr addr) if (tcet-bypass) { ret.perm = IOMMU_RW; -} else if ((addr SPAPR_TCE_PAGE_SHIFT) tcet-nb_table) { +} else if ((addr tcet-page_shift) tcet-nb_table) { /* Check if we are in bound */ -tce = tcet-table[addr SPAPR_TCE_PAGE_SHIFT]; -ret.iova = addr ~SPAPR_TCE_PAGE_MASK; -ret.translated_addr = tce ~SPAPR_TCE_PAGE_MASK; -ret.addr_mask = SPAPR_TCE_PAGE_MASK; +target_ulong mask = ~((1 tcet-page_shift) - 1); Why target_ulong? This should be u64 or hwaddr or something along those lines, no? Also, can the mask grow bigger than 31bits? If so you probably want 1ULL (below as well). In fact, we might be better off with a few more fields to tcet. Just add page_mask and page_size in addition to the page_shift one and use them instead of calculating them over and over again. +tce = tcet-table[addr tcet-page_shift]; +ret.iova = addr mask; +ret.translated_addr = tce mask; +ret.addr_mask = ~mask; ret.perm = tce; } trace_spapr_iommu_xlate(tcet-liobn, addr, ret.iova, ret.perm, @@ -113,7 +114,7 @@ static int spapr_tce_table_realize(DeviceState *dev) if (kvm_enabled()) { tcet-table = kvmppc_create_spapr_tce(tcet-liobn, tcet-nb_table - SPAPR_TCE_PAGE_SHIFT, + tcet-page_shift, tcet-fd); } @@ -133,6 +134,7 @@ static int spapr_tce_table_realize(DeviceState *dev) } sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, + uint32_t page_shift, uint32_t nb_table) { sPAPRTCETable *tcet; @@ -149,6 +151,7 @@ sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, tcet = SPAPR_TCE_TABLE(object_new(TYPE_SPAPR_TCE_TABLE)); tcet-liobn = liobn; +tcet-page_shift = page_shift; tcet-nb_table = nb_table; object_property_add_child(OBJECT(owner), tce-table, OBJECT(tcet), NULL); @@ -194,19 +197,20 @@ static target_ulong put_tce_emu(sPAPRTCETable *tcet, target_ulong ioba, target_ulong tce) { IOMMUTLBEntry entry; +target_ulong mask = ~((1 tcet-page_shift) - 1); -if ((ioba SPAPR_TCE_PAGE_SHIFT) = tcet-nb_table) { +if ((ioba tcet-page_shift) = tcet-nb_table) { hcall_dprintf(spapr_vio_put_tce on out-of-bounds IOBA 0x TARGET_FMT_lx \n, ioba); return H_PARAMETER; } -tcet-table[ioba SPAPR_TCE_PAGE_SHIFT] = tce; +tcet-table[ioba tcet-page_shift] = tce; entry.target_as = address_space_memory, -entry.iova = ioba ~SPAPR_TCE_PAGE_MASK; -entry.translated_addr = tce ~SPAPR_TCE_PAGE_MASK; -entry.addr_mask = SPAPR_TCE_PAGE_MASK; +entry.iova = ioba mask; +entry.translated_addr = tce mask; +entry.addr_mask = ~mask; entry.perm = tce; memory_region_notify_iommu(tcet-iommu, entry); @@ -226,6 +230,7 @@ static target_ulong h_put_tce_indirect(PowerPCCPU *cpu, target_ulong ret = H_PARAMETER; sPAPRTCETable *tcet = spapr_tce_find_by_liobn(liobn); CPUState *cs = CPU(cpu); +target_ulong mask; if (!tcet) { return H_PARAMETER; @@ -235,12 +240,14 @@ static target_ulong h_put_tce_indirect(PowerPCCPU *cpu, return H_PARAMETER; } -ioba = ~SPAPR_TCE_PAGE_MASK; -tce_list = ~SPAPR_TCE_PAGE_MASK; +mask = ~((1 tcet-page_shift) - 1); +ioba = mask; + +for (i = 0; i npages; ++i, ioba += (1 tcet-page_shift)) { +target_ulong off = (tce_list ~SPAPR_TCE_RW) + +i * sizeof(target_ulong); +target_ulong tce = ldq_phys(cs-as, off); -for (i = 0; i npages; ++i, ioba
Re: [Qemu-devel] [PATCH v2 1/2] openpic: Move definition of openpic_reset
On 22.05.14 06:46, Paul Janzen wrote: This patch moves the definition of openpic_reset after the various register read/write functions. No functional change. It is in preparation for using the register read/write functions in openpic_reset. Signed-off-by: Paul Janzen p...@pauljanzen.org Thanks, applied both to ppc-next. Alex
Re: [Qemu-devel] [Qemu-ppc] [PATCH] openpic: Initialize destmask at reset
On 21.05.14 21:47, Alexander Graf wrote: Am 21.05.2014 um 18:33 schrieb Paul Janzen p...@pauljanzen.org: On Wed, May 21, 2014 at 12:45 AM, Alexander Graf ag...@suse.de wrote: Or maybe it's safer overall to just call write_IRQreg_idr() instead of setting idr directly? That would update destmask along the way as well and we would catch all subtle corner cases. I considered this initially but decided to go with the simpler approach. But you have convinced me. Specifically, the current implementation does not reset src-output or src-nomask, which write_IRQreg_idr() will. I will test this change and re-submit the patch. Do you have a simple test case for this patch? We seem to have the same bug in the in-kernel KVM MPIC emulation code and I'd like to have it fixed there as well, but I don't really like to do that change blindly. I have tested this patch using a proprietary embedded operating system which doesn't bother to initialize the IDR registers. Do you think it'd be possible for you to test whether it works with kvm when I give you a patch? Do you have real hardware to run kvm on? :) I just checked - we don't properly implement reset at all in the in-kernel MPIC we do in KVM. So it's merely something to keep in mind for now until someone sits down and implements proper reset support ;). Alex
Re: [Qemu-devel] [PATCH 4/7] virtio-blk: use aliases instead of duplicate qdev properties
On Thu, May 22, 2014 at 12:04 AM, Paolo Bonzini pbonz...@redhat.com wrote: Il 21/05/2014 22:22, Stefan Hajnoczi ha scritto: virtio-blk-pci, virtio-blk-s390, and virtio-blk-ccw all duplicate the qdev properties of their VirtIOBlock child. This approach does not work well with string or pointer properties since we must be careful about leaking or double-freeing them. Use the QOM alias property to forward property accesses to the VirtIOBlock child. This way no duplication is necessary. Remember to stop calling virtio_blk_set_conf() so that we don't clobber the values already set on the VirtIOBlock instance. Which properties are _not_ being added? This is probably needed for all other virtio devices so a generic solution would be nice. I think your idea is something like: def qdev_add_alias_properties(obj, dev): for qdev_prop in dev: object_property_add_alias(obj, qdev_prop.name, dev, qdev_prop.name) That way we don't explicitly list all properties and duplicate code when fixing up net, rng, serial, scsi, etc. I took a quick look at net, rng, and serial. This approach should work because the VirtIODevice qdev properties need to be exposed wholesale on the transport device. If that's what you're hinting at, I'll try it in v2. Stefan
Re: [Qemu-devel] [PATCH 4/7] virtio-blk: use aliases instead of duplicate qdev properties
Am 22.05.2014 00:04, schrieb Paolo Bonzini: Il 21/05/2014 22:22, Stefan Hajnoczi ha scritto: virtio-blk-pci, virtio-blk-s390, and virtio-blk-ccw all duplicate the qdev properties of their VirtIOBlock child. This approach does not work well with string or pointer properties since we must be careful about leaking or double-freeing them. Use the QOM alias property to forward property accesses to the VirtIOBlock child. This way no duplication is necessary. Remember to stop calling virtio_blk_set_conf() so that we don't clobber the values already set on the VirtIOBlock instance. Which properties are _not_ being added? This is probably needed for all other virtio devices so a generic solution would be nice. type, realized and the child property for VirtIODevice come to mind, possibly one or two more. If we follow a generic scheme, we could add an .instance_post_init hook for VirtIOPCIProxy iterating over all properties and blacklisting some. Regards, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH 1/7] qom: add object_property_add_alias()
On Thu, May 22, 2014 at 12:05 AM, Paolo Bonzini pbonz...@redhat.com wrote: Il 21/05/2014 22:22, Stefan Hajnoczi ha scritto: +void object_property_add_alias(Object *obj, const char *name, + Object *target_obj, const char *target_name, + Error **errp) +{ +AliasProperty *prop; +ObjectProperty *target_prop; + +target_prop = object_property_find(target_obj, target_name, errp); +if (!target_prop) { +return; +} + +prop = g_malloc(sizeof(*prop)); +prop-target_obj = target_obj; +prop-target_name = target_name; + +object_property_add(obj, name, target_prop-type, +property_get_alias, +property_set_alias, +property_release_alias, +prop, errp); +} ref target_obj here, and unref in property_release_alias? No. I originally did this but realized it is probably not a good idea. 1. If you want to alias a property on the same object instance (foo.a - foo.b) this would create a refcount leak (unless someone explicitly deletes this property to bring the refcount back to 1). 2. The virtio callers already have a child reference and I suspect other callers would too. Therefore we don't need to do extra refcounting. I had an intermediate version of this patch with a flag so you could tell object_property_add_alias() whether or not to ref the target object. But in the end it seems like overengineering since the refcount case is rare or non-existent. The refcount is a convenience feature just for the case where you want to alias to a random object that you don't otherwise hold a refcount on (I couldn't think of an example). Do you see what I mean? Stefan
Re: [Qemu-devel] [PATCH 4/7] virtio-blk: use aliases instead of duplicate qdev properties
On Thu, May 22, 2014 at 12:18 PM, Andreas Färber afaer...@suse.de wrote: Am 22.05.2014 00:04, schrieb Paolo Bonzini: Il 21/05/2014 22:22, Stefan Hajnoczi ha scritto: virtio-blk-pci, virtio-blk-s390, and virtio-blk-ccw all duplicate the qdev properties of their VirtIOBlock child. This approach does not work well with string or pointer properties since we must be careful about leaking or double-freeing them. Use the QOM alias property to forward property accesses to the VirtIOBlock child. This way no duplication is necessary. Remember to stop calling virtio_blk_set_conf() so that we don't clobber the values already set on the VirtIOBlock instance. Which properties are _not_ being added? This is probably needed for all other virtio devices so a generic solution would be nice. type, realized and the child property for VirtIODevice come to mind, possibly one or two more. If we follow a generic scheme, we could add an .instance_post_init hook for VirtIOPCIProxy iterating over all properties and blacklisting some. I think the trick is to alias all the qdev properties, not the QOM ones. That way we get all the explicitly declared properties and none of the implicit ones. Stefan
Re: [Qemu-devel] [PATCH 8/9] spapr_iommu: Introduce page_shift in sPAPRTCETable
On 05/22/2014 08:09 PM, Alexander Graf wrote: On 22.05.14 01:45, Alexey Kardashevskiy wrote: On 05/22/2014 08:11 AM, Alexander Graf wrote: On 21.05.14 16:21, Alexey Kardashevskiy wrote: At the moment only 4K pages are supported by sPAPRTCETable. Since sPAPR spec allows other page sizes and we are going to implement them, we need page size to be configrable. This adds @page_shift into sPAPRTCETable and replaces SPAPR_TCE_PAGE_SHIFT with it whereever it is possible. This removes SPAPR_TCE_PAGE_MASK as it is no longer used. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- hw/ppc/spapr_iommu.c | 54 +- hw/ppc/spapr_pci.c | 1 + hw/ppc/spapr_vio.c | 1 + include/hw/ppc/spapr.h | 3 ++- 4 files changed, 35 insertions(+), 24 deletions(-) diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c index 90de3e3..e765a6d 100644 --- a/hw/ppc/spapr_iommu.c +++ b/hw/ppc/spapr_iommu.c @@ -70,12 +70,13 @@ static IOMMUTLBEntry spapr_tce_translate_iommu(MemoryRegion *iommu, hwaddr addr) if (tcet-bypass) { ret.perm = IOMMU_RW; -} else if ((addr SPAPR_TCE_PAGE_SHIFT) tcet-nb_table) { +} else if ((addr tcet-page_shift) tcet-nb_table) { /* Check if we are in bound */ -tce = tcet-table[addr SPAPR_TCE_PAGE_SHIFT]; -ret.iova = addr ~SPAPR_TCE_PAGE_MASK; -ret.translated_addr = tce ~SPAPR_TCE_PAGE_MASK; -ret.addr_mask = SPAPR_TCE_PAGE_MASK; +target_ulong mask = ~((1 tcet-page_shift) - 1); Why target_ulong? This should be u64 or hwaddr or something along those lines, no? Also, can the mask grow bigger than 31bits? If so you probably want 1ULL (below as well). In fact, we might be better off with a few more fields to tcet. Just add page_mask and page_size in addition to the page_shift one and use them instead of calculating them over and over again. +tce = tcet-table[addr tcet-page_shift]; +ret.iova = addr mask; +ret.translated_addr = tce mask; +ret.addr_mask = ~mask; ret.perm = tce; } trace_spapr_iommu_xlate(tcet-liobn, addr, ret.iova, ret.perm, @@ -113,7 +114,7 @@ static int spapr_tce_table_realize(DeviceState *dev) if (kvm_enabled()) { tcet-table = kvmppc_create_spapr_tce(tcet-liobn, tcet-nb_table - SPAPR_TCE_PAGE_SHIFT, + tcet-page_shift, tcet-fd); } @@ -133,6 +134,7 @@ static int spapr_tce_table_realize(DeviceState *dev) } sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, + uint32_t page_shift, uint32_t nb_table) { sPAPRTCETable *tcet; @@ -149,6 +151,7 @@ sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, tcet = SPAPR_TCE_TABLE(object_new(TYPE_SPAPR_TCE_TABLE)); tcet-liobn = liobn; +tcet-page_shift = page_shift; tcet-nb_table = nb_table; object_property_add_child(OBJECT(owner), tce-table, OBJECT(tcet), NULL); @@ -194,19 +197,20 @@ static target_ulong put_tce_emu(sPAPRTCETable *tcet, target_ulong ioba, target_ulong tce) { IOMMUTLBEntry entry; +target_ulong mask = ~((1 tcet-page_shift) - 1); -if ((ioba SPAPR_TCE_PAGE_SHIFT) = tcet-nb_table) { +if ((ioba tcet-page_shift) = tcet-nb_table) { hcall_dprintf(spapr_vio_put_tce on out-of-bounds IOBA 0x TARGET_FMT_lx \n, ioba); return H_PARAMETER; } -tcet-table[ioba SPAPR_TCE_PAGE_SHIFT] = tce; +tcet-table[ioba tcet-page_shift] = tce; entry.target_as = address_space_memory, -entry.iova = ioba ~SPAPR_TCE_PAGE_MASK; -entry.translated_addr = tce ~SPAPR_TCE_PAGE_MASK; -entry.addr_mask = SPAPR_TCE_PAGE_MASK; +entry.iova = ioba mask; +entry.translated_addr = tce mask; +entry.addr_mask = ~mask; entry.perm = tce; memory_region_notify_iommu(tcet-iommu, entry); @@ -226,6 +230,7 @@ static target_ulong h_put_tce_indirect(PowerPCCPU *cpu, target_ulong ret = H_PARAMETER; sPAPRTCETable *tcet = spapr_tce_find_by_liobn(liobn); CPUState *cs = CPU(cpu); +target_ulong mask; if (!tcet) { return H_PARAMETER; @@ -235,12 +240,14 @@ static target_ulong h_put_tce_indirect(PowerPCCPU *cpu, return H_PARAMETER; } -ioba = ~SPAPR_TCE_PAGE_MASK; -tce_list = ~SPAPR_TCE_PAGE_MASK; +mask = ~((1 tcet-page_shift) - 1); +ioba = mask; + +for (i = 0; i npages; ++i, ioba += (1 tcet-page_shift)) { +target_ulong off = (tce_list
Re: [Qemu-devel] [PATCH 4/7] virtio-blk: use aliases instead of duplicate qdev properties
Am 22.05.2014 12:24, schrieb Stefan Hajnoczi: On Thu, May 22, 2014 at 12:18 PM, Andreas Färber afaer...@suse.de wrote: Am 22.05.2014 00:04, schrieb Paolo Bonzini: Il 21/05/2014 22:22, Stefan Hajnoczi ha scritto: virtio-blk-pci, virtio-blk-s390, and virtio-blk-ccw all duplicate the qdev properties of their VirtIOBlock child. This approach does not work well with string or pointer properties since we must be careful about leaking or double-freeing them. Use the QOM alias property to forward property accesses to the VirtIOBlock child. This way no duplication is necessary. Remember to stop calling virtio_blk_set_conf() so that we don't clobber the values already set on the VirtIOBlock instance. Which properties are _not_ being added? This is probably needed for all other virtio devices so a generic solution would be nice. type, realized and the child property for VirtIODevice come to mind, possibly one or two more. If we follow a generic scheme, we could add an .instance_post_init hook for VirtIOPCIProxy iterating over all properties and blacklisting some. I think the trick is to alias all the qdev properties, not the QOM ones. That way we get all the explicitly declared properties and none of the implicit ones. I wouldn't oppose that, but you then need to iterate over parent classes until you hit VirtioDeviceClass (or DeviceClass?), to avoid properties falling through the cracks. I just figured it easier and in line with your QMP patch to avoid distinguishing them in new code. But a quick solution is more important than futureproofness here, so I'll take or ack whatever works here. Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH 8/9] spapr_iommu: Introduce page_shift in sPAPRTCETable
On 22.05.14 12:24, Alexey Kardashevskiy wrote: On 05/22/2014 08:09 PM, Alexander Graf wrote: On 22.05.14 01:45, Alexey Kardashevskiy wrote: On 05/22/2014 08:11 AM, Alexander Graf wrote: On 21.05.14 16:21, Alexey Kardashevskiy wrote: At the moment only 4K pages are supported by sPAPRTCETable. Since sPAPR spec allows other page sizes and we are going to implement them, we need page size to be configrable. This adds @page_shift into sPAPRTCETable and replaces SPAPR_TCE_PAGE_SHIFT with it whereever it is possible. This removes SPAPR_TCE_PAGE_MASK as it is no longer used. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru [...] diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index fdd4c07..c9850d4 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -656,6 +656,7 @@ static void spapr_phb_finish_realize(sPAPRPHBState *sphb, Error **errp) sPAPRTCETable *tcet; tcet = spapr_tce_new_table(DEVICE(sphb), sphb-dma_liobn, + SPAPR_TCE_PAGE_SHIFT, 0x4000 SPAPR_TCE_PAGE_SHIFT); if (!tcet) { error_setg(errp, Unable to create TCE table for %s, diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c index b84e481..d7e9e6a 100644 --- a/hw/ppc/spapr_vio.c +++ b/hw/ppc/spapr_vio.c @@ -457,6 +457,7 @@ static int spapr_vio_busdev_init(DeviceState *qdev) if (pc-rtce_window_size) { uint32_t liobn = SPAPR_VIO_BASE_LIOBN | dev-reg; dev-tcet = spapr_tce_new_table(qdev, liobn, +SPAPR_TCE_PAGE_SHIFT, I don't quite understand who defines what the TCE page size is for a given device. Can you try to explain this to me? If it is default window (for PCI) or window for VIO - it is 4K. If it is a dynamic DMA window - page size is a parameter of RTAS call which creates the window. Could we change that default size for non-dynamic windows somehow? 4k is really fine grained. No, this is hardcoded in a million places and old distros. Normally it is maximum 1GB or 2MB table, not too bad. And with DDW support, we can make the default one really small (64MB?) and lose even less. SPAPR: R1–7.3.31–4. For the Dynamic DMA Windows (DDW) option: The platform must provide a default DMA window for each PE, and all of the following must be true: a. The window is defined by the “ibm,dma-window” property in the OF device tree. b. The window is defined with 4 KB I/O pages. c. The window is located entirely below 4 GB. Cool, would be interesting to see how that affects performance. I like most of this patch set btw, I think it should be good to go in in the next round. Alex
Re: [Qemu-devel] [PATCH v2] spapr: Enable dynamic change of the supported hypercalls list
On 21.05.14 17:21, Alexey Kardashevskiy wrote: At the moment the ibm,hypertas-functions list is fixed. However some calls should be listed there if they are supported by QEMU or the host kernel. This enables hyperrtas_prop to grow on stack by adding a SPAPR_HYPERRTAS_ADD macro. qemu,hypertas-functions is converted as well. The first user of this is going to be a multi-tce property. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- Changes: v2: * replaced alloca() with GString --- hw/ppc/spapr.c | 30 +++--- 1 file changed, 23 insertions(+), 7 deletions(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 0a61246..3b28211 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -293,6 +293,10 @@ static size_t create_page_sizes_prop(CPUPPCState *env, uint32_t *prop, } \ } while (0) +static inline void add_str(GString *s, const gchar *s1) Please remove the inline :). Otherwise this looks a lot nicer than before :) Alex
Re: [Qemu-devel] [PATCH 8/9] spapr_iommu: Introduce page_shift in sPAPRTCETable
On 05/22/2014 08:45 PM, Alexander Graf wrote: On 22.05.14 12:24, Alexey Kardashevskiy wrote: On 05/22/2014 08:09 PM, Alexander Graf wrote: On 22.05.14 01:45, Alexey Kardashevskiy wrote: On 05/22/2014 08:11 AM, Alexander Graf wrote: On 21.05.14 16:21, Alexey Kardashevskiy wrote: At the moment only 4K pages are supported by sPAPRTCETable. Since sPAPR spec allows other page sizes and we are going to implement them, we need page size to be configrable. This adds @page_shift into sPAPRTCETable and replaces SPAPR_TCE_PAGE_SHIFT with it whereever it is possible. This removes SPAPR_TCE_PAGE_MASK as it is no longer used. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru [...] diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index fdd4c07..c9850d4 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -656,6 +656,7 @@ static void spapr_phb_finish_realize(sPAPRPHBState *sphb, Error **errp) sPAPRTCETable *tcet; tcet = spapr_tce_new_table(DEVICE(sphb), sphb-dma_liobn, + SPAPR_TCE_PAGE_SHIFT, 0x4000 SPAPR_TCE_PAGE_SHIFT); if (!tcet) { error_setg(errp, Unable to create TCE table for %s, diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c index b84e481..d7e9e6a 100644 --- a/hw/ppc/spapr_vio.c +++ b/hw/ppc/spapr_vio.c @@ -457,6 +457,7 @@ static int spapr_vio_busdev_init(DeviceState *qdev) if (pc-rtce_window_size) { uint32_t liobn = SPAPR_VIO_BASE_LIOBN | dev-reg; dev-tcet = spapr_tce_new_table(qdev, liobn, +SPAPR_TCE_PAGE_SHIFT, I don't quite understand who defines what the TCE page size is for a given device. Can you try to explain this to me? If it is default window (for PCI) or window for VIO - it is 4K. If it is a dynamic DMA window - page size is a parameter of RTAS call which creates the window. Could we change that default size for non-dynamic windows somehow? 4k is really fine grained. No, this is hardcoded in a million places and old distros. Normally it is maximum 1GB or 2MB table, not too bad. And with DDW support, we can make the default one really small (64MB?) and lose even less. SPAPR: R1–7.3.31–4. For the Dynamic DMA Windows (DDW) option: The platform must provide a default DMA window for each PE, and all of the following must be true: a. The window is defined by the “ibm,dma-window” property in the OF device tree. b. The window is defined with 4 KB I/O pages. c. The window is located entirely below 4 GB. Cool, would be interesting to see how that affects performance. I like most of this patch set btw, I think it should be good to go in in the next round. 10Gbit ethernet is ok but 40Gbit seems to suffer from this limitation. -- Alexey
Re: [Qemu-devel] [PATCH 8/9] spapr_iommu: Introduce page_shift in sPAPRTCETable
On 22.05.14 12:46, Alexey Kardashevskiy wrote: On 05/22/2014 08:45 PM, Alexander Graf wrote: On 22.05.14 12:24, Alexey Kardashevskiy wrote: On 05/22/2014 08:09 PM, Alexander Graf wrote: On 22.05.14 01:45, Alexey Kardashevskiy wrote: On 05/22/2014 08:11 AM, Alexander Graf wrote: On 21.05.14 16:21, Alexey Kardashevskiy wrote: At the moment only 4K pages are supported by sPAPRTCETable. Since sPAPR spec allows other page sizes and we are going to implement them, we need page size to be configrable. This adds @page_shift into sPAPRTCETable and replaces SPAPR_TCE_PAGE_SHIFT with it whereever it is possible. This removes SPAPR_TCE_PAGE_MASK as it is no longer used. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru [...] diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index fdd4c07..c9850d4 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -656,6 +656,7 @@ static void spapr_phb_finish_realize(sPAPRPHBState *sphb, Error **errp) sPAPRTCETable *tcet; tcet = spapr_tce_new_table(DEVICE(sphb), sphb-dma_liobn, + SPAPR_TCE_PAGE_SHIFT, 0x4000 SPAPR_TCE_PAGE_SHIFT); if (!tcet) { error_setg(errp, Unable to create TCE table for %s, diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c index b84e481..d7e9e6a 100644 --- a/hw/ppc/spapr_vio.c +++ b/hw/ppc/spapr_vio.c @@ -457,6 +457,7 @@ static int spapr_vio_busdev_init(DeviceState *qdev) if (pc-rtce_window_size) { uint32_t liobn = SPAPR_VIO_BASE_LIOBN | dev-reg; dev-tcet = spapr_tce_new_table(qdev, liobn, +SPAPR_TCE_PAGE_SHIFT, I don't quite understand who defines what the TCE page size is for a given device. Can you try to explain this to me? If it is default window (for PCI) or window for VIO - it is 4K. If it is a dynamic DMA window - page size is a parameter of RTAS call which creates the window. Could we change that default size for non-dynamic windows somehow? 4k is really fine grained. No, this is hardcoded in a million places and old distros. Normally it is maximum 1GB or 2MB table, not too bad. And with DDW support, we can make the default one really small (64MB?) and lose even less. SPAPR: R1–7.3.31–4. For the Dynamic DMA Windows (DDW) option: The platform must provide a default DMA window for each PE, and all of the following must be true: a. The window is defined by the “ibm,dma-window” property in the OF device tree. b. The window is defined with 4 KB I/O pages. c. The window is located entirely below 4 GB. Cool, would be interesting to see how that affects performance. I like most of this patch set btw, I think it should be good to go in in the next round. 10Gbit ethernet is ok but 40Gbit seems to suffer from this limitation. Oh, I was implying that it'd be interesting to see how we fare when we do a really small default DMA window to force the guest to use DDW and preferably even with very large pages. Alex
Re: [Qemu-devel] [RFC 0/3] cpu: add device_add foo-x86_64-cpu support
Hi, Am 22.05.2014 04:33, schrieb chen.fan.f...@cn.fujitsu.com: I think if we want to use 'device/device_add' to implement CPU, we must do some check before qemu_init_vcpu(). how can we to do that? We ran into such problems before... If need be, we can change from the old parent_realize scheme to the base class calling the derived realize function in-order, or we can add new hooks to CPUClass as necessary. Consider me a bit skeptical about MAX_CPUMASK_BITS in 1/3. This should at least be tied to the maximum allowed for QEMUMachine/MachineClass rather than hardcoded to 255, which people may forget to synchronize. There was a recent attempt to increase the limits. 2/3 looks good apart from the subject; I could cherry-pick that, seeing there is Reviewed-by, if you like. 3/3 is doing a bit much to digest at once for my taste. Regards, Andreas On Tue, 2014-05-13 at 18:08 +0800, Chen Fan wrote: this patches tried to make cpu hotplug with device_add, and made -device foo-x86_64-cpu available,also we can set apic-id property with command line, if without setting apic-id property, we added first unoccupied apic id as the default new apic id. and hotplug cpu with device_add, we must make check of APIC ID after cpu object initialization that was different from 'cpu_add' command which check 'ids' at the beginning. Chen Fan (3): using CPUMASK bitmaps to calculate cpu index cpu: introduce CpuTopoInfo structure for argument simplification cpu: add device_add foo-x86_64-cpu support exec.c | 9 +++-- include/qom/cpu.h | 11 ++ include/sysemu/sysemu.h | 7 qdev-monitor.c | 11 ++ target-i386/cpu.c | 91 - target-i386/topology.h | 51 ++- 6 files changed, 151 insertions(+), 29 deletions(-) -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [v2][PATCH 4/8] xen, gfx passthrough: reserve 00:02.0 for INTEL IGD
-Original Message- From: Gerd Hoffmann [mailto:kra...@redhat.com] Sent: Thursday, May 22, 2014 2:45 PM To: Chen, Tiejun Cc: Anthony PERARD; Daniel P. Berrange; peter.mayd...@linaro.org; xen-de...@lists.xensource.com; m...@redhat.com; stefano.stabell...@eu.citrix.com; Kay, Allen M; kelly.zyta...@amd.com; qemu-devel@nongnu.org; Zhang, Yang Z; anth...@codemonkey.ws Subject: Re: [Qemu-devel] [v2][PATCH 4/8] xen, gfx passthrough: reserve 00:02.0 for INTEL IGD Hi, Another useful thing would be to not create the xen platform device in case -nodefaults was specified on the command line (that switch turns off a bunch of other devices present by default: vga, nic, cdrom, ...). Currently looks 'xen-platform' itself can't be created, not those devices existed on that. The error message looks more like libxl tries to hot-unplug the xen platform device. Attached patch (untested!) hooks up the xen platform device to the default device code we have in qemu. Two effects: (1) As mentioned above the device will not be created in case -nodefaults is specified on the command line. (2) Autocreating the device is also turned off in case xen-platform is added manually via -device. With the patch applied you should be able to move the xen-platform device to some other place with a simple 'qemu -device xen-platform,addr=$slot'. Gerd, Sorry, I may misunderstand what you mean previously then have a wrong test. So this still doesn't work actually. After applied your patch, 'xen-platform' is always disabled by default, right? So 00:02.0 is left naturally to be assigned to IGD as we expect like this, tchen0@tchen0-HVM-domU:~$ lspci 00:00.0 Host bridge: Intel Corporation 4th Gen Core Processor DRAM Controller (rev 06) 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03) 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06) 00:03.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #2 (rev 04) 00:1f.0 ISA bridge: Intel Corporation Q87 Express LPC Controller (rev 04) Then this is fine but if you intend to add a 'qemu -device xen-platform,addr=$slot', this doesn't work well. In my case, gfx_passthru=1 pci=[00:02.0@2, 00:1a.0] xen_platform_pci=0 device_model_args_hvm = ['-device', 'xen-platform,addr=0x3'] tchen0@tchen0-linux:~/workspace$ sudo xl cr domu-cfg Parsing config from domu-cfg libxl: error: libxl_qmp.c:287:qmp_handle_error_response: received an error message from QMP server: Unsupported bus. Bus doesn't have property 'acpi-pcihp-bsel' set libxl: error: libxl_create.c:1277:domcreate_attach_pci: libxl_device_pci_add failed: -3 Thanks Tiejun
[Qemu-devel] [PATCH 1/2] vmstate: Add helper to enable GHashTable migration
This adds a VMSTATE_HASH_V macro. This implements put/get callbacks for it. This implements a qemu_hash_init() wrapper to save key/value sizes. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- include/migration/vmstate.h | 10 + include/qemu-common.h | 13 +++ vmstate.c | 54 + 3 files changed, 77 insertions(+) diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h index 7e45048..6af599d 100644 --- a/include/migration/vmstate.h +++ b/include/migration/vmstate.h @@ -166,6 +166,7 @@ extern const VMStateInfo vmstate_info_timer; extern const VMStateInfo vmstate_info_buffer; extern const VMStateInfo vmstate_info_unused_buffer; extern const VMStateInfo vmstate_info_bitmap; +static const VMStateInfo vmstate_info_hash; #define type_check_2darray(t1,t2,n,m) ((t1(*)[n][m])0 - (t2*)0) #define type_check_array(t1,t2,n) ((t1(*)[n])0 - (t2*)0) @@ -740,6 +741,15 @@ extern const VMStateInfo vmstate_info_bitmap; #define VMSTATE_BUFFER_UNSAFE(_field, _state, _version, _size)\ VMSTATE_BUFFER_UNSAFE_INFO(_field, _state, _version, vmstate_info_buffer, _size) +#define VMSTATE_HASH_V(_field, _state, _version) { \ +.name = (stringify(_field)), \ +.version_id = (_version), \ +.size = sizeof(qemu_hash),\ +.info = vmstate_info_hash, \ +.flags = VMS_SINGLE, \ +.offset = vmstate_offset_value(_state, _field, qemu_hash), \ +} + #define VMSTATE_UNUSED_V(_v, _size) \ VMSTATE_UNUSED_BUFFER(NULL, _v, _size) diff --git a/include/qemu-common.h b/include/qemu-common.h index 3f3fd60..ee973e7 100644 --- a/include/qemu-common.h +++ b/include/qemu-common.h @@ -462,4 +462,17 @@ int parse_debug_env(const char *name, int max, int initial); const char *qemu_ether_ntoa(const MACAddr *mac); +typedef struct qemu_hash { +GHashTable *hash; +int key_size; +int value_size; +} qemu_hash; + +static inline void qemu_hash_init(qemu_hash *h, int key_size, int value_size) +{ +h-key_size = key_size; +h-value_size = value_size; +h-hash = g_hash_table_new_full(g_int_hash, g_int_equal, g_free, g_free); +} + #endif diff --git a/vmstate.c b/vmstate.c index b5882fa..2148e73 100644 --- a/vmstate.c +++ b/vmstate.c @@ -667,3 +667,57 @@ const VMStateInfo vmstate_info_bitmap = { .get = get_bitmap, .put = put_bitmap, }; + +/* Save restore for qemu_hash which is a wrapper over GHashTable */ +static int get_hash(QEMUFile *f, void *pv, size_t size) +{ +qemu_hash *h = pv; +uint32_t num = g_hash_table_size(h-hash); +gpointer key, value; + +num = qemu_get_be32(f); + +for ( ; num; --num) { +int i; + +key = g_malloc0(h-key_size); +for (i = 0; i h-key_size; ++i) { +((uint8_t *)key)[i] = qemu_get_byte(f); +} +value = g_malloc0(h-value_size); +for (i = 0; i h-value_size; ++i) { +((uint8_t *)value)[i] = qemu_get_byte(f); +} +g_hash_table_insert(h-hash, key, value); +} + +return 0; +} + +static void put_hash(QEMUFile *f, void *pv, size_t size) +{ +qemu_hash *h = pv; +uint32_t num = g_hash_table_size(h-hash); +GHashTableIter iter; +gpointer key, value; + +qemu_put_be32(f, num); + +g_hash_table_iter_init(iter, h-hash); +while (g_hash_table_iter_next (iter, key, value)) { +int i; + +for (i = 0; i h-key_size; ++i) { +qemu_put_byte(f, ((uint8_t *)key)[i]); +} +for (i = 0; i h-value_size; ++i) { +qemu_put_byte(f, ((uint8_t *)value)[i]); +} +} +} + +static const VMStateInfo vmstate_info_hash = { +.name = qemu-hash-desc, +.get = get_hash, +.put = put_hash, +}; -- 1.9.rc0
[Qemu-devel] [PATCH 2/2] spapr_pci: Use XICS interrupt allocator and do not cache interrupts in PHB
Currently SPAPR PHB keeps track of all allocated MSI/MISX interrupt as XICS used to be unable to reuse interrupts which becomes a problem for dynamic MSI reconfiguration which is happening on guest driver reload or PCI hot (un)plug. Another problem is that PHB has a limit of devices supporting MSI/MSIX (SPAPR_MSIX_MAX_DEVS=32) and there is no good reason for that. This makes use of new XICS ability to reuse interrupts. This removes cached MSI configuration from SPAPR PHB so the first IRQ number of a device is stored in MSI/MSIX config space so there is no need to store this anywhere else. From now on, SPAPR PHB only keeps flags telling what type of interrupt for which device it has configured in order to return error if (for example) MSIX was enabled and the guest is trying to disable MSI which it has not enabled. This removes a limit for the maximum number of MSIX-enabled devices per PHB, now XICS and PCI bus capacity are the only limitation. This changes migration stream as it fixes vmstate_spapr_pci_msi::name which was wrong since the beginning. This fixed traces to be more informative. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- hw/ppc/spapr_pci.c | 144 +--- include/hw/pci-host/spapr.h | 13 ++-- trace-events| 5 +- 3 files changed, 64 insertions(+), 98 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index e574014..6dc0c50 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -220,36 +220,12 @@ static void rtas_write_pci_config(PowerPCCPU *cpu, sPAPREnvironment *spapr, } /* - * Find an entry with config_addr or returns the empty one if not found AND - * alloc_new is set. - * At the moment the msi_table entries are never released so there is - * no point to look till the end of the list if we need to find the free entry. - */ -static int spapr_msicfg_find(sPAPRPHBState *phb, uint32_t config_addr, - bool alloc_new) -{ -int i; - -for (i = 0; i SPAPR_MSIX_MAX_DEVS; ++i) { -if (!phb-msi_table[i].nvec) { -break; -} -if (phb-msi_table[i].config_addr == config_addr) { -return i; -} -} -if ((i SPAPR_MSIX_MAX_DEVS) alloc_new) { -trace_spapr_pci_msi(Allocating new MSI config, i, config_addr); -return i; -} - -return -1; -} - -/* * Set MSI/MSIX message data. * This is required for msi_notify()/msix_notify() which * will write at the addresses via spapr_msi_write(). + * + * If hwaddr == 0, all entries will have .data == first_irq i.e. + * table will be reset. */ static void spapr_msi_setmsg(PCIDevice *pdev, hwaddr addr, bool msix, unsigned first_irq, unsigned req_num) @@ -263,9 +239,12 @@ static void spapr_msi_setmsg(PCIDevice *pdev, hwaddr addr, bool msix, return; } -for (i = 0; i req_num; ++i, ++msg.data) { +for (i = 0; i req_num; ++i) { msix_set_message(pdev, i, msg); trace_spapr_pci_msi_setup(pdev-name, i, msg.address); +if (addr) { +++msg.data; +} } } @@ -280,9 +259,12 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPREnvironment *spapr, unsigned int req_num = rtas_ld(args, 4); /* 0 == remove all */ unsigned int seq_num = rtas_ld(args, 5); unsigned int ret_intr_type; -int ndev, irq, max_irqs = 0; +unsigned int irq, max_irqs = 0, num = 0; sPAPRPHBState *phb = NULL; PCIDevice *pdev = NULL; +bool msix = false; +spapr_pci_msi *msi; +int *config_addr_key; switch (func) { case RTAS_CHANGE_MSI_FN: @@ -310,13 +292,18 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPREnvironment *spapr, /* Releasing MSIs */ if (!req_num) { -ndev = spapr_msicfg_find(phb, config_addr, false); -if (ndev 0) { -trace_spapr_pci_msi(MSI has not been enabled, -1, config_addr); +msi = (spapr_pci_msi *) g_hash_table_lookup(phb-msi.hash, config_addr); +if (!msi) { +trace_spapr_pci_msi(Releasing wrong config, config_addr); rtas_st(rets, 0, RTAS_OUT_HW_ERROR); return; } -trace_spapr_pci_msi(Released MSIs, ndev, config_addr); + +xics_free(spapr-icp, msi-first_irq, msi-num); +spapr_msi_setmsg(pdev, 0, msix, 0, num); +g_hash_table_remove(phb-msi.hash, config_addr); + +trace_spapr_pci_msi(Released MSIs, config_addr); rtas_st(rets, 0, RTAS_OUT_SUCCESS); rtas_st(rets, 1, 0); return; @@ -324,15 +311,6 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPREnvironment *spapr, /* Enabling MSI */ -/* Find a device number in the map to add or reuse the existing one */ -ndev = spapr_msicfg_find(phb, config_addr, true); -if (ndev = SPAPR_MSIX_MAX_DEVS || ndev 0) { -error_report(No free entry for a new MSI device); -
Re: [Qemu-devel] [PATCH v2 8/8] spapr_pci: Use XICS interrupt allocator and do not cache interrupts in PHB
On 05/22/2014 05:16 PM, Alexander Graf wrote: Am 22.05.2014 um 08:53 schrieb Alexey Kardashevskiy a...@ozlabs.ru: On 05/21/2014 10:42 PM, Alexey Kardashevskiy wrote: On 05/21/2014 08:35 PM, Alexander Graf wrote: On 21.05.14 12:13, Alexey Kardashevskiy wrote: On 05/21/2014 07:50 PM, Alexander Graf wrote: On 21.05.14 11:33, Alexey Kardashevskiy wrote: On 05/21/2014 07:13 PM, Alexander Graf wrote: On 21.05.14 11:11, Michael S. Tsirkin wrote: On Wed, May 21, 2014 at 11:06:09AM +0200, Alexander Graf wrote: On 21.05.14 10:52, Alexey Kardashevskiy wrote: On 05/21/2014 06:40 PM, Alexander Graf wrote: On 15.05.14 11:59, Alexey Kardashevskiy wrote: Currently SPAPR PHB keeps track of all allocated MSI/MISX interrupt as XICS used to be unable to reuse interrupts which becomes a problem for dynamic MSI reconfiguration which is happening on guest driver reload or PCI hot (un)plug. Another problem is that PHB has a limit of devices supporting MSI/MSIX (SPAPR_MSIX_MAX_DEVS=32) and there is no good reason for that. This makes use of new XICS ability to reuse interrupts. This removes cached MSI configuration from SPAPR PHB so the first IRQ number of a device is stored in MSI/MSIX config space so there is no need to store this anywhere else. From now on, SPAPR PHB only keeps flags telling what type of interrupt for which device it has configured in order to return error if (for example) MSIX was enabled and the guest is trying to disable MSI which it has not enabled. This removes a limit for the maximum number of MSIX-enabled devices per PHB, now XICS and PCI bus capacity are the only limitation. This changes migration stream as it fixes vmstate_spapr_pci_msi::name which was wrong since the beginning. This fixed traces to be more informative. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- In reality either MSIX or MSI is enabled, never both. So I could remove msi/msix bitmaps from this patch, would it make sense? Is this a hard requirement? Does a device have to choose between MSIX and MSI or could it theoretically have both enabled? Is this a PCI limitation, a PAPR/XICS limitation or just a limitation of your implementation? My implementation does not have this limitation, I asked if I can simplify code by introducing one :) I cannot see any reason why PCI cannot have both MSI and MSIX enabled but it does not seem to be used by anyone = cannot debug and confirm. PAPR spec assumes that if the guest tries enabling MSIX when MSI is already enabled, this is a change, not enabling both types. But it also says MSI and MSIX vector numbers are not shared. Hm. Yeah, I'm not aware of any limitation on hardware here and I'd rather not impose one. Michael, do you know of any hardware that uses MSI *and* MSI-X at the same time? Alex No, and the PCI spec says: A function is permitted to implement both MSI and MSI-X, but system software is prohibited from enabling both at the same time. If system software enables both at the same time, the result is undefined. Ah, cool. So yes Alexey, feel free to impose it :). Heh. This solves just half of the problem - I still have to keep track of what device got MSI/MSIX configured via that ibm,change-msi interface. I was hoping I can store such flag somewhere in a device PCI config space but MSI/MSIX enable bit is not good as it is not set when those calls are made. And I cannot rely on address/data fields much as the guest can change them (I already use them to store IRQ numbers and btw it is missing checks when I read them back for disposal, I'll fix in next round). Or on enable event I could put IRQ numbers to .data of MSI config space and on disable check if it is not zero, then configuration took place, then I can remove my msi[]/msix[] flag arrays. If the guest did any change to MSI/MSIX config space (it does not on SPAPR except weird selftest cases), I compare .data with what ICS can possibly have and either reject disable or handle it and if it breaks XICS - that's too bad for the stupid guest. Would that be acceptable? Can't you prohibit the guest from writing to the MSI configuration registers itself? Then you don't need to do sanity checks. I could for emulated devices but VFIO uses the same code. For example, there is an IBM SCSI IPR card which does a self test. For that, it saves MSIX BAR content, does reboot via some backdoor interface and restores MSIX BAR. It has been solved for VFIO in the host kernel by restoring MSIX data from cached values when guest is trying to restore it with what it thinks is actual MSIX data (it is virtualized because of x86). But there is cache We already have a cache because we don't access the real PCI registers with msi_set_message(), no? For emulated devices there is no cache. And in any case the guest is allowed to write to it... Who knows what AIX does? I do not. Tried
Re: [Qemu-devel] [PATCH 8/9] spapr_iommu: Introduce page_shift in sPAPRTCETable
On 05/22/2014 08:48 PM, Alexander Graf wrote: On 22.05.14 12:46, Alexey Kardashevskiy wrote: On 05/22/2014 08:45 PM, Alexander Graf wrote: On 22.05.14 12:24, Alexey Kardashevskiy wrote: On 05/22/2014 08:09 PM, Alexander Graf wrote: On 22.05.14 01:45, Alexey Kardashevskiy wrote: On 05/22/2014 08:11 AM, Alexander Graf wrote: On 21.05.14 16:21, Alexey Kardashevskiy wrote: At the moment only 4K pages are supported by sPAPRTCETable. Since sPAPR spec allows other page sizes and we are going to implement them, we need page size to be configrable. This adds @page_shift into sPAPRTCETable and replaces SPAPR_TCE_PAGE_SHIFT with it whereever it is possible. This removes SPAPR_TCE_PAGE_MASK as it is no longer used. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru [...] diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index fdd4c07..c9850d4 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -656,6 +656,7 @@ static void spapr_phb_finish_realize(sPAPRPHBState *sphb, Error **errp) sPAPRTCETable *tcet; tcet = spapr_tce_new_table(DEVICE(sphb), sphb-dma_liobn, + SPAPR_TCE_PAGE_SHIFT, 0x4000 SPAPR_TCE_PAGE_SHIFT); if (!tcet) { error_setg(errp, Unable to create TCE table for %s, diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c index b84e481..d7e9e6a 100644 --- a/hw/ppc/spapr_vio.c +++ b/hw/ppc/spapr_vio.c @@ -457,6 +457,7 @@ static int spapr_vio_busdev_init(DeviceState *qdev) if (pc-rtce_window_size) { uint32_t liobn = SPAPR_VIO_BASE_LIOBN | dev-reg; dev-tcet = spapr_tce_new_table(qdev, liobn, +SPAPR_TCE_PAGE_SHIFT, I don't quite understand who defines what the TCE page size is for a given device. Can you try to explain this to me? If it is default window (for PCI) or window for VIO - it is 4K. If it is a dynamic DMA window - page size is a parameter of RTAS call which creates the window. Could we change that default size for non-dynamic windows somehow? 4k is really fine grained. No, this is hardcoded in a million places and old distros. Normally it is maximum 1GB or 2MB table, not too bad. And with DDW support, we can make the default one really small (64MB?) and lose even less. SPAPR: R1–7.3.31–4. For the Dynamic DMA Windows (DDW) option: The platform must provide a default DMA window for each PE, and all of the following must be true: a. The window is defined by the “ibm,dma-window” property in the OF device tree. b. The window is defined with 4 KB I/O pages. c. The window is located entirely below 4 GB. Cool, would be interesting to see how that affects performance. I like most of this patch set btw, I think it should be good to go in in the next round. 10Gbit ethernet is ok but 40Gbit seems to suffer from this limitation. Oh, I was implying that it'd be interesting to see how we fare when we do a really small default DMA window to force the guest to use DDW and preferably even with very large pages. Guest switches to DDW always if the feature is there. SPAPR DDW API even allows deleting the default one if the platform supports reset (which I do not implement for now). -- Alexey
Re: [Qemu-devel] [PATCH 1/2] vmstate: Add helper to enable GHashTable migration
On 22.05.14 12:53, Alexey Kardashevskiy wrote: This adds a VMSTATE_HASH_V macro. This implements put/get callbacks for it. This implements a qemu_hash_init() wrapper to save key/value sizes. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- include/migration/vmstate.h | 10 + include/qemu-common.h | 13 +++ vmstate.c | 54 + 3 files changed, 77 insertions(+) diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h index 7e45048..6af599d 100644 --- a/include/migration/vmstate.h +++ b/include/migration/vmstate.h @@ -166,6 +166,7 @@ extern const VMStateInfo vmstate_info_timer; extern const VMStateInfo vmstate_info_buffer; extern const VMStateInfo vmstate_info_unused_buffer; extern const VMStateInfo vmstate_info_bitmap; +static const VMStateInfo vmstate_info_hash; #define type_check_2darray(t1,t2,n,m) ((t1(*)[n][m])0 - (t2*)0) #define type_check_array(t1,t2,n) ((t1(*)[n])0 - (t2*)0) @@ -740,6 +741,15 @@ extern const VMStateInfo vmstate_info_bitmap; #define VMSTATE_BUFFER_UNSAFE(_field, _state, _version, _size)\ VMSTATE_BUFFER_UNSAFE_INFO(_field, _state, _version, vmstate_info_buffer, _size) +#define VMSTATE_HASH_V(_field, _state, _version) { \ +.name = (stringify(_field)), \ +.version_id = (_version), \ +.size = sizeof(qemu_hash),\ +.info = vmstate_info_hash, \ +.flags = VMS_SINGLE, \ +.offset = vmstate_offset_value(_state, _field, qemu_hash), \ +} + #define VMSTATE_UNUSED_V(_v, _size) \ VMSTATE_UNUSED_BUFFER(NULL, _v, _size) diff --git a/include/qemu-common.h b/include/qemu-common.h index 3f3fd60..ee973e7 100644 --- a/include/qemu-common.h +++ b/include/qemu-common.h @@ -462,4 +462,17 @@ int parse_debug_env(const char *name, int max, int initial); const char *qemu_ether_ntoa(const MACAddr *mac); +typedef struct qemu_hash { +GHashTable *hash; +int key_size; +int value_size; +} qemu_hash; + +static inline void qemu_hash_init(qemu_hash *h, int key_size, int value_size) +{ +h-key_size = key_size; +h-value_size = value_size; +h-hash = g_hash_table_new_full(g_int_hash, g_int_equal, g_free, g_free); +} + #endif diff --git a/vmstate.c b/vmstate.c index b5882fa..2148e73 100644 --- a/vmstate.c +++ b/vmstate.c @@ -667,3 +667,57 @@ const VMStateInfo vmstate_info_bitmap = { .get = get_bitmap, .put = put_bitmap, }; + +/* Save restore for qemu_hash which is a wrapper over GHashTable */ +static int get_hash(QEMUFile *f, void *pv, size_t size) +{ +qemu_hash *h = pv; +uint32_t num = g_hash_table_size(h-hash); +gpointer key, value; + +num = qemu_get_be32(f); + +for ( ; num; --num) { +int i; + +key = g_malloc0(h-key_size); +for (i = 0; i h-key_size; ++i) { +((uint8_t *)key)[i] = qemu_get_byte(f); +} +value = g_malloc0(h-value_size); +for (i = 0; i h-value_size; ++i) { +((uint8_t *)value)[i] = qemu_get_byte(f); +} +g_hash_table_insert(h-hash, key, value); +} + +return 0; +} + +static void put_hash(QEMUFile *f, void *pv, size_t size) +{ +qemu_hash *h = pv; +uint32_t num = g_hash_table_size(h-hash); +GHashTableIter iter; +gpointer key, value; + +qemu_put_be32(f, num); + +g_hash_table_iter_init(iter, h-hash); +while (g_hash_table_iter_next (iter, key, value)) { +int i; + +for (i = 0; i h-key_size; ++i) { +qemu_put_byte(f, ((uint8_t *)key)[i]); +} +for (i = 0; i h-value_size; ++i) { +qemu_put_byte(f, ((uint8_t *)value)[i]); I think the only way to do this safely would be to do a recursive introspective walk through the hash table's key and value properties. Every key and every property can be an object again, right? We would basically impose a limit onto ourselves that every member of the hash table would have to be a GObject again. Alex
Re: [Qemu-devel] [PATCH v2 8/8] spapr_pci: Use XICS interrupt allocator and do not cache interrupts in PHB
On 22.05.14 12:53, Alexey Kardashevskiy wrote: On 05/22/2014 05:16 PM, Alexander Graf wrote: Am 22.05.2014 um 08:53 schrieb Alexey Kardashevskiy a...@ozlabs.ru: On 05/21/2014 10:42 PM, Alexey Kardashevskiy wrote: On 05/21/2014 08:35 PM, Alexander Graf wrote: On 21.05.14 12:13, Alexey Kardashevskiy wrote: On 05/21/2014 07:50 PM, Alexander Graf wrote: On 21.05.14 11:33, Alexey Kardashevskiy wrote: On 05/21/2014 07:13 PM, Alexander Graf wrote: On 21.05.14 11:11, Michael S. Tsirkin wrote: On Wed, May 21, 2014 at 11:06:09AM +0200, Alexander Graf wrote: On 21.05.14 10:52, Alexey Kardashevskiy wrote: On 05/21/2014 06:40 PM, Alexander Graf wrote: On 15.05.14 11:59, Alexey Kardashevskiy wrote: Currently SPAPR PHB keeps track of all allocated MSI/MISX interrupt as XICS used to be unable to reuse interrupts which becomes a problem for dynamic MSI reconfiguration which is happening on guest driver reload or PCI hot (un)plug. Another problem is that PHB has a limit of devices supporting MSI/MSIX (SPAPR_MSIX_MAX_DEVS=32) and there is no good reason for that. This makes use of new XICS ability to reuse interrupts. This removes cached MSI configuration from SPAPR PHB so the first IRQ number of a device is stored in MSI/MSIX config space so there is no need to store this anywhere else. From now on, SPAPR PHB only keeps flags telling what type of interrupt for which device it has configured in order to return error if (for example) MSIX was enabled and the guest is trying to disable MSI which it has not enabled. This removes a limit for the maximum number of MSIX-enabled devices per PHB, now XICS and PCI bus capacity are the only limitation. This changes migration stream as it fixes vmstate_spapr_pci_msi::name which was wrong since the beginning. This fixed traces to be more informative. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- In reality either MSIX or MSI is enabled, never both. So I could remove msi/msix bitmaps from this patch, would it make sense? Is this a hard requirement? Does a device have to choose between MSIX and MSI or could it theoretically have both enabled? Is this a PCI limitation, a PAPR/XICS limitation or just a limitation of your implementation? My implementation does not have this limitation, I asked if I can simplify code by introducing one :) I cannot see any reason why PCI cannot have both MSI and MSIX enabled but it does not seem to be used by anyone = cannot debug and confirm. PAPR spec assumes that if the guest tries enabling MSIX when MSI is already enabled, this is a change, not enabling both types. But it also says MSI and MSIX vector numbers are not shared. Hm. Yeah, I'm not aware of any limitation on hardware here and I'd rather not impose one. Michael, do you know of any hardware that uses MSI *and* MSI-X at the same time? Alex No, and the PCI spec says: A function is permitted to implement both MSI and MSI-X, but system software is prohibited from enabling both at the same time. If system software enables both at the same time, the result is undefined. Ah, cool. So yes Alexey, feel free to impose it :). Heh. This solves just half of the problem - I still have to keep track of what device got MSI/MSIX configured via that ibm,change-msi interface. I was hoping I can store such flag somewhere in a device PCI config space but MSI/MSIX enable bit is not good as it is not set when those calls are made. And I cannot rely on address/data fields much as the guest can change them (I already use them to store IRQ numbers and btw it is missing checks when I read them back for disposal, I'll fix in next round). Or on enable event I could put IRQ numbers to .data of MSI config space and on disable check if it is not zero, then configuration took place, then I can remove my msi[]/msix[] flag arrays. If the guest did any change to MSI/MSIX config space (it does not on SPAPR except weird selftest cases), I compare .data with what ICS can possibly have and either reject disable or handle it and if it breaks XICS - that's too bad for the stupid guest. Would that be acceptable? Can't you prohibit the guest from writing to the MSI configuration registers itself? Then you don't need to do sanity checks. I could for emulated devices but VFIO uses the same code. For example, there is an IBM SCSI IPR card which does a self test. For that, it saves MSIX BAR content, does reboot via some backdoor interface and restores MSIX BAR. It has been solved for VFIO in the host kernel by restoring MSIX data from cached values when guest is trying to restore it with what it thinks is actual MSIX data (it is virtualized because of x86). But there is cache We already have a cache because we don't access the real PCI registers with msi_set_message(), no? For emulated devices there is no cache. And in any case the guest is allowed to write to it... Who knows what AIX does? I do not. Tried GHashTable for keeping bus:dev.fn - (irq, num), more or less ok
Re: [Qemu-devel] [PATCH] xen: make xen-platform a default device
-Original Message- From: qemu-devel-bounces+tiejun.chen=intel@nongnu.org [mailto:qemu-devel-bounces+tiejun.chen=intel@nongnu.org] On Behalf Of Michael S. Tsirkin Sent: Thursday, May 22, 2014 3:22 PM To: Gerd Hoffmann Cc: qemu-devel@nongnu.org; Anthony Liguori Subject: Re: [Qemu-devel] [PATCH] xen: make xen-platform a default device On Thu, May 22, 2014 at 09:20:50AM +0200, Gerd Hoffmann wrote: Patch hooks up the xen platform device to the default device code we have in qemu. Two effects: (1) The device will not be created in case -nodefaults is specified on the command line. (2) Autocreating the device is also turned off in case xen-platform is added manually via -device. With the patch applied you can move the xen-platform device to some other place with a simple 'qemu -device xen-platform,addr=$slot'. Tested-by: Tiejun Chen tiejun.c...@intel.com Sorry, I'm not sure if this works well as Gerd expect. I already reply something online, please take a look at that before apply. Thanks Tiejun Signed-off-by: Gerd Hoffmann kra...@redhat.com Applied, thanks! --- hw/i386/pc_piix.c| 2 +- include/hw/xen/xen.h | 1 + vl.c | 3 +++ 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index eaf3e61..f987d03 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -385,7 +385,7 @@ static void pc_xen_hvm_init(QEMUMachineInitArgs *args) pc_init_pci(args); bus = pci_find_primary_bus(); -if (bus != NULL) { +if (bus != NULL default_xenplatform) { pci_create_simple(bus, -1, xen-platform); } } diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h index 85fda3d..b350413 100644 --- a/include/hw/xen/xen.h +++ b/include/hw/xen/xen.h @@ -20,6 +20,7 @@ enum xen_mode { extern uint32_t xen_domid; extern enum xen_mode xen_mode; +extern int default_xenplatform; extern bool xen_allowed; diff --git a/vl.c b/vl.c index 709d8cd..673148e 100644 --- a/vl.c +++ b/vl.c @@ -226,6 +226,7 @@ static int default_floppy = 1; static int default_cdrom = 1; static int default_sdcard = 1; static int default_vga = 1; +int default_xenplatform = 1; static struct { const char *driver; @@ -247,6 +248,7 @@ static struct { { .driver = isa-cirrus-vga, .flag = default_vga }, { .driver = vmware-svga, .flag = default_vga }, { .driver = qxl-vga, .flag = default_vga }, +{ .driver = xen-platform, .flag = default_xenplatform }, }; static QemuOptsList qemu_rtc_opts = { @@ -4101,6 +4103,7 @@ int main(int argc, char **argv, char **envp) default_monitor = 0; default_net = 0; default_vga = 0; +default_xenplatform = 0; } if (is_daemonized()) { -- 1.8.3.1
[Qemu-devel] [PATCH 0/3] console: text terminal updates
Hi, These tree patches make the text terminal consoles support multiple windows, i.e. work correctly in case a displaychangelistener binds a text terminal explicitly to a fixed QemuConsole by setting dcl-con. Use case: https://www.kraxel.org/cgit/qemu/commit/?h=rebase/ui-gtk-nextid=dc0bdb463d0b9cb0f3f2bd898ced9b9322f7d3ce please review, Gerd Gerd Hoffmann (3): console: update text terminal surface unconditionally console: rework text terminal cursor logic console: add kbd_put_keysym_console include/ui/console.h | 1 + ui/console.c | 184 ++- 2 files changed, 94 insertions(+), 91 deletions(-) -- 1.8.3.1
[Qemu-devel] [PATCH 1/3] console: update text terminal surface unconditionally
These days each QemuConsole has its own private DisplaySurface, so we can simply render updates all the time. Signed-off-by: Gerd Hoffmann kra...@redhat.com --- ui/console.c | 127 ++- 1 file changed, 56 insertions(+), 71 deletions(-) diff --git a/ui/console.c b/ui/console.c index c6087d8..85f46ae 100644 --- a/ui/console.c +++ b/ui/console.c @@ -475,6 +475,9 @@ static inline void text_update_xy(QemuConsole *s, int x, int y) static void invalidate_xy(QemuConsole *s, int x, int y) { +if (!qemu_console_is_visible(s)) { +return; +} if (s-update_x0 x * FONT_WIDTH) s-update_x0 = x * FONT_WIDTH; if (s-update_y0 y * FONT_HEIGHT) @@ -490,25 +493,20 @@ static void update_xy(QemuConsole *s, int x, int y) TextCell *c; int y1, y2; -if (!qemu_console_is_visible(s)) { -return; -} - if (s-ds-have_text) { text_update_xy(s, x, y); } -if (s-ds-have_gfx) { -y1 = (s-y_base + y) % s-total_height; -y2 = y1 - s-y_displayed; -if (y2 0) -y2 += s-total_height; -if (y2 s-height) { -c = s-cells[y1 * s-width + x]; -vga_putcharxy(s, x, y2, c-ch, - (c-t_attrib)); -invalidate_xy(s, x, y2); -} +y1 = (s-y_base + y) % s-total_height; +y2 = y1 - s-y_displayed; +if (y2 0) { +y2 += s-total_height; +} +if (y2 s-height) { +c = s-cells[y1 * s-width + x]; +vga_putcharxy(s, x, y2, c-ch, + (c-t_attrib)); +invalidate_xy(s, x, y2); } } @@ -518,33 +516,28 @@ static void console_show_cursor(QemuConsole *s, int show) int y, y1; int x = s-x; -if (!qemu_console_is_visible(s)) { -return; -} - if (s-ds-have_text) { s-cursor_invalidate = 1; } -if (s-ds-have_gfx) { -if (x = s-width) { -x = s-width - 1; -} -y1 = (s-y_base + s-y) % s-total_height; -y = y1 - s-y_displayed; -if (y 0) -y += s-total_height; -if (y s-height) { -c = s-cells[y1 * s-width + x]; -if (show s-cursor_visible_phase) { -TextAttributes t_attrib = s-t_attrib_default; -t_attrib.invers = !(t_attrib.invers); /* invert fg and bg */ -vga_putcharxy(s, x, y, c-ch, t_attrib); -} else { -vga_putcharxy(s, x, y, c-ch, (c-t_attrib)); -} -invalidate_xy(s, x, y); +if (x = s-width) { +x = s-width - 1; +} +y1 = (s-y_base + s-y) % s-total_height; +y = y1 - s-y_displayed; +if (y 0) { +y += s-total_height; +} +if (y s-height) { +c = s-cells[y1 * s-width + x]; +if (show s-cursor_visible_phase) { +TextAttributes t_attrib = s-t_attrib_default; +t_attrib.invers = !(t_attrib.invers); /* invert fg and bg */ +vga_putcharxy(s, x, y, c-ch, t_attrib); +} else { +vga_putcharxy(s, x, y, c-ch, (c-t_attrib)); } +invalidate_xy(s, x, y); } } @@ -554,10 +547,6 @@ static void console_refresh(QemuConsole *s) TextCell *c; int x, y, y1; -if (!qemu_console_is_visible(s)) { -return; -} - if (s-ds-have_text) { s-text_x[0] = 0; s-text_y[0] = 0; @@ -566,25 +555,23 @@ static void console_refresh(QemuConsole *s) s-cursor_invalidate = 1; } -if (s-ds-have_gfx) { -vga_fill_rect(s, 0, 0, surface_width(surface), surface_height(surface), - color_table_rgb[0][COLOR_BLACK]); -y1 = s-y_displayed; -for (y = 0; y s-height; y++) { -c = s-cells + y1 * s-width; -for (x = 0; x s-width; x++) { -vga_putcharxy(s, x, y, c-ch, - (c-t_attrib)); -c++; -} -if (++y1 == s-total_height) { -y1 = 0; -} +vga_fill_rect(s, 0, 0, surface_width(surface), surface_height(surface), + color_table_rgb[0][COLOR_BLACK]); +y1 = s-y_displayed; +for (y = 0; y s-height; y++) { +c = s-cells + y1 * s-width; +for (x = 0; x s-width; x++) { +vga_putcharxy(s, x, y, c-ch, + (c-t_attrib)); +c++; +} +if (++y1 == s-total_height) { +y1 = 0; } -console_show_cursor(s, 1); -dpy_gfx_update(s, 0, 0, - surface_width(surface), surface_height(surface)); } +console_show_cursor(s, 1); +dpy_gfx_update(s, 0, 0, + surface_width(surface), surface_height(surface)); } static void console_scroll(QemuConsole *s, int ydelta) @@ -640,7 +627,7 @@ static void console_put_lf(QemuConsole *s) c-t_attrib =
Re: [Qemu-devel] [PATCH v2] spapr: Enable dynamic change of the supported hypercalls list
On 05/22/2014 08:47 PM, Alexander Graf wrote: On 21.05.14 17:21, Alexey Kardashevskiy wrote: At the moment the ibm,hypertas-functions list is fixed. However some calls should be listed there if they are supported by QEMU or the host kernel. This enables hyperrtas_prop to grow on stack by adding a SPAPR_HYPERRTAS_ADD macro. qemu,hypertas-functions is converted as well. The first user of this is going to be a multi-tce property. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- Changes: v2: * replaced alloca() with GString --- hw/ppc/spapr.c | 30 +++--- 1 file changed, 23 insertions(+), 7 deletions(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 0a61246..3b28211 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -293,6 +293,10 @@ static size_t create_page_sizes_prop(CPUPPCState *env, uint32_t *prop, } \ } while (0) +static inline void add_str(GString *s, const gchar *s1) Please remove the inline :). Otherwise this looks a lot nicer than before :) Good :) What now? Have you finished with this set and I can repost it? -- Alexey
Re: [Qemu-devel] [PATCH] xen: make xen-platform a default device
On Thu, May 22, 2014 at 10:57:54AM +, Chen, Tiejun wrote: -Original Message- From: qemu-devel-bounces+tiejun.chen=intel@nongnu.org [mailto:qemu-devel-bounces+tiejun.chen=intel@nongnu.org] On Behalf Of Michael S. Tsirkin Sent: Thursday, May 22, 2014 3:22 PM To: Gerd Hoffmann Cc: qemu-devel@nongnu.org; Anthony Liguori Subject: Re: [Qemu-devel] [PATCH] xen: make xen-platform a default device On Thu, May 22, 2014 at 09:20:50AM +0200, Gerd Hoffmann wrote: Patch hooks up the xen platform device to the default device code we have in qemu. Two effects: (1) The device will not be created in case -nodefaults is specified on the command line. (2) Autocreating the device is also turned off in case xen-platform is added manually via -device. With the patch applied you can move the xen-platform device to some other place with a simple 'qemu -device xen-platform,addr=$slot'. Tested-by: Tiejun Chen tiejun.c...@intel.com Sorry, I'm not sure if this works well as Gerd expect. I already reply something online, please take a look at that before apply. Thanks Tiejun No worries, I'll wait. Signed-off-by: Gerd Hoffmann kra...@redhat.com Applied, thanks! --- hw/i386/pc_piix.c| 2 +- include/hw/xen/xen.h | 1 + vl.c | 3 +++ 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index eaf3e61..f987d03 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -385,7 +385,7 @@ static void pc_xen_hvm_init(QEMUMachineInitArgs *args) pc_init_pci(args); bus = pci_find_primary_bus(); -if (bus != NULL) { +if (bus != NULL default_xenplatform) { pci_create_simple(bus, -1, xen-platform); } } diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h index 85fda3d..b350413 100644 --- a/include/hw/xen/xen.h +++ b/include/hw/xen/xen.h @@ -20,6 +20,7 @@ enum xen_mode { extern uint32_t xen_domid; extern enum xen_mode xen_mode; +extern int default_xenplatform; extern bool xen_allowed; diff --git a/vl.c b/vl.c index 709d8cd..673148e 100644 --- a/vl.c +++ b/vl.c @@ -226,6 +226,7 @@ static int default_floppy = 1; static int default_cdrom = 1; static int default_sdcard = 1; static int default_vga = 1; +int default_xenplatform = 1; static struct { const char *driver; @@ -247,6 +248,7 @@ static struct { { .driver = isa-cirrus-vga, .flag = default_vga }, { .driver = vmware-svga, .flag = default_vga }, { .driver = qxl-vga, .flag = default_vga }, +{ .driver = xen-platform, .flag = default_xenplatform }, }; static QemuOptsList qemu_rtc_opts = { @@ -4101,6 +4103,7 @@ int main(int argc, char **argv, char **envp) default_monitor = 0; default_net = 0; default_vga = 0; +default_xenplatform = 0; } if (is_daemonized()) { -- 1.8.3.1
[Qemu-devel] [PATCH 3/3] console: add kbd_put_keysym_console
So you can send keysyms to a specific (text terminal) console. Signed-off-by: Gerd Hoffmann kra...@redhat.com --- include/ui/console.h | 1 + ui/console.c | 9 ++--- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/include/ui/console.h b/include/ui/console.h index 8a86617..b513e20 100644 --- a/include/ui/console.h +++ b/include/ui/console.h @@ -81,6 +81,7 @@ void do_mouse_set(Monitor *mon, const QDict *qdict); #define QEMU_KEY_CTRL_PAGEUP 0xe406 #define QEMU_KEY_CTRL_PAGEDOWN 0xe407 +void kbd_put_keysym_console(QemuConsole *s, int keysym); void kbd_put_keysym(int keysym); /* consoles */ diff --git a/ui/console.c b/ui/console.c index f6ce0ef..75ec3af 100644 --- a/ui/console.c +++ b/ui/console.c @@ -1056,13 +1056,11 @@ static void kbd_send_chars(void *opaque) } /* called when an ascii key is pressed */ -void kbd_put_keysym(int keysym) +void kbd_put_keysym_console(QemuConsole *s, int keysym) { -QemuConsole *s; uint8_t buf[16], *q; int c; -s = active_console; if (!s || (s-console_type == GRAPHIC_CONSOLE)) return; @@ -,6 +1109,11 @@ void kbd_put_keysym(int keysym) } } +void kbd_put_keysym(int keysym) +{ +kbd_put_keysym_console(active_console, keysym); +} + static void text_console_invalidate(void *opaque) { QemuConsole *s = (QemuConsole *) opaque; -- 1.8.3.1
Re: [Qemu-devel] [PATCH v2] spapr: Enable dynamic change of the supported hypercalls list
On 22.05.14 13:01, Alexey Kardashevskiy wrote: On 05/22/2014 08:47 PM, Alexander Graf wrote: On 21.05.14 17:21, Alexey Kardashevskiy wrote: At the moment the ibm,hypertas-functions list is fixed. However some calls should be listed there if they are supported by QEMU or the host kernel. This enables hyperrtas_prop to grow on stack by adding a SPAPR_HYPERRTAS_ADD macro. qemu,hypertas-functions is converted as well. The first user of this is going to be a multi-tce property. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- Changes: v2: * replaced alloca() with GString --- hw/ppc/spapr.c | 30 +++--- 1 file changed, 23 insertions(+), 7 deletions(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 0a61246..3b28211 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -293,6 +293,10 @@ static size_t create_page_sizes_prop(CPUPPCState *env, uint32_t *prop, } \ } while (0) +static inline void add_str(GString *s, const gchar *s1) Please remove the inline :). Otherwise this looks a lot nicer than before :) Good :) What now? Have you finished with this set and I can repost it? Yup Alex
Re: [Qemu-devel] [v2][PATCH 4/8] xen, gfx passthrough: reserve 00:02.0 for INTEL IGD
-Original Message- From: qemu-devel-bounces+arei.gonglei=huawei@nongnu.org [mailto:qemu-devel-bounces+arei.gonglei=huawei@nongnu.org] On Behalf Of Chen, Tiejun Sent: Thursday, May 22, 2014 6:50 PM To: Gerd Hoffmann Cc: peter.mayd...@linaro.org; xen-de...@lists.xensource.com; stefano.stabell...@eu.citrix.com; m...@redhat.com; Kay, Allen M; kelly.zyta...@amd.com; qemu-devel@nongnu.org; Zhang, Yang Z; anth...@codemonkey.ws; Anthony PERARD Subject: Re: [Qemu-devel] [v2][PATCH 4/8] xen, gfx passthrough: reserve 00:02.0 for INTEL IGD -Original Message- From: Gerd Hoffmann [mailto:kra...@redhat.com] Sent: Thursday, May 22, 2014 2:45 PM To: Chen, Tiejun Cc: Anthony PERARD; Daniel P. Berrange; peter.mayd...@linaro.org; xen-de...@lists.xensource.com; m...@redhat.com; stefano.stabell...@eu.citrix.com; Kay, Allen M; kelly.zyta...@amd.com; qemu-devel@nongnu.org; Zhang, Yang Z; anth...@codemonkey.ws Subject: Re: [Qemu-devel] [v2][PATCH 4/8] xen, gfx passthrough: reserve 00:02.0 for INTEL IGD Hi, Another useful thing would be to not create the xen platform device in case -nodefaults was specified on the command line (that switch turns off a bunch of other devices present by default: vga, nic, cdrom, ...). Currently looks 'xen-platform' itself can't be created, not those devices existed on that. The error message looks more like libxl tries to hot-unplug the xen platform device. Attached patch (untested!) hooks up the xen platform device to the default device code we have in qemu. Two effects: (1) As mentioned above the device will not be created in case -nodefaults is specified on the command line. (2) Autocreating the device is also turned off in case xen-platform is added manually via -device. With the patch applied you should be able to move the xen-platform device to some other place with a simple 'qemu -device xen-platform,addr=$slot'. Gerd, Sorry, I may misunderstand what you mean previously then have a wrong test. So this still doesn't work actually. After applied your patch, 'xen-platform' is always disabled by default, right? So 00:02.0 is left naturally to be assigned to IGD as we expect like this, tchen0@tchen0-HVM-domU:~$ lspci 00:00.0 Host bridge: Intel Corporation 4th Gen Core Processor DRAM Controller (rev 06) 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03) 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06) 00:03.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #2 (rev 04) 00:1f.0 ISA bridge: Intel Corporation Q87 Express LPC Controller (rev 04) Then this is fine but if you intend to add a 'qemu -device xen-platform,addr=$slot', this doesn't work well. In my case, gfx_passthru=1 pci=[00:02.0@2, 00:1a.0] xen_platform_pci=0 device_model_args_hvm = ['-device', 'xen-platform,addr=0x3'] tchen0@tchen0-linux:~/workspace$ sudo xl cr domu-cfg Parsing config from domu-cfg libxl: error: libxl_qmp.c:287:qmp_handle_error_response: received an error message from QMP server: Unsupported bus. Bus doesn't have property 'acpi-pcihp-bsel' set libxl: error: libxl_create.c:1277:domcreate_attach_pci: libxl_device_pci_add failed: -3 Maybe you can assign explicitly the bus name, such as: device_model_args_hvm = ['-device', 'xen-platform,bus=pci.0,addr=0x3'] Best regards, -Gonglei
Re: [Qemu-devel] [PATCH 1/2] vmstate: Add helper to enable GHashTable migration
On 05/22/2014 08:57 PM, Alexander Graf wrote: On 22.05.14 12:53, Alexey Kardashevskiy wrote: This adds a VMSTATE_HASH_V macro. This implements put/get callbacks for it. This implements a qemu_hash_init() wrapper to save key/value sizes. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- include/migration/vmstate.h | 10 + include/qemu-common.h | 13 +++ vmstate.c | 54 + 3 files changed, 77 insertions(+) diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h index 7e45048..6af599d 100644 --- a/include/migration/vmstate.h +++ b/include/migration/vmstate.h @@ -166,6 +166,7 @@ extern const VMStateInfo vmstate_info_timer; extern const VMStateInfo vmstate_info_buffer; extern const VMStateInfo vmstate_info_unused_buffer; extern const VMStateInfo vmstate_info_bitmap; +static const VMStateInfo vmstate_info_hash; #define type_check_2darray(t1,t2,n,m) ((t1(*)[n][m])0 - (t2*)0) #define type_check_array(t1,t2,n) ((t1(*)[n])0 - (t2*)0) @@ -740,6 +741,15 @@ extern const VMStateInfo vmstate_info_bitmap; #define VMSTATE_BUFFER_UNSAFE(_field, _state, _version, _size)\ VMSTATE_BUFFER_UNSAFE_INFO(_field, _state, _version, vmstate_info_buffer, _size) +#define VMSTATE_HASH_V(_field, _state, _version) { \ +.name = (stringify(_field)), \ +.version_id = (_version), \ +.size = sizeof(qemu_hash),\ +.info = vmstate_info_hash, \ +.flags = VMS_SINGLE, \ +.offset = vmstate_offset_value(_state, _field, qemu_hash), \ +} + #define VMSTATE_UNUSED_V(_v, _size) \ VMSTATE_UNUSED_BUFFER(NULL, _v, _size) diff --git a/include/qemu-common.h b/include/qemu-common.h index 3f3fd60..ee973e7 100644 --- a/include/qemu-common.h +++ b/include/qemu-common.h @@ -462,4 +462,17 @@ int parse_debug_env(const char *name, int max, int initial); const char *qemu_ether_ntoa(const MACAddr *mac); +typedef struct qemu_hash { +GHashTable *hash; +int key_size; +int value_size; +} qemu_hash; + +static inline void qemu_hash_init(qemu_hash *h, int key_size, int value_size) +{ +h-key_size = key_size; +h-value_size = value_size; +h-hash = g_hash_table_new_full(g_int_hash, g_int_equal, g_free, g_free); +} + #endif diff --git a/vmstate.c b/vmstate.c index b5882fa..2148e73 100644 --- a/vmstate.c +++ b/vmstate.c @@ -667,3 +667,57 @@ const VMStateInfo vmstate_info_bitmap = { .get = get_bitmap, .put = put_bitmap, }; + +/* Save restore for qemu_hash which is a wrapper over GHashTable */ +static int get_hash(QEMUFile *f, void *pv, size_t size) +{ +qemu_hash *h = pv; +uint32_t num = g_hash_table_size(h-hash); +gpointer key, value; + +num = qemu_get_be32(f); + +for ( ; num; --num) { +int i; + +key = g_malloc0(h-key_size); +for (i = 0; i h-key_size; ++i) { +((uint8_t *)key)[i] = qemu_get_byte(f); +} +value = g_malloc0(h-value_size); +for (i = 0; i h-value_size; ++i) { +((uint8_t *)value)[i] = qemu_get_byte(f); +} +g_hash_table_insert(h-hash, key, value); +} + +return 0; +} + +static void put_hash(QEMUFile *f, void *pv, size_t size) +{ +qemu_hash *h = pv; +uint32_t num = g_hash_table_size(h-hash); +GHashTableIter iter; +gpointer key, value; + +qemu_put_be32(f, num); + +g_hash_table_iter_init(iter, h-hash); +while (g_hash_table_iter_next (iter, key, value)) { +int i; + +for (i = 0; i h-key_size; ++i) { +qemu_put_byte(f, ((uint8_t *)key)[i]); +} +for (i = 0; i h-value_size; ++i) { +qemu_put_byte(f, ((uint8_t *)value)[i]); I think the only way to do this safely would be to do a recursive introspective walk through the hash table's key and value properties. Every key and every property can be an object again, right? We would basically impose a limit onto ourselves that every member of the hash table would have to be a GObject again. This is a bit too much for the task I am trying to solve :) @key here is uin32_t and there is no point in making it an object... -- Alexey
Re: [Qemu-devel] [PATCH 4/7] virtio-blk: use aliases instead of duplicate qdev properties
Il 22/05/2014 12:17, Stefan Hajnoczi ha scritto: I took a quick look at net, rng, and serial. This approach should work because the VirtIODevice qdev properties need to be exposed wholesale on the transport device. If that's what you're hinting at, I'll try it in v2. Yes, exactly. scsi too. Paolo
Re: [Qemu-devel] [PATCH 1/7] qom: add object_property_add_alias()
Il 22/05/2014 12:23, Stefan Hajnoczi ha scritto: I had an intermediate version of this patch with a flag so you could tell object_property_add_alias() whether or not to ref the target object. But in the end it seems like overengineering since the refcount case is rare or non-existent. The refcount is a convenience feature just for the case where you want to alias to a random object that you don't otherwise hold a refcount on (I couldn't think of an example). Do you see what I mean? Yes, and it makes sense. Perhaps add a comment for v2. Paolo
[Qemu-devel] [PATCH V6 0/8] PSCI v0.2 support for KVM ARM/ARM64
This patchset adds the QEMU side changes for providing PSCI v0.2 to VM. ChangeLog: V6: - Add psci_version field in ARMCPU struct. - Misc cleanups suggested on RFC V5 of this patch. ( http://www.spinics.net/lists/kvm-arm/msg09400.html) - Dropping RFC prefix from patch. V5: - Updated scripts/update-linux-headers.sh to include linux/psci.h - Synced linux headers using update-linux-headers.sh with linux-next tree - Added per cpu field for kvm init features. - Set psci-0.2 compatible string in generated dtb sucn that it works for guest kernel not having psci 0.2 support. - Added QEMU_KVM_CAP_ARM_PSCI_0_2 define in target-arm/kvm-consts.h V4: - Rebase this patch against v11 patchset for in-kernel PSCI v0.2 emulation (http://www.spinics.net/lists/kvm-arm/msg09182.html) - Used PSCI 0.2 DT bindings for linux kernel. (http://www.spinics.net/lists/arm-kernel/msg326044.html). V3: - Rebase this patchset against v8 patchset for in-kernel PSCI v0.2 emulation (http://www.spinics.net/lists/kvm-arm/msg08780.html) - Added common kvm_arm_vcpu_init() function for kvm arm and kvm arm64 V2: - Rebase this patchset against v6 patchset for in-kernel PSCI v0.2 emulation (http://www.spinics.net/lists/arm-kernel/msg319037.html) - Handle KVM_EXIT_SYSTEM_EVENT in kvm-all.c:kvm_cpu_exec() - Drop change in kvm_arm_get_host_cpu_features() - Improve comments and description of kvm_arch_reset_vcpu() implementation V1: - Initial RFC patchset Pranavkumar Sawargaonkar (8): update-linux-headers.sh: Add psci.h to linux header sync-up script linux-headers: Update KVM headers from linux-next tag next-20140508 kvm: Handle exit reason KVM_EXIT_SYSTEM_EVENT target-arm: Common kvm_arm_vcpu_init() for KVM ARM and KVM ARM64 target-arm: Enable KVM_ARM_VCPU_PSCI_0_2 feature when possible target-arm: Implement kvm_arch_reset_vcpu() for KVM ARM64 target-arm: Introduce per-CPU field for PSCI version hw/arm/virt: Use PSCI v0.2 compatible string when KVM or TCG provides it hw/arm/virt.c | 16 ++- kvm-all.c | 16 +++ linux-headers/asm-arm/kvm.h | 10 +++-- linux-headers/asm-arm64/kvm.h | 10 +++-- linux-headers/asm-s390/kvm.h| 28 linux-headers/linux/kvm.h | 14 +- linux-headers/linux/psci.h | 90 +++ scripts/update-linux-headers.sh |3 +- target-arm/cpu-qom.h|9 target-arm/cpu.c|1 + target-arm/kvm.c| 11 + target-arm/kvm32.c | 16 --- target-arm/kvm64.c | 26 --- target-arm/kvm_arm.h| 12 ++ 14 files changed, 239 insertions(+), 23 deletions(-) create mode 100644 linux-headers/linux/psci.h -- 1.7.9.5
[Qemu-devel] [PATCH V6 2/8] linux-headers: Update KVM headers from linux-next tag next-20140508
Syncup KVM related linux headers from linux-next tree using scripts/update-linux-headers.sh. Signed-off-by: Pranavkumar Sawargaonkar pranavku...@linaro.org Signed-off-by: Anup Patel anup.pa...@linaro.org --- linux-headers/asm-arm/kvm.h | 10 +++-- linux-headers/asm-arm64/kvm.h | 10 +++-- linux-headers/asm-s390/kvm.h | 28 + linux-headers/linux/kvm.h | 14 ++- linux-headers/linux/psci.h| 90 + 5 files changed, 143 insertions(+), 9 deletions(-) create mode 100644 linux-headers/linux/psci.h diff --git a/linux-headers/asm-arm/kvm.h b/linux-headers/asm-arm/kvm.h index ef0c878..e6ebdd3 100644 --- a/linux-headers/asm-arm/kvm.h +++ b/linux-headers/asm-arm/kvm.h @@ -20,6 +20,7 @@ #define __ARM_KVM_H__ #include linux/types.h +#include linux/psci.h #include asm/ptrace.h #define __KVM_HAVE_GUEST_DEBUG @@ -83,6 +84,7 @@ struct kvm_regs { #define KVM_VGIC_V2_CPU_SIZE 0x2000 #define KVM_ARM_VCPU_POWER_OFF 0 /* CPU is started in OFF state */ +#define KVM_ARM_VCPU_PSCI_0_2 1 /* CPU uses PSCI v0.2 */ struct kvm_vcpu_init { __u32 target; @@ -201,9 +203,9 @@ struct kvm_arch_memory_slot { #define KVM_PSCI_FN_CPU_ON KVM_PSCI_FN(2) #define KVM_PSCI_FN_MIGRATEKVM_PSCI_FN(3) -#define KVM_PSCI_RET_SUCCESS 0 -#define KVM_PSCI_RET_NI((unsigned long)-1) -#define KVM_PSCI_RET_INVAL ((unsigned long)-2) -#define KVM_PSCI_RET_DENIED((unsigned long)-3) +#define KVM_PSCI_RET_SUCCESS PSCI_RET_SUCCESS +#define KVM_PSCI_RET_NIPSCI_RET_NOT_SUPPORTED +#define KVM_PSCI_RET_INVAL PSCI_RET_INVALID_PARAMS +#define KVM_PSCI_RET_DENIEDPSCI_RET_DENIED #endif /* __ARM_KVM_H__ */ diff --git a/linux-headers/asm-arm64/kvm.h b/linux-headers/asm-arm64/kvm.h index eaf54a3..e6471da 100644 --- a/linux-headers/asm-arm64/kvm.h +++ b/linux-headers/asm-arm64/kvm.h @@ -31,6 +31,7 @@ #define KVM_NR_SPSR5 #ifndef __ASSEMBLY__ +#include linux/psci.h #include asm/types.h #include asm/ptrace.h @@ -77,6 +78,7 @@ struct kvm_regs { #define KVM_ARM_VCPU_POWER_OFF 0 /* CPU is started in OFF state */ #define KVM_ARM_VCPU_EL1_32BIT 1 /* CPU running a 32bit VM */ +#define KVM_ARM_VCPU_PSCI_0_2 2 /* CPU uses PSCI v0.2 */ struct kvm_vcpu_init { __u32 target; @@ -186,10 +188,10 @@ struct kvm_arch_memory_slot { #define KVM_PSCI_FN_CPU_ON KVM_PSCI_FN(2) #define KVM_PSCI_FN_MIGRATEKVM_PSCI_FN(3) -#define KVM_PSCI_RET_SUCCESS 0 -#define KVM_PSCI_RET_NI((unsigned long)-1) -#define KVM_PSCI_RET_INVAL ((unsigned long)-2) -#define KVM_PSCI_RET_DENIED((unsigned long)-3) +#define KVM_PSCI_RET_SUCCESS PSCI_RET_SUCCESS +#define KVM_PSCI_RET_NIPSCI_RET_NOT_SUPPORTED +#define KVM_PSCI_RET_INVAL PSCI_RET_INVALID_PARAMS +#define KVM_PSCI_RET_DENIEDPSCI_RET_DENIED #endif diff --git a/linux-headers/asm-s390/kvm.h b/linux-headers/asm-s390/kvm.h index c003c6a..98bedf3 100644 --- a/linux-headers/asm-s390/kvm.h +++ b/linux-headers/asm-s390/kvm.h @@ -15,6 +15,7 @@ #include linux/types.h #define __KVM_S390 +#define __KVM_HAVE_GUEST_DEBUG /* Device control API: s390-specific devices */ #define KVM_DEV_FLIC_GET_ALL_IRQS 1 @@ -54,6 +55,13 @@ struct kvm_s390_io_adapter_req { __u64 addr; }; +/* kvm attr_group on vm fd */ +#define KVM_S390_VM_MEM_CTRL 0 + +/* kvm attributes for mem_ctrl */ +#define KVM_S390_VM_MEM_ENABLE_CMMA0 +#define KVM_S390_VM_MEM_CLR_CMMA 1 + /* for KVM_GET_REGS and KVM_SET_REGS */ struct kvm_regs { /* general purpose regs for s390 */ @@ -72,11 +80,31 @@ struct kvm_fpu { __u64 fprs[16]; }; +#define KVM_GUESTDBG_USE_HW_BP 0x0001 + +#define KVM_HW_BP 1 +#define KVM_HW_WP_WRITE2 +#define KVM_SINGLESTEP 4 + struct kvm_debug_exit_arch { + __u64 addr; + __u8 type; + __u8 pad[7]; /* Should be set to 0 */ +}; + +struct kvm_hw_breakpoint { + __u64 addr; + __u64 phys_addr; + __u64 len; + __u8 type; + __u8 pad[7]; /* Should be set to 0 */ }; /* for KVM_SET_GUEST_DEBUG */ struct kvm_guest_debug_arch { + __u32 nr_hw_bp; + __u32 pad; /* Should be set to 0 */ + struct kvm_hw_breakpoint *hw_bp; }; #define KVM_SYNC_PREFIX (1UL 0) diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h index b278ab3..6a9a0fc 100644 --- a/linux-headers/linux/kvm.h +++ b/linux-headers/linux/kvm.h @@ -171,6 +171,7 @@ struct kvm_pit_config { #define KVM_EXIT_WATCHDOG 21 #define KVM_EXIT_S390_TSCH22 #define KVM_EXIT_EPR 23 +#define KVM_EXIT_SYSTEM_EVENT 24 /* For KVM_EXIT_INTERNAL_ERROR */ /*
[Qemu-devel] [PATCH V6 3/8] kvm: Handle exit reason KVM_EXIT_SYSTEM_EVENT
In-kernel PSCI v0.2 emulation of KVM ARM/ARM64 forwards SYSTEM_OFF and SYSTEM_RESET function calls to QEMU using KVM_EXIT_SYSTEM_EVENT exit reason. This patch updates kvm_cpu_exec() to handle KVM_SYSTEM_EVENT_SHUTDOWN and KVM_SYSTEM_EVENT_RESET system-level events from QEMU-side. Signed-off-by: Pranavkumar Sawargaonkar pranavku...@linaro.org Signed-off-by: Anup Patel anup.pa...@linaro.org Reviewed-by: Peter Maydell peter.mayd...@linaro.org --- kvm-all.c | 16 1 file changed, 16 insertions(+) diff --git a/kvm-all.c b/kvm-all.c index a343ede..f87c1d5 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -1713,6 +1713,22 @@ int kvm_cpu_exec(CPUState *cpu) case KVM_EXIT_INTERNAL_ERROR: ret = kvm_handle_internal_error(cpu, run); break; +case KVM_EXIT_SYSTEM_EVENT: +switch (run-system_event.type) { +case KVM_SYSTEM_EVENT_SHUTDOWN: +qemu_system_shutdown_request(); +ret = EXCP_INTERRUPT; +break; +case KVM_SYSTEM_EVENT_RESET: +qemu_system_reset_request(); +ret = EXCP_INTERRUPT; +break; +default: +DPRINTF(kvm_arch_handle_exit\n); +ret = kvm_arch_handle_exit(cpu, run); +break; +} +break; default: DPRINTF(kvm_arch_handle_exit\n); ret = kvm_arch_handle_exit(cpu, run); -- 1.7.9.5
[Qemu-devel] [PATCH V6 1/8] update-linux-headers.sh: Add psci.h to linux header sync-up script
We will be using linux/psci.h for KVM ARM/ARM64 hence add it to linux header sync-up script. Signed-off-by: Pranavkumar Sawargaonkar pranavku...@linaro.org Signed-off-by: Anup Patel anup.pa...@linaro.org --- scripts/update-linux-headers.sh |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh index 120a694..35d1960 100755 --- a/scripts/update-linux-headers.sh +++ b/scripts/update-linux-headers.sh @@ -61,7 +61,8 @@ done rm -rf $output/linux-headers/linux mkdir -p $output/linux-headers/linux -for header in kvm.h kvm_para.h vfio.h vhost.h virtio_config.h virtio_ring.h; do +for header in kvm.h kvm_para.h psci.h vfio.h vhost.h virtio_config.h \ +virtio_ring.h; do cp $tmpdir/include/linux/$header $output/linux-headers/linux done rm -rf $output/linux-headers/asm-generic -- 1.7.9.5
[Qemu-devel] [PATCH V6 4/8] target-arm: Common kvm_arm_vcpu_init() for KVM ARM and KVM ARM64
Introduce a common kvm_arm_vcpu_init() for doing KVM_ARM_VCPU_INIT ioctl in KVM ARM and KVM ARM64. This also helps us factor-out few common code lines from kvm_arch_init_vcpu() for KVM ARM/ARM64. Signed-off-by: Pranavkumar Sawargaonkar pranavku...@linaro.org Signed-off-by: Anup Patel anup.pa...@linaro.org --- target-arm/cpu-qom.h |3 +++ target-arm/kvm.c | 11 +++ target-arm/kvm32.c | 12 +++- target-arm/kvm64.c | 18 +++--- target-arm/kvm_arm.h | 12 5 files changed, 44 insertions(+), 12 deletions(-) diff --git a/target-arm/cpu-qom.h b/target-arm/cpu-qom.h index edc7f26..2bd7df8 100644 --- a/target-arm/cpu-qom.h +++ b/target-arm/cpu-qom.h @@ -102,6 +102,9 @@ typedef struct ARMCPU { */ uint32_t kvm_target; +/* KVM init features for this CPU */ +uint32_t kvm_init_features[7]; + /* The instance init functions for implementation-specific subclasses * set these fields to specify the implementation-dependent values of * various constant registers and reset values of non-constant diff --git a/target-arm/kvm.c b/target-arm/kvm.c index 39202d7..319784d 100644 --- a/target-arm/kvm.c +++ b/target-arm/kvm.c @@ -27,6 +27,17 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = { KVM_CAP_LAST_INFO }; +int kvm_arm_vcpu_init(CPUState *cs) +{ +ARMCPU *cpu = ARM_CPU(cs); +struct kvm_vcpu_init init; + +init.target = cpu-kvm_target; +memcpy(init.features, cpu-kvm_init_features, sizeof(init.features)); + +return kvm_vcpu_ioctl(cs, KVM_ARM_VCPU_INIT, init); +} + bool kvm_arm_create_scratch_host_vcpu(const uint32_t *cpus_to_try, int *fdarray, struct kvm_vcpu_init *init) diff --git a/target-arm/kvm32.c b/target-arm/kvm32.c index b79750c..b142e90 100644 --- a/target-arm/kvm32.c +++ b/target-arm/kvm32.c @@ -166,7 +166,6 @@ static int compare_u64(const void *a, const void *b) int kvm_arch_init_vcpu(CPUState *cs) { -struct kvm_vcpu_init init; int i, ret, arraylen; uint64_t v; struct kvm_one_reg r; @@ -179,15 +178,18 @@ int kvm_arch_init_vcpu(CPUState *cs) return -EINVAL; } -init.target = cpu-kvm_target; -memset(init.features, 0, sizeof(init.features)); +/* Determine init features for this CPU */ +memset(cpu-kvm_init_features, 0, sizeof(cpu-kvm_init_features)); if (cpu-start_powered_off) { -init.features[0] = 1 KVM_ARM_VCPU_POWER_OFF; +cpu-kvm_init_features[0] |= 1 KVM_ARM_VCPU_POWER_OFF; } -ret = kvm_vcpu_ioctl(cs, KVM_ARM_VCPU_INIT, init); + +/* Do KVM_ARM_VCPU_INIT ioctl */ +ret = kvm_arm_vcpu_init(cs); if (ret) { return ret; } + /* Query the kernel to make sure it supports 32 VFP * registers: QEMU's cortex-a15 CPU is always a * VFP-D32 core. The simplest way to do this is just diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c index c729b9e..1980ddc 100644 --- a/target-arm/kvm64.c +++ b/target-arm/kvm64.c @@ -77,9 +77,8 @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUClass *ahcc) int kvm_arch_init_vcpu(CPUState *cs) { -ARMCPU *cpu = ARM_CPU(cs); -struct kvm_vcpu_init init; int ret; +ARMCPU *cpu = ARM_CPU(cs); if (cpu-kvm_target == QEMU_KVM_ARM_TARGET_NONE || !arm_feature(cpu-env, ARM_FEATURE_AARCH64)) { @@ -87,16 +86,21 @@ int kvm_arch_init_vcpu(CPUState *cs) return -EINVAL; } -init.target = cpu-kvm_target; -memset(init.features, 0, sizeof(init.features)); +/* Determine init features for this CPU */ +memset(cpu-kvm_init_features, 0, sizeof(cpu-kvm_init_features)); if (cpu-start_powered_off) { -init.features[0] = 1 KVM_ARM_VCPU_POWER_OFF; +cpu-kvm_init_features[0] |= 1 KVM_ARM_VCPU_POWER_OFF; +} + +/* Do KVM_ARM_VCPU_INIT ioctl */ +ret = kvm_arm_vcpu_init(cs); +if (ret) { +return ret; } -ret = kvm_vcpu_ioctl(cs, KVM_ARM_VCPU_INIT, init); /* TODO : support for save/restore/reset of system regs via tuple list */ -return ret; +return 0; } #define AARCH64_CORE_REG(x) (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \ diff --git a/target-arm/kvm_arm.h b/target-arm/kvm_arm.h index dc4e233..af93105 100644 --- a/target-arm/kvm_arm.h +++ b/target-arm/kvm_arm.h @@ -15,6 +15,18 @@ #include exec/memory.h /** + * kvm_arm_vcpu_init: + * @cs: CPUState + * + * Initialize (or reinitialize) the VCPU by invoking the + * KVM_ARM_VCPU_INIT ioctl with the CPU type and feature + * bitmask specified in the CPUState. + * + * Returns: 0 if success else 0 error code + */ +int kvm_arm_vcpu_init(CPUState *cs); + +/** * kvm_arm_register_device: * @mr: memory region for this device * @devid: the KVM device ID -- 1.7.9.5
[Qemu-devel] [PATCH v2 7/9] spapr_iommu: Get rid of window_size in sPAPRTCETable
This removes window_size as it is basically a copy of nb_table shifted by SPAPR_TCE_PAGE_SHIFT. As new dynamic DMA windows are going to support windows as big as the entire RAM and this number will be bigger that 32 capacity, we will have to do something about @window_size anyway and removal seems to be the right way to go. This removes dma_window_start/dma_window_size from sPAPRPHBState as they are no longer used. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- Changes: v2: * bumped minimum version in VMState descriptor --- hw/ppc/spapr_iommu.c| 43 --- hw/ppc/spapr_pci.c | 6 ++ hw/ppc/spapr_vio.c | 4 +++- include/hw/pci-host/spapr.h | 2 -- include/hw/ppc/spapr.h | 3 +-- target-ppc/kvm.c| 4 ++-- target-ppc/kvm_ppc.h| 2 +- 7 files changed, 25 insertions(+), 39 deletions(-) diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c index 0dd6509..99d1d6e 100644 --- a/hw/ppc/spapr_iommu.c +++ b/hw/ppc/spapr_iommu.c @@ -70,7 +70,7 @@ static IOMMUTLBEntry spapr_tce_translate_iommu(MemoryRegion *iommu, hwaddr addr) if (tcet-bypass) { ret.perm = IOMMU_RW; -} else if (addr tcet-window_size) { +} else if ((addr SPAPR_TCE_PAGE_SHIFT) tcet-nb_table) { /* Check if we are in bound */ tce = tcet-table[addr SPAPR_TCE_PAGE_SHIFT]; ret.iova = addr ~SPAPR_TCE_PAGE_MASK; @@ -84,25 +84,15 @@ static IOMMUTLBEntry spapr_tce_translate_iommu(MemoryRegion *iommu, hwaddr addr) return ret; } -static int spapr_tce_table_pre_load(void *opaque) -{ -sPAPRTCETable *tcet = SPAPR_TCE_TABLE(opaque); - -tcet-nb_table = tcet-window_size SPAPR_TCE_PAGE_SHIFT; - -return 0; -} - static const VMStateDescription vmstate_spapr_tce_table = { .name = spapr_iommu, -.version_id = 1, -.minimum_version_id = 1, +.version_id = 2, +.minimum_version_id = 2, .minimum_version_id_old = 1, -.pre_load = spapr_tce_table_pre_load, .fields = (VMStateField []) { /* Sanity check */ VMSTATE_UINT32_EQUAL(liobn, sPAPRTCETable), -VMSTATE_UINT32_EQUAL(window_size, sPAPRTCETable), +VMSTATE_UINT32_EQUAL(nb_table, sPAPRTCETable), /* IOMMU state */ VMSTATE_BOOL(bypass, sPAPRTCETable), @@ -122,16 +112,15 @@ static int spapr_tce_table_realize(DeviceState *dev) if (kvm_enabled()) { tcet-table = kvmppc_create_spapr_tce(tcet-liobn, - tcet-window_size, + tcet-nb_table + SPAPR_TCE_PAGE_SHIFT, tcet-fd); } if (!tcet-table) { -size_t table_size = (tcet-window_size SPAPR_TCE_PAGE_SHIFT) -* sizeof(uint64_t); +size_t table_size = tcet-nb_table * sizeof(uint64_t); tcet-table = g_malloc0(table_size); } -tcet-nb_table = tcet-window_size SPAPR_TCE_PAGE_SHIFT; trace_spapr_iommu_new_table(tcet-liobn, tcet, tcet-table, tcet-fd); @@ -143,7 +132,8 @@ static int spapr_tce_table_realize(DeviceState *dev) return 0; } -sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, size_t window_size) +sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, + uint32_t nb_table) { sPAPRTCETable *tcet; @@ -153,13 +143,13 @@ sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, size_t wi return NULL; } -if (!window_size) { +if (!nb_table) { return NULL; } tcet = SPAPR_TCE_TABLE(object_new(TYPE_SPAPR_TCE_TABLE)); tcet-liobn = liobn; -tcet-window_size = window_size; +tcet-nb_table = nb_table; object_property_add_child(OBJECT(owner), tce-table, OBJECT(tcet), NULL); @@ -176,7 +166,7 @@ static void spapr_tce_table_finalize(Object *obj) if (!kvm_enabled() || (kvmppc_remove_spapr_tce(tcet-table, tcet-fd, - tcet-window_size) != 0)) { + tcet-nb_table) != 0)) { g_free(tcet-table); } } @@ -194,8 +184,7 @@ void spapr_tce_set_bypass(sPAPRTCETable *tcet, bool bypass) static void spapr_tce_reset(DeviceState *dev) { sPAPRTCETable *tcet = SPAPR_TCE_TABLE(dev); -size_t table_size = (tcet-window_size SPAPR_TCE_PAGE_SHIFT) -* sizeof(uint64_t); +size_t table_size = tcet-nb_table * sizeof(uint64_t); tcet-bypass = false; memset(tcet-table, 0, table_size); @@ -206,7 +195,7 @@ static target_ulong put_tce_emu(sPAPRTCETable *tcet, target_ulong ioba, { IOMMUTLBEntry entry; -if (ioba = tcet-window_size) { +if ((ioba SPAPR_TCE_PAGE_SHIFT) = tcet-nb_table) { hcall_dprintf(spapr_vio_put_tce on out-of-bounds IOBA 0x TARGET_FMT_lx \n, ioba); return
[Qemu-devel] [PATCH V6 7/8] target-arm: Introduce per-CPU field for PSCI version
We require to know the PSCI version available to given CPU at potentially many places. Currently, we need to know PSCI version when generating DTB for virt machine. This patch introduce per-CPU 32bit field representing the PSCI version available to the CPU. The encoding of this 32bit field is same as described in PSCI v0.2 spec. Signed-off-by: Pranavkumar Sawargaonkar pranavku...@linaro.org Signed-off-by: Anup Patel anup.pa...@linaro.org --- target-arm/cpu-qom.h |6 ++ target-arm/cpu.c |1 + target-arm/kvm32.c |1 + target-arm/kvm64.c |1 + 4 files changed, 9 insertions(+) diff --git a/target-arm/cpu-qom.h b/target-arm/cpu-qom.h index 2bd7df8..eaee944 100644 --- a/target-arm/cpu-qom.h +++ b/target-arm/cpu-qom.h @@ -94,6 +94,12 @@ typedef struct ARMCPU { /* 'compatible' string for this CPU for Linux device trees */ const char *dtb_compatible; +/* PSCI version for this CPU + * Bits[31:16] = Major Version + * Bits[15:0] = Minor Version + */ +uint32_t psci_version; + /* Should CPU start in PSCI powered-off state? */ bool start_powered_off; diff --git a/target-arm/cpu.c b/target-arm/cpu.c index 6c6f2b3..589f34d 100644 --- a/target-arm/cpu.c +++ b/target-arm/cpu.c @@ -260,6 +260,7 @@ static void arm_cpu_initfn(Object *obj) * picky DTB consumer will also provide a helpful error message. */ cpu-dtb_compatible = qemu,unknown; +cpu-psci_version = 1; /* By default assume PSCI v0.1 */ cpu-kvm_target = QEMU_KVM_ARM_TARGET_NONE; if (tcg_enabled() !inited) { diff --git a/target-arm/kvm32.c b/target-arm/kvm32.c index 52d626c..068af7d 100644 --- a/target-arm/kvm32.c +++ b/target-arm/kvm32.c @@ -184,6 +184,7 @@ int kvm_arch_init_vcpu(CPUState *cs) cpu-kvm_init_features[0] |= 1 KVM_ARM_VCPU_POWER_OFF; } if (kvm_check_extension(cs-kvm_state, KVM_CAP_ARM_PSCI_0_2)) { +cpu-psci_version = 2; cpu-kvm_init_features[0] |= 1 KVM_ARM_VCPU_PSCI_0_2; } diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c index feeac4a..2634baf 100644 --- a/target-arm/kvm64.c +++ b/target-arm/kvm64.c @@ -92,6 +92,7 @@ int kvm_arch_init_vcpu(CPUState *cs) cpu-kvm_init_features[0] |= 1 KVM_ARM_VCPU_POWER_OFF; } if (kvm_check_extension(cs-kvm_state, KVM_CAP_ARM_PSCI_0_2)) { +cpu-psci_version = 2; cpu-kvm_init_features[0] |= 1 KVM_ARM_VCPU_PSCI_0_2; } -- 1.7.9.5
[Qemu-devel] [PATCH V6 5/8] target-arm: Enable KVM_ARM_VCPU_PSCI_0_2 feature when possible
Latest linux kernel supports in-kernel emulation of PSCI v0.2 but to enable it we need to select KVM_ARM_VCPU_PSCI_0_2 feature using KVM_ARM_VCPU_INIT ioctl. Also, we can use KVM_ARM_VCPU_PSCI_0_2 feature for VCPU only when linux kernel has KVM_CAP_ARM_PSCI_0_2 capability. This patch updates kvm_arch_init_vcpu() to enable KVM_ARM_VCPU_PSCI_0_2 feature for VCPU when KVM ARM/ARM64 has KVM_CAP_ARM_PSCI_0_2 capability. Signed-off-by: Pranavkumar Sawargaonkar pranavku...@linaro.org Signed-off-by: Anup Patel anup.pa...@linaro.org --- target-arm/kvm32.c |3 +++ target-arm/kvm64.c |3 +++ 2 files changed, 6 insertions(+) diff --git a/target-arm/kvm32.c b/target-arm/kvm32.c index b142e90..52d626c 100644 --- a/target-arm/kvm32.c +++ b/target-arm/kvm32.c @@ -183,6 +183,9 @@ int kvm_arch_init_vcpu(CPUState *cs) if (cpu-start_powered_off) { cpu-kvm_init_features[0] |= 1 KVM_ARM_VCPU_POWER_OFF; } +if (kvm_check_extension(cs-kvm_state, KVM_CAP_ARM_PSCI_0_2)) { +cpu-kvm_init_features[0] |= 1 KVM_ARM_VCPU_PSCI_0_2; +} /* Do KVM_ARM_VCPU_INIT ioctl */ ret = kvm_arm_vcpu_init(cs); diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c index 1980ddc..dcf621f 100644 --- a/target-arm/kvm64.c +++ b/target-arm/kvm64.c @@ -91,6 +91,9 @@ int kvm_arch_init_vcpu(CPUState *cs) if (cpu-start_powered_off) { cpu-kvm_init_features[0] |= 1 KVM_ARM_VCPU_POWER_OFF; } +if (kvm_check_extension(cs-kvm_state, KVM_CAP_ARM_PSCI_0_2)) { +cpu-kvm_init_features[0] |= 1 KVM_ARM_VCPU_PSCI_0_2; +} /* Do KVM_ARM_VCPU_INIT ioctl */ ret = kvm_arm_vcpu_init(cs); -- 1.7.9.5
[Qemu-devel] [PATCH V6 6/8] target-arm: Implement kvm_arch_reset_vcpu() for KVM ARM64
To implement kvm_arch_reset_vcpu(), we simply re-init the VCPU using kvm_arm_vcpu_init() so that all registers of VCPU are set to their reset values by in-kernel KVM code. Signed-off-by: Pranavkumar Sawargaonkar pranavku...@linaro.org Signed-off-by: Anup Patel anup.pa...@linaro.org Reviewed-by: Peter Maydell peter.mayd...@linaro.org --- target-arm/kvm64.c |4 1 file changed, 4 insertions(+) diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c index dcf621f..feeac4a 100644 --- a/target-arm/kvm64.c +++ b/target-arm/kvm64.c @@ -269,4 +269,8 @@ int kvm_arch_get_registers(CPUState *cs) void kvm_arm_reset_vcpu(ARMCPU *cpu) { +/* Re-init VCPU so that all registers are set to + * their respective reset values. + */ +kvm_arm_vcpu_init(CPU(cpu)); } -- 1.7.9.5
[Qemu-devel] [PATCH v2 0/9] spapr_pci: Prepare for VFIO
This patchset prepares QEMU for VFIO support on SPAPR. It also does preparations for Dynamic DMA window feature which allows to create DMA windows with 16MB IOMMU pages which will allow to map the entire guest RAM for DMA at almost no cost. Changelogs are in the patches. Alexey Kardashevskiy (9): spapr: Enable dynamic change of the supported hypercalls list spapr_iommu: Enable multiple TCE requests spapr_pci: Introduce a finish_realize() callback spapr_pci: spapr_iommu: Make DMA window a subregion spapr_pci: Allow multiple TCE tables per PHB spapr_iommu: Convert old qdev_init_nofail() to object_property_set_bool spapr_iommu: Get rid of window_size in sPAPRTCETable spapr_iommu: Introduce page_shift in sPAPRTCETable spapr_iommu: Introduce bus_offset in sPAPRTCETable hw/ppc/spapr.c | 33 +++-- hw/ppc/spapr_iommu.c| 172 +--- hw/ppc/spapr_pci.c | 96 - hw/ppc/spapr_vio.c | 6 +- include/hw/pci-host/spapr.h | 18 - include/hw/ppc/spapr.h | 8 ++- target-ppc/kvm.c| 11 ++- target-ppc/kvm_ppc.h| 8 ++- trace-events| 2 + 9 files changed, 278 insertions(+), 76 deletions(-) -- 1.9.rc0
[Qemu-devel] [PATCH v2 4/9] spapr_pci: spapr_iommu: Make DMA window a subregion
Currently the default DMA window is represented by a single MemoryRegion. However there can be more than just one window so we need a root memory region to be separated from the actual DMA window(s). This introduces a root IOMMU memory region and adds a subregion for the default DMA 32bit window. Following patches will add other subregion(s). This initializes a default DMA window subregion size to the guest RAM size as this window can be switched into bypass mode which implements direct DMA mapping. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- hw/ppc/spapr_iommu.c| 2 +- hw/ppc/spapr_pci.c | 20 ++-- include/hw/pci-host/spapr.h | 1 + 3 files changed, 20 insertions(+), 3 deletions(-) diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c index ab5037c..241ceeb 100644 --- a/hw/ppc/spapr_iommu.c +++ b/hw/ppc/spapr_iommu.c @@ -136,7 +136,7 @@ static int spapr_tce_table_realize(DeviceState *dev) trace_spapr_iommu_new_table(tcet-liobn, tcet, tcet-table, tcet-fd); memory_region_init_iommu(tcet-iommu, OBJECT(dev), spapr_iommu_ops, - iommu-spapr, UINT64_MAX); + iommu-spapr, ram_size); QLIST_INSERT_HEAD(spapr_tce_tables, tcet, list); diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index b141e83..f1684c2 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -610,6 +610,20 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp) PCI_DEVFN(0, 0), PCI_NUM_PINS, TYPE_PCI_BUS); phb-bus = bus; +/* + * Initialize PHB address space. + * By default there will be at least one subregion for default + * 32bit DMA window. + * Later the guest might want to create another DMA window + * which will become another memory subregion. + */ +sprintf(namebuf, %s.iommu-root, sphb-dtbusname); + +memory_region_init(sphb-iommu_root, OBJECT(sphb), + namebuf, UINT64_MAX); +address_space_init(sphb-iommu_as, sphb-iommu_root, + sphb-dtbusname); + pci_setup_iommu(bus, spapr_pci_dma_iommu, sphb); pci_bus_set_route_irq_fn(bus, spapr_route_intx_pin_to_irq); @@ -648,8 +662,10 @@ static void spapr_phb_finish_realize(sPAPRPHBState *sphb, Error **errp) sphb-dtbusname); return ; } -address_space_init(sphb-iommu_as, spapr_tce_get_iommu(sphb-tcet), - sphb-dtbusname); + +/* Register default 32bit DMA window */ +memory_region_add_subregion(sphb-iommu_root, 0, +spapr_tce_get_iommu(sphb-tcet)); } static void spapr_phb_reset(DeviceState *qdev) diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h index ab29281..c98ebdf 100644 --- a/include/hw/pci-host/spapr.h +++ b/include/hw/pci-host/spapr.h @@ -64,6 +64,7 @@ typedef struct sPAPRPHBState { uint64_t dma_window_size; sPAPRTCETable *tcet; AddressSpace iommu_as; +MemoryRegion iommu_root; struct spapr_pci_lsi { uint32_t irq; -- 1.9.rc0
[Qemu-devel] [PATCH V6 8/8] hw/arm/virt: Use PSCI v0.2 compatible string when KVM or TCG provides it
If we have PSCI v0.2 emulation available for KVM ARM/ARM64 or TCG then we need to provide PSCI v0.2 compatible string via generated DTB. Signed-off-by: Pranavkumar Sawargaonkar pranavku...@linaro.org Signed-off-by: Anup Patel anup.pa...@linaro.org --- hw/arm/virt.c | 16 +++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index ea4f02d..442363c 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -180,10 +180,23 @@ static void create_fdt(VirtBoardInfo *vbi) clk24mhz); qemu_fdt_setprop_cell(fdt, /apb-pclk, phandle, vbi-clock_phandle); +} + +static void fdt_add_psci_node(const VirtBoardInfo *vbi) +{ +void *fdt = vbi-fdt; +ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(0)); + /* No PSCI for TCG yet */ if (kvm_enabled()) { qemu_fdt_add_subnode(fdt, /psci); -qemu_fdt_setprop_string(fdt, /psci, compatible, arm,psci); +if (armcpu-psci_version == 2) { +const char comp[] = arm,psci-0.2\0arm,psci; +qemu_fdt_setprop(fdt, /psci, compatible, comp, sizeof(comp)); +} else { +qemu_fdt_setprop_string(fdt, /psci, compatible, arm,psci); +} + qemu_fdt_setprop_string(fdt, /psci, method, hvc); qemu_fdt_setprop_cell(fdt, /psci, cpu_suspend, PSCI_FN_CPU_SUSPEND); @@ -446,6 +459,7 @@ static void machvirt_init(QEMUMachineInitArgs *args) object_property_set_bool(cpuobj, true, realized, NULL); } fdt_add_cpu_nodes(vbi); +fdt_add_psci_node(vbi); memory_region_init_ram(ram, NULL, mach-virt.ram, args-ram_size); vmstate_register_ram_global(ram); -- 1.7.9.5
[Qemu-devel] [PATCH v2 8/9] spapr_iommu: Introduce page_shift in sPAPRTCETable
At the moment only 4K pages are supported by sPAPRTCETable. Since sPAPR spec allows other page sizes and we are going to implement them, we need page size to be configrable. This adds @page_shift into sPAPRTCETable and replaces SPAPR_TCE_PAGE_SHIFT with it whereever it is possible. This removes SPAPR_TCE_PAGE_MASK as it is no longer used. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- Changes: v2: * target_ulong for mask replaced with hwaddr * added page_mask and page_size local variables where possible --- hw/ppc/spapr_iommu.c | 66 +++--- hw/ppc/spapr_pci.c | 1 + hw/ppc/spapr_vio.c | 1 + include/hw/ppc/spapr.h | 3 ++- 4 files changed, 45 insertions(+), 26 deletions(-) diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c index 99d1d6e..65f9a89 100644 --- a/hw/ppc/spapr_iommu.c +++ b/hw/ppc/spapr_iommu.c @@ -70,12 +70,14 @@ static IOMMUTLBEntry spapr_tce_translate_iommu(MemoryRegion *iommu, hwaddr addr) if (tcet-bypass) { ret.perm = IOMMU_RW; -} else if ((addr SPAPR_TCE_PAGE_SHIFT) tcet-nb_table) { +} else if ((addr tcet-page_shift) tcet-nb_table) { /* Check if we are in bound */ -tce = tcet-table[addr SPAPR_TCE_PAGE_SHIFT]; -ret.iova = addr ~SPAPR_TCE_PAGE_MASK; -ret.translated_addr = tce ~SPAPR_TCE_PAGE_MASK; -ret.addr_mask = SPAPR_TCE_PAGE_MASK; +hwaddr page_mask = ~((1 tcet-page_shift) - 1); + +tce = tcet-table[addr tcet-page_shift]; +ret.iova = addr page_mask; +ret.translated_addr = tce page_mask; +ret.addr_mask = ~page_mask; ret.perm = tce; } trace_spapr_iommu_xlate(tcet-liobn, addr, ret.iova, ret.perm, @@ -113,7 +115,7 @@ static int spapr_tce_table_realize(DeviceState *dev) if (kvm_enabled()) { tcet-table = kvmppc_create_spapr_tce(tcet-liobn, tcet-nb_table - SPAPR_TCE_PAGE_SHIFT, + tcet-page_shift, tcet-fd); } @@ -133,6 +135,7 @@ static int spapr_tce_table_realize(DeviceState *dev) } sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, + uint32_t page_shift, uint32_t nb_table) { sPAPRTCETable *tcet; @@ -149,6 +152,7 @@ sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, tcet = SPAPR_TCE_TABLE(object_new(TYPE_SPAPR_TCE_TABLE)); tcet-liobn = liobn; +tcet-page_shift = page_shift; tcet-nb_table = nb_table; object_property_add_child(OBJECT(owner), tce-table, OBJECT(tcet), NULL); @@ -194,19 +198,20 @@ static target_ulong put_tce_emu(sPAPRTCETable *tcet, target_ulong ioba, target_ulong tce) { IOMMUTLBEntry entry; +hwaddr page_mask = ~((1 tcet-page_shift) - 1); -if ((ioba SPAPR_TCE_PAGE_SHIFT) = tcet-nb_table) { +if ((ioba tcet-page_shift) = tcet-nb_table) { hcall_dprintf(spapr_vio_put_tce on out-of-bounds IOBA 0x TARGET_FMT_lx \n, ioba); return H_PARAMETER; } -tcet-table[ioba SPAPR_TCE_PAGE_SHIFT] = tce; +tcet-table[ioba tcet-page_shift] = tce; entry.target_as = address_space_memory, -entry.iova = ioba ~SPAPR_TCE_PAGE_MASK; -entry.translated_addr = tce ~SPAPR_TCE_PAGE_MASK; -entry.addr_mask = SPAPR_TCE_PAGE_MASK; +entry.iova = ioba page_mask; +entry.translated_addr = tce page_mask; +entry.addr_mask = ~page_mask; entry.perm = tce; memory_region_notify_iommu(tcet-iommu, entry); @@ -226,6 +231,7 @@ static target_ulong h_put_tce_indirect(PowerPCCPU *cpu, target_ulong ret = H_PARAMETER; sPAPRTCETable *tcet = spapr_tce_find_by_liobn(liobn); CPUState *cs = CPU(cpu); +hwaddr page_mask, page_size; if (!tcet) { return H_PARAMETER; @@ -235,12 +241,15 @@ static target_ulong h_put_tce_indirect(PowerPCCPU *cpu, return H_PARAMETER; } -ioba = ~SPAPR_TCE_PAGE_MASK; -tce_list = ~SPAPR_TCE_PAGE_MASK; +page_mask = ~((1 tcet-page_shift) - 1); +page_size = (1 tcet-page_shift); +ioba = page_mask; + +for (i = 0; i npages; ++i, ioba += page_size) { +target_ulong off = (tce_list ~SPAPR_TCE_RW) + +i * sizeof(target_ulong); +target_ulong tce = ldq_phys(cs-as, off); -for (i = 0; i npages; ++i, ioba += SPAPR_TCE_PAGE_SIZE) { -target_ulong tce = ldq_phys(cs-as, tce_list + -i * sizeof(target_ulong)); ret = put_tce_emu(tcet, ioba, tce); if (ret) { break; @@ -267,6 +276,7 @@ static target_ulong h_stuff_tce(PowerPCCPU *cpu, sPAPREnvironment *spapr, target_ulong npages = args[3]; target_ulong ret = H_PARAMETER;
[Qemu-devel] [PATCH 2/3] console: rework text terminal cursor logic
Have a global timer. Update all visible terminal windows syncronously. Right now this can be the active_console only, but that will change soon. The global timer will disable itself if not needed, so we only have to care start it if needed. Which might be at console switch time or when a new displaychangelistener is registered. Signed-off-by: Gerd Hoffmann kra...@redhat.com --- ui/console.c | 50 -- 1 file changed, 32 insertions(+), 18 deletions(-) diff --git a/ui/console.c b/ui/console.c index 85f46ae..f6ce0ef 100644 --- a/ui/console.c +++ b/ui/console.c @@ -143,8 +143,6 @@ struct QemuConsole { TextCell *cells; int text_x[2], text_y[2], cursor_invalidate; int echo; -bool cursor_visible_phase; -QEMUTimer *cursor_timer; int update_x0; int update_y0; @@ -177,10 +175,14 @@ static DisplayState *display_state; static QemuConsole *active_console; static QemuConsole *consoles[MAX_CONSOLES]; static int nb_consoles = 0; +static bool cursor_visible_phase; +static QEMUTimer *cursor_timer; static void text_console_do_init(CharDriverState *chr, DisplayState *ds); static void dpy_refresh(DisplayState *s); static DisplayState *get_alloc_displaystate(void); +static void text_console_update_cursor_timer(void); +static void text_console_update_cursor(void *opaque); static void gui_update(void *opaque) { @@ -530,7 +532,7 @@ static void console_show_cursor(QemuConsole *s, int show) } if (y s-height) { c = s-cells[y1 * s-width + x]; -if (show s-cursor_visible_phase) { +if (show cursor_visible_phase) { TextAttributes t_attrib = s-t_attrib_default; t_attrib.invers = !(t_attrib.invers); /* invert fg and bg */ vga_putcharxy(s, x, y, c-ch, t_attrib); @@ -989,9 +991,6 @@ void console_select(unsigned int index) if (s) { DisplayState *ds = s-ds; -if (active_console active_console-cursor_timer) { -timer_del(active_console-cursor_timer); -} active_console = s; if (ds-have_gfx) { QLIST_FOREACH(dcl, ds-listeners, next) { @@ -1008,10 +1007,7 @@ void console_select(unsigned int index) if (ds-have_text) { dpy_text_resize(s, s-width, s-height); } -if (s-cursor_timer) { -timer_mod(s-cursor_timer, - qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + CONSOLE_CURSOR_PERIOD / 2); -} +text_console_update_cursor(NULL); } } @@ -1314,6 +1310,7 @@ void register_displaychangelistener(DisplayChangeListener *dcl) dcl-ops-dpy_gfx_switch(dcl, dummy); } } +text_console_update_cursor(NULL); } void update_displaychangelistener(DisplayChangeListener *dcl, @@ -1537,6 +1534,8 @@ static DisplayState *get_alloc_displaystate(void) { if (!display_state) { display_state = g_new0(DisplayState, 1); +cursor_timer = timer_new_ms(QEMU_CLOCK_REALTIME, +text_console_update_cursor, NULL); } return display_state; } @@ -1696,14 +1695,32 @@ static void text_console_set_echo(CharDriverState *chr, bool echo) s-echo = echo; } +static void text_console_update_cursor_timer(void) +{ +timer_mod(cursor_timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + + CONSOLE_CURSOR_PERIOD / 2); +} + static void text_console_update_cursor(void *opaque) { -QemuConsole *s = opaque; +QemuConsole *s; +int i, count = 0; + +cursor_visible_phase = !cursor_visible_phase; -s-cursor_visible_phase = !s-cursor_visible_phase; -graphic_hw_invalidate(s); -timer_mod(s-cursor_timer, - qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + CONSOLE_CURSOR_PERIOD / 2); +for (i = 0; i nb_consoles; i++) { +s = consoles[i]; +if (qemu_console_is_graphic(s) || +!qemu_console_is_visible(s)) { +continue; +} +count++; +graphic_hw_invalidate(s); +} + +if (count) { +text_console_update_cursor_timer(); +} } static const GraphicHwOps text_console_ops = { @@ -1739,9 +1756,6 @@ static void text_console_do_init(CharDriverState *chr, DisplayState *ds) s-surface = qemu_create_displaysurface(g_width, g_height); } -s-cursor_timer = -timer_new_ms(QEMU_CLOCK_REALTIME, text_console_update_cursor, s); - s-hw_ops = text_console_ops; s-hw = s; -- 1.8.3.1
Re: [Qemu-devel] [v2][PATCH 4/8] xen, gfx passthrough: reserve 00:02.0 for INTEL IGD
Hi, After applied your patch, 'xen-platform' is always disabled by default, right? Only in case -nodefaults is passed on the qemu command line (don't know whenever libxl does that). gfx_passthru=1 pci=[00:02.0@2, 00:1a.0] xen_platform_pci=0 ^^ That line isn't needed ... device_model_args_hvm = ['-device', 'xen-platform,addr=0x3'] That alone should be enough to move away the platform device. cheers, Gerd
[Qemu-devel] [PATCH v2 3/9] spapr_pci: Introduce a finish_realize() callback
The spapr-pci PHB initializes IOMMU for emulated devices only. The upcoming VFIO support will do it different. However both emulated and VFIO PHB types share most of the initialization code. For the type specific things a new finish_realize() callback is introduced. This introduces sPAPRPHBClass derived from PCIHostBridgeClass and adds the callback pointer. This implements finish_realize() for emulated devices. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- The difference to VFIO is that for VFIO we have to ask the kernel about DMA window size before calling spapr_tce_new_table() and then we have to tell VFIO KVM device about LIOBN - IOMMU link. --- hw/ppc/spapr_pci.c | 38 ++ include/hw/pci-host/spapr.h | 14 ++ 2 files changed, 40 insertions(+), 12 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 1db73f2..b141e83 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -528,6 +528,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp) SysBusDevice *s = SYS_BUS_DEVICE(dev); sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(s); PCIHostState *phb = PCI_HOST_BRIDGE(s); +sPAPRPHBClass *info = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(s); char *namebuf; int i; PCIBus *bus; @@ -609,18 +610,6 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp) PCI_DEVFN(0, 0), PCI_NUM_PINS, TYPE_PCI_BUS); phb-bus = bus; -sphb-dma_window_start = 0; -sphb-dma_window_size = 0x4000; -sphb-tcet = spapr_tce_new_table(dev, sphb-dma_liobn, - sphb-dma_window_size); -if (!sphb-tcet) { -error_setg(errp, Unable to create TCE table for %s, - sphb-dtbusname); -return; -} -address_space_init(sphb-iommu_as, spapr_tce_get_iommu(sphb-tcet), - sphb-dtbusname); - pci_setup_iommu(bus, spapr_pci_dma_iommu, sphb); pci_bus_set_route_irq_fn(bus, spapr_route_intx_pin_to_irq); @@ -639,6 +628,28 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp) sphb-lsi_table[i].irq = irq; } + +if (!info-finish_realize) { +error_setg(errp, finish_realize not defined); +return; +} + +info-finish_realize(sphb, errp); +} + +static void spapr_phb_finish_realize(sPAPRPHBState *sphb, Error **errp) +{ +sphb-dma_window_start = 0; +sphb-dma_window_size = 0x4000; +sphb-tcet = spapr_tce_new_table(DEVICE(sphb), sphb-dma_liobn, + sphb-dma_window_size); +if (!sphb-tcet) { +error_setg(errp, Unable to create TCE table for %s, + sphb-dtbusname); +return ; +} +address_space_init(sphb-iommu_as, spapr_tce_get_iommu(sphb-tcet), + sphb-dtbusname); } static void spapr_phb_reset(DeviceState *qdev) @@ -722,6 +733,7 @@ static void spapr_phb_class_init(ObjectClass *klass, void *data) { PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(klass); DeviceClass *dc = DEVICE_CLASS(klass); +sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_CLASS(klass); hc-root_bus_path = spapr_phb_root_bus_path; dc-realize = spapr_phb_realize; @@ -730,6 +742,7 @@ static void spapr_phb_class_init(ObjectClass *klass, void *data) dc-vmsd = vmstate_spapr_pci; set_bit(DEVICE_CATEGORY_BRIDGE, dc-categories); dc-cannot_instantiate_with_device_add_yet = false; +spc-finish_realize = spapr_phb_finish_realize; } static const TypeInfo spapr_phb_info = { @@ -737,6 +750,7 @@ static const TypeInfo spapr_phb_info = { .parent= TYPE_PCI_HOST_BRIDGE, .instance_size = sizeof(sPAPRPHBState), .class_init= spapr_phb_class_init, +.class_size= sizeof(sPAPRPHBClass), }; PCIHostState *spapr_create_phb(sPAPREnvironment *spapr, int index) diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h index 970b4a9..ab29281 100644 --- a/include/hw/pci-host/spapr.h +++ b/include/hw/pci-host/spapr.h @@ -34,6 +34,20 @@ #define SPAPR_PCI_HOST_BRIDGE(obj) \ OBJECT_CHECK(sPAPRPHBState, (obj), TYPE_SPAPR_PCI_HOST_BRIDGE) +#define SPAPR_PCI_HOST_BRIDGE_CLASS(klass) \ + OBJECT_CLASS_CHECK(sPAPRPHBClass, (klass), TYPE_SPAPR_PCI_HOST_BRIDGE) +#define SPAPR_PCI_HOST_BRIDGE_GET_CLASS(obj) \ + OBJECT_GET_CLASS(sPAPRPHBClass, (obj), TYPE_SPAPR_PCI_HOST_BRIDGE) + +typedef struct sPAPRPHBClass sPAPRPHBClass; +typedef struct sPAPRPHBState sPAPRPHBState; + +struct sPAPRPHBClass { +PCIHostBridgeClass parent_class; + +void (*finish_realize)(sPAPRPHBState *sphb, Error **errp); +}; + typedef struct sPAPRPHBState { PCIHostState parent_obj; -- 1.9.rc0
[Qemu-devel] [PATCH v2 9/9] spapr_iommu: Introduce bus_offset in sPAPRTCETable
This adds @bus_offset into sPAPRTCETable to tell where TCE table starts from. It is set to 0 for emulated devices. Dynamic DMA windows will use other offset. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- hw/ppc/spapr_iommu.c | 13 + hw/ppc/spapr_pci.c | 5 +++-- hw/ppc/spapr_vio.c | 1 + include/hw/ppc/spapr.h | 2 ++ 4 files changed, 15 insertions(+), 6 deletions(-) diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c index 65f9a89..1c69cf7 100644 --- a/hw/ppc/spapr_iommu.c +++ b/hw/ppc/spapr_iommu.c @@ -135,6 +135,7 @@ static int spapr_tce_table_realize(DeviceState *dev) } sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, + uint64_t bus_offset, uint32_t page_shift, uint32_t nb_table) { @@ -152,6 +153,7 @@ sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, tcet = SPAPR_TCE_TABLE(object_new(TYPE_SPAPR_TCE_TABLE)); tcet-liobn = liobn; +tcet-bus_offset = bus_offset; tcet-page_shift = page_shift; tcet-nb_table = nb_table; @@ -199,14 +201,15 @@ static target_ulong put_tce_emu(sPAPRTCETable *tcet, target_ulong ioba, { IOMMUTLBEntry entry; hwaddr page_mask = ~((1 tcet-page_shift) - 1); +unsigned long index = (ioba - tcet-bus_offset) tcet-page_shift; -if ((ioba tcet-page_shift) = tcet-nb_table) { +if (index = tcet-nb_table) { hcall_dprintf(spapr_vio_put_tce on out-of-bounds IOBA 0x TARGET_FMT_lx \n, ioba); return H_PARAMETER; } -tcet-table[ioba tcet-page_shift] = tce; +tcet-table[index] = tce; entry.target_as = address_space_memory, entry.iova = ioba page_mask; @@ -325,13 +328,15 @@ static target_ulong h_put_tce(PowerPCCPU *cpu, sPAPREnvironment *spapr, static target_ulong get_tce_emu(sPAPRTCETable *tcet, target_ulong ioba, target_ulong *tce) { -if ((ioba tcet-page_shift) = tcet-nb_table) { +unsigned long index = (ioba - tcet-bus_offset) tcet-page_shift; + +if (index = tcet-nb_table) { hcall_dprintf(spapr_iommu_get_tce on out-of-bounds IOBA 0x TARGET_FMT_lx \n, ioba); return H_PARAMETER; } -*tce = tcet-table[ioba tcet-page_shift]; +*tce = tcet-table[index]; return H_SUCCESS; } diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index c9850d4..ddfd8bb 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -656,6 +656,7 @@ static void spapr_phb_finish_realize(sPAPRPHBState *sphb, Error **errp) sPAPRTCETable *tcet; tcet = spapr_tce_new_table(DEVICE(sphb), sphb-dma_liobn, + 0, SPAPR_TCE_PAGE_SHIFT, 0x4000 SPAPR_TCE_PAGE_SHIFT); if (!tcet) { @@ -816,8 +817,8 @@ static int spapr_phb_children_dt(Object *child, void *opaque) } spapr_dma_dt(p-fdt, p-node_off, ibm,dma-window, - tcet-liobn, 0, - tcet-nb_table SPAPR_TCE_PAGE_SHIFT); + tcet-liobn, tcet-bus_offset, + tcet-nb_table tcet-page_shift); /* Stop after the first window */ return 1; diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c index d7e9e6a..48b0125 100644 --- a/hw/ppc/spapr_vio.c +++ b/hw/ppc/spapr_vio.c @@ -457,6 +457,7 @@ static int spapr_vio_busdev_init(DeviceState *qdev) if (pc-rtce_window_size) { uint32_t liobn = SPAPR_VIO_BASE_LIOBN | dev-reg; dev-tcet = spapr_tce_new_table(qdev, liobn, +0, SPAPR_TCE_PAGE_SHIFT, pc-rtce_window_size SPAPR_TCE_PAGE_SHIFT); diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h index 2e88e52..c848c50 100644 --- a/include/hw/ppc/spapr.h +++ b/include/hw/ppc/spapr.h @@ -391,6 +391,7 @@ struct sPAPRTCETable { DeviceState parent; uint32_t liobn; uint32_t nb_table; +uint64_t bus_offset; uint32_t page_shift; uint64_t *table; bool bypass; @@ -402,6 +403,7 @@ struct sPAPRTCETable { void spapr_events_init(sPAPREnvironment *spapr); void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq); sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, + uint64_t bus_offset, uint32_t page_shift, uint32_t nb_table); MemoryRegion *spapr_tce_get_iommu(sPAPRTCETable *tcet); -- 1.9.rc0
[Qemu-devel] [PATCH v2 6/9] spapr_iommu: Convert old qdev_init_nofail() to object_property_set_bool
qdev_init_nofail() was replaced by object_property_set_bool(realized) all over the QEMU so do we. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- hw/ppc/spapr_iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c index 241ceeb..0dd6509 100644 --- a/hw/ppc/spapr_iommu.c +++ b/hw/ppc/spapr_iommu.c @@ -163,7 +163,7 @@ sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, size_t wi object_property_add_child(OBJECT(owner), tce-table, OBJECT(tcet), NULL); -qdev_init_nofail(DEVICE(tcet)); +object_property_set_bool(OBJECT(tcet), true, realized, NULL); return tcet; } -- 1.9.rc0
[Qemu-devel] [PATCH v2 2/9] spapr_iommu: Enable multiple TCE requests
Currently only single TCE entry per request is supported (H_PUT_TCE). However PAPR+ specification allows multiple entry requests such as H_PUT_TCE_INDIRECT and H_STUFF_TCE. Having less transitions to the host kernel via ioctls, support of these calls can accelerate IOMMU operations. This implements H_STUFF_TCE and H_PUT_TCE_INDIRECT. This advertises multi-tce capability to the guest if the host kernel supports it (KVM_CAP_SPAPR_MULTITCE) or guest is running in TCG mode. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- Changes: v2: * multi-tce enabled explicitely for TCG, it was implicit * kvmppc_spapr_use_multitce() does not handle TCG anymore v1: * removed checks for liobn as the check is performed already in spapr_tce_find_by_liobn * added hcall-multi-tce if the host kernel supports the capability --- hw/ppc/spapr.c | 3 ++ hw/ppc/spapr_iommu.c | 78 target-ppc/kvm.c | 7 + target-ppc/kvm_ppc.h | 6 trace-events | 2 ++ 5 files changed, 96 insertions(+) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index a16945d..921fe5d 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -498,6 +498,9 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base, /* RTAS */ _FDT((fdt_begin_node(fdt, rtas))); +if (!kvm_enabled() || kvmppc_spapr_use_multitce()) { +add_str(hypertas, hcall-multi-tce); +} _FDT((fdt_property(fdt, ibm,hypertas-functions, hypertas-str, hypertas-len))); g_string_free(hypertas, TRUE); diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c index 72493d8..ab5037c 100644 --- a/hw/ppc/spapr_iommu.c +++ b/hw/ppc/spapr_iommu.c @@ -224,6 +224,82 @@ static target_ulong put_tce_emu(sPAPRTCETable *tcet, target_ulong ioba, return H_SUCCESS; } +static target_ulong h_put_tce_indirect(PowerPCCPU *cpu, + sPAPREnvironment *spapr, + target_ulong opcode, target_ulong *args) +{ +int i; +target_ulong liobn = args[0]; +target_ulong ioba = args[1]; +target_ulong ioba1 = ioba; +target_ulong tce_list = args[2]; +target_ulong npages = args[3]; +target_ulong ret = H_PARAMETER; +sPAPRTCETable *tcet = spapr_tce_find_by_liobn(liobn); +CPUState *cs = CPU(cpu); + +if (!tcet) { +return H_PARAMETER; +} + +if (npages 512) { +return H_PARAMETER; +} + +ioba = ~SPAPR_TCE_PAGE_MASK; +tce_list = ~SPAPR_TCE_PAGE_MASK; + +for (i = 0; i npages; ++i, ioba += SPAPR_TCE_PAGE_SIZE) { +target_ulong tce = ldq_phys(cs-as, tce_list + +i * sizeof(target_ulong)); +ret = put_tce_emu(tcet, ioba, tce); +if (ret) { +break; +} +} + +/* Trace last successful or the first problematic entry */ +i = i ? (i - 1) : 0; +trace_spapr_iommu_indirect(liobn, ioba1, tce_list, i, + ldq_phys(cs-as, + tce_list + i * sizeof(target_ulong)), + ret); + +return ret; +} + +static target_ulong h_stuff_tce(PowerPCCPU *cpu, sPAPREnvironment *spapr, + target_ulong opcode, target_ulong *args) +{ +int i; +target_ulong liobn = args[0]; +target_ulong ioba = args[1]; +target_ulong tce_value = args[2]; +target_ulong npages = args[3]; +target_ulong ret = H_PARAMETER; +sPAPRTCETable *tcet = spapr_tce_find_by_liobn(liobn); + +if (!tcet) { +return H_PARAMETER; +} + +if (npages tcet-nb_table) { +return H_PARAMETER; +} + +ioba = ~SPAPR_TCE_PAGE_MASK; + +for (i = 0; i npages; ++i, ioba += SPAPR_TCE_PAGE_SIZE) { +ret = put_tce_emu(tcet, ioba, tce_value); +if (ret) { +break; +} +} +trace_spapr_iommu_stuff(liobn, ioba, tce_value, npages, ret); + +return ret; +} + static target_ulong h_put_tce(PowerPCCPU *cpu, sPAPREnvironment *spapr, target_ulong opcode, target_ulong *args) { @@ -332,6 +408,8 @@ static void spapr_tce_table_class_init(ObjectClass *klass, void *data) /* hcall-tce */ spapr_register_hypercall(H_PUT_TCE, h_put_tce); spapr_register_hypercall(H_GET_TCE, h_get_tce); +spapr_register_hypercall(H_PUT_TCE_INDIRECT, h_put_tce_indirect); +spapr_register_hypercall(H_STUFF_TCE, h_stuff_tce); } static TypeInfo spapr_tce_table_info = { diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index ca31027..bcf2db8 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -62,6 +62,7 @@ static int cap_booke_sregs; static int cap_ppc_smt; static int cap_ppc_rma; static int cap_spapr_tce; +static int cap_spapr_multitce; static int cap_hior; static int cap_one_reg; static int cap_epr; @@ -98,6 +99,7 @@ int kvm_arch_init(KVMState *s) cap_ppc_smt = kvm_check_extension(s, KVM_CAP_PPC_SMT);
[Qemu-devel] [PATCH v2 5/9] spapr_pci: Allow multiple TCE tables per PHB
At the moment sPAPRPHBState contains a @tcet pointer to the only TCE table. However sPAPR spec allows having more than one DMA window. Since the TCE object is already a child of SPAPR PHB object, there is no need to keep an additional pointer to it in sPAPRPHBState so remove it. This changes the way sPAPRPHBState::reset performs reset of sPAPRTCETable objects. This changes the default DMA window properties calculation. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- The only reason for sPAPRPHBState to keep a pointer to sPAPRTCETable is to have a direct link to calculate default 32bit window properties. So I decided to replace the link with a spapr_phb_children_dt() loop and use first TCE table for default window. Is that ok or ugly? --- hw/ppc/spapr_pci.c | 54 - include/hw/pci-host/spapr.h | 1 - 2 files changed, 43 insertions(+), 12 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index f1684c2..aa29116 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -653,11 +653,13 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp) static void spapr_phb_finish_realize(sPAPRPHBState *sphb, Error **errp) { +sPAPRTCETable *tcet; + sphb-dma_window_start = 0; sphb-dma_window_size = 0x4000; -sphb-tcet = spapr_tce_new_table(DEVICE(sphb), sphb-dma_liobn, - sphb-dma_window_size); -if (!sphb-tcet) { +tcet = spapr_tce_new_table(DEVICE(sphb), sphb-dma_liobn, + sphb-dma_window_size); +if (!tcet) { error_setg(errp, Unable to create TCE table for %s, sphb-dtbusname); return ; @@ -665,16 +667,24 @@ static void spapr_phb_finish_realize(sPAPRPHBState *sphb, Error **errp) /* Register default 32bit DMA window */ memory_region_add_subregion(sphb-iommu_root, 0, -spapr_tce_get_iommu(sphb-tcet)); +spapr_tce_get_iommu(tcet)); +} + +static int spapr_phb_children_reset(Object *child, void *opaque) +{ +DeviceState *dev = (DeviceState *) object_dynamic_cast(child, TYPE_DEVICE); + +if (dev) { +device_reset(dev); +} + +return 0; } static void spapr_phb_reset(DeviceState *qdev) { -SysBusDevice *s = SYS_BUS_DEVICE(qdev); -sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(s); - /* Reset the IOMMU state */ -device_reset(DEVICE(sphb-tcet)); +object_child_foreach(OBJECT(qdev), spapr_phb_children_reset, NULL); } static Property spapr_phb_properties[] = { @@ -791,6 +801,29 @@ PCIHostState *spapr_create_phb(sPAPREnvironment *spapr, int index) #define b_fff(x)b_x((x), 8, 3) /* function number */ #define b_(x) b_x((x), 0, 8) /* register number */ +typedef struct sPAPRTCEDT { +void *fdt; +int node_off; +} sPAPRTCEDT; + +static int spapr_phb_children_dt(Object *child, void *opaque) +{ +sPAPRTCEDT *p = opaque; +sPAPRTCETable *tcet; + +tcet = (sPAPRTCETable *) object_dynamic_cast(child, TYPE_SPAPR_TCE_TABLE); +if (!tcet) { +return 0; +} + +spapr_dma_dt(p-fdt, p-node_off, ibm,dma-window, + tcet-liobn, 0, + tcet-window_size); +/* Stop after the first window */ + +return 1; +} + int spapr_populate_pci_dt(sPAPRPHBState *phb, uint32_t xics_phandle, void *fdt) @@ -870,9 +903,8 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb, _FDT(fdt_setprop(fdt, bus_off, interrupt-map, interrupt_map, sizeof(interrupt_map))); -spapr_dma_dt(fdt, bus_off, ibm,dma-window, - phb-dma_liobn, phb-dma_window_start, - phb-dma_window_size); +object_child_foreach(OBJECT(phb), spapr_phb_children_dt, + ((sPAPRTCEDT){ .fdt = fdt, .node_off = bus_off })); return 0; } diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h index c98ebdf..5ea4745 100644 --- a/include/hw/pci-host/spapr.h +++ b/include/hw/pci-host/spapr.h @@ -62,7 +62,6 @@ typedef struct sPAPRPHBState { uint32_t dma_liobn; uint64_t dma_window_start; uint64_t dma_window_size; -sPAPRTCETable *tcet; AddressSpace iommu_as; MemoryRegion iommu_root; -- 1.9.rc0
[Qemu-devel] [PATCH v2 1/9] spapr: Enable dynamic change of the supported hypercalls list
At the moment the ibm,hypertas-functions list is fixed. However some calls should be listed there if they are supported by QEMU or the host kernel. This enables hyperrtas_prop to grow on stack by adding a SPAPR_HYPERRTAS_ADD macro. qemu,hypertas-functions is converted as well. The first user of this is going to be a multi-tce property. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- Changes: v2: * replaced alloca() with GString * removed inline from add_str() definition --- hw/ppc/spapr.c | 30 +++--- 1 file changed, 23 insertions(+), 7 deletions(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 0a61246..a16945d 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -293,6 +293,10 @@ static size_t create_page_sizes_prop(CPUPPCState *env, uint32_t *prop, } \ } while (0) +static void add_str(GString *s, const gchar *s1) +{ +g_string_append_len(s, s1, strlen(s1) + 1); +} static void *spapr_create_fdt_skel(hwaddr initrd_base, hwaddr initrd_size, @@ -306,9 +310,8 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base, CPUState *cs; uint32_t start_prop = cpu_to_be32(initrd_base); uint32_t end_prop = cpu_to_be32(initrd_base + initrd_size); -char hypertas_prop[] = hcall-pft\0hcall-term\0hcall-dabr\0hcall-interrupt -\0hcall-tce\0hcall-vio\0hcall-splpar\0hcall-bulk\0hcall-set-mode; -char qemu_hypertas_prop[] = hcall-memop1; +GString *hypertas = g_string_sized_new(256); +GString *qemu_hypertas = g_string_sized_new(256); uint32_t refpoints[] = {cpu_to_be32(0x4), cpu_to_be32(0x4)}; uint32_t interrupt_server_ranges_prop[] = {0, cpu_to_be32(smp_cpus)}; int i, smt = kvmppc_smt_threads(); @@ -317,6 +320,17 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base, unsigned sockets = opts ? qemu_opt_get_number(opts, sockets, 0) : 0; uint32_t cpus_per_socket = sockets ? (smp_cpus / sockets) : 1; +add_str(hypertas, hcall-pft); +add_str(hypertas, hcall-term); +add_str(hypertas, hcall-dabr); +add_str(hypertas, hcall-interrupt); +add_str(hypertas, hcall-tce); +add_str(hypertas, hcall-vio); +add_str(hypertas, hcall-splpar); +add_str(hypertas, hcall-bulk); +add_str(hypertas, hcall-set-mode); +add_str(qemu_hypertas, hcall-memop1); + fdt = g_malloc0(FDT_MAX_SIZE); _FDT((fdt_create(fdt, FDT_MAX_SIZE))); @@ -484,10 +498,12 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base, /* RTAS */ _FDT((fdt_begin_node(fdt, rtas))); -_FDT((fdt_property(fdt, ibm,hypertas-functions, hypertas_prop, - sizeof(hypertas_prop; -_FDT((fdt_property(fdt, qemu,hypertas-functions, qemu_hypertas_prop, - sizeof(qemu_hypertas_prop; +_FDT((fdt_property(fdt, ibm,hypertas-functions, hypertas-str, + hypertas-len))); +g_string_free(hypertas, TRUE); +_FDT((fdt_property(fdt, qemu,hypertas-functions, qemu_hypertas-str, + qemu_hypertas-len))); +g_string_free(qemu_hypertas, TRUE); _FDT((fdt_property(fdt, ibm,associativity-reference-points, refpoints, sizeof(refpoints; -- 1.9.rc0
Re: [Qemu-devel] [v2][PATCH 4/8] xen, gfx passthrough: reserve 00:02.0 for INTEL IGD
On Thu, May 22, 2014 at 10:50:10AM +, Chen, Tiejun wrote: -Original Message- From: Gerd Hoffmann [mailto:kra...@redhat.com] Sent: Thursday, May 22, 2014 2:45 PM To: Chen, Tiejun Cc: Anthony PERARD; Daniel P. Berrange; peter.mayd...@linaro.org; xen-de...@lists.xensource.com; m...@redhat.com; stefano.stabell...@eu.citrix.com; Kay, Allen M; kelly.zyta...@amd.com; qemu-devel@nongnu.org; Zhang, Yang Z; anth...@codemonkey.ws Subject: Re: [Qemu-devel] [v2][PATCH 4/8] xen, gfx passthrough: reserve 00:02.0 for INTEL IGD Hi, Another useful thing would be to not create the xen platform device in case -nodefaults was specified on the command line (that switch turns off a bunch of other devices present by default: vga, nic, cdrom, ...). Currently looks 'xen-platform' itself can't be created, not those devices existed on that. The error message looks more like libxl tries to hot-unplug the xen platform device. Attached patch (untested!) hooks up the xen platform device to the default device code we have in qemu. Two effects: (1) As mentioned above the device will not be created in case -nodefaults is specified on the command line. (2) Autocreating the device is also turned off in case xen-platform is added manually via -device. With the patch applied you should be able to move the xen-platform device to some other place with a simple 'qemu -device xen-platform,addr=$slot'. Gerd, Sorry, I may misunderstand what you mean previously then have a wrong test. So this still doesn't work actually. After applied your patch, 'xen-platform' is always disabled by default, right? So 00:02.0 is left naturally to be assigned to IGD as we expect like this, tchen0@tchen0-HVM-domU:~$ lspci 00:00.0 Host bridge: Intel Corporation 4th Gen Core Processor DRAM Controller (rev 06) 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03) 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06) 00:03.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #2 (rev 04) 00:1f.0 ISA bridge: Intel Corporation Q87 Express LPC Controller (rev 04) Then this is fine but if you intend to add a 'qemu -device xen-platform,addr=$slot', this doesn't work well. In my case, gfx_passthru=1 pci=[00:02.0@2, 00:1a.0] xen_platform_pci=0 device_model_args_hvm = ['-device', 'xen-platform,addr=0x3'] tchen0@tchen0-linux:~/workspace$ sudo xl cr domu-cfg Parsing config from domu-cfg libxl: error: libxl_qmp.c:287:qmp_handle_error_response: received an error message from QMP server: Unsupported bus. Bus doesn't have property 'acpi-pcihp-bsel' set libxl: error: libxl_create.c:1277:domcreate_attach_pci: libxl_device_pci_add failed: -3 Thanks Tiejun Weird: this implies use_acpi_pci_hotplug got set somehow. -- MST