Re: [Qemu-devel] [PATCH v5] vfio-ccw: support async command subregion
On 06/12/2019 05:38 AM, Cornelia Huck wrote: On Tue, 11 Jun 2019 15:33:59 -0400 Farhan Ali wrote: On 06/07/2019 10:53 AM, Cornelia Huck wrote: A vfio-ccw device may provide an async command subregion for issuing halt/clear subchannel requests. If it is present, use it for sending halt/clear request to the device; if not, fall back to emulation (as done today). Signed-off-by: Cornelia Huck --- v4->v5: - It seems we need to take the indirection via the class for the callbacks after all :( - Dropped Eric's R-b: for that reason --- hw/s390x/css.c | 27 +++-- hw/s390x/s390-ccw.c | 20 +++ hw/vfio/ccw.c | 112 +++- include/hw/s390x/css.h | 3 + include/hw/s390x/s390-ccw.h | 2 + 5 files changed, 158 insertions(+), 6 deletions(-) @@ -309,11 +400,26 @@ static void vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error **errp) vcdev->io_region_offset = info->offset; vcdev->io_region = g_malloc0(info->size); +/* check for the optional async command region */ +ret = vfio_get_dev_region_info(vdev, VFIO_REGION_TYPE_CCW, + VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD, ); +if (!ret) { +vcdev->async_cmd_region_size = info->size; +if (sizeof(*vcdev->async_cmd_region) != vcdev->async_cmd_region_size) { +error_setg(errp, "vfio: Unexpected size of the async cmd region"); +g_free(info); +return; +} +vcdev->async_cmd_region_offset = info->offset; +vcdev->async_cmd_region = g_malloc0(info->size); +} + g_free(info); } static void vfio_ccw_put_region(VFIOCCWDevice *vcdev) { +g_free(vcdev->async_cmd_region); g_free(vcdev->io_region); } I think we can have a memory leak given how the code is currently structured and how we call vfio_ccw_get_region. vfio_ccw_get_region is called in vfio_ccw_realize. So if we return an error from vfio_ccw_get_region, we would jump to out_region_err in vfio_ccw_realize which would call vfio_ccw_put_device. Now we can also return an error from vfio_ccw_get_region for the async region, and so we might never end up freeing the io_region for which we allocated memory successfully. I think we would also need to change vfio_ccw_realize, no? Indeed, you're right. I have the following change on top: diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c index c9d1c76b4d04..d21ac24f743c 100644 --- a/hw/vfio/ccw.c +++ b/hw/vfio/ccw.c @@ -407,6 +407,7 @@ static void vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error **errp) vcdev->async_cmd_region_size = info->size; if (sizeof(*vcdev->async_cmd_region) != vcdev->async_cmd_region_size) { error_setg(errp, "vfio: Unexpected size of the async cmd region"); +g_free(vcdev->io_region); g_free(info); return; } This looks good to me. Anything else before I respin? Nothing else that jumped out to me. Reviewed-by: Farhan Ali
Re: [Qemu-devel] [PATCH v5] vfio-ccw: support async command subregion
On 06/11/2019 07:37 AM, Cornelia Huck wrote: On Fri, 7 Jun 2019 11:19:09 -0400 Farhan Ali wrote: On 06/07/2019 11:09 AM, Cornelia Huck wrote: On Fri, 7 Jun 2019 11:02:36 -0400 Farhan Ali wrote: On 06/07/2019 10:53 AM, Cornelia Huck wrote: diff --git a/hw/s390x/css.c b/hw/s390x/css.c index ad310b9f94bc..b92395f165e6 100644 --- a/hw/s390x/css.c +++ b/hw/s390x/css.c @@ -22,6 +22,7 @@ #include "trace.h" #include "hw/s390x/s390_flic.h" #include "hw/s390x/s390-virtio-ccw.h" +#include "hw/s390x/s390-ccw.h" typedef struct CrwContainer { CRW crw; @@ -1205,6 +1206,26 @@ static void sch_handle_start_func_virtual(SubchDev *sch) } +static void sch_handle_halt_func_passthrough(SubchDev *sch) +{ +int ret; + +ret = s390_ccw_halt(sch); +if (ret == -ENOSYS) { +sch_handle_halt_func(sch); +} +} + +static void sch_handle_clear_func_passthrough(SubchDev *sch) +{ +int ret; + +ret = s390_ccw_clear(sch); +if (ret == -ENOSYS) { +sch_handle_clear_func(sch); +} +} + do we need an extra s390_ccw_clear/halt functions? can't we just call cdc->clear/halt in the passthrough functions? I mostly added them for symmetry reasons... we still need to check for presence of the callback in any case, though. (vfio is not always built, e.g. on windows or os x.) right, but if we are calling do_subchannel_work_passthrough, then we know for sure we are building the S390CCWDevice which is the vfio device, no? So we could just add checks for callbacks in sch_handle_clear/halt_func_passthrough, no? I would even like to get rid of the s390_ccw_cmd_request if we can, but that is out of scope for this patch. :) Ok, I just walked through various source files (some of which are a bit confusingly named) again and now I understand again why it was done that way in the first place. - hw/s390x/s390-ccw.c provides some interfaces for pass-through ccw devices. It is built unconditionally, and its interfaces are called unconditionally from the css code. It also provides a device class where code can hook up callbacks. - hw/vfio/ccw.c (which is not built for !KVM) actually hooks up callbacks in that device class. So, s390-ccw.c (not to be confused with ccw-device.c...) provides a layer that makes it possible to call things unconditionally, regardless whether we have vfio-ccw available or not. Not that the code ends up being called without vfio-ccw support; but the class indirection enables the code to be built. Okay, now I get it. Thanks for the explanation, I really do appreciate it! :) There's possibly a way to make this work without the class indirection as well, but I think we'd end up doing code juggling before ending up with something that's not really nicer than the code we have now. Therefore, I'd prefer to keep the class handling and hook up the halt/clear callbacks there as well. Yeah I agree, given the constraints I don't think any code juggling would look any prettier. Thanks Farhan
Re: [Qemu-devel] [PATCH v5] vfio-ccw: support async command subregion
On 06/07/2019 10:53 AM, Cornelia Huck wrote: A vfio-ccw device may provide an async command subregion for issuing halt/clear subchannel requests. If it is present, use it for sending halt/clear request to the device; if not, fall back to emulation (as done today). Signed-off-by: Cornelia Huck --- v4->v5: - It seems we need to take the indirection via the class for the callbacks after all :( - Dropped Eric's R-b: for that reason --- hw/s390x/css.c | 27 +++-- hw/s390x/s390-ccw.c | 20 +++ hw/vfio/ccw.c | 112 +++- include/hw/s390x/css.h | 3 + include/hw/s390x/s390-ccw.h | 2 + 5 files changed, 158 insertions(+), 6 deletions(-) diff --git a/hw/s390x/css.c b/hw/s390x/css.c index ad310b9f94bc..b92395f165e6 100644 --- a/hw/s390x/css.c +++ b/hw/s390x/css.c @@ -22,6 +22,7 @@ #include "trace.h" #include "hw/s390x/s390_flic.h" #include "hw/s390x/s390-virtio-ccw.h" +#include "hw/s390x/s390-ccw.h" typedef struct CrwContainer { CRW crw; @@ -1205,6 +1206,26 @@ static void sch_handle_start_func_virtual(SubchDev *sch) } +static void sch_handle_halt_func_passthrough(SubchDev *sch) +{ +int ret; + +ret = s390_ccw_halt(sch); +if (ret == -ENOSYS) { +sch_handle_halt_func(sch); +} +} + +static void sch_handle_clear_func_passthrough(SubchDev *sch) +{ +int ret; + +ret = s390_ccw_clear(sch); +if (ret == -ENOSYS) { +sch_handle_clear_func(sch); +} +} + static IOInstEnding sch_handle_start_func_passthrough(SubchDev *sch) { SCHIB *schib = >curr_status; @@ -1244,11 +1265,9 @@ IOInstEnding do_subchannel_work_passthrough(SubchDev *sch) SCHIB *schib = >curr_status; if (schib->scsw.ctrl & SCSW_FCTL_CLEAR_FUNC) { -/* TODO: Clear handling */ -sch_handle_clear_func(sch); +sch_handle_clear_func_passthrough(sch); } else if (schib->scsw.ctrl & SCSW_FCTL_HALT_FUNC) { -/* TODO: Halt handling */ -sch_handle_halt_func(sch); +sch_handle_halt_func_passthrough(sch); } else if (schib->scsw.ctrl & SCSW_FCTL_START_FUNC) { return sch_handle_start_func_passthrough(sch); } diff --git a/hw/s390x/s390-ccw.c b/hw/s390x/s390-ccw.c index f5f025d1b6ca..951be5ab0245 100644 --- a/hw/s390x/s390-ccw.c +++ b/hw/s390x/s390-ccw.c @@ -29,6 +29,26 @@ IOInstEnding s390_ccw_cmd_request(SubchDev *sch) return cdc->handle_request(sch); } +int s390_ccw_halt(SubchDev *sch) +{ +S390CCWDeviceClass *cdc = S390_CCW_DEVICE_GET_CLASS(sch->driver_data); + +if (!cdc->handle_halt) { +return -ENOSYS; +} +return cdc->handle_halt(sch); +} + +int s390_ccw_clear(SubchDev *sch) +{ +S390CCWDeviceClass *cdc = S390_CCW_DEVICE_GET_CLASS(sch->driver_data); + +if (!cdc->handle_clear) { +return -ENOSYS; +} +return cdc->handle_clear(sch); +} + static void s390_ccw_get_dev_info(S390CCWDevice *cdev, char *sysfsdev, Error **errp) diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c index d9e39552e237..c9d1c76b4d04 100644 --- a/hw/vfio/ccw.c +++ b/hw/vfio/ccw.c @@ -2,9 +2,12 @@ * vfio based subchannel assignment support * * Copyright 2017 IBM Corp. + * Copyright 2019 Red Hat, Inc. + * * Author(s): Dong Jia Shi *Xiao Feng Ren *Pierre Morel + *Cornelia Huck * * This work is licensed under the terms of the GNU GPL, version 2 or (at * your option) any later version. See the COPYING file in the top-level @@ -32,6 +35,9 @@ struct VFIOCCWDevice { uint64_t io_region_size; uint64_t io_region_offset; struct ccw_io_region *io_region; +uint64_t async_cmd_region_size; +uint64_t async_cmd_region_offset; +struct ccw_cmd_region *async_cmd_region; EventNotifier io_notifier; bool force_orb_pfch; bool warned_orb_pfch; @@ -114,6 +120,87 @@ again: } } +static int vfio_ccw_handle_clear(SubchDev *sch) +{ +S390CCWDevice *cdev = sch->driver_data; +VFIOCCWDevice *vcdev = DO_UPCAST(VFIOCCWDevice, cdev, cdev); +struct ccw_cmd_region *region = vcdev->async_cmd_region; +int ret; + +if (!vcdev->async_cmd_region) { +/* Async command region not available, fall back to emulation */ +return -ENOSYS; +} + +memset(region, 0, sizeof(*region)); +region->command = VFIO_CCW_ASYNC_CMD_CSCH; + +again: +ret = pwrite(vcdev->vdev.fd, region, + vcdev->async_cmd_region_size, vcdev->async_cmd_region_offset); +if (ret != vcdev->async_cmd_region_size) { +if (errno == EAGAIN) { +goto again; +} +error_report("vfio-ccw: write cmd region failed with errno=%d", errno); +ret = -errno; +} else { +ret = region->ret_code; +} +switch (ret) { +case 0: +case -ENODEV: +case -EACCES: +
Re: [Qemu-devel] [PATCH v5] vfio-ccw: support async command subregion
On 06/07/2019 11:09 AM, Cornelia Huck wrote: On Fri, 7 Jun 2019 11:02:36 -0400 Farhan Ali wrote: On 06/07/2019 10:53 AM, Cornelia Huck wrote: A vfio-ccw device may provide an async command subregion for issuing halt/clear subchannel requests. If it is present, use it for sending halt/clear request to the device; if not, fall back to emulation (as done today). Signed-off-by: Cornelia Huck --- v4->v5: - It seems we need to take the indirection via the class for the callbacks after all :( - Dropped Eric's R-b: for that reason --- hw/s390x/css.c | 27 +++-- hw/s390x/s390-ccw.c | 20 +++ hw/vfio/ccw.c | 112 +++- include/hw/s390x/css.h | 3 + include/hw/s390x/s390-ccw.h | 2 + 5 files changed, 158 insertions(+), 6 deletions(-) diff --git a/hw/s390x/css.c b/hw/s390x/css.c index ad310b9f94bc..b92395f165e6 100644 --- a/hw/s390x/css.c +++ b/hw/s390x/css.c @@ -22,6 +22,7 @@ #include "trace.h" #include "hw/s390x/s390_flic.h" #include "hw/s390x/s390-virtio-ccw.h" +#include "hw/s390x/s390-ccw.h" typedef struct CrwContainer { CRW crw; @@ -1205,6 +1206,26 @@ static void sch_handle_start_func_virtual(SubchDev *sch) } +static void sch_handle_halt_func_passthrough(SubchDev *sch) +{ +int ret; + +ret = s390_ccw_halt(sch); +if (ret == -ENOSYS) { +sch_handle_halt_func(sch); +} +} + +static void sch_handle_clear_func_passthrough(SubchDev *sch) +{ +int ret; + +ret = s390_ccw_clear(sch); +if (ret == -ENOSYS) { +sch_handle_clear_func(sch); +} +} + do we need an extra s390_ccw_clear/halt functions? can't we just call cdc->clear/halt in the passthrough functions? I mostly added them for symmetry reasons... we still need to check for presence of the callback in any case, though. (vfio is not always built, e.g. on windows or os x.) right, but if we are calling do_subchannel_work_passthrough, then we know for sure we are building the S390CCWDevice which is the vfio device, no? So we could just add checks for callbacks in sch_handle_clear/halt_func_passthrough, no? I would even like to get rid of the s390_ccw_cmd_request if we can, but that is out of scope for this patch. :) static IOInstEnding sch_handle_start_func_passthrough(SubchDev *sch) { SCHIB *schib = >curr_status; @@ -1244,11 +1265,9 @@ IOInstEnding do_subchannel_work_passthrough(SubchDev *sch) SCHIB *schib = >curr_status; if (schib->scsw.ctrl & SCSW_FCTL_CLEAR_FUNC) { -/* TODO: Clear handling */ -sch_handle_clear_func(sch); +sch_handle_clear_func_passthrough(sch); } else if (schib->scsw.ctrl & SCSW_FCTL_HALT_FUNC) { -/* TODO: Halt handling */ -sch_handle_halt_func(sch); +sch_handle_halt_func_passthrough(sch); } else if (schib->scsw.ctrl & SCSW_FCTL_START_FUNC) { return sch_handle_start_func_passthrough(sch); } diff --git a/hw/s390x/s390-ccw.c b/hw/s390x/s390-ccw.c index f5f025d1b6ca..951be5ab0245 100644 --- a/hw/s390x/s390-ccw.c +++ b/hw/s390x/s390-ccw.c @@ -29,6 +29,26 @@ IOInstEnding s390_ccw_cmd_request(SubchDev *sch) return cdc->handle_request(sch); } +int s390_ccw_halt(SubchDev *sch) +{ +S390CCWDeviceClass *cdc = S390_CCW_DEVICE_GET_CLASS(sch->driver_data); + +if (!cdc->handle_halt) { +return -ENOSYS; +} +return cdc->handle_halt(sch); +} + +int s390_ccw_clear(SubchDev *sch) +{ +S390CCWDeviceClass *cdc = S390_CCW_DEVICE_GET_CLASS(sch->driver_data); + +if (!cdc->handle_clear) { +return -ENOSYS; +} +return cdc->handle_clear(sch); +} +
Re: [Qemu-devel] [PATCH v5] vfio-ccw: support async command subregion
On 06/07/2019 10:53 AM, Cornelia Huck wrote: A vfio-ccw device may provide an async command subregion for issuing halt/clear subchannel requests. If it is present, use it for sending halt/clear request to the device; if not, fall back to emulation (as done today). Signed-off-by: Cornelia Huck --- v4->v5: - It seems we need to take the indirection via the class for the callbacks after all :( - Dropped Eric's R-b: for that reason --- hw/s390x/css.c | 27 +++-- hw/s390x/s390-ccw.c | 20 +++ hw/vfio/ccw.c | 112 +++- include/hw/s390x/css.h | 3 + include/hw/s390x/s390-ccw.h | 2 + 5 files changed, 158 insertions(+), 6 deletions(-) diff --git a/hw/s390x/css.c b/hw/s390x/css.c index ad310b9f94bc..b92395f165e6 100644 --- a/hw/s390x/css.c +++ b/hw/s390x/css.c @@ -22,6 +22,7 @@ #include "trace.h" #include "hw/s390x/s390_flic.h" #include "hw/s390x/s390-virtio-ccw.h" +#include "hw/s390x/s390-ccw.h" typedef struct CrwContainer { CRW crw; @@ -1205,6 +1206,26 @@ static void sch_handle_start_func_virtual(SubchDev *sch) } +static void sch_handle_halt_func_passthrough(SubchDev *sch) +{ +int ret; + +ret = s390_ccw_halt(sch); +if (ret == -ENOSYS) { +sch_handle_halt_func(sch); +} +} + +static void sch_handle_clear_func_passthrough(SubchDev *sch) +{ +int ret; + +ret = s390_ccw_clear(sch); +if (ret == -ENOSYS) { +sch_handle_clear_func(sch); +} +} + do we need an extra s390_ccw_clear/halt functions? can't we just call cdc->clear/halt in the passthrough functions? static IOInstEnding sch_handle_start_func_passthrough(SubchDev *sch) { SCHIB *schib = >curr_status; @@ -1244,11 +1265,9 @@ IOInstEnding do_subchannel_work_passthrough(SubchDev *sch) SCHIB *schib = >curr_status; if (schib->scsw.ctrl & SCSW_FCTL_CLEAR_FUNC) { -/* TODO: Clear handling */ -sch_handle_clear_func(sch); +sch_handle_clear_func_passthrough(sch); } else if (schib->scsw.ctrl & SCSW_FCTL_HALT_FUNC) { -/* TODO: Halt handling */ -sch_handle_halt_func(sch); +sch_handle_halt_func_passthrough(sch); } else if (schib->scsw.ctrl & SCSW_FCTL_START_FUNC) { return sch_handle_start_func_passthrough(sch); } diff --git a/hw/s390x/s390-ccw.c b/hw/s390x/s390-ccw.c index f5f025d1b6ca..951be5ab0245 100644 --- a/hw/s390x/s390-ccw.c +++ b/hw/s390x/s390-ccw.c @@ -29,6 +29,26 @@ IOInstEnding s390_ccw_cmd_request(SubchDev *sch) return cdc->handle_request(sch); } +int s390_ccw_halt(SubchDev *sch) +{ +S390CCWDeviceClass *cdc = S390_CCW_DEVICE_GET_CLASS(sch->driver_data); + +if (!cdc->handle_halt) { +return -ENOSYS; +} +return cdc->handle_halt(sch); +} + +int s390_ccw_clear(SubchDev *sch) +{ +S390CCWDeviceClass *cdc = S390_CCW_DEVICE_GET_CLASS(sch->driver_data); + +if (!cdc->handle_clear) { +return -ENOSYS; +} +return cdc->handle_clear(sch); +} +
Re: [Qemu-devel] [PATCH v4 2/2] vfio-ccw: support async command subregion
On 05/22/2019 06:17 AM, Cornelia Huck wrote: On Tue, 21 May 2019 16:50:47 -0400 Farhan Ali wrote: On 05/07/2019 11:47 AM, Cornelia Huck wrote: A vfio-ccw device may provide an async command subregion for issuing halt/clear subchannel requests. If it is present, use it for sending halt/clear request to the device; if not, fall back to emulation (as done today). Signed-off-by: Cornelia Huck --- hw/s390x/css.c | 27 +++-- hw/vfio/ccw.c | 110 +++- include/hw/s390x/s390-ccw.h | 3 + 3 files changed, 134 insertions(+), 6 deletions(-) diff --git a/include/hw/s390x/s390-ccw.h b/include/hw/s390x/s390-ccw.h index 901d805d79a3..e9c7e1db5761 100644 --- a/include/hw/s390x/s390-ccw.h +++ b/include/hw/s390x/s390-ccw.h @@ -37,4 +37,7 @@ typedef struct S390CCWDeviceClass { IOInstEnding (*handle_request) (SubchDev *sch); } S390CCWDeviceClass; +int vfio_ccw_handle_clear(SubchDev *sch); +int vfio_ccw_handle_halt(SubchDev *sch); + We are not making clear and halt functions part of the S390CCWDeviceClass, is there are reason for doing this? Currently we handle ssch through the handle_request function, it just looks a little inconsistent. I don't quite remember why I did it this way; not sure if there is a good reason for that (that patch has been around for too long...) We can change such internal details later on, though. (And I think your comment has merit.) Yes, sure we could change it later on. I do prefer your way though, it avoids one more layer of indirection. Thanks Farhan
Re: [Qemu-devel] [PATCH v4 2/2] vfio-ccw: support async command subregion
On 05/21/2019 12:32 PM, Cornelia Huck wrote: On Mon, 20 May 2019 12:29:56 -0400 Eric Farman wrote: On 5/7/19 11:47 AM, Cornelia Huck wrote: A vfio-ccw device may provide an async command subregion for issuing halt/clear subchannel requests. If it is present, use it for sending halt/clear request to the device; if not, fall back to emulation (as done today). Signed-off-by: Cornelia Huck --- hw/s390x/css.c | 27 +++-- hw/vfio/ccw.c | 110 +++- include/hw/s390x/s390-ccw.h | 3 + 3 files changed, 134 insertions(+), 6 deletions(-) +int vfio_ccw_handle_clear(SubchDev *sch) +{ +S390CCWDevice *cdev = sch->driver_data; +VFIOCCWDevice *vcdev = DO_UPCAST(VFIOCCWDevice, cdev, cdev); +struct ccw_cmd_region *region = vcdev->async_cmd_region; +int ret; + +if (!vcdev->async_cmd_region) { +/* Async command region not available, fall back to emulation */ +return -ENOSYS; +} + +memset(region, 0, sizeof(*region)); +region->command = VFIO_CCW_ASYNC_CMD_CSCH; Considering the serialization you added on the kernel side, what happens if another vcpu runs this code (or _halt) and clears the async region before the kernel code gains control from the pwrite() call below? Asked another way, there's nothing preventing us from issuing more than one asynchronous command concurrently, so how do we make sure the command gets to the kernel rather than "current command wins" ? This send me digging through the code, because if two threads can call this at the same time for passthrough, we'd also have the same problem for virtual. If I followed the code correctly, all I/O instruction interpretation is currently serialized via qemu_mutex_{lock,unlock}_iothread() (see target/s390x/kvm.c respectively target/s390x/misc_helper.c). This should mostly be enough to avoid stepping on each other's toes. Why mostly? I'm not sure yet whether we handling multiple requests for passthrough devices correctly yet (virtual should be fine.) But don't virtual and passthrough device end up calling the same ioinst_handle_* functions to interpret the I/O instructions? As you mentioned all the ioinst_handle_* functions are called with the qemu_mutex held. So I am confused as why virtual devices should be fine and not passthrough? :) Start vs. (start|halt|clear) is fine, as the code checks whether something is already pending before poking the kernel interface. Likewise, halt vs. (start|halt|clear) is fine, as the code checks for halt or clear and start and halt use different regions. The problematic one is clear, as that's something that's always supposed to go through. Probably fine if clear should always "win", but I need to think some more about that. That possibly worrisome question aside, this seems generally fine. + +again: +ret = pwrite(vcdev->vdev.fd, region, + vcdev->async_cmd_region_size, vcdev->async_cmd_region_offset); +if (ret != vcdev->async_cmd_region_size) { +if (errno == EAGAIN) { +goto again; +} +error_report("vfio-ccw: write cmd region failed with errno=%d", errno); +ret = -errno; +} else { +ret = region->ret_code; +} +switch (ret) { +case 0: +case -ENODEV: +case -EACCES: +return 0; +case -EFAULT: +default: +sch_gen_unit_exception(sch); +css_inject_io_interrupt(sch); +return 0; +} +}
Re: [Qemu-devel] [PATCH v4 2/2] vfio-ccw: support async command subregion
On 05/07/2019 11:47 AM, Cornelia Huck wrote: A vfio-ccw device may provide an async command subregion for issuing halt/clear subchannel requests. If it is present, use it for sending halt/clear request to the device; if not, fall back to emulation (as done today). Signed-off-by: Cornelia Huck --- hw/s390x/css.c | 27 +++-- hw/vfio/ccw.c | 110 +++- include/hw/s390x/s390-ccw.h | 3 + 3 files changed, 134 insertions(+), 6 deletions(-) diff --git a/hw/s390x/css.c b/hw/s390x/css.c index 8fc9e35ba5d3..5b21a74b5c29 100644 --- a/hw/s390x/css.c +++ b/hw/s390x/css.c @@ -22,6 +22,7 @@ #include "trace.h" #include "hw/s390x/s390_flic.h" #include "hw/s390x/s390-virtio-ccw.h" +#include "hw/s390x/s390-ccw.h" typedef struct CrwContainer { CRW crw; @@ -1191,6 +1192,26 @@ static void sch_handle_start_func_virtual(SubchDev *sch) } +static void sch_handle_halt_func_passthrough(SubchDev *sch) +{ +int ret; + +ret = vfio_ccw_handle_halt(sch); +if (ret == -ENOSYS) { +sch_handle_halt_func(sch); +} +} + +static void sch_handle_clear_func_passthrough(SubchDev *sch) +{ +int ret; + +ret = vfio_ccw_handle_clear(sch); +if (ret == -ENOSYS) { +sch_handle_clear_func(sch); +} +} + static IOInstEnding sch_handle_start_func_passthrough(SubchDev *sch) { SCHIB *schib = >curr_status; @@ -1230,11 +1251,9 @@ IOInstEnding do_subchannel_work_passthrough(SubchDev *sch) SCHIB *schib = >curr_status; if (schib->scsw.ctrl & SCSW_FCTL_CLEAR_FUNC) { -/* TODO: Clear handling */ -sch_handle_clear_func(sch); +sch_handle_clear_func_passthrough(sch); } else if (schib->scsw.ctrl & SCSW_FCTL_HALT_FUNC) { -/* TODO: Halt handling */ -sch_handle_halt_func(sch); +sch_handle_halt_func_passthrough(sch); } else if (schib->scsw.ctrl & SCSW_FCTL_START_FUNC) { return sch_handle_start_func_passthrough(sch); } diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c index 31dd3a2a87b6..175a17b1772a 100644 --- a/hw/vfio/ccw.c +++ b/hw/vfio/ccw.c @@ -2,9 +2,12 @@ * vfio based subchannel assignment support * * Copyright 2017 IBM Corp. + * Copyright 2019 Red Hat, Inc. + * * Author(s): Dong Jia Shi *Xiao Feng Ren *Pierre Morel + *Cornelia Huck * * This work is licensed under the terms of the GNU GPL, version 2 or (at * your option) any later version. See the COPYING file in the top-level @@ -32,6 +35,9 @@ struct VFIOCCWDevice { uint64_t io_region_size; uint64_t io_region_offset; struct ccw_io_region *io_region; +uint64_t async_cmd_region_size; +uint64_t async_cmd_region_offset; +struct ccw_cmd_region *async_cmd_region; EventNotifier io_notifier; bool force_orb_pfch; bool warned_orb_pfch; @@ -114,6 +120,87 @@ again: } } +int vfio_ccw_handle_clear(SubchDev *sch) +{ +S390CCWDevice *cdev = sch->driver_data; +VFIOCCWDevice *vcdev = DO_UPCAST(VFIOCCWDevice, cdev, cdev); +struct ccw_cmd_region *region = vcdev->async_cmd_region; +int ret; + +if (!vcdev->async_cmd_region) { +/* Async command region not available, fall back to emulation */ +return -ENOSYS; +} + +memset(region, 0, sizeof(*region)); +region->command = VFIO_CCW_ASYNC_CMD_CSCH; + +again: +ret = pwrite(vcdev->vdev.fd, region, + vcdev->async_cmd_region_size, vcdev->async_cmd_region_offset); +if (ret != vcdev->async_cmd_region_size) { +if (errno == EAGAIN) { +goto again; +} +error_report("vfio-ccw: write cmd region failed with errno=%d", errno); +ret = -errno; +} else { +ret = region->ret_code; +} +switch (ret) { +case 0: +case -ENODEV: +case -EACCES: +return 0; +case -EFAULT: +default: +sch_gen_unit_exception(sch); +css_inject_io_interrupt(sch); +return 0; +} +} + +int vfio_ccw_handle_halt(SubchDev *sch) +{ +S390CCWDevice *cdev = sch->driver_data; +VFIOCCWDevice *vcdev = DO_UPCAST(VFIOCCWDevice, cdev, cdev); +struct ccw_cmd_region *region = vcdev->async_cmd_region; +int ret; + +if (!vcdev->async_cmd_region) { +/* Async command region not available, fall back to emulation */ +return -ENOSYS; +} + +memset(region, 0, sizeof(*region)); +region->command = VFIO_CCW_ASYNC_CMD_HSCH; + +again: +ret = pwrite(vcdev->vdev.fd, region, + vcdev->async_cmd_region_size, vcdev->async_cmd_region_offset); +if (ret != vcdev->async_cmd_region_size) { +if (errno == EAGAIN) { +goto again; +} +error_report("vfio-ccw: write cmd region failed with errno=%d", errno); +ret = -errno; +} else { +ret = region->ret_code; +} +switch (ret) { +case 0: +case -EBUSY: +
Re: [Qemu-devel] [PATCH v4 4/6] vfio-ccw: add capabilities chain
On 03/01/2019 04:39 AM, Cornelia Huck wrote: Allow to extend the regions used by vfio-ccw. The first user will be handling of halt and clear subchannel. Signed-off-by: Cornelia Huck --- drivers/s390/cio/vfio_ccw_ops.c | 186 drivers/s390/cio/vfio_ccw_private.h | 38 ++ include/uapi/linux/vfio.h | 2 + 3 files changed, 200 insertions(+), 26 deletions(-) Reviewed-by: Farhan Ali
Re: [Qemu-devel] [PATCH v4 6/6] vfio-ccw: add handling for async channel instructions
On 03/01/2019 04:39 AM, Cornelia Huck wrote: Add a region to the vfio-ccw device that can be used to submit asynchronous I/O instructions. ssch continues to be handled by the existing I/O region; the new region handles hsch and csch. Interrupt status continues to be reported through the same channels as for ssch. Signed-off-by: Cornelia Huck Reviewed-by: Farhan Ali
Re: [Qemu-devel] [PATCH v4 1/6] vfio-ccw: make it safe to access channel programs
On 04/08/2019 01:07 PM, Cornelia Huck wrote: On Mon, 8 Apr 2019 13:02:12 -0400 Farhan Ali wrote: On 03/01/2019 04:38 AM, Cornelia Huck wrote: When we get a solicited interrupt, the start function may have been cleared by a csch, but we still have a channel program structure allocated. Make it safe to call the cp accessors in any case, so we can call them unconditionally. While at it, also make sure that functions called from other parts of the code return gracefully if the channel program structure has not been initialized (even though that is a bug in the caller). Reviewed-by: Eric Farman Signed-off-by: Cornelia Huck --- Hi Connie, My series of fixes for vfio-ccw depends on this patch as I would like to call cp_free unconditionally :) (I had developed my code on top of your patches). Could we pick this patch up as well when/if you pick up my patch series? I am in the process of sending out a v2. Regarding this patch we could merge it as a stand alone patch, separate from this series. And also the patch LGTM Reviewed-by: Farhan Ali Actually, I wanted to ask how people felt about merging this whole series for the next release :) It would be one thing less on my plate... I have been testing with your patches for a while now and I haven't hit any problems due to the patches. IMHO I think we can merge the patches for the next release. Thanks Farhan
Re: [Qemu-devel] [PATCH v4 1/6] vfio-ccw: make it safe to access channel programs
On 03/01/2019 04:38 AM, Cornelia Huck wrote: When we get a solicited interrupt, the start function may have been cleared by a csch, but we still have a channel program structure allocated. Make it safe to call the cp accessors in any case, so we can call them unconditionally. While at it, also make sure that functions called from other parts of the code return gracefully if the channel program structure has not been initialized (even though that is a bug in the caller). Reviewed-by: Eric Farman Signed-off-by: Cornelia Huck --- Hi Connie, My series of fixes for vfio-ccw depends on this patch as I would like to call cp_free unconditionally :) (I had developed my code on top of your patches). Could we pick this patch up as well when/if you pick up my patch series? I am in the process of sending out a v2. Regarding this patch we could merge it as a stand alone patch, separate from this series. And also the patch LGTM Reviewed-by: Farhan Ali Thanks Farhan
Re: [Qemu-devel] [PATCH 11/14] hw/vfio/ccw: avoid taking address members in packed structs
On 03/29/2019 07:11 AM, Daniel P. Berrangé wrote: The GCC 9 compiler complains about many places in s390 code that take the address of members of the 'struct SCHIB' which is marked packed: hw/vfio/ccw.c: In function ‘vfio_ccw_io_notifier_handler’: hw/vfio/ccw.c:133:15: warning: taking address of packed member of ‘struct SCHIB’ may result in an unaligned pointer value \ [-Waddress-of-packed-member] 133 | SCSW *s = >curr_status.scsw; | ^~ hw/vfio/ccw.c:134:15: warning: taking address of packed member of ‘struct SCHIB’ may result in an unaligned pointer value \ [-Waddress-of-packed-member] 134 | PMCW *p = >curr_status.pmcw; | ^~ ...snip many more... Almost all of these are just done for convenience to avoid typing out long variable/field names when referencing struct members. We can get most of this convenience by taking the address of the 'struct SCHIB' instead, avoiding triggering the compiler warnings. In a couple of places we copy via a local variable which is a technique already applied elsewhere in s390 code for this problem. Signed-off-by: Daniel P. Berrangé --- hw/vfio/ccw.c | 42 ++ 1 file changed, 22 insertions(+), 20 deletions(-) diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c index 9246729a75..c44d13cc50 100644 --- a/hw/vfio/ccw.c +++ b/hw/vfio/ccw.c @@ -130,8 +130,8 @@ static void vfio_ccw_io_notifier_handler(void *opaque) S390CCWDevice *cdev = S390_CCW_DEVICE(vcdev); CcwDevice *ccw_dev = CCW_DEVICE(cdev); SubchDev *sch = ccw_dev->sch; -SCSW *s = >curr_status.scsw; -PMCW *p = >curr_status.pmcw; +SCHIB *schib = >curr_status; +SCSW s; IRB irb; int size; @@ -145,33 +145,33 @@ static void vfio_ccw_io_notifier_handler(void *opaque) switch (errno) { case ENODEV: /* Generate a deferred cc 3 condition. */ -s->flags |= SCSW_FLAGS_MASK_CC; -s->ctrl &= ~SCSW_CTRL_MASK_STCTL; -s->ctrl |= (SCSW_STCTL_ALERT | SCSW_STCTL_STATUS_PEND); +schib->scsw.flags |= SCSW_FLAGS_MASK_CC; +schib->scsw.ctrl &= ~SCSW_CTRL_MASK_STCTL; +schib->scsw.ctrl |= (SCSW_STCTL_ALERT | SCSW_STCTL_STATUS_PEND); goto read_err; case EFAULT: /* Memory problem, generate channel data check. */ -s->ctrl &= ~SCSW_ACTL_START_PEND; -s->cstat = SCSW_CSTAT_DATA_CHECK; -s->ctrl &= ~SCSW_CTRL_MASK_STCTL; -s->ctrl |= SCSW_STCTL_PRIMARY | SCSW_STCTL_SECONDARY | +schib->scsw.ctrl &= ~SCSW_ACTL_START_PEND; +schib->scsw.cstat = SCSW_CSTAT_DATA_CHECK; +schib->scsw.ctrl &= ~SCSW_CTRL_MASK_STCTL; +schib->scsw.ctrl |= SCSW_STCTL_PRIMARY | SCSW_STCTL_SECONDARY | SCSW_STCTL_ALERT | SCSW_STCTL_STATUS_PEND; goto read_err; default: /* Error, generate channel program check. */ -s->ctrl &= ~SCSW_ACTL_START_PEND; -s->cstat = SCSW_CSTAT_PROG_CHECK; -s->ctrl &= ~SCSW_CTRL_MASK_STCTL; -s->ctrl |= SCSW_STCTL_PRIMARY | SCSW_STCTL_SECONDARY | +schib->scsw.ctrl &= ~SCSW_ACTL_START_PEND; +schib->scsw.cstat = SCSW_CSTAT_PROG_CHECK; +schib->scsw.ctrl &= ~SCSW_CTRL_MASK_STCTL; +schib->scsw.ctrl |= SCSW_STCTL_PRIMARY | SCSW_STCTL_SECONDARY | SCSW_STCTL_ALERT | SCSW_STCTL_STATUS_PEND; goto read_err; } } else if (size != vcdev->io_region_size) { /* Information transfer error, generate channel-control check. */ -s->ctrl &= ~SCSW_ACTL_START_PEND; -s->cstat = SCSW_CSTAT_CHN_CTRL_CHK; -s->ctrl &= ~SCSW_CTRL_MASK_STCTL; -s->ctrl |= SCSW_STCTL_PRIMARY | SCSW_STCTL_SECONDARY | +schib->scsw.ctrl &= ~SCSW_ACTL_START_PEND; +schib->scsw.cstat = SCSW_CSTAT_CHN_CTRL_CHK; +schib->scsw.ctrl &= ~SCSW_CTRL_MASK_STCTL; +schib->scsw.ctrl |= SCSW_STCTL_PRIMARY | SCSW_STCTL_SECONDARY | SCSW_STCTL_ALERT | SCSW_STCTL_STATUS_PEND; goto read_err; } @@ -179,11 +179,13 @@ static void vfio_ccw_io_notifier_handler(void *opaque) memcpy(, region->irb_area, sizeof(IRB)); /* Update control block via irb. */ -copy_scsw_to_guest(s, ); +s = schib->scsw; +copy_scsw_to_guest(, ); +schib->scsw = s; /* If a uint check is pending, copy sense data. */ -if ((s->dstat & SCSW_DSTAT_UNIT_CHECK) && -(p->chars & PMCW_CHARS_MASK_CSENSE)) { + if ((schib->scsw.dstat & SCSW_DSTAT_UNIT_CHECK) && +(schib->pmcw.chars & PMCW_CHARS_MASK_CSENSE)) { memcpy(sch->sense_data, irb.ecw, sizeof(irb.ecw)); } Reviewed-by: Farhan Ali
Re: [Qemu-devel] [PATCH 13/14] hw/s390x/ipl: avoid taking address of fields in packed struct
On 03/29/2019 07:11 AM, Daniel P. Berrangé wrote: Compiling with GCC 9 complains hw/s390x/ipl.c: In function ‘s390_ipl_set_boot_menu’: hw/s390x/ipl.c:256:25: warning: taking address of packed member of ‘struct QemuIplParameters’ may result in an unaligned pointer value [-Waddress-of-packed-member] 256 | uint32_t *timeout = >qipl.boot_menu_timeout; | ^~~~ This local variable is only present to save a little bit of typing when setting the field later. Get rid of this to avoid the warning about unaligned accesses. Signed-off-by: Daniel P. Berrangé --- hw/s390x/ipl.c | 12 +--- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c index 896888bf8f..51b272e190 100644 --- a/hw/s390x/ipl.c +++ b/hw/s390x/ipl.c @@ -252,8 +252,6 @@ static void s390_ipl_set_boot_menu(S390IPLState *ipl) { QemuOptsList *plist = qemu_find_opts("boot-opts"); QemuOpts *opts = QTAILQ_FIRST(>head); -uint8_t *flags = >qipl.qipl_flags; -uint32_t *timeout = >qipl.boot_menu_timeout; const char *tmp; unsigned long splash_time = 0; @@ -269,7 +267,7 @@ static void s390_ipl_set_boot_menu(S390IPLState *ipl) case S390_IPL_TYPE_CCW: /* In the absence of -boot menu, use zipl parameters */ if (!qemu_opt_get(opts, "menu")) { -*flags |= QIPL_FLAG_BM_OPTS_ZIPL; +ipl->qipl.qipl_flags |= QIPL_FLAG_BM_OPTS_ZIPL; return; } break; @@ -286,23 +284,23 @@ static void s390_ipl_set_boot_menu(S390IPLState *ipl) return; } -*flags |= QIPL_FLAG_BM_OPTS_CMD; +ipl->qipl.qipl_flags |= QIPL_FLAG_BM_OPTS_CMD; tmp = qemu_opt_get(opts, "splash-time"); if (tmp && qemu_strtoul(tmp, NULL, 10, _time)) { error_report("splash-time is invalid, forcing it to 0"); -*timeout = 0; +ipl->qipl.boot_menu_timeout = 0; return; } if (splash_time > 0x) { error_report("splash-time is too large, forcing it to max value"); -*timeout = 0x; +ipl->qipl.boot_menu_timeout = 0x; return; } -*timeout = cpu_to_be32(splash_time); +ipl->qipl.boot_menu_timeout = cpu_to_be32(splash_time); } static CcwDevice *s390_get_ccw_device(DeviceState *dev_st) Reviewed-by: Farhan Ali
Re: [Qemu-devel] [PATCH v5 10/15] s390-bios: Support for running format-0/1 channel programs
On 03/13/2019 12:31 PM, Jason J. Herne wrote: Introduce a library function for executing format-0 and format-1 channel programs and waiting for their completion before continuing execution. Add cu_type() to channel io library. This will be used to query control unit type which is used to determine if we are booting a virtio device or a real dasd device. Signed-off-by: Jason J. Herne Reviewed-by: Cornelia Huck --- pc-bios/s390-ccw/cio.c | 144 pc-bios/s390-ccw/cio.h | 130 ++- pc-bios/s390-ccw/s390-ccw.h | 1 + pc-bios/s390-ccw/start.S| 29 + 4 files changed, 301 insertions(+), 3 deletions(-) Reviewed-by: Farhan Ali
Re: [Qemu-devel] qemu-iotest 55 broken
On 03/14/2019 01:11 PM, Eric Blake wrote: On 3/14/19 11:33 AM, Farhan Ali wrote: Hi, I noticed qemu iotest 55 was broken on the latest master (My head commit is dbbc277 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging). Patch has been proposed: https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg04398.html Thanks for pointing it out. Didn't notice there was a patch out there. Thanks Farhan
[Qemu-devel] qemu-iotest 55 broken
Hi, I noticed qemu iotest 55 was broken on the latest master (My head commit is dbbc277 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging). The backtrace of the qemu process when the test fails: #0 0x7f82047cd428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54 #1 0x7f82047cf02a in __GI_abort () at abort.c:89 #2 0x7f82047c5bd7 in __assert_fail_base (fmt=, assertion=assertion@entry=0x55b6f83f91e4 "current_migration", file=file@entry=0x55b6f83f9197 "migration/migration.c", line=line@entry=187, function=function@entry=0x55b6f83fa420 <__PRETTY_FUNCTION__.30526> "migrate_get_current") at assert.c:92 #3 0x7f82047c5c82 in __GI___assert_fail (assertion=0x55b6f83f91e4 "current_migration", file=0x55b6f83f9197 "migration/migration.c", line=187, function=0x55b6f83fa420 <__PRETTY_FUNCTION__.30526> "migrate_get_current") at assert.c:101 #4 0x55b6f8068c93 in migrate_get_current () at migration/migration.c:187 #5 0x55b6f806c16e in migrate_add_blocker (reason=0x55b6f91797e0, errp=0x7ffe73ee6d20) at migration/migration.c:1715 #6 0x55b6f810caf9 in vmdk_open (bs=0x55b6f91835e0, options=0x55b6f91888b0, flags=139266, errp=0x7ffe73ee6d90) at block/vmdk.c:1019 #7 0x55b6f80f55d1 in bdrv_open_driver (bs=0x55b6f91835e0, drv=0x55b6f8bba300 , node_name=0x0, options=0x55b6f91888b0, open_flags=139266, errp=0x7ffe73ee6ea0) at block.c:1272 #8 0x55b6f80f5f89 in bdrv_open_common (bs=0x55b6f91835e0, file=0x55b6f9179490, options=0x55b6f91888b0, errp=0x7ffe73ee6ea0) at block.c:1530 #9 0x55b6f80f8fb6 in bdrv_open_inherit (filename=0x55b6f9182270 "/home/alifm/kvmdev/qemu/tests/qemu-iotests/scratch/blockdev-target.img", reference=0x0, options=0x55b6f91888b0, flags=8194, parent=0x0, child_role=0x0, errp=0x7ffe73ee7298) at block.c:2889 #10 0x55b6f80f94ef in bdrv_open (filename=0x55b6f9182270 "/home/alifm/kvmdev/qemu/tests/qemu-iotests/scratch/blockdev-target.img", reference=0x0, options=0x55b6f91713a0, flags=0, errp=0x7ffe73ee7298) at block.c:2982 #11 0x55b6f815baf0 in blk_new_open (filename=0x55b6f9182270 "/home/alifm/kvmdev/qemu/tests/qemu-iotests/scratch/blockdev-target.img", reference=0x0, options=0x55b6f91713a0, flags=0, errp=0x7ffe73ee7298) at block/block-backend.c:377 #12 0x55b6f7e6939d in blockdev_init (file=0x55b6f9182270 "/home/alifm/kvmdev/qemu/tests/qemu-iotests/scratch/blockdev-target.img", bs_opts=0x55b6f91713a0, errp=0x7ffe73ee7298) at blockdev.c:602 #13 0x55b6f7e6a2de in drive_new (all_opts=0x55b6f90ba630, block_default_type=IF_IDE, errp=0x55b6f8c21fd0 ) at blockdev.c:994 #14 0x55b6f7e7b2f2 in drive_init_func (opaque=0x55b6f90f2628, opts=0x55b6f90ba630, errp=0x55b6f8c21fd0 ) at vl.c:1162 #15 0x55b6f823e17b in qemu_opts_foreach (list=0x55b6f8a8b720 , func=0x55b6f7e7b2be , opaque=0x55b6f90f2628, errp=0x55b6f8c21fd0 ) at util/qemu-option.c:1171 #16 0x55b6f7e7b528 in configure_blockdev (bdo_queue=0x7ffe73ee75d0, machine_class=0x55b6f90f2580, snapshot=0) at vl.c:1229 #17 0x55b6f7e83323 in main (argc=20, argv=0x7ffe73ee76e8, envp=0x7ffe73ee7790) at vl.c:4284 Git bisect points to: commit cda4aa9a5a08777cf13e164c0543bd4888b8adce Author: Markus Armbruster Date: Fri Mar 8 14:14:40 2019 +0100 vl: Create block backends before setting machine properties qemu-system-FOO's main() acts on command line arguments in its own idiosyncratic order. There's not much method to its madness. Whenever we find a case where one kind of command line argument needs to refer to something created for another kind later, we rejigger the order. Block devices get created long after machine properties get processed. Therefore, block device machine properties can be created, but not set. No such properties exist. But the next commit will create some. Time to rejigger again: create block devices earlier. Signed-off-by: Markus Armbruster Reviewed-by: Philippe Mathieu-Daudé Message-Id: <20190308131445.17502-8-arm...@redhat.com> Reviewed-by: Michael S. Tsirkin :100644 100644 c22ca447fa480673e9ecfbddad5e8b2965c4d3c4 e9239d55ad6afc431c16cb9602e167c6ac95b9a1 M vl.c It seems the reason is now we are configuring block devices earlier than we setting up the migration state, and this breaks for vmdk block devices. Thanks Farhan
Re: [Qemu-devel] [PATCH v3 01/16] s390 vfio-ccw: Add bootindex property and IPLB data
On 03/01/2019 01:59 PM, Jason J. Herne wrote: Add bootindex property and iplb data for vfio-ccw devices. This allows us to forward boot information into the bios for vfio-ccw devices. Refactor s390_get_ccw_device() to return device type. This prevents us from having to use messy casting logic in several places. Signed-off-by: Jason J. Herne Acked-by: Halil Pasic --- MAINTAINERS | 1 + hw/s390x/ipl.c | 39 +-- hw/s390x/s390-ccw.c | 9 + hw/vfio/ccw.c | 2 +- include/hw/s390x/s390-ccw.h | 1 + include/hw/s390x/vfio-ccw.h | 28 6 files changed, 73 insertions(+), 7 deletions(-) create mode 100644 include/hw/s390x/vfio-ccw.h diff --git a/MAINTAINERS b/MAINTAINERS index 9a76845..a780916 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1400,6 +1400,7 @@ S: Supported F: hw/vfio/ccw.c F: hw/s390x/s390-ccw.c F: include/hw/s390x/s390-ccw.h +F: include/hw/s390x/vfio-ccw.h T: githttps://github.com/cohuck/qemu.git s390-next L:qemu-s3...@nongnu.org diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c index 896888b..df891bb 100644 --- a/hw/s390x/ipl.c +++ b/hw/s390x/ipl.c @@ -19,6 +19,7 @@ #include "hw/loader.h" #include "hw/boards.h" #include "hw/s390x/virtio-ccw.h" +#include "hw/s390x/vfio-ccw.h" #include "hw/s390x/css.h" #include "hw/s390x/ebcdic.h" #include "ipl.h" @@ -305,16 +306,29 @@ static void s390_ipl_set_boot_menu(S390IPLState *ipl) *timeout = cpu_to_be32(splash_time); } -static CcwDevice *s390_get_ccw_device(DeviceState *dev_st) +#define CCW_DEVTYPE_NONE0x00 +#define CCW_DEVTYPE_VIRTIO 0x01 +#define CCW_DEVTYPE_SCSI0x02 +#define CCW_DEVTYPE_VFIO0x03 + +static CcwDevice*s390_get_ccw_device(DeviceState *dev_st, int* devtype) { CcwDevice *ccw_dev = NULL; +*devtype = CCW_DEVTYPE_NONE; + if (dev_st) { VirtioCcwDevice *virtio_ccw_dev = (VirtioCcwDevice *) object_dynamic_cast(OBJECT(qdev_get_parent_bus(dev_st)->parent), TYPE_VIRTIO_CCW_DEVICE); +VFIOCCWDevice *vfio_ccw_dev = (VFIOCCWDevice *) +object_dynamic_cast(OBJECT(dev_st), TYPE_VFIO_CCW); if (virtio_ccw_dev) { ccw_dev = CCW_DEVICE(virtio_ccw_dev); +*devtype = CCW_DEVTYPE_VIRTIO; +} else if (vfio_ccw_dev) { +ccw_dev = CCW_DEVICE(vfio_ccw_dev); +*devtype = CCW_DEVTYPE_VFIO; } else { SCSIDevice *sd = (SCSIDevice *) object_dynamic_cast(OBJECT(dev_st), @@ -327,6 +341,7 @@ static CcwDevice *s390_get_ccw_device(DeviceState *dev_st) ccw_dev = (CcwDevice *)object_dynamic_cast(OBJECT(scsi_ccw), TYPE_CCW_DEVICE); +*devtype = CCW_DEVTYPE_SCSI; } } } @@ -337,10 +352,11 @@ static bool s390_gen_initial_iplb(S390IPLState *ipl) { DeviceState *dev_st; CcwDevice *ccw_dev = NULL; +int devtype; dev_st = get_boot_device(0); if (dev_st) { -ccw_dev = s390_get_ccw_device(dev_st); +ccw_dev = s390_get_ccw_device(dev_st, ); } /* @@ -349,8 +365,10 @@ static bool s390_gen_initial_iplb(S390IPLState *ipl) if (ccw_dev) { SCSIDevice *sd = (SCSIDevice *) object_dynamic_cast(OBJECT(dev_st), TYPE_SCSI_DEVICE); +VirtIONet *vn; -if (sd) { +switch (devtype) { +case CCW_DEVTYPE_SCSI: ipl->iplb.len = cpu_to_be32(S390_IPLB_MIN_QEMU_SCSI_LEN); ipl->iplb.blk0_len = cpu_to_be32(S390_IPLB_MIN_QEMU_SCSI_LEN - S390_IPLB_HEADER_LEN); @@ -360,8 +378,15 @@ static bool s390_gen_initial_iplb(S390IPLState *ipl) ipl->iplb.scsi.channel = cpu_to_be16(sd->channel); ipl->iplb.scsi.devno = cpu_to_be16(ccw_dev->sch->devno); ipl->iplb.scsi.ssid = ccw_dev->sch->ssid & 3; -} else { -VirtIONet *vn = (VirtIONet *) object_dynamic_cast(OBJECT(dev_st), +break; +case CCW_DEVTYPE_VFIO: +ipl->iplb.len = cpu_to_be32(S390_IPLB_MIN_CCW_LEN); +ipl->iplb.pbt = S390_IPL_TYPE_CCW; +ipl->iplb.ccw.devno = cpu_to_be16(ccw_dev->sch->devno); +ipl->iplb.ccw.ssid = ccw_dev->sch->ssid & 3; +break; Also for vfio-ccw we are not setting the ipl->iplb.blk0_len like we do for virtio. Is there a reason? I know we don't reference the value in the bios for both virtio and vfio. +case CCW_DEVTYPE_VIRTIO: +vn = (VirtIONet *) object_dynamic_cast(OBJECT(dev_st), TYPE_VIRTIO_NET); ipl->iplb.len = cpu_to_be32(S390_IPLB_MIN_CCW_LEN); @@ -374,6 +399,7 @@ static bool
Re: [Qemu-devel] [PATCH v2] s390x: add zPCI feature to "qemu" CPU model
On 02/13/2019 11:26 AM, Cornelia Huck wrote: On Wed, 13 Feb 2019 10:54:23 -0500 Farhan Ali wrote: I noticed the qemu-iotests 200 was failing due to the commit "703fef6fcf3 : s390x/pci: Warn when adding PCI devices without the 'zpci' feature". This patch fixes the failure :). Huh. So it was testing without pci before and now with and it's also working? Awesome :) It was testing with pci, but the reason it fails was because qemu was spitting out the warning due to commit 703fef6fcf3. Now David's commit adds the zpci to qemu cpu model and qemu doesn't give out a warning message. Thanks Farhan On 02/12/2019 06:23 AM, David Hildenbrand wrote: As we now always have PCI support, let's add it to the "qemu" CPU model, taking care of backwards compatibility. Signed-off-by: David Hildenbrand --- v1 -> v2: - Use correct model identifiction of the z12 we emulate hw/s390x/s390-virtio-ccw.c | 3 +++ target/s390x/gen-features.c | 8 ++-- 2 files changed, 9 insertions(+), 2 deletions(-)
Re: [Qemu-devel] [PATCH v2] s390x: add zPCI feature to "qemu" CPU model
I noticed the qemu-iotests 200 was failing due to the commit "703fef6fcf3 : s390x/pci: Warn when adding PCI devices without the 'zpci' feature". This patch fixes the failure :). Thanks Farhan On 02/12/2019 06:23 AM, David Hildenbrand wrote: As we now always have PCI support, let's add it to the "qemu" CPU model, taking care of backwards compatibility. Signed-off-by: David Hildenbrand --- v1 -> v2: - Use correct model identifiction of the z12 we emulate hw/s390x/s390-virtio-ccw.c | 3 +++ target/s390x/gen-features.c | 8 ++-- 2 files changed, 9 insertions(+), 2 deletions(-)
Re: [Qemu-devel] [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs
On 02/05/2019 09:48 AM, Eric Farman wrote: On 02/05/2019 07:35 AM, Cornelia Huck wrote: On Tue, 5 Feb 2019 12:52:29 +0100 Halil Pasic wrote: On Mon, 4 Feb 2019 16:31:02 +0100 Cornelia Huck wrote: On Thu, 31 Jan 2019 13:34:55 +0100 Halil Pasic wrote: On Thu, 31 Jan 2019 12:52:20 +0100 Cornelia Huck wrote: On Wed, 30 Jan 2019 19:51:27 +0100 Halil Pasic wrote: On Wed, 30 Jan 2019 14:22:07 +0100 Cornelia Huck wrote: When we get a solicited interrupt, the start function may have been cleared by a csch, but we still have a channel program structure allocated. Make it safe to call the cp accessors in any case, so we can call them unconditionally. I read this like it is supposed to be safe regardless of parallelism and threads. However I don't see any explicit synchronization done for cp->initialized. I've managed to figure out how is that supposed to be safe for the cp_free() (which is probably our main concern) in vfio_ccw_sch_io_todo(), but if fail when it comes to the one in vfio_ccw_mdev_notifier(). Can you explain us how does the synchronization work? You read that wrong, I don't add synchronization, I just add a check. Now I'm confused. Does that mean we don't need synchronization for this? If we lack synchronization (that is not provided by the current state machine handling, or the rework here), we should do a patch on top (preferably on top of the whole series, so this does not get even more tangled up.) This is really just about the extra check. I'm not a huge fan of keeping or introducing races -- it makes things difficult to reason about, but I do have some understanging your position. The only thing I want to avoid is knowingly making things worse than before, and I don't think this patch does that. This patch-series is AFAICT a big improvement over what we have. I would like Farhan confirming that it makes these hick-ups when he used to hit BUSY with another ssch request disappear. If it does (I hope it does) it's definitely a good thing for anybody who wants to use vfio-ccw. Yep. There remains a lot to be done, but it's a first step. s/a first step/an excellent first step/ :) Can't speak for Farhan, but this makes things somewhat better for me. I'm still getting some periodic errors, but they happen infrequently enough now that debugging them is frustrating. ;-) - Eric I ran the my workloads/tests with the patches and like Eric I notice the errors I previously hit less frequently. Yet I find it difficult to slap my r-b over racy code, or partial solutions. In the latter case, when I lack conceptual clarity, I find it difficult to tell if we are heading into the right direction, or is what we build today going to turn against us tomorrow. Sorry for being a drag. As long as we don't introduce bad user space interfaces we have to drag around forever, I think anything is fair game if we think it's a good idea at that moment. We can rewrite things if it turned out to be a bad idea (although I'm not arguing for doing random crap, of course :)
Re: [Qemu-devel] [PATCH 10/15] s390-bios: Support for running format-0/1 channel programs
On 02/04/2019 06:13 AM, Cornelia Huck wrote: On Thu, 31 Jan 2019 12:31:00 -0500 Farhan Ali wrote: On 01/29/2019 08:29 AM, Jason J. Herne wrote: Add struct for format-0 ccws. Support executing format-0 channel programs and waiting for their completion before continuing execution. This will be used for real dasd ipl. Add cu_type() to channel io library. This will be used to query control unit type which is used to determine if we are booting a virtio device or a real dasd device. Signed-off-by: Jason J. Herne --- pc-bios/s390-ccw/cio.c | 114 +++ pc-bios/s390-ccw/cio.h | 127 ++-- pc-bios/s390-ccw/s390-ccw.h | 1 + pc-bios/s390-ccw/start.S| 33 +++- 4 files changed, 270 insertions(+), 5 deletions(-) +/* + * Executes a channel program at a given subchannel. The request to run the + * channel program is sent to the subchannel, we then wait for the interrupt + * signaling completion of the I/O operation(s) performed by the channel + * program. Lastly we verify that the i/o operation completed without error and + * that the interrupt we received was for the subchannel used to run the + * channel program. + * + * Note: This function assumes it is running in an environment where no other + * cpus are generating or receiving I/O interrupts. So either run it in a + * single-cpu environment or make sure all other cpus are not doing I/O and + * have I/O interrupts masked off. + */ +int do_cio(SubChannelId schid, uint32_t ccw_addr, int fmt) +{ +CmdOrb orb = {}; +Irb irb = {}; +sense_data_eckd_dasd sd; +int rc, retries = 0; + +IPL_assert(fmt == 0 || fmt == 1, "Invalid ccw format"); + +/* ccw_addr must be <= 24 bits and point to at least one whole ccw. */ +if (fmt == 0) { +IPL_assert(ccw_addr <= 0xFF - 8, "Invalid ccw address"); +} + +orb.fmt = fmt ; +orb.pfch = 1; /* QEMU's cio implementation requires prefetch */ +orb.c64 = 1; /* QEMU's cio implementation requires 64-bit idaws */ +orb.lpm = 0xFF; /* All paths allowed */ +orb.cpa = ccw_addr; + +while (true) { +rc = ssch(schid, ); +if (rc == 1) { +/* Status pending, not sure why. Let's eat the status and retry. */ +tsch(schid, ); +retries++; +continue; +} +if (rc) { +print_int("ssch failed with rc=", rc); +break; +} + +consume_io_int(); + +/* collect status */ +rc = tsch(schid, ); +if (rc) { +print_int("tsch failed with rc=", rc); +break; +} + +if (!irb_error()) { +break; +} + +/* + * Unexpected unit check, or interface-control-check. Use sense to + * clear unit check then retry. + */ +if ((unit_check() || iface_ctrl_check()) && retries <= 2) { +basic_sense(schid, , sizeof(sd)); We are using basic sense to clear any unit check or ifcc, but is it possible for the basic sense to cause another unit check? The chapter on Basic Sense in the Common I/O Device Commands (http://publibz.boulder.ibm.com/support/libraryserver/FRAMESET/dz9ar501/2.1?SHELF==19920409154647=) says this: "" The basic sense command initiates a sense operation at all devices and cannot cause the command-reject, intervention-required, data-check, or overrun bit to be set to one. If the control unit detects an equipment malfunction or invalid checking-block code (CBC) on the sense-command code, the equipment-check or bus-out-check bit is set to one, and unit check is indicated in the device-status byte. "" If my understanding is correct, if there is an equipment malfunction then the control unit can return a unit check even for a basic sense. This can lead to infinite recursion in the bios. I think the retries variable is supposed to take care of that. If I understand the code correctly, the retries variable cannot prevent infinite recursion. Because every time we get a unit check we do a basic sense which calls the do_cio function again. If that basic sense returns a unit check we do another basic sense What I don't understand is why we do the basic sense after an IFCC? Wouldn't it make more sense to simply retry the original command in that case? +retries++; +continue; +} + +break; +} + +return rc; +}
Re: [Qemu-devel] [PATCH 10/15] s390-bios: Support for running format-0/1 channel programs
On 01/29/2019 08:29 AM, Jason J. Herne wrote: Add struct for format-0 ccws. Support executing format-0 channel programs and waiting for their completion before continuing execution. This will be used for real dasd ipl. Add cu_type() to channel io library. This will be used to query control unit type which is used to determine if we are booting a virtio device or a real dasd device. Signed-off-by: Jason J. Herne --- pc-bios/s390-ccw/cio.c | 114 +++ pc-bios/s390-ccw/cio.h | 127 ++-- pc-bios/s390-ccw/s390-ccw.h | 1 + pc-bios/s390-ccw/start.S| 33 +++- 4 files changed, 270 insertions(+), 5 deletions(-) diff --git a/pc-bios/s390-ccw/cio.c b/pc-bios/s390-ccw/cio.c index 095f79b..63581c6 100644 --- a/pc-bios/s390-ccw/cio.c +++ b/pc-bios/s390-ccw/cio.c @@ -10,6 +10,7 @@ #include "libc.h" #include "s390-ccw.h" +#include "s390-arch.h" #include "cio.h" static char chsc_page[PAGE_SIZE] __attribute__((__aligned__(PAGE_SIZE))); @@ -39,3 +40,116 @@ void enable_subchannel(SubChannelId schid) schib.pmcw.ena = 1; msch(schid, ); } + +uint16_t cu_type(SubChannelId schid) +{ +Ccw1 sense_id_ccw; +SenseId sense_data; + +sense_id_ccw.cmd_code = CCW_CMD_SENSE_ID; +sense_id_ccw.cda = ptr2u32(_data); +sense_id_ccw.count = sizeof(sense_data); +sense_id_ccw.flags |= CCW_FLAG_SLI; + +if (do_cio(schid, ptr2u32(_id_ccw), CCW_FMT1)) { +panic("Failed to run SenseID CCw\n"); +} + +return sense_data.cu_type; +} + +void basic_sense(SubChannelId schid, void *sense_data, uint16_t data_size) +{ +Ccw1 senseCcw; + +senseCcw.cmd_code = CCW_CMD_BASIC_SENSE; +senseCcw.cda = ptr2u32(sense_data); +senseCcw.count = data_size; + +if (do_cio(schid, ptr2u32(), CCW_FMT1)) { +panic("Failed to run Basic Sense CCW\n"); +} +} + +static bool irb_error(Irb *irb) +{ +if (irb->scsw.cstat) { +return true; +} +return irb->scsw.dstat != (SCSW_DSTAT_DEVEND | SCSW_DSTAT_CHEND); +} + +/* + * Executes a channel program at a given subchannel. The request to run the + * channel program is sent to the subchannel, we then wait for the interrupt + * signaling completion of the I/O operation(s) performed by the channel + * program. Lastly we verify that the i/o operation completed without error and + * that the interrupt we received was for the subchannel used to run the + * channel program. + * + * Note: This function assumes it is running in an environment where no other + * cpus are generating or receiving I/O interrupts. So either run it in a + * single-cpu environment or make sure all other cpus are not doing I/O and + * have I/O interrupts masked off. + */ +int do_cio(SubChannelId schid, uint32_t ccw_addr, int fmt) +{ +CmdOrb orb = {}; +Irb irb = {}; +sense_data_eckd_dasd sd; +int rc, retries = 0; + +IPL_assert(fmt == 0 || fmt == 1, "Invalid ccw format"); + +/* ccw_addr must be <= 24 bits and point to at least one whole ccw. */ +if (fmt == 0) { +IPL_assert(ccw_addr <= 0xFF - 8, "Invalid ccw address"); +} + +orb.fmt = fmt ; +orb.pfch = 1; /* QEMU's cio implementation requires prefetch */ +orb.c64 = 1; /* QEMU's cio implementation requires 64-bit idaws */ +orb.lpm = 0xFF; /* All paths allowed */ +orb.cpa = ccw_addr; + +while (true) { +rc = ssch(schid, ); +if (rc == 1) { +/* Status pending, not sure why. Let's eat the status and retry. */ +tsch(schid, ); +retries++; +continue; +} +if (rc) { +print_int("ssch failed with rc=", rc); +break; +} + +consume_io_int(); + +/* collect status */ +rc = tsch(schid, ); +if (rc) { +print_int("tsch failed with rc=", rc); +break; +} + +if (!irb_error()) { +break; +} + +/* + * Unexpected unit check, or interface-control-check. Use sense to + * clear unit check then retry. + */ +if ((unit_check() || iface_ctrl_check()) && retries <= 2) { +basic_sense(schid, , sizeof(sd)); We are using basic sense to clear any unit check or ifcc, but is it possible for the basic sense to cause another unit check? The chapter on Basic Sense in the Common I/O Device Commands (http://publibz.boulder.ibm.com/support/libraryserver/FRAMESET/dz9ar501/2.1?SHELF==19920409154647=) says this: "" The basic sense command initiates a sense operation at all devices and cannot cause the command-reject, intervention-required, data-check, or overrun bit to be set to one. If the control unit detects an equipment malfunction or invalid checking-block code (CBC) on the sense-command code, the equipment-check or bus-out-check bit is set to one, and unit check is indicated in the
Re: [Qemu-devel] [PATCH 07/15] s390-bios: Decouple channel i/o logic from virtio
On 01/29/2019 08:29 AM, Jason J. Herne wrote: Create a separate library for channel i/o related code. This decouples channel i/o operations from virtio and allows us to make use of them for the real dasd boot path. Signed-off-by: Jason J. Herne --- pc-bios/s390-ccw/Makefile| 2 +- pc-bios/s390-ccw/cio.c | 41 pc-bios/s390-ccw/cio.h | 3 +++ pc-bios/s390-ccw/main.c | 1 + pc-bios/s390-ccw/netboot.mak | 2 +- pc-bios/s390-ccw/netmain.c | 1 + pc-bios/s390-ccw/s390-ccw.h | 1 - pc-bios/s390-ccw/virtio-blkdev.c | 1 + pc-bios/s390-ccw/virtio.c| 27 ++ 9 files changed, 51 insertions(+), 28 deletions(-) create mode 100644 pc-bios/s390-ccw/cio.c diff --git a/pc-bios/s390-ccw/Makefile b/pc-bios/s390-ccw/Makefile index 1eb316b..12ad9c1 100644 --- a/pc-bios/s390-ccw/Makefile +++ b/pc-bios/s390-ccw/Makefile @@ -10,7 +10,7 @@ $(call set-vpath, $(SRC_PATH)/pc-bios/s390-ccw) .PHONY : all clean build-all OBJECTS = start.o main.o bootmap.o jump2ipl.o sclp.o menu.o \ - virtio.o virtio-scsi.o virtio-blkdev.o libc.o + virtio.o virtio-scsi.o virtio-blkdev.o libc.o cio.o QEMU_CFLAGS := $(filter -W%, $(QEMU_CFLAGS)) QEMU_CFLAGS += -ffreestanding -fno-delete-null-pointer-checks -msoft-float diff --git a/pc-bios/s390-ccw/cio.c b/pc-bios/s390-ccw/cio.c new file mode 100644 index 000..095f79b --- /dev/null +++ b/pc-bios/s390-ccw/cio.c @@ -0,0 +1,41 @@ +/* + * S390 Channel I/O + * + * Copyright (c) 2018 Jason J. Herne + * + * This work is licensed under the terms of the GNU GPL, version 2 or (at + * your option) any later version. See the COPYING file in the top-level + * directory. + */ + shouldn't the year be 2019 now? :)
Re: [Qemu-devel] [PATCH 06/15] s390-bios: Clean up cio.h
{ /* * sense-id response buffer layout */ -struct senseid { +typedef struct senseid { /* common part */ __u8 reserved; /* always 0x'FF' */ __u16 cu_type;/* control unit type */ @@ -203,15 +203,15 @@ struct senseid { __u8 unused; /* padding byte */ /* extended part */ struct ciw ciw[62]; -} __attribute__ ((packed, aligned(4))); +} __attribute__ ((packed, aligned(4))) SenseId; /* interruption response block */ -struct irb { +typedef struct irb { struct scsw scsw; __u32 esw[5]; __u32 ecw[8]; __u32 emw[8]; -} __attribute__ ((packed, aligned(4))); +} __attribute__ ((packed, aligned(4))) Irb; /* * Some S390 specific IO instructions as inline diff --git a/pc-bios/s390-ccw/s390-ccw.h b/pc-bios/s390-ccw/s390-ccw.h index 9828aa2..241c6d0 100644 --- a/pc-bios/s390-ccw/s390-ccw.h +++ b/pc-bios/s390-ccw/s390-ccw.h @@ -49,14 +49,6 @@ typedef unsigned long long __u64; #include "cio.h" #include "iplb.h" -typedef struct irb Irb; -typedef struct ccw1 Ccw1; -typedef struct cmd_orb CmdOrb; -typedef struct schib Schib; -typedef struct chsc_area_sda ChscAreaSda; -typedef struct senseid SenseId; -typedef struct subchannel_id SubChannelId; - /* start.s */ void disabled_wait(void); void consume_sclp_int(void); Reviewed-by: Farhan Ali
Re: [Qemu-devel] [PATCH 05/15] s390-bios: Factor finding boot device out of virtio code path
; boot_setup(); -virtio_setup(); +find_boot_device(); +virtio_setup(); zipl_load(); /* no return */ panic("Failed to load OS from hard disk\n"); diff --git a/tests/boot-serial-test.c b/tests/boot-serial-test.c index 58a48f3..9daf2cb 100644 --- a/tests/boot-serial-test.c +++ b/tests/boot-serial-test.c @@ -112,7 +112,7 @@ static testdef_t tests[] = { { "sparc", "SS-4", "", "MB86904" }, { "sparc", "SS-600MP", "", "TMS390Z55" }, { "sparc64", "sun4u", "", "UltraSPARC" }, -{ "s390x", "s390-ccw-virtio", "", "virtio device" }, +{ "s390x", "s390-ccw-virtio", "", "device" }, { "m68k", "mcf5208evb", "", "TT", sizeof(kernel_mcf5208), kernel_mcf5208 }, { "microblaze", "petalogix-s3adsp1800", "", "TT", sizeof(kernel_pls3adsp1800), kernel_pls3adsp1800 }, Reviewed-by: Farhan Ali
Re: [Qemu-devel] [PATCH 01/15] s390 vfio-ccw: Add bootindex property and IPLB data
On 01/29/2019 08:29 AM, Jason J. Herne wrote: Add bootindex property and iplb data for vfio-ccw devices. This allows us to forward boot information into the bios for vfio-ccw devices. Signed-off-by: Jason J. Herne Acked-by: Halil Pasic --- hw/s390x/ipl.c | 14 ++ hw/s390x/s390-ccw.c | 9 + hw/vfio/ccw.c | 13 + include/hw/s390x/s390-ccw.h | 1 + include/hw/s390x/vfio-ccw.h | 38 ++ 5 files changed, 63 insertions(+), 12 deletions(-) create mode 100644 include/hw/s390x/vfio-ccw.h diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c index 21f64ad..a993f65 100644 --- a/hw/s390x/ipl.c +++ b/hw/s390x/ipl.c @@ -19,6 +19,7 @@ #include "hw/loader.h" #include "hw/boards.h" #include "hw/s390x/virtio-ccw.h" +#include "hw/s390x/vfio-ccw.h" #include "hw/s390x/css.h" #include "hw/s390x/ebcdic.h" #include "ipl.h" @@ -311,8 +312,12 @@ static CcwDevice *s390_get_ccw_device(DeviceState *dev_st) VirtioCcwDevice *virtio_ccw_dev = (VirtioCcwDevice *) object_dynamic_cast(OBJECT(qdev_get_parent_bus(dev_st)->parent), TYPE_VIRTIO_CCW_DEVICE); +VFIOCCWDevice *vfio_ccw_dev = (VFIOCCWDevice *) +object_dynamic_cast(OBJECT(dev_st), TYPE_VFIO_CCW); if (virtio_ccw_dev) { ccw_dev = CCW_DEVICE(virtio_ccw_dev); +} else if (vfio_ccw_dev) { +ccw_dev = CCW_DEVICE(vfio_ccw_dev); } else { SCSIDevice *sd = (SCSIDevice *) object_dynamic_cast(OBJECT(dev_st), @@ -347,6 +352,8 @@ static bool s390_gen_initial_iplb(S390IPLState *ipl) if (ccw_dev) { SCSIDevice *sd = (SCSIDevice *) object_dynamic_cast(OBJECT(dev_st), TYPE_SCSI_DEVICE); +VFIOCCWDevice *vc = (VFIOCCWDevice *) +object_dynamic_cast(OBJECT(dev_st), TYPE_VFIO_CCW); if (sd) { ipl->iplb.len = cpu_to_be32(S390_IPLB_MIN_QEMU_SCSI_LEN); @@ -358,6 +365,13 @@ static bool s390_gen_initial_iplb(S390IPLState *ipl) ipl->iplb.scsi.channel = cpu_to_be16(sd->channel); ipl->iplb.scsi.devno = cpu_to_be16(ccw_dev->sch->devno); ipl->iplb.scsi.ssid = ccw_dev->sch->ssid & 3; +} else if (vc) { +CcwDevice *ccw_dev = CCW_DEVICE(vc); Do we need this line here? I though ccw_dev was already correctly casted in s390_get_ccw_device? + +ipl->iplb.len = cpu_to_be32(S390_IPLB_MIN_CCW_LEN); +ipl->iplb.pbt = S390_IPL_TYPE_CCW; +ipl->iplb.ccw.devno = cpu_to_be16(ccw_dev->sch->devno); +ipl->iplb.ccw.ssid = ccw_dev->sch->ssid & 3; } else { VirtIONet *vn = (VirtIONet *) object_dynamic_cast(OBJECT(dev_st), TYPE_VIRTIO_NET);
Re: [Qemu-devel] [PATCH 03/15] s390-bios: decouple common boot logic from virtio
On 01/29/2019 08:29 AM, Jason J. Herne wrote: Create a boot_setup function to handle getting boot information from the machine/hypervisor. This decouples common boot logic from the virtio code path and allows us to make use of it for the real dasd boot scenario. Signed-off-by: Jason J. Herne Acked-by: Halil Pasic Reviewed-by: Collin Walling --- pc-bios/s390-ccw/main.c | 28 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/pc-bios/s390-ccw/main.c b/pc-bios/s390-ccw/main.c index e82fe2c..67df421 100644 --- a/pc-bios/s390-ccw/main.c +++ b/pc-bios/s390-ccw/main.c @@ -14,16 +14,17 @@ char stack[PAGE_SIZE * 8] __attribute__((__aligned__(PAGE_SIZE))); static SubChannelId blk_schid = { .one = 1 }; -IplParameterBlock iplb __attribute__((__aligned__(PAGE_SIZE))); static char loadparm_str[LOADPARM_LEN + 1] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 }; QemuIplParameters qipl; +IplParameterBlock iplb __attribute__((__aligned__(PAGE_SIZE))); +static bool have_iplb; #define LOADPARM_PROMPT "PROMPT " #define LOADPARM_EMPTY "" #define BOOT_MENU_FLAG_MASK (QIPL_FLAG_BM_OPTS_CMD | QIPL_FLAG_BM_OPTS_ZIPL) /* - * Priniciples of Operations (SA22-7832-09) chapter 17 requires that + * Principles of Operations (SA22-7832-09) chapter 17 requires that * a subsystem-identification is at 184-187 and bytes 188-191 are zero * after list-directed-IPL and ccw-IPL. */ @@ -111,23 +112,33 @@ static void css_setup(void) enable_mss_facility(); } +/* + * Collect various pieces of information from the hypervisor/hardware that + * we'll use to determine exactly how we'll boot. + */ +static void boot_setup(void) +{ +char lpmsg[] = "LOADPARM=[]\n"; + +sclp_get_loadparm_ascii(loadparm_str); +memcpy(lpmsg + 10, loadparm_str, 8); +sclp_print(lpmsg); + +have_iplb = store_iplb(); +} + static void virtio_setup(void) { Schib schib; int ssid; bool found = false; uint16_t dev_no; -char ldp[] = "LOADPARM=[]\n"; VDev *vdev = virtio_get_device(); QemuIplParameters *early_qipl = (QemuIplParameters *)QIPL_ADDRESS; -sclp_get_loadparm_ascii(loadparm_str); -memcpy(ldp + 10, loadparm_str, LOADPARM_LEN); -sclp_print(ldp); - memcpy(, early_qipl, sizeof(QemuIplParameters)); -if (store_iplb()) { +if (have_iplb) { switch (iplb.pbt) { case S390_IPL_TYPE_CCW: dev_no = iplb.ccw.devno; @@ -174,6 +185,7 @@ int main(void) { sclp_setup(); css_setup(); +boot_setup(); virtio_setup(); zipl_load(); /* no return */ Reviewed-by: Farhan Ali
Re: [Qemu-devel] [PATCH 02/15] s390-bios: decouple cio setup from virtio
On 01/29/2019 08:29 AM, Jason J. Herne wrote: Move channel i/o setup code out to a separate function. This decouples cio setup from the virtio code path and allows us to make use of it for booting dasd devices. Signed-off-by: Jason J. Herne Acked-by: Halil Pasic Reviewed-by: Collin Walling --- pc-bios/s390-ccw/main.c | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/pc-bios/s390-ccw/main.c b/pc-bios/s390-ccw/main.c index 544851d..e82fe2c 100644 --- a/pc-bios/s390-ccw/main.c +++ b/pc-bios/s390-ccw/main.c @@ -99,6 +99,18 @@ static void menu_setup(void) } } +/* + * Initialize the channel I/O subsystem so we can talk to our ipl/boot device. + */ +static void css_setup(void) +{ +/* + * Unconditionally enable mss support. In every sane configuration this + * will succeed; and even if it doesn't, stsch_err() can handle it. + */ +enable_mss_facility(); +} + static void virtio_setup(void) { Schib schib; @@ -109,13 +121,6 @@ static void virtio_setup(void) VDev *vdev = virtio_get_device(); QemuIplParameters *early_qipl = (QemuIplParameters *)QIPL_ADDRESS; -/* - * We unconditionally enable mss support. In every sane configuration, - * this will succeed; and even if it doesn't, stsch_err() can deal - * with the consequences. - */ -enable_mss_facility(); - sclp_get_loadparm_ascii(loadparm_str); memcpy(ldp + 10, loadparm_str, LOADPARM_LEN); sclp_print(ldp); @@ -168,6 +173,7 @@ static void virtio_setup(void) int main(void) { sclp_setup(); +css_setup(); virtio_setup(); zipl_load(); /* no return */ Reviewed-by: Farhan Ali
Re: [Qemu-devel] [PATCH v2 2/5] vfio-ccw: concurrent I/O handling
On 01/30/2019 08:29 AM, Cornelia Huck wrote: On Tue, 29 Jan 2019 20:39:33 +0100 Halil Pasic wrote: On Tue, 29 Jan 2019 10:58:40 +0100 Cornelia Huck wrote: The problem I see with the let the hardware sort it out is that, for that to work, we need to juggle multiple translations simultaneously (or am I wrong?). Doing that does not appear particularly simple to me. None in the first stage, at most two in the second stage, I guess. Expected benefit of the second stage over the first stage? (I see none.) Making something possible that is allowed by the architecture. Not really important, though. I had a chat with Farhan, and he suggested that by 'allowed by architecture' you mean " You can submit a new request if the subchannel is pending with primary, but not with secondary state." (from Message-ID: <20190125152154.05120461.coh...@redhat.com>). Yes. I might have mixed things up, though. So I re-read the PoP. From the description of the start subchannel instruction: """ Special Conditions Condition code 1 is set, and no other action is taken, when the subchannel is status pending when START SUBCHANNEL is executed. On some mod- els, condition code 1 is not set when the subchannel is status pending with only secondary status; instead, the status-pending condition is discarded. Condition code 2 is set, and no other action is taken, when a start, halt, or clear function is currently in progress at the subchannel (see “Function Control (FC)” on page 13). """ So I guess you mixed primary and secondary up and wanted to say: "You can submit a new request if the subchannel is pending with _secondary_, but not with _primary_ _status_." But does that really mean architecture allows the subchannel to accept multiple ssch() instructions so that it ends up processing two or more channel programs in parallel. That's not what I meant. The vfio-ccw driver still holds on to one cp, while a second one could be submitted. But let's just end discussing this here, and continue with discussing the reworked state machine, ok? It's not really relevant for going forward with halt/clear. +1 I think we should move forward with halt/clear. Thanks Farhan
Re: [Qemu-devel] [PATCH 11/15] s390-bios: cio error handling
On 12/12/2018 09:11 AM, Jason J. Herne wrote: Add verbose error output for when unexpected i/o errors happen. This eases the burden of debugging and reporting i/o errors. No error information is printed in the success case, here is an example of what is output on error: vfio-ccw device I/O error - Interrupt Response Block Data: Function Ctrl : [Start] Activity Ctrl : [Start-Pending] Status Ctrl : [Alert] [Primary] [Secondary] [Status-Pending] Device Status : [Unit-Check] Channel Status : cpa=: 0x01e67098 prev_ccw=: 0x this_ccw=: 0x Sense Data (fmt 32-bytes): Sense Condition Flags : [Equipment-Check] Residual Count =: 0x Phys Drive ID =: 0x009e low cyl address=: 0x head addr & hi cyl =: 0x format/message =: 0x0008 fmt-dependent[0-7] =: 0x0004 fmt-dependent[8-15]=: 0xe561282305082fff prog action code =: 0x0016 Configuration info =: 0x40e0 mcode / hi-cyl =: 0x cyl & head addr [0]=: 0x cyl & head addr [1]=: 0x cyl & head addr [2]=: 0x Signed-off-by: Jason J. Herne --- pc-bios/s390-ccw/cio.c | 225 pc-bios/s390-ccw/libc.h | 11 +++ 2 files changed, 236 insertions(+) diff --git a/pc-bios/s390-ccw/cio.c b/pc-bios/s390-ccw/cio.c index 9019250..c72e13c 100644 --- a/pc-bios/s390-ccw/cio.c +++ b/pc-bios/s390-ccw/cio.c @@ -83,6 +83,228 @@ static bool irb_error(Irb *irb) return irb->scsw.dstat != 0xc; } +static void print_sense_data(SenseData *sd) +{ +char msgline[512]; + +if (sd->config_info & 0x8000) { +sclp_print("Sense Data (fmt 24-bytes):\n"); +} else { +sclp_print("Sense Data (fmt 32-bytes):\n"); +} + +strcat(msgline, "Sense Condition Flags :"); +if (sd->status[0] & SNS0_CMD_REJECT) { +strcat(msgline, " [Cmd-Reject]"); +} +if (sd->status[0] & SNS0_INTERVENTION_REQ) { +strcat(msgline, " [Intervention-Required]"); +} +if (sd->status[0] & SNS0_BUS_OUT_CHECK) { +strcat(msgline, " [Bus-Out-Parity-Check]"); +} +if (sd->status[0] & SNS0_EQUIPMENT_CHECK) { +strcat(msgline, " [Equipment-Check]"); +} +if (sd->status[0] & SNS0_DATA_CHECK) { +strcat(msgline, " [Data-Check]"); +} +if (sd->status[0] & SNS0_OVERRUN) { +strcat(msgline, " [Overrun]"); +} +if (sd->status[0] & SNS0_INCOMPL_DOMAIN) { +strcat(msgline, " [Incomplete-Domain]"); +} + +if (sd->status[1] & SNS1_PERM_ERR) { +strcat(msgline, " [Permanent-Error]"); +} +if (sd->status[1] & SNS1_INV_TRACK_FORMAT) { +strcat(msgline, " [Invalid-Track-Fmt]"); +} +if (sd->status[1] & SNS1_EOC) { +strcat(msgline, " [End-of-Cyl]"); +} +if (sd->status[1] & SNS1_MESSAGE_TO_OPER) { +strcat(msgline, " [Operator-Msg]"); +} +if (sd->status[1] & SNS1_NO_REC_FOUND) { +strcat(msgline, " [No-Record-Found]"); +} +if (sd->status[1] & SNS1_FILE_PROTECTED) { +strcat(msgline, " [File-Protected]"); +} +if (sd->status[1] & SNS1_WRITE_INHIBITED) { +strcat(msgline, " [Write-Inhibited]"); +} +if (sd->status[1] & SNS1_INPRECISE_END) { +strcat(msgline, " [Imprecise-Ending]"); +} + +if (sd->status[2] & SNS2_REQ_INH_WRITE) { +strcat(msgline, " [Req-Inhibit-Write]"); +} +if (sd->status[2] & SNS2_CORRECTABLE) { +strcat(msgline, " [Correctable-Data-Check]"); +} +if (sd->status[2] & SNS2_FIRST_LOG_ERR) { +strcat(msgline, " [First-Error-Log]"); +} +if (sd->status[2] & SNS2_ENV_DATA_PRESENT) { +strcat(msgline, " [Env-Data-Present]"); +} +if (sd->status[2] & SNS2_INPRECISE_END) { +strcat(msgline, " [Imprecise-End]"); +} +strcat(msgline, "\n"); +sclp_print(msgline); + +print_int("Residual Count =", sd->res_count); +print_int("Phys Drive ID =", sd->phys_drive_id); +print_int("low cyl address=", sd->low_cyl_addr); +print_int("head addr & hi cyl =", sd->head_high_cyl_addr); +print_int("format/message =", sd->fmt_msg); +print_int("fmt-dependent[0-7] =", sd->fmt_dependent_info[0]); +print_int("fmt-dependent[8-15]=", sd->fmt_dependent_info[1]); +print_int("prog action code =", sd->program_action_code); +print_int("Configuration info =", sd->config_info); +print_int("mcode / hi-cyl =", sd->mcode_hicyl); +print_int("cyl & head addr [0]=", sd->cyl_head_addr[0]); +print_int("cyl & head addr [1]=", sd->cyl_head_addr[1]); +print_int("cyl & head addr [2]=", sd->cyl_head_addr[2]); +} + +static void print_irb_err(Irb *irb) +{ +
Re: [Qemu-devel] [PATCH 10/15] s390-bios: Support for running format-0/1 channel programs
On 12/12/2018 09:11 AM, Jason J. Herne wrote: Add struct for format-0 ccws. Support executing format-0 channel programs and waiting for their completion before continuing execution. This will be used for real dasd ipl. Add cu_type() to channel io library. This will be used to query control unit type which is used to determine if we are booting a virtio device or a real dasd device. Signed-off-by: Jason J. Herne --- pc-bios/s390-ccw/cio.c | 108 ++ pc-bios/s390-ccw/cio.h | 124 ++-- pc-bios/s390-ccw/s390-ccw.h | 1 + pc-bios/s390-ccw/start.S| 33 +++- 4 files changed, 261 insertions(+), 5 deletions(-) diff --git a/pc-bios/s390-ccw/cio.c b/pc-bios/s390-ccw/cio.c index 095f79b..9019250 100644 --- a/pc-bios/s390-ccw/cio.c +++ b/pc-bios/s390-ccw/cio.c @@ -10,6 +10,7 @@ #include "libc.h" #include "s390-ccw.h" +#include "s390-arch.h" #include "cio.h" static char chsc_page[PAGE_SIZE] __attribute__((__aligned__(PAGE_SIZE))); @@ -39,3 +40,110 @@ void enable_subchannel(SubChannelId schid) schib.pmcw.ena = 1; msch(schid, ); } + +uint16_t cu_type(SubChannelId schid) +{ +Ccw1 sense_id_ccw; +SenseId sense_data; + +sense_id_ccw.cmd_code = CCW_CMD_SENSE_ID; +sense_id_ccw.cda = ptr2u32(_data); +sense_id_ccw.count = sizeof(sense_data); + +if (do_cio(schid, ptr2u32(_id_ccw), CCW_FMT1)) { +panic("Failed to run SenseID CCw\n"); +} + +return sense_data.cu_type; +} + +void basic_sense(SubChannelId schid, SenseData *sd) +{ +Ccw1 senseCcw; + +senseCcw.cmd_code = CCW_CMD_BASIC_SENSE; +senseCcw.cda = ptr2u32(sd); +senseCcw.count = sizeof(*sd); + +if (do_cio(schid, ptr2u32(), CCW_FMT1)) { +panic("Failed to run Basic Sense CCW\n"); +} +} + +static bool irb_error(Irb *irb) +{ +/* We have to ignore Incorrect Length (cstat == 0x40) indicators because + * real devices expect a 24 byte SenseID buffer, and virtio devices expect + * a much larger buffer. Neither device type can tolerate a buffer size + * different from what they expect so they set this indicator. + */ +if (irb->scsw.cstat != 0x00 && irb->scsw.cstat != 0x40) { +return true; +} +return irb->scsw.dstat != 0xc; +} + +/* Executes a channel program at a given subchannel. The request to run the + * channel program is sent to the subchannel, we then wait for the interrupt + * singaling completion of the I/O operation(s) perfomed by the channel + * program. Lastly we verify that the i/o operation completed without error and + * that the interrupt we received was for the subchannel used to run the + * channel program. + * + * Note: This function assumes it is running in an environment where no other + * cpus are generating or receiving I/O interrupts. So either run it in a + * single-cpu environment or make sure all other cpus are not doing I/O and + * have I/O interrupts masked off. + */ +int do_cio(SubChannelId schid, uint32_t ccw_addr, int fmt) +{ +CmdOrb orb = {}; +Irb irb = {}; +SenseData sd; +int rc, retries = 0; + +IPL_assert(fmt == 0 || fmt == 1, "Invalid ccw format"); + +/* ccw_addr must be <= 24 bits and point to at least one whole ccw. */ +if (fmt == 0) { +IPL_assert(ccw_addr <= 0xFF - 8, "Invalid ccw address"); +} + +orb.fmt = fmt ; +orb.pfch = 1; /* QEMU's cio implementation requires prefetch */ +orb.c64 = 1; /* QEMU's cio implementation requires 64-bit idaws */ +orb.lpm = 0xFF; /* All paths allowed */ +orb.cpa = ccw_addr; + +while (true) { +rc = ssch(schid, ); +if (rc) { +print_int("ssch failed with rc=", rc); +break; +} + +consume_io_int(); + +/* Clear read */ +rc = tsch(schid, ); +if (rc) { +print_int("tsch failed with rc=", rc); +break; +} + +if (!irb_error()) { +break; +} Now that we enable i/o interrupts, do we still need drain_irqs function anymore? Maybe we could refactor that. My question is more out of curiosity and I know it's out of the score for this patch series. + +/* Unexpected unit check. Use sense to clear unit check then retry. */ +if (unit_check() && retries <= 2) { +basic_sense(schid, ); +retries++; +continue; +} + +break; +} + +return rc; +} diff --git a/pc-bios/s390-ccw/cio.h b/pc-bios/s390-ccw/cio.h index 7b07d75..5c16635 100644 --- a/pc-bios/s390-ccw/cio.h +++ b/pc-bios/s390-ccw/cio.h @@ -70,9 +70,46 @@ struct scsw { __u16 count; } __attribute__ ((packed)); -#define SCSW_FCTL_CLEAR_FUNC 0x1000 -#define SCSW_FCTL_HALT_FUNC 0x2000 +/* Function Control */ #define SCSW_FCTL_START_FUNC 0x4000 +#define SCSW_FCTL_HALT_FUNC 0x2000 +#define SCSW_FCTL_CLEAR_FUNC 0x1000 + +/* Activity Control */ +#define
Re: [Qemu-devel] [PATCH 05/15] s390-bios: Factor finding boot device out of virtio code path
On 12/12/2018 09:11 AM, Jason J. Herne wrote: Make a new routine find_boot_device to locate the boot device for all cases. not just virtio. I don't think there should be a period after cases? In one case no boot device is specified and a suitable boot device can not be auto detected. The error message for this case was specific to virtio devices. We update this message to remove virtio specific wording. Signed-off-by: Jason J. Herne --- pc-bios/s390-ccw/main.c | 87 ++-- tests/boot-serial-test.c | 2 +- 2 files changed, 49 insertions(+), 40 deletions(-) diff --git a/pc-bios/s390-ccw/main.c b/pc-bios/s390-ccw/main.c index 7e3f65e..2457752 100644 --- a/pc-bios/s390-ccw/main.c +++ b/pc-bios/s390-ccw/main.c @@ -55,17 +55,18 @@ unsigned int get_loadparm_index(void) * NOTE: The global variable blk_schid is updated to contain the subchannel * information. */ -static bool find_dev(Schib *schib, int dev_no) +static bool find_subch(int dev_no) { +Schib schib; int i, r; for (i = 0; i < 0x1; i++) { blk_schid.sch_no = i; -r = stsch_err(blk_schid, schib); +r = stsch_err(blk_schid, ); if ((r == 3) || (r == -EIO)) { break; } -if (!schib->pmcw.dnv) { +if (!schib.pmcw.dnv) { continue; } @@ -77,7 +78,7 @@ static bool find_dev(Schib *schib, int dev_no) continue; } -if ((dev_no < 0) || (schib->pmcw.dev == dev_no)) { +if ((dev_no < 0) || (schib.pmcw.dev == dev_no)) { return true; } } @@ -133,56 +134,63 @@ static void boot_setup(void) have_iplb = store_iplb(); } -static void virtio_setup(void) +static void find_boot_device(void) { -Schib schib; -int ssid; -bool found = false; -uint16_t dev_no; VDev *vdev = virtio_get_device(); -QemuIplParameters *early_qipl = (QemuIplParameters *)QIPL_ADDRESS; - -memcpy(, early_qipl, sizeof(QemuIplParameters)); +int ssid; +bool found; -if (have_iplb) { -switch (iplb.pbt) { -case S390_IPL_TYPE_CCW: -dev_no = iplb.ccw.devno; -debug_print_int("device no. ", dev_no); -blk_schid.ssid = iplb.ccw.ssid & 0x3; -debug_print_int("ssid ", blk_schid.ssid); -found = find_dev(, dev_no); -break; -case S390_IPL_TYPE_QEMU_SCSI: -vdev->scsi_device_selected = true; -vdev->selected_scsi_device.channel = iplb.scsi.channel; -vdev->selected_scsi_device.target = iplb.scsi.target; -vdev->selected_scsi_device.lun = iplb.scsi.lun; -blk_schid.ssid = iplb.scsi.ssid & 0x3; -found = find_dev(, iplb.scsi.devno); -break; -default: -panic("List-directed IPL not supported yet!\n"); -} -menu_setup(); -} else { +if (!have_iplb) { for (ssid = 0; ssid < 0x3; ssid++) { blk_schid.ssid = ssid; -found = find_dev(, -1); +found = find_subch(-1); if (found) { -break; +return; } } +panic("Could not find a suitable boot device (none specified)\n"); } -IPL_assert(found, "No virtio device found"); +switch (iplb.pbt) { +case S390_IPL_TYPE_CCW: +debug_print_int("device no. ", iplb.ccw.devno); +blk_schid.ssid = iplb.ccw.ssid & 0x3; +debug_print_int("ssid ", blk_schid.ssid); +found = find_subch(iplb.ccw.devno); +break; +case S390_IPL_TYPE_QEMU_SCSI: +vdev->scsi_device_selected = true; +vdev->selected_scsi_device.channel = iplb.scsi.channel; +vdev->selected_scsi_device.target = iplb.scsi.target; +vdev->selected_scsi_device.lun = iplb.scsi.lun; +blk_schid.ssid = iplb.scsi.ssid & 0x3; +found = find_subch(iplb.scsi.devno); +break; +default: +panic("List-directed IPL not supported yet!\n"); +} + +if (!found) { +IPL_assert(found, "Boot device not found\n"); +} +} + +static void virtio_setup(void) +{ +VDev *vdev = virtio_get_device(); +QemuIplParameters *early_qipl = (QemuIplParameters *)QIPL_ADDRESS; + +memcpy(, early_qipl, sizeof(QemuIplParameters)); + +if (have_iplb) { +menu_setup(); +} if (virtio_get_device_type() == VIRTIO_ID_NET) { sclp_print("Network boot device detected\n"); vdev->netboot_start_addr = qipl.netboot_start_addr; } else { virtio_blk_setup_device(blk_schid); - IPL_assert(virtio_ipl_disk_is_valid(), "No valid IPL device detected"); } } @@ -192,8 +200,9 @@ int main(void) sclp_setup(); css_setup(); boot_setup(); -virtio_setup(); +find_boot_device(); +virtio_setup(); zipl_load(); /* no return */
Re: [Qemu-devel] [PATCH 03/15] s390-bios: decouple common boot logic from virtio
On 12/12/2018 09:11 AM, Jason J. Herne wrote: Create a boot_setup function to handle getting boot information from the machine/hypervisor. This decouples common boot logic from the virtio code path and allows us to make use of it for the real dasd boot scenario. Signed-off-by: Jason J. Herne Acked-by: Halil Pasic Reviewed-by: Collin Walling --- pc-bios/s390-ccw/main.c | 28 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/pc-bios/s390-ccw/main.c b/pc-bios/s390-ccw/main.c index e82fe2c..67df421 100644 --- a/pc-bios/s390-ccw/main.c +++ b/pc-bios/s390-ccw/main.c @@ -14,16 +14,17 @@ char stack[PAGE_SIZE * 8] __attribute__((__aligned__(PAGE_SIZE))); static SubChannelId blk_schid = { .one = 1 }; -IplParameterBlock iplb __attribute__((__aligned__(PAGE_SIZE))); static char loadparm_str[LOADPARM_LEN + 1] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 }; QemuIplParameters qipl; +IplParameterBlock iplb __attribute__((__aligned__(PAGE_SIZE))); +static bool have_iplb; #define LOADPARM_PROMPT "PROMPT " #define LOADPARM_EMPTY "" #define BOOT_MENU_FLAG_MASK (QIPL_FLAG_BM_OPTS_CMD | QIPL_FLAG_BM_OPTS_ZIPL) /* - * Priniciples of Operations (SA22-7832-09) chapter 17 requires that + * Principles of Operations (SA22-7832-09) chapter 17 requires that * a subsystem-identification is at 184-187 and bytes 188-191 are zero * after list-directed-IPL and ccw-IPL. */ @@ -111,23 +112,33 @@ static void css_setup(void) enable_mss_facility(); } +/* + * Collect various pieces of information from the hypervisor/hardware that + * we'll use to determine exactly how we'll boot. + */ +static void boot_setup(void) +{ +char lpmsg[] = "LOADPARM=[]\n"; + +sclp_get_loadparm_ascii(loadparm_str); +memcpy(lpmsg + 10, loadparm_str, 8); +sclp_print(lpmsg); + +have_iplb = store_iplb(); +} + static void virtio_setup(void) { Schib schib; int ssid; bool found = false; uint16_t dev_no; -char ldp[] = "LOADPARM=[]\n"; VDev *vdev = virtio_get_device(); QemuIplParameters *early_qipl = (QemuIplParameters *)QIPL_ADDRESS; -sclp_get_loadparm_ascii(loadparm_str); -memcpy(ldp + 10, loadparm_str, LOADPARM_LEN); -sclp_print(ldp); - memcpy(, early_qipl, sizeof(QemuIplParameters)); -if (store_iplb()) { +if (have_iplb) { switch (iplb.pbt) { case S390_IPL_TYPE_CCW: dev_no = iplb.ccw.devno; @@ -174,6 +185,7 @@ int main(void) { sclp_setup(); css_setup(); +boot_setup(); virtio_setup(); zipl_load(); /* no return */ Reviewed-by: Farhan Ali
Re: [Qemu-devel] [PATCH 02/15] s390-bios: decouple cio setup from virtio
On 12/12/2018 09:11 AM, Jason J. Herne wrote: Move channel i/o setup code out to a separate function. This decouples cio setup from the virtio code path and allows us to make use of it for booting dasd devices. Signed-off-by: Jason J. Herne Acked-by: Halil Pasic Reviewed-by: Collin Walling --- pc-bios/s390-ccw/main.c | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/pc-bios/s390-ccw/main.c b/pc-bios/s390-ccw/main.c index 544851d..e82fe2c 100644 --- a/pc-bios/s390-ccw/main.c +++ b/pc-bios/s390-ccw/main.c @@ -99,6 +99,18 @@ static void menu_setup(void) } } +/* + * Initialize the channel I/O subsystem so we can talk to our ipl/boot device. + */ +static void css_setup(void) +{ +/* + * Unconditionally enable mss support. In every sane configuration this + * will succeed; and even if it doesn't, stsch_err() can handle it. + */ +enable_mss_facility(); +} + static void virtio_setup(void) { Schib schib; @@ -109,13 +121,6 @@ static void virtio_setup(void) VDev *vdev = virtio_get_device(); QemuIplParameters *early_qipl = (QemuIplParameters *)QIPL_ADDRESS; -/* - * We unconditionally enable mss support. In every sane configuration, - * this will succeed; and even if it doesn't, stsch_err() can deal - * with the consequences. - */ -enable_mss_facility(); - sclp_get_loadparm_ascii(loadparm_str); memcpy(ldp + 10, loadparm_str, LOADPARM_LEN); sclp_print(ldp); @@ -168,6 +173,7 @@ static void virtio_setup(void) int main(void) { sclp_setup(); +css_setup(); virtio_setup(); zipl_load(); /* no return */ Reviewed-by: Farhan Ali
Re: [Qemu-devel] [PATCH v2] hw/s390/ccw.c: Don't take address of packed members
On 12/13/2018 07:02 AM, Peter Maydell wrote: Taking the address of a field in a packed struct is a bad idea, because it might not be actually aligned enough for that pointer type (and thus cause a crash on dereference on some host architectures). Newer versions of clang warn about this. Avoid the problem by using local copies of the PMCW and SCSW struct fields in copy_schib_from_guest() and copy_schib_to_guest(). Signed-off-by: Peter Maydell --- v1->v2 changes: * add comment about why we're using locals * name locals with underscores, as QEMU's naming conventions recommend hw/s390x/css.c | 28 1 file changed, 24 insertions(+), 4 deletions(-) diff --git a/hw/s390x/css.c b/hw/s390x/css.c index 04ec5cc9705..f92b046cd33 100644 --- a/hw/s390x/css.c +++ b/hw/s390x/css.c @@ -1290,9 +1290,19 @@ void copy_scsw_to_guest(SCSW *dest, const SCSW *src) static void copy_schib_to_guest(SCHIB *dest, const SCHIB *src) { int i; +/* + * We copy the PMCW and SCSW in and out of local variables to + * avoid taking the address of members of a packed struct. + */ +PMCW src_pmcw, dest_pmcw; +SCSW src_scsw, dest_scsw; -copy_pmcw_to_guest(>pmcw, >pmcw); -copy_scsw_to_guest(>scsw, >scsw); +src_pmcw = src->pmcw; +copy_pmcw_to_guest(_pmcw, _pmcw); +dest->pmcw = dest_pmcw; +src_scsw = src->scsw; +copy_scsw_to_guest(_scsw, _scsw); +dest->scsw = dest_scsw; dest->mba = cpu_to_be64(src->mba); for (i = 0; i < ARRAY_SIZE(dest->mda); i++) { dest->mda[i] = src->mda[i]; @@ -1339,9 +1349,19 @@ static void copy_scsw_from_guest(SCSW *dest, const SCSW *src) static void copy_schib_from_guest(SCHIB *dest, const SCHIB *src) { int i; +/* + * We copy the PMCW and SCSW in and out of local variables to + * avoid taking the address of members of a packed struct. + */ +PMCW src_pmcw, dest_pmcw; +SCSW src_scsw, dest_scsw; -copy_pmcw_from_guest(>pmcw, >pmcw); -copy_scsw_from_guest(>scsw, >scsw); +src_pmcw = src->pmcw; +copy_pmcw_from_guest(_pmcw, _pmcw); +dest->pmcw = dest_pmcw; +src_scsw = src->scsw; +copy_scsw_from_guest(_scsw, _scsw); +dest->scsw = dest_scsw; dest->mba = be64_to_cpu(src->mba); for (i = 0; i < ARRAY_SIZE(dest->mda); i++) { dest->mda[i] = src->mda[i]; Reviewed-by: Farhan Ali
Re: [Qemu-devel] [PATCH] hw/s390/ccw.c: Don't take address of packed members
On 12/10/2018 08:58 AM, Peter Maydell wrote: Taking the address of a field in a packed struct is a bad idea, because it might not be actually aligned enough for that pointer type (and thus cause a crash on dereference on some host architectures). Newer versions of clang warn about this. Avoid the problem by using local copies of the PMCW and SCSW struct fields in copy_schib_from_guest() and copy_schib_to_guest(). Signed-off-by: Peter Maydell --- This seemed like a not totally ugly and reasonably localised fix that satisfies clang. Oddly, this makes the generated object file 15K smaller (421K vs 406K), so it might even be better code... hw/s390x/css.c | 20 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/hw/s390x/css.c b/hw/s390x/css.c index 04ec5cc9705..ef07691e36b 100644 --- a/hw/s390x/css.c +++ b/hw/s390x/css.c @@ -1290,9 +1290,15 @@ void copy_scsw_to_guest(SCSW *dest, const SCSW *src) static void copy_schib_to_guest(SCHIB *dest, const SCHIB *src) { int i; +PMCW srcpmcw, destpmcw; +SCSW srcscsw, destscsw; -copy_pmcw_to_guest(>pmcw, >pmcw); -copy_scsw_to_guest(>scsw, >scsw); +srcpmcw = src->pmcw; +copy_pmcw_to_guest(, ); +dest->pmcw = destpmcw; +srcscsw = src->scsw; +copy_scsw_to_guest(, ); +dest->scsw = destscsw; dest->mba = cpu_to_be64(src->mba); for (i = 0; i < ARRAY_SIZE(dest->mda); i++) { dest->mda[i] = src->mda[i]; @@ -1339,9 +1345,15 @@ static void copy_scsw_from_guest(SCSW *dest, const SCSW *src) static void copy_schib_from_guest(SCHIB *dest, const SCHIB *src) { int i; +PMCW srcpmcw, destpmcw; +SCSW srcscsw, destscsw; -copy_pmcw_from_guest(>pmcw, >pmcw); -copy_scsw_from_guest(>scsw, >scsw); +srcpmcw = src->pmcw; +copy_pmcw_from_guest(, ); +dest->pmcw = destpmcw; +srcscsw = src->scsw; +copy_scsw_from_guest(, ); +dest->scsw = destscsw; dest->mba = be64_to_cpu(src->mba); for (i = 0; i < ARRAY_SIZE(dest->mda); i++) { dest->mda[i] = src->mda[i]; Reviewed-by: Farhan Ali
Re: [Qemu-devel] [PATCH 0/3] vfio-ccw: support hsch/csch (kernel part)
On 12/06/2018 11:21 AM, Cornelia Huck wrote: On Thu, 6 Dec 2018 10:26:12 -0500 Farhan Ali wrote: On 12/06/2018 09:39 AM, Cornelia Huck wrote: On Wed, 5 Dec 2018 13:34:11 -0500 Farhan Ali wrote: On 12/05/2018 07:54 AM, Cornelia Huck wrote: Yeah, that is perfectly clear, but it ain't the complete story. E.g. are subsequent commands blocking until the preceding command finishes is part of the interface. And what is good implementation depends on the answer. What I mean, I first need to understand how things are supposed to work (together) so I can double check that against the implementation. Otherwise all I can do is nitpicking. To get more tangible: we are in the middle of processing an SSCH request (e.g. doing the translation) when a HSCH comes in. What should happen? Should we start processing HSCH after he instruction part of SSCH is done -- which currently includes translation? Or should we -EBUSY? Or do we abort START related activities and do the HALT stuff? I think most of the sorting-out-the-operations stuff should be done by the hardware itself, and we should not really try to enforce anything special in our vfio code. For your example, it might be best if a hsch is always accepted and send on towards the hardware. Probably best to reflect back -EAGAIN if we're currently processing another instruction from another vcpu, so that the user space caller can retry. Same for ssch, if another ssch is already being processed. We*could* reflect cc 2 if the fctl bit is already set, but that won't do for csch, so it is probably best to have the hardware figure that out in any case. If I read the code correctly, we currently reflect -EBUSY and not -EAGAIN if we get a ssch request while already processing another one. QEMU hands that back to the guest as a cc 2, which is not 100% correct. In practice, we don't see this with Linux guests due to locking. If we have a ssch and a csch immediately afterwards from userspace, will we end up issuing csch first and then ssch to the hardware? If I understand correctly, the ccw translation as part of the ssch can be a slow operation so it might be possible we issue the csch first? In that case we won't actually clear the original start function as intended. When we start processing the ssch request (translation and so on), we set the state to BUSY. This means that any csch request will get a -EBUSY, no overtaking possible. (I think maybe I'll need to check what this series looks like if I rebase it on top of Pierre's rework, as he did some changes in the state machine.) I think you meant the state is set to BOXED? otherwise the patch 3 says if state is BUSY and CLEAR event request comes in, we issue the clear instruction, no? That's what I meant with "need to rebase" :) The BOXED state is gone; I just had not rebased on top of it. There's more changes in the state machine; if we are on the same page as to what should happen, I can start massaging the patches. Sorry maybe I missed it, but are you referring to Pierre's latest cleanup patches? I don't see him removing the BOXED state. I think returning -EAGAIN and asking the userspace to retry the operation sounds reasonable to me. But how do we handle the issue of protecting the cmd_region from simultaneous hsch and csch calls? Do we agree on Pierre's method of making write calls mutually exclusive?
Re: [Qemu-devel] [PATCH 0/3] vfio-ccw: support hsch/csch (kernel part)
On 12/06/2018 09:39 AM, Cornelia Huck wrote: On Wed, 5 Dec 2018 13:34:11 -0500 Farhan Ali wrote: On 12/05/2018 07:54 AM, Cornelia Huck wrote: Yeah, that is perfectly clear, but it ain't the complete story. E.g. are subsequent commands blocking until the preceding command finishes is part of the interface. And what is good implementation depends on the answer. What I mean, I first need to understand how things are supposed to work (together) so I can double check that against the implementation. Otherwise all I can do is nitpicking. To get more tangible: we are in the middle of processing an SSCH request (e.g. doing the translation) when a HSCH comes in. What should happen? Should we start processing HSCH after he instruction part of SSCH is done -- which currently includes translation? Or should we -EBUSY? Or do we abort START related activities and do the HALT stuff? I think most of the sorting-out-the-operations stuff should be done by the hardware itself, and we should not really try to enforce anything special in our vfio code. For your example, it might be best if a hsch is always accepted and send on towards the hardware. Probably best to reflect back -EAGAIN if we're currently processing another instruction from another vcpu, so that the user space caller can retry. Same for ssch, if another ssch is already being processed. We*could* reflect cc 2 if the fctl bit is already set, but that won't do for csch, so it is probably best to have the hardware figure that out in any case. If I read the code correctly, we currently reflect -EBUSY and not -EAGAIN if we get a ssch request while already processing another one. QEMU hands that back to the guest as a cc 2, which is not 100% correct. In practice, we don't see this with Linux guests due to locking. If we have a ssch and a csch immediately afterwards from userspace, will we end up issuing csch first and then ssch to the hardware? If I understand correctly, the ccw translation as part of the ssch can be a slow operation so it might be possible we issue the csch first? In that case we won't actually clear the original start function as intended. When we start processing the ssch request (translation and so on), we set the state to BUSY. This means that any csch request will get a -EBUSY, no overtaking possible. (I think maybe I'll need to check what this series looks like if I rebase it on top of Pierre's rework, as he did some changes in the state machine.) I think you meant the state is set to BOXED? otherwise the patch 3 says if state is BUSY and CLEAR event request comes in, we issue the clear instruction, no? My idea above was to return -EAGAIN instead of -EBUSY, so that user space can retry the operation.
Re: [Qemu-devel] [PATCH 0/3] vfio-ccw: support hsch/csch (kernel part)
On 12/05/2018 07:54 AM, Cornelia Huck wrote: Yeah, that is perfectly clear, but it ain't the complete story. E.g. are subsequent commands blocking until the preceding command finishes is part of the interface. And what is good implementation depends on the answer. What I mean, I first need to understand how things are supposed to work (together) so I can double check that against the implementation. Otherwise all I can do is nitpicking. To get more tangible: we are in the middle of processing an SSCH request (e.g. doing the translation) when a HSCH comes in. What should happen? Should we start processing HSCH after he instruction part of SSCH is done -- which currently includes translation? Or should we -EBUSY? Or do we abort START related activities and do the HALT stuff? I think most of the sorting-out-the-operations stuff should be done by the hardware itself, and we should not really try to enforce anything special in our vfio code. For your example, it might be best if a hsch is always accepted and send on towards the hardware. Probably best to reflect back -EAGAIN if we're currently processing another instruction from another vcpu, so that the user space caller can retry. Same for ssch, if another ssch is already being processed. We*could* reflect cc 2 if the fctl bit is already set, but that won't do for csch, so it is probably best to have the hardware figure that out in any case. If I read the code correctly, we currently reflect -EBUSY and not -EAGAIN if we get a ssch request while already processing another one. QEMU hands that back to the guest as a cc 2, which is not 100% correct. In practice, we don't see this with Linux guests due to locking. If we have a ssch and a csch immediately afterwards from userspace, will we end up issuing csch first and then ssch to the hardware? If I understand correctly, the ccw translation as part of the ssch can be a slow operation so it might be possible we issue the csch first? In that case we won't actually clear the original start function as intended. Thanks Farhan
[Qemu-devel] [PATCH for 3.1? or 4 v4 1/1] qemu-iotests: Don't run the test when user is root
Test 232 creates image files with read-only permission and expects an error message when trying to access the image files with read-only and auto-read-only turned off. Don't run as root user, since root can open files with read/write access for read-only files. Signed-off-by: Farhan Ali Reviewed-by: Eric Blake --- tests/qemu-iotests/232 | 5 + 1 file changed, 5 insertions(+) diff --git a/tests/qemu-iotests/232 b/tests/qemu-iotests/232 index 0708b8b..1d34242 100755 --- a/tests/qemu-iotests/232 +++ b/tests/qemu-iotests/232 @@ -92,6 +92,11 @@ echo chmod a-w $TEST_IMG +if [ -w $TEST_IMG ] +then +_notrun "Cannot run this test as root user" +fi + run_qemu_info_block -drive driver=file,file="$TEST_IMG",if=none,read-only=on,auto-read-only=off run_qemu_info_block -drive driver=file,file="$TEST_IMG",if=none,read-only=on,auto-read-only=on run_qemu_info_block -drive driver=file,file="$TEST_IMG",if=none,read-only=on -- 2.7.4
Re: [Qemu-devel] [RFC for 3.1? or 4 v3 1/1] qemu-iotests: Don't run the test when user is root
On 11/30/2018 03:52 PM, Eric Blake wrote: On 11/30/18 2:37 PM, Farhan Ali wrote: Test 232 creates image files with read-only permission and expects an error message when trying to access the image files with read-only and auto-read-only turned off. Don't run as root user, since root can open files with read/write access for read-only files. Signed-off-by: Farhan Ali --- tests/qemu-iotests/232 | 5 + 1 file changed, 5 insertions(+) Did you mean to keep RFC in the subject line? Will patchew even spot it without PATCH in the subject line? Reviewed-by: Eric Blake Thanks for reviewing the patch. I am not sure if patchew will pick it up; I have no problem in spinning another patch but I will wait till Monday to see if anyone else has any feedback. Thanks Farhan
[Qemu-devel] [RFC for 3.1? or 4 v3 1/1] qemu-iotests: Don't run the test when user is root
Test 232 creates image files with read-only permission and expects an error message when trying to access the image files with read-only and auto-read-only turned off. Don't run as root user, since root can open files with read/write access for read-only files. Signed-off-by: Farhan Ali --- tests/qemu-iotests/232 | 5 + 1 file changed, 5 insertions(+) diff --git a/tests/qemu-iotests/232 b/tests/qemu-iotests/232 index 0708b8b..1d34242 100755 --- a/tests/qemu-iotests/232 +++ b/tests/qemu-iotests/232 @@ -92,6 +92,11 @@ echo chmod a-w $TEST_IMG +if [ -w $TEST_IMG ] +then +_notrun "Cannot run this test as root user" +fi + run_qemu_info_block -drive driver=file,file="$TEST_IMG",if=none,read-only=on,auto-read-only=off run_qemu_info_block -drive driver=file,file="$TEST_IMG",if=none,read-only=on,auto-read-only=on run_qemu_info_block -drive driver=file,file="$TEST_IMG",if=none,read-only=on -- 2.7.4
Re: [Qemu-devel] [RFC for 3.1? or 4 v2 1/1] qemu-iotests: Don't run the test when user is root
On 11/30/2018 12:50 PM, Eric Blake wrote: Adding qemu-devel - all patches should go there, especially if you want to get Peter's attention that this might be a 3.1 candidate if we have other reasons to spin -rc4. On 11/30/18 10:04 AM, Farhan Ali wrote: Test 232 creates image files with read-only permission and expects an error message when trying to access the image files with read-only and auto-read-only turned off. Don't run as root user, since root can open files with read/write access for read-only files. Signed-off-by: Farhan Ali --- tests/qemu-iotests/232 | 8 1 file changed, 8 insertions(+) diff --git a/tests/qemu-iotests/232 b/tests/qemu-iotests/232 index 0708b8b..05d5f2f 100755 --- a/tests/qemu-iotests/232 +++ b/tests/qemu-iotests/232 @@ -41,6 +41,14 @@ _supported_fmt generic _supported_proto file _supported_os Linux +tmp='file' +touch $tmp +chmod a-w $tmp +if [ -w $tmp ] +then + _notrun "Cannot run this test as root user" +fi + I know you just copied from my suggestion, but now looking at it, this leaves 'tmp' around in the directory for both success and skip. Better might be to just check whether $TEST_IMG is writable, immediately after the existing 'chmod a-w $TEST_IMG' line (Hmm - that line is already broken for not quoting "$TEST_IMG" in case it contains whitespace). Ah yes, I forgot about the fact the 'file' will linger around. I will spin up a v3. I don't see this being a reason for -rc4 on its own (most people don't run iotests as root); and the fact that we're still working on the final contents of what the patch should contain, as well as the fact that the patch doesn't affect the main binaries, means that if it were up to me, I'd defer it to 4.0. Kevin may have a different opinion, though, since it is his test, and new to 3.1.
Re: [Qemu-devel] qemu-iotests 232 fails when running the test as root user
On 11/29/2018 04:07 PM, Eric Blake wrote: On 11/29/18 3:03 PM, Farhan Ali wrote: Hi, I am seeing a failure of the qemu-iotest number 232 when running the test as a root user. Is this the expected behavior? Here is the output of the failure: -QEMU_PROG: -drive driver=file,file=TEST_DIR/t.IMGFMT,if=none,read-only=off,auto-read-only=off: Could not open 'TEST_DIR/t.IMGFMT': Permission denied -NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only) -NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only) - -QEMU_PROG: -drive driver=file,file=TEST_DIR/t.IMGFMT,if=none,auto-read-only=off: Could not open 'TEST_DIR/t.IMGFMT': Permission denied -NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only) -NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only) +NODE_NAME: TEST_DIR/t.IMGFMT (file) Not entirely unexpected (since root can read what are otherwise read-only files), but obviously unexpected by the test. The test should probably refuse to run as root. Yes, I believe it would be better if there was a check to prevent running this test as root. Thanks Farhan
[Qemu-devel] qemu-iotests 232 fails when running the test as root user
Hi, I am seeing a failure of the qemu-iotest number 232 when running the test as a root user. Is this the expected behavior? Here is the output of the failure: sudo ./check -qcow2 232 QEMU -- "/home/alifm/kvmdev/qemu/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64" -nodefaults -machine accel=qtest QEMU_IMG -- "/home/alifm/kvmdev/qemu/tests/qemu-iotests/../../qemu-img" QEMU_IO -- "/home/alifm/kvmdev/qemu/tests/qemu-iotests/../../qemu-io" --cache writeback -f qcow2 QEMU_NBD -- "/home/alifm/kvmdev/qemu/tests/qemu-iotests/../../qemu-nbd" IMGFMT-- qcow2 (compat=1.1) IMGPROTO -- file PLATFORM -- Linux/x86_64 alifm-ThinkPad-T470p 4.10.0-40-generic TEST_DIR -- /home/alifm/kvmdev/qemu/tests/qemu-iotests/scratch SOCKET_SCM_HELPER -- /home/alifm/kvmdev/qemu/tests/qemu-iotests/socket_scm_helper 232 1s ... - output mismatch (see 232.out.bad) --- /home/alifm/kvmdev/qemu/tests/qemu-iotests/232.out 2018-11-26 11:45:58.368023999 -0500 +++ /home/alifm/kvmdev/qemu/tests/qemu-iotests/232.out.bad 2018-11-29 15:58:41.768561694 -0500 @@ -21,13 +21,13 @@ NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only) NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only) -QEMU_PROG: -drive driver=file,file=TEST_DIR/t.IMGFMT,if=none,read-only=off,auto-read-only=off: Could not open 'TEST_DIR/t.IMGFMT': Permission denied -NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only) -NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only) - -QEMU_PROG: -drive driver=file,file=TEST_DIR/t.IMGFMT,if=none,auto-read-only=off: Could not open 'TEST_DIR/t.IMGFMT': Permission denied -NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only) -NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only) +NODE_NAME: TEST_DIR/t.IMGFMT (file) +NODE_NAME: TEST_DIR/t.IMGFMT (file) +NODE_NAME: TEST_DIR/t.IMGFMT (file) + +NODE_NAME: TEST_DIR/t.IMGFMT (file) +NODE_NAME: TEST_DIR/t.IMGFMT (file) +NODE_NAME: TEST_DIR/t.IMGFMT (file) === -blockdev with read-write image: read-only/auto-read-only combinations === @@ -49,11 +49,11 @@ node0: TEST_DIR/t.IMGFMT (file, read-only) node0: TEST_DIR/t.IMGFMT (file, read-only) -QEMU_PROG: -blockdev driver=file,filename=TEST_DIR/t.IMGFMT,node-name=node0,read-only=off,auto-read-only=off: Could not open 'TEST_DIR/t.IMGFMT': Permission denied -node0: TEST_DIR/t.IMGFMT (file, read-only) -QEMU_PROG: -blockdev driver=file,filename=TEST_DIR/t.IMGFMT,node-name=node0,read-only=off: Could not open 'TEST_DIR/t.IMGFMT': Permission denied +node0: TEST_DIR/t.IMGFMT (file) +node0: TEST_DIR/t.IMGFMT (file) +node0: TEST_DIR/t.IMGFMT (file) -QEMU_PROG: -blockdev driver=file,filename=TEST_DIR/t.IMGFMT,node-name=node0,auto-read-only=off: Could not open 'TEST_DIR/t.IMGFMT': Permission denied -node0: TEST_DIR/t.IMGFMT (file, read-only) -QEMU_PROG: -blockdev driver=file,filename=TEST_DIR/t.IMGFMT,node-name=node0: Could not open 'TEST_DIR/t.IMGFMT': Permission denied +node0: TEST_DIR/t.IMGFMT (file) +node0: TEST_DIR/t.IMGFMT (file) +node0: TEST_DIR/t.IMGFMT (file) *** done Failures: 232 Failed 1 of 1 tests Thank you Farhan
Re: [Qemu-devel] [PATCH 3/3] vfio-ccw: add handling for asnyc channel instructions
On 11/28/2018 10:35 AM, Cornelia Huck wrote: On Wed, 28 Nov 2018 10:00:59 -0500 Farhan Ali wrote: On 11/28/2018 09:52 AM, Cornelia Huck wrote: On Wed, 28 Nov 2018 09:31:51 -0500 Farhan Ali wrote: On 11/28/2018 04:02 AM, Cornelia Huck wrote: On Tue, 27 Nov 2018 14:09:49 -0500 Farhan Ali wrote: On 11/22/2018 11:54 AM, Cornelia Huck wrote: Add a region to the vfio-ccw device that can be used to submit asynchronous I/O instructions. ssch continues to be handled by the existing I/O region; the new region handles hsch and csch. Interrupt status continues to be reported through the same channels as for ssch. Signed-off-by: Cornelia Huck --- drivers/s390/cio/Makefile | 3 +- drivers/s390/cio/vfio_ccw_async.c | 88 drivers/s390/cio/vfio_ccw_drv.c | 48 ++--- drivers/s390/cio/vfio_ccw_fsm.c | 158 +++- drivers/s390/cio/vfio_ccw_ops.c | 13 ++- drivers/s390/cio/vfio_ccw_private.h | 6 ++ include/uapi/linux/vfio.h | 4 + include/uapi/linux/vfio_ccw.h | 12 +++ 8 files changed, 313 insertions(+), 19 deletions(-) create mode 100644 drivers/s390/cio/vfio_ccw_async.c @@ -76,7 +79,8 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work) private = container_of(work, struct vfio_ccw_private, io_work); irb = >irb; - if (scsw_is_solicited(>scsw)) { + if (scsw_is_solicited(>scsw) && + (scsw_fctl(>scsw) & SCSW_FCTL_START_FUNC)) { cp_update_scsw(>cp, >scsw); cp_free(>cp); } I am a little confused about this. Why do we need to update the scsw.cpa if we have the start function function control bit set? Is it an optimization? No, it's not an optimization. This is the work function that is scheduled if we get an interrupt for the device. Previously, we only got an interrupt if either the device presented us an unsolicited status or if we got an interrupt as a response to the channel program we submitted. Now, we can get an interrupt for halt/clear subchannel as well, and in that case, we don't necessarily have a cp. [Thinking some more about it, we need to verify if the start function actually remains set if we try to terminate a running channel program with halt/clear. A clear might scrub too much. If that's the case, we also need to free the cp if the start function is not set.] According to PoPs (Chapter 16: I/O interruptions, under function control): The start-function indication is also cleared at the subchannel during the execution of CLEAR SUB- CHANNEL. So maybe we do need to free the cp. Hm... so we need to make sure that cp_update_scsw() and cp_free() only do something when there's actually a valid cp around and call them unconditionally. Yes, I agree. Maybe add a ->valid flag to struct channel_program? We could do that. So we would set the flag once we have copied the channel program to kernel memory? since that's when we should care about freeing it. I hacked up the following (still untested): From e771c8dc5abbfbd19688b452096bab9d032e0df5 Mon Sep 17 00:00:00 2001 From: Cornelia Huck Date: Wed, 28 Nov 2018 16:30:51 +0100 Subject: [PATCH] vfio-ccw: make it safe to access channel programs When we get a solicited interrupt, the start function may have been cleared by a csch, but we still have a channel program structure allocated. Make it safe to call the cp accessors in any case, so we can call them unconditionally. Signed-off-by: Cornelia Huck --- drivers/s390/cio/vfio_ccw_cp.c | 9 - drivers/s390/cio/vfio_ccw_cp.h | 2 ++ drivers/s390/cio/vfio_ccw_drv.c | 3 +-- 3 files changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c index 70a006ba4d05..35f87514276b 100644 --- a/drivers/s390/cio/vfio_ccw_cp.c +++ b/drivers/s390/cio/vfio_ccw_cp.c @@ -335,6 +335,7 @@ static void cp_unpin_free(struct channel_program *cp) struct ccwchain *chain, *temp; int i; + cp->initialized = false; list_for_each_entry_safe(chain, temp, >ccwchain_list, next) { for (i = 0; i < chain->ch_len; i++) { pfn_array_table_unpin_free(chain->ch_pat + i, @@ -701,6 +702,8 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb) */ cp->orb.cmd.c64 = 1; + cp->initialized = true; + return ret; } @@ -715,7 +718,8 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb) */ void cp_free(struct channel_program *cp) { - cp_unpin_free(cp); + if (cp->initialized) + cp_unpin_free(cp); } /** @@ -831,6 +835,9 @@ void cp_update_scsw(struct channel_program *cp, union scsw *scsw) u32 cpa = scsw->cmd.cpa; u32 ccw_hea
Re: [Qemu-devel] [PATCH 3/3] vfio-ccw: add handling for asnyc channel instructions
On 11/28/2018 09:52 AM, Cornelia Huck wrote: On Wed, 28 Nov 2018 09:31:51 -0500 Farhan Ali wrote: On 11/28/2018 04:02 AM, Cornelia Huck wrote: On Tue, 27 Nov 2018 14:09:49 -0500 Farhan Ali wrote: On 11/22/2018 11:54 AM, Cornelia Huck wrote: Add a region to the vfio-ccw device that can be used to submit asynchronous I/O instructions. ssch continues to be handled by the existing I/O region; the new region handles hsch and csch. Interrupt status continues to be reported through the same channels as for ssch. Signed-off-by: Cornelia Huck --- drivers/s390/cio/Makefile | 3 +- drivers/s390/cio/vfio_ccw_async.c | 88 drivers/s390/cio/vfio_ccw_drv.c | 48 ++--- drivers/s390/cio/vfio_ccw_fsm.c | 158 +++- drivers/s390/cio/vfio_ccw_ops.c | 13 ++- drivers/s390/cio/vfio_ccw_private.h | 6 ++ include/uapi/linux/vfio.h | 4 + include/uapi/linux/vfio_ccw.h | 12 +++ 8 files changed, 313 insertions(+), 19 deletions(-) create mode 100644 drivers/s390/cio/vfio_ccw_async.c @@ -76,7 +79,8 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work) private = container_of(work, struct vfio_ccw_private, io_work); irb = >irb; - if (scsw_is_solicited(>scsw)) { + if (scsw_is_solicited(>scsw) && + (scsw_fctl(>scsw) & SCSW_FCTL_START_FUNC)) { cp_update_scsw(>cp, >scsw); cp_free(>cp); } I am a little confused about this. Why do we need to update the scsw.cpa if we have the start function function control bit set? Is it an optimization? No, it's not an optimization. This is the work function that is scheduled if we get an interrupt for the device. Previously, we only got an interrupt if either the device presented us an unsolicited status or if we got an interrupt as a response to the channel program we submitted. Now, we can get an interrupt for halt/clear subchannel as well, and in that case, we don't necessarily have a cp. [Thinking some more about it, we need to verify if the start function actually remains set if we try to terminate a running channel program with halt/clear. A clear might scrub too much. If that's the case, we also need to free the cp if the start function is not set.] According to PoPs (Chapter 16: I/O interruptions, under function control): The start-function indication is also cleared at the subchannel during the execution of CLEAR SUB- CHANNEL. So maybe we do need to free the cp. Hm... so we need to make sure that cp_update_scsw() and cp_free() only do something when there's actually a valid cp around and call them unconditionally. Yes, I agree. Maybe add a ->valid flag to struct channel_program? We could do that. So we would set the flag once we have copied the channel program to kernel memory? since that's when we should care about freeing it.
Re: [Qemu-devel] [PATCH 3/3] vfio-ccw: add handling for asnyc channel instructions
On 11/28/2018 04:02 AM, Cornelia Huck wrote: On Tue, 27 Nov 2018 14:09:49 -0500 Farhan Ali wrote: On 11/22/2018 11:54 AM, Cornelia Huck wrote: Add a region to the vfio-ccw device that can be used to submit asynchronous I/O instructions. ssch continues to be handled by the existing I/O region; the new region handles hsch and csch. Interrupt status continues to be reported through the same channels as for ssch. Signed-off-by: Cornelia Huck --- drivers/s390/cio/Makefile | 3 +- drivers/s390/cio/vfio_ccw_async.c | 88 drivers/s390/cio/vfio_ccw_drv.c | 48 ++--- drivers/s390/cio/vfio_ccw_fsm.c | 158 +++- drivers/s390/cio/vfio_ccw_ops.c | 13 ++- drivers/s390/cio/vfio_ccw_private.h | 6 ++ include/uapi/linux/vfio.h | 4 + include/uapi/linux/vfio_ccw.h | 12 +++ 8 files changed, 313 insertions(+), 19 deletions(-) create mode 100644 drivers/s390/cio/vfio_ccw_async.c @@ -76,7 +79,8 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work) private = container_of(work, struct vfio_ccw_private, io_work); irb = >irb; - if (scsw_is_solicited(>scsw)) { + if (scsw_is_solicited(>scsw) && + (scsw_fctl(>scsw) & SCSW_FCTL_START_FUNC)) { cp_update_scsw(>cp, >scsw); cp_free(>cp); } I am a little confused about this. Why do we need to update the scsw.cpa if we have the start function function control bit set? Is it an optimization? No, it's not an optimization. This is the work function that is scheduled if we get an interrupt for the device. Previously, we only got an interrupt if either the device presented us an unsolicited status or if we got an interrupt as a response to the channel program we submitted. Now, we can get an interrupt for halt/clear subchannel as well, and in that case, we don't necessarily have a cp. [Thinking some more about it, we need to verify if the start function actually remains set if we try to terminate a running channel program with halt/clear. A clear might scrub too much. If that's the case, we also need to free the cp if the start function is not set.] According to PoPs (Chapter 16: I/O interruptions, under function control): The start-function indication is also cleared at the subchannel during the execution of CLEAR SUB- CHANNEL. So maybe we do need to free the cp.
Re: [Qemu-devel] [PATCH 3/3] vfio-ccw: add handling for asnyc channel instructions
On 11/22/2018 11:54 AM, Cornelia Huck wrote: diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 565669f95534..c01472ec77ea 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -304,6 +304,7 @@ struct vfio_region_info_cap_type { #define VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG(2) #define VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG (3) + Whitespace error? #define VFIO_REGION_TYPE_GFX(1) #define VFIO_REGION_SUBTYPE_GFX_EDID(1) @@ -354,6 +355,9 @@ struct vfio_region_gfx_edid { #define VFIO_DEVICE_GFX_LINK_STATE_DOWN 2 }; +/* ccw sub-types */ +#define VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD (1) +
Re: [Qemu-devel] [PATCH 3/3] vfio-ccw: add handling for asnyc channel instructions
On 11/22/2018 11:54 AM, Cornelia Huck wrote: Add a region to the vfio-ccw device that can be used to submit asynchronous I/O instructions. ssch continues to be handled by the existing I/O region; the new region handles hsch and csch. Interrupt status continues to be reported through the same channels as for ssch. Signed-off-by: Cornelia Huck --- drivers/s390/cio/Makefile | 3 +- drivers/s390/cio/vfio_ccw_async.c | 88 drivers/s390/cio/vfio_ccw_drv.c | 48 ++--- drivers/s390/cio/vfio_ccw_fsm.c | 158 +++- drivers/s390/cio/vfio_ccw_ops.c | 13 ++- drivers/s390/cio/vfio_ccw_private.h | 6 ++ include/uapi/linux/vfio.h | 4 + include/uapi/linux/vfio_ccw.h | 12 +++ 8 files changed, 313 insertions(+), 19 deletions(-) create mode 100644 drivers/s390/cio/vfio_ccw_async.c diff --git a/drivers/s390/cio/Makefile b/drivers/s390/cio/Makefile index f230516abb96..f6a8db04177c 100644 --- a/drivers/s390/cio/Makefile +++ b/drivers/s390/cio/Makefile @@ -20,5 +20,6 @@ obj-$(CONFIG_CCWGROUP) += ccwgroup.o qdio-objs := qdio_main.o qdio_thinint.o qdio_debug.o qdio_setup.o obj-$(CONFIG_QDIO) += qdio.o -vfio_ccw-objs += vfio_ccw_drv.o vfio_ccw_cp.o vfio_ccw_ops.o vfio_ccw_fsm.o +vfio_ccw-objs += vfio_ccw_drv.o vfio_ccw_cp.o vfio_ccw_ops.o vfio_ccw_fsm.o \ + vfio_ccw_async.o obj-$(CONFIG_VFIO_CCW) += vfio_ccw.o diff --git a/drivers/s390/cio/vfio_ccw_async.c b/drivers/s390/cio/vfio_ccw_async.c new file mode 100644 index ..8c7f51d17d70 --- /dev/null +++ b/drivers/s390/cio/vfio_ccw_async.c @@ -0,0 +1,88 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Async I/O region for vfio_ccw + * + * Copyright Red Hat, Inc. 2018 + * + * Author(s): Cornelia Huck + */ + +#include +#include + +#include "vfio_ccw_private.h" + +static size_t vfio_ccw_async_region_read(struct vfio_ccw_private *private, +char __user *buf, size_t count, +loff_t *ppos) +{ + unsigned int i = VFIO_CCW_OFFSET_TO_INDEX(*ppos) - VFIO_CCW_NUM_REGIONS; + loff_t pos = *ppos & VFIO_CCW_OFFSET_MASK; + struct ccw_cmd_region *region; + + if (pos + count > sizeof(*region)) + return -EINVAL; + + region = private->region[i].data; + if (copy_to_user(buf, (void *)region + pos, count)) + return -EFAULT; + + return count; + +} + +static size_t vfio_ccw_async_region_write(struct vfio_ccw_private *private, + const char __user *buf, size_t count, + loff_t *ppos) +{ + unsigned int i = VFIO_CCW_OFFSET_TO_INDEX(*ppos) - VFIO_CCW_NUM_REGIONS; + loff_t pos = *ppos & VFIO_CCW_OFFSET_MASK; + struct ccw_cmd_region *region; + + if (pos + count > sizeof(*region)) + return -EINVAL; + + if (private->state == VFIO_CCW_STATE_NOT_OPER || + private->state == VFIO_CCW_STATE_STANDBY) + return -EACCES; + + region = private->region[i].data; + if (copy_from_user((void *)region + pos, buf, count)) + return -EFAULT; + + switch (region->command) { + case VFIO_CCW_ASYNC_CMD_HSCH: + vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_HALT_REQ); + break; + case VFIO_CCW_ASYNC_CMD_CSCH: + vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_CLEAR_REQ); + break; + default: + return -EINVAL; + } + + return region->ret_code ? region->ret_code : count; +} + +static void vfio_ccw_async_region_release(struct vfio_ccw_private *private, + struct vfio_ccw_region *region) +{ + +} + +const struct vfio_ccw_regops vfio_ccw_async_region_ops = { + .read = vfio_ccw_async_region_read, + .write = vfio_ccw_async_region_write, + .release = vfio_ccw_async_region_release, +}; + +int vfio_ccw_register_async_dev_regions(struct vfio_ccw_private *private) +{ + return vfio_ccw_register_dev_region(private, + VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD, + _ccw_async_region_ops, + sizeof(struct ccw_cmd_region), + VFIO_REGION_INFO_FLAG_READ | + VFIO_REGION_INFO_FLAG_WRITE, + private->cmd_region); +} diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c index a10cec0e86eb..890c588a3a61 100644 --- a/drivers/s390/cio/vfio_ccw_drv.c +++ b/drivers/s390/cio/vfio_ccw_drv.c @@ -3,9 +3,11 @@ * VFIO based Physical Subchannel device driver * * Copyright IBM Corp. 2017 + * Copyright Red Hat, Inc. 2018 * * Author(s): Dong Jia Shi *Xiao Feng Ren + *
Re: [Qemu-devel] [PATCH 1/3] vfio-ccw: add capabilities chain
On 11/22/2018 11:54 AM, Cornelia Huck wrote: diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h index 078e46f9623d..a6f9f84526e2 100644 --- a/drivers/s390/cio/vfio_ccw_private.h +++ b/drivers/s390/cio/vfio_ccw_private.h @@ -3,9 +3,11 @@ * Private stuff for vfio_ccw driver * * Copyright IBM Corp. 2017 + * Copyright Red Hat, Inc. 2018 * * Author(s): Dong Jia Shi *Xiao Feng Ren + *Cornelia Huck */ #ifndef_VFIO_CCW_PRIVATE_H_ @@ -19,6 +21,38 @@ #include "css.h" #include "vfio_ccw_cp.h" +#define VFIO_CCW_OFFSET_SHIFT 40 +#define VFIO_CCW_OFFSET_TO_INDEX(off) (off >> VFIO_CCW_OFFSET_SHIFT) +#define VFIO_CCW_INDEX_TO_OFFSET(index)((u64)(index) << VFIO_CCW_OFFSET_SHIFT) +#define VFIO_CCW_OFFSET_MASK (((u64)(1) << VFIO_CCW_OFFSET_SHIFT) - 1) + Why is the offset shift 40? I know vfio-pci is also using the same offset shift, but I am curious about the reasoning behind why we are using this? :)
Re: [Qemu-devel] [PATCH 0/3] vfio-ccw: support hsch/csch (kernel part)
On 11/22/2018 11:54 AM, Cornelia Huck wrote: [This is the Linux kernel part, git tree is available at https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw.git vfio-ccw-caps The companion QEMU patches are available at https://github.com/cohuck/qemu vfio-ccw-caps] Currently, vfio-ccw only relays START SUBCHANNEL requests to the real device. This tends to work well for the most common 'good path' scenarios; however, as we emulate {HALT,CLEAR} SUBCHANNEL in QEMU, things like clearing pending requests at the device is currently not supported. This may be a problem for e.g. error recovery. This patch series introduces capabilities (similar to what vfio-pci uses) and exposes a new async region for handling hsch/csch. Very lightly tested (I can interact with a dasd as before; I have not found a reliable way to trigger hsch/csch in the Linux dasd guest driver.) I was able to trigger the guest DASD driver to issue a csch instruction, and from my brief testing it seems to be working just like it would on the LPAR. (I basically tested with 2 threads trying to issue DASD device reserve and release ioctls, on the same DASD device, in a busy loop). I am going to spend some time doing a deeper review. Thanks Farhan Cornelia Huck (3): vfio-ccw: add capabilities chain s390/cio: export hsch to modules vfio-ccw: add handling for asnyc channel instructions drivers/s390/cio/Makefile | 3 +- drivers/s390/cio/ioasm.c| 1 + drivers/s390/cio/vfio_ccw_async.c | 88 + drivers/s390/cio/vfio_ccw_drv.c | 48 +-- drivers/s390/cio/vfio_ccw_fsm.c | 158 +- drivers/s390/cio/vfio_ccw_ops.c | 195 drivers/s390/cio/vfio_ccw_private.h | 44 +++ include/uapi/linux/vfio.h | 5 + include/uapi/linux/vfio_ccw.h | 12 ++ 9 files changed, 509 insertions(+), 45 deletions(-) create mode 100644 drivers/s390/cio/vfio_ccw_async.c
Re: [Qemu-devel] [PATCH 1/4] MAINTAINERS: s390: more maintainers for vfio-ccw
On 10/29/2018 11:42 AM, Christian Borntraeger wrote: Eric and Farhan will help with maintaining vfio-ccw. Signed-off-by: Christian Borntraeger --- MAINTAINERS | 2 ++ 1 file changed, 2 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index d794bd7..10045b6 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1204,6 +1204,8 @@ F: include/hw/vfio/ vfio-ccw M: Cornelia Huck +M: Eric Farman +M: Farhan Ali S: Supported F: hw/vfio/ccw.c F: hw/s390x/s390-ccw.c Acked-by: Farhan Ali
Re: [Qemu-devel] [RFC v1 1/1] qemu-iotests: Fix output for testcase 082
On 10/15/2018 11:45 AM, Max Reitz wrote: 082 is failing for me on master (046936ed), and this fixes it. I'm wondering if/why other people are not running into this, as 082 runs with qcow2 and is included on `make check-block`. + Max, FIY, I ran into this while reviewing/testing your Python 3 series. Tested-by: Cleber Rosa Thanks for CC-ing me, Cleber. Farhan, I disagree with this fix, as I wrote here: http://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg01488.html I don't like the new output, so I think what we should do is change the code that produces it. Marc-André seemed to agree, so I just need to write a patch (which I didn't get to so far). Max Hi Max, That is fine with me. The reason I sent this patch out as an RFC was because I wasn't sure about the fix. So for the time being is it okay for the test to fail? And if you do submit a patch, would you kindly cc me as well. Thank you Farhan
[Qemu-devel] [RFC v1 1/1] qemu-iotests: Fix output for testcase 082
Commit 9cbef9d68ee: qemu-option: improve qemu_opts_print_help() output, changed some of the output for qemu-img tool but did not update the corresponding 082 test case file. This broke qemu-iotests. Signed-off-by: Farhan Ali --- tests/qemu-iotests/082.out | 956 ++--- 1 file changed, 478 insertions(+), 478 deletions(-) diff --git a/tests/qemu-iotests/082.out b/tests/qemu-iotests/082.out index 19e9fb1..2672349 100644 --- a/tests/qemu-iotests/082.out +++ b/tests/qemu-iotests/082.out @@ -44,171 +44,171 @@ cluster_size: 8192 Testing: create -f qcow2 -o help TEST_DIR/t.qcow2 128M Supported options: -size Virtual disk size -compat Compatibility level (0.10 or 1.1) -backing_file File name of a base image -backing_fmt Image format of the base image -encryption Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes) -encrypt.format Encrypt the image, format choices: 'aes', 'luks' -encrypt.key-secret ID of secret providing qcow AES key or LUKS passphrase -encrypt.cipher-alg Name of encryption cipher algorithm -encrypt.cipher-mode Name of encryption cipher mode -encrypt.ivgen-alg Name of IV generator algorithm -encrypt.ivgen-hash-alg Name of IV generator hash algorithm -encrypt.hash-alg Name of encryption hash algorithm -encrypt.iter-time Time to spend in PBKDF in milliseconds -cluster_size qcow2 cluster size -preallocationPreallocation mode (allowed values: off, metadata, falloc, full) -lazy_refcounts Postpone refcount updates -refcount_bitsWidth of a reference count entry in bits -nocowTurn off copy-on-write (valid only on btrfs) +backing_file=str - File name of a base image +backing_fmt=str - Image format of the base image +cluster_size=size - qcow2 cluster size +compat=str - Compatibility level (0.10 or 1.1) +encrypt.cipher-alg=str - Name of encryption cipher algorithm +encrypt.cipher-mode=str - Name of encryption cipher mode +encrypt.format=str - Encrypt the image, format choices: 'aes', 'luks' +encrypt.hash-alg=str - Name of encryption hash algorithm +encrypt.iter-time=num - Time to spend in PBKDF in milliseconds +encrypt.ivgen-alg=str - Name of IV generator algorithm +encrypt.ivgen-hash-alg=str - Name of IV generator hash algorithm +encrypt.key-secret=str - ID of secret providing qcow AES key or LUKS passphrase +encryption=bool (on/off) - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes) +lazy_refcounts=bool (on/off) - Postpone refcount updates +nocow=bool (on/off) - Turn off copy-on-write (valid only on btrfs) +preallocation=str - Preallocation mode (allowed values: off, metadata, falloc, full) +refcount_bits=num - Width of a reference count entry in bits +size=size - Virtual disk size Testing: create -f qcow2 -o ? TEST_DIR/t.qcow2 128M Supported options: -size Virtual disk size -compat Compatibility level (0.10 or 1.1) -backing_file File name of a base image -backing_fmt Image format of the base image -encryption Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes) -encrypt.format Encrypt the image, format choices: 'aes', 'luks' -encrypt.key-secret ID of secret providing qcow AES key or LUKS passphrase -encrypt.cipher-alg Name of encryption cipher algorithm -encrypt.cipher-mode Name of encryption cipher mode -encrypt.ivgen-alg Name of IV generator algorithm -encrypt.ivgen-hash-alg Name of IV generator hash algorithm -encrypt.hash-alg Name of encryption hash algorithm -encrypt.iter-time Time to spend in PBKDF in milliseconds -cluster_size qcow2 cluster size -preallocationPreallocation mode (allowed values: off, metadata, falloc, full) -lazy_refcounts Postpone refcount updates -refcount_bitsWidth of a reference count entry in bits -nocowTurn off copy-on-write (valid only on btrfs) +backing_file=str - File name of a base image +backing_fmt=str - Image format of the base image +cluster_size=size - qcow2 cluster size +compat=str - Compatibility level (0.10 or 1.1) +encrypt.cipher-alg=str - Name of encryption cipher algorithm +encrypt.cipher-mode=str - Name of encryption cipher mode +encrypt.format=str - Encrypt the image, format choices: 'aes', 'luks' +encrypt.hash-alg=str - Name of encryption hash algorithm +encrypt.iter-time=num - Time to spend in PBKDF in milliseconds +encrypt.ivgen-alg=str - Name of IV generator algorithm +encrypt.ivgen-hash-alg=str - Name of IV generator hash algorithm +encrypt.key-secret=str - ID of secret providing qcow AES key or LUKS passphrase +encryption=bool (on/off) - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes) +lazy_refcounts=bool (on/off) - Postpone refcount updates +nocow=bool (on/off) - Turn off copy-on-write (valid only on btrfs) +preallocation=str - Preallocation mode (allowed values: off, metadata, falloc, full) +refcount_bits=num - Width of a reference count entry
Re: [Qemu-devel] [Qemu-block] [PATCH] block/file-posix: add bdrv_attach_aio_context callback for host dev and cdrom
On 07/27/2018 09:26 AM, Stefan Hajnoczi wrote: On Mon, Jul 23, 2018 at 12:42:02PM -0400, Farhan Ali wrote: On 07/23/2018 12:30 PM, Stefan Hajnoczi wrote: On Fri, Jul 20, 2018 at 03:11:14PM -0400, Farhan Ali wrote: I am seeing another issue pop up, in a different test. Even though it's a different assertion, it might be related based on the call trace. Which test case? This test case involved one guest with 2 disks, with an iothread for each disk. The guest was running a memory workload. Please post a link to the test case. Stefan Hi Stefan, Thanks for your response. The test case was run in our internal infrastructure, so unfortunately I cannot post any link to the test case. I have been unable to reproduce this exact issue. On the other hand the same test case is throwing another assertion error: qemu-kvm: /builddir/build/BUILD/qemu-2.12.91/exec.c:3695: address_space_unmap: Assertion `mr != NULL' failed. Again this to me is very strange error. I have been able to reproduce this assertion couple of times, though I don't hit it on every run of the test case. The qemu command line for the test case: /usr/bin/qemu-kvm -name guest=sles,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-6-sles/master-key.aes -machine s390-ccw-virtio-3.0,accel=kvm,usb=off,dump-guest-core=off -m 4096 -realtime mlock=off -smp 8,sockets=8,cores=1,threads=1 -object iothread,id=iothread1 -object iothread,id=iothread2 -uuid 094a20ff-e881-44db-a772-fb4029cf8f09 -display none -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=28,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -drive file=/dev/mapper/360050763998b08839800333a,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native -device virtio-blk-ccw,iothread=iothread1,scsi=off,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,write-cache=on -drive file=/dev/mapper/360050763998b088398001e25,format=raw,if=none,id=drive-virtio-disk1,cache=none,aio=native -device virtio-blk-ccw,iothread=iothread2,scsi=off,devno=fe.0.0002,drive=drive-virtio-disk1,id=virtio-disk1,write-cache=on -netdev tap,fd=30,id=hostnet0,vhost=on,vhostfd=31 -device virtio-net-ccw,netdev=hostnet0,id=net0,mac=02:f5:5d:7d:7d:ef,devno=fe.0. -chardev pty,id=charconsole0 -device sclpconsole,chardev=charconsole0,id=console0 -device virtio-balloon-ccw,id=balloon0,devno=fe.3.ffba -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on The backtrace of the thread which trips the assertion: #0 0x03ffb553e274 in raise () from /lib64/libc.so.6 #1 0x03ffb55239a8 in abort () from /lib64/libc.so.6 #2 0x03ffb55362ce in __assert_fail_base () from /lib64/libc.so.6 #3 0x03ffb553634c in __assert_fail () from /lib64/libc.so.6 #4 0x02aa39b487e6 in address_space_unmap (as=, buffer=, len=, is_write=, access_len=) at /usr/src/debug/qemu-2.12.91-20180726.0.c6cf862329.fc28.s390x/exec.c:3695 #5 0x02aa39bfb550 in dma_memory_unmap (access_len=0, dir=DMA_DIRECTION_FROM_DEVICE, len=, buffer=out>, as=0x2aa3a085b80 ) at /usr/src/debug/qemu-2.12.91-20180726.0.c6cf862329.fc28.s390x/include/sysemu/dma.h:146 #6 virtqueue_unmap_sg (elem=elem@entry=0x3ff9c01c9d0, len=len@entry=1, vq=) at /usr/src/debug/qemu-2.12.91-20180726.0.c6cf862329.fc28.s390x/hw/virtio/virtio.c:401 #7 0x02aa39bfc840 in virtqueue_fill (vq=vq@entry=0x3ffb6c9e010, elem=0x3ff9c01c9d0, len=, idx=idx@entry=0) at /usr/src/debug/qemu-2.12.91-20180726.0.c6cf862329.fc28.s390x/hw/virtio/virtio.c:476 #8 0x02aa39bfcb80 in virtqueue_push (vq=0x3ffb6c9e010, elem=elem@entry=0x3ff9c01c9d0, len=) at /usr/src/debug/qemu-2.12.91-20180726.0.c6cf862329.fc28.s390x/hw/virtio/virtio.c:522 #9 0x02aa39bd78c2 in virtio_blk_req_complete (req=0x3ff9c01c9d0, status=) at /usr/src/debug/qemu-2.12.91-20180726.0.c6cf862329.fc28.s390x/hw/block/virtio-blk.c:55 #10 0x02aa39bd8540 in virtio_blk_rw_complete (opaque=out>, ret=) at /usr/src/debug/qemu-2.12.91-20180726.0.c6cf862329.fc28.s390x/hw/block/virtio-blk.c:121 #11 0x02aa39d9a15e in blk_aio_complete (acb=0x3ff9c04f670) at /usr/src/debug/qemu-2.12.91-20180726.0.c6cf862329.fc28.s390x/block/block-backend.c:1336 #12 0x02aa39e48e98 in coroutine_trampoline (i0=, i1=) at /usr/src/debug/qemu-2.12.91-20180726.0.c6cf862329.fc28.s390x/util/coroutine-ucontext.c:116 #13 0x03ffb5553b7a in __makecontext_ret () from /lib64/libc.so.6 I don't know if these issues are related to the same underlying problem. Any help is really appreciated. Thanks Farhan
Re: [Qemu-devel] [PATCH] block/file-posix: add bdrv_attach_aio_context callback for host dev and cdrom
On 07/23/2018 12:30 PM, Stefan Hajnoczi wrote: On Fri, Jul 20, 2018 at 03:11:14PM -0400, Farhan Ali wrote: I am seeing another issue pop up, in a different test. Even though it's a different assertion, it might be related based on the call trace. Which test case? This test case involved one guest with 2 disks, with an iothread for each disk. The guest was running a memory workload. Stack trace of thread 276199: #0 0x03ff8473e274 raise (libc.so.6) #1 0x03ff847239a8 abort (libc.so.6) #2 0x03ff847362ce __assert_fail_base (libc.so.6) #3 0x03ff8473634c __assert_fail (libc.so.6) #4 0x02aa30aba0c4 iov_memset (qemu-system-s390x) #5 0x02aa30aba9a6 qemu_iovec_memset (qemu-system-s390x) #6 0x02aa30a23e88 qemu_laio_process_completion (qemu-system-s390x) What are the values of laiocb->qiov->size and laiocb->ret? The laiocb->qiov->size was 4096 and laiocb->ret was 8192 #7 0x02aa30a23f68 qemu_laio_process_completions (qemu-system-s390x) #8 0x02aa30a2418e qemu_laio_process_completions_and_submit (qemu-system-s390x) #9 0x02aa30a24220 qemu_laio_poll_cb (qemu-system-s390x) #10 0x02aa30ab22c4 run_poll_handlers_once (qemu-system-s390x) #11 0x02aa30ab2e78 aio_poll (qemu-system-s390x) #12 0x02aa30a29f4e bdrv_do_drained_begin (qemu-system-s390x) #13 0x02aa30a2a276 bdrv_drain (qemu-system-s390x) #14 0x02aa309d45aa bdrv_set_aio_context (qemu-system-s390x) #15 0x02aa3085acfe virtio_blk_data_plane_stop (qemu-system-s390x) #16 0x02aa3096994c virtio_bus_stop_ioeventfd.part.1 (qemu-system-s390x) #17 0x02aa3087d1d6 virtio_vmstate_change (qemu-system-s390x) #18 0x02aa308e8a12 vm_state_notify (qemu-system-s390x) #19 0x02aa3080ed54 do_vm_stop (qemu-system-s390x) #20 0x02aa307bea04 main (qemu-system-s390x) #21 0x03ff84723dd2 __libc_start_main (libc.so.6) #22 0x02aa307c0414 _start (qemu-system-s390x) The failing assertion is: qemu-kvm: util/iov.c:78: iov_memset: Assertion `offset == 0' failed. I wonder if the offset is beyond the end of the iovecs. Thanks, Stefan Thanks Farhan
Re: [Qemu-devel] [PATCH] block/file-posix: add bdrv_attach_aio_context callback for host dev and cdrom
On 07/20/2018 03:11 PM, Farhan Ali wrote: I am seeing another issue pop up, in a different test. Even though it's a different assertion, it might be related based on the call trace. Stack trace of thread 276199: #0 0x03ff8473e274 raise (libc.so.6) #1 0x03ff847239a8 abort (libc.so.6) #2 0x03ff847362ce __assert_fail_base (libc.so.6) #3 0x03ff8473634c __assert_fail (libc.so.6) #4 0x02aa30aba0c4 iov_memset (qemu-system-s390x) #5 0x02aa30aba9a6 qemu_iovec_memset (qemu-system-s390x) #6 0x02aa30a23e88 qemu_laio_process_completion (qemu-system-s390x) #7 0x02aa30a23f68 qemu_laio_process_completions (qemu-system-s390x) #8 0x02aa30a2418e qemu_laio_process_completions_and_submit (qemu-system-s390x) #9 0x02aa30a24220 qemu_laio_poll_cb (qemu-system-s390x) #10 0x02aa30ab22c4 run_poll_handlers_once (qemu-system-s390x) #11 0x02aa30ab2e78 aio_poll (qemu-system-s390x) #12 0x02aa30a29f4e bdrv_do_drained_begin (qemu-system-s390x) #13 0x02aa30a2a276 bdrv_drain (qemu-system-s390x) #14 0x02aa309d45aa bdrv_set_aio_context (qemu-system-s390x) #15 0x02aa3085acfe virtio_blk_data_plane_stop (qemu-system-s390x) #16 0x02aa3096994c virtio_bus_stop_ioeventfd.part.1 (qemu-system-s390x) #17 0x02aa3087d1d6 virtio_vmstate_change (qemu-system-s390x) #18 0x02aa308e8a12 vm_state_notify (qemu-system-s390x) #19 0x02aa3080ed54 do_vm_stop (qemu-system-s390x) #20 0x02aa307bea04 main (qemu-system-s390x) #21 0x03ff84723dd2 __libc_start_main (libc.so.6) #22 0x02aa307c0414 _start (qemu-system-s390x) The failing assertion is: qemu-kvm: util/iov.c:78: iov_memset: Assertion `offset == 0' failed. Just to give some context, this a guest with 2 disks with each assigned an iothread. The guest was running a memory intensive workload. From the coredump of the qemu process, I see there were 2 threads that were trying to call aio_poll with the same AioContext on the same BlockDeviceDriver Thread 1: #0 0x03ff8473e274 in raise () from /lib64/libc.so.6 #1 0x03ff847239a8 in abort () from /lib64/libc.so.6 #2 0x03ff847362ce in __assert_fail_base () from /lib64/libc.so.6 #3 0x03ff8473634c in __assert_fail () from /lib64/libc.so.6 #4 0x02aa30aba0c4 in iov_memset (iov=, iov_cnt=, offset=, fillc=, bytes=18446744073709547520) at /usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/util/iov.c:78 #5 0x02aa30aba9a6 in qemu_iovec_memset (qiov=, offset=offset@entry=8192, fillc=fillc@entry=0, bytes=18446744073709547520) at /usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/util/iov.c:410 #6 0x02aa30a23e88 in qemu_laio_process_completion (laiocb=0x3fe36a6a3f0) at /usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/block/linux-aio.c:88 #7 0x02aa30a23f68 in qemu_laio_process_completions (s=s@entry=0x3fe60001910) at /usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/block/linux-aio.c:222 #8 0x02aa30a2418e in qemu_laio_process_completions_and_submit (s=0x3fe60001910) at /usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/block/linux-aio.c:237 #9 0x02aa30a24220 in qemu_laio_poll_cb (opaque=) at /usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/block/linux-aio.c:272 #10 0x02aa30ab22c4 in run_poll_handlers_once (ctx=ctx@entry=0x2aa4f35df50) at /usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/util/aio-posix.c:494 #11 0x02aa30ab2e78 in try_poll_mode (blocking=, ctx=) at /usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/util/aio-posix.c:573 > #12 aio_poll (ctx=0x2aa4f35df50, blocking=blocking@entry=false) at /usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/util/aio-posix.c:602 #13 0x02aa30a29f4e in bdrv_drain_poll_top_level (ignore_parent=, recursive=, bs=out>) at /usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/block/io.c:390 #14 bdrv_do_drained_begin (bs=0x2aa4f392510, recursive=, parent=0x0, ignore_bds_parents=, poll=) at /usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/block/io.c:390 #15 0x02aa30a2a276 in bdrv_drained_begin (bs=0x2aa4f392510) at /usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/block/io.c:396 #16 bdrv_drain (bs=bs@entry=0x2aa4f392510) at /usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/block/io.c:478 #17 0x02aa309d45aa in bdrv_set_aio_context (bs=0x2aa4f392510, new_context=0x2aa4f3594f0) at /usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/block.c:4954 #18 0x02aa30a1c228 in blk_set_aio_context (blk=blk@entry=0x2aa4f38ed90, new_context=) at /usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/block/block-backend.c:1894 #19 0x02aa3085acfe in virtio_blk_data_plane_stop (vdev=out>) at /usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/hw/block/dataplane/virtio-blk.c:285 #20 0x02aa3096994c in virtio_bus_stop_ioeventfd (bus=0x2aa4f4f61f0) at /usr/src/debug/qem
Re: [Qemu-devel] [PATCH] block/file-posix: add bdrv_attach_aio_context callback for host dev and cdrom
On 07/20/2018 03:32 PM, Nishanth Aravamudan wrote: On 20.07.2018 [15:11:14 -0400], Farhan Ali wrote: I am seeing another issue pop up, in a different test. Even though it's a different assertion, it might be related based on the call trace. Just to be clear, this does not happen if you revert the original patch (i.e., the one you bisected to before)? I'm digging into the code now. -Nish I had not seen this issue before. I just ran my regression tests with your fix and saw the failure in one of my test. The patch in itself fixes the original issue I reported. I am going to try and debug some more. Thanks Farhan
Re: [Qemu-devel] [PATCH] block/file-posix: add bdrv_attach_aio_context callback for host dev and cdrom
I am seeing another issue pop up, in a different test. Even though it's a different assertion, it might be related based on the call trace. Stack trace of thread 276199: #0 0x03ff8473e274 raise (libc.so.6) #1 0x03ff847239a8 abort (libc.so.6) #2 0x03ff847362ce __assert_fail_base (libc.so.6) #3 0x03ff8473634c __assert_fail (libc.so.6) #4 0x02aa30aba0c4 iov_memset (qemu-system-s390x) #5 0x02aa30aba9a6 qemu_iovec_memset (qemu-system-s390x) #6 0x02aa30a23e88 qemu_laio_process_completion (qemu-system-s390x) #7 0x02aa30a23f68 qemu_laio_process_completions (qemu-system-s390x) #8 0x02aa30a2418e qemu_laio_process_completions_and_submit (qemu-system-s390x) #9 0x02aa30a24220 qemu_laio_poll_cb (qemu-system-s390x) #10 0x02aa30ab22c4 run_poll_handlers_once (qemu-system-s390x) #11 0x02aa30ab2e78 aio_poll (qemu-system-s390x) #12 0x02aa30a29f4e bdrv_do_drained_begin (qemu-system-s390x) #13 0x02aa30a2a276 bdrv_drain (qemu-system-s390x) #14 0x02aa309d45aa bdrv_set_aio_context (qemu-system-s390x) #15 0x02aa3085acfe virtio_blk_data_plane_stop (qemu-system-s390x) #16 0x02aa3096994c virtio_bus_stop_ioeventfd.part.1 (qemu-system-s390x) #17 0x02aa3087d1d6 virtio_vmstate_change (qemu-system-s390x) #18 0x02aa308e8a12 vm_state_notify (qemu-system-s390x) #19 0x02aa3080ed54 do_vm_stop (qemu-system-s390x) #20 0x02aa307bea04 main (qemu-system-s390x) #21 0x03ff84723dd2 __libc_start_main (libc.so.6) #22 0x02aa307c0414 _start (qemu-system-s390x) The failing assertion is: qemu-kvm: util/iov.c:78: iov_memset: Assertion `offset == 0' failed. On 07/18/2018 05:12 PM, Nishanth Aravamudan wrote: In ed6e2161 ("linux-aio: properly bubble up errors from initialzation"), I only added a bdrv_attach_aio_context callback for the bdrv_file driver. There are several other drivers that use the shared aio_plug callback, though, and they will trip the assertion added to aio_get_linux_aio because they did not call aio_setup_linux_aio first. Add the appropriate callback definition to the affected driver definitions. Fixes: ed6e2161 ("linux-aio: properly bubble up errors from initialization") Reported-by: Farhan Ali Signed-off-by: Nishanth Aravamudan Cc: Eric Blake Cc: Kevin Wolf Cc: John Snow Cc: Max Reitz Cc: Stefan Hajnoczi Cc: Fam Zheng Cc: Paolo Bonzini Cc: qemu-bl...@nongnu.org Cc: qemu-devel@nongnu.org --- block/file-posix.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/block/file-posix.c b/block/file-posix.c index 60af4b3d51..ad299beb38 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -3158,6 +3158,7 @@ static BlockDriver bdrv_host_device = { .bdrv_refresh_limits = raw_refresh_limits, .bdrv_io_plug = raw_aio_plug, .bdrv_io_unplug = raw_aio_unplug, +.bdrv_attach_aio_context = raw_aio_attach_aio_context, .bdrv_co_truncate = raw_co_truncate, .bdrv_getlength = raw_getlength, @@ -3280,6 +3281,7 @@ static BlockDriver bdrv_host_cdrom = { .bdrv_refresh_limits = raw_refresh_limits, .bdrv_io_plug = raw_aio_plug, .bdrv_io_unplug = raw_aio_unplug, +.bdrv_attach_aio_context = raw_aio_attach_aio_context, .bdrv_co_truncate= raw_co_truncate, .bdrv_getlength = raw_getlength, @@ -3410,6 +3412,7 @@ static BlockDriver bdrv_host_cdrom = { .bdrv_refresh_limits = raw_refresh_limits, .bdrv_io_plug = raw_aio_plug, .bdrv_io_unplug = raw_aio_unplug, +.bdrv_attach_aio_context = raw_aio_attach_aio_context, .bdrv_co_truncate= raw_co_truncate, .bdrv_getlength = raw_getlength,
Re: [Qemu-devel] [BUG?] aio_get_linux_aio: Assertion `ctx->linux_aio' failed
On 07/18/2018 09:42 AM, Farhan Ali wrote: On 07/17/2018 04:52 PM, Nishanth Aravamudan wrote: iiuc, this possibly implies AIO was not actually used previously on this guest (it might have silently been falling back to threaded IO?). I don't have access to s390x, but would it be possible to run qemu under gdb and see if aio_setup_linux_aio is being called at all (I think it might not be, but I'm not sure why), and if so, if it's for the context in question? If it's not being called first, could you see what callpath is calling aio_get_linux_aio when this assertion trips? Thanks! -Nish Hi Nishant, From the coredump of the guest this is the call trace that calls aio_get_linux_aio: Stack trace of thread 145158: #0 0x03ff94dbe274 raise (libc.so.6) #1 0x03ff94da39a8 abort (libc.so.6) #2 0x03ff94db62ce __assert_fail_base (libc.so.6) #3 0x03ff94db634c __assert_fail (libc.so.6) #4 0x02aa20db067a aio_get_linux_aio (qemu-system-s390x) #5 0x02aa20d229a8 raw_aio_plug (qemu-system-s390x) #6 0x02aa20d309ee bdrv_io_plug (qemu-system-s390x) #7 0x02aa20b5a8ea virtio_blk_handle_vq (qemu-system-s390x) #8 0x02aa20db2f6e aio_dispatch_handlers (qemu-system-s390x) #9 0x02aa20db3c34 aio_poll (qemu-system-s390x) #10 0x02aa20be32a2 iothread_run (qemu-system-s390x) #11 0x03ff94f879a8 start_thread (libpthread.so.0) #12 0x03ff94e797ee thread_start (libc.so.6) Thanks for taking a look and responding. Thanks Farhan Trying to debug a little further, the block device in this case is a "host device". And looking at your commit carefully you use the bdrv_attach_aio_context callback to setup a Linux AioContext. For some reason the "host device" struct (BlockDriver bdrv_host_device in block/file-posix.c) does not have a bdrv_attach_aio_context defined. So a simple change of adding the callback to the struct solves the issue and the guest starts fine. diff --git a/block/file-posix.c b/block/file-posix.c index 28824aa..b8d59fb 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -3135,6 +3135,7 @@ static BlockDriver bdrv_host_device = { .bdrv_refresh_limits = raw_refresh_limits, .bdrv_io_plug = raw_aio_plug, .bdrv_io_unplug = raw_aio_unplug, +.bdrv_attach_aio_context = raw_aio_attach_aio_context, .bdrv_co_truncate = raw_co_truncate, .bdrv_getlength= raw_getlength, I am not too familiar with block device code in QEMU, so not sure if this is the right fix or if there are some underlying problems. Thanks Farhan
Re: [Qemu-devel] [BUG?] aio_get_linux_aio: Assertion `ctx->linux_aio' failed
On 07/17/2018 04:52 PM, Nishanth Aravamudan wrote: iiuc, this possibly implies AIO was not actually used previously on this guest (it might have silently been falling back to threaded IO?). I don't have access to s390x, but would it be possible to run qemu under gdb and see if aio_setup_linux_aio is being called at all (I think it might not be, but I'm not sure why), and if so, if it's for the context in question? If it's not being called first, could you see what callpath is calling aio_get_linux_aio when this assertion trips? Thanks! -Nish Hi Nishant, From the coredump of the guest this is the call trace that calls aio_get_linux_aio: Stack trace of thread 145158: #0 0x03ff94dbe274 raise (libc.so.6) #1 0x03ff94da39a8 abort (libc.so.6) #2 0x03ff94db62ce __assert_fail_base (libc.so.6) #3 0x03ff94db634c __assert_fail (libc.so.6) #4 0x02aa20db067a aio_get_linux_aio (qemu-system-s390x) #5 0x02aa20d229a8 raw_aio_plug (qemu-system-s390x) #6 0x02aa20d309ee bdrv_io_plug (qemu-system-s390x) #7 0x02aa20b5a8ea virtio_blk_handle_vq (qemu-system-s390x) #8 0x02aa20db2f6e aio_dispatch_handlers (qemu-system-s390x) #9 0x02aa20db3c34 aio_poll (qemu-system-s390x) #10 0x02aa20be32a2 iothread_run (qemu-system-s390x) #11 0x03ff94f879a8 start_thread (libpthread.so.0) #12 0x03ff94e797ee thread_start (libc.so.6) Thanks for taking a look and responding. Thanks Farhan
[Qemu-devel] [BUG?] aio_get_linux_aio: Assertion `ctx->linux_aio' failed
Hi, I am seeing some strange QEMU assertion failures for qemu on s390x, which prevents a guest from starting. Git bisecting points to the following commit as the source of the error. commit ed6e2161715c527330f936d44af4c547f25f687e Author: Nishanth Aravamudan Date: Fri Jun 22 12:37:00 2018 -0700 linux-aio: properly bubble up errors from initialization laio_init() can fail for a couple of reasons, which will lead to a NULL pointer dereference in laio_attach_aio_context(). To solve this, add a aio_setup_linux_aio() function which is called early in raw_open_common. If this fails, propagate the error up. The signature of aio_get_linux_aio() was not modified, because it seems preferable to return the actual errno from the possible failing initialization calls. Additionally, when the AioContext changes, we need to associate a LinuxAioState with the new AioContext. Use the bdrv_attach_aio_context callback and call the new aio_setup_linux_aio(), which will allocate a new AioContext if needed, and return errors on failures. If it fails for any reason, fallback to threaded AIO with an error message, as the device is already in-use by the guest. Add an assert that aio_get_linux_aio() cannot return NULL. Signed-off-by: Nishanth Aravamudan Message-id: 20180622193700.6523-1-naravamu...@digitalocean.com Signed-off-by: Stefan Hajnoczi Not sure what is causing this assertion to fail. Here is the qemu command line of the guest, from qemu log, which throws this error: LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/local/bin/qemu-system-s390x -name guest=rt_vm1,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-21-rt_vm1/master-key.aes -machine s390-ccw-virtio-2.12,accel=kvm,usb=off,dump-guest-core=off -m 1024 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -object iothread,id=iothread1 -uuid 0cde16cd-091d-41bd-9ac2-5243df5c9a0d -display none -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=28,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -drive file=/dev/mapper/360050763998b088398002a31,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native -device virtio-blk-ccw,iothread=iothread1,scsi=off,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,write-cache=on -netdev tap,fd=30,id=hostnet0,vhost=on,vhostfd=31 -device virtio-net-ccw,netdev=hostnet0,id=net0,mac=02:3a:c8:67:95:84,devno=fe.0. -netdev tap,fd=32,id=hostnet1,vhost=on,vhostfd=33 -device virtio-net-ccw,netdev=hostnet1,id=net1,mac=52:54:00:2a:e5:08,devno=fe.0.0002 -chardev pty,id=charconsole0 -device sclpconsole,chardev=charconsole0,id=console0 -device virtio-balloon-ccw,id=balloon0,devno=fe.3.ffba -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on 2018-07-17 15:48:42.252+: Domain id=21 is tainted: high-privileges 2018-07-17T15:48:42.279380Z qemu-system-s390x: -chardev pty,id=charconsole0: char device redirected to /dev/pts/3 (label charconsole0) qemu-system-s390x: util/async.c:339: aio_get_linux_aio: Assertion `ctx->linux_aio' failed. 2018-07-17 15:48:43.309+: shutting down, reason=failed Any help debugging this would be greatly appreciated. Thank you Farhan
Re: [Qemu-devel] [PATCH v3 0/2] Detect & register virtio-crypto algos only if it can be supported by backend
Polite ping :) Would like to know how to merge these patches in the mainline kernel. Thanks Farhan On 06/19/2018 11:41 AM, Farhan Ali wrote: Hi, Currently the Linux virtio-crypto driver registers the crypto algorithm without verifying if the backend actually supports the algorithm. This kernel patch series adds support for registering algorithm with Linux crypto layer, only if the algorithm is supported by the backend device. This also makes the driver more compliant with the virtio-crypto spec [1]. I would appreciate any feedback or comments on this. Thank you Farhan Reference - [1] Virtio crypto spec proposal https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg00816.html ChangeLog - v2 -> v3 - Add Christian Borntrager's acks for the patches. v1 -> v2 - Modify comment as suggested by Arei (patch 1) - Modify error message as suggested by Arei (patch 2) Farhan Ali (2): crypto/virtio-crypto: Read crypto services and algorithm masks crypto/virtio-crypto: Register an algo only if it's supported drivers/crypto/virtio/virtio_crypto_algs.c | 112 ++- drivers/crypto/virtio/virtio_crypto_common.h | 25 +- drivers/crypto/virtio/virtio_crypto_core.c | 29 +++ drivers/crypto/virtio/virtio_crypto_mgr.c| 81 +-- 4 files changed, 202 insertions(+), 45 deletions(-)
[Qemu-devel] [PATCH v3 2/2] crypto/virtio-crypto: Register an algo only if it's supported
From: Farhan Ali Register a crypto algo with the Linux crypto layer only if the algorithm is supported by the backend virtio-crypto device. Also route crypto requests to a virtio-crypto device, only if it can support the requested service and algorithm. Signed-off-by: Farhan Ali Acked-by: Gonglei Acked-by: Christian Borntraeger --- drivers/crypto/virtio/virtio_crypto_algs.c | 112 ++- drivers/crypto/virtio/virtio_crypto_common.h | 11 ++- drivers/crypto/virtio/virtio_crypto_mgr.c| 81 +-- 3 files changed, 159 insertions(+), 45 deletions(-) diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c b/drivers/crypto/virtio/virtio_crypto_algs.c index af6a908..7a104f6 100644 --- a/drivers/crypto/virtio/virtio_crypto_algs.c +++ b/drivers/crypto/virtio/virtio_crypto_algs.c @@ -49,12 +49,18 @@ struct virtio_crypto_sym_request { bool encrypt; }; +struct virtio_crypto_algo { + uint32_t algonum; + uint32_t service; + unsigned int active_devs; + struct crypto_alg algo; +}; + /* * The algs_lock protects the below global virtio_crypto_active_devs * and crypto algorithms registion. */ static DEFINE_MUTEX(algs_lock); -static unsigned int virtio_crypto_active_devs; static void virtio_crypto_ablkcipher_finalize_req( struct virtio_crypto_sym_request *vc_sym_req, struct ablkcipher_request *req, @@ -312,15 +318,21 @@ static int virtio_crypto_ablkcipher_setkey(struct crypto_ablkcipher *tfm, unsigned int keylen) { struct virtio_crypto_ablkcipher_ctx *ctx = crypto_ablkcipher_ctx(tfm); + uint32_t alg; int ret; + ret = virtio_crypto_alg_validate_key(keylen, ); + if (ret) + return ret; + if (!ctx->vcrypto) { /* New key */ int node = virtio_crypto_get_current_node(); struct virtio_crypto *vcrypto = - virtcrypto_get_dev_node(node); + virtcrypto_get_dev_node(node, + VIRTIO_CRYPTO_SERVICE_CIPHER, alg); if (!vcrypto) { - pr_err("virtio_crypto: Could not find a virtio device in the system\n"); + pr_err("virtio_crypto: Could not find a virtio device in the system or unsupported algo\n"); return -ENODEV; } @@ -571,57 +583,85 @@ static void virtio_crypto_ablkcipher_finalize_req( virtcrypto_clear_request(_sym_req->base); } -static struct crypto_alg virtio_crypto_algs[] = { { - .cra_name = "cbc(aes)", - .cra_driver_name = "virtio_crypto_aes_cbc", - .cra_priority = 150, - .cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC, - .cra_blocksize = AES_BLOCK_SIZE, - .cra_ctxsize = sizeof(struct virtio_crypto_ablkcipher_ctx), - .cra_alignmask = 0, - .cra_module = THIS_MODULE, - .cra_type = _ablkcipher_type, - .cra_init = virtio_crypto_ablkcipher_init, - .cra_exit = virtio_crypto_ablkcipher_exit, - .cra_u = { - .ablkcipher = { - .setkey = virtio_crypto_ablkcipher_setkey, - .decrypt = virtio_crypto_ablkcipher_decrypt, - .encrypt = virtio_crypto_ablkcipher_encrypt, - .min_keysize = AES_MIN_KEY_SIZE, - .max_keysize = AES_MAX_KEY_SIZE, - .ivsize = AES_BLOCK_SIZE, +static struct virtio_crypto_algo virtio_crypto_algs[] = { { + .algonum = VIRTIO_CRYPTO_CIPHER_AES_CBC, + .service = VIRTIO_CRYPTO_SERVICE_CIPHER, + .algo = { + .cra_name = "cbc(aes)", + .cra_driver_name = "virtio_crypto_aes_cbc", + .cra_priority = 150, + .cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC, + .cra_blocksize = AES_BLOCK_SIZE, + .cra_ctxsize = sizeof(struct virtio_crypto_ablkcipher_ctx), + .cra_alignmask = 0, + .cra_module = THIS_MODULE, + .cra_type = _ablkcipher_type, + .cra_init = virtio_crypto_ablkcipher_init, + .cra_exit = virtio_crypto_ablkcipher_exit, + .cra_u = { + .ablkcipher = { + .setkey = virtio_crypto_ablkcipher_setkey, + .decrypt = virtio_crypto_ablkcipher_decrypt, + .encrypt = virtio_crypto_ablkcipher_encrypt, + .min_keysize = AES_MIN_KEY_SIZE, + .max_keysize = AES_MAX_KEY_SIZE, + .ivsize = AES_BLOCK_SIZE, + }, }, }, } }; -int virtio_crypto_algs_register(vo
[Qemu-devel] [PATCH v3 0/2] Detect & register virtio-crypto algos only if it can be supported by backend
Hi, Currently the Linux virtio-crypto driver registers the crypto algorithm without verifying if the backend actually supports the algorithm. This kernel patch series adds support for registering algorithm with Linux crypto layer, only if the algorithm is supported by the backend device. This also makes the driver more compliant with the virtio-crypto spec [1]. I would appreciate any feedback or comments on this. Thank you Farhan Reference - [1] Virtio crypto spec proposal https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg00816.html ChangeLog - v2 -> v3 - Add Christian Borntrager's acks for the patches. v1 -> v2 - Modify comment as suggested by Arei (patch 1) - Modify error message as suggested by Arei (patch 2) Farhan Ali (2): crypto/virtio-crypto: Read crypto services and algorithm masks crypto/virtio-crypto: Register an algo only if it's supported drivers/crypto/virtio/virtio_crypto_algs.c | 112 ++- drivers/crypto/virtio/virtio_crypto_common.h | 25 +- drivers/crypto/virtio/virtio_crypto_core.c | 29 +++ drivers/crypto/virtio/virtio_crypto_mgr.c| 81 +-- 4 files changed, 202 insertions(+), 45 deletions(-) -- 2.7.4
[Qemu-devel] [PATCH v3 1/2] crypto/virtio-crypto: Read crypto services and algorithm masks
Read the crypto services and algorithm masks which provides information about the services and algorithms supported by virtio-crypto backend. Signed-off-by: Farhan Ali Acked-by: Gonglei Acked-by: Christian Borntraeger --- drivers/crypto/virtio/virtio_crypto_common.h | 14 ++ drivers/crypto/virtio/virtio_crypto_core.c | 29 2 files changed, 43 insertions(+) diff --git a/drivers/crypto/virtio/virtio_crypto_common.h b/drivers/crypto/virtio/virtio_crypto_common.h index 66501a5..931a3bd 100644 --- a/drivers/crypto/virtio/virtio_crypto_common.h +++ b/drivers/crypto/virtio/virtio_crypto_common.h @@ -55,6 +55,20 @@ struct virtio_crypto { /* Number of queue currently used by the driver */ u32 curr_queue; + /* +* Specifies the services mask which the device support, +* see VIRTIO_CRYPTO_SERVICE_* +*/ + u32 crypto_services; + + /* Detailed algorithms mask */ + u32 cipher_algo_l; + u32 cipher_algo_h; + u32 hash_algo; + u32 mac_algo_l; + u32 mac_algo_h; + u32 aead_algo; + /* Maximum length of cipher key */ u32 max_cipher_key_len; /* Maximum length of authenticated key */ diff --git a/drivers/crypto/virtio/virtio_crypto_core.c b/drivers/crypto/virtio/virtio_crypto_core.c index 8332698..8f745f2 100644 --- a/drivers/crypto/virtio/virtio_crypto_core.c +++ b/drivers/crypto/virtio/virtio_crypto_core.c @@ -303,6 +303,13 @@ static int virtcrypto_probe(struct virtio_device *vdev) u32 max_data_queues = 0, max_cipher_key_len = 0; u32 max_auth_key_len = 0; u64 max_size = 0; + u32 cipher_algo_l = 0; + u32 cipher_algo_h = 0; + u32 hash_algo = 0; + u32 mac_algo_l = 0; + u32 mac_algo_h = 0; + u32 aead_algo = 0; + u32 crypto_services = 0; if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) return -ENODEV; @@ -339,6 +346,20 @@ static int virtcrypto_probe(struct virtio_device *vdev) max_auth_key_len, _auth_key_len); virtio_cread(vdev, struct virtio_crypto_config, max_size, _size); + virtio_cread(vdev, struct virtio_crypto_config, + crypto_services, _services); + virtio_cread(vdev, struct virtio_crypto_config, + cipher_algo_l, _algo_l); + virtio_cread(vdev, struct virtio_crypto_config, + cipher_algo_h, _algo_h); + virtio_cread(vdev, struct virtio_crypto_config, + hash_algo, _algo); + virtio_cread(vdev, struct virtio_crypto_config, + mac_algo_l, _algo_l); + virtio_cread(vdev, struct virtio_crypto_config, + mac_algo_h, _algo_h); + virtio_cread(vdev, struct virtio_crypto_config, + aead_algo, _algo); /* Add virtio crypto device to global table */ err = virtcrypto_devmgr_add_dev(vcrypto); @@ -358,6 +379,14 @@ static int virtcrypto_probe(struct virtio_device *vdev) vcrypto->max_cipher_key_len = max_cipher_key_len; vcrypto->max_auth_key_len = max_auth_key_len; vcrypto->max_size = max_size; + vcrypto->crypto_services = crypto_services; + vcrypto->cipher_algo_l = cipher_algo_l; + vcrypto->cipher_algo_h = cipher_algo_h; + vcrypto->mac_algo_l = mac_algo_l; + vcrypto->mac_algo_h = mac_algo_h; + vcrypto->hash_algo = hash_algo; + vcrypto->aead_algo = aead_algo; + dev_info(>dev, "max_queues: %u, max_cipher_key_len: %u, max_auth_key_len: %u, max_size 0x%llx\n", -- 2.7.4
Re: [Qemu-devel] [PATCH v2 0/2] Detect & register virtio-crypto algos only if it can be supported by backend
On 06/19/2018 10:32 AM, Herbert Xu wrote: On Tue, Jun 19, 2018 at 04:08:05PM +0200, Christian Borntraeger wrote: On 06/19/2018 03:26 PM, Farhan Ali wrote: A polite ping :) If no one has any more comments/feedback then I would like to how to merge these patches to the mainline kernel tree. I think it should go via Michael Tsirkin or Jason Wang (Adding Jason). Also adding Herbert Xu as FYI. I have not looked into the details of the patch, but I like the idea. Acked-by: Christian Borntraeger Please post crypto patches to linux-crypto. Thanks, I have cc'ed linux-crypto folks. Sorry I missed them Thanks Farhan
Re: [Qemu-devel] [PATCH v2 0/2] Detect & register virtio-crypto algos only if it can be supported by backend
A polite ping :) If no one has any more comments/feedback then I would like to how to merge these patches to the mainline kernel tree. Thanks Farhan On 06/13/2018 04:38 PM, Farhan Ali wrote: Hi, Currently the Linux virtio-crypto driver registers the crypto algorithm without verifying if the backend actually supports the algorithm. This kernel patch series adds support for registering algorithm with Linux crypto layer, only if the algorithm is supported by the backend device. This also makes the driver more compliant with the virtio-crypto spec [1]. I would appreciate any feedback or comments on this. Thank you Farhan Reference - [1] Virtio crypto spec proposal https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg00816.html ChangeLog - v1 -> v2 - Modify comment as suggested by Arei (patch 1) - Modify error message as suggested by Arei (patch 2) Farhan Ali (2): crypto/virtio-crypto: Read crypto services and algorithm masks crypto/virtio-crypto: Register an algo only if it's supported drivers/crypto/virtio/virtio_crypto_algs.c | 112 ++- drivers/crypto/virtio/virtio_crypto_common.h | 25 +- drivers/crypto/virtio/virtio_crypto_core.c | 29 +++ drivers/crypto/virtio/virtio_crypto_mgr.c| 81 +-- 4 files changed, 202 insertions(+), 45 deletions(-)
Re: [Qemu-devel] [RFC v1 1/1] virtio-crypto: Allow disabling of cipher algorithms for virtio-crypto device
On 06/15/2018 09:17 AM, Viktor VM Mihajlovski wrote: On 14.06.2018 18:12, Farhan Ali wrote: On 06/14/2018 11:10 AM, Daniel P. Berrangé wrote: On Thu, Jun 14, 2018 at 10:50:40AM -0400, Farhan Ali wrote: On 06/14/2018 04:21 AM, Daniel P. Berrangé wrote: On Wed, Jun 13, 2018 at 07:28:08PM +0200, Halil Pasic wrote: On 06/13/2018 05:05 PM, Daniel P. Berrangé wrote: On Wed, Jun 13, 2018 at 11:01:05AM -0400, Farhan Ali wrote: Hi Daniel On 06/13/2018 05:37 AM, Daniel P. Berrangé wrote: On Tue, Jun 12, 2018 at 03:48:34PM -0400, Farhan Ali wrote: The virtio-crypto driver currently propagates to the guest all the cipher algorithms that the backend cryptodev can support. But in certain cases where the guest has more performant mechanism to handle some algorithms, it would be useful to propagate only a subset of the algorithms. I'm not really convinced by this. The performance of crypto algorithms has many influencing factors, making it pretty hard to decide which is best without actively testing specific impls and comparing them in a manner which matches the application usage pattern. eg in theory the kernel crypto impl of an alg is faster than a userspace impl, if the kernel uses hardware accel and userspace does not. This, however, ignores the overhead of the kernel/userspace switch. The real world performance winner, thus depends on the amount of data being processed in each operation. Some times userspace can win & sometimes kernel space can win. This is even more relevant to virtio-crypto as it has more expensive context switches. True. But what if the guest can perform some crypto algorithms without a incurring a VM exit? For example in s390 we have the cpacf instructions to perform crypto and this instruction is implemented for us by our hardware virtualization technology. In such a case it would be better not to use virtio-crypto's implementation of such a crypto algorithm. At the same time we would like to take advantage of virtio-crypto's acceleration capabilities for certain crypto algorithms for which there is no hardware assistance. IIUC, the kernel's crypto layer can support multiple implementations of any algorithm. Providers can report a priority against implementations which influences which impl is used in practice. So if there's a native instruction for a partiuclar algorithm I would expect the impl registered for that to be designated higher priority than other impls, so that it is used in preference to other impls. AFAIR the problem here is that in (the guest) kernel the virtio-crypto driver has to register it's crypto algo implementations with a priority (single number), which dictates if it's going to be the preferred (used) implementation of the algorithm or not. The virtio-crypto driver does this without having information about the (comparative or absolute) performance of it's implementation (which depends on the backend among others). I don't think any dynamic re-prioritization of the algorithms takes place (e.g. based on how these perform in for the given configuration). I think the strategy of the virtio-crypto is to rather overstate, than understate the performance of it's implementation. If we were to 'be conservative' and say, 'hey we don't know nothing about the performance, let's make it lowest priority implementation' the implementations provided by virtio-crypto would end up being used only if there is no other implementation. And that does not sound like a good idea either. This problem you describe, however, is something that applies to *any* kerenl code that is registering a crypto algo impl for accelerator hardware. A non-virtualized crypto cards in bare metal likewise cannot assume that its AES impl is better then the host CPU's aes-ni instruction. So the idea is to give the user the power to effectively not provide an algorithm via virtio-crypto. That is, if the user observes a performance degradation because of virtio-crypto, he can turn off the bad algorithms at the device. That way overstatement becomes a much smaller problem. The user can turn off the bad algorithms for reasons other than performance too. Of course there are other ways to deal with the problem of virtio-crypto driver not knowing how good it's implementation of a given algo is. We could make the in kernel crypto priorities dynamically adjustable in general or we could provide the user with means to specify the priorities (e.g. as module parameter) with which the virtio-crypto driver registers each algo. Both of these would be knobs in the guest. It's hard to tell if these first one would be useful in scenarios not involving virtualization. Same goes for some kind of dynamic priority management for crypto algorithm implementations in the Linux kernel. I assume the people involved with the respective subsystem do not see the necessity for something like that. It still feels like this is a problem for the guest OS to solve. If you put a physical crypto acceler
Re: [Qemu-devel] [RFC v1 1/1] virtio-crypto: Allow disabling of cipher algorithms for virtio-crypto device
On 06/14/2018 11:10 AM, Daniel P. Berrangé wrote: On Thu, Jun 14, 2018 at 10:50:40AM -0400, Farhan Ali wrote: On 06/14/2018 04:21 AM, Daniel P. Berrangé wrote: On Wed, Jun 13, 2018 at 07:28:08PM +0200, Halil Pasic wrote: On 06/13/2018 05:05 PM, Daniel P. Berrangé wrote: On Wed, Jun 13, 2018 at 11:01:05AM -0400, Farhan Ali wrote: Hi Daniel On 06/13/2018 05:37 AM, Daniel P. Berrangé wrote: On Tue, Jun 12, 2018 at 03:48:34PM -0400, Farhan Ali wrote: The virtio-crypto driver currently propagates to the guest all the cipher algorithms that the backend cryptodev can support. But in certain cases where the guest has more performant mechanism to handle some algorithms, it would be useful to propagate only a subset of the algorithms. I'm not really convinced by this. The performance of crypto algorithms has many influencing factors, making it pretty hard to decide which is best without actively testing specific impls and comparing them in a manner which matches the application usage pattern. eg in theory the kernel crypto impl of an alg is faster than a userspace impl, if the kernel uses hardware accel and userspace does not. This, however, ignores the overhead of the kernel/userspace switch. The real world performance winner, thus depends on the amount of data being processed in each operation. Some times userspace can win & sometimes kernel space can win. This is even more relevant to virtio-crypto as it has more expensive context switches. True. But what if the guest can perform some crypto algorithms without a incurring a VM exit? For example in s390 we have the cpacf instructions to perform crypto and this instruction is implemented for us by our hardware virtualization technology. In such a case it would be better not to use virtio-crypto's implementation of such a crypto algorithm. At the same time we would like to take advantage of virtio-crypto's acceleration capabilities for certain crypto algorithms for which there is no hardware assistance. IIUC, the kernel's crypto layer can support multiple implementations of any algorithm. Providers can report a priority against implementations which influences which impl is used in practice. So if there's a native instruction for a partiuclar algorithm I would expect the impl registered for that to be designated higher priority than other impls, so that it is used in preference to other impls. AFAIR the problem here is that in (the guest) kernel the virtio-crypto driver has to register it's crypto algo implementations with a priority (single number), which dictates if it's going to be the preferred (used) implementation of the algorithm or not. The virtio-crypto driver does this without having information about the (comparative or absolute) performance of it's implementation (which depends on the backend among others). I don't think any dynamic re-prioritization of the algorithms takes place (e.g. based on how these perform in for the given configuration). I think the strategy of the virtio-crypto is to rather overstate, than understate the performance of it's implementation. If we were to 'be conservative' and say, 'hey we don't know nothing about the performance, let's make it lowest priority implementation' the implementations provided by virtio-crypto would end up being used only if there is no other implementation. And that does not sound like a good idea either. This problem you describe, however, is something that applies to *any* kerenl code that is registering a crypto algo impl for accelerator hardware. A non-virtualized crypto cards in bare metal likewise cannot assume that its AES impl is better then the host CPU's aes-ni instruction. So the idea is to give the user the power to effectively not provide an algorithm via virtio-crypto. That is, if the user observes a performance degradation because of virtio-crypto, he can turn off the bad algorithms at the device. That way overstatement becomes a much smaller problem. The user can turn off the bad algorithms for reasons other than performance too. Of course there are other ways to deal with the problem of virtio-crypto driver not knowing how good it's implementation of a given algo is. We could make the in kernel crypto priorities dynamically adjustable in general or we could provide the user with means to specify the priorities (e.g. as module parameter) with which the virtio-crypto driver registers each algo. Both of these would be knobs in the guest. It's hard to tell if these first one would be useful in scenarios not involving virtualization. Same goes for some kind of dynamic priority management for crypto algorithm implementations in the Linux kernel. I assume the people involved with the respective subsystem do not see the necessity for something like that. It still feels like this is a problem for the guest OS to solve. If you put a physical crypto accelerator in a bare metal machine, that has the same problem you describe here, so the kernel su
Re: [Qemu-devel] [RFC v1 1/1] virtio-crypto: Allow disabling of cipher algorithms for virtio-crypto device
On 06/14/2018 04:21 AM, Daniel P. Berrangé wrote: On Wed, Jun 13, 2018 at 07:28:08PM +0200, Halil Pasic wrote: On 06/13/2018 05:05 PM, Daniel P. Berrangé wrote: On Wed, Jun 13, 2018 at 11:01:05AM -0400, Farhan Ali wrote: Hi Daniel On 06/13/2018 05:37 AM, Daniel P. Berrangé wrote: On Tue, Jun 12, 2018 at 03:48:34PM -0400, Farhan Ali wrote: The virtio-crypto driver currently propagates to the guest all the cipher algorithms that the backend cryptodev can support. But in certain cases where the guest has more performant mechanism to handle some algorithms, it would be useful to propagate only a subset of the algorithms. I'm not really convinced by this. The performance of crypto algorithms has many influencing factors, making it pretty hard to decide which is best without actively testing specific impls and comparing them in a manner which matches the application usage pattern. eg in theory the kernel crypto impl of an alg is faster than a userspace impl, if the kernel uses hardware accel and userspace does not. This, however, ignores the overhead of the kernel/userspace switch. The real world performance winner, thus depends on the amount of data being processed in each operation. Some times userspace can win & sometimes kernel space can win. This is even more relevant to virtio-crypto as it has more expensive context switches. True. But what if the guest can perform some crypto algorithms without a incurring a VM exit? For example in s390 we have the cpacf instructions to perform crypto and this instruction is implemented for us by our hardware virtualization technology. In such a case it would be better not to use virtio-crypto's implementation of such a crypto algorithm. At the same time we would like to take advantage of virtio-crypto's acceleration capabilities for certain crypto algorithms for which there is no hardware assistance. IIUC, the kernel's crypto layer can support multiple implementations of any algorithm. Providers can report a priority against implementations which influences which impl is used in practice. So if there's a native instruction for a partiuclar algorithm I would expect the impl registered for that to be designated higher priority than other impls, so that it is used in preference to other impls. AFAIR the problem here is that in (the guest) kernel the virtio-crypto driver has to register it's crypto algo implementations with a priority (single number), which dictates if it's going to be the preferred (used) implementation of the algorithm or not. The virtio-crypto driver does this without having information about the (comparative or absolute) performance of it's implementation (which depends on the backend among others). I don't think any dynamic re-prioritization of the algorithms takes place (e.g. based on how these perform in for the given configuration). I think the strategy of the virtio-crypto is to rather overstate, than understate the performance of it's implementation. If we were to 'be conservative' and say, 'hey we don't know nothing about the performance, let's make it lowest priority implementation' the implementations provided by virtio-crypto would end up being used only if there is no other implementation. And that does not sound like a good idea either. This problem you describe, however, is something that applies to *any* kerenl code that is registering a crypto algo impl for accelerator hardware. A non-virtualized crypto cards in bare metal likewise cannot assume that its AES impl is better then the host CPU's aes-ni instruction. So the idea is to give the user the power to effectively not provide an algorithm via virtio-crypto. That is, if the user observes a performance degradation because of virtio-crypto, he can turn off the bad algorithms at the device. That way overstatement becomes a much smaller problem. The user can turn off the bad algorithms for reasons other than performance too. Of course there are other ways to deal with the problem of virtio-crypto driver not knowing how good it's implementation of a given algo is. We could make the in kernel crypto priorities dynamically adjustable in general or we could provide the user with means to specify the priorities (e.g. as module parameter) with which the virtio-crypto driver registers each algo. Both of these would be knobs in the guest. It's hard to tell if these first one would be useful in scenarios not involving virtualization. Same goes for some kind of dynamic priority management for crypto algorithm implementations in the Linux kernel. I assume the people involved with the respective subsystem do not see the necessity for something like that. It still feels like this is a problem for the guest OS to solve. If you put a physical crypto accelerator in a bare metal machine, that has the same problem you describe here, so the kernel surely already needs to find a viable solution for this problem. How would the guest OS know which algo is better? As
[Qemu-devel] [PATCH v2 1/2] crypto/virtio-crypto: Read crypto services and algorithm masks
Read the crypto services and algorithm masks which provides information about the services and algorithms supported by virtio-crypto backend. Signed-off-by: Farhan Ali Acked-by: Gonglei --- drivers/crypto/virtio/virtio_crypto_common.h | 14 ++ drivers/crypto/virtio/virtio_crypto_core.c | 29 2 files changed, 43 insertions(+) diff --git a/drivers/crypto/virtio/virtio_crypto_common.h b/drivers/crypto/virtio/virtio_crypto_common.h index 66501a5..931a3bd 100644 --- a/drivers/crypto/virtio/virtio_crypto_common.h +++ b/drivers/crypto/virtio/virtio_crypto_common.h @@ -55,6 +55,20 @@ struct virtio_crypto { /* Number of queue currently used by the driver */ u32 curr_queue; + /* +* Specifies the services mask which the device support, +* see VIRTIO_CRYPTO_SERVICE_* +*/ + u32 crypto_services; + + /* Detailed algorithms mask */ + u32 cipher_algo_l; + u32 cipher_algo_h; + u32 hash_algo; + u32 mac_algo_l; + u32 mac_algo_h; + u32 aead_algo; + /* Maximum length of cipher key */ u32 max_cipher_key_len; /* Maximum length of authenticated key */ diff --git a/drivers/crypto/virtio/virtio_crypto_core.c b/drivers/crypto/virtio/virtio_crypto_core.c index 8332698..8f745f2 100644 --- a/drivers/crypto/virtio/virtio_crypto_core.c +++ b/drivers/crypto/virtio/virtio_crypto_core.c @@ -303,6 +303,13 @@ static int virtcrypto_probe(struct virtio_device *vdev) u32 max_data_queues = 0, max_cipher_key_len = 0; u32 max_auth_key_len = 0; u64 max_size = 0; + u32 cipher_algo_l = 0; + u32 cipher_algo_h = 0; + u32 hash_algo = 0; + u32 mac_algo_l = 0; + u32 mac_algo_h = 0; + u32 aead_algo = 0; + u32 crypto_services = 0; if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) return -ENODEV; @@ -339,6 +346,20 @@ static int virtcrypto_probe(struct virtio_device *vdev) max_auth_key_len, _auth_key_len); virtio_cread(vdev, struct virtio_crypto_config, max_size, _size); + virtio_cread(vdev, struct virtio_crypto_config, + crypto_services, _services); + virtio_cread(vdev, struct virtio_crypto_config, + cipher_algo_l, _algo_l); + virtio_cread(vdev, struct virtio_crypto_config, + cipher_algo_h, _algo_h); + virtio_cread(vdev, struct virtio_crypto_config, + hash_algo, _algo); + virtio_cread(vdev, struct virtio_crypto_config, + mac_algo_l, _algo_l); + virtio_cread(vdev, struct virtio_crypto_config, + mac_algo_h, _algo_h); + virtio_cread(vdev, struct virtio_crypto_config, + aead_algo, _algo); /* Add virtio crypto device to global table */ err = virtcrypto_devmgr_add_dev(vcrypto); @@ -358,6 +379,14 @@ static int virtcrypto_probe(struct virtio_device *vdev) vcrypto->max_cipher_key_len = max_cipher_key_len; vcrypto->max_auth_key_len = max_auth_key_len; vcrypto->max_size = max_size; + vcrypto->crypto_services = crypto_services; + vcrypto->cipher_algo_l = cipher_algo_l; + vcrypto->cipher_algo_h = cipher_algo_h; + vcrypto->mac_algo_l = mac_algo_l; + vcrypto->mac_algo_h = mac_algo_h; + vcrypto->hash_algo = hash_algo; + vcrypto->aead_algo = aead_algo; + dev_info(>dev, "max_queues: %u, max_cipher_key_len: %u, max_auth_key_len: %u, max_size 0x%llx\n", -- 2.7.4
[Qemu-devel] [PATCH v2 0/2] Detect & register virtio-crypto algos only if it can be supported by backend
Hi, Currently the Linux virtio-crypto driver registers the crypto algorithm without verifying if the backend actually supports the algorithm. This kernel patch series adds support for registering algorithm with Linux crypto layer, only if the algorithm is supported by the backend device. This also makes the driver more compliant with the virtio-crypto spec [1]. I would appreciate any feedback or comments on this. Thank you Farhan Reference - [1] Virtio crypto spec proposal https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg00816.html ChangeLog - v1 -> v2 - Modify comment as suggested by Arei (patch 1) - Modify error message as suggested by Arei (patch 2) Farhan Ali (2): crypto/virtio-crypto: Read crypto services and algorithm masks crypto/virtio-crypto: Register an algo only if it's supported drivers/crypto/virtio/virtio_crypto_algs.c | 112 ++- drivers/crypto/virtio/virtio_crypto_common.h | 25 +- drivers/crypto/virtio/virtio_crypto_core.c | 29 +++ drivers/crypto/virtio/virtio_crypto_mgr.c| 81 +-- 4 files changed, 202 insertions(+), 45 deletions(-) -- 2.7.4
[Qemu-devel] [PATCH v2 2/2] crypto/virtio-crypto: Register an algo only if it's supported
From: Farhan Ali Register a crypto algo with the Linux crypto layer only if the algorithm is supported by the backend virtio-crypto device. Also route crypto requests to a virtio-crypto device, only if it can support the requested service and algorithm. Signed-off-by: Farhan Ali Acked-by: Gonglei --- drivers/crypto/virtio/virtio_crypto_algs.c | 112 ++- drivers/crypto/virtio/virtio_crypto_common.h | 11 ++- drivers/crypto/virtio/virtio_crypto_mgr.c| 81 +-- 3 files changed, 159 insertions(+), 45 deletions(-) diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c b/drivers/crypto/virtio/virtio_crypto_algs.c index ba190cf..11db62f 100644 --- a/drivers/crypto/virtio/virtio_crypto_algs.c +++ b/drivers/crypto/virtio/virtio_crypto_algs.c @@ -49,12 +49,18 @@ struct virtio_crypto_sym_request { bool encrypt; }; +struct virtio_crypto_algo { + uint32_t algonum; + uint32_t service; + unsigned int active_devs; + struct crypto_alg algo; +}; + /* * The algs_lock protects the below global virtio_crypto_active_devs * and crypto algorithms registion. */ static DEFINE_MUTEX(algs_lock); -static unsigned int virtio_crypto_active_devs; static void virtio_crypto_ablkcipher_finalize_req( struct virtio_crypto_sym_request *vc_sym_req, struct ablkcipher_request *req, @@ -312,15 +318,21 @@ static int virtio_crypto_ablkcipher_setkey(struct crypto_ablkcipher *tfm, unsigned int keylen) { struct virtio_crypto_ablkcipher_ctx *ctx = crypto_ablkcipher_ctx(tfm); + uint32_t alg; int ret; + ret = virtio_crypto_alg_validate_key(keylen, ); + if (ret) + return ret; + if (!ctx->vcrypto) { /* New key */ int node = virtio_crypto_get_current_node(); struct virtio_crypto *vcrypto = - virtcrypto_get_dev_node(node); + virtcrypto_get_dev_node(node, + VIRTIO_CRYPTO_SERVICE_CIPHER, alg); if (!vcrypto) { - pr_err("virtio_crypto: Could not find a virtio device in the system\n"); + pr_err("virtio_crypto: Could not find a virtio device in the system or unsupported algo\n"); return -ENODEV; } @@ -571,57 +583,85 @@ static void virtio_crypto_ablkcipher_finalize_req( virtcrypto_clear_request(_sym_req->base); } -static struct crypto_alg virtio_crypto_algs[] = { { - .cra_name = "cbc(aes)", - .cra_driver_name = "virtio_crypto_aes_cbc", - .cra_priority = 150, - .cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC, - .cra_blocksize = AES_BLOCK_SIZE, - .cra_ctxsize = sizeof(struct virtio_crypto_ablkcipher_ctx), - .cra_alignmask = 0, - .cra_module = THIS_MODULE, - .cra_type = _ablkcipher_type, - .cra_init = virtio_crypto_ablkcipher_init, - .cra_exit = virtio_crypto_ablkcipher_exit, - .cra_u = { - .ablkcipher = { - .setkey = virtio_crypto_ablkcipher_setkey, - .decrypt = virtio_crypto_ablkcipher_decrypt, - .encrypt = virtio_crypto_ablkcipher_encrypt, - .min_keysize = AES_MIN_KEY_SIZE, - .max_keysize = AES_MAX_KEY_SIZE, - .ivsize = AES_BLOCK_SIZE, +static struct virtio_crypto_algo virtio_crypto_algs[] = { { + .algonum = VIRTIO_CRYPTO_CIPHER_AES_CBC, + .service = VIRTIO_CRYPTO_SERVICE_CIPHER, + .algo = { + .cra_name = "cbc(aes)", + .cra_driver_name = "virtio_crypto_aes_cbc", + .cra_priority = 150, + .cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC, + .cra_blocksize = AES_BLOCK_SIZE, + .cra_ctxsize = sizeof(struct virtio_crypto_ablkcipher_ctx), + .cra_alignmask = 0, + .cra_module = THIS_MODULE, + .cra_type = _ablkcipher_type, + .cra_init = virtio_crypto_ablkcipher_init, + .cra_exit = virtio_crypto_ablkcipher_exit, + .cra_u = { + .ablkcipher = { + .setkey = virtio_crypto_ablkcipher_setkey, + .decrypt = virtio_crypto_ablkcipher_decrypt, + .encrypt = virtio_crypto_ablkcipher_encrypt, + .min_keysize = AES_MIN_KEY_SIZE, + .max_keysize = AES_MAX_KEY_SIZE, + .ivsize = AES_BLOCK_SIZE, + }, }, }, } }; -int virtio_crypto_algs_register(void) +int virtio_crypto_algs_register(struct
Re: [Qemu-devel] [RFC v1 1/1] virtio-crypto: Allow disabling of cipher algorithms for virtio-crypto device
Hi Arei On 06/12/2018 08:57 PM, Gonglei (Arei) wrote: -Original Message- From: Farhan Ali [mailto:al...@linux.ibm.com] Sent: Wednesday, June 13, 2018 3:49 AM To: qemu-devel@nongnu.org Cc: m...@redhat.com; Gonglei (Arei) ; longpeng ; pa...@linux.ibm.com; borntrae...@de.ibm.com; fran...@linux.ibm.com; al...@linux.ibm.com Subject: [RFC v1 1/1] virtio-crypto: Allow disabling of cipher algorithms for virtio-crypto device The virtio-crypto driver currently propagates to the guest all the cipher algorithms that the backend cryptodev can support. But in certain cases where the guest has more performant mechanism to handle some algorithms, it would be useful to propagate only a subset of the algorithms. It makes sense to me. E.g. current Intel CPU has the AES-NI instruction for accelerating AES algo. We don't need to propagate AES algos. This patch adds support for disabling the cipher algorithms of the backend cryptodev. eg: -object cryptodev-backend-builtin,id=cryptodev0 -device virtio-crypto-ccw,id=crypto0,cryptodev=cryptodev0,cipher-aes-cbc=off Signed-off-by: Farhan Ali --- Please note this patch is not complete, and there are TODOs to handle for other types of algorithms such Hash, AEAD and MAC algorithms. This is mainly intended to get some feedback on the design approach from the community. hw/virtio/virtio-crypto.c | 46 --- include/hw/virtio/virtio-crypto.h | 3 +++ 2 files changed, 46 insertions(+), 3 deletions(-) diff --git a/hw/virtio/virtio-crypto.c b/hw/virtio/virtio-crypto.c index 9a9fa49..4aed9ca 100644 --- a/hw/virtio/virtio-crypto.c +++ b/hw/virtio/virtio-crypto.c @@ -754,12 +754,22 @@ static void virtio_crypto_reset(VirtIODevice *vdev) static void virtio_crypto_init_config(VirtIODevice *vdev) { VirtIOCrypto *vcrypto = VIRTIO_CRYPTO(vdev); +uint32_t user_crypto_services = (1u << VIRTIO_CRYPTO_SERVICE_CIPHER) | +(1u << VIRTIO_CRYPTO_SERVICE_HASH) | +(1u << VIRTIO_CRYPTO_SERVICE_AEAD) | +(1u << VIRTIO_CRYPTO_SERVICE_MAC); + +if (vcrypto->user_cipher_algo_l & (1u << VIRTIO_CRYPTO_NO_CIPHER)) { +vcrypto->user_cipher_algo_l = 1u << VIRTIO_CRYPTO_NO_CIPHER; +vcrypto->user_cipher_algo_h = 0; +user_crypto_services &= ~(1u << VIRTIO_CRYPTO_SERVICE_CIPHER); +} -vcrypto->conf.crypto_services = +vcrypto->conf.crypto_services = user_crypto_services & vcrypto->conf.cryptodev->conf.crypto_services; -vcrypto->conf.cipher_algo_l = +vcrypto->conf.cipher_algo_l = vcrypto->user_cipher_algo_l & vcrypto->conf.cryptodev->conf.cipher_algo_l; -vcrypto->conf.cipher_algo_h = +vcrypto->conf.cipher_algo_h = vcrypto->user_cipher_algo_h & vcrypto->conf.cryptodev->conf.cipher_algo_h; vcrypto->conf.hash_algo = vcrypto->conf.cryptodev->conf.hash_algo; vcrypto->conf.mac_algo_l = vcrypto->conf.cryptodev->conf.mac_algo_l; @@ -853,6 +863,34 @@ static const VMStateDescription vmstate_virtio_crypto = { static Property virtio_crypto_properties[] = { DEFINE_PROP_LINK("cryptodev", VirtIOCrypto, conf.cryptodev, TYPE_CRYPTODEV_BACKEND, CryptoDevBackend *), +DEFINE_PROP_BIT("no-cipher", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_ARC4, false), s/ VIRTIO_CRYPTO_CIPHER_ARC4/VIRTIO_CRYPTO_NO_CIPHER/ Thanks for catching that. I will change. +DEFINE_PROP_BIT("cipher-arc4", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_ARC4, false), +DEFINE_PROP_BIT("cipher-aes-ecb", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_AES_ECB, false), +DEFINE_PROP_BIT("cipher-aes-cbc", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_AES_CBC, false), +DEFINE_PROP_BIT("cipher-aes-ctr", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_AES_CTR, false), +DEFINE_PROP_BIT("cipher-des-ecb", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_DES_ECB, false), +DEFINE_PROP_BIT("cipher-3des-ecb", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_3DES_ECB, false), +DEFINE_PROP_BIT("cipher-3des-cbc", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_3DES_CBC, false), +DEFINE_PROP_BIT("cipher-3des-ctr", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_3DES_CTR, false), +DEFINE_PROP_BIT("cipher-kasumi-f8", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CR
Re: [Qemu-devel] [RFC v1 1/1] virtio-crypto: Allow disabling of cipher algorithms for virtio-crypto device
Hi Daniel On 06/13/2018 05:37 AM, Daniel P. Berrangé wrote: On Tue, Jun 12, 2018 at 03:48:34PM -0400, Farhan Ali wrote: The virtio-crypto driver currently propagates to the guest all the cipher algorithms that the backend cryptodev can support. But in certain cases where the guest has more performant mechanism to handle some algorithms, it would be useful to propagate only a subset of the algorithms. I'm not really convinced by this. The performance of crypto algorithms has many influencing factors, making it pretty hard to decide which is best without actively testing specific impls and comparing them in a manner which matches the application usage pattern. eg in theory the kernel crypto impl of an alg is faster than a userspace impl, if the kernel uses hardware accel and userspace does not. This, however, ignores the overhead of the kernel/userspace switch. The real world performance winner, thus depends on the amount of data being processed in each operation. Some times userspace can win & sometimes kernel space can win. This is even more relevant to virtio-crypto as it has more expensive context switches. True. But what if the guest can perform some crypto algorithms without a incurring a VM exit? For example in s390 we have the cpacf instructions to perform crypto and this instruction is implemented for us by our hardware virtualization technology. In such a case it would be better not to use virtio-crypto's implementation of such a crypto algorithm. At the same time we would like to take advantage of virtio-crypto's acceleration capabilities for certain crypto algorithms for which there is no hardware assistance. IOW, when we expose a virtio-crypto dev to a guest, it is never reasonable for the guest to blindly assume that anything it does is faster than a pure software impl running in the guest. It will depend on the usage pattern. This is no different to bare metal where you should not assume kernel crypto is faster. IMHO this is not a compelling reason to be able to turn off algorithms in virtio-crypto, as any decision will always be at best incomplete & inaccurate. But shouldn't the user have the option to try and test by turning off certain algorithms? You are right the performance will depend on usage patterns, but I believe at least the user should have the option to test and see what works and does not work. It would be far easier to do so with the virtio-crypto dev than changing code in the kernel or userspace IMHO. @@ -853,6 +863,34 @@ static const VMStateDescription vmstate_virtio_crypto = { static Property virtio_crypto_properties[] = { DEFINE_PROP_LINK("cryptodev", VirtIOCrypto, conf.cryptodev, TYPE_CRYPTODEV_BACKEND, CryptoDevBackend *), +DEFINE_PROP_BIT("no-cipher", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_ARC4, false), +DEFINE_PROP_BIT("cipher-arc4", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_ARC4, false), +DEFINE_PROP_BIT("cipher-aes-ecb", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_AES_ECB, false), +DEFINE_PROP_BIT("cipher-aes-cbc", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_AES_CBC, false), +DEFINE_PROP_BIT("cipher-aes-ctr", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_AES_CTR, false), +DEFINE_PROP_BIT("cipher-des-ecb", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_DES_ECB, false), +DEFINE_PROP_BIT("cipher-3des-ecb", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_3DES_ECB, false), +DEFINE_PROP_BIT("cipher-3des-cbc", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_3DES_CBC, false), +DEFINE_PROP_BIT("cipher-3des-ctr", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_3DES_CTR, false), +DEFINE_PROP_BIT("cipher-kasumi-f8", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_KASUMI_F8, false), +DEFINE_PROP_BIT("cipher-snow3g-uea2", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_SNOW3G_UEA2, false), +DEFINE_PROP_BIT("cipher-aes-f8", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_AES_F8, false), +DEFINE_PROP_BIT("cipher-aes-xts", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_AES_XTS, false), +DEFINE_PROP_BIT("cipher-zuc-eea3", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_ZUC_EEA3, false), This does not scale as an approach IMHO which just reinforces to me that we shouldn't do this. I am open suggestions on better implementation. I thought defining a property bit for the virtio-crypto dev would
[Qemu-devel] [RFC v1 1/1] virtio-crypto: Allow disabling of cipher algorithms for virtio-crypto device
The virtio-crypto driver currently propagates to the guest all the cipher algorithms that the backend cryptodev can support. But in certain cases where the guest has more performant mechanism to handle some algorithms, it would be useful to propagate only a subset of the algorithms. This patch adds support for disabling the cipher algorithms of the backend cryptodev. eg: -object cryptodev-backend-builtin,id=cryptodev0 -device virtio-crypto-ccw,id=crypto0,cryptodev=cryptodev0,cipher-aes-cbc=off Signed-off-by: Farhan Ali --- Please note this patch is not complete, and there are TODOs to handle for other types of algorithms such Hash, AEAD and MAC algorithms. This is mainly intended to get some feedback on the design approach from the community. hw/virtio/virtio-crypto.c | 46 --- include/hw/virtio/virtio-crypto.h | 3 +++ 2 files changed, 46 insertions(+), 3 deletions(-) diff --git a/hw/virtio/virtio-crypto.c b/hw/virtio/virtio-crypto.c index 9a9fa49..4aed9ca 100644 --- a/hw/virtio/virtio-crypto.c +++ b/hw/virtio/virtio-crypto.c @@ -754,12 +754,22 @@ static void virtio_crypto_reset(VirtIODevice *vdev) static void virtio_crypto_init_config(VirtIODevice *vdev) { VirtIOCrypto *vcrypto = VIRTIO_CRYPTO(vdev); +uint32_t user_crypto_services = (1u << VIRTIO_CRYPTO_SERVICE_CIPHER) | +(1u << VIRTIO_CRYPTO_SERVICE_HASH) | +(1u << VIRTIO_CRYPTO_SERVICE_AEAD) | +(1u << VIRTIO_CRYPTO_SERVICE_MAC); + +if (vcrypto->user_cipher_algo_l & (1u << VIRTIO_CRYPTO_NO_CIPHER)) { +vcrypto->user_cipher_algo_l = 1u << VIRTIO_CRYPTO_NO_CIPHER; +vcrypto->user_cipher_algo_h = 0; +user_crypto_services &= ~(1u << VIRTIO_CRYPTO_SERVICE_CIPHER); +} -vcrypto->conf.crypto_services = +vcrypto->conf.crypto_services = user_crypto_services & vcrypto->conf.cryptodev->conf.crypto_services; -vcrypto->conf.cipher_algo_l = +vcrypto->conf.cipher_algo_l = vcrypto->user_cipher_algo_l & vcrypto->conf.cryptodev->conf.cipher_algo_l; -vcrypto->conf.cipher_algo_h = +vcrypto->conf.cipher_algo_h = vcrypto->user_cipher_algo_h & vcrypto->conf.cryptodev->conf.cipher_algo_h; vcrypto->conf.hash_algo = vcrypto->conf.cryptodev->conf.hash_algo; vcrypto->conf.mac_algo_l = vcrypto->conf.cryptodev->conf.mac_algo_l; @@ -853,6 +863,34 @@ static const VMStateDescription vmstate_virtio_crypto = { static Property virtio_crypto_properties[] = { DEFINE_PROP_LINK("cryptodev", VirtIOCrypto, conf.cryptodev, TYPE_CRYPTODEV_BACKEND, CryptoDevBackend *), +DEFINE_PROP_BIT("no-cipher", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_ARC4, false), +DEFINE_PROP_BIT("cipher-arc4", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_ARC4, false), +DEFINE_PROP_BIT("cipher-aes-ecb", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_AES_ECB, false), +DEFINE_PROP_BIT("cipher-aes-cbc", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_AES_CBC, false), +DEFINE_PROP_BIT("cipher-aes-ctr", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_AES_CTR, false), +DEFINE_PROP_BIT("cipher-des-ecb", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_DES_ECB, false), +DEFINE_PROP_BIT("cipher-3des-ecb", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_3DES_ECB, false), +DEFINE_PROP_BIT("cipher-3des-cbc", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_3DES_CBC, false), +DEFINE_PROP_BIT("cipher-3des-ctr", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_3DES_CTR, false), +DEFINE_PROP_BIT("cipher-kasumi-f8", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_KASUMI_F8, false), +DEFINE_PROP_BIT("cipher-snow3g-uea2", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_SNOW3G_UEA2, false), +DEFINE_PROP_BIT("cipher-aes-f8", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_AES_F8, false), +DEFINE_PROP_BIT("cipher-aes-xts", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_AES_XTS, false), +DEFINE_PROP_BIT("cipher-zuc-eea3", VirtIOCrypto, user_cipher_algo_l, +VIRTIO_CRYPTO_CIPHER_ZUC_EEA3, false), DEFINE_PROP_END_OF_LIST(), }; @@ -974,6 +1012,8 @@ static void virtio_crypto_instance_init(Objec
Re: [Qemu-devel] [RFC v1 2/2] crypto/virtio-crypto: Register an algo only if it's supported
On 06/11/2018 04:48 AM, Gonglei (Arei) wrote: -Original Message- From: Farhan Ali [mailto:al...@linux.ibm.com] Sent: Saturday, June 09, 2018 3:09 AM To: linux-ker...@vger.kernel.org; k...@vger.kernel.org Cc: m...@redhat.com; qemu-devel@nongnu.org; Gonglei (Arei) ; longpeng ; pa...@linux.ibm.com; fran...@linux.ibm.com; borntrae...@de.ibm.com; al...@linux.ibm.com Subject: [RFC v1 2/2] crypto/virtio-crypto: Register an algo only if it's supported From: Farhan Ali Register a crypto algo with the Linux crypto layer only if the algorithm is supported by the backend virtio-crypto device. Also route crypto requests to a virtio-crypto device, only if it can support the requested service and algorithm. Signed-off-by: Farhan Ali --- drivers/crypto/virtio/virtio_crypto_algs.c | 110 ++- drivers/crypto/virtio/virtio_crypto_common.h | 11 ++- drivers/crypto/virtio/virtio_crypto_mgr.c| 81 ++-- 3 files changed, 158 insertions(+), 44 deletions(-) diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c b/drivers/crypto/virtio/virtio_crypto_algs.c index ba190cf..fef112a 100644 --- a/drivers/crypto/virtio/virtio_crypto_algs.c +++ b/drivers/crypto/virtio/virtio_crypto_algs.c @@ -49,12 +49,18 @@ struct virtio_crypto_sym_request { bool encrypt; }; +struct virtio_crypto_algo { + uint32_t algonum; + uint32_t service; + unsigned int active_devs; + struct crypto_alg algo; +}; + /* * The algs_lock protects the below global virtio_crypto_active_devs * and crypto algorithms registion. */ static DEFINE_MUTEX(algs_lock); -static unsigned int virtio_crypto_active_devs; static void virtio_crypto_ablkcipher_finalize_req( struct virtio_crypto_sym_request *vc_sym_req, struct ablkcipher_request *req, @@ -312,13 +318,19 @@ static int virtio_crypto_ablkcipher_setkey(struct crypto_ablkcipher *tfm, unsigned int keylen) { struct virtio_crypto_ablkcipher_ctx *ctx = crypto_ablkcipher_ctx(tfm); + uint32_t alg; int ret; + ret = virtio_crypto_alg_validate_key(keylen, ); + if (ret) + return ret; + if (!ctx->vcrypto) { /* New key */ int node = virtio_crypto_get_current_node(); struct virtio_crypto *vcrypto = - virtcrypto_get_dev_node(node); + virtcrypto_get_dev_node(node, + VIRTIO_CRYPTO_SERVICE_CIPHER, alg); if (!vcrypto) { pr_err("virtio_crypto: Could not find a virtio device in the system\n"); We'd better change the above error message now. What about: " virtio_crypto: Could not find a virtio device in the system or unsupported algo" ? Regards, -Gonglei Sure, I will update the error message. But other than that does the rest of the code looks good to you? Thanks Farhan
Re: [Qemu-devel] [RFC v1 1/2] crypto/virtio-crypto: Read crypto services and algorithm masks
Hi Arei On 06/11/2018 02:43 AM, Gonglei (Arei) wrote: -Original Message- From: Farhan Ali [mailto:al...@linux.ibm.com] Sent: Saturday, June 09, 2018 3:09 AM To: linux-ker...@vger.kernel.org; k...@vger.kernel.org Cc: m...@redhat.com; qemu-devel@nongnu.org; Gonglei (Arei) ; longpeng ; pa...@linux.ibm.com; fran...@linux.ibm.com; borntrae...@de.ibm.com; al...@linux.ibm.com Subject: [RFC v1 1/2] crypto/virtio-crypto: Read crypto services and algorithm masks Read the crypto services and algorithm masks which provides information about the services and algorithms supported by virtio-crypto backend. Signed-off-by: Farhan Ali --- drivers/crypto/virtio/virtio_crypto_common.h | 14 ++ drivers/crypto/virtio/virtio_crypto_core.c | 29 2 files changed, 43 insertions(+) diff --git a/drivers/crypto/virtio/virtio_crypto_common.h b/drivers/crypto/virtio/virtio_crypto_common.h index 66501a5..05eca12e 100644 --- a/drivers/crypto/virtio/virtio_crypto_common.h +++ b/drivers/crypto/virtio/virtio_crypto_common.h @@ -55,6 +55,20 @@ struct virtio_crypto { /* Number of queue currently used by the driver */ u32 curr_queue; + /* +* Specifies the services mask which the device support, +* see VIRTIO_CRYPTO_SERVICE_* above +*/ Pls update the above comments. Except that: Acked-by: Gonglei Sure will update the comment. How about " Specifies the services mask which the device support, * see VIRTIO_CRYPTO_SERVICE_*" ? or should I specify the file where the VIRTIO_CRYPTO_SERVICE_* are defined? Thanks Farhan + u32 crypto_services; + + /* Detailed algorithms mask */ + u32 cipher_algo_l; + u32 cipher_algo_h; + u32 hash_algo; + u32 mac_algo_l; + u32 mac_algo_h; + u32 aead_algo; + /* Maximum length of cipher key */ u32 max_cipher_key_len; /* Maximum length of authenticated key */ diff --git a/drivers/crypto/virtio/virtio_crypto_core.c b/drivers/crypto/virtio/virtio_crypto_core.c index 8332698..8f745f2 100644 --- a/drivers/crypto/virtio/virtio_crypto_core.c +++ b/drivers/crypto/virtio/virtio_crypto_core.c @@ -303,6 +303,13 @@ static int virtcrypto_probe(struct virtio_device *vdev) u32 max_data_queues = 0, max_cipher_key_len = 0; u32 max_auth_key_len = 0; u64 max_size = 0; + u32 cipher_algo_l = 0; + u32 cipher_algo_h = 0; + u32 hash_algo = 0; + u32 mac_algo_l = 0; + u32 mac_algo_h = 0; + u32 aead_algo = 0; + u32 crypto_services = 0; if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) return -ENODEV; @@ -339,6 +346,20 @@ static int virtcrypto_probe(struct virtio_device *vdev) max_auth_key_len, _auth_key_len); virtio_cread(vdev, struct virtio_crypto_config, max_size, _size); + virtio_cread(vdev, struct virtio_crypto_config, + crypto_services, _services); + virtio_cread(vdev, struct virtio_crypto_config, + cipher_algo_l, _algo_l); + virtio_cread(vdev, struct virtio_crypto_config, + cipher_algo_h, _algo_h); + virtio_cread(vdev, struct virtio_crypto_config, + hash_algo, _algo); + virtio_cread(vdev, struct virtio_crypto_config, + mac_algo_l, _algo_l); + virtio_cread(vdev, struct virtio_crypto_config, + mac_algo_h, _algo_h); + virtio_cread(vdev, struct virtio_crypto_config, + aead_algo, _algo); /* Add virtio crypto device to global table */ err = virtcrypto_devmgr_add_dev(vcrypto); @@ -358,6 +379,14 @@ static int virtcrypto_probe(struct virtio_device *vdev) vcrypto->max_cipher_key_len = max_cipher_key_len; vcrypto->max_auth_key_len = max_auth_key_len; vcrypto->max_size = max_size; + vcrypto->crypto_services = crypto_services; + vcrypto->cipher_algo_l = cipher_algo_l; + vcrypto->cipher_algo_h = cipher_algo_h; + vcrypto->mac_algo_l = mac_algo_l; + vcrypto->mac_algo_h = mac_algo_h; + vcrypto->hash_algo = hash_algo; + vcrypto->aead_algo = aead_algo; + dev_info(>dev, "max_queues: %u, max_cipher_key_len: %u, max_auth_key_len: %u, max_size 0x%llx\n", -- 2.7.4
[Qemu-devel] [RFC v1 2/2] crypto/virtio-crypto: Register an algo only if it's supported
From: Farhan Ali Register a crypto algo with the Linux crypto layer only if the algorithm is supported by the backend virtio-crypto device. Also route crypto requests to a virtio-crypto device, only if it can support the requested service and algorithm. Signed-off-by: Farhan Ali --- drivers/crypto/virtio/virtio_crypto_algs.c | 110 ++- drivers/crypto/virtio/virtio_crypto_common.h | 11 ++- drivers/crypto/virtio/virtio_crypto_mgr.c| 81 ++-- 3 files changed, 158 insertions(+), 44 deletions(-) diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c b/drivers/crypto/virtio/virtio_crypto_algs.c index ba190cf..fef112a 100644 --- a/drivers/crypto/virtio/virtio_crypto_algs.c +++ b/drivers/crypto/virtio/virtio_crypto_algs.c @@ -49,12 +49,18 @@ struct virtio_crypto_sym_request { bool encrypt; }; +struct virtio_crypto_algo { + uint32_t algonum; + uint32_t service; + unsigned int active_devs; + struct crypto_alg algo; +}; + /* * The algs_lock protects the below global virtio_crypto_active_devs * and crypto algorithms registion. */ static DEFINE_MUTEX(algs_lock); -static unsigned int virtio_crypto_active_devs; static void virtio_crypto_ablkcipher_finalize_req( struct virtio_crypto_sym_request *vc_sym_req, struct ablkcipher_request *req, @@ -312,13 +318,19 @@ static int virtio_crypto_ablkcipher_setkey(struct crypto_ablkcipher *tfm, unsigned int keylen) { struct virtio_crypto_ablkcipher_ctx *ctx = crypto_ablkcipher_ctx(tfm); + uint32_t alg; int ret; + ret = virtio_crypto_alg_validate_key(keylen, ); + if (ret) + return ret; + if (!ctx->vcrypto) { /* New key */ int node = virtio_crypto_get_current_node(); struct virtio_crypto *vcrypto = - virtcrypto_get_dev_node(node); + virtcrypto_get_dev_node(node, + VIRTIO_CRYPTO_SERVICE_CIPHER, alg); if (!vcrypto) { pr_err("virtio_crypto: Could not find a virtio device in the system\n"); return -ENODEV; @@ -571,57 +583,85 @@ static void virtio_crypto_ablkcipher_finalize_req( virtcrypto_clear_request(_sym_req->base); } -static struct crypto_alg virtio_crypto_algs[] = { { - .cra_name = "cbc(aes)", - .cra_driver_name = "virtio_crypto_aes_cbc", - .cra_priority = 150, - .cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC, - .cra_blocksize = AES_BLOCK_SIZE, - .cra_ctxsize = sizeof(struct virtio_crypto_ablkcipher_ctx), - .cra_alignmask = 0, - .cra_module = THIS_MODULE, - .cra_type = _ablkcipher_type, - .cra_init = virtio_crypto_ablkcipher_init, - .cra_exit = virtio_crypto_ablkcipher_exit, - .cra_u = { - .ablkcipher = { - .setkey = virtio_crypto_ablkcipher_setkey, - .decrypt = virtio_crypto_ablkcipher_decrypt, - .encrypt = virtio_crypto_ablkcipher_encrypt, - .min_keysize = AES_MIN_KEY_SIZE, - .max_keysize = AES_MAX_KEY_SIZE, - .ivsize = AES_BLOCK_SIZE, +static struct virtio_crypto_algo virtio_crypto_algs[] = { { + .algonum = VIRTIO_CRYPTO_CIPHER_AES_CBC, + .service = VIRTIO_CRYPTO_SERVICE_CIPHER, + .algo = { + .cra_name = "cbc(aes)", + .cra_driver_name = "virtio_crypto_aes_cbc", + .cra_priority = 150, + .cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC, + .cra_blocksize = AES_BLOCK_SIZE, + .cra_ctxsize = sizeof(struct virtio_crypto_ablkcipher_ctx), + .cra_alignmask = 0, + .cra_module = THIS_MODULE, + .cra_type = _ablkcipher_type, + .cra_init = virtio_crypto_ablkcipher_init, + .cra_exit = virtio_crypto_ablkcipher_exit, + .cra_u = { + .ablkcipher = { + .setkey = virtio_crypto_ablkcipher_setkey, + .decrypt = virtio_crypto_ablkcipher_decrypt, + .encrypt = virtio_crypto_ablkcipher_encrypt, + .min_keysize = AES_MIN_KEY_SIZE, + .max_keysize = AES_MAX_KEY_SIZE, + .ivsize = AES_BLOCK_SIZE, + }, }, }, } }; -int virtio_crypto_algs_register(void) +int virtio_crypto_algs_register(struct virtio_crypto *vcrypto) { int ret = 0; + int i = 0; mutex_lock(_lock); - if (++virtio_crypto_active_devs != 1) - goto unlock;
[Qemu-devel] [RFC v1 1/2] crypto/virtio-crypto: Read crypto services and algorithm masks
Read the crypto services and algorithm masks which provides information about the services and algorithms supported by virtio-crypto backend. Signed-off-by: Farhan Ali --- drivers/crypto/virtio/virtio_crypto_common.h | 14 ++ drivers/crypto/virtio/virtio_crypto_core.c | 29 2 files changed, 43 insertions(+) diff --git a/drivers/crypto/virtio/virtio_crypto_common.h b/drivers/crypto/virtio/virtio_crypto_common.h index 66501a5..05eca12e 100644 --- a/drivers/crypto/virtio/virtio_crypto_common.h +++ b/drivers/crypto/virtio/virtio_crypto_common.h @@ -55,6 +55,20 @@ struct virtio_crypto { /* Number of queue currently used by the driver */ u32 curr_queue; + /* +* Specifies the services mask which the device support, +* see VIRTIO_CRYPTO_SERVICE_* above +*/ + u32 crypto_services; + + /* Detailed algorithms mask */ + u32 cipher_algo_l; + u32 cipher_algo_h; + u32 hash_algo; + u32 mac_algo_l; + u32 mac_algo_h; + u32 aead_algo; + /* Maximum length of cipher key */ u32 max_cipher_key_len; /* Maximum length of authenticated key */ diff --git a/drivers/crypto/virtio/virtio_crypto_core.c b/drivers/crypto/virtio/virtio_crypto_core.c index 8332698..8f745f2 100644 --- a/drivers/crypto/virtio/virtio_crypto_core.c +++ b/drivers/crypto/virtio/virtio_crypto_core.c @@ -303,6 +303,13 @@ static int virtcrypto_probe(struct virtio_device *vdev) u32 max_data_queues = 0, max_cipher_key_len = 0; u32 max_auth_key_len = 0; u64 max_size = 0; + u32 cipher_algo_l = 0; + u32 cipher_algo_h = 0; + u32 hash_algo = 0; + u32 mac_algo_l = 0; + u32 mac_algo_h = 0; + u32 aead_algo = 0; + u32 crypto_services = 0; if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) return -ENODEV; @@ -339,6 +346,20 @@ static int virtcrypto_probe(struct virtio_device *vdev) max_auth_key_len, _auth_key_len); virtio_cread(vdev, struct virtio_crypto_config, max_size, _size); + virtio_cread(vdev, struct virtio_crypto_config, + crypto_services, _services); + virtio_cread(vdev, struct virtio_crypto_config, + cipher_algo_l, _algo_l); + virtio_cread(vdev, struct virtio_crypto_config, + cipher_algo_h, _algo_h); + virtio_cread(vdev, struct virtio_crypto_config, + hash_algo, _algo); + virtio_cread(vdev, struct virtio_crypto_config, + mac_algo_l, _algo_l); + virtio_cread(vdev, struct virtio_crypto_config, + mac_algo_h, _algo_h); + virtio_cread(vdev, struct virtio_crypto_config, + aead_algo, _algo); /* Add virtio crypto device to global table */ err = virtcrypto_devmgr_add_dev(vcrypto); @@ -358,6 +379,14 @@ static int virtcrypto_probe(struct virtio_device *vdev) vcrypto->max_cipher_key_len = max_cipher_key_len; vcrypto->max_auth_key_len = max_auth_key_len; vcrypto->max_size = max_size; + vcrypto->crypto_services = crypto_services; + vcrypto->cipher_algo_l = cipher_algo_l; + vcrypto->cipher_algo_h = cipher_algo_h; + vcrypto->mac_algo_l = mac_algo_l; + vcrypto->mac_algo_h = mac_algo_h; + vcrypto->hash_algo = hash_algo; + vcrypto->aead_algo = aead_algo; + dev_info(>dev, "max_queues: %u, max_cipher_key_len: %u, max_auth_key_len: %u, max_size 0x%llx\n", -- 2.7.4
[Qemu-devel] [RFC v1 0/2] Detect & register virtio-crypto algos only if it can be supported by backend
Hi, Currently the Linux virtio-crypto driver registers the crypto algorithm without verifying if the backend actually supports the algorithm. This kernel patch series adds support for registering algorithm with Linux crypto layer, only if the algorithm is supported by the backend device. This also makes the driver more compliant with the virtio-crypto spec [1]. I would appreciate any feedback or comments on this. Thank you Farhan Reference - [1] Virtio crypto spec proposal https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg00816.html Farhan Ali (2): crypto/virtio-crypto: Read crypto services and algorithm masks crypto/virtio-crypto: Register an algo only if it's supported drivers/crypto/virtio/virtio_crypto_algs.c | 110 ++- drivers/crypto/virtio/virtio_crypto_common.h | 25 +- drivers/crypto/virtio/virtio_crypto_core.c | 29 +++ drivers/crypto/virtio/virtio_crypto_mgr.c| 81 ++-- 4 files changed, 201 insertions(+), 44 deletions(-) -- 2.7.4
Re: [Qemu-devel] [PATCH 2/3] pc-bios/s390-ccw/net: Add support for pxelinux-style config files
On 05/31/2018 11:21 PM, Thomas Huth wrote: Just a question do we want to clear cfgbuf here, before calling pxelinux_load_parse_cfg? That's theoretically not necessary. The buffer either gets populated with data, or the function errors out. The code also makes sure that there is a final NUL-character in the buffer: https://github.com/aik/SLOF/blob/64c526a/lib/libnet/pxelinux.c#L169 ... but I think I've got to double check that there is also a NUL-character immediately at the end of the downloaded data ... so there's indeed a change required, but likely rather in the SLOF code than here. Thomas Can't we do that in net_try_direct_tftp_load, or it is better to put that in SLOF code?
Re: [Qemu-devel] [PATCH 3/3] pc-bios/s390-ccw/net: Try to load pxelinux.cfg file accoring to the UUID
Reviewed-by: Farhan Ali
Re: [Qemu-devel] [PATCH 2/3] pc-bios/s390-ccw/net: Add support for pxelinux-style config files
On 05/30/2018 05:16 AM, Thomas Huth wrote: Since it is quite cumbersome to manually create a combined kernel with initrd image for network booting, we now support loading via pxelinux configuration files, too. In these files, the kernel, initrd and command line parameters can be specified seperately, and the firmware then takes care of glueing everything together in memory after the files have been downloaded. See this URL for details about the config file layout: https://www.syslinux.org/wiki/index.php?title=PXELINUX The user can either specify a config file directly as bootfile via DHCP (but in this case, the file has to start either with "default" or a "#" comment so we can distinguish it from binary kernels), or a folder (i.e. the bootfile name must end with "/") where the firmware should look for the typical pxelinux.cfg file names, e.g. based on MAC or IP address. We also support the pxelinux.cfg DHCP options 209 and 210 from RFC 5071. Signed-off-by: Thomas Huth --- pc-bios/s390-ccw/netboot.mak | 7 ++-- pc-bios/s390-ccw/netmain.c | 79 +++- 2 files changed, 82 insertions(+), 4 deletions(-) diff --git a/pc-bios/s390-ccw/netboot.mak b/pc-bios/s390-ccw/netboot.mak index a73be36..8af0cfd 100644 --- a/pc-bios/s390-ccw/netboot.mak +++ b/pc-bios/s390-ccw/netboot.mak @@ -25,8 +25,9 @@ CTYPE_OBJS = isdigit.o isxdigit.o toupper.o %.o : $(SLOF_DIR)/lib/libc/ctype/%.c $(call quiet-command,$(CC) $(LIBC_CFLAGS) -c -o $@ $<,"CC","$(TARGET_DIR)$@") -STRING_OBJS = strcat.o strchr.o strcmp.o strcpy.o strlen.o strncmp.o strncpy.o \ - strstr.o memset.o memcpy.o memmove.o memcmp.o +STRING_OBJS = strcat.o strchr.o strrchr.o strcpy.o strlen.o strncpy.o \ + strcmp.o strncmp.o strcasecmp.o strncasecmp.o strstr.o \ + memset.o memcpy.o memmove.o memcmp.o %.o : $(SLOF_DIR)/lib/libc/string/%.c $(call quiet-command,$(CC) $(LIBC_CFLAGS) -c -o $@ $<,"CC","$(TARGET_DIR)$@") @@ -50,7 +51,7 @@ libc.a: $(LIBCOBJS) # libnet files: LIBNETOBJS := args.o dhcp.o dns.o icmpv6.o ipv6.o tcp.o udp.o bootp.o \ - dhcpv6.o ethernet.o ipv4.o ndp.o tftp.o + dhcpv6.o ethernet.o ipv4.o ndp.o tftp.o pxelinux.o LIBNETCFLAGS := $(QEMU_CFLAGS) -DDHCPARCH=0x1F $(LIBC_INC) $(LIBNET_INC) %.o : $(SLOF_DIR)/lib/libnet/%.c diff --git a/pc-bios/s390-ccw/netmain.c b/pc-bios/s390-ccw/netmain.c index 7533cf7..e84bb2b 100644 --- a/pc-bios/s390-ccw/netmain.c +++ b/pc-bios/s390-ccw/netmain.c @@ -30,6 +30,7 @@ #include #include #include +#include #include "s390-ccw.h" #include "virtio.h" @@ -41,12 +42,14 @@ extern char _start[]; #define KERNEL_ADDR ((void *)0L) #define KERNEL_MAX_SIZE ((long)_start) +#define ARCH_COMMAND_LINE_SIZE 896 /* Taken from Linux kernel */ char stack[PAGE_SIZE * 8] __attribute__((aligned(PAGE_SIZE))); IplParameterBlock iplb __attribute__((aligned(PAGE_SIZE))); static char cfgbuf[2048]; static SubChannelId net_schid = { .one = 1 }; +static uint8_t mac[6]; static uint64_t dest_timer; static uint64_t get_timer_ms(void) @@ -158,7 +161,6 @@ static int tftp_load(filename_ip_t *fnip, void *buffer, int len) static int net_init(filename_ip_t *fn_ip) { -uint8_t mac[6]; int rc; memset(fn_ip, 0, sizeof(filename_ip_t)); @@ -233,6 +235,66 @@ static void net_release(filename_ip_t *fn_ip) } /** + * Load a kernel with initrd (i.e. with the information that we've got from + * a pxelinux.cfg config file) + */ +static int load_kernel_with_initrd(filename_ip_t *fn_ip, + struct pl_cfg_entry *entry) +{ +int rc; + +printf("Loading pxelinux.cfg entry '%s'\n", entry->label); + +if (!entry->kernel) { +printf("Kernel entry is missing!\n"); +return -1; +} + +strncpy(fn_ip->filename, entry->kernel, sizeof(fn_ip->filename)); +rc = tftp_load(fn_ip, KERNEL_ADDR, KERNEL_MAX_SIZE); +if (rc < 0) { +return rc; +} + +if (entry->initrd) { +uint64_t iaddr = (rc + 0xfff) & ~0xfffUL; + +strncpy(fn_ip->filename, entry->initrd, sizeof(fn_ip->filename)); +rc = tftp_load(fn_ip, (void *)iaddr, KERNEL_MAX_SIZE - iaddr); +if (rc < 0) { +return rc; +} +/* Patch location and size: */ +*(uint64_t *)0x10408 = iaddr; +*(uint64_t *)0x10410 = rc; +rc += iaddr; +} + +if (entry->append) { +strncpy((char *)0x10480, entry->append, ARCH_COMMAND_LINE_SIZE); +} + +return rc; +} + +#define MAX_PXELINUX_ENTRIES 16 + +static int net_try_pxelinux_cfg(filename_ip_t *fn_ip) +{ +struct pl_cfg_entry entries[MAX_PXELINUX_ENTRIES]; +int num_ent, def_ent = 0; + +num_ent = pxelinux_load_parse_cfg(fn_ip, mac, NULL, DEFAULT_TFTP_RETRIES, + cfgbuf, sizeof(cfgbuf), +
Re: [Qemu-devel] [PATCH] pc-bios/s390-ccw: define loadparm length
On 05/29/2018 12:40 AM, Collin Walling wrote: Loadparm is defined by the s390 architecture to be 8 bytes in length. Let's define this size in the s390-ccw bios. Suggested-by: Laszlo Ersek Signed-off-by: Collin Walling --- pc-bios/s390-ccw/iplb.h | 4 +++- pc-bios/s390-ccw/main.c | 8 pc-bios/s390-ccw/sclp.c | 2 +- pc-bios/s390-ccw/sclp.h | 2 +- 4 files changed, 9 insertions(+), 7 deletions(-) Reviewed-by: Farhan Ali
Re: [Qemu-devel] [PATCH] s390-ccw: force diag 308 subcode to unsigned long
On 05/03/2018 11:44 AM, Cornelia Huck wrote: On Thu, 3 May 2018 11:25:08 -0400 Farhan Ali <al...@linux.ibm.com> wrote: On 05/02/2018 08:52 AM, Cornelia Huck wrote: We currently pass an integer as the subcode parameter. However, the upper bits of the register containing the subcode need to be 0, which is not guaranteed unless we explicitly specify the subcode to be an unsigned long value. Fixes: d046c51dad3 ("pc-bios/s390-ccw: Get device address via diag 308/6") Cc:qemu-sta...@nongnu.org Signed-off-by: Cornelia Huck<coh...@redhat.com> Sorry for my ignorance, but is there a C standard that says upper bits of an int is not guaranteed to be 0? The value (5 resp. 6) is small enough to fit into a regular integer, and the compiler generated a lhi for the load, which did not change any upper values that might have been in the register previously. Telling the compiler to treat the value as an unsigned long makes it generate a lghi. This makes sense now. Thanks for the explanation :)
Re: [Qemu-devel] [PATCH] s390-ccw: force diag 308 subcode to unsigned long
On 05/03/2018 11:48 AM, Eric Blake wrote: On 05/03/2018 10:25 AM, Farhan Ali wrote: On 05/02/2018 08:52 AM, Cornelia Huck wrote: We currently pass an integer as the subcode parameter. However, the upper bits of the register containing the subcode need to be 0, which is not guaranteed unless we explicitly specify the subcode to be an unsigned long value. Fixes: d046c51dad3 ("pc-bios/s390-ccw: Get device address via diag 308/6") Cc:qemu-sta...@nongnu.org Signed-off-by: Cornelia Huck<coh...@redhat.com> Sorry for my ignorance, but is there a C standard that says upper bits of an int is not guaranteed to be 0? We're outside the bounds of the C standard because of the use of asm(). The problem here is that the compiler assigning a 32-bit int into a 64-bit register uses the shortest sequence possible (leaving the upper 64 bits garbage), because the compiler assumes you correctly wrote the assembly to only use 32-bit operations on that register (which don't care about the upper bits). By using an unsigned long (a 64-bit value), the compiler instead emits assembly to write the full 64-bit register value, rather than leaving the upper bits as garbage; and this matters because we are subsequently using all 64 bits of the register in a later operation. We could also use a signed long, even long long, or written it as: (store ? 6ULL : 5ULL) instead of using a temporary variable. The crux of the fix is that you have to tell asm() that you want a 64-bit value written (the unpatched (store ? 6 : 5) is only a 32-bit value), and not whether that value is signed or unsigned (since the representation of both 6 and 5 are the same regardless of whether the type being written into the register is signed or not). Thank you so much for the detailed explanation :). I did not think about the instruction that will be used by the compiler to handle the values. Definitely learned something new!
Re: [Qemu-devel] [PATCH] s390-ccw: force diag 308 subcode to unsigned long
On 05/02/2018 08:52 AM, Cornelia Huck wrote: We currently pass an integer as the subcode parameter. However, the upper bits of the register containing the subcode need to be 0, which is not guaranteed unless we explicitly specify the subcode to be an unsigned long value. Fixes: d046c51dad3 ("pc-bios/s390-ccw: Get device address via diag 308/6") Cc:qemu-sta...@nongnu.org Signed-off-by: Cornelia HuckSorry for my ignorance, but is there a C standard that says upper bits of an int is not guaranteed to be 0? Thanks Farhan
Re: [Qemu-devel] [PATCH v3 1/3] pc-bios/s390-ccw/net: Split up net_load() into init, load and uninit parts
On 04/25/2018 05:08 AM, Thomas Huth wrote: When we want to support pxelinux-style network booting later, we've got to do several TFTP transfers - and we do not want to apply for a new IP address via DHCP each time. So split up net_load into three parts: 1. net_init(), which initializes virtio-net, gets an IP address via DHCP and prints out the related information. 2. The tftp_load call is now moved directly into the main() function 3. A new net_release() function which should tear down the network stack before we are done in the firmware. This will make it easier to extend the code in the next patches. Signed-off-by: Thomas HuthYou might want to change the commit header "uninit parts" to "release parts" :)
Re: [Qemu-devel] [qemu-s390x] [PATCH v1 for-2.13 0/4] pc-bios/s390-ccw: Network boot improvements
On 04/19/2018 01:27 AM, Thomas Huth wrote: Ah, that's not a memory leak, though it might look like one if you don't know what the vring_send_buf function is doing: vring_send_buf adds a buffer to a virtio ring for later use. So the buffer is not unused after this function, but gets filled in later by the virtio-net device. Thus we must not free the buffer here. We could free it in the "uninit" function after shutting down the device, but OTOH we are then leaving the firmware code anyway by jumping into the OS kernel, so it does not really matter whether we free()ed the buffers or not. Thomas That makes a lot of sense now :) thanks a lot for that explanation and sorry for my ignorance.
Re: [Qemu-devel] [PATCH v1 for-2.13 0/4] pc-bios/s390-ccw: Network boot improvements
On 04/18/2018 08:31 AM, Thomas Huth wrote: Some patches to improve the network boot experience on s390x: First, make sure that we shut down the virtio-net device before jumping into the kernel. Otherwise some incoming packets might destroy some of the kernel's data if it has not taken over the device yet. Then the last two patches add support for loading kernels via configuration files - pxelinux-style and .INS-file style. This way you don't have to manually glue your ramdisk to your kernel anymore, so this should be quite a relieve for all users who want to boot Linux via the network. The config file parsers have been completely written by myself from scratch and only tested with some config files that I came up with on my own. So if anybody has some pre-existing pxelinux config files already for booting a s390x, I'd appreciate some testing to see whether this works as expected for you, too! Thomas Huth (4): pc-bios/s390-ccw/net: Split up net_load() into init, load and uninit parts pc-bios/s390-ccw/net: Stop virtio-net device before jumping into the OS pc-bios/s390-ccw/net: Add support for pxelinux-style config files pc-bios/s390-ccw/net: Add support for .INS config files pc-bios/s390-ccw/netboot.mak | 5 +- pc-bios/s390-ccw/netmain.c| 312 ++ pc-bios/s390-ccw/virtio-net.c | 8 ++ pc-bios/s390-ccw/virtio.c | 19 ++- pc-bios/s390-ccw/virtio.h | 3 + 5 files changed, 312 insertions(+), 35 deletions(-) I have tried with pxelinux default config file I had and it worked fine. I will try a couple more tests. Also while going through the code again, I noticed a memory leak in the function virtio_net_init, and fixed it. I could send a formal patch for it, if you want to queue it as part of this series as I think we might have missed 2.12 window? diff --git a/pc-bios/s390-ccw/virtio-net.c b/pc-bios/s390-ccw/virtio-net.c index ff7f4da..e83ba08 100644 --- a/pc-bios/s390-ccw/virtio-net.c +++ b/pc-bios/s390-ccw/virtio-net.c @@ -61,6 +61,7 @@ int virtio_net_init(void *mac_addr) IPL_assert(buf != NULL, "Can not allocate memory for receive buffers"); vring_send_buf(rxvq, buf, ETH_MTU_SIZE + sizeof(VirtioNetHdr), VRING_DESC_F_WRITE); +free(buf); } vring_notify(rxvq);
Re: [Qemu-devel] [PATCH v1 for-2.13 1/4] pc-bios/s390-ccw/net: Split up net_load() into init, load and uninit parts
On 04/18/2018 08:31 AM, Thomas Huth wrote: When we want to support pxelinux-style network booting later, we've got to do several TFTP transfers - and we do not want to apply for a new IP address via DHCP each time. So split up net_load into three parts: 1. net_init(), which initializes virtio-net, gets an IP address via DHCP and prints out the related information. 2. The tftp_load call is now moved directly into the main() function 3. A new net_uninit() function which should tear down the network stack before we are done in the firmware. This will make it easier to extend the code in the next patches. Signed-off-by: Thomas HuthJust a minor nit, if we could rename *_uninit functions to destroy/release? I think it's just easier to read
Re: [Qemu-devel] [PATCH 2/4] pc-bios/s390-ccw: fix loadparm initialization and int conversion
On 04/12/2018 04:57 PM, Collin Walling wrote: On 04/12/2018 02:57 PM, Thomas Huth wrote: On 10.04.2018 17:01, Collin Walling wrote: Rename the loadparm char array in main.c to loadparm_str and increase the size by one byte to account for a null termination when converting the loadparm string to an int via atoui. Also allow the boot menu to be enabled when loadparm is set to an empty string or a series of spaces. Signed-off-by: Collin WallingReported-by: Vasily Gorbik --- hw/s390x/ipl.c | 2 ++ pc-bios/s390-ccw/main.c | 14 +++--- 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c index fdeaec3..23b5b54 100644 --- a/hw/s390x/ipl.c +++ b/hw/s390x/ipl.c @@ -352,6 +352,8 @@ int s390_ipl_set_loadparm(uint8_t *loadparm) loadparm[i] = ascii2ebcdic[(uint8_t) lp[i]]; } +memset(loadparm + i, 0x40, 8 - i); /* fill with EBCDIC spaces */ + g_free(lp); return 0; } diff --git a/pc-bios/s390-ccw/main.c b/pc-bios/s390-ccw/main.c index 9d9f8cf..26f9adf 100644 --- a/pc-bios/s390-ccw/main.c +++ b/pc-bios/s390-ccw/main.c @@ -15,11 +15,11 @@ char stack[PAGE_SIZE * 8] __attribute__((__aligned__(PAGE_SIZE))); static SubChannelId blk_schid = { .one = 1 }; IplParameterBlock iplb __attribute__((__aligned__(PAGE_SIZE))); -static char loadparm[8] = { 0, 0, 0, 0, 0, 0, 0, 0 }; +static char loadparm_str[9] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 }; QemuIplParameters qipl; #define LOADPARM_PROMPT "PROMPT " -#define LOADPARM_EMPTY "" +#define LOADPARM_EMPTY "" Sorry for my ignorance, but why was the old string containing dots? Thomas No need for apologies :) If -machine loadparm is *not* present on the command line, then the loadparm in the sclp will be a series of nulls. For whatever reason, that gets translated into a series of dots. It's because of the ebc2asc table we use for conversion, which results in the dots when converting from ebcdic_to_ascii. If -machine loadparm="" is present, then loadparm will in the sclp will be a series of spaces. > We want to enable the boot menu for both of these cases and, to make things easier, this patch replaces any nulls (dots) to spaces when setting loadparm so that we only have to handle one case instead of two within the bios.
Re: [Qemu-devel] [PATCH 2/3] s390: Ensure IPL from SCSI works as expected
On 04/05/2018 11:07 AM, Viktor Mihajlovski wrote: Operating systems may request an IPL from a virtio-scsi device by specifying an IPL parameter type of CCW. In this case QEMU won't set up the IPLB correctly. The BIOS will still detect it's a SCSI device to boot from, but it will now have to search for the first LUN and attempt to boot from there. However this may not be the original boot LUN if there's more than one SCSI disk attached to the HBA. With this change QEMU will detect that the request is for a SCSI device and will rebuild the initial IPL parameter info if it's the SCSI device used for the first boot. In consequence the BIOS can use the boot LUN from the IPL information block. In case a different SCSI device has been set, the BIOS will find and use the first available LUN. Signed-off-by: Viktor Mihajlovski <mihaj...@linux.vnet.ibm.com> --- hw/s390x/ipl.c | 31 +-- 1 file changed, 29 insertions(+), 2 deletions(-) diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c index 58e33c5..fb554ab 100644 --- a/hw/s390x/ipl.c +++ b/hw/s390x/ipl.c @@ -427,7 +427,8 @@ unref_mr: return img_size; } -static bool is_virtio_net_device(IplParameterBlock *iplb) +static bool is_virtio_ccw_device_of_type(IplParameterBlock *iplb, + int virtio_id) { uint8_t cssid; uint8_t ssid; @@ -447,13 +448,23 @@ static bool is_virtio_net_device(IplParameterBlock *iplb) sch = css_find_subch(1, cssid, ssid, schid); if (sch && sch->devno == devno) { -return sch->id.cu_model == VIRTIO_ID_NET; +return sch->id.cu_model == virtio_id; } } } return false; } +static bool is_virtio_net_device(IplParameterBlock *iplb) +{ +return is_virtio_ccw_device_of_type(iplb, VIRTIO_ID_NET); +} + +static bool is_virtio_scsi_device(IplParameterBlock *iplb) +{ +return is_virtio_ccw_device_of_type(iplb, VIRTIO_ID_SCSI); +} + void s390_ipl_update_diag308(IplParameterBlock *iplb) { S390IPLState *ipl = get_ipl_device(); @@ -478,6 +489,22 @@ void s390_reipl_request(void) S390IPLState *ipl = get_ipl_device(); ipl->reipl_requested = true; +if (ipl->iplb_valid && +!ipl->netboot && +ipl->iplb.pbt == S390_IPL_TYPE_CCW && +is_virtio_scsi_device(>iplb)) { +CcwDevice *ccw_dev = s390_get_ccw_device(get_boot_device(0)); + +if (ccw_dev && +cpu_to_be16(ccw_dev->sch->devno) == ipl->iplb.ccw.devno && +(ccw_dev->sch->ssid & 3) == ipl->iplb.ccw.ssid) { +/* + * this is the original boot device's SCSI + * so restore IPL parameter info from it + */ +ipl->iplb_valid = s390_gen_initial_iplb(ipl); +} +} qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET); } This does feel a little hackish, but I can't think of a better way to ensure the correct SCSI device is used for booting. Reviewed-by: Farhan Ali <al...@linux.vnet.ibm.com>
Re: [Qemu-devel] [PATCH 1/3] s390: Refactor IPL parameter block generation
On 04/05/2018 11:07 AM, Viktor Mihajlovski wrote: Splitting out the the CCW device extraction allows reuse. Signed-off-by: Viktor Mihajlovski<mihaj...@linux.vnet.ibm.com> --- hw/s390x/ipl.c | 81 -- 1 file changed, 51 insertions(+), 30 deletions(-) Reviewed-by: Farhan Ali <al...@linux.vnet.ibm.com>
Re: [Qemu-devel] [BUG] I/O thread segfault for QEMU on s390x
On 03/06/2018 01:34 AM, Martin Schwidefsky wrote: On Mon, 5 Mar 2018 20:08:45 +0100 Christian Borntraegerwrote: Do you happen to run with a recent host kernel that has commit 7041d28115e91f2144f811ffe8a195c696b1e1d0 s390: scrub registers on kernel entry and KVM exit Can you run with this on top diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S index 13a133a6015c..d6dc0e5e8f74 100644 --- a/arch/s390/kernel/entry.S +++ b/arch/s390/kernel/entry.S @@ -426,13 +426,13 @@ ENTRY(system_call) UPDATE_VTIME %r8,%r9,__LC_SYNC_ENTER_TIMER BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP stmg%r0,%r7,__PT_R0(%r11) - # clear user controlled register to prevent speculative use - xgr %r0,%r0 mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC mvc __PT_PSW(16,%r11),__LC_SVC_OLD_PSW mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC stg %r14,__PT_FLAGS(%r11) .Lsysc_do_svc: + # clear user controlled register to prevent speculative use + xgr %r0,%r0 # load address of system call table lg %r10,__THREAD_sysc_table(%r13,%r12) llgh%r8,__PT_INT_CODE+2(%r11) To me it looks like that the critical section cleanup (interrupt during system call entry) might save the registers again into ptregs but we have already zeroed out r0. This patch moves the clearing of r0 after sysc_do_svc, which should fix the critical section cleanup. Adding Martin and Heiko. Will spin a patch. Argh, yes. Thanks Chrisitan, this is it. I have been searching for the bug for days now. The point is that if the system call handler is interrupted after the xgr but before .Lsysc_do_svc the code at .Lcleanup_system_call repeats the stmg for %r0-%r7 but now %r0 is already zero. Please commit a patch for this and I'll will queue it up immediately. This patch does fix the QEMU crash. I haven't seen the crash after running the test case for more than a day. Thanks to everyone for taking a look at this problem :) Thanks Farhan
Re: [Qemu-devel] [BUG] I/O thread segfault for QEMU on s390x
On 03/05/2018 02:08 PM, Christian Borntraeger wrote: Do you happen to run with a recent host kernel that has commit 7041d28115e91f2144f811ffe8a195c696b1e1d0 s390: scrub registers on kernel entry and KVM exit Yes. Can you run with this on top diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S index 13a133a6015c..d6dc0e5e8f74 100644 --- a/arch/s390/kernel/entry.S +++ b/arch/s390/kernel/entry.S @@ -426,13 +426,13 @@ ENTRY(system_call) UPDATE_VTIME %r8,%r9,__LC_SYNC_ENTER_TIMER BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP stmg%r0,%r7,__PT_R0(%r11) - # clear user controlled register to prevent speculative use - xgr %r0,%r0 mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC mvc __PT_PSW(16,%r11),__LC_SVC_OLD_PSW mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC stg %r14,__PT_FLAGS(%r11) .Lsysc_do_svc: + # clear user controlled register to prevent speculative use + xgr %r0,%r0 # load address of system call table lg %r10,__THREAD_sysc_table(%r13,%r12) llgh%r8,__PT_INT_CODE+2(%r11) To me it looks like that the critical section cleanup (interrupt during system call entry) might save the registers again into ptregs but we have already zeroed out r0. This patch moves the clearing of r0 after sysc_do_svc, which should fix the critical section cleanup. Okay I will run with this. Adding Martin and Heiko. Will spin a patch. On 03/05/2018 07:54 PM, Christian Borntraeger wrote: On 03/05/2018 07:45 PM, Farhan Ali wrote: On 03/05/2018 06:03 AM, Stefan Hajnoczi wrote: Please include the following gdb output: (gdb) disas swapcontext (gdb) i r That way it's possible to see which instruction faulted and which registers were being accessed. here is the disas out for swapcontext, this is on a coredump with debugging symbols enabled for qemu. So the addresses from the previous dump is a little different. (gdb) disas swapcontext Dump of assembler code for function swapcontext: 0x03ff90751fb8 <+0>: lgr %r1,%r2 0x03ff90751fbc <+4>: lgr %r0,%r3 0x03ff90751fc0 <+8>: stfpc 248(%r1) 0x03ff90751fc4 <+12>: std %f0,256(%r1) 0x03ff90751fc8 <+16>: std %f1,264(%r1) 0x03ff90751fcc <+20>: std %f2,272(%r1) 0x03ff90751fd0 <+24>: std %f3,280(%r1) 0x03ff90751fd4 <+28>: std %f4,288(%r1) 0x03ff90751fd8 <+32>: std %f5,296(%r1) 0x03ff90751fdc <+36>: std %f6,304(%r1) 0x03ff90751fe0 <+40>: std %f7,312(%r1) 0x03ff90751fe4 <+44>: std %f8,320(%r1) 0x03ff90751fe8 <+48>: std %f9,328(%r1) 0x03ff90751fec <+52>: std %f10,336(%r1) 0x03ff90751ff0 <+56>: std %f11,344(%r1) 0x03ff90751ff4 <+60>: std %f12,352(%r1) 0x03ff90751ff8 <+64>: std %f13,360(%r1) 0x03ff90751ffc <+68>: std %f14,368(%r1) 0x03ff90752000 <+72>: std %f15,376(%r1) 0x03ff90752004 <+76>: slgr %r2,%r2 0x03ff90752008 <+80>: stam %a0,%a15,184(%r1) 0x03ff9075200c <+84>: stmg %r0,%r15,56(%r1) 0x03ff90752012 <+90>: la %r2,2 0x03ff90752016 <+94>: lgr %r5,%r0 0x03ff9075201a <+98>: la %r3,384(%r5) 0x03ff9075201e <+102>: la %r4,384(%r1) 0x03ff90752022 <+106>: lghi %r5,8 0x03ff90752026 <+110>: svc 175 sys_rt_sigprocmask. r0 should not be changed by the system call. 0x03ff90752028 <+112>: lgr %r5,%r0 => 0x03ff9075202c <+116>: lfpc 248(%r5) so r5 is zero and it was loaded from r0. r0 was loaded from r3 (which is the 2nd parameter to this function). Now this is odd. 0x03ff90752030 <+120>: ld %f0,256(%r5) 0x03ff90752034 <+124>: ld %f1,264(%r5) 0x03ff90752038 <+128>: ld %f2,272(%r5) 0x03ff9075203c <+132>: ld %f3,280(%r5) 0x03ff90752040 <+136>: ld %f4,288(%r5) 0x03ff90752044 <+140>: ld %f5,296(%r5) 0x03ff90752048 <+144>: ld %f6,304(%r5) 0x03ff9075204c <+148>: ld %f7,312(%r5) 0x03ff90752050 <+152>: ld %f8,320(%r5) 0x03ff90752054 <+156>: ld %f9,328(%r5) 0x03ff90752058 <+160>: ld %f10,336(%r5) 0x03ff9075205c <+164>: ld %f11,344(%r5) 0x03ff90752060 <+168>: ld %f12,352(%r5) 0x03ff90752064 <+172>: ld %f13,360(%r5) 0x03ff90752068 <+176>: ld %f14,368(%r5) 0x03ff9075206c <+180>: ld %f15,376(%r5) 0x03f