Re: [Qemu-devel] [PATCH v5] vfio-ccw: support async command subregion

2019-06-12 Thread Farhan Ali




On 06/12/2019 05:38 AM, Cornelia Huck wrote:

On Tue, 11 Jun 2019 15:33:59 -0400
Farhan Ali  wrote:


On 06/07/2019 10:53 AM, Cornelia Huck wrote:

A vfio-ccw device may provide an async command subregion for
issuing halt/clear subchannel requests. If it is present, use
it for sending halt/clear request to the device; if not, fall
back to emulation (as done today).

Signed-off-by: Cornelia Huck 
---

v4->v5:
- It seems we need to take the indirection via the class for the
callbacks after all :(
- Dropped Eric's R-b: for that reason

---
   hw/s390x/css.c  |  27 +++--
   hw/s390x/s390-ccw.c |  20 +++
   hw/vfio/ccw.c   | 112 +++-
   include/hw/s390x/css.h  |   3 +
   include/hw/s390x/s390-ccw.h |   2 +
   5 files changed, 158 insertions(+), 6 deletions(-)




@@ -309,11 +400,26 @@ static void vfio_ccw_get_region(VFIOCCWDevice *vcdev, 
Error **errp)
   vcdev->io_region_offset = info->offset;
   vcdev->io_region = g_malloc0(info->size);
   
+/* check for the optional async command region */

+ret = vfio_get_dev_region_info(vdev, VFIO_REGION_TYPE_CCW,
+   VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD, );
+if (!ret) {
+vcdev->async_cmd_region_size = info->size;
+if (sizeof(*vcdev->async_cmd_region) != vcdev->async_cmd_region_size) {
+error_setg(errp, "vfio: Unexpected size of the async cmd region");
+g_free(info);
+return;
+}
+vcdev->async_cmd_region_offset = info->offset;
+vcdev->async_cmd_region = g_malloc0(info->size);
+}
+
   g_free(info);
   }
   
   static void vfio_ccw_put_region(VFIOCCWDevice *vcdev)

   {
+g_free(vcdev->async_cmd_region);
   g_free(vcdev->io_region);
   }


I think we can have a memory leak given how the code is currently
structured and how we call vfio_ccw_get_region.

vfio_ccw_get_region is called in vfio_ccw_realize. So if we return an
error from vfio_ccw_get_region, we would jump to out_region_err in
vfio_ccw_realize which would call vfio_ccw_put_device.

Now we can also return an error from vfio_ccw_get_region for the async
region, and so we might never end up freeing the io_region for which we
allocated memory successfully.

I think we would also need to change vfio_ccw_realize, no?


Indeed, you're right. I have the following change on top:

diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index c9d1c76b4d04..d21ac24f743c 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -407,6 +407,7 @@ static void vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error 
**errp)
  vcdev->async_cmd_region_size = info->size;
  if (sizeof(*vcdev->async_cmd_region) != vcdev->async_cmd_region_size) 
{
  error_setg(errp, "vfio: Unexpected size of the async cmd region");
+g_free(vcdev->io_region);
  g_free(info);
  return;
  }


This looks good to me.



Anything else before I respin?




Nothing else that jumped out to me.

Reviewed-by: Farhan Ali 






Re: [Qemu-devel] [PATCH v5] vfio-ccw: support async command subregion

2019-06-11 Thread Farhan Ali




On 06/11/2019 07:37 AM, Cornelia Huck wrote:

On Fri, 7 Jun 2019 11:19:09 -0400
Farhan Ali  wrote:


On 06/07/2019 11:09 AM, Cornelia Huck wrote:

On Fri, 7 Jun 2019 11:02:36 -0400
Farhan Ali  wrote:
   

On 06/07/2019 10:53 AM, Cornelia Huck wrote:



diff --git a/hw/s390x/css.c b/hw/s390x/css.c
index ad310b9f94bc..b92395f165e6 100644
--- a/hw/s390x/css.c
+++ b/hw/s390x/css.c
@@ -22,6 +22,7 @@
#include "trace.h"
#include "hw/s390x/s390_flic.h"
#include "hw/s390x/s390-virtio-ccw.h"
+#include "hw/s390x/s390-ccw.h"

typedef struct CrwContainer {

CRW crw;
@@ -1205,6 +1206,26 @@ static void sch_handle_start_func_virtual(SubchDev *sch)

}

+static void sch_handle_halt_func_passthrough(SubchDev *sch)

+{
+int ret;
+
+ret = s390_ccw_halt(sch);
+if (ret == -ENOSYS) {
+sch_handle_halt_func(sch);
+}
+}
+
+static void sch_handle_clear_func_passthrough(SubchDev *sch)
+{
+int ret;
+
+ret = s390_ccw_clear(sch);
+if (ret == -ENOSYS) {
+sch_handle_clear_func(sch);
+}
+}
+


do we need an extra s390_ccw_clear/halt functions? can't we just call
cdc->clear/halt in the passthrough functions?


I mostly added them for symmetry reasons... we still need to check for
presence of the callback in any case, though.

(vfio is not always built, e.g. on windows or os x.)



right, but if we are calling do_subchannel_work_passthrough, then we
know for sure we are building the S390CCWDevice which is the vfio
device, no?

So we could just add checks for callbacks in
sch_handle_clear/halt_func_passthrough, no?

I would even like to get rid of the s390_ccw_cmd_request if we can, but
that is out of scope for this patch. :)


Ok, I just walked through various source files (some of which are a bit
confusingly named) again and now I understand again why it was done
that way in the first place.

- hw/s390x/s390-ccw.c provides some interfaces for pass-through ccw
   devices. It is built unconditionally, and its interfaces are called
   unconditionally from the css code.
   It also provides a device class where code can hook up callbacks.
- hw/vfio/ccw.c (which is not built for !KVM) actually hooks up
   callbacks in that device class.

So, s390-ccw.c (not to be confused with ccw-device.c...) provides a
layer that makes it possible to call things unconditionally, regardless
whether we have vfio-ccw available or not. Not that the code ends up
being called without vfio-ccw support; but the class indirection
enables the code to be built.



Okay, now I get it. Thanks for the explanation, I really do appreciate 
it! :)



There's possibly a way to make this work without the class indirection
as well, but I think we'd end up doing code juggling before ending up
with something that's not really nicer than the code we have now.
Therefore, I'd prefer to keep the class handling and hook up the
halt/clear callbacks there as well.




Yeah I agree, given the constraints I don't think any code juggling 
would look any prettier.


Thanks
Farhan




Re: [Qemu-devel] [PATCH v5] vfio-ccw: support async command subregion

2019-06-11 Thread Farhan Ali




On 06/07/2019 10:53 AM, Cornelia Huck wrote:

A vfio-ccw device may provide an async command subregion for
issuing halt/clear subchannel requests. If it is present, use
it for sending halt/clear request to the device; if not, fall
back to emulation (as done today).

Signed-off-by: Cornelia Huck 
---

v4->v5:
- It seems we need to take the indirection via the class for the
   callbacks after all :(
- Dropped Eric's R-b: for that reason

---
  hw/s390x/css.c  |  27 +++--
  hw/s390x/s390-ccw.c |  20 +++
  hw/vfio/ccw.c   | 112 +++-
  include/hw/s390x/css.h  |   3 +
  include/hw/s390x/s390-ccw.h |   2 +
  5 files changed, 158 insertions(+), 6 deletions(-)

diff --git a/hw/s390x/css.c b/hw/s390x/css.c
index ad310b9f94bc..b92395f165e6 100644
--- a/hw/s390x/css.c
+++ b/hw/s390x/css.c
@@ -22,6 +22,7 @@
  #include "trace.h"
  #include "hw/s390x/s390_flic.h"
  #include "hw/s390x/s390-virtio-ccw.h"
+#include "hw/s390x/s390-ccw.h"
  
  typedef struct CrwContainer {

  CRW crw;
@@ -1205,6 +1206,26 @@ static void sch_handle_start_func_virtual(SubchDev *sch)
  
  }
  
+static void sch_handle_halt_func_passthrough(SubchDev *sch)

+{
+int ret;
+
+ret = s390_ccw_halt(sch);
+if (ret == -ENOSYS) {
+sch_handle_halt_func(sch);
+}
+}
+
+static void sch_handle_clear_func_passthrough(SubchDev *sch)
+{
+int ret;
+
+ret = s390_ccw_clear(sch);
+if (ret == -ENOSYS) {
+sch_handle_clear_func(sch);
+}
+}
+
  static IOInstEnding sch_handle_start_func_passthrough(SubchDev *sch)
  {
  SCHIB *schib = >curr_status;
@@ -1244,11 +1265,9 @@ IOInstEnding do_subchannel_work_passthrough(SubchDev 
*sch)
  SCHIB *schib = >curr_status;
  
  if (schib->scsw.ctrl & SCSW_FCTL_CLEAR_FUNC) {

-/* TODO: Clear handling */
-sch_handle_clear_func(sch);
+sch_handle_clear_func_passthrough(sch);
  } else if (schib->scsw.ctrl & SCSW_FCTL_HALT_FUNC) {
-/* TODO: Halt handling */
-sch_handle_halt_func(sch);
+sch_handle_halt_func_passthrough(sch);
  } else if (schib->scsw.ctrl & SCSW_FCTL_START_FUNC) {
  return sch_handle_start_func_passthrough(sch);
  }
diff --git a/hw/s390x/s390-ccw.c b/hw/s390x/s390-ccw.c
index f5f025d1b6ca..951be5ab0245 100644
--- a/hw/s390x/s390-ccw.c
+++ b/hw/s390x/s390-ccw.c
@@ -29,6 +29,26 @@ IOInstEnding s390_ccw_cmd_request(SubchDev *sch)
  return cdc->handle_request(sch);
  }
  
+int s390_ccw_halt(SubchDev *sch)

+{
+S390CCWDeviceClass *cdc = S390_CCW_DEVICE_GET_CLASS(sch->driver_data);
+
+if (!cdc->handle_halt) {
+return -ENOSYS;
+}
+return cdc->handle_halt(sch);
+}
+
+int s390_ccw_clear(SubchDev *sch)
+{
+S390CCWDeviceClass *cdc = S390_CCW_DEVICE_GET_CLASS(sch->driver_data);
+
+if (!cdc->handle_clear) {
+return -ENOSYS;
+}
+return cdc->handle_clear(sch);
+}
+
  static void s390_ccw_get_dev_info(S390CCWDevice *cdev,
char *sysfsdev,
Error **errp)
diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index d9e39552e237..c9d1c76b4d04 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -2,9 +2,12 @@
   * vfio based subchannel assignment support
   *
   * Copyright 2017 IBM Corp.
+ * Copyright 2019 Red Hat, Inc.
+ *
   * Author(s): Dong Jia Shi 
   *Xiao Feng Ren 
   *Pierre Morel 
+ *Cornelia Huck 
   *
   * This work is licensed under the terms of the GNU GPL, version 2 or (at
   * your option) any later version. See the COPYING file in the top-level
@@ -32,6 +35,9 @@ struct VFIOCCWDevice {
  uint64_t io_region_size;
  uint64_t io_region_offset;
  struct ccw_io_region *io_region;
+uint64_t async_cmd_region_size;
+uint64_t async_cmd_region_offset;
+struct ccw_cmd_region *async_cmd_region;
  EventNotifier io_notifier;
  bool force_orb_pfch;
  bool warned_orb_pfch;
@@ -114,6 +120,87 @@ again:
  }
  }
  
+static int vfio_ccw_handle_clear(SubchDev *sch)

+{
+S390CCWDevice *cdev = sch->driver_data;
+VFIOCCWDevice *vcdev = DO_UPCAST(VFIOCCWDevice, cdev, cdev);
+struct ccw_cmd_region *region = vcdev->async_cmd_region;
+int ret;
+
+if (!vcdev->async_cmd_region) {
+/* Async command region not available, fall back to emulation */
+return -ENOSYS;
+}
+
+memset(region, 0, sizeof(*region));
+region->command = VFIO_CCW_ASYNC_CMD_CSCH;
+
+again:
+ret = pwrite(vcdev->vdev.fd, region,
+ vcdev->async_cmd_region_size, vcdev->async_cmd_region_offset);
+if (ret != vcdev->async_cmd_region_size) {
+if (errno == EAGAIN) {
+goto again;
+}
+error_report("vfio-ccw: write cmd region failed with errno=%d", errno);
+ret = -errno;
+} else {
+ret = region->ret_code;
+}
+switch (ret) {
+case 0:
+case -ENODEV:
+case -EACCES:
+

Re: [Qemu-devel] [PATCH v5] vfio-ccw: support async command subregion

2019-06-07 Thread Farhan Ali




On 06/07/2019 11:09 AM, Cornelia Huck wrote:

On Fri, 7 Jun 2019 11:02:36 -0400
Farhan Ali  wrote:


On 06/07/2019 10:53 AM, Cornelia Huck wrote:

A vfio-ccw device may provide an async command subregion for
issuing halt/clear subchannel requests. If it is present, use
it for sending halt/clear request to the device; if not, fall
back to emulation (as done today).

Signed-off-by: Cornelia Huck 
---

v4->v5:
- It seems we need to take the indirection via the class for the
callbacks after all :(
- Dropped Eric's R-b: for that reason

---
   hw/s390x/css.c  |  27 +++--
   hw/s390x/s390-ccw.c |  20 +++
   hw/vfio/ccw.c   | 112 +++-
   include/hw/s390x/css.h  |   3 +
   include/hw/s390x/s390-ccw.h |   2 +
   5 files changed, 158 insertions(+), 6 deletions(-)

diff --git a/hw/s390x/css.c b/hw/s390x/css.c
index ad310b9f94bc..b92395f165e6 100644
--- a/hw/s390x/css.c
+++ b/hw/s390x/css.c
@@ -22,6 +22,7 @@
   #include "trace.h"
   #include "hw/s390x/s390_flic.h"
   #include "hw/s390x/s390-virtio-ccw.h"
+#include "hw/s390x/s390-ccw.h"
   
   typedef struct CrwContainer {

   CRW crw;
@@ -1205,6 +1206,26 @@ static void sch_handle_start_func_virtual(SubchDev *sch)
   
   }
   
+static void sch_handle_halt_func_passthrough(SubchDev *sch)

+{
+int ret;
+
+ret = s390_ccw_halt(sch);
+if (ret == -ENOSYS) {
+sch_handle_halt_func(sch);
+}
+}
+
+static void sch_handle_clear_func_passthrough(SubchDev *sch)
+{
+int ret;
+
+ret = s390_ccw_clear(sch);
+if (ret == -ENOSYS) {
+sch_handle_clear_func(sch);
+}
+}
+


do we need an extra s390_ccw_clear/halt functions? can't we just call
cdc->clear/halt in the passthrough functions?


I mostly added them for symmetry reasons... we still need to check for
presence of the callback in any case, though.

(vfio is not always built, e.g. on windows or os x.)



right, but if we are calling do_subchannel_work_passthrough, then we 
know for sure we are building the S390CCWDevice which is the vfio 
device, no?


So we could just add checks for callbacks in 
sch_handle_clear/halt_func_passthrough, no?


I would even like to get rid of the s390_ccw_cmd_request if we can, but 
that is out of scope for this patch. :)








   static IOInstEnding sch_handle_start_func_passthrough(SubchDev *sch)
   {
   SCHIB *schib = >curr_status;
@@ -1244,11 +1265,9 @@ IOInstEnding do_subchannel_work_passthrough(SubchDev 
*sch)
   SCHIB *schib = >curr_status;
   
   if (schib->scsw.ctrl & SCSW_FCTL_CLEAR_FUNC) {

-/* TODO: Clear handling */
-sch_handle_clear_func(sch);
+sch_handle_clear_func_passthrough(sch);
   } else if (schib->scsw.ctrl & SCSW_FCTL_HALT_FUNC) {
-/* TODO: Halt handling */
-sch_handle_halt_func(sch);
+sch_handle_halt_func_passthrough(sch);
   } else if (schib->scsw.ctrl & SCSW_FCTL_START_FUNC) {
   return sch_handle_start_func_passthrough(sch);
   }
diff --git a/hw/s390x/s390-ccw.c b/hw/s390x/s390-ccw.c
index f5f025d1b6ca..951be5ab0245 100644
--- a/hw/s390x/s390-ccw.c
+++ b/hw/s390x/s390-ccw.c
@@ -29,6 +29,26 @@ IOInstEnding s390_ccw_cmd_request(SubchDev *sch)
   return cdc->handle_request(sch);
   }
   
+int s390_ccw_halt(SubchDev *sch)

+{
+S390CCWDeviceClass *cdc = S390_CCW_DEVICE_GET_CLASS(sch->driver_data);
+
+if (!cdc->handle_halt) {
+return -ENOSYS;
+}
+return cdc->handle_halt(sch);
+}
+
+int s390_ccw_clear(SubchDev *sch)
+{
+S390CCWDeviceClass *cdc = S390_CCW_DEVICE_GET_CLASS(sch->driver_data);
+
+if (!cdc->handle_clear) {
+return -ENOSYS;
+}
+return cdc->handle_clear(sch);
+}
+










Re: [Qemu-devel] [PATCH v5] vfio-ccw: support async command subregion

2019-06-07 Thread Farhan Ali




On 06/07/2019 10:53 AM, Cornelia Huck wrote:

A vfio-ccw device may provide an async command subregion for
issuing halt/clear subchannel requests. If it is present, use
it for sending halt/clear request to the device; if not, fall
back to emulation (as done today).

Signed-off-by: Cornelia Huck 
---

v4->v5:
- It seems we need to take the indirection via the class for the
   callbacks after all :(
- Dropped Eric's R-b: for that reason

---
  hw/s390x/css.c  |  27 +++--
  hw/s390x/s390-ccw.c |  20 +++
  hw/vfio/ccw.c   | 112 +++-
  include/hw/s390x/css.h  |   3 +
  include/hw/s390x/s390-ccw.h |   2 +
  5 files changed, 158 insertions(+), 6 deletions(-)

diff --git a/hw/s390x/css.c b/hw/s390x/css.c
index ad310b9f94bc..b92395f165e6 100644
--- a/hw/s390x/css.c
+++ b/hw/s390x/css.c
@@ -22,6 +22,7 @@
  #include "trace.h"
  #include "hw/s390x/s390_flic.h"
  #include "hw/s390x/s390-virtio-ccw.h"
+#include "hw/s390x/s390-ccw.h"
  
  typedef struct CrwContainer {

  CRW crw;
@@ -1205,6 +1206,26 @@ static void sch_handle_start_func_virtual(SubchDev *sch)
  
  }
  
+static void sch_handle_halt_func_passthrough(SubchDev *sch)

+{
+int ret;
+
+ret = s390_ccw_halt(sch);
+if (ret == -ENOSYS) {
+sch_handle_halt_func(sch);
+}
+}
+
+static void sch_handle_clear_func_passthrough(SubchDev *sch)
+{
+int ret;
+
+ret = s390_ccw_clear(sch);
+if (ret == -ENOSYS) {
+sch_handle_clear_func(sch);
+}
+}
+


do we need an extra s390_ccw_clear/halt functions? can't we just call 
cdc->clear/halt in the passthrough functions?



  static IOInstEnding sch_handle_start_func_passthrough(SubchDev *sch)
  {
  SCHIB *schib = >curr_status;
@@ -1244,11 +1265,9 @@ IOInstEnding do_subchannel_work_passthrough(SubchDev 
*sch)
  SCHIB *schib = >curr_status;
  
  if (schib->scsw.ctrl & SCSW_FCTL_CLEAR_FUNC) {

-/* TODO: Clear handling */
-sch_handle_clear_func(sch);
+sch_handle_clear_func_passthrough(sch);
  } else if (schib->scsw.ctrl & SCSW_FCTL_HALT_FUNC) {
-/* TODO: Halt handling */
-sch_handle_halt_func(sch);
+sch_handle_halt_func_passthrough(sch);
  } else if (schib->scsw.ctrl & SCSW_FCTL_START_FUNC) {
  return sch_handle_start_func_passthrough(sch);
  }
diff --git a/hw/s390x/s390-ccw.c b/hw/s390x/s390-ccw.c
index f5f025d1b6ca..951be5ab0245 100644
--- a/hw/s390x/s390-ccw.c
+++ b/hw/s390x/s390-ccw.c
@@ -29,6 +29,26 @@ IOInstEnding s390_ccw_cmd_request(SubchDev *sch)
  return cdc->handle_request(sch);
  }
  
+int s390_ccw_halt(SubchDev *sch)

+{
+S390CCWDeviceClass *cdc = S390_CCW_DEVICE_GET_CLASS(sch->driver_data);
+
+if (!cdc->handle_halt) {
+return -ENOSYS;
+}
+return cdc->handle_halt(sch);
+}
+
+int s390_ccw_clear(SubchDev *sch)
+{
+S390CCWDeviceClass *cdc = S390_CCW_DEVICE_GET_CLASS(sch->driver_data);
+
+if (!cdc->handle_clear) {
+return -ENOSYS;
+}
+return cdc->handle_clear(sch);
+}
+





Re: [Qemu-devel] [PATCH v4 2/2] vfio-ccw: support async command subregion

2019-05-22 Thread Farhan Ali




On 05/22/2019 06:17 AM, Cornelia Huck wrote:

On Tue, 21 May 2019 16:50:47 -0400
Farhan Ali  wrote:


On 05/07/2019 11:47 AM, Cornelia Huck wrote:

A vfio-ccw device may provide an async command subregion for
issuing halt/clear subchannel requests. If it is present, use
it for sending halt/clear request to the device; if not, fall
back to emulation (as done today).

Signed-off-by: Cornelia Huck 
---
   hw/s390x/css.c  |  27 +++--
   hw/vfio/ccw.c   | 110 +++-
   include/hw/s390x/s390-ccw.h |   3 +
   3 files changed, 134 insertions(+), 6 deletions(-)



diff --git a/include/hw/s390x/s390-ccw.h b/include/hw/s390x/s390-ccw.h
index 901d805d79a3..e9c7e1db5761 100644
--- a/include/hw/s390x/s390-ccw.h
+++ b/include/hw/s390x/s390-ccw.h
@@ -37,4 +37,7 @@ typedef struct S390CCWDeviceClass {
   IOInstEnding (*handle_request) (SubchDev *sch);
   } S390CCWDeviceClass;
   
+int vfio_ccw_handle_clear(SubchDev *sch);

+int vfio_ccw_handle_halt(SubchDev *sch);
+


We are not making clear and halt functions part of the
S390CCWDeviceClass, is there are reason for doing this?
Currently we handle ssch through the handle_request function, it just
looks a little inconsistent.


I don't quite remember why I did it this way; not sure if there is a
good reason for that (that patch has been around for too long...)

We can change such internal details later on, though. (And I think your
comment has merit.)


Yes, sure we could change it later on. I do prefer your way though, it 
avoids one more layer of indirection.


Thanks
Farhan




Re: [Qemu-devel] [PATCH v4 2/2] vfio-ccw: support async command subregion

2019-05-21 Thread Farhan Ali




On 05/21/2019 12:32 PM, Cornelia Huck wrote:

On Mon, 20 May 2019 12:29:56 -0400
Eric Farman  wrote:


On 5/7/19 11:47 AM, Cornelia Huck wrote:

A vfio-ccw device may provide an async command subregion for
issuing halt/clear subchannel requests. If it is present, use
it for sending halt/clear request to the device; if not, fall
back to emulation (as done today).

Signed-off-by: Cornelia Huck 
---
  hw/s390x/css.c  |  27 +++--
  hw/vfio/ccw.c   | 110 +++-
  include/hw/s390x/s390-ccw.h |   3 +
  3 files changed, 134 insertions(+), 6 deletions(-)




+int vfio_ccw_handle_clear(SubchDev *sch)
+{
+S390CCWDevice *cdev = sch->driver_data;
+VFIOCCWDevice *vcdev = DO_UPCAST(VFIOCCWDevice, cdev, cdev);
+struct ccw_cmd_region *region = vcdev->async_cmd_region;
+int ret;
+
+if (!vcdev->async_cmd_region) {
+/* Async command region not available, fall back to emulation */
+return -ENOSYS;
+}
+
+memset(region, 0, sizeof(*region));
+region->command = VFIO_CCW_ASYNC_CMD_CSCH;


Considering the serialization you added on the kernel side, what happens
if another vcpu runs this code (or _halt) and clears the async region
before the kernel code gains control from the pwrite() call below?
Asked another way, there's nothing preventing us from issuing more than
one asynchronous command concurrently, so how do we make sure the
command gets to the kernel rather than "current command wins"  ?


This send me digging through the code, because if two threads can call
this at the same time for passthrough, we'd also have the same problem
for virtual.

If I followed the code correctly, all I/O instruction interpretation is
currently serialized via qemu_mutex_{lock,unlock}_iothread() (see
target/s390x/kvm.c respectively target/s390x/misc_helper.c). This
should mostly be enough to avoid stepping on each other's toes.

Why mostly? I'm not sure yet whether we handling multiple requests for
passthrough devices correctly yet (virtual should be fine.)



But don't virtual and passthrough device end up calling the same 
ioinst_handle_* functions to interpret the I/O instructions?


As you mentioned all the ioinst_handle_* functions are called with the 
qemu_mutex held. So I am confused as why virtual devices should be fine 
and not passthrough? :)





Start vs. (start|halt|clear) is fine, as the code checks whether
something is already pending before poking the kernel interface.
Likewise, halt vs. (start|halt|clear) is fine, as the code checks for
halt or clear and start and halt use different regions. The problematic
one is clear, as that's something that's always supposed to go through.
Probably fine if clear should always "win", but I need to think some
more about that.



That possibly worrisome question aside, this seems generally fine.



+
+again:
+ret = pwrite(vcdev->vdev.fd, region,
+ vcdev->async_cmd_region_size, vcdev->async_cmd_region_offset);
+if (ret != vcdev->async_cmd_region_size) {
+if (errno == EAGAIN) {
+goto again;
+}
+error_report("vfio-ccw: write cmd region failed with errno=%d", errno);
+ret = -errno;
+} else {
+ret = region->ret_code;
+}
+switch (ret) {
+case 0:
+case -ENODEV:
+case -EACCES:
+return 0;
+case -EFAULT:
+default:
+sch_gen_unit_exception(sch);
+css_inject_io_interrupt(sch);
+return 0;
+}
+}








Re: [Qemu-devel] [PATCH v4 2/2] vfio-ccw: support async command subregion

2019-05-21 Thread Farhan Ali




On 05/07/2019 11:47 AM, Cornelia Huck wrote:

A vfio-ccw device may provide an async command subregion for
issuing halt/clear subchannel requests. If it is present, use
it for sending halt/clear request to the device; if not, fall
back to emulation (as done today).

Signed-off-by: Cornelia Huck 
---
  hw/s390x/css.c  |  27 +++--
  hw/vfio/ccw.c   | 110 +++-
  include/hw/s390x/s390-ccw.h |   3 +
  3 files changed, 134 insertions(+), 6 deletions(-)

diff --git a/hw/s390x/css.c b/hw/s390x/css.c
index 8fc9e35ba5d3..5b21a74b5c29 100644
--- a/hw/s390x/css.c
+++ b/hw/s390x/css.c
@@ -22,6 +22,7 @@
  #include "trace.h"
  #include "hw/s390x/s390_flic.h"
  #include "hw/s390x/s390-virtio-ccw.h"
+#include "hw/s390x/s390-ccw.h"
  
  typedef struct CrwContainer {

  CRW crw;
@@ -1191,6 +1192,26 @@ static void sch_handle_start_func_virtual(SubchDev *sch)
  
  }
  
+static void sch_handle_halt_func_passthrough(SubchDev *sch)

+{
+int ret;
+
+ret = vfio_ccw_handle_halt(sch);
+if (ret == -ENOSYS) {
+sch_handle_halt_func(sch);
+}
+}
+
+static void sch_handle_clear_func_passthrough(SubchDev *sch)
+{
+int ret;
+
+ret = vfio_ccw_handle_clear(sch);
+if (ret == -ENOSYS) {
+sch_handle_clear_func(sch);
+}
+}
+
  static IOInstEnding sch_handle_start_func_passthrough(SubchDev *sch)
  {
  SCHIB *schib = >curr_status;
@@ -1230,11 +1251,9 @@ IOInstEnding do_subchannel_work_passthrough(SubchDev 
*sch)
  SCHIB *schib = >curr_status;
  
  if (schib->scsw.ctrl & SCSW_FCTL_CLEAR_FUNC) {

-/* TODO: Clear handling */
-sch_handle_clear_func(sch);
+sch_handle_clear_func_passthrough(sch);
  } else if (schib->scsw.ctrl & SCSW_FCTL_HALT_FUNC) {
-/* TODO: Halt handling */
-sch_handle_halt_func(sch);
+sch_handle_halt_func_passthrough(sch);
  } else if (schib->scsw.ctrl & SCSW_FCTL_START_FUNC) {
  return sch_handle_start_func_passthrough(sch);
  }
diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index 31dd3a2a87b6..175a17b1772a 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -2,9 +2,12 @@
   * vfio based subchannel assignment support
   *
   * Copyright 2017 IBM Corp.
+ * Copyright 2019 Red Hat, Inc.
+ *
   * Author(s): Dong Jia Shi 
   *Xiao Feng Ren 
   *Pierre Morel 
+ *Cornelia Huck 
   *
   * This work is licensed under the terms of the GNU GPL, version 2 or (at
   * your option) any later version. See the COPYING file in the top-level
@@ -32,6 +35,9 @@ struct VFIOCCWDevice {
  uint64_t io_region_size;
  uint64_t io_region_offset;
  struct ccw_io_region *io_region;
+uint64_t async_cmd_region_size;
+uint64_t async_cmd_region_offset;
+struct ccw_cmd_region *async_cmd_region;
  EventNotifier io_notifier;
  bool force_orb_pfch;
  bool warned_orb_pfch;
@@ -114,6 +120,87 @@ again:
  }
  }
  
+int vfio_ccw_handle_clear(SubchDev *sch)

+{
+S390CCWDevice *cdev = sch->driver_data;
+VFIOCCWDevice *vcdev = DO_UPCAST(VFIOCCWDevice, cdev, cdev);
+struct ccw_cmd_region *region = vcdev->async_cmd_region;
+int ret;
+
+if (!vcdev->async_cmd_region) {
+/* Async command region not available, fall back to emulation */
+return -ENOSYS;
+}
+
+memset(region, 0, sizeof(*region));
+region->command = VFIO_CCW_ASYNC_CMD_CSCH;
+
+again:
+ret = pwrite(vcdev->vdev.fd, region,
+ vcdev->async_cmd_region_size, vcdev->async_cmd_region_offset);
+if (ret != vcdev->async_cmd_region_size) {
+if (errno == EAGAIN) {
+goto again;
+}
+error_report("vfio-ccw: write cmd region failed with errno=%d", errno);
+ret = -errno;
+} else {
+ret = region->ret_code;
+}
+switch (ret) {
+case 0:
+case -ENODEV:
+case -EACCES:
+return 0;
+case -EFAULT:
+default:
+sch_gen_unit_exception(sch);
+css_inject_io_interrupt(sch);
+return 0;
+}
+}
+
+int vfio_ccw_handle_halt(SubchDev *sch)
+{
+S390CCWDevice *cdev = sch->driver_data;
+VFIOCCWDevice *vcdev = DO_UPCAST(VFIOCCWDevice, cdev, cdev);
+struct ccw_cmd_region *region = vcdev->async_cmd_region;
+int ret;
+
+if (!vcdev->async_cmd_region) {
+/* Async command region not available, fall back to emulation */
+return -ENOSYS;
+}
+
+memset(region, 0, sizeof(*region));
+region->command = VFIO_CCW_ASYNC_CMD_HSCH;
+
+again:
+ret = pwrite(vcdev->vdev.fd, region,
+ vcdev->async_cmd_region_size, vcdev->async_cmd_region_offset);
+if (ret != vcdev->async_cmd_region_size) {
+if (errno == EAGAIN) {
+goto again;
+}
+error_report("vfio-ccw: write cmd region failed with errno=%d", errno);
+ret = -errno;
+} else {
+ret = region->ret_code;
+}
+switch (ret) {
+case 0:
+case -EBUSY:
+

Re: [Qemu-devel] [PATCH v4 4/6] vfio-ccw: add capabilities chain

2019-04-15 Thread Farhan Ali




On 03/01/2019 04:39 AM, Cornelia Huck wrote:

Allow to extend the regions used by vfio-ccw. The first user will be
handling of halt and clear subchannel.

Signed-off-by: Cornelia Huck
---
  drivers/s390/cio/vfio_ccw_ops.c | 186 
  drivers/s390/cio/vfio_ccw_private.h |  38 ++
  include/uapi/linux/vfio.h   |   2 +
  3 files changed, 200 insertions(+), 26 deletions(-)


Reviewed-by: Farhan Ali 




Re: [Qemu-devel] [PATCH v4 6/6] vfio-ccw: add handling for async channel instructions

2019-04-15 Thread Farhan Ali




On 03/01/2019 04:39 AM, Cornelia Huck wrote:

Add a region to the vfio-ccw device that can be used to submit
asynchronous I/O instructions. ssch continues to be handled by the
existing I/O region; the new region handles hsch and csch.

Interrupt status continues to be reported through the same channels
as for ssch.

Signed-off-by: Cornelia Huck


Reviewed-by: Farhan Ali 




Re: [Qemu-devel] [PATCH v4 1/6] vfio-ccw: make it safe to access channel programs

2019-04-08 Thread Farhan Ali




On 04/08/2019 01:07 PM, Cornelia Huck wrote:

On Mon, 8 Apr 2019 13:02:12 -0400
Farhan Ali  wrote:


On 03/01/2019 04:38 AM, Cornelia Huck wrote:

When we get a solicited interrupt, the start function may have
been cleared by a csch, but we still have a channel program
structure allocated. Make it safe to call the cp accessors in
any case, so we can call them unconditionally.

While at it, also make sure that functions called from other parts
of the code return gracefully if the channel program structure
has not been initialized (even though that is a bug in the caller).

Reviewed-by: Eric Farman
Signed-off-by: Cornelia Huck
---


Hi Connie,

My series of fixes for vfio-ccw depends on this patch as I would like to
call cp_free unconditionally :) (I had developed my code on top of your
patches).

Could we pick this patch up as well when/if you pick up my patch series?
I am in the process of sending out a v2.

Regarding this patch we could merge it as a stand alone patch, separate
from this series. And also the patch LGTM

Reviewed-by: Farhan Ali 


Actually, I wanted to ask how people felt about merging this whole
series for the next release :) It would be one thing less on my plate...




I have been testing with your patches for a while now and I haven't hit 
any problems due to the patches.


IMHO I think we can merge the patches for the next release.

Thanks
Farhan




Re: [Qemu-devel] [PATCH v4 1/6] vfio-ccw: make it safe to access channel programs

2019-04-08 Thread Farhan Ali




On 03/01/2019 04:38 AM, Cornelia Huck wrote:

When we get a solicited interrupt, the start function may have
been cleared by a csch, but we still have a channel program
structure allocated. Make it safe to call the cp accessors in
any case, so we can call them unconditionally.

While at it, also make sure that functions called from other parts
of the code return gracefully if the channel program structure
has not been initialized (even though that is a bug in the caller).

Reviewed-by: Eric Farman
Signed-off-by: Cornelia Huck
---


Hi Connie,

My series of fixes for vfio-ccw depends on this patch as I would like to 
call cp_free unconditionally :) (I had developed my code on top of your 
patches).


Could we pick this patch up as well when/if you pick up my patch series? 
I am in the process of sending out a v2.


Regarding this patch we could merge it as a stand alone patch, separate 
from this series. And also the patch LGTM


Reviewed-by: Farhan Ali 

Thanks
Farhan




Re: [Qemu-devel] [PATCH 11/14] hw/vfio/ccw: avoid taking address members in packed structs

2019-04-02 Thread Farhan Ali




On 03/29/2019 07:11 AM, Daniel P. Berrangé wrote:

The GCC 9 compiler complains about many places in s390 code
that take the address of members of the 'struct SCHIB' which
is marked packed:

hw/vfio/ccw.c: In function ‘vfio_ccw_io_notifier_handler’:
hw/vfio/ccw.c:133:15: warning: taking address of packed member of ‘struct 
SCHIB’ may result in an unaligned pointer value \
[-Waddress-of-packed-member]
   133 | SCSW *s = >curr_status.scsw;
   |   ^~
hw/vfio/ccw.c:134:15: warning: taking address of packed member of ‘struct 
SCHIB’ may result in an unaligned pointer value \
[-Waddress-of-packed-member]
   134 | PMCW *p = >curr_status.pmcw;
   |   ^~

...snip many more...

Almost all of these are just done for convenience to avoid
typing out long variable/field names when referencing struct
members. We can get most of this convenience by taking the
address of the 'struct SCHIB' instead, avoiding triggering
the compiler warnings.

In a couple of places we copy via a local variable which is
a technique already applied elsewhere in s390 code for this
problem.

Signed-off-by: Daniel P. Berrangé 
---
  hw/vfio/ccw.c | 42 ++
  1 file changed, 22 insertions(+), 20 deletions(-)

diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index 9246729a75..c44d13cc50 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -130,8 +130,8 @@ static void vfio_ccw_io_notifier_handler(void *opaque)
  S390CCWDevice *cdev = S390_CCW_DEVICE(vcdev);
  CcwDevice *ccw_dev = CCW_DEVICE(cdev);
  SubchDev *sch = ccw_dev->sch;
-SCSW *s = >curr_status.scsw;
-PMCW *p = >curr_status.pmcw;
+SCHIB *schib = >curr_status;
+SCSW s;
  IRB irb;
  int size;
  
@@ -145,33 +145,33 @@ static void vfio_ccw_io_notifier_handler(void *opaque)

  switch (errno) {
  case ENODEV:
  /* Generate a deferred cc 3 condition. */
-s->flags |= SCSW_FLAGS_MASK_CC;
-s->ctrl &= ~SCSW_CTRL_MASK_STCTL;
-s->ctrl |= (SCSW_STCTL_ALERT | SCSW_STCTL_STATUS_PEND);
+schib->scsw.flags |= SCSW_FLAGS_MASK_CC;
+schib->scsw.ctrl &= ~SCSW_CTRL_MASK_STCTL;
+schib->scsw.ctrl |= (SCSW_STCTL_ALERT | SCSW_STCTL_STATUS_PEND);
  goto read_err;
  case EFAULT:
  /* Memory problem, generate channel data check. */
-s->ctrl &= ~SCSW_ACTL_START_PEND;
-s->cstat = SCSW_CSTAT_DATA_CHECK;
-s->ctrl &= ~SCSW_CTRL_MASK_STCTL;
-s->ctrl |= SCSW_STCTL_PRIMARY | SCSW_STCTL_SECONDARY |
+schib->scsw.ctrl &= ~SCSW_ACTL_START_PEND;
+schib->scsw.cstat = SCSW_CSTAT_DATA_CHECK;
+schib->scsw.ctrl &= ~SCSW_CTRL_MASK_STCTL;
+schib->scsw.ctrl |= SCSW_STCTL_PRIMARY | SCSW_STCTL_SECONDARY |
 SCSW_STCTL_ALERT | SCSW_STCTL_STATUS_PEND;
  goto read_err;
  default:
  /* Error, generate channel program check. */
-s->ctrl &= ~SCSW_ACTL_START_PEND;
-s->cstat = SCSW_CSTAT_PROG_CHECK;
-s->ctrl &= ~SCSW_CTRL_MASK_STCTL;
-s->ctrl |= SCSW_STCTL_PRIMARY | SCSW_STCTL_SECONDARY |
+schib->scsw.ctrl &= ~SCSW_ACTL_START_PEND;
+schib->scsw.cstat = SCSW_CSTAT_PROG_CHECK;
+schib->scsw.ctrl &= ~SCSW_CTRL_MASK_STCTL;
+schib->scsw.ctrl |= SCSW_STCTL_PRIMARY | SCSW_STCTL_SECONDARY |
 SCSW_STCTL_ALERT | SCSW_STCTL_STATUS_PEND;
  goto read_err;
  }
  } else if (size != vcdev->io_region_size) {
  /* Information transfer error, generate channel-control check. */
-s->ctrl &= ~SCSW_ACTL_START_PEND;
-s->cstat = SCSW_CSTAT_CHN_CTRL_CHK;
-s->ctrl &= ~SCSW_CTRL_MASK_STCTL;
-s->ctrl |= SCSW_STCTL_PRIMARY | SCSW_STCTL_SECONDARY |
+schib->scsw.ctrl &= ~SCSW_ACTL_START_PEND;
+schib->scsw.cstat = SCSW_CSTAT_CHN_CTRL_CHK;
+schib->scsw.ctrl &= ~SCSW_CTRL_MASK_STCTL;
+schib->scsw.ctrl |= SCSW_STCTL_PRIMARY | SCSW_STCTL_SECONDARY |
 SCSW_STCTL_ALERT | SCSW_STCTL_STATUS_PEND;
  goto read_err;
  }
@@ -179,11 +179,13 @@ static void vfio_ccw_io_notifier_handler(void *opaque)
  memcpy(, region->irb_area, sizeof(IRB));
  
  /* Update control block via irb. */

-copy_scsw_to_guest(s, );
+s = schib->scsw;
+copy_scsw_to_guest(, );
+schib->scsw = s;
  
  /* If a uint check is pending, copy sense data. */

-if ((s->dstat & SCSW_DSTAT_UNIT_CHECK) &&
-(p->chars & PMCW_CHARS_MASK_CSENSE)) {
+    if ((schib->scsw.dstat & SCSW_DSTAT_UNIT_CHECK) &&
+(schib->pmcw.chars & PMCW_CHARS_MASK_CSENSE)) {
  memcpy(sch->sense_data, irb.ecw, sizeof(irb.ecw));
  }
  



Reviewed-by: Farhan Ali 




Re: [Qemu-devel] [PATCH 13/14] hw/s390x/ipl: avoid taking address of fields in packed struct

2019-04-02 Thread Farhan Ali




On 03/29/2019 07:11 AM, Daniel P. Berrangé wrote:

Compiling with GCC 9 complains

hw/s390x/ipl.c: In function ‘s390_ipl_set_boot_menu’:
hw/s390x/ipl.c:256:25: warning: taking address of packed member of ‘struct 
QemuIplParameters’ may result in an unaligned pointer value 
[-Waddress-of-packed-member]
   256 | uint32_t *timeout = >qipl.boot_menu_timeout;
   | ^~~~

This local variable is only present to save a little bit of
typing when setting the field later. Get rid of this to avoid
the warning about unaligned accesses.

Signed-off-by: Daniel P. Berrangé 
---
  hw/s390x/ipl.c | 12 +---
  1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c
index 896888bf8f..51b272e190 100644
--- a/hw/s390x/ipl.c
+++ b/hw/s390x/ipl.c
@@ -252,8 +252,6 @@ static void s390_ipl_set_boot_menu(S390IPLState *ipl)
  {
  QemuOptsList *plist = qemu_find_opts("boot-opts");
  QemuOpts *opts = QTAILQ_FIRST(>head);
-uint8_t *flags = >qipl.qipl_flags;
-uint32_t *timeout = >qipl.boot_menu_timeout;
  const char *tmp;
  unsigned long splash_time = 0;
  
@@ -269,7 +267,7 @@ static void s390_ipl_set_boot_menu(S390IPLState *ipl)

  case S390_IPL_TYPE_CCW:
  /* In the absence of -boot menu, use zipl parameters */
  if (!qemu_opt_get(opts, "menu")) {
-*flags |= QIPL_FLAG_BM_OPTS_ZIPL;
+ipl->qipl.qipl_flags |= QIPL_FLAG_BM_OPTS_ZIPL;
  return;
  }
  break;
@@ -286,23 +284,23 @@ static void s390_ipl_set_boot_menu(S390IPLState *ipl)
  return;
  }
  
-*flags |= QIPL_FLAG_BM_OPTS_CMD;

+ipl->qipl.qipl_flags |= QIPL_FLAG_BM_OPTS_CMD;
  
  tmp = qemu_opt_get(opts, "splash-time");
  
  if (tmp && qemu_strtoul(tmp, NULL, 10, _time)) {

  error_report("splash-time is invalid, forcing it to 0");
-*timeout = 0;
+ipl->qipl.boot_menu_timeout = 0;
  return;
  }
  
  if (splash_time > 0x) {

  error_report("splash-time is too large, forcing it to max value");
-*timeout = 0x;
+ipl->qipl.boot_menu_timeout = 0x;
  return;
  }
  
-*timeout = cpu_to_be32(splash_time);

+ipl->qipl.boot_menu_timeout = cpu_to_be32(splash_time);
  }
  
  static CcwDevice *s390_get_ccw_device(DeviceState *dev_st)




Reviewed-by: Farhan Ali 




Re: [Qemu-devel] [PATCH v5 10/15] s390-bios: Support for running format-0/1 channel programs

2019-03-14 Thread Farhan Ali




On 03/13/2019 12:31 PM, Jason J. Herne wrote:

Introduce a library function for executing format-0 and format-1
channel programs and waiting for their completion before continuing
execution.

Add cu_type() to channel io library. This will be used to query control
unit type which is used to determine if we are booting a virtio device or a
real dasd device.

Signed-off-by: Jason J. Herne
Reviewed-by: Cornelia Huck
---
  pc-bios/s390-ccw/cio.c  | 144 
  pc-bios/s390-ccw/cio.h  | 130 ++-
  pc-bios/s390-ccw/s390-ccw.h |   1 +
  pc-bios/s390-ccw/start.S|  29 +
  4 files changed, 301 insertions(+), 3 deletions(-)


Reviewed-by: Farhan Ali 




Re: [Qemu-devel] qemu-iotest 55 broken

2019-03-14 Thread Farhan Ali




On 03/14/2019 01:11 PM, Eric Blake wrote:

On 3/14/19 11:33 AM, Farhan Ali wrote:

Hi,

I noticed qemu iotest 55 was broken on the latest master (My head commit
is dbbc277 Merge remote-tracking branch
'remotes/stefanha/tags/block-pull-request' into staging).


Patch has been proposed:

https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg04398.html



Thanks for pointing it out. Didn't notice there was a patch out there.

Thanks
Farhan




[Qemu-devel] qemu-iotest 55 broken

2019-03-14 Thread Farhan Ali

Hi,

I noticed qemu iotest 55 was broken on the latest master (My head commit 
is dbbc277 Merge remote-tracking branch 
'remotes/stefanha/tags/block-pull-request' into staging).


The backtrace of the qemu process when the test fails:

#0  0x7f82047cd428 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:54

#1  0x7f82047cf02a in __GI_abort () at abort.c:89
#2  0x7f82047c5bd7 in __assert_fail_base (fmt=, 
assertion=assertion@entry=0x55b6f83f91e4 "current_migration", 
file=file@entry=0x55b6f83f9197 "migration/migration.c", 
line=line@entry=187,
function=function@entry=0x55b6f83fa420 <__PRETTY_FUNCTION__.30526> 
"migrate_get_current") at assert.c:92
#3  0x7f82047c5c82 in __GI___assert_fail (assertion=0x55b6f83f91e4 
"current_migration", file=0x55b6f83f9197 "migration/migration.c", 
line=187, function=0x55b6f83fa420 <__PRETTY_FUNCTION__.30526> 
"migrate_get_current") at assert.c:101
#4  0x55b6f8068c93 in migrate_get_current () at 
migration/migration.c:187
#5  0x55b6f806c16e in migrate_add_blocker (reason=0x55b6f91797e0, 
errp=0x7ffe73ee6d20) at migration/migration.c:1715
#6  0x55b6f810caf9 in vmdk_open (bs=0x55b6f91835e0, 
options=0x55b6f91888b0, flags=139266, errp=0x7ffe73ee6d90) at 
block/vmdk.c:1019
#7  0x55b6f80f55d1 in bdrv_open_driver (bs=0x55b6f91835e0, 
drv=0x55b6f8bba300 , node_name=0x0, options=0x55b6f91888b0, 
open_flags=139266, errp=0x7ffe73ee6ea0) at block.c:1272
#8  0x55b6f80f5f89 in bdrv_open_common (bs=0x55b6f91835e0, 
file=0x55b6f9179490, options=0x55b6f91888b0, errp=0x7ffe73ee6ea0) at 
block.c:1530
#9  0x55b6f80f8fb6 in bdrv_open_inherit (filename=0x55b6f9182270 
"/home/alifm/kvmdev/qemu/tests/qemu-iotests/scratch/blockdev-target.img", 
reference=0x0, options=0x55b6f91888b0, flags=8194, parent=0x0, 
child_role=0x0,

errp=0x7ffe73ee7298) at block.c:2889
#10 0x55b6f80f94ef in bdrv_open (filename=0x55b6f9182270 
"/home/alifm/kvmdev/qemu/tests/qemu-iotests/scratch/blockdev-target.img", 
reference=0x0, options=0x55b6f91713a0, flags=0, errp=0x7ffe73ee7298) at 
block.c:2982
#11 0x55b6f815baf0 in blk_new_open (filename=0x55b6f9182270 
"/home/alifm/kvmdev/qemu/tests/qemu-iotests/scratch/blockdev-target.img", 
reference=0x0, options=0x55b6f91713a0, flags=0, errp=0x7ffe73ee7298) at 
block/block-backend.c:377
#12 0x55b6f7e6939d in blockdev_init (file=0x55b6f9182270 
"/home/alifm/kvmdev/qemu/tests/qemu-iotests/scratch/blockdev-target.img", 
bs_opts=0x55b6f91713a0, errp=0x7ffe73ee7298) at blockdev.c:602
#13 0x55b6f7e6a2de in drive_new (all_opts=0x55b6f90ba630, 
block_default_type=IF_IDE, errp=0x55b6f8c21fd0 ) at 
blockdev.c:994
#14 0x55b6f7e7b2f2 in drive_init_func (opaque=0x55b6f90f2628, 
opts=0x55b6f90ba630, errp=0x55b6f8c21fd0 ) at vl.c:1162
#15 0x55b6f823e17b in qemu_opts_foreach (list=0x55b6f8a8b720 
, func=0x55b6f7e7b2be , 
opaque=0x55b6f90f2628, errp=0x55b6f8c21fd0 ) at 
util/qemu-option.c:1171
#16 0x55b6f7e7b528 in configure_blockdev (bdo_queue=0x7ffe73ee75d0, 
machine_class=0x55b6f90f2580, snapshot=0) at vl.c:1229
#17 0x55b6f7e83323 in main (argc=20, argv=0x7ffe73ee76e8, 
envp=0x7ffe73ee7790) at vl.c:4284



Git bisect points to:

commit cda4aa9a5a08777cf13e164c0543bd4888b8adce
Author: Markus Armbruster 
Date:   Fri Mar 8 14:14:40 2019 +0100

vl: Create block backends before setting machine properties

qemu-system-FOO's main() acts on command line arguments in its own
idiosyncratic order.  There's not much method to its madness.
Whenever we find a case where one kind of command line argument needs
to refer to something created for another kind later, we rejigger the
order.

Block devices get created long after machine properties get processed.
Therefore, block device machine properties can be created, but not
set.  No such properties exist.  But the next commit will create some.
Time to rejigger again: create block devices earlier.

Signed-off-by: Markus Armbruster 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20190308131445.17502-8-arm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 

:100644 100644 c22ca447fa480673e9ecfbddad5e8b2965c4d3c4 
e9239d55ad6afc431c16cb9602e167c6ac95b9a1 M	vl.c


It seems the reason is now we are configuring block devices earlier than 
we setting up the migration state, and this breaks for vmdk block devices.


Thanks
Farhan




Re: [Qemu-devel] [PATCH v3 01/16] s390 vfio-ccw: Add bootindex property and IPLB data

2019-03-04 Thread Farhan Ali




On 03/01/2019 01:59 PM, Jason J. Herne wrote:

Add bootindex property and iplb data for vfio-ccw devices. This allows us to
forward boot information into the bios for vfio-ccw devices.

Refactor s390_get_ccw_device() to return device type. This prevents us from
having to use messy casting logic in several places.

Signed-off-by: Jason J. Herne
Acked-by: Halil Pasic
---
  MAINTAINERS |  1 +
  hw/s390x/ipl.c  | 39 +--
  hw/s390x/s390-ccw.c |  9 +
  hw/vfio/ccw.c   |  2 +-
  include/hw/s390x/s390-ccw.h |  1 +
  include/hw/s390x/vfio-ccw.h | 28 
  6 files changed, 73 insertions(+), 7 deletions(-)
  create mode 100644 include/hw/s390x/vfio-ccw.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 9a76845..a780916 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1400,6 +1400,7 @@ S: Supported
  F: hw/vfio/ccw.c
  F: hw/s390x/s390-ccw.c
  F: include/hw/s390x/s390-ccw.h
+F: include/hw/s390x/vfio-ccw.h
  T: githttps://github.com/cohuck/qemu.git  s390-next
  L:qemu-s3...@nongnu.org
  
diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c

index 896888b..df891bb 100644
--- a/hw/s390x/ipl.c
+++ b/hw/s390x/ipl.c
@@ -19,6 +19,7 @@
  #include "hw/loader.h"
  #include "hw/boards.h"
  #include "hw/s390x/virtio-ccw.h"
+#include "hw/s390x/vfio-ccw.h"
  #include "hw/s390x/css.h"
  #include "hw/s390x/ebcdic.h"
  #include "ipl.h"
@@ -305,16 +306,29 @@ static void s390_ipl_set_boot_menu(S390IPLState *ipl)
  *timeout = cpu_to_be32(splash_time);
  }
  
-static CcwDevice *s390_get_ccw_device(DeviceState *dev_st)

+#define CCW_DEVTYPE_NONE0x00
+#define CCW_DEVTYPE_VIRTIO  0x01
+#define CCW_DEVTYPE_SCSI0x02
+#define CCW_DEVTYPE_VFIO0x03
+
+static CcwDevice*s390_get_ccw_device(DeviceState *dev_st, int*  devtype)
  {
  CcwDevice *ccw_dev = NULL;
  
+*devtype = CCW_DEVTYPE_NONE;

+
  if (dev_st) {
  VirtioCcwDevice *virtio_ccw_dev = (VirtioCcwDevice *)
  object_dynamic_cast(OBJECT(qdev_get_parent_bus(dev_st)->parent),
  TYPE_VIRTIO_CCW_DEVICE);
+VFIOCCWDevice *vfio_ccw_dev = (VFIOCCWDevice *)
+object_dynamic_cast(OBJECT(dev_st), TYPE_VFIO_CCW);
  if (virtio_ccw_dev) {
  ccw_dev = CCW_DEVICE(virtio_ccw_dev);
+*devtype = CCW_DEVTYPE_VIRTIO;
+} else if (vfio_ccw_dev) {
+ccw_dev = CCW_DEVICE(vfio_ccw_dev);
+*devtype = CCW_DEVTYPE_VFIO;
  } else {
  SCSIDevice *sd = (SCSIDevice *)
  object_dynamic_cast(OBJECT(dev_st),
@@ -327,6 +341,7 @@ static CcwDevice *s390_get_ccw_device(DeviceState *dev_st)
  
  ccw_dev = (CcwDevice *)object_dynamic_cast(OBJECT(scsi_ccw),

 TYPE_CCW_DEVICE);
+*devtype = CCW_DEVTYPE_SCSI;
  }
  }
  }
@@ -337,10 +352,11 @@ static bool s390_gen_initial_iplb(S390IPLState *ipl)
  {
  DeviceState *dev_st;
  CcwDevice *ccw_dev = NULL;
+int devtype;
  
  dev_st = get_boot_device(0);

  if (dev_st) {
-ccw_dev = s390_get_ccw_device(dev_st);
+ccw_dev = s390_get_ccw_device(dev_st, );
  }
  
  /*

@@ -349,8 +365,10 @@ static bool s390_gen_initial_iplb(S390IPLState *ipl)
  if (ccw_dev) {
  SCSIDevice *sd = (SCSIDevice *) object_dynamic_cast(OBJECT(dev_st),
  TYPE_SCSI_DEVICE);
+VirtIONet *vn;
  
-if (sd) {

+switch (devtype) {
+case CCW_DEVTYPE_SCSI:
  ipl->iplb.len = cpu_to_be32(S390_IPLB_MIN_QEMU_SCSI_LEN);
  ipl->iplb.blk0_len =
  cpu_to_be32(S390_IPLB_MIN_QEMU_SCSI_LEN - 
S390_IPLB_HEADER_LEN);
@@ -360,8 +378,15 @@ static bool s390_gen_initial_iplb(S390IPLState *ipl)
  ipl->iplb.scsi.channel = cpu_to_be16(sd->channel);
  ipl->iplb.scsi.devno = cpu_to_be16(ccw_dev->sch->devno);
  ipl->iplb.scsi.ssid = ccw_dev->sch->ssid & 3;
-} else {
-VirtIONet *vn = (VirtIONet *) object_dynamic_cast(OBJECT(dev_st),
+break;
+case CCW_DEVTYPE_VFIO:
+ipl->iplb.len = cpu_to_be32(S390_IPLB_MIN_CCW_LEN);
+ipl->iplb.pbt = S390_IPL_TYPE_CCW;
+ipl->iplb.ccw.devno = cpu_to_be16(ccw_dev->sch->devno);
+ipl->iplb.ccw.ssid = ccw_dev->sch->ssid & 3;
+break;


Also for vfio-ccw we are not setting the ipl->iplb.blk0_len like we do 
for virtio. Is there a reason?


I know we don't reference the value in the bios for both virtio and vfio.


+case CCW_DEVTYPE_VIRTIO:
+vn = (VirtIONet *) object_dynamic_cast(OBJECT(dev_st),

TYPE_VIRTIO_NET);
  
  ipl->iplb.len = cpu_to_be32(S390_IPLB_MIN_CCW_LEN);

@@ -374,6 +399,7 @@ static bool 

Re: [Qemu-devel] [PATCH v2] s390x: add zPCI feature to "qemu" CPU model

2019-02-13 Thread Farhan Ali




On 02/13/2019 11:26 AM, Cornelia Huck wrote:

On Wed, 13 Feb 2019 10:54:23 -0500
Farhan Ali  wrote:


I noticed the qemu-iotests 200 was failing due to the commit
"703fef6fcf3 : s390x/pci: Warn when adding PCI devices without the
'zpci' feature".

This patch fixes the failure :).


Huh. So it was testing without pci before and now with and it's also
working? Awesome :)


It was testing with pci, but the reason it fails was because qemu was 
spitting out the warning due to commit 703fef6fcf3. Now David's commit 
adds the zpci to qemu cpu model and qemu doesn't give out a warning message.




Thanks
Farhan

On 02/12/2019 06:23 AM, David Hildenbrand wrote:

As we now always have PCI support, let's add it to the "qemu" CPU model,
taking care of backwards compatibility.

Signed-off-by: David Hildenbrand
---

v1 -> v2:
- Use correct model identifiction of the z12 we emulate

   hw/s390x/s390-virtio-ccw.c  | 3 +++
   target/s390x/gen-features.c | 8 ++--
   2 files changed, 9 insertions(+), 2 deletions(-)










Re: [Qemu-devel] [PATCH v2] s390x: add zPCI feature to "qemu" CPU model

2019-02-13 Thread Farhan Ali

I noticed the qemu-iotests 200 was failing due to the commit
"703fef6fcf3 : s390x/pci: Warn when adding PCI devices without the 
'zpci' feature".


This patch fixes the failure :).

Thanks
Farhan

On 02/12/2019 06:23 AM, David Hildenbrand wrote:

As we now always have PCI support, let's add it to the "qemu" CPU model,
taking care of backwards compatibility.

Signed-off-by: David Hildenbrand
---

v1 -> v2:
- Use correct model identifiction of the z12 we emulate

  hw/s390x/s390-virtio-ccw.c  | 3 +++
  target/s390x/gen-features.c | 8 ++--
  2 files changed, 9 insertions(+), 2 deletions(-)





Re: [Qemu-devel] [PATCH v3 1/6] vfio-ccw: make it safe to access channel programs

2019-02-05 Thread Farhan Ali




On 02/05/2019 09:48 AM, Eric Farman wrote:



On 02/05/2019 07:35 AM, Cornelia Huck wrote:

On Tue, 5 Feb 2019 12:52:29 +0100
Halil Pasic  wrote:


On Mon, 4 Feb 2019 16:31:02 +0100
Cornelia Huck  wrote:


On Thu, 31 Jan 2019 13:34:55 +0100
Halil Pasic  wrote:

On Thu, 31 Jan 2019 12:52:20 +0100
Cornelia Huck  wrote:

On Wed, 30 Jan 2019 19:51:27 +0100
Halil Pasic  wrote:

On Wed, 30 Jan 2019 14:22:07 +0100
Cornelia Huck  wrote:

When we get a solicited interrupt, the start function may have
been cleared by a csch, but we still have a channel program
structure allocated. Make it safe to call the cp accessors in
any case, so we can call them unconditionally.


I read this like it is supposed to be safe regardless of
parallelism and threads. However I don't see any explicit
synchronization done for cp->initialized.

I've managed to figure out how is that supposed to be safe
for the cp_free() (which is probably our main concern) in
vfio_ccw_sch_io_todo(), but if fail when it comes to the one
in vfio_ccw_mdev_notifier().

Can you explain us how does the synchronization work?


You read that wrong, I don't add synchronization, I just add a check.


Now I'm confused. Does that mean we don't need synchronization for 
this?


If we lack synchronization (that is not provided by the current state
machine handling, or the rework here), we should do a patch on top
(preferably on top of the whole series, so this does not get even more
tangled up.) This is really just about the extra check.


I'm not a huge fan of keeping or introducing races -- it makes things
difficult to reason about, but I do have some understanging your
position.


The only thing I want to avoid is knowingly making things worse than
before, and I don't think this patch does that.



This patch-series is AFAICT a big improvement over what we have. I would
like Farhan confirming that it makes these hick-ups when he used to hit
BUSY with another ssch request disappear. If it does (I hope it does)
it's definitely a good thing for anybody who wants to use vfio-ccw.


Yep. There remains a lot to be done, but it's a first step.


s/a first step/an excellent first step/  :)

Can't speak for Farhan, but this makes things somewhat better for me. 
I'm still getting some periodic errors, but they happen infrequently 
enough now that debugging them is frustrating.  ;-)


  - Eric



I ran the my workloads/tests with the patches and like Eric I notice the 
errors I previously hit less frequently.







Yet I find it difficult to slap my r-b over racy code, or partial
solutions. In the latter case, when I lack conceptual clarity, I find it
difficult to tell if we are heading into the right direction, or is what
we build today going to turn against us tomorrow. Sorry for being a 
drag.


As long as we don't introduce bad user space interfaces we have to drag
around forever, I think anything is fair game if we think it's a good
idea at that moment. We can rewrite things if it turned out to be a bad
idea (although I'm not arguing for doing random crap, of course :)








Re: [Qemu-devel] [PATCH 10/15] s390-bios: Support for running format-0/1 channel programs

2019-02-04 Thread Farhan Ali




On 02/04/2019 06:13 AM, Cornelia Huck wrote:

On Thu, 31 Jan 2019 12:31:00 -0500
Farhan Ali  wrote:


On 01/29/2019 08:29 AM, Jason J. Herne wrote:

Add struct for format-0 ccws. Support executing format-0 channel
programs and waiting for their completion before continuing execution.
This will be used for real dasd ipl.

Add cu_type() to channel io library. This will be used to query control
unit type which is used to determine if we are booting a virtio device or a
real dasd device.

Signed-off-by: Jason J. Herne
---
   pc-bios/s390-ccw/cio.c  | 114 +++
   pc-bios/s390-ccw/cio.h  | 127 
++--
   pc-bios/s390-ccw/s390-ccw.h |   1 +
   pc-bios/s390-ccw/start.S|  33 +++-
   4 files changed, 270 insertions(+), 5 deletions(-)



+/*
+ * Executes a channel program at a given subchannel. The request to run the
+ * channel program is sent to the subchannel, we then wait for the interrupt
+ * signaling completion of the I/O operation(s) performed by the channel
+ * program. Lastly we verify that the i/o operation completed without error and
+ * that the interrupt we received was for the subchannel used to run the
+ * channel program.
+ *
+ * Note: This function assumes it is running in an environment where no other
+ * cpus are generating or receiving I/O interrupts. So either run it in a
+ * single-cpu environment or make sure all other cpus are not doing I/O and
+ * have I/O interrupts masked off.
+ */
+int do_cio(SubChannelId schid, uint32_t ccw_addr, int fmt)
+{
+CmdOrb orb = {};
+Irb irb = {};
+sense_data_eckd_dasd sd;
+int rc, retries = 0;
+
+IPL_assert(fmt == 0 || fmt == 1, "Invalid ccw format");
+
+/* ccw_addr must be <= 24 bits and point to at least one whole ccw. */
+if (fmt == 0) {
+IPL_assert(ccw_addr <= 0xFF - 8, "Invalid ccw address");
+}
+
+orb.fmt = fmt ;
+orb.pfch = 1;  /* QEMU's cio implementation requires prefetch */
+orb.c64 = 1;   /* QEMU's cio implementation requires 64-bit idaws */
+orb.lpm = 0xFF; /* All paths allowed */
+orb.cpa = ccw_addr;
+
+while (true) {
+rc = ssch(schid, );
+if (rc == 1) {
+/* Status pending, not sure why. Let's eat the status and retry. */
+tsch(schid, );
+retries++;
+continue;
+}
+if (rc) {
+print_int("ssch failed with rc=", rc);
+break;
+}
+
+consume_io_int();
+
+/* collect status */
+rc = tsch(schid, );
+if (rc) {
+print_int("tsch failed with rc=", rc);
+break;
+}
+
+if (!irb_error()) {
+break;
+}
+
+/*
+ * Unexpected unit check, or interface-control-check. Use sense to
+ * clear unit check then retry.
+ */
+if ((unit_check() || iface_ctrl_check()) && retries <= 2) {
+basic_sense(schid, , sizeof(sd));


We are using basic sense to clear any unit check or ifcc, but is it
possible for the basic sense to cause another unit check?

The chapter on Basic Sense in the Common I/O Device Commands
(http://publibz.boulder.ibm.com/support/libraryserver/FRAMESET/dz9ar501/2.1?SHELF==19920409154647=)
   says this:

""
The basic sense command initiates a sense operation  at  all  devices
and cannot  cause  the  command-reject,  intervention-required,
data-check, or overrun bit to be set to one.  If the control unit
detects  an  equipment malfunction  or  invalid  checking-block  code
(CBC) on the sense-command code, the equipment-check or bus-out-check
bit is set  to  one,  and  unit check is indicated in the device-status
byte.
""

If my understanding is correct, if there is an equipment malfunction
then the control unit can return a unit check even for a basic sense.
This can lead to infinite recursion in the bios.


I think the retries variable is supposed to take care of that.



If I understand the code correctly, the retries variable cannot prevent 
infinite recursion. Because every time we get a unit check we do a basic 
sense which calls the do_cio function again. If that basic sense returns 
a unit check we do another basic sense



What I don't understand is why we do the basic sense after an IFCC?
Wouldn't it make more sense to simply retry the original command in
that case?






+retries++;
+continue;
+}
+
+break;
+}
+
+return rc;
+}










Re: [Qemu-devel] [PATCH 10/15] s390-bios: Support for running format-0/1 channel programs

2019-01-31 Thread Farhan Ali




On 01/29/2019 08:29 AM, Jason J. Herne wrote:

Add struct for format-0 ccws. Support executing format-0 channel
programs and waiting for their completion before continuing execution.
This will be used for real dasd ipl.

Add cu_type() to channel io library. This will be used to query control
unit type which is used to determine if we are booting a virtio device or a
real dasd device.

Signed-off-by: Jason J. Herne
---
  pc-bios/s390-ccw/cio.c  | 114 +++
  pc-bios/s390-ccw/cio.h  | 127 ++--
  pc-bios/s390-ccw/s390-ccw.h |   1 +
  pc-bios/s390-ccw/start.S|  33 +++-
  4 files changed, 270 insertions(+), 5 deletions(-)

diff --git a/pc-bios/s390-ccw/cio.c b/pc-bios/s390-ccw/cio.c
index 095f79b..63581c6 100644
--- a/pc-bios/s390-ccw/cio.c
+++ b/pc-bios/s390-ccw/cio.c
@@ -10,6 +10,7 @@
  
  #include "libc.h"

  #include "s390-ccw.h"
+#include "s390-arch.h"
  #include "cio.h"
  
  static char chsc_page[PAGE_SIZE] __attribute__((__aligned__(PAGE_SIZE)));

@@ -39,3 +40,116 @@ void enable_subchannel(SubChannelId schid)
  schib.pmcw.ena = 1;
  msch(schid, );
  }
+
+uint16_t cu_type(SubChannelId schid)
+{
+Ccw1 sense_id_ccw;
+SenseId sense_data;
+
+sense_id_ccw.cmd_code = CCW_CMD_SENSE_ID;
+sense_id_ccw.cda = ptr2u32(_data);
+sense_id_ccw.count = sizeof(sense_data);
+sense_id_ccw.flags |= CCW_FLAG_SLI;
+
+if (do_cio(schid, ptr2u32(_id_ccw), CCW_FMT1)) {
+panic("Failed to run SenseID CCw\n");
+}
+
+return sense_data.cu_type;
+}
+
+void basic_sense(SubChannelId schid, void *sense_data, uint16_t data_size)
+{
+Ccw1 senseCcw;
+
+senseCcw.cmd_code = CCW_CMD_BASIC_SENSE;
+senseCcw.cda = ptr2u32(sense_data);
+senseCcw.count = data_size;
+
+if (do_cio(schid, ptr2u32(), CCW_FMT1)) {
+panic("Failed to run Basic Sense CCW\n");
+}
+}
+
+static bool irb_error(Irb *irb)
+{
+if (irb->scsw.cstat) {
+return true;
+}
+return irb->scsw.dstat != (SCSW_DSTAT_DEVEND | SCSW_DSTAT_CHEND);
+}
+
+/*
+ * Executes a channel program at a given subchannel. The request to run the
+ * channel program is sent to the subchannel, we then wait for the interrupt
+ * signaling completion of the I/O operation(s) performed by the channel
+ * program. Lastly we verify that the i/o operation completed without error and
+ * that the interrupt we received was for the subchannel used to run the
+ * channel program.
+ *
+ * Note: This function assumes it is running in an environment where no other
+ * cpus are generating or receiving I/O interrupts. So either run it in a
+ * single-cpu environment or make sure all other cpus are not doing I/O and
+ * have I/O interrupts masked off.
+ */
+int do_cio(SubChannelId schid, uint32_t ccw_addr, int fmt)
+{
+CmdOrb orb = {};
+Irb irb = {};
+sense_data_eckd_dasd sd;
+int rc, retries = 0;
+
+IPL_assert(fmt == 0 || fmt == 1, "Invalid ccw format");
+
+/* ccw_addr must be <= 24 bits and point to at least one whole ccw. */
+if (fmt == 0) {
+IPL_assert(ccw_addr <= 0xFF - 8, "Invalid ccw address");
+}
+
+orb.fmt = fmt ;
+orb.pfch = 1;  /* QEMU's cio implementation requires prefetch */
+orb.c64 = 1;   /* QEMU's cio implementation requires 64-bit idaws */
+orb.lpm = 0xFF; /* All paths allowed */
+orb.cpa = ccw_addr;
+
+while (true) {
+rc = ssch(schid, );
+if (rc == 1) {
+/* Status pending, not sure why. Let's eat the status and retry. */
+tsch(schid, );
+retries++;
+continue;
+}
+if (rc) {
+print_int("ssch failed with rc=", rc);
+break;
+}
+
+consume_io_int();
+
+/* collect status */
+rc = tsch(schid, );
+if (rc) {
+print_int("tsch failed with rc=", rc);
+break;
+}
+
+if (!irb_error()) {
+break;
+}
+
+/*
+ * Unexpected unit check, or interface-control-check. Use sense to
+ * clear unit check then retry.
+ */
+if ((unit_check() || iface_ctrl_check()) && retries <= 2) {
+basic_sense(schid, , sizeof(sd));


We are using basic sense to clear any unit check or ifcc, but is it 
possible for the basic sense to cause another unit check?


The chapter on Basic Sense in the Common I/O Device Commands 
(http://publibz.boulder.ibm.com/support/libraryserver/FRAMESET/dz9ar501/2.1?SHELF==19920409154647=) 
 says this:


""
The basic sense command initiates a sense operation  at  all  devices 
and cannot  cause  the  command-reject,  intervention-required, 
data-check, or overrun bit to be set to one.  If the control unit 
detects  an  equipment malfunction  or  invalid  checking-block  code 
(CBC) on the sense-command code, the equipment-check or bus-out-check 
bit is set  to  one,  and  unit check is indicated in the 

Re: [Qemu-devel] [PATCH 07/15] s390-bios: Decouple channel i/o logic from virtio

2019-01-31 Thread Farhan Ali




On 01/29/2019 08:29 AM, Jason J. Herne wrote:

Create a separate library for channel i/o related code. This decouples
channel i/o operations from virtio and allows us to make use of them for
the real dasd boot path.

Signed-off-by: Jason J. Herne 
---
  pc-bios/s390-ccw/Makefile|  2 +-
  pc-bios/s390-ccw/cio.c   | 41 
  pc-bios/s390-ccw/cio.h   |  3 +++
  pc-bios/s390-ccw/main.c  |  1 +
  pc-bios/s390-ccw/netboot.mak |  2 +-
  pc-bios/s390-ccw/netmain.c   |  1 +
  pc-bios/s390-ccw/s390-ccw.h  |  1 -
  pc-bios/s390-ccw/virtio-blkdev.c |  1 +
  pc-bios/s390-ccw/virtio.c| 27 ++
  9 files changed, 51 insertions(+), 28 deletions(-)
  create mode 100644 pc-bios/s390-ccw/cio.c

diff --git a/pc-bios/s390-ccw/Makefile b/pc-bios/s390-ccw/Makefile
index 1eb316b..12ad9c1 100644
--- a/pc-bios/s390-ccw/Makefile
+++ b/pc-bios/s390-ccw/Makefile
@@ -10,7 +10,7 @@ $(call set-vpath, $(SRC_PATH)/pc-bios/s390-ccw)
  .PHONY : all clean build-all
  
  OBJECTS = start.o main.o bootmap.o jump2ipl.o sclp.o menu.o \

- virtio.o virtio-scsi.o virtio-blkdev.o libc.o
+ virtio.o virtio-scsi.o virtio-blkdev.o libc.o cio.o
  
  QEMU_CFLAGS := $(filter -W%, $(QEMU_CFLAGS))

  QEMU_CFLAGS += -ffreestanding -fno-delete-null-pointer-checks -msoft-float
diff --git a/pc-bios/s390-ccw/cio.c b/pc-bios/s390-ccw/cio.c
new file mode 100644
index 000..095f79b
--- /dev/null
+++ b/pc-bios/s390-ccw/cio.c
@@ -0,0 +1,41 @@
+/*
+ * S390 Channel I/O
+ *
+ * Copyright (c) 2018 Jason J. Herne 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or (at
+ * your option) any later version. See the COPYING file in the top-level
+ * directory.
+ */
+


shouldn't the year be 2019 now? :)




Re: [Qemu-devel] [PATCH 06/15] s390-bios: Clean up cio.h

2019-01-31 Thread Farhan Ali
 {
  /*
   * sense-id response buffer layout
   */
-struct senseid {
+typedef struct senseid {
  /* common part */
  __u8  reserved;   /* always 0x'FF' */
  __u16 cu_type;/* control unit type */
@@ -203,15 +203,15 @@ struct senseid {
  __u8  unused; /* padding byte */
  /* extended part */
  struct ciw ciw[62];
-}  __attribute__ ((packed, aligned(4)));
+}  __attribute__ ((packed, aligned(4))) SenseId;
  
  /* interruption response block */

-struct irb {
+typedef struct irb {
  struct scsw scsw;
  __u32 esw[5];
  __u32 ecw[8];
  __u32 emw[8];
-}  __attribute__ ((packed, aligned(4)));
+}  __attribute__ ((packed, aligned(4))) Irb;
  
  /*

   * Some S390 specific IO instructions as inline
diff --git a/pc-bios/s390-ccw/s390-ccw.h b/pc-bios/s390-ccw/s390-ccw.h
index 9828aa2..241c6d0 100644
--- a/pc-bios/s390-ccw/s390-ccw.h
+++ b/pc-bios/s390-ccw/s390-ccw.h
@@ -49,14 +49,6 @@ typedef unsigned long long __u64;
  #include "cio.h"
  #include "iplb.h"
  
-typedef struct irb Irb;

-typedef struct ccw1 Ccw1;
-typedef struct cmd_orb CmdOrb;
-typedef struct schib Schib;
-typedef struct chsc_area_sda ChscAreaSda;
-typedef struct senseid SenseId;
-typedef struct subchannel_id SubChannelId;
-
  /* start.s */
  void disabled_wait(void);
  void consume_sclp_int(void);




Reviewed-by: Farhan Ali 




Re: [Qemu-devel] [PATCH 05/15] s390-bios: Factor finding boot device out of virtio code path

2019-01-31 Thread Farhan Ali
;
  boot_setup();
-virtio_setup();
+find_boot_device();
  
+virtio_setup();

  zipl_load(); /* no return */
  
  panic("Failed to load OS from hard disk\n");

diff --git a/tests/boot-serial-test.c b/tests/boot-serial-test.c
index 58a48f3..9daf2cb 100644
--- a/tests/boot-serial-test.c
+++ b/tests/boot-serial-test.c
@@ -112,7 +112,7 @@ static testdef_t tests[] = {
  { "sparc", "SS-4", "", "MB86904" },
  { "sparc", "SS-600MP", "", "TMS390Z55" },
  { "sparc64", "sun4u", "", "UltraSPARC" },
-{ "s390x", "s390-ccw-virtio", "", "virtio device" },
+{ "s390x", "s390-ccw-virtio", "", "device" },
  { "m68k", "mcf5208evb", "", "TT", sizeof(kernel_mcf5208), kernel_mcf5208 
},
  { "microblaze", "petalogix-s3adsp1800", "", "TT",
sizeof(kernel_pls3adsp1800), kernel_pls3adsp1800 },




Reviewed-by: Farhan Ali 




Re: [Qemu-devel] [PATCH 01/15] s390 vfio-ccw: Add bootindex property and IPLB data

2019-01-30 Thread Farhan Ali




On 01/29/2019 08:29 AM, Jason J. Herne wrote:

Add bootindex property and iplb data for vfio-ccw devices. This allows us to
forward boot information into the bios for vfio-ccw devices.

Signed-off-by: Jason J. Herne
Acked-by: Halil Pasic
---
  hw/s390x/ipl.c  | 14 ++
  hw/s390x/s390-ccw.c |  9 +
  hw/vfio/ccw.c   | 13 +
  include/hw/s390x/s390-ccw.h |  1 +
  include/hw/s390x/vfio-ccw.h | 38 ++
  5 files changed, 63 insertions(+), 12 deletions(-)
  create mode 100644 include/hw/s390x/vfio-ccw.h

diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c
index 21f64ad..a993f65 100644
--- a/hw/s390x/ipl.c
+++ b/hw/s390x/ipl.c
@@ -19,6 +19,7 @@
  #include "hw/loader.h"
  #include "hw/boards.h"
  #include "hw/s390x/virtio-ccw.h"
+#include "hw/s390x/vfio-ccw.h"
  #include "hw/s390x/css.h"
  #include "hw/s390x/ebcdic.h"
  #include "ipl.h"
@@ -311,8 +312,12 @@ static CcwDevice *s390_get_ccw_device(DeviceState *dev_st)
  VirtioCcwDevice *virtio_ccw_dev = (VirtioCcwDevice *)
  object_dynamic_cast(OBJECT(qdev_get_parent_bus(dev_st)->parent),
  TYPE_VIRTIO_CCW_DEVICE);
+VFIOCCWDevice *vfio_ccw_dev = (VFIOCCWDevice *)
+object_dynamic_cast(OBJECT(dev_st), TYPE_VFIO_CCW);
  if (virtio_ccw_dev) {
  ccw_dev = CCW_DEVICE(virtio_ccw_dev);
+} else if (vfio_ccw_dev) {
+ccw_dev = CCW_DEVICE(vfio_ccw_dev);
  } else {
  SCSIDevice *sd = (SCSIDevice *)
  object_dynamic_cast(OBJECT(dev_st),
@@ -347,6 +352,8 @@ static bool s390_gen_initial_iplb(S390IPLState *ipl)
  if (ccw_dev) {
  SCSIDevice *sd = (SCSIDevice *) object_dynamic_cast(OBJECT(dev_st),
  TYPE_SCSI_DEVICE);
+VFIOCCWDevice *vc = (VFIOCCWDevice *)
+object_dynamic_cast(OBJECT(dev_st), TYPE_VFIO_CCW);
  
  if (sd) {

  ipl->iplb.len = cpu_to_be32(S390_IPLB_MIN_QEMU_SCSI_LEN);
@@ -358,6 +365,13 @@ static bool s390_gen_initial_iplb(S390IPLState *ipl)
  ipl->iplb.scsi.channel = cpu_to_be16(sd->channel);
  ipl->iplb.scsi.devno = cpu_to_be16(ccw_dev->sch->devno);
  ipl->iplb.scsi.ssid = ccw_dev->sch->ssid & 3;
+} else if (vc) {
+CcwDevice *ccw_dev = CCW_DEVICE(vc);


Do we need this line here? I though ccw_dev was already correctly casted 
in s390_get_ccw_device?



+
+ipl->iplb.len = cpu_to_be32(S390_IPLB_MIN_CCW_LEN);
+ipl->iplb.pbt = S390_IPL_TYPE_CCW;
+ipl->iplb.ccw.devno = cpu_to_be16(ccw_dev->sch->devno);
+ipl->iplb.ccw.ssid = ccw_dev->sch->ssid & 3;
  } else {
  VirtIONet *vn = (VirtIONet *) object_dynamic_cast(OBJECT(dev_st),

TYPE_VIRTIO_NET);





Re: [Qemu-devel] [PATCH 03/15] s390-bios: decouple common boot logic from virtio

2019-01-30 Thread Farhan Ali




On 01/29/2019 08:29 AM, Jason J. Herne wrote:

Create a boot_setup function to handle getting boot information from
the machine/hypervisor. This decouples common boot logic from the
virtio code path and allows us to make use of it for the real dasd boot
scenario.

Signed-off-by: Jason J. Herne 
Acked-by: Halil Pasic 
Reviewed-by: Collin Walling 
---
  pc-bios/s390-ccw/main.c | 28 
  1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/pc-bios/s390-ccw/main.c b/pc-bios/s390-ccw/main.c
index e82fe2c..67df421 100644
--- a/pc-bios/s390-ccw/main.c
+++ b/pc-bios/s390-ccw/main.c
@@ -14,16 +14,17 @@
  
  char stack[PAGE_SIZE * 8] __attribute__((__aligned__(PAGE_SIZE)));

  static SubChannelId blk_schid = { .one = 1 };
-IplParameterBlock iplb __attribute__((__aligned__(PAGE_SIZE)));
  static char loadparm_str[LOADPARM_LEN + 1] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 };
  QemuIplParameters qipl;
+IplParameterBlock iplb __attribute__((__aligned__(PAGE_SIZE)));
+static bool have_iplb;
  
  #define LOADPARM_PROMPT "PROMPT  "

  #define LOADPARM_EMPTY  ""
  #define BOOT_MENU_FLAG_MASK (QIPL_FLAG_BM_OPTS_CMD | QIPL_FLAG_BM_OPTS_ZIPL)
  
  /*

- * Priniciples of Operations (SA22-7832-09) chapter 17 requires that
+ * Principles of Operations (SA22-7832-09) chapter 17 requires that
   * a subsystem-identification is at 184-187 and bytes 188-191 are zero
   * after list-directed-IPL and ccw-IPL.
   */
@@ -111,23 +112,33 @@ static void css_setup(void)
  enable_mss_facility();
  }
  
+/*

+ * Collect various pieces of information from the hypervisor/hardware that
+ * we'll use to determine exactly how we'll boot.
+ */
+static void boot_setup(void)
+{
+char lpmsg[] = "LOADPARM=[]\n";
+
+sclp_get_loadparm_ascii(loadparm_str);
+memcpy(lpmsg + 10, loadparm_str, 8);
+sclp_print(lpmsg);
+
+have_iplb = store_iplb();
+}
+
  static void virtio_setup(void)
  {
  Schib schib;
  int ssid;
  bool found = false;
  uint16_t dev_no;
-char ldp[] = "LOADPARM=[]\n";
  VDev *vdev = virtio_get_device();
  QemuIplParameters *early_qipl = (QemuIplParameters *)QIPL_ADDRESS;
  
-sclp_get_loadparm_ascii(loadparm_str);

-memcpy(ldp + 10, loadparm_str, LOADPARM_LEN);
-sclp_print(ldp);
-
  memcpy(, early_qipl, sizeof(QemuIplParameters));
  
-if (store_iplb()) {

+if (have_iplb) {
  switch (iplb.pbt) {
  case S390_IPL_TYPE_CCW:
  dev_no = iplb.ccw.devno;
@@ -174,6 +185,7 @@ int main(void)
  {
  sclp_setup();
  css_setup();
+boot_setup();
  virtio_setup();
  
  zipl_load(); /* no return */



Reviewed-by: Farhan Ali 




Re: [Qemu-devel] [PATCH 02/15] s390-bios: decouple cio setup from virtio

2019-01-30 Thread Farhan Ali




On 01/29/2019 08:29 AM, Jason J. Herne wrote:

Move channel i/o setup code out to a separate function. This decouples cio
setup from the virtio code path and allows us to make use of it for booting
dasd devices.

Signed-off-by: Jason J. Herne 
Acked-by: Halil Pasic 
Reviewed-by: Collin Walling 
---
  pc-bios/s390-ccw/main.c | 20 +---
  1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/pc-bios/s390-ccw/main.c b/pc-bios/s390-ccw/main.c
index 544851d..e82fe2c 100644
--- a/pc-bios/s390-ccw/main.c
+++ b/pc-bios/s390-ccw/main.c
@@ -99,6 +99,18 @@ static void menu_setup(void)
  }
  }
  
+/*

+ * Initialize the channel I/O subsystem so we can talk to our ipl/boot device.
+ */
+static void css_setup(void)
+{
+/*
+ * Unconditionally enable mss support. In every sane configuration this
+ * will succeed; and even if it doesn't, stsch_err() can handle it.
+ */
+enable_mss_facility();
+}
+
  static void virtio_setup(void)
  {
  Schib schib;
@@ -109,13 +121,6 @@ static void virtio_setup(void)
  VDev *vdev = virtio_get_device();
  QemuIplParameters *early_qipl = (QemuIplParameters *)QIPL_ADDRESS;
  
-/*

- * We unconditionally enable mss support. In every sane configuration,
- * this will succeed; and even if it doesn't, stsch_err() can deal
- * with the consequences.
- */
-enable_mss_facility();
-
  sclp_get_loadparm_ascii(loadparm_str);
  memcpy(ldp + 10, loadparm_str, LOADPARM_LEN);
  sclp_print(ldp);
@@ -168,6 +173,7 @@ static void virtio_setup(void)
  int main(void)
  {
  sclp_setup();
+css_setup();
  virtio_setup();
  
  zipl_load(); /* no return */



Reviewed-by: Farhan Ali 




Re: [Qemu-devel] [PATCH v2 2/5] vfio-ccw: concurrent I/O handling

2019-01-30 Thread Farhan Ali




On 01/30/2019 08:29 AM, Cornelia Huck wrote:

On Tue, 29 Jan 2019 20:39:33 +0100
Halil Pasic  wrote:


On Tue, 29 Jan 2019 10:58:40 +0100
Cornelia Huck  wrote:


The problem I see with the let the hardware sort it out is that, for
that to work, we need to juggle multiple translations simultaneously
(or am I wrong?). Doing that does not appear particularly simple to
me.


None in the first stage, at most two in the second stage, I guess.
   


Expected benefit of the second stage over the first stage? (I see none.)


Making something possible that is allowed by the architecture. Not
really important, though.
   


I had a chat with Farhan, and he suggested that by 'allowed by
architecture' you mean " You can submit a new request if the subchannel
is pending with primary, but not with secondary state." (from Message-ID:
<20190125152154.05120461.coh...@redhat.com>).


Yes. I might have mixed things up, though.



So I re-read the PoP.

 From the description of the start subchannel instruction:
"""
Special Conditions

Condition code 1 is set, and no other action is
taken, when the subchannel is status pending when
START SUBCHANNEL is executed. On some mod-
els, condition code 1 is not set when the subchannel
is status pending with only secondary status; instead,
the status-pending condition is discarded.

Condition code 2 is set, and no other action is
taken, when a start, halt, or clear function is currently
in progress at the subchannel (see “Function Control
(FC)” on page 13).

"""

So I guess you mixed primary and secondary up and wanted to say:
"You can submit a new request if the subchannel
is pending with _secondary_, but not with _primary_ _status_."

But does that really mean architecture allows the subchannel
to accept multiple ssch() instructions so that it ends up processing
two or more channel programs in parallel.


That's not what I meant. The vfio-ccw driver still holds on to one cp,
while a second one could be submitted.

But let's just end discussing this here, and continue with discussing
the reworked state machine, ok? It's not really relevant for going
forward with halt/clear.



+1
 I think we should move forward with halt/clear.

Thanks
Farhan




Re: [Qemu-devel] [PATCH 11/15] s390-bios: cio error handling

2018-12-13 Thread Farhan Ali




On 12/12/2018 09:11 AM, Jason J. Herne wrote:

Add verbose error output for when unexpected i/o errors happen. This eases the
burden of debugging and reporting i/o errors. No error information is printed
in the success case, here is an example of what is output on error:

vfio-ccw device I/O error - Interrupt Response Block Data:
 Function Ctrl : [Start]
 Activity Ctrl : [Start-Pending]
 Status Ctrl : [Alert] [Primary] [Secondary] [Status-Pending]
 Device Status : [Unit-Check]
 Channel Status :
 cpa=: 0x01e67098
 prev_ccw=: 0x
 this_ccw=: 0x

Sense Data (fmt 32-bytes):
 Sense Condition Flags : [Equipment-Check]
 Residual Count =: 0x
 Phys Drive ID  =: 0x009e
 low cyl address=: 0x
 head addr & hi cyl =: 0x
 format/message =: 0x0008
 fmt-dependent[0-7] =: 0x0004
 fmt-dependent[8-15]=: 0xe561282305082fff
 prog action code   =: 0x0016
 Configuration info =: 0x40e0
 mcode / hi-cyl =: 0x
 cyl & head addr [0]=: 0x
 cyl & head addr [1]=: 0x
 cyl & head addr [2]=: 0x

Signed-off-by: Jason J. Herne 
---
  pc-bios/s390-ccw/cio.c  | 225 
  pc-bios/s390-ccw/libc.h |  11 +++
  2 files changed, 236 insertions(+)

diff --git a/pc-bios/s390-ccw/cio.c b/pc-bios/s390-ccw/cio.c
index 9019250..c72e13c 100644
--- a/pc-bios/s390-ccw/cio.c
+++ b/pc-bios/s390-ccw/cio.c
@@ -83,6 +83,228 @@ static bool irb_error(Irb *irb)
  return irb->scsw.dstat != 0xc;
  }

+static void print_sense_data(SenseData *sd)
+{
+char msgline[512];
+
+if (sd->config_info & 0x8000) {
+sclp_print("Sense Data (fmt 24-bytes):\n");
+} else {
+sclp_print("Sense Data (fmt 32-bytes):\n");
+}
+
+strcat(msgline, "Sense Condition Flags :");
+if (sd->status[0] & SNS0_CMD_REJECT) {
+strcat(msgline, " [Cmd-Reject]");
+}
+if (sd->status[0] & SNS0_INTERVENTION_REQ) {
+strcat(msgline, " [Intervention-Required]");
+}
+if (sd->status[0] & SNS0_BUS_OUT_CHECK) {
+strcat(msgline, " [Bus-Out-Parity-Check]");
+}
+if (sd->status[0] & SNS0_EQUIPMENT_CHECK) {
+strcat(msgline, " [Equipment-Check]");
+}
+if (sd->status[0] & SNS0_DATA_CHECK) {
+strcat(msgline, " [Data-Check]");
+}
+if (sd->status[0] & SNS0_OVERRUN) {
+strcat(msgline, " [Overrun]");
+}
+if (sd->status[0] & SNS0_INCOMPL_DOMAIN) {
+strcat(msgline, " [Incomplete-Domain]");
+}
+
+if (sd->status[1] & SNS1_PERM_ERR) {
+strcat(msgline, " [Permanent-Error]");
+}
+if (sd->status[1] & SNS1_INV_TRACK_FORMAT) {
+strcat(msgline, " [Invalid-Track-Fmt]");
+}
+if (sd->status[1] & SNS1_EOC) {
+strcat(msgline, " [End-of-Cyl]");
+}
+if (sd->status[1] & SNS1_MESSAGE_TO_OPER) {
+strcat(msgline, " [Operator-Msg]");
+}
+if (sd->status[1] & SNS1_NO_REC_FOUND) {
+strcat(msgline, " [No-Record-Found]");
+}
+if (sd->status[1] & SNS1_FILE_PROTECTED) {
+strcat(msgline, " [File-Protected]");
+}
+if (sd->status[1] & SNS1_WRITE_INHIBITED) {
+strcat(msgline, " [Write-Inhibited]");
+}
+if (sd->status[1] & SNS1_INPRECISE_END) {
+strcat(msgline, " [Imprecise-Ending]");
+}
+
+if (sd->status[2] & SNS2_REQ_INH_WRITE) {
+strcat(msgline, " [Req-Inhibit-Write]");
+}
+if (sd->status[2] & SNS2_CORRECTABLE) {
+strcat(msgline, " [Correctable-Data-Check]");
+}
+if (sd->status[2] & SNS2_FIRST_LOG_ERR) {
+strcat(msgline, " [First-Error-Log]");
+}
+if (sd->status[2] & SNS2_ENV_DATA_PRESENT) {
+strcat(msgline, " [Env-Data-Present]");
+}
+if (sd->status[2] & SNS2_INPRECISE_END) {
+strcat(msgline, " [Imprecise-End]");
+}
+strcat(msgline, "\n");
+sclp_print(msgline);
+
+print_int("Residual Count =", sd->res_count);
+print_int("Phys Drive ID  =", sd->phys_drive_id);
+print_int("low cyl address=", sd->low_cyl_addr);
+print_int("head addr & hi cyl =", sd->head_high_cyl_addr);
+print_int("format/message =", sd->fmt_msg);
+print_int("fmt-dependent[0-7] =", sd->fmt_dependent_info[0]);
+print_int("fmt-dependent[8-15]=", sd->fmt_dependent_info[1]);
+print_int("prog action code   =", sd->program_action_code);
+print_int("Configuration info =", sd->config_info);
+print_int("mcode / hi-cyl =", sd->mcode_hicyl);
+print_int("cyl & head addr [0]=", sd->cyl_head_addr[0]);
+print_int("cyl & head addr [1]=", sd->cyl_head_addr[1]);
+print_int("cyl & head addr [2]=", sd->cyl_head_addr[2]);
+}
+
+static void print_irb_err(Irb *irb)
+{
+

Re: [Qemu-devel] [PATCH 10/15] s390-bios: Support for running format-0/1 channel programs

2018-12-13 Thread Farhan Ali




On 12/12/2018 09:11 AM, Jason J. Herne wrote:

Add struct for format-0 ccws. Support executing format-0 channel
programs and waiting for their completion before continuing execution.
This will be used for real dasd ipl.

Add cu_type() to channel io library. This will be used to query control
unit type which is used to determine if we are booting a virtio device or a
real dasd device.

Signed-off-by: Jason J. Herne 
---
  pc-bios/s390-ccw/cio.c  | 108 ++
  pc-bios/s390-ccw/cio.h  | 124 ++--
  pc-bios/s390-ccw/s390-ccw.h |   1 +
  pc-bios/s390-ccw/start.S|  33 +++-
  4 files changed, 261 insertions(+), 5 deletions(-)

diff --git a/pc-bios/s390-ccw/cio.c b/pc-bios/s390-ccw/cio.c
index 095f79b..9019250 100644
--- a/pc-bios/s390-ccw/cio.c
+++ b/pc-bios/s390-ccw/cio.c
@@ -10,6 +10,7 @@

  #include "libc.h"
  #include "s390-ccw.h"
+#include "s390-arch.h"
  #include "cio.h"

  static char chsc_page[PAGE_SIZE] __attribute__((__aligned__(PAGE_SIZE)));
@@ -39,3 +40,110 @@ void enable_subchannel(SubChannelId schid)
  schib.pmcw.ena = 1;
  msch(schid, );
  }
+
+uint16_t cu_type(SubChannelId schid)
+{
+Ccw1 sense_id_ccw;
+SenseId sense_data;
+
+sense_id_ccw.cmd_code = CCW_CMD_SENSE_ID;
+sense_id_ccw.cda = ptr2u32(_data);
+sense_id_ccw.count = sizeof(sense_data);
+
+if (do_cio(schid, ptr2u32(_id_ccw), CCW_FMT1)) {
+panic("Failed to run SenseID CCw\n");
+}
+
+return sense_data.cu_type;
+}
+
+void basic_sense(SubChannelId schid, SenseData *sd)
+{
+Ccw1 senseCcw;
+
+senseCcw.cmd_code = CCW_CMD_BASIC_SENSE;
+senseCcw.cda = ptr2u32(sd);
+senseCcw.count = sizeof(*sd);
+
+if (do_cio(schid, ptr2u32(), CCW_FMT1)) {
+panic("Failed to run Basic Sense CCW\n");
+}
+}
+
+static bool irb_error(Irb *irb)
+{
+/* We have to ignore Incorrect Length (cstat == 0x40) indicators because
+ * real devices expect a 24 byte SenseID  buffer, and virtio devices expect
+ * a much larger buffer. Neither device type can tolerate a buffer size
+ * different from what they expect so they set this indicator.
+ */
+if (irb->scsw.cstat != 0x00 && irb->scsw.cstat != 0x40) {
+return true;
+}
+return irb->scsw.dstat != 0xc;
+}
+
+/* Executes a channel program at a given subchannel. The request to run the
+ * channel program is sent to the subchannel, we then wait for the interrupt
+ * singaling completion of the I/O operation(s) perfomed by the channel
+ * program. Lastly we verify that the i/o operation completed without error and
+ * that the interrupt we received was for the subchannel used to run the
+ * channel program.
+ *
+ * Note: This function assumes it is running in an environment where no other
+ * cpus are generating or receiving I/O interrupts. So either run it in a
+ * single-cpu environment or make sure all other cpus are not doing I/O and
+ * have I/O interrupts masked off.
+ */
+int do_cio(SubChannelId schid, uint32_t ccw_addr, int fmt)
+{
+CmdOrb orb = {};
+Irb irb = {};
+SenseData sd;
+int rc, retries = 0;
+
+IPL_assert(fmt == 0 || fmt == 1, "Invalid ccw format");
+
+/* ccw_addr must be <= 24 bits and point to at least one whole ccw. */
+if (fmt == 0) {
+IPL_assert(ccw_addr <= 0xFF - 8, "Invalid ccw address");
+}
+
+orb.fmt = fmt ;
+orb.pfch = 1;  /* QEMU's cio implementation requires prefetch */
+orb.c64 = 1;   /* QEMU's cio implementation requires 64-bit idaws */
+orb.lpm = 0xFF; /* All paths allowed */
+orb.cpa = ccw_addr;
+
+while (true) {
+rc = ssch(schid, );
+if (rc) {
+print_int("ssch failed with rc=", rc);
+break;
+}
+
+consume_io_int();
+
+/* Clear read */
+rc = tsch(schid, );
+if (rc) {
+print_int("tsch failed with rc=", rc);
+break;
+}
+
+if (!irb_error()) {
+break;
+}


Now that we enable i/o interrupts, do we still need drain_irqs function 
anymore? Maybe we could refactor that.


My question is more out of curiosity and I know it's out of the score 
for this patch series.



+
+/* Unexpected unit check. Use sense to clear unit check then retry. */
+if (unit_check() && retries <= 2) {
+basic_sense(schid, );
+retries++;
+continue;
+}
+
+break;
+}
+
+return rc;
+}
diff --git a/pc-bios/s390-ccw/cio.h b/pc-bios/s390-ccw/cio.h
index 7b07d75..5c16635 100644
--- a/pc-bios/s390-ccw/cio.h
+++ b/pc-bios/s390-ccw/cio.h
@@ -70,9 +70,46 @@ struct scsw {
  __u16 count;
  } __attribute__ ((packed));

-#define SCSW_FCTL_CLEAR_FUNC 0x1000
-#define SCSW_FCTL_HALT_FUNC 0x2000
+/* Function Control */
  #define SCSW_FCTL_START_FUNC 0x4000
+#define SCSW_FCTL_HALT_FUNC 0x2000
+#define SCSW_FCTL_CLEAR_FUNC 0x1000
+
+/* Activity Control */
+#define 

Re: [Qemu-devel] [PATCH 05/15] s390-bios: Factor finding boot device out of virtio code path

2018-12-13 Thread Farhan Ali




On 12/12/2018 09:11 AM, Jason J. Herne wrote:

Make a new routine find_boot_device to locate the boot device for all
cases. not just virtio.


I don't think there should be a period after cases?



In one case no boot device is specified and a suitable boot device can not
be auto detected. The error message for this case was specific to virtio
devices. We update this message to remove virtio specific wording.

Signed-off-by: Jason J. Herne 
---
  pc-bios/s390-ccw/main.c  | 87 ++--
  tests/boot-serial-test.c |  2 +-
  2 files changed, 49 insertions(+), 40 deletions(-)

diff --git a/pc-bios/s390-ccw/main.c b/pc-bios/s390-ccw/main.c
index 7e3f65e..2457752 100644
--- a/pc-bios/s390-ccw/main.c
+++ b/pc-bios/s390-ccw/main.c
@@ -55,17 +55,18 @@ unsigned int get_loadparm_index(void)
   * NOTE: The global variable blk_schid is updated to contain the subchannel
   * information.
   */
-static bool find_dev(Schib *schib, int dev_no)
+static bool find_subch(int dev_no)
  {
+Schib schib;
  int i, r;

  for (i = 0; i < 0x1; i++) {
  blk_schid.sch_no = i;
-r = stsch_err(blk_schid, schib);
+r = stsch_err(blk_schid, );
  if ((r == 3) || (r == -EIO)) {
  break;
  }
-if (!schib->pmcw.dnv) {
+if (!schib.pmcw.dnv) {
  continue;
  }

@@ -77,7 +78,7 @@ static bool find_dev(Schib *schib, int dev_no)
  continue;
  }

-if ((dev_no < 0) || (schib->pmcw.dev == dev_no)) {
+if ((dev_no < 0) || (schib.pmcw.dev == dev_no)) {
  return true;
  }
  }
@@ -133,56 +134,63 @@ static void boot_setup(void)
  have_iplb = store_iplb();
  }

-static void virtio_setup(void)
+static void find_boot_device(void)
  {
-Schib schib;
-int ssid;
-bool found = false;
-uint16_t dev_no;
  VDev *vdev = virtio_get_device();
-QemuIplParameters *early_qipl = (QemuIplParameters *)QIPL_ADDRESS;
-
-memcpy(, early_qipl, sizeof(QemuIplParameters));
+int ssid;
+bool found;

-if (have_iplb) {
-switch (iplb.pbt) {
-case S390_IPL_TYPE_CCW:
-dev_no = iplb.ccw.devno;
-debug_print_int("device no. ", dev_no);
-blk_schid.ssid = iplb.ccw.ssid & 0x3;
-debug_print_int("ssid ", blk_schid.ssid);
-found = find_dev(, dev_no);
-break;
-case S390_IPL_TYPE_QEMU_SCSI:
-vdev->scsi_device_selected = true;
-vdev->selected_scsi_device.channel = iplb.scsi.channel;
-vdev->selected_scsi_device.target = iplb.scsi.target;
-vdev->selected_scsi_device.lun = iplb.scsi.lun;
-blk_schid.ssid = iplb.scsi.ssid & 0x3;
-found = find_dev(, iplb.scsi.devno);
-break;
-default:
-panic("List-directed IPL not supported yet!\n");
-}
-menu_setup();
-} else {
+if (!have_iplb) {
  for (ssid = 0; ssid < 0x3; ssid++) {
  blk_schid.ssid = ssid;
-found = find_dev(, -1);
+found = find_subch(-1);
  if (found) {
-break;
+return;
  }
  }
+panic("Could not find a suitable boot device (none specified)\n");
  }

-IPL_assert(found, "No virtio device found");
+switch (iplb.pbt) {
+case S390_IPL_TYPE_CCW:
+debug_print_int("device no. ", iplb.ccw.devno);
+blk_schid.ssid = iplb.ccw.ssid & 0x3;
+debug_print_int("ssid ", blk_schid.ssid);
+found = find_subch(iplb.ccw.devno);
+break;
+case S390_IPL_TYPE_QEMU_SCSI:
+vdev->scsi_device_selected = true;
+vdev->selected_scsi_device.channel = iplb.scsi.channel;
+vdev->selected_scsi_device.target = iplb.scsi.target;
+vdev->selected_scsi_device.lun = iplb.scsi.lun;
+blk_schid.ssid = iplb.scsi.ssid & 0x3;
+found = find_subch(iplb.scsi.devno);
+break;
+default:
+panic("List-directed IPL not supported yet!\n");
+}
+
+if (!found) {
+IPL_assert(found, "Boot device not found\n");
+}
+}
+
+static void virtio_setup(void)
+{
+VDev *vdev = virtio_get_device();
+QemuIplParameters *early_qipl = (QemuIplParameters *)QIPL_ADDRESS;
+
+memcpy(, early_qipl, sizeof(QemuIplParameters));
+
+if (have_iplb) {
+menu_setup();
+}

  if (virtio_get_device_type() == VIRTIO_ID_NET) {
  sclp_print("Network boot device detected\n");
  vdev->netboot_start_addr = qipl.netboot_start_addr;
  } else {
  virtio_blk_setup_device(blk_schid);
-
  IPL_assert(virtio_ipl_disk_is_valid(), "No valid IPL device 
detected");
  }
  }
@@ -192,8 +200,9 @@ int main(void)
  sclp_setup();
  css_setup();
  boot_setup();
-virtio_setup();
+find_boot_device();

+virtio_setup();
  zipl_load(); /* no return */

  

Re: [Qemu-devel] [PATCH 03/15] s390-bios: decouple common boot logic from virtio

2018-12-13 Thread Farhan Ali




On 12/12/2018 09:11 AM, Jason J. Herne wrote:

Create a boot_setup function to handle getting boot information from
the machine/hypervisor. This decouples common boot logic from the
virtio code path and allows us to make use of it for the real dasd boot
scenario.

Signed-off-by: Jason J. Herne 
Acked-by: Halil Pasic 
Reviewed-by: Collin Walling 
---
  pc-bios/s390-ccw/main.c | 28 
  1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/pc-bios/s390-ccw/main.c b/pc-bios/s390-ccw/main.c
index e82fe2c..67df421 100644
--- a/pc-bios/s390-ccw/main.c
+++ b/pc-bios/s390-ccw/main.c
@@ -14,16 +14,17 @@

  char stack[PAGE_SIZE * 8] __attribute__((__aligned__(PAGE_SIZE)));
  static SubChannelId blk_schid = { .one = 1 };
-IplParameterBlock iplb __attribute__((__aligned__(PAGE_SIZE)));
  static char loadparm_str[LOADPARM_LEN + 1] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 };
  QemuIplParameters qipl;
+IplParameterBlock iplb __attribute__((__aligned__(PAGE_SIZE)));
+static bool have_iplb;

  #define LOADPARM_PROMPT "PROMPT  "
  #define LOADPARM_EMPTY  ""
  #define BOOT_MENU_FLAG_MASK (QIPL_FLAG_BM_OPTS_CMD | QIPL_FLAG_BM_OPTS_ZIPL)

  /*
- * Priniciples of Operations (SA22-7832-09) chapter 17 requires that
+ * Principles of Operations (SA22-7832-09) chapter 17 requires that
   * a subsystem-identification is at 184-187 and bytes 188-191 are zero
   * after list-directed-IPL and ccw-IPL.
   */
@@ -111,23 +112,33 @@ static void css_setup(void)
  enable_mss_facility();
  }

+/*
+ * Collect various pieces of information from the hypervisor/hardware that
+ * we'll use to determine exactly how we'll boot.
+ */
+static void boot_setup(void)
+{
+char lpmsg[] = "LOADPARM=[]\n";
+
+sclp_get_loadparm_ascii(loadparm_str);
+memcpy(lpmsg + 10, loadparm_str, 8);
+sclp_print(lpmsg);
+
+have_iplb = store_iplb();
+}
+
  static void virtio_setup(void)
  {
  Schib schib;
  int ssid;
  bool found = false;
  uint16_t dev_no;
-char ldp[] = "LOADPARM=[]\n";
  VDev *vdev = virtio_get_device();
  QemuIplParameters *early_qipl = (QemuIplParameters *)QIPL_ADDRESS;

-sclp_get_loadparm_ascii(loadparm_str);
-memcpy(ldp + 10, loadparm_str, LOADPARM_LEN);
-sclp_print(ldp);
-
  memcpy(, early_qipl, sizeof(QemuIplParameters));

-if (store_iplb()) {
+if (have_iplb) {
  switch (iplb.pbt) {
  case S390_IPL_TYPE_CCW:
  dev_no = iplb.ccw.devno;
@@ -174,6 +185,7 @@ int main(void)
  {
  sclp_setup();
  css_setup();
+boot_setup();
  virtio_setup();

  zipl_load(); /* no return */




Reviewed-by: Farhan Ali 




Re: [Qemu-devel] [PATCH 02/15] s390-bios: decouple cio setup from virtio

2018-12-13 Thread Farhan Ali




On 12/12/2018 09:11 AM, Jason J. Herne wrote:

Move channel i/o setup code out to a separate function. This decouples cio
setup from the virtio code path and allows us to make use of it for booting
dasd devices.

Signed-off-by: Jason J. Herne 
Acked-by: Halil Pasic 
Reviewed-by: Collin Walling 
---
  pc-bios/s390-ccw/main.c | 20 +---
  1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/pc-bios/s390-ccw/main.c b/pc-bios/s390-ccw/main.c
index 544851d..e82fe2c 100644
--- a/pc-bios/s390-ccw/main.c
+++ b/pc-bios/s390-ccw/main.c
@@ -99,6 +99,18 @@ static void menu_setup(void)
  }
  }

+/*
+ * Initialize the channel I/O subsystem so we can talk to our ipl/boot device.
+ */
+static void css_setup(void)
+{
+/*
+ * Unconditionally enable mss support. In every sane configuration this
+ * will succeed; and even if it doesn't, stsch_err() can handle it.
+ */
+enable_mss_facility();
+}
+
  static void virtio_setup(void)
  {
  Schib schib;
@@ -109,13 +121,6 @@ static void virtio_setup(void)
  VDev *vdev = virtio_get_device();
  QemuIplParameters *early_qipl = (QemuIplParameters *)QIPL_ADDRESS;

-/*
- * We unconditionally enable mss support. In every sane configuration,
- * this will succeed; and even if it doesn't, stsch_err() can deal
- * with the consequences.
- */
-enable_mss_facility();
-
  sclp_get_loadparm_ascii(loadparm_str);
  memcpy(ldp + 10, loadparm_str, LOADPARM_LEN);
  sclp_print(ldp);
@@ -168,6 +173,7 @@ static void virtio_setup(void)
  int main(void)
  {
  sclp_setup();
+css_setup();
  virtio_setup();

  zipl_load(); /* no return */



Reviewed-by: Farhan Ali 




Re: [Qemu-devel] [PATCH v2] hw/s390/ccw.c: Don't take address of packed members

2018-12-13 Thread Farhan Ali




On 12/13/2018 07:02 AM, Peter Maydell wrote:

Taking the address of a field in a packed struct is a bad idea, because
it might not be actually aligned enough for that pointer type (and
thus cause a crash on dereference on some host architectures). Newer
versions of clang warn about this.

Avoid the problem by using local copies of the PMCW and SCSW
struct fields in copy_schib_from_guest() and copy_schib_to_guest().

Signed-off-by: Peter Maydell 
---
v1->v2 changes:
  * add comment about why we're using locals
  * name locals with underscores, as QEMU's naming conventions recommend


  hw/s390x/css.c | 28 
  1 file changed, 24 insertions(+), 4 deletions(-)

diff --git a/hw/s390x/css.c b/hw/s390x/css.c
index 04ec5cc9705..f92b046cd33 100644
--- a/hw/s390x/css.c
+++ b/hw/s390x/css.c
@@ -1290,9 +1290,19 @@ void copy_scsw_to_guest(SCSW *dest, const SCSW *src)
  static void copy_schib_to_guest(SCHIB *dest, const SCHIB *src)
  {
  int i;
+/*
+ * We copy the PMCW and SCSW in and out of local variables to
+ * avoid taking the address of members of a packed struct.
+ */
+PMCW src_pmcw, dest_pmcw;
+SCSW src_scsw, dest_scsw;
  
-copy_pmcw_to_guest(>pmcw, >pmcw);

-copy_scsw_to_guest(>scsw, >scsw);
+src_pmcw = src->pmcw;
+copy_pmcw_to_guest(_pmcw, _pmcw);
+dest->pmcw = dest_pmcw;
+src_scsw = src->scsw;
+copy_scsw_to_guest(_scsw, _scsw);
+dest->scsw = dest_scsw;
  dest->mba = cpu_to_be64(src->mba);
  for (i = 0; i < ARRAY_SIZE(dest->mda); i++) {
  dest->mda[i] = src->mda[i];
@@ -1339,9 +1349,19 @@ static void copy_scsw_from_guest(SCSW *dest, const SCSW 
*src)
  static void copy_schib_from_guest(SCHIB *dest, const SCHIB *src)
  {
  int i;
+/*
+ * We copy the PMCW and SCSW in and out of local variables to
+ * avoid taking the address of members of a packed struct.
+ */
+PMCW src_pmcw, dest_pmcw;
+SCSW src_scsw, dest_scsw;
  
-copy_pmcw_from_guest(>pmcw, >pmcw);

-copy_scsw_from_guest(>scsw, >scsw);
+src_pmcw = src->pmcw;
+copy_pmcw_from_guest(_pmcw, _pmcw);
+dest->pmcw = dest_pmcw;
+src_scsw = src->scsw;
+copy_scsw_from_guest(_scsw, _scsw);
+dest->scsw = dest_scsw;
  dest->mba = be64_to_cpu(src->mba);
  for (i = 0; i < ARRAY_SIZE(dest->mda); i++) {
  dest->mda[i] = src->mda[i];



Reviewed-by: Farhan Ali 




Re: [Qemu-devel] [PATCH] hw/s390/ccw.c: Don't take address of packed members

2018-12-10 Thread Farhan Ali




On 12/10/2018 08:58 AM, Peter Maydell wrote:

Taking the address of a field in a packed struct is a bad idea, because
it might not be actually aligned enough for that pointer type (and
thus cause a crash on dereference on some host architectures). Newer
versions of clang warn about this.

Avoid the problem by using local copies of the PMCW and SCSW
struct fields in copy_schib_from_guest() and copy_schib_to_guest().

Signed-off-by: Peter Maydell 
---
This seemed like a not totally ugly and reasonably localised fix
that satisfies clang. Oddly, this makes the generated object file
15K smaller (421K vs 406K), so it might even be better code...

  hw/s390x/css.c | 20 
  1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/hw/s390x/css.c b/hw/s390x/css.c
index 04ec5cc9705..ef07691e36b 100644
--- a/hw/s390x/css.c
+++ b/hw/s390x/css.c
@@ -1290,9 +1290,15 @@ void copy_scsw_to_guest(SCSW *dest, const SCSW *src)
  static void copy_schib_to_guest(SCHIB *dest, const SCHIB *src)
  {
  int i;
+PMCW srcpmcw, destpmcw;
+SCSW srcscsw, destscsw;

-copy_pmcw_to_guest(>pmcw, >pmcw);
-copy_scsw_to_guest(>scsw, >scsw);
+srcpmcw = src->pmcw;
+copy_pmcw_to_guest(, );
+dest->pmcw = destpmcw;
+srcscsw = src->scsw;
+copy_scsw_to_guest(, );
+dest->scsw = destscsw;
  dest->mba = cpu_to_be64(src->mba);
  for (i = 0; i < ARRAY_SIZE(dest->mda); i++) {
  dest->mda[i] = src->mda[i];
@@ -1339,9 +1345,15 @@ static void copy_scsw_from_guest(SCSW *dest, const SCSW 
*src)
  static void copy_schib_from_guest(SCHIB *dest, const SCHIB *src)
  {
  int i;
+PMCW srcpmcw, destpmcw;
+SCSW srcscsw, destscsw;

-copy_pmcw_from_guest(>pmcw, >pmcw);
-copy_scsw_from_guest(>scsw, >scsw);
+srcpmcw = src->pmcw;
+copy_pmcw_from_guest(, );
+dest->pmcw = destpmcw;
+srcscsw = src->scsw;
+copy_scsw_from_guest(, );
+dest->scsw = destscsw;
  dest->mba = be64_to_cpu(src->mba);
  for (i = 0; i < ARRAY_SIZE(dest->mda); i++) {
  dest->mda[i] = src->mda[i];



Reviewed-by: Farhan Ali 




Re: [Qemu-devel] [PATCH 0/3] vfio-ccw: support hsch/csch (kernel part)

2018-12-06 Thread Farhan Ali




On 12/06/2018 11:21 AM, Cornelia Huck wrote:

On Thu, 6 Dec 2018 10:26:12 -0500
Farhan Ali  wrote:


On 12/06/2018 09:39 AM, Cornelia Huck wrote:

On Wed, 5 Dec 2018 13:34:11 -0500
Farhan Ali  wrote:
   

On 12/05/2018 07:54 AM, Cornelia Huck wrote:

Yeah, that is perfectly clear, but it ain't the complete story. E.g.
are subsequent commands blocking until the preceding command finishes
is part of the interface. And what is good implementation depends on the
answer. What I mean, I first need to understand how things are supposed
to work (together) so I can double check that against the
implementation. Otherwise all I can do is nitpicking.

To get more tangible: we are in the middle of processing an SSCH request
(e.g. doing the translation) when a HSCH comes in. What should happen?
Should we start processing HSCH after he instruction part of SSCH is
done -- which currently includes translation? Or should we -EBUSY? Or do
we abort START related activities and do the HALT stuff?

I think most of the sorting-out-the-operations stuff should be done by
the hardware itself, and we should not really try to enforce anything
special in our vfio code.

For your example, it might be best if a hsch is always accepted and
send on towards the hardware. Probably best to reflect back -EAGAIN if
we're currently processing another instruction from another vcpu, so
that the user space caller can retry. Same for ssch, if another ssch is
already being processed. We*could*  reflect cc 2 if the fctl bit is
already set, but that won't do for csch, so it is probably best to have
the hardware figure that out in any case.

If I read the code correctly, we currently reflect -EBUSY and not
-EAGAIN if we get a ssch request while already processing another one.
QEMU hands that back to the guest as a cc 2, which is not 100% correct.
In practice, we don't see this with Linux guests due to locking.
  


If we have a ssch and a csch immediately afterwards from userspace, will
we end up issuing csch first and then ssch to the hardware?

If I understand correctly, the ccw translation as part of the ssch can
be a slow operation so it might be possible we issue the csch first?
In that case we won't actually clear the original start function as
intended.


When we start processing the ssch request (translation and so on), we
set the state to BUSY. This means that any csch request will get a
-EBUSY, no overtaking possible. (I think maybe I'll need to check what
this series looks like if I rebase it on top of Pierre's rework, as he
did some changes in the state machine.)


I think you meant the state is set to BOXED? otherwise the patch 3 says
if state is BUSY and CLEAR event request comes in, we issue the clear
instruction, no?


That's what I meant with "need to rebase" :) The BOXED state is gone; I
just had not rebased on top of it. There's more changes in the state
machine; if we are on the same page as to what should happen, I can
start massaging the patches.




Sorry maybe I missed it, but are you referring to Pierre's latest 
cleanup patches? I don't see him removing the BOXED state.


I think returning -EAGAIN and asking the userspace to retry the 
operation sounds reasonable to me.


But how do we handle the issue of protecting the cmd_region from 
simultaneous hsch and csch calls? Do we agree on Pierre's method of 
making write calls mutually exclusive?





Re: [Qemu-devel] [PATCH 0/3] vfio-ccw: support hsch/csch (kernel part)

2018-12-06 Thread Farhan Ali




On 12/06/2018 09:39 AM, Cornelia Huck wrote:

On Wed, 5 Dec 2018 13:34:11 -0500
Farhan Ali  wrote:


On 12/05/2018 07:54 AM, Cornelia Huck wrote:

Yeah, that is perfectly clear, but it ain't the complete story. E.g.
are subsequent commands blocking until the preceding command finishes
is part of the interface. And what is good implementation depends on the
answer. What I mean, I first need to understand how things are supposed
to work (together) so I can double check that against the
implementation. Otherwise all I can do is nitpicking.

To get more tangible: we are in the middle of processing an SSCH request
(e.g. doing the translation) when a HSCH comes in. What should happen?
Should we start processing HSCH after he instruction part of SSCH is
done -- which currently includes translation? Or should we -EBUSY? Or do
we abort START related activities and do the HALT stuff?

I think most of the sorting-out-the-operations stuff should be done by
the hardware itself, and we should not really try to enforce anything
special in our vfio code.

For your example, it might be best if a hsch is always accepted and
send on towards the hardware. Probably best to reflect back -EAGAIN if
we're currently processing another instruction from another vcpu, so
that the user space caller can retry. Same for ssch, if another ssch is
already being processed. We*could*  reflect cc 2 if the fctl bit is
already set, but that won't do for csch, so it is probably best to have
the hardware figure that out in any case.

If I read the code correctly, we currently reflect -EBUSY and not
-EAGAIN if we get a ssch request while already processing another one.
QEMU hands that back to the guest as a cc 2, which is not 100% correct.
In practice, we don't see this with Linux guests due to locking.
   


If we have a ssch and a csch immediately afterwards from userspace, will
we end up issuing csch first and then ssch to the hardware?

If I understand correctly, the ccw translation as part of the ssch can
be a slow operation so it might be possible we issue the csch first?
In that case we won't actually clear the original start function as
intended.


When we start processing the ssch request (translation and so on), we
set the state to BUSY. This means that any csch request will get a
-EBUSY, no overtaking possible. (I think maybe I'll need to check what
this series looks like if I rebase it on top of Pierre's rework, as he
did some changes in the state machine.)


I think you meant the state is set to BOXED? otherwise the patch 3 says 
if state is BUSY and CLEAR event request comes in, we issue the clear 
instruction, no?





My idea above was to return -EAGAIN instead of -EBUSY, so that user
space can retry the operation.







Re: [Qemu-devel] [PATCH 0/3] vfio-ccw: support hsch/csch (kernel part)

2018-12-05 Thread Farhan Ali




On 12/05/2018 07:54 AM, Cornelia Huck wrote:

Yeah, that is perfectly clear, but it ain't the complete story. E.g.
are subsequent commands blocking until the preceding command finishes
is part of the interface. And what is good implementation depends on the
answer. What I mean, I first need to understand how things are supposed
to work (together) so I can double check that against the
implementation. Otherwise all I can do is nitpicking.

To get more tangible: we are in the middle of processing an SSCH request
(e.g. doing the translation) when a HSCH comes in. What should happen?
Should we start processing HSCH after he instruction part of SSCH is
done -- which currently includes translation? Or should we -EBUSY? Or do
we abort START related activities and do the HALT stuff?

I think most of the sorting-out-the-operations stuff should be done by
the hardware itself, and we should not really try to enforce anything
special in our vfio code.

For your example, it might be best if a hsch is always accepted and
send on towards the hardware. Probably best to reflect back -EAGAIN if
we're currently processing another instruction from another vcpu, so
that the user space caller can retry. Same for ssch, if another ssch is
already being processed. We*could*  reflect cc 2 if the fctl bit is
already set, but that won't do for csch, so it is probably best to have
the hardware figure that out in any case.

If I read the code correctly, we currently reflect -EBUSY and not
-EAGAIN if we get a ssch request while already processing another one.
QEMU hands that back to the guest as a cc 2, which is not 100% correct.
In practice, we don't see this with Linux guests due to locking.



If we have a ssch and a csch immediately afterwards from userspace, will 
we end up issuing csch first and then ssch to the hardware?


If I understand correctly, the ccw translation as part of the ssch can 
be a slow operation so it might be possible we issue the csch first?
In that case we won't actually clear the original start function as 
intended.



Thanks
Farhan




[Qemu-devel] [PATCH for 3.1? or 4 v4 1/1] qemu-iotests: Don't run the test when user is root

2018-12-03 Thread Farhan Ali
Test 232 creates image files with read-only permission and
expects an error message when trying to access the image
files with read-only and auto-read-only turned off.

Don't run as root user, since root can open files with read/write
access for read-only files.

Signed-off-by: Farhan Ali 
Reviewed-by: Eric Blake 
---
 tests/qemu-iotests/232 | 5 +
 1 file changed, 5 insertions(+)

diff --git a/tests/qemu-iotests/232 b/tests/qemu-iotests/232
index 0708b8b..1d34242 100755
--- a/tests/qemu-iotests/232
+++ b/tests/qemu-iotests/232
@@ -92,6 +92,11 @@ echo
 
 chmod a-w $TEST_IMG
 
+if [ -w $TEST_IMG ]
+then
+_notrun "Cannot run this test as root user"
+fi
+
 run_qemu_info_block -drive 
driver=file,file="$TEST_IMG",if=none,read-only=on,auto-read-only=off
 run_qemu_info_block -drive 
driver=file,file="$TEST_IMG",if=none,read-only=on,auto-read-only=on
 run_qemu_info_block -drive driver=file,file="$TEST_IMG",if=none,read-only=on
-- 
2.7.4




Re: [Qemu-devel] [RFC for 3.1? or 4 v3 1/1] qemu-iotests: Don't run the test when user is root

2018-11-30 Thread Farhan Ali




On 11/30/2018 03:52 PM, Eric Blake wrote:

On 11/30/18 2:37 PM, Farhan Ali wrote:

Test 232 creates image files with read-only permission and
expects an error message when trying to access the image
files with read-only and auto-read-only turned off.

Don't run as root user, since root can open files with read/write
access for read-only files.

Signed-off-by: Farhan Ali 
---
  tests/qemu-iotests/232 | 5 +
  1 file changed, 5 insertions(+)


Did you mean to keep RFC in the subject line?  Will patchew even spot it 
without PATCH in the subject line?


Reviewed-by: Eric Blake 

Thanks for reviewing the patch. I am not sure if patchew will pick it 
up; I have no problem in spinning another patch but I will wait till 
Monday to see if anyone else has any feedback.


Thanks
Farhan




[Qemu-devel] [RFC for 3.1? or 4 v3 1/1] qemu-iotests: Don't run the test when user is root

2018-11-30 Thread Farhan Ali
Test 232 creates image files with read-only permission and
expects an error message when trying to access the image
files with read-only and auto-read-only turned off.

Don't run as root user, since root can open files with read/write
access for read-only files.

Signed-off-by: Farhan Ali 
---
 tests/qemu-iotests/232 | 5 +
 1 file changed, 5 insertions(+)

diff --git a/tests/qemu-iotests/232 b/tests/qemu-iotests/232
index 0708b8b..1d34242 100755
--- a/tests/qemu-iotests/232
+++ b/tests/qemu-iotests/232
@@ -92,6 +92,11 @@ echo
 
 chmod a-w $TEST_IMG
 
+if [ -w $TEST_IMG ]
+then
+_notrun "Cannot run this test as root user"
+fi
+
 run_qemu_info_block -drive 
driver=file,file="$TEST_IMG",if=none,read-only=on,auto-read-only=off
 run_qemu_info_block -drive 
driver=file,file="$TEST_IMG",if=none,read-only=on,auto-read-only=on
 run_qemu_info_block -drive driver=file,file="$TEST_IMG",if=none,read-only=on
-- 
2.7.4




Re: [Qemu-devel] [RFC for 3.1? or 4 v2 1/1] qemu-iotests: Don't run the test when user is root

2018-11-30 Thread Farhan Ali




On 11/30/2018 12:50 PM, Eric Blake wrote:
Adding qemu-devel - all patches should go there, especially if you want 
to get Peter's attention that this might be a 3.1 candidate if we have 
other reasons to spin -rc4.


On 11/30/18 10:04 AM, Farhan Ali wrote:

Test 232 creates image files with read-only permission and
expects an error message when trying to access the image
files with read-only and auto-read-only turned off.

Don't run as root user, since root can open files with read/write
access for read-only files.

Signed-off-by: Farhan Ali 
---
  tests/qemu-iotests/232 | 8 
  1 file changed, 8 insertions(+)

diff --git a/tests/qemu-iotests/232 b/tests/qemu-iotests/232
index 0708b8b..05d5f2f 100755
--- a/tests/qemu-iotests/232
+++ b/tests/qemu-iotests/232
@@ -41,6 +41,14 @@ _supported_fmt generic
  _supported_proto file
  _supported_os Linux
+tmp='file'
+touch $tmp
+chmod a-w $tmp
+if [ -w $tmp ]
+then
+    _notrun "Cannot run this test as root user"
+fi
+


I know you just copied from my suggestion, but now looking at it, this 
leaves 'tmp' around in the directory for both success and skip. Better 
might be to just check whether $TEST_IMG is writable, immediately after 
the existing 'chmod a-w $TEST_IMG' line (Hmm - that line is already 
broken for not quoting "$TEST_IMG" in case it contains whitespace).


Ah yes, I forgot about the fact the 'file' will linger around. I will 
spin up a v3.




I don't see this being a reason for -rc4 on its own (most people don't 
run iotests as root); and the fact that we're still working on the final 
contents of what the patch should contain, as well as the fact that the 
patch doesn't affect the main binaries, means that if it were up to me, 
I'd defer it to 4.0. Kevin may have a different opinion, though, since 
it is his test, and new to 3.1.







Re: [Qemu-devel] qemu-iotests 232 fails when running the test as root user

2018-11-29 Thread Farhan Ali




On 11/29/2018 04:07 PM, Eric Blake wrote:

On 11/29/18 3:03 PM, Farhan Ali wrote:

Hi,

I am seeing a failure of the qemu-iotest number 232 when running the 
test as a root user. Is this the expected behavior?


Here is the output of the failure:



-QEMU_PROG: -drive 
driver=file,file=TEST_DIR/t.IMGFMT,if=none,read-only=off,auto-read-only=off: 
Could not open 'TEST_DIR/t.IMGFMT': Permission denied

-NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only)
-NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only)
-
-QEMU_PROG: -drive 
driver=file,file=TEST_DIR/t.IMGFMT,if=none,auto-read-only=off: Could 
not open 'TEST_DIR/t.IMGFMT': Permission denied

-NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only)
-NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only)
+NODE_NAME: TEST_DIR/t.IMGFMT (file)


Not entirely unexpected (since root can read what are otherwise 
read-only files), but obviously unexpected by the test. The test should 
probably refuse to run as root.




Yes, I believe it would be better if there was a check to prevent 
running this test as root.


Thanks
Farhan




[Qemu-devel] qemu-iotests 232 fails when running the test as root user

2018-11-29 Thread Farhan Ali

Hi,

I am seeing a failure of the qemu-iotest number 232 when running the 
test as a root user. Is this the expected behavior?


Here is the output of the failure:

sudo ./check -qcow2 232

QEMU  -- 
"/home/alifm/kvmdev/qemu/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64" 
-nodefaults -machine accel=qtest
QEMU_IMG  -- 
"/home/alifm/kvmdev/qemu/tests/qemu-iotests/../../qemu-img"
QEMU_IO   -- 
"/home/alifm/kvmdev/qemu/tests/qemu-iotests/../../qemu-io"  --cache 
writeback -f qcow2
QEMU_NBD  -- 
"/home/alifm/kvmdev/qemu/tests/qemu-iotests/../../qemu-nbd"

IMGFMT-- qcow2 (compat=1.1)
IMGPROTO  -- file
PLATFORM  -- Linux/x86_64 alifm-ThinkPad-T470p 4.10.0-40-generic
TEST_DIR  -- /home/alifm/kvmdev/qemu/tests/qemu-iotests/scratch
SOCKET_SCM_HELPER -- 
/home/alifm/kvmdev/qemu/tests/qemu-iotests/socket_scm_helper


232 1s ... - output mismatch (see 232.out.bad)
--- /home/alifm/kvmdev/qemu/tests/qemu-iotests/232.out	2018-11-26 
11:45:58.368023999 -0500
+++ /home/alifm/kvmdev/qemu/tests/qemu-iotests/232.out.bad	2018-11-29 
15:58:41.768561694 -0500

@@ -21,13 +21,13 @@
 NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only)
 NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only)

-QEMU_PROG: -drive 
driver=file,file=TEST_DIR/t.IMGFMT,if=none,read-only=off,auto-read-only=off: 
Could not open 'TEST_DIR/t.IMGFMT': Permission denied

-NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only)
-NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only)
-
-QEMU_PROG: -drive 
driver=file,file=TEST_DIR/t.IMGFMT,if=none,auto-read-only=off: Could not 
open 'TEST_DIR/t.IMGFMT': Permission denied

-NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only)
-NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only)
+NODE_NAME: TEST_DIR/t.IMGFMT (file)
+NODE_NAME: TEST_DIR/t.IMGFMT (file)
+NODE_NAME: TEST_DIR/t.IMGFMT (file)
+
+NODE_NAME: TEST_DIR/t.IMGFMT (file)
+NODE_NAME: TEST_DIR/t.IMGFMT (file)
+NODE_NAME: TEST_DIR/t.IMGFMT (file)

 === -blockdev with read-write image: read-only/auto-read-only 
combinations ===


@@ -49,11 +49,11 @@
 node0: TEST_DIR/t.IMGFMT (file, read-only)
 node0: TEST_DIR/t.IMGFMT (file, read-only)

-QEMU_PROG: -blockdev 
driver=file,filename=TEST_DIR/t.IMGFMT,node-name=node0,read-only=off,auto-read-only=off: 
Could not open 'TEST_DIR/t.IMGFMT': Permission denied

-node0: TEST_DIR/t.IMGFMT (file, read-only)
-QEMU_PROG: -blockdev 
driver=file,filename=TEST_DIR/t.IMGFMT,node-name=node0,read-only=off: 
Could not open 'TEST_DIR/t.IMGFMT': Permission denied

+node0: TEST_DIR/t.IMGFMT (file)
+node0: TEST_DIR/t.IMGFMT (file)
+node0: TEST_DIR/t.IMGFMT (file)

-QEMU_PROG: -blockdev 
driver=file,filename=TEST_DIR/t.IMGFMT,node-name=node0,auto-read-only=off: 
Could not open 'TEST_DIR/t.IMGFMT': Permission denied

-node0: TEST_DIR/t.IMGFMT (file, read-only)
-QEMU_PROG: -blockdev 
driver=file,filename=TEST_DIR/t.IMGFMT,node-name=node0: Could not open 
'TEST_DIR/t.IMGFMT': Permission denied

+node0: TEST_DIR/t.IMGFMT (file)
+node0: TEST_DIR/t.IMGFMT (file)
+node0: TEST_DIR/t.IMGFMT (file)
 *** done
Failures: 232
Failed 1 of 1 tests


Thank you
Farhan




Re: [Qemu-devel] [PATCH 3/3] vfio-ccw: add handling for asnyc channel instructions

2018-11-28 Thread Farhan Ali




On 11/28/2018 10:35 AM, Cornelia Huck wrote:

On Wed, 28 Nov 2018 10:00:59 -0500
Farhan Ali  wrote:


On 11/28/2018 09:52 AM, Cornelia Huck wrote:

On Wed, 28 Nov 2018 09:31:51 -0500
Farhan Ali  wrote:
   

On 11/28/2018 04:02 AM, Cornelia Huck wrote:

On Tue, 27 Nov 2018 14:09:49 -0500
Farhan Ali  wrote:
  

On 11/22/2018 11:54 AM, Cornelia Huck wrote:

Add a region to the vfio-ccw device that can be used to submit
asynchronous I/O instructions. ssch continues to be handled by the
existing I/O region; the new region handles hsch and csch.

Interrupt status continues to be reported through the same channels
as for ssch.

Signed-off-by: Cornelia Huck 
---
 drivers/s390/cio/Makefile   |   3 +-
 drivers/s390/cio/vfio_ccw_async.c   |  88 
 drivers/s390/cio/vfio_ccw_drv.c |  48 ++---
 drivers/s390/cio/vfio_ccw_fsm.c | 158 +++-
 drivers/s390/cio/vfio_ccw_ops.c |  13 ++-
 drivers/s390/cio/vfio_ccw_private.h |   6 ++
 include/uapi/linux/vfio.h   |   4 +
 include/uapi/linux/vfio_ccw.h   |  12 +++
 8 files changed, 313 insertions(+), 19 deletions(-)
 create mode 100644 drivers/s390/cio/vfio_ccw_async.c
 
  

@@ -76,7 +79,8 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
private = container_of(work, struct vfio_ccw_private, io_work);
irb = >irb;
 
-	if (scsw_is_solicited(>scsw)) {

+   if (scsw_is_solicited(>scsw) &&
+   (scsw_fctl(>scsw) & SCSW_FCTL_START_FUNC)) {
cp_update_scsw(>cp, >scsw);
cp_free(>cp);
}


I am a little confused about this. Why do we need to update the scsw.cpa
if we have the start function function control bit set? Is it an
optimization?


No, it's not an optimization. This is the work function that is
scheduled if we get an interrupt for the device. Previously, we only
got an interrupt if either the device presented us an unsolicited
status or if we got an interrupt as a response to the channel program
we submitted. Now, we can get an interrupt for halt/clear subchannel as
well, and in that case, we don't necessarily have a cp.

[Thinking some more about it, we need to verify if the start function
actually remains set if we try to terminate a running channel program
with halt/clear. A clear might scrub too much. If that's the case, we
also need to free the cp if the start function is not set.]

  


According to PoPs (Chapter 16: I/O interruptions, under function control):

The start-function indication is also cleared at
the subchannel during the execution of CLEAR SUB-
CHANNEL.

So maybe we do need to free the cp.


Hm... so we need to make sure that cp_update_scsw() and cp_free() only
do something when there's actually a valid cp around and call them
unconditionally.


Yes, I agree.


Maybe add a ->valid flag to struct channel_program?


We could do that. So we would set the flag once we have copied the
channel program to kernel memory? since that's when we should care about
freeing it.


I hacked up the following (still untested):

 From e771c8dc5abbfbd19688b452096bab9d032e0df5 Mon Sep 17 00:00:00 2001
From: Cornelia Huck 
Date: Wed, 28 Nov 2018 16:30:51 +0100
Subject: [PATCH] vfio-ccw: make it safe to access channel programs

When we get a solicited interrupt, the start function may have
been cleared by a csch, but we still have a channel program
structure allocated. Make it safe to call the cp accessors in
any case, so we can call them unconditionally.

Signed-off-by: Cornelia Huck 
---
  drivers/s390/cio/vfio_ccw_cp.c  | 9 -
  drivers/s390/cio/vfio_ccw_cp.h  | 2 ++
  drivers/s390/cio/vfio_ccw_drv.c | 3 +--
  3 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index 70a006ba4d05..35f87514276b 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -335,6 +335,7 @@ static void cp_unpin_free(struct channel_program *cp)
struct ccwchain *chain, *temp;
int i;
  
+	cp->initialized = false;

list_for_each_entry_safe(chain, temp, >ccwchain_list, next) {
for (i = 0; i < chain->ch_len; i++) {
pfn_array_table_unpin_free(chain->ch_pat + i,
@@ -701,6 +702,8 @@ int cp_init(struct channel_program *cp, struct device 
*mdev, union orb *orb)
 */
cp->orb.cmd.c64 = 1;
  
+	cp->initialized = true;

+
return ret;
  }
  
@@ -715,7 +718,8 @@ int cp_init(struct channel_program *cp, struct device *mdev, union orb *orb)

   */
  void cp_free(struct channel_program *cp)
  {
-   cp_unpin_free(cp);
+   if (cp->initialized)
+   cp_unpin_free(cp);
  }
  
  /**

@@ -831,6 +835,9 @@ void cp_update_scsw(struct channel_program *cp, union scsw 
*scsw)
u32 cpa = scsw->cmd.cpa;
u32 ccw_hea

Re: [Qemu-devel] [PATCH 3/3] vfio-ccw: add handling for asnyc channel instructions

2018-11-28 Thread Farhan Ali




On 11/28/2018 09:52 AM, Cornelia Huck wrote:

On Wed, 28 Nov 2018 09:31:51 -0500
Farhan Ali  wrote:


On 11/28/2018 04:02 AM, Cornelia Huck wrote:

On Tue, 27 Nov 2018 14:09:49 -0500
Farhan Ali  wrote:
   

On 11/22/2018 11:54 AM, Cornelia Huck wrote:

Add a region to the vfio-ccw device that can be used to submit
asynchronous I/O instructions. ssch continues to be handled by the
existing I/O region; the new region handles hsch and csch.

Interrupt status continues to be reported through the same channels
as for ssch.

Signed-off-by: Cornelia Huck 
---
drivers/s390/cio/Makefile   |   3 +-
drivers/s390/cio/vfio_ccw_async.c   |  88 
drivers/s390/cio/vfio_ccw_drv.c |  48 ++---
drivers/s390/cio/vfio_ccw_fsm.c | 158 +++-
drivers/s390/cio/vfio_ccw_ops.c |  13 ++-
drivers/s390/cio/vfio_ccw_private.h |   6 ++
include/uapi/linux/vfio.h   |   4 +
include/uapi/linux/vfio_ccw.h   |  12 +++
8 files changed, 313 insertions(+), 19 deletions(-)
create mode 100644 drivers/s390/cio/vfio_ccw_async.c
  
   

@@ -76,7 +79,8 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
private = container_of(work, struct vfio_ccw_private, io_work);
irb = >irb;

-	if (scsw_is_solicited(>scsw)) {

+   if (scsw_is_solicited(>scsw) &&
+   (scsw_fctl(>scsw) & SCSW_FCTL_START_FUNC)) {
cp_update_scsw(>cp, >scsw);
cp_free(>cp);
}


I am a little confused about this. Why do we need to update the scsw.cpa
if we have the start function function control bit set? Is it an
optimization?


No, it's not an optimization. This is the work function that is
scheduled if we get an interrupt for the device. Previously, we only
got an interrupt if either the device presented us an unsolicited
status or if we got an interrupt as a response to the channel program
we submitted. Now, we can get an interrupt for halt/clear subchannel as
well, and in that case, we don't necessarily have a cp.

[Thinking some more about it, we need to verify if the start function
actually remains set if we try to terminate a running channel program
with halt/clear. A clear might scrub too much. If that's the case, we
also need to free the cp if the start function is not set.]

   


According to PoPs (Chapter 16: I/O interruptions, under function control):

The start-function indication is also cleared at
the subchannel during the execution of CLEAR SUB-
CHANNEL.

So maybe we do need to free the cp.


Hm... so we need to make sure that cp_update_scsw() and cp_free() only
do something when there's actually a valid cp around and call them
unconditionally. 


Yes, I agree.

Maybe add a ->valid flag to struct channel_program?

We could do that. So we would set the flag once we have copied the 
channel program to kernel memory? since that's when we should care about 
freeing it.










Re: [Qemu-devel] [PATCH 3/3] vfio-ccw: add handling for asnyc channel instructions

2018-11-28 Thread Farhan Ali




On 11/28/2018 04:02 AM, Cornelia Huck wrote:

On Tue, 27 Nov 2018 14:09:49 -0500
Farhan Ali  wrote:


On 11/22/2018 11:54 AM, Cornelia Huck wrote:

Add a region to the vfio-ccw device that can be used to submit
asynchronous I/O instructions. ssch continues to be handled by the
existing I/O region; the new region handles hsch and csch.

Interrupt status continues to be reported through the same channels
as for ssch.

Signed-off-by: Cornelia Huck 
---
   drivers/s390/cio/Makefile   |   3 +-
   drivers/s390/cio/vfio_ccw_async.c   |  88 
   drivers/s390/cio/vfio_ccw_drv.c |  48 ++---
   drivers/s390/cio/vfio_ccw_fsm.c | 158 +++-
   drivers/s390/cio/vfio_ccw_ops.c |  13 ++-
   drivers/s390/cio/vfio_ccw_private.h |   6 ++
   include/uapi/linux/vfio.h   |   4 +
   include/uapi/linux/vfio_ccw.h   |  12 +++
   8 files changed, 313 insertions(+), 19 deletions(-)
   create mode 100644 drivers/s390/cio/vfio_ccw_async.c




@@ -76,7 +79,8 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
private = container_of(work, struct vfio_ccw_private, io_work);
irb = >irb;
   
-	if (scsw_is_solicited(>scsw)) {

+   if (scsw_is_solicited(>scsw) &&
+   (scsw_fctl(>scsw) & SCSW_FCTL_START_FUNC)) {
cp_update_scsw(>cp, >scsw);
cp_free(>cp);
}


I am a little confused about this. Why do we need to update the scsw.cpa
if we have the start function function control bit set? Is it an
optimization?


No, it's not an optimization. This is the work function that is
scheduled if we get an interrupt for the device. Previously, we only
got an interrupt if either the device presented us an unsolicited
status or if we got an interrupt as a response to the channel program
we submitted. Now, we can get an interrupt for halt/clear subchannel as
well, and in that case, we don't necessarily have a cp.

[Thinking some more about it, we need to verify if the start function
actually remains set if we try to terminate a running channel program
with halt/clear. A clear might scrub too much. If that's the case, we
also need to free the cp if the start function is not set.]




According to PoPs (Chapter 16: I/O interruptions, under function control):

The start-function indication is also cleared at
the subchannel during the execution of CLEAR SUB-
CHANNEL.

So maybe we do need to free the cp.




Re: [Qemu-devel] [PATCH 3/3] vfio-ccw: add handling for asnyc channel instructions

2018-11-27 Thread Farhan Ali




On 11/22/2018 11:54 AM, Cornelia Huck wrote:

diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 565669f95534..c01472ec77ea 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -304,6 +304,7 @@ struct vfio_region_info_cap_type {
  #define VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG(2)
  #define VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG (3)
  
+


Whitespace error?


  #define VFIO_REGION_TYPE_GFX(1)
  #define VFIO_REGION_SUBTYPE_GFX_EDID(1)
  
@@ -354,6 +355,9 @@ struct vfio_region_gfx_edid {

  #define VFIO_DEVICE_GFX_LINK_STATE_DOWN  2
  };
  
+/* ccw sub-types */

+#define VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD  (1)
+





Re: [Qemu-devel] [PATCH 3/3] vfio-ccw: add handling for asnyc channel instructions

2018-11-27 Thread Farhan Ali




On 11/22/2018 11:54 AM, Cornelia Huck wrote:

Add a region to the vfio-ccw device that can be used to submit
asynchronous I/O instructions. ssch continues to be handled by the
existing I/O region; the new region handles hsch and csch.

Interrupt status continues to be reported through the same channels
as for ssch.

Signed-off-by: Cornelia Huck 
---
  drivers/s390/cio/Makefile   |   3 +-
  drivers/s390/cio/vfio_ccw_async.c   |  88 
  drivers/s390/cio/vfio_ccw_drv.c |  48 ++---
  drivers/s390/cio/vfio_ccw_fsm.c | 158 +++-
  drivers/s390/cio/vfio_ccw_ops.c |  13 ++-
  drivers/s390/cio/vfio_ccw_private.h |   6 ++
  include/uapi/linux/vfio.h   |   4 +
  include/uapi/linux/vfio_ccw.h   |  12 +++
  8 files changed, 313 insertions(+), 19 deletions(-)
  create mode 100644 drivers/s390/cio/vfio_ccw_async.c

diff --git a/drivers/s390/cio/Makefile b/drivers/s390/cio/Makefile
index f230516abb96..f6a8db04177c 100644
--- a/drivers/s390/cio/Makefile
+++ b/drivers/s390/cio/Makefile
@@ -20,5 +20,6 @@ obj-$(CONFIG_CCWGROUP) += ccwgroup.o
  qdio-objs := qdio_main.o qdio_thinint.o qdio_debug.o qdio_setup.o
  obj-$(CONFIG_QDIO) += qdio.o
  
-vfio_ccw-objs += vfio_ccw_drv.o vfio_ccw_cp.o vfio_ccw_ops.o vfio_ccw_fsm.o

+vfio_ccw-objs += vfio_ccw_drv.o vfio_ccw_cp.o vfio_ccw_ops.o vfio_ccw_fsm.o \
+   vfio_ccw_async.o
  obj-$(CONFIG_VFIO_CCW) += vfio_ccw.o
diff --git a/drivers/s390/cio/vfio_ccw_async.c 
b/drivers/s390/cio/vfio_ccw_async.c
new file mode 100644
index ..8c7f51d17d70
--- /dev/null
+++ b/drivers/s390/cio/vfio_ccw_async.c
@@ -0,0 +1,88 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Async I/O region for vfio_ccw
+ *
+ * Copyright Red Hat, Inc. 2018
+ *
+ * Author(s): Cornelia Huck 
+ */
+
+#include 
+#include 
+
+#include "vfio_ccw_private.h"
+
+static size_t vfio_ccw_async_region_read(struct vfio_ccw_private *private,
+char __user *buf, size_t count,
+loff_t *ppos)
+{
+   unsigned int i = VFIO_CCW_OFFSET_TO_INDEX(*ppos) - VFIO_CCW_NUM_REGIONS;
+   loff_t pos = *ppos & VFIO_CCW_OFFSET_MASK;
+   struct ccw_cmd_region *region;
+
+   if (pos + count > sizeof(*region))
+   return -EINVAL;
+
+   region = private->region[i].data;
+   if (copy_to_user(buf, (void *)region + pos, count))
+   return -EFAULT;
+
+   return count;
+
+}
+
+static size_t vfio_ccw_async_region_write(struct vfio_ccw_private *private,
+ const char __user *buf, size_t count,
+ loff_t *ppos)
+{
+   unsigned int i = VFIO_CCW_OFFSET_TO_INDEX(*ppos) - VFIO_CCW_NUM_REGIONS;
+   loff_t pos = *ppos & VFIO_CCW_OFFSET_MASK;
+   struct ccw_cmd_region *region;
+
+   if (pos + count > sizeof(*region))
+   return -EINVAL;
+
+   if (private->state == VFIO_CCW_STATE_NOT_OPER ||
+   private->state == VFIO_CCW_STATE_STANDBY)
+   return -EACCES;
+
+   region = private->region[i].data;
+   if (copy_from_user((void *)region + pos, buf, count))
+   return -EFAULT;
+
+   switch (region->command) {
+   case VFIO_CCW_ASYNC_CMD_HSCH:
+   vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_HALT_REQ);
+   break;
+   case VFIO_CCW_ASYNC_CMD_CSCH:
+   vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_CLEAR_REQ);
+   break;
+   default:
+   return -EINVAL;
+   }
+
+   return region->ret_code ? region->ret_code : count;
+}
+
+static void vfio_ccw_async_region_release(struct vfio_ccw_private *private,
+ struct vfio_ccw_region *region)
+{
+
+}
+
+const struct vfio_ccw_regops vfio_ccw_async_region_ops = {
+   .read = vfio_ccw_async_region_read,
+   .write = vfio_ccw_async_region_write,
+   .release = vfio_ccw_async_region_release,
+};
+
+int vfio_ccw_register_async_dev_regions(struct vfio_ccw_private *private)
+{
+   return vfio_ccw_register_dev_region(private,
+   VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD,
+   _ccw_async_region_ops,
+   sizeof(struct ccw_cmd_region),
+   VFIO_REGION_INFO_FLAG_READ |
+   VFIO_REGION_INFO_FLAG_WRITE,
+   private->cmd_region);
+}
diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index a10cec0e86eb..890c588a3a61 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -3,9 +3,11 @@
   * VFIO based Physical Subchannel device driver
   *
   * Copyright IBM Corp. 2017
+ * Copyright Red Hat, Inc. 2018
   *
   * Author(s): Dong Jia Shi 
   *Xiao Feng Ren 
+ *

Re: [Qemu-devel] [PATCH 1/3] vfio-ccw: add capabilities chain

2018-11-27 Thread Farhan Ali




On 11/22/2018 11:54 AM, Cornelia Huck wrote:

diff --git a/drivers/s390/cio/vfio_ccw_private.h 
b/drivers/s390/cio/vfio_ccw_private.h
index 078e46f9623d..a6f9f84526e2 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -3,9 +3,11 @@
   * Private stuff for vfio_ccw driver
   *
   * Copyright IBM Corp. 2017
+ * Copyright Red Hat, Inc. 2018
   *
   * Author(s): Dong Jia Shi
   *Xiao Feng Ren
+ *Cornelia Huck
   */
  
  #ifndef_VFIO_CCW_PRIVATE_H_

@@ -19,6 +21,38 @@
  #include "css.h"
  #include "vfio_ccw_cp.h"
  
+#define VFIO_CCW_OFFSET_SHIFT   40

+#define VFIO_CCW_OFFSET_TO_INDEX(off)  (off >> VFIO_CCW_OFFSET_SHIFT)
+#define VFIO_CCW_INDEX_TO_OFFSET(index)((u64)(index) << 
VFIO_CCW_OFFSET_SHIFT)
+#define VFIO_CCW_OFFSET_MASK   (((u64)(1) << VFIO_CCW_OFFSET_SHIFT) - 1)
+


Why is the offset shift 40? I know vfio-pci is also using the same 
offset shift, but I am curious about the reasoning behind why we are 
using this? :)





Re: [Qemu-devel] [PATCH 0/3] vfio-ccw: support hsch/csch (kernel part)

2018-11-26 Thread Farhan Ali




On 11/22/2018 11:54 AM, Cornelia Huck wrote:

[This is the Linux kernel part, git tree is available at
https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw.git 
vfio-ccw-caps

The companion QEMU patches are available at
https://github.com/cohuck/qemu vfio-ccw-caps]

Currently, vfio-ccw only relays START SUBCHANNEL requests to the real
device. This tends to work well for the most common 'good path' scenarios;
however, as we emulate {HALT,CLEAR} SUBCHANNEL in QEMU, things like
clearing pending requests at the device is currently not supported.
This may be a problem for e.g. error recovery.

This patch series introduces capabilities (similar to what vfio-pci uses)
and exposes a new async region for handling hsch/csch.

Very lightly tested (I can interact with a dasd as before; I have not
found a reliable way to trigger hsch/csch in the Linux dasd guest driver.)



I was able to trigger the guest DASD driver to issue a csch instruction, 
and from my brief testing it seems to be working just like it would on 
the LPAR. (I basically tested with 2 threads trying to issue DASD device 
reserve and release ioctls, on the same DASD device, in a busy loop).


I am going to spend some time doing a deeper review.

Thanks
Farhan



Cornelia Huck (3):
   vfio-ccw: add capabilities chain
   s390/cio: export hsch to modules
   vfio-ccw: add handling for asnyc channel instructions

  drivers/s390/cio/Makefile   |   3 +-
  drivers/s390/cio/ioasm.c|   1 +
  drivers/s390/cio/vfio_ccw_async.c   |  88 +
  drivers/s390/cio/vfio_ccw_drv.c |  48 +--
  drivers/s390/cio/vfio_ccw_fsm.c | 158 +-
  drivers/s390/cio/vfio_ccw_ops.c | 195 
  drivers/s390/cio/vfio_ccw_private.h |  44 +++
  include/uapi/linux/vfio.h   |   5 +
  include/uapi/linux/vfio_ccw.h   |  12 ++
  9 files changed, 509 insertions(+), 45 deletions(-)
  create mode 100644 drivers/s390/cio/vfio_ccw_async.c






Re: [Qemu-devel] [PATCH 1/4] MAINTAINERS: s390: more maintainers for vfio-ccw

2018-10-29 Thread Farhan Ali




On 10/29/2018 11:42 AM, Christian Borntraeger wrote:

Eric and Farhan will help with maintaining vfio-ccw.

Signed-off-by: Christian Borntraeger 
---
  MAINTAINERS | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index d794bd7..10045b6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1204,6 +1204,8 @@ F: include/hw/vfio/
  
  vfio-ccw

  M: Cornelia Huck 
+M: Eric Farman 
+M: Farhan Ali 
  S: Supported
  F: hw/vfio/ccw.c
  F: hw/s390x/s390-ccw.c



Acked-by: Farhan Ali 




Re: [Qemu-devel] [RFC v1 1/1] qemu-iotests: Fix output for testcase 082

2018-10-16 Thread Farhan Ali




On 10/15/2018 11:45 AM, Max Reitz wrote:

082 is failing for me on master (046936ed), and this fixes it.

I'm wondering if/why other people are not running into this, as 082 runs
with qcow2 and is included on `make check-block`.

+ Max,

FIY, I ran into this while reviewing/testing your Python 3 series.

Tested-by: Cleber Rosa

Thanks for CC-ing me, Cleber.

Farhan, I disagree with this fix, as I wrote here:

http://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg01488.html

I don't like the new output, so I think what we should do is change the
code that produces it.  Marc-André seemed to agree, so I just need to
write a patch (which I didn't get to so far).

Max



Hi Max,

That is fine with me. The reason I sent this patch out as an RFC was 
because I wasn't sure about the fix. So for the time being is it okay 
for the test to fail? And if you do submit a patch, would you kindly cc 
me as well.


Thank you
Farhan




[Qemu-devel] [RFC v1 1/1] qemu-iotests: Fix output for testcase 082

2018-10-08 Thread Farhan Ali
Commit 9cbef9d68ee: qemu-option: improve qemu_opts_print_help() output,
changed some of the output for qemu-img tool but did not
update the corresponding 082 test case file. This broke
qemu-iotests.

Signed-off-by: Farhan Ali 
---
 tests/qemu-iotests/082.out | 956 ++---
 1 file changed, 478 insertions(+), 478 deletions(-)

diff --git a/tests/qemu-iotests/082.out b/tests/qemu-iotests/082.out
index 19e9fb1..2672349 100644
--- a/tests/qemu-iotests/082.out
+++ b/tests/qemu-iotests/082.out
@@ -44,171 +44,171 @@ cluster_size: 8192
 
 Testing: create -f qcow2 -o help TEST_DIR/t.qcow2 128M
 Supported options:
-size Virtual disk size
-compat   Compatibility level (0.10 or 1.1)
-backing_file File name of a base image
-backing_fmt  Image format of the base image
-encryption   Encrypt the image with format 'aes'. (Deprecated in favor of 
encrypt.format=aes)
-encrypt.format   Encrypt the image, format choices: 'aes', 'luks'
-encrypt.key-secret ID of secret providing qcow AES key or LUKS passphrase
-encrypt.cipher-alg Name of encryption cipher algorithm
-encrypt.cipher-mode Name of encryption cipher mode
-encrypt.ivgen-alg Name of IV generator algorithm
-encrypt.ivgen-hash-alg Name of IV generator hash algorithm
-encrypt.hash-alg Name of encryption hash algorithm
-encrypt.iter-time Time to spend in PBKDF in milliseconds
-cluster_size qcow2 cluster size
-preallocationPreallocation mode (allowed values: off, metadata, falloc, 
full)
-lazy_refcounts   Postpone refcount updates
-refcount_bitsWidth of a reference count entry in bits
-nocowTurn off copy-on-write (valid only on btrfs)
+backing_file=str - File name of a base image
+backing_fmt=str - Image format of the base image
+cluster_size=size - qcow2 cluster size
+compat=str - Compatibility level (0.10 or 1.1)
+encrypt.cipher-alg=str - Name of encryption cipher algorithm
+encrypt.cipher-mode=str - Name of encryption cipher mode
+encrypt.format=str - Encrypt the image, format choices: 'aes', 'luks'
+encrypt.hash-alg=str - Name of encryption hash algorithm
+encrypt.iter-time=num - Time to spend in PBKDF in milliseconds
+encrypt.ivgen-alg=str - Name of IV generator algorithm
+encrypt.ivgen-hash-alg=str - Name of IV generator hash algorithm
+encrypt.key-secret=str - ID of secret providing qcow AES key or LUKS passphrase
+encryption=bool (on/off) - Encrypt the image with format 'aes'. (Deprecated in 
favor of encrypt.format=aes)
+lazy_refcounts=bool (on/off) - Postpone refcount updates
+nocow=bool (on/off) - Turn off copy-on-write (valid only on btrfs)
+preallocation=str - Preallocation mode (allowed values: off, metadata, falloc, 
full)
+refcount_bits=num - Width of a reference count entry in bits
+size=size - Virtual disk size
 
 Testing: create -f qcow2 -o ? TEST_DIR/t.qcow2 128M
 Supported options:
-size Virtual disk size
-compat   Compatibility level (0.10 or 1.1)
-backing_file File name of a base image
-backing_fmt  Image format of the base image
-encryption   Encrypt the image with format 'aes'. (Deprecated in favor of 
encrypt.format=aes)
-encrypt.format   Encrypt the image, format choices: 'aes', 'luks'
-encrypt.key-secret ID of secret providing qcow AES key or LUKS passphrase
-encrypt.cipher-alg Name of encryption cipher algorithm
-encrypt.cipher-mode Name of encryption cipher mode
-encrypt.ivgen-alg Name of IV generator algorithm
-encrypt.ivgen-hash-alg Name of IV generator hash algorithm
-encrypt.hash-alg Name of encryption hash algorithm
-encrypt.iter-time Time to spend in PBKDF in milliseconds
-cluster_size qcow2 cluster size
-preallocationPreallocation mode (allowed values: off, metadata, falloc, 
full)
-lazy_refcounts   Postpone refcount updates
-refcount_bitsWidth of a reference count entry in bits
-nocowTurn off copy-on-write (valid only on btrfs)
+backing_file=str - File name of a base image
+backing_fmt=str - Image format of the base image
+cluster_size=size - qcow2 cluster size
+compat=str - Compatibility level (0.10 or 1.1)
+encrypt.cipher-alg=str - Name of encryption cipher algorithm
+encrypt.cipher-mode=str - Name of encryption cipher mode
+encrypt.format=str - Encrypt the image, format choices: 'aes', 'luks'
+encrypt.hash-alg=str - Name of encryption hash algorithm
+encrypt.iter-time=num - Time to spend in PBKDF in milliseconds
+encrypt.ivgen-alg=str - Name of IV generator algorithm
+encrypt.ivgen-hash-alg=str - Name of IV generator hash algorithm
+encrypt.key-secret=str - ID of secret providing qcow AES key or LUKS passphrase
+encryption=bool (on/off) - Encrypt the image with format 'aes'. (Deprecated in 
favor of encrypt.format=aes)
+lazy_refcounts=bool (on/off) - Postpone refcount updates
+nocow=bool (on/off) - Turn off copy-on-write (valid only on btrfs)
+preallocation=str - Preallocation mode (allowed values: off, metadata, falloc, 
full)
+refcount_bits=num - Width of a reference count entry

Re: [Qemu-devel] [Qemu-block] [PATCH] block/file-posix: add bdrv_attach_aio_context callback for host dev and cdrom

2018-07-27 Thread Farhan Ali




On 07/27/2018 09:26 AM, Stefan Hajnoczi wrote:

On Mon, Jul 23, 2018 at 12:42:02PM -0400, Farhan Ali wrote:



On 07/23/2018 12:30 PM, Stefan Hajnoczi wrote:

On Fri, Jul 20, 2018 at 03:11:14PM -0400, Farhan Ali wrote:

I am seeing another issue pop up, in a different test. Even though it's a
different assertion, it might be related based on the call trace.


Which test case?


This test case involved one guest with 2 disks, with an iothread for each
disk. The guest was running a memory workload.


Please post a link to the test case.

Stefan


Hi Stefan,

Thanks for your response. The test case was run in our internal 
infrastructure, so unfortunately I cannot post any link to the test case.


I have been unable to reproduce this exact issue. On the other hand the 
same test case is throwing another assertion error:


qemu-kvm: /builddir/build/BUILD/qemu-2.12.91/exec.c:3695: 
address_space_unmap: Assertion `mr != NULL' failed.


Again this to me is very strange error. I have been able to reproduce 
this assertion couple of times, though I don't hit it on every run of 
the test case.



The qemu command line for the test case:

/usr/bin/qemu-kvm -name guest=sles,debug-threads=on -S -object 
secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-6-sles/master-key.aes 
-machine s390-ccw-virtio-3.0,accel=kvm,usb=off,dump-guest-core=off -m 
4096 -realtime mlock=off -smp 8,sockets=8,cores=1,threads=1 -object 
iothread,id=iothread1 -object iothread,id=iothread2 -uuid 
094a20ff-e881-44db-a772-fb4029cf8f09 -display none -no-user-config 
-nodefaults -chardev socket,id=charmonitor,fd=28,server,nowait -mon 
chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown 
-boot strict=on -drive 
file=/dev/mapper/360050763998b08839800333a,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native 
-device 
virtio-blk-ccw,iothread=iothread1,scsi=off,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,write-cache=on 
-drive 
file=/dev/mapper/360050763998b088398001e25,format=raw,if=none,id=drive-virtio-disk1,cache=none,aio=native 
-device 
virtio-blk-ccw,iothread=iothread2,scsi=off,devno=fe.0.0002,drive=drive-virtio-disk1,id=virtio-disk1,write-cache=on 
-netdev tap,fd=30,id=hostnet0,vhost=on,vhostfd=31 -device 
virtio-net-ccw,netdev=hostnet0,id=net0,mac=02:f5:5d:7d:7d:ef,devno=fe.0. 
-chardev pty,id=charconsole0 -device 
sclpconsole,chardev=charconsole0,id=console0 -device 
virtio-balloon-ccw,id=balloon0,devno=fe.3.ffba -sandbox 
on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny 
-msg timestamp=on



The backtrace of the thread which trips the assertion:

#0  0x03ffb553e274 in raise () from /lib64/libc.so.6
#1  0x03ffb55239a8 in abort () from /lib64/libc.so.6
#2  0x03ffb55362ce in __assert_fail_base () from /lib64/libc.so.6
#3  0x03ffb553634c in __assert_fail () from /lib64/libc.so.6
#4  0x02aa39b487e6 in address_space_unmap (as=, 
buffer=, len=, is_write=, 
access_len=)
at 
/usr/src/debug/qemu-2.12.91-20180726.0.c6cf862329.fc28.s390x/exec.c:3695
#5  0x02aa39bfb550 in dma_memory_unmap (access_len=0, 
dir=DMA_DIRECTION_FROM_DEVICE, len=, buffer=out>, as=0x2aa3a085b80 )
at 
/usr/src/debug/qemu-2.12.91-20180726.0.c6cf862329.fc28.s390x/include/sysemu/dma.h:146
#6  virtqueue_unmap_sg (elem=elem@entry=0x3ff9c01c9d0, len=len@entry=1, 
vq=) at 
/usr/src/debug/qemu-2.12.91-20180726.0.c6cf862329.fc28.s390x/hw/virtio/virtio.c:401
#7  0x02aa39bfc840 in virtqueue_fill (vq=vq@entry=0x3ffb6c9e010, 
elem=0x3ff9c01c9d0, len=, idx=idx@entry=0) at 
/usr/src/debug/qemu-2.12.91-20180726.0.c6cf862329.fc28.s390x/hw/virtio/virtio.c:476
#8  0x02aa39bfcb80 in virtqueue_push (vq=0x3ffb6c9e010, 
elem=elem@entry=0x3ff9c01c9d0, len=) at 
/usr/src/debug/qemu-2.12.91-20180726.0.c6cf862329.fc28.s390x/hw/virtio/virtio.c:522
#9  0x02aa39bd78c2 in virtio_blk_req_complete (req=0x3ff9c01c9d0, 
status=) at 
/usr/src/debug/qemu-2.12.91-20180726.0.c6cf862329.fc28.s390x/hw/block/virtio-blk.c:55
#10 0x02aa39bd8540 in virtio_blk_rw_complete (opaque=out>, ret=) at 
/usr/src/debug/qemu-2.12.91-20180726.0.c6cf862329.fc28.s390x/hw/block/virtio-blk.c:121
#11 0x02aa39d9a15e in blk_aio_complete (acb=0x3ff9c04f670) at 
/usr/src/debug/qemu-2.12.91-20180726.0.c6cf862329.fc28.s390x/block/block-backend.c:1336
#12 0x02aa39e48e98 in coroutine_trampoline (i0=, 
i1=) at 
/usr/src/debug/qemu-2.12.91-20180726.0.c6cf862329.fc28.s390x/util/coroutine-ucontext.c:116

#13 0x03ffb5553b7a in __makecontext_ret () from /lib64/libc.so.6


I don't know if these issues are related to the same underlying problem.
Any help is really appreciated.

Thanks
Farhan




Re: [Qemu-devel] [PATCH] block/file-posix: add bdrv_attach_aio_context callback for host dev and cdrom

2018-07-23 Thread Farhan Ali




On 07/23/2018 12:30 PM, Stefan Hajnoczi wrote:

On Fri, Jul 20, 2018 at 03:11:14PM -0400, Farhan Ali wrote:

I am seeing another issue pop up, in a different test. Even though it's a
different assertion, it might be related based on the call trace.


Which test case?


This test case involved one guest with 2 disks, with an iothread for 
each disk. The guest was running a memory workload.





Stack trace of thread 276199:
#0  0x03ff8473e274 raise (libc.so.6)
#1  0x03ff847239a8 abort (libc.so.6)
#2  0x03ff847362ce __assert_fail_base (libc.so.6)
#3  0x03ff8473634c __assert_fail (libc.so.6)
#4  0x02aa30aba0c4 iov_memset (qemu-system-s390x)
#5  0x02aa30aba9a6 qemu_iovec_memset (qemu-system-s390x)
#6  0x02aa30a23e88 qemu_laio_process_completion (qemu-system-s390x)


What are the values of laiocb->qiov->size and laiocb->ret?


The laiocb->qiov->size was 4096 and laiocb->ret was 8192




#7  0x02aa30a23f68 qemu_laio_process_completions (qemu-system-s390x)
#8  0x02aa30a2418e qemu_laio_process_completions_and_submit
(qemu-system-s390x)
#9  0x02aa30a24220 qemu_laio_poll_cb (qemu-system-s390x)
#10 0x02aa30ab22c4 run_poll_handlers_once (qemu-system-s390x)
#11 0x02aa30ab2e78 aio_poll (qemu-system-s390x)
#12 0x02aa30a29f4e bdrv_do_drained_begin (qemu-system-s390x)
#13 0x02aa30a2a276 bdrv_drain (qemu-system-s390x)
#14 0x02aa309d45aa bdrv_set_aio_context (qemu-system-s390x)
#15 0x02aa3085acfe virtio_blk_data_plane_stop (qemu-system-s390x)
#16 0x02aa3096994c virtio_bus_stop_ioeventfd.part.1 (qemu-system-s390x)
#17 0x02aa3087d1d6 virtio_vmstate_change (qemu-system-s390x)
#18 0x02aa308e8a12 vm_state_notify (qemu-system-s390x)
#19 0x02aa3080ed54 do_vm_stop (qemu-system-s390x)
#20 0x02aa307bea04 main (qemu-system-s390x)
#21 0x03ff84723dd2 __libc_start_main (libc.so.6)
#22 0x02aa307c0414 _start (qemu-system-s390x)


The failing assertion is:

qemu-kvm: util/iov.c:78: iov_memset: Assertion `offset == 0' failed.


I wonder if the offset is beyond the end of the iovecs.

Thanks,
Stefan



Thanks
Farhan




Re: [Qemu-devel] [PATCH] block/file-posix: add bdrv_attach_aio_context callback for host dev and cdrom

2018-07-23 Thread Farhan Ali




On 07/20/2018 03:11 PM, Farhan Ali wrote:
I am seeing another issue pop up, in a different test. Even though it's 
a different assertion, it might be related based on the call trace.


Stack trace of thread 276199:
#0  0x03ff8473e274 raise (libc.so.6)
#1  0x03ff847239a8 abort (libc.so.6)
#2  0x03ff847362ce __assert_fail_base (libc.so.6)
#3  0x03ff8473634c __assert_fail (libc.so.6)
#4  0x02aa30aba0c4 iov_memset (qemu-system-s390x)
#5  0x02aa30aba9a6 qemu_iovec_memset (qemu-system-s390x)
#6  0x02aa30a23e88 qemu_laio_process_completion (qemu-system-s390x)
#7  0x02aa30a23f68 qemu_laio_process_completions (qemu-system-s390x)
#8  0x02aa30a2418e qemu_laio_process_completions_and_submit 
(qemu-system-s390x)

#9  0x02aa30a24220 qemu_laio_poll_cb (qemu-system-s390x)
#10 0x02aa30ab22c4 run_poll_handlers_once (qemu-system-s390x)
#11 0x02aa30ab2e78 aio_poll (qemu-system-s390x)
#12 0x02aa30a29f4e bdrv_do_drained_begin (qemu-system-s390x)
#13 0x02aa30a2a276 bdrv_drain (qemu-system-s390x)
#14 0x02aa309d45aa bdrv_set_aio_context (qemu-system-s390x)
#15 0x02aa3085acfe virtio_blk_data_plane_stop (qemu-system-s390x)
#16 0x02aa3096994c virtio_bus_stop_ioeventfd.part.1 (qemu-system-s390x)
#17 0x02aa3087d1d6 virtio_vmstate_change (qemu-system-s390x)
#18 0x02aa308e8a12 vm_state_notify (qemu-system-s390x)
#19 0x02aa3080ed54 do_vm_stop (qemu-system-s390x)
#20 0x02aa307bea04 main (qemu-system-s390x)
#21 0x03ff84723dd2 __libc_start_main (libc.so.6)
#22 0x02aa307c0414 _start (qemu-system-s390x)


The failing assertion is:

qemu-kvm: util/iov.c:78: iov_memset: Assertion `offset == 0' failed.



Just to give some context, this a guest with 2 disks with each assigned 
an iothread. The guest was running a memory intensive workload.


From the coredump of the qemu process, I see there were 2 threads that 
were trying to call aio_poll with the same AioContext on the same 
BlockDeviceDriver


Thread 1:

#0  0x03ff8473e274 in raise () from /lib64/libc.so.6
#1  0x03ff847239a8 in abort () from /lib64/libc.so.6
#2  0x03ff847362ce in __assert_fail_base () from /lib64/libc.so.6
#3  0x03ff8473634c in __assert_fail () from /lib64/libc.so.6
#4  0x02aa30aba0c4 in iov_memset (iov=, 
iov_cnt=, offset=, fillc=, 
bytes=18446744073709547520) at 
/usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/util/iov.c:78
#5  0x02aa30aba9a6 in qemu_iovec_memset (qiov=, 
offset=offset@entry=8192, fillc=fillc@entry=0, 
bytes=18446744073709547520) at 
/usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/util/iov.c:410
#6  0x02aa30a23e88 in qemu_laio_process_completion 
(laiocb=0x3fe36a6a3f0) at 
/usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/block/linux-aio.c:88
#7  0x02aa30a23f68 in qemu_laio_process_completions 
(s=s@entry=0x3fe60001910) at 
/usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/block/linux-aio.c:222
#8  0x02aa30a2418e in qemu_laio_process_completions_and_submit 
(s=0x3fe60001910) at 
/usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/block/linux-aio.c:237
#9  0x02aa30a24220 in qemu_laio_poll_cb (opaque=) at 
/usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/block/linux-aio.c:272
#10 0x02aa30ab22c4 in run_poll_handlers_once 
(ctx=ctx@entry=0x2aa4f35df50) at 
/usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/util/aio-posix.c:494
#11 0x02aa30ab2e78 in try_poll_mode (blocking=, 
ctx=) at 
/usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/util/aio-posix.c:573
>  #12 aio_poll (ctx=0x2aa4f35df50, blocking=blocking@entry=false) 
at 
/usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/util/aio-posix.c:602
#13 0x02aa30a29f4e in bdrv_drain_poll_top_level 
(ignore_parent=, recursive=, bs=out>) at 
/usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/block/io.c:390
#14 bdrv_do_drained_begin (bs=0x2aa4f392510, recursive=, 
parent=0x0, ignore_bds_parents=, poll=) at 
/usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/block/io.c:390
#15 0x02aa30a2a276 in bdrv_drained_begin (bs=0x2aa4f392510) at 
/usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/block/io.c:396
#16 bdrv_drain (bs=bs@entry=0x2aa4f392510) at 
/usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/block/io.c:478
#17 0x02aa309d45aa in bdrv_set_aio_context (bs=0x2aa4f392510, 
new_context=0x2aa4f3594f0) at 
/usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/block.c:4954
#18 0x02aa30a1c228 in blk_set_aio_context 
(blk=blk@entry=0x2aa4f38ed90, new_context=) at 
/usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/block/block-backend.c:1894
#19 0x02aa3085acfe in virtio_blk_data_plane_stop (vdev=out>) at 
/usr/src/debug/qemu-2.12.91-20180720.0.677af45304.fc28.s390x/hw/block/dataplane/virtio-blk.c:285
#20 0x02aa3096994c in virtio_bus_stop_ioeventfd (bus=0x2aa4f4f61f0) 
at 
/usr/src/debug/qem

Re: [Qemu-devel] [PATCH] block/file-posix: add bdrv_attach_aio_context callback for host dev and cdrom

2018-07-20 Thread Farhan Ali




On 07/20/2018 03:32 PM, Nishanth Aravamudan wrote:

On 20.07.2018 [15:11:14 -0400], Farhan Ali wrote:

I am seeing another issue pop up, in a different test. Even though it's a
different assertion, it might be related based on the call trace.


Just to be clear, this does not happen if you revert the original patch
(i.e., the one you bisected to before)?

I'm digging into the code now.

-Nish


I had not seen this issue before. I just ran my regression tests with 
your fix and saw the failure in one of my test. The patch in itself 
fixes the original issue I reported. I am going to try and debug some more.


Thanks
Farhan




Re: [Qemu-devel] [PATCH] block/file-posix: add bdrv_attach_aio_context callback for host dev and cdrom

2018-07-20 Thread Farhan Ali
I am seeing another issue pop up, in a different test. Even though it's 
a different assertion, it might be related based on the call trace.


Stack trace of thread 276199:
#0  0x03ff8473e274 raise (libc.so.6)
#1  0x03ff847239a8 abort (libc.so.6)
#2  0x03ff847362ce __assert_fail_base (libc.so.6)
#3  0x03ff8473634c __assert_fail (libc.so.6)
#4  0x02aa30aba0c4 iov_memset (qemu-system-s390x)
#5  0x02aa30aba9a6 qemu_iovec_memset (qemu-system-s390x)
#6  0x02aa30a23e88 qemu_laio_process_completion (qemu-system-s390x)
#7  0x02aa30a23f68 qemu_laio_process_completions (qemu-system-s390x)
#8  0x02aa30a2418e qemu_laio_process_completions_and_submit 
(qemu-system-s390x)

#9  0x02aa30a24220 qemu_laio_poll_cb (qemu-system-s390x)
#10 0x02aa30ab22c4 run_poll_handlers_once (qemu-system-s390x)
#11 0x02aa30ab2e78 aio_poll (qemu-system-s390x)
#12 0x02aa30a29f4e bdrv_do_drained_begin (qemu-system-s390x)
#13 0x02aa30a2a276 bdrv_drain (qemu-system-s390x)
#14 0x02aa309d45aa bdrv_set_aio_context (qemu-system-s390x)
#15 0x02aa3085acfe virtio_blk_data_plane_stop (qemu-system-s390x)
#16 0x02aa3096994c virtio_bus_stop_ioeventfd.part.1 (qemu-system-s390x)
#17 0x02aa3087d1d6 virtio_vmstate_change (qemu-system-s390x)
#18 0x02aa308e8a12 vm_state_notify (qemu-system-s390x)
#19 0x02aa3080ed54 do_vm_stop (qemu-system-s390x)
#20 0x02aa307bea04 main (qemu-system-s390x)
#21 0x03ff84723dd2 __libc_start_main (libc.so.6)
#22 0x02aa307c0414 _start (qemu-system-s390x)


The failing assertion is:

qemu-kvm: util/iov.c:78: iov_memset: Assertion `offset == 0' failed.

On 07/18/2018 05:12 PM, Nishanth Aravamudan wrote:

In ed6e2161 ("linux-aio: properly bubble up errors from initialzation"),
I only added a bdrv_attach_aio_context callback for the bdrv_file
driver. There are several other drivers that use the shared
aio_plug callback, though, and they will trip the assertion added to
aio_get_linux_aio because they did not call aio_setup_linux_aio first.
Add the appropriate callback definition to the affected driver
definitions.

Fixes: ed6e2161 ("linux-aio: properly bubble up errors from initialization")
Reported-by: Farhan Ali 
Signed-off-by: Nishanth Aravamudan 
Cc: Eric Blake 
Cc: Kevin Wolf 
Cc: John Snow 
Cc: Max Reitz 
Cc: Stefan Hajnoczi 
Cc: Fam Zheng 
Cc: Paolo Bonzini 
Cc: qemu-bl...@nongnu.org
Cc: qemu-devel@nongnu.org
---
  block/file-posix.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/block/file-posix.c b/block/file-posix.c
index 60af4b3d51..ad299beb38 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -3158,6 +3158,7 @@ static BlockDriver bdrv_host_device = {
  .bdrv_refresh_limits = raw_refresh_limits,
  .bdrv_io_plug = raw_aio_plug,
  .bdrv_io_unplug = raw_aio_unplug,
+.bdrv_attach_aio_context = raw_aio_attach_aio_context,
  
  .bdrv_co_truncate   = raw_co_truncate,

  .bdrv_getlength   = raw_getlength,
@@ -3280,6 +3281,7 @@ static BlockDriver bdrv_host_cdrom = {
  .bdrv_refresh_limits = raw_refresh_limits,
  .bdrv_io_plug = raw_aio_plug,
  .bdrv_io_unplug = raw_aio_unplug,
+.bdrv_attach_aio_context = raw_aio_attach_aio_context,
  
  .bdrv_co_truncate= raw_co_truncate,

  .bdrv_getlength  = raw_getlength,
@@ -3410,6 +3412,7 @@ static BlockDriver bdrv_host_cdrom = {
  .bdrv_refresh_limits = raw_refresh_limits,
  .bdrv_io_plug = raw_aio_plug,
  .bdrv_io_unplug = raw_aio_unplug,
+.bdrv_attach_aio_context = raw_aio_attach_aio_context,
  
  .bdrv_co_truncate= raw_co_truncate,

  .bdrv_getlength  = raw_getlength,






Re: [Qemu-devel] [BUG?] aio_get_linux_aio: Assertion `ctx->linux_aio' failed

2018-07-18 Thread Farhan Ali




On 07/18/2018 09:42 AM, Farhan Ali wrote:



On 07/17/2018 04:52 PM, Nishanth Aravamudan wrote:

iiuc, this possibly implies AIO was not actually used previously on this
guest (it might have silently been falling back to threaded IO?). I
don't have access to s390x, but would it be possible to run qemu under
gdb and see if aio_setup_linux_aio is being called at all (I think it
might not be, but I'm not sure why), and if so, if it's for the context
in question?

If it's not being called first, could you see what callpath is calling
aio_get_linux_aio when this assertion trips?

Thanks!
-Nish



Hi Nishant,

 From the coredump of the guest this is the call trace that calls 
aio_get_linux_aio:



Stack trace of thread 145158:
#0  0x03ff94dbe274 raise (libc.so.6)
#1  0x03ff94da39a8 abort (libc.so.6)
#2  0x03ff94db62ce __assert_fail_base (libc.so.6)
#3  0x03ff94db634c __assert_fail (libc.so.6)
#4  0x02aa20db067a aio_get_linux_aio (qemu-system-s390x)
#5  0x02aa20d229a8 raw_aio_plug (qemu-system-s390x)
#6  0x02aa20d309ee bdrv_io_plug (qemu-system-s390x)
#7  0x02aa20b5a8ea virtio_blk_handle_vq (qemu-system-s390x)
#8  0x02aa20db2f6e aio_dispatch_handlers (qemu-system-s390x)
#9  0x02aa20db3c34 aio_poll (qemu-system-s390x)
#10 0x02aa20be32a2 iothread_run (qemu-system-s390x)
#11 0x03ff94f879a8 start_thread (libpthread.so.0)
#12 0x03ff94e797ee thread_start (libc.so.6)


Thanks for taking a look and responding.

Thanks
Farhan





Trying to debug a little further, the block device in this case is a 
"host device". And looking at your commit carefully you use the 
bdrv_attach_aio_context callback to setup a Linux AioContext.


For some reason the "host device" struct (BlockDriver bdrv_host_device 
in block/file-posix.c) does not have a bdrv_attach_aio_context defined.
So a simple change of adding the callback to the struct solves the issue 
and the guest starts fine.



diff --git a/block/file-posix.c b/block/file-posix.c
index 28824aa..b8d59fb 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -3135,6 +3135,7 @@ static BlockDriver bdrv_host_device = {
 .bdrv_refresh_limits = raw_refresh_limits,
 .bdrv_io_plug = raw_aio_plug,
 .bdrv_io_unplug = raw_aio_unplug,
+.bdrv_attach_aio_context = raw_aio_attach_aio_context,

 .bdrv_co_truncate   = raw_co_truncate,
 .bdrv_getlength= raw_getlength,



I am not too familiar with block device code in QEMU, so not sure if 
this is the right fix or if there are some underlying problems.


Thanks
Farhan




Re: [Qemu-devel] [BUG?] aio_get_linux_aio: Assertion `ctx->linux_aio' failed

2018-07-18 Thread Farhan Ali




On 07/17/2018 04:52 PM, Nishanth Aravamudan wrote:

iiuc, this possibly implies AIO was not actually used previously on this
guest (it might have silently been falling back to threaded IO?). I
don't have access to s390x, but would it be possible to run qemu under
gdb and see if aio_setup_linux_aio is being called at all (I think it
might not be, but I'm not sure why), and if so, if it's for the context
in question?

If it's not being called first, could you see what callpath is calling
aio_get_linux_aio when this assertion trips?

Thanks!
-Nish



Hi Nishant,

From the coredump of the guest this is the call trace that calls 
aio_get_linux_aio:



Stack trace of thread 145158:
#0  0x03ff94dbe274 raise (libc.so.6)
#1  0x03ff94da39a8 abort (libc.so.6)
#2  0x03ff94db62ce __assert_fail_base (libc.so.6)
#3  0x03ff94db634c __assert_fail (libc.so.6)
#4  0x02aa20db067a aio_get_linux_aio (qemu-system-s390x)
#5  0x02aa20d229a8 raw_aio_plug (qemu-system-s390x)
#6  0x02aa20d309ee bdrv_io_plug (qemu-system-s390x)
#7  0x02aa20b5a8ea virtio_blk_handle_vq (qemu-system-s390x)
#8  0x02aa20db2f6e aio_dispatch_handlers (qemu-system-s390x)
#9  0x02aa20db3c34 aio_poll (qemu-system-s390x)
#10 0x02aa20be32a2 iothread_run (qemu-system-s390x)
#11 0x03ff94f879a8 start_thread (libpthread.so.0)
#12 0x03ff94e797ee thread_start (libc.so.6)


Thanks for taking a look and responding.

Thanks
Farhan




[Qemu-devel] [BUG?] aio_get_linux_aio: Assertion `ctx->linux_aio' failed

2018-07-17 Thread Farhan Ali

Hi,

I am seeing some strange QEMU assertion failures for qemu on s390x,
which prevents a guest from starting.

Git bisecting points to the following commit as the source of the error.

commit ed6e2161715c527330f936d44af4c547f25f687e
Author: Nishanth Aravamudan 
Date:   Fri Jun 22 12:37:00 2018 -0700

linux-aio: properly bubble up errors from initialization

laio_init() can fail for a couple of reasons, which will lead to a NULL
pointer dereference in laio_attach_aio_context().

To solve this, add a aio_setup_linux_aio() function which is called
early in raw_open_common. If this fails, propagate the error up. The
signature of aio_get_linux_aio() was not modified, because it seems
preferable to return the actual errno from the possible failing
initialization calls.

Additionally, when the AioContext changes, we need to associate a
LinuxAioState with the new AioContext. Use the bdrv_attach_aio_context
callback and call the new aio_setup_linux_aio(), which will allocate a
new AioContext if needed, and return errors on failures. If it 
fails for

any reason, fallback to threaded AIO with an error message, as the
device is already in-use by the guest.

Add an assert that aio_get_linux_aio() cannot return NULL.

Signed-off-by: Nishanth Aravamudan 
Message-id: 20180622193700.6523-1-naravamu...@digitalocean.com
Signed-off-by: Stefan Hajnoczi 


Not sure what is causing this assertion to fail. Here is the qemu 
command line of the guest, from qemu log, which throws this error:



LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin 
QEMU_AUDIO_DRV=none /usr/local/bin/qemu-system-s390x -name 
guest=rt_vm1,debug-threads=on -S -object 
secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-21-rt_vm1/master-key.aes 
-machine s390-ccw-virtio-2.12,accel=kvm,usb=off,dump-guest-core=off -m 
1024 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -object 
iothread,id=iothread1 -uuid 0cde16cd-091d-41bd-9ac2-5243df5c9a0d 
-display none -no-user-config -nodefaults -chardev 
socket,id=charmonitor,fd=28,server,nowait -mon 
chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown 
-boot strict=on -drive 
file=/dev/mapper/360050763998b088398002a31,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native 
-device 
virtio-blk-ccw,iothread=iothread1,scsi=off,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,write-cache=on 
-netdev tap,fd=30,id=hostnet0,vhost=on,vhostfd=31 -device 
virtio-net-ccw,netdev=hostnet0,id=net0,mac=02:3a:c8:67:95:84,devno=fe.0. 
-netdev tap,fd=32,id=hostnet1,vhost=on,vhostfd=33 -device 
virtio-net-ccw,netdev=hostnet1,id=net1,mac=52:54:00:2a:e5:08,devno=fe.0.0002 
-chardev pty,id=charconsole0 -device 
sclpconsole,chardev=charconsole0,id=console0 -device 
virtio-balloon-ccw,id=balloon0,devno=fe.3.ffba -sandbox 
on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny 
-msg timestamp=on




2018-07-17 15:48:42.252+: Domain id=21 is tainted: high-privileges
2018-07-17T15:48:42.279380Z qemu-system-s390x: -chardev 
pty,id=charconsole0: char device redirected to /dev/pts/3 (label 
charconsole0)
qemu-system-s390x: util/async.c:339: aio_get_linux_aio: Assertion 
`ctx->linux_aio' failed.

2018-07-17 15:48:43.309+: shutting down, reason=failed


Any help debugging this would be greatly appreciated.

Thank you
Farhan




Re: [Qemu-devel] [PATCH v3 0/2] Detect & register virtio-crypto algos only if it can be supported by backend

2018-06-27 Thread Farhan Ali



Polite ping :)

Would like to know how to merge these patches in the mainline kernel.

Thanks
Farhan

On 06/19/2018 11:41 AM, Farhan Ali wrote:

Hi,

Currently the Linux virtio-crypto driver registers the crypto
algorithm without verifying if the backend actually supports the
algorithm.

This kernel patch series adds support for registering algorithm
with Linux crypto layer, only if the algorithm is supported by
the backend device. This also makes the driver more compliant with
the virtio-crypto spec [1].

I would appreciate any feedback or comments on this.

Thank you
Farhan

Reference
-
[1] Virtio crypto spec proposal 
https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg00816.html

ChangeLog
-
v2 -> v3
 - Add Christian Borntrager's acks for the patches.

v1 -> v2
- Modify comment as suggested by Arei (patch 1)
- Modify error message as suggested by Arei (patch 2)

Farhan Ali (2):
   crypto/virtio-crypto: Read crypto services and algorithm masks
   crypto/virtio-crypto: Register an algo only if it's supported

  drivers/crypto/virtio/virtio_crypto_algs.c   | 112 ++-
  drivers/crypto/virtio/virtio_crypto_common.h |  25 +-
  drivers/crypto/virtio/virtio_crypto_core.c   |  29 +++
  drivers/crypto/virtio/virtio_crypto_mgr.c|  81 +--
  4 files changed, 202 insertions(+), 45 deletions(-)






[Qemu-devel] [PATCH v3 2/2] crypto/virtio-crypto: Register an algo only if it's supported

2018-06-19 Thread Farhan Ali
From: Farhan Ali 

Register a crypto algo with the Linux crypto layer only if
the algorithm is supported by the backend virtio-crypto
device.

Also route crypto requests to a virtio-crypto
device, only if it can support the requested service and
algorithm.

Signed-off-by: Farhan Ali 
Acked-by: Gonglei 
Acked-by: Christian Borntraeger 
---
 drivers/crypto/virtio/virtio_crypto_algs.c   | 112 ++-
 drivers/crypto/virtio/virtio_crypto_common.h |  11 ++-
 drivers/crypto/virtio/virtio_crypto_mgr.c|  81 +--
 3 files changed, 159 insertions(+), 45 deletions(-)

diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c 
b/drivers/crypto/virtio/virtio_crypto_algs.c
index af6a908..7a104f6 100644
--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -49,12 +49,18 @@ struct virtio_crypto_sym_request {
bool encrypt;
 };
 
+struct virtio_crypto_algo {
+   uint32_t algonum;
+   uint32_t service;
+   unsigned int active_devs;
+   struct crypto_alg algo;
+};
+
 /*
  * The algs_lock protects the below global virtio_crypto_active_devs
  * and crypto algorithms registion.
  */
 static DEFINE_MUTEX(algs_lock);
-static unsigned int virtio_crypto_active_devs;
 static void virtio_crypto_ablkcipher_finalize_req(
struct virtio_crypto_sym_request *vc_sym_req,
struct ablkcipher_request *req,
@@ -312,15 +318,21 @@ static int virtio_crypto_ablkcipher_setkey(struct 
crypto_ablkcipher *tfm,
 unsigned int keylen)
 {
struct virtio_crypto_ablkcipher_ctx *ctx = crypto_ablkcipher_ctx(tfm);
+   uint32_t alg;
int ret;
 
+   ret = virtio_crypto_alg_validate_key(keylen, );
+   if (ret)
+   return ret;
+
if (!ctx->vcrypto) {
/* New key */
int node = virtio_crypto_get_current_node();
struct virtio_crypto *vcrypto =
- virtcrypto_get_dev_node(node);
+ virtcrypto_get_dev_node(node,
+ VIRTIO_CRYPTO_SERVICE_CIPHER, alg);
if (!vcrypto) {
-   pr_err("virtio_crypto: Could not find a virtio device 
in the system\n");
+   pr_err("virtio_crypto: Could not find a virtio device 
in the system or unsupported algo\n");
return -ENODEV;
}
 
@@ -571,57 +583,85 @@ static void virtio_crypto_ablkcipher_finalize_req(
virtcrypto_clear_request(_sym_req->base);
 }
 
-static struct crypto_alg virtio_crypto_algs[] = { {
-   .cra_name = "cbc(aes)",
-   .cra_driver_name = "virtio_crypto_aes_cbc",
-   .cra_priority = 150,
-   .cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC,
-   .cra_blocksize = AES_BLOCK_SIZE,
-   .cra_ctxsize  = sizeof(struct virtio_crypto_ablkcipher_ctx),
-   .cra_alignmask = 0,
-   .cra_module = THIS_MODULE,
-   .cra_type = _ablkcipher_type,
-   .cra_init = virtio_crypto_ablkcipher_init,
-   .cra_exit = virtio_crypto_ablkcipher_exit,
-   .cra_u = {
-  .ablkcipher = {
-   .setkey = virtio_crypto_ablkcipher_setkey,
-   .decrypt = virtio_crypto_ablkcipher_decrypt,
-   .encrypt = virtio_crypto_ablkcipher_encrypt,
-   .min_keysize = AES_MIN_KEY_SIZE,
-   .max_keysize = AES_MAX_KEY_SIZE,
-   .ivsize = AES_BLOCK_SIZE,
+static struct virtio_crypto_algo virtio_crypto_algs[] = { {
+   .algonum = VIRTIO_CRYPTO_CIPHER_AES_CBC,
+   .service = VIRTIO_CRYPTO_SERVICE_CIPHER,
+   .algo = {
+   .cra_name = "cbc(aes)",
+   .cra_driver_name = "virtio_crypto_aes_cbc",
+   .cra_priority = 150,
+   .cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC,
+   .cra_blocksize = AES_BLOCK_SIZE,
+   .cra_ctxsize  = sizeof(struct virtio_crypto_ablkcipher_ctx),
+   .cra_alignmask = 0,
+   .cra_module = THIS_MODULE,
+   .cra_type = _ablkcipher_type,
+   .cra_init = virtio_crypto_ablkcipher_init,
+   .cra_exit = virtio_crypto_ablkcipher_exit,
+   .cra_u = {
+   .ablkcipher = {
+   .setkey = virtio_crypto_ablkcipher_setkey,
+   .decrypt = virtio_crypto_ablkcipher_decrypt,
+   .encrypt = virtio_crypto_ablkcipher_encrypt,
+   .min_keysize = AES_MIN_KEY_SIZE,
+   .max_keysize = AES_MAX_KEY_SIZE,
+   .ivsize = AES_BLOCK_SIZE,
+   },
},
},
 } };
 
-int virtio_crypto_algs_register(vo

[Qemu-devel] [PATCH v3 0/2] Detect & register virtio-crypto algos only if it can be supported by backend

2018-06-19 Thread Farhan Ali
Hi,

Currently the Linux virtio-crypto driver registers the crypto
algorithm without verifying if the backend actually supports the
algorithm.

This kernel patch series adds support for registering algorithm
with Linux crypto layer, only if the algorithm is supported by
the backend device. This also makes the driver more compliant with
the virtio-crypto spec [1].

I would appreciate any feedback or comments on this.

Thank you
Farhan

Reference
-
[1] Virtio crypto spec proposal 
https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg00816.html

ChangeLog
-
v2 -> v3
- Add Christian Borntrager's acks for the patches.

v1 -> v2
   - Modify comment as suggested by Arei (patch 1)
   - Modify error message as suggested by Arei (patch 2)

Farhan Ali (2):
  crypto/virtio-crypto: Read crypto services and algorithm masks
  crypto/virtio-crypto: Register an algo only if it's supported

 drivers/crypto/virtio/virtio_crypto_algs.c   | 112 ++-
 drivers/crypto/virtio/virtio_crypto_common.h |  25 +-
 drivers/crypto/virtio/virtio_crypto_core.c   |  29 +++
 drivers/crypto/virtio/virtio_crypto_mgr.c|  81 +--
 4 files changed, 202 insertions(+), 45 deletions(-)

-- 
2.7.4




[Qemu-devel] [PATCH v3 1/2] crypto/virtio-crypto: Read crypto services and algorithm masks

2018-06-19 Thread Farhan Ali
Read the crypto services and algorithm masks which provides
information about the services and algorithms supported by
virtio-crypto backend.

Signed-off-by: Farhan Ali 
Acked-by: Gonglei 
Acked-by: Christian Borntraeger 
---
 drivers/crypto/virtio/virtio_crypto_common.h | 14 ++
 drivers/crypto/virtio/virtio_crypto_core.c   | 29 
 2 files changed, 43 insertions(+)

diff --git a/drivers/crypto/virtio/virtio_crypto_common.h 
b/drivers/crypto/virtio/virtio_crypto_common.h
index 66501a5..931a3bd 100644
--- a/drivers/crypto/virtio/virtio_crypto_common.h
+++ b/drivers/crypto/virtio/virtio_crypto_common.h
@@ -55,6 +55,20 @@ struct virtio_crypto {
/* Number of queue currently used by the driver */
u32 curr_queue;
 
+   /*
+* Specifies the services mask which the device support,
+* see VIRTIO_CRYPTO_SERVICE_*
+*/
+   u32 crypto_services;
+
+   /* Detailed algorithms mask */
+   u32 cipher_algo_l;
+   u32 cipher_algo_h;
+   u32 hash_algo;
+   u32 mac_algo_l;
+   u32 mac_algo_h;
+   u32 aead_algo;
+
/* Maximum length of cipher key */
u32 max_cipher_key_len;
/* Maximum length of authenticated key */
diff --git a/drivers/crypto/virtio/virtio_crypto_core.c 
b/drivers/crypto/virtio/virtio_crypto_core.c
index 8332698..8f745f2 100644
--- a/drivers/crypto/virtio/virtio_crypto_core.c
+++ b/drivers/crypto/virtio/virtio_crypto_core.c
@@ -303,6 +303,13 @@ static int virtcrypto_probe(struct virtio_device *vdev)
u32 max_data_queues = 0, max_cipher_key_len = 0;
u32 max_auth_key_len = 0;
u64 max_size = 0;
+   u32 cipher_algo_l = 0;
+   u32 cipher_algo_h = 0;
+   u32 hash_algo = 0;
+   u32 mac_algo_l = 0;
+   u32 mac_algo_h = 0;
+   u32 aead_algo = 0;
+   u32 crypto_services = 0;
 
if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
return -ENODEV;
@@ -339,6 +346,20 @@ static int virtcrypto_probe(struct virtio_device *vdev)
max_auth_key_len, _auth_key_len);
virtio_cread(vdev, struct virtio_crypto_config,
max_size, _size);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   crypto_services, _services);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   cipher_algo_l, _algo_l);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   cipher_algo_h, _algo_h);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   hash_algo, _algo);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   mac_algo_l, _algo_l);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   mac_algo_h, _algo_h);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   aead_algo, _algo);
 
/* Add virtio crypto device to global table */
err = virtcrypto_devmgr_add_dev(vcrypto);
@@ -358,6 +379,14 @@ static int virtcrypto_probe(struct virtio_device *vdev)
vcrypto->max_cipher_key_len = max_cipher_key_len;
vcrypto->max_auth_key_len = max_auth_key_len;
vcrypto->max_size = max_size;
+   vcrypto->crypto_services = crypto_services;
+   vcrypto->cipher_algo_l = cipher_algo_l;
+   vcrypto->cipher_algo_h = cipher_algo_h;
+   vcrypto->mac_algo_l = mac_algo_l;
+   vcrypto->mac_algo_h = mac_algo_h;
+   vcrypto->hash_algo = hash_algo;
+   vcrypto->aead_algo = aead_algo;
+
 
dev_info(>dev,
"max_queues: %u, max_cipher_key_len: %u, max_auth_key_len: %u, 
max_size 0x%llx\n",
-- 
2.7.4




Re: [Qemu-devel] [PATCH v2 0/2] Detect & register virtio-crypto algos only if it can be supported by backend

2018-06-19 Thread Farhan Ali




On 06/19/2018 10:32 AM, Herbert Xu wrote:

On Tue, Jun 19, 2018 at 04:08:05PM +0200, Christian Borntraeger wrote:

On 06/19/2018 03:26 PM, Farhan Ali wrote:

A polite ping :)

If no one has any more comments/feedback then I would like to how to merge 
these patches to the mainline kernel tree.


I think it should go via Michael Tsirkin or Jason Wang (Adding Jason).
Also adding Herbert Xu as FYI.


I have not looked into the details of the patch, but I like the idea.
Acked-by: Christian Borntraeger 


Please post crypto patches to linux-crypto.

Thanks,



I have cc'ed linux-crypto folks. Sorry I missed them

Thanks
Farhan




Re: [Qemu-devel] [PATCH v2 0/2] Detect & register virtio-crypto algos only if it can be supported by backend

2018-06-19 Thread Farhan Ali

A polite ping :)

If no one has any more comments/feedback then I would like to how to 
merge these patches to the mainline kernel tree.


Thanks
Farhan

On 06/13/2018 04:38 PM, Farhan Ali wrote:

Hi,

Currently the Linux virtio-crypto driver registers the crypto
algorithm without verifying if the backend actually supports the
algorithm.

This kernel patch series adds support for registering algorithm
with Linux crypto layer, only if the algorithm is supported by
the backend device. This also makes the driver more compliant with
the virtio-crypto spec [1].

I would appreciate any feedback or comments on this.

Thank you
Farhan

Reference
-
[1] Virtio crypto spec proposal 
https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg00816.html

ChangeLog
-
v1 -> v2
- Modify comment as suggested by Arei (patch 1)
- Modify error message as suggested by Arei (patch 2)


Farhan Ali (2):
   crypto/virtio-crypto: Read crypto services and algorithm masks
   crypto/virtio-crypto: Register an algo only if it's supported

  drivers/crypto/virtio/virtio_crypto_algs.c   | 112 ++-
  drivers/crypto/virtio/virtio_crypto_common.h |  25 +-
  drivers/crypto/virtio/virtio_crypto_core.c   |  29 +++
  drivers/crypto/virtio/virtio_crypto_mgr.c|  81 +--
  4 files changed, 202 insertions(+), 45 deletions(-)






Re: [Qemu-devel] [RFC v1 1/1] virtio-crypto: Allow disabling of cipher algorithms for virtio-crypto device

2018-06-15 Thread Farhan Ali




On 06/15/2018 09:17 AM, Viktor VM Mihajlovski wrote:

On 14.06.2018 18:12, Farhan Ali wrote:



On 06/14/2018 11:10 AM, Daniel P. Berrangé wrote:

On Thu, Jun 14, 2018 at 10:50:40AM -0400, Farhan Ali wrote:



On 06/14/2018 04:21 AM, Daniel P. Berrangé wrote:

On Wed, Jun 13, 2018 at 07:28:08PM +0200, Halil Pasic wrote:



On 06/13/2018 05:05 PM, Daniel P. Berrangé wrote:

On Wed, Jun 13, 2018 at 11:01:05AM -0400, Farhan Ali wrote:

Hi Daniel

On 06/13/2018 05:37 AM, Daniel P. Berrangé wrote:

On Tue, Jun 12, 2018 at 03:48:34PM -0400, Farhan Ali wrote:

The virtio-crypto driver currently propagates to the guest
all the cipher algorithms that the backend cryptodev can
support. But in certain cases where the guest has more
performant mechanism to handle some algorithms, it would be
useful to propagate only a subset of the algorithms.


I'm not really convinced by this.

The performance of crypto algorithms has many influencing
factors, making it pretty hard to decide which is best
without actively testing specific impls and comparing
them in a manner which matches the application usage
pattern. eg in theory the kernel crypto impl of an alg
is faster than a userspace impl, if the kernel uses
hardware accel and userspace does not. This, however,
ignores the overhead of the kernel/userspace switch.
The real world performance winner, thus depends on the
amount of data being processed in each operation. Some
times userspace can win & sometimes kernel space can
win. This is even more relevant to virtio-crypto as
it has more expensive context switches.


True. But what if the guest can perform some crypto algorithms
without a
incurring a VM exit? For example in s390 we have the cpacf
instructions to
perform crypto and this instruction is implemented for us by our
hardware
virtualization technology. In such a case it would be better not
to use
virtio-crypto's implementation of such a crypto algorithm.

At the same time we would like to take advantage of virtio-crypto's
acceleration capabilities for certain crypto algorithms for which
there is
no hardware assistance.


IIUC, the kernel's crypto layer can support multiple
implementations of
any algorithm. Providers can report a priority against
implementations
which influences which impl is used in practice. So if there's a
native
instruction for a partiuclar algorithm I would expect the impl
registered
for that to be designated higher priority than other impls, so
that it is
used in preference to other impls.



AFAIR the problem here is that in (the guest) kernel the virtio-crypto
driver has to register it's crypto algo implementations with a
priority
(single number), which dictates if it's going to be the preferred
(used)
implementation of the algorithm or not. The virtio-crypto driver
does this
without having information about the (comparative or absolute)
performance
of it's implementation (which depends on the backend among others).
I don't think
any dynamic re-prioritization of the algorithms takes place (e.g.
based on how these
perform in for the given configuration).

I think the strategy of the virtio-crypto is to rather overstate, than
understate the performance of it's implementation. If we were to 'be
conservative' and say, 'hey we don't know nothing about the
performance,
let's make it lowest priority implementation' the implementations
provided
by virtio-crypto would end up being used only if there is no other
implementation. And that does not sound like a good idea either.



This problem you describe, however, is something that applies to *any*
kerenl code that is registering a crypto algo impl for accelerator
hardware. A non-virtualized crypto cards in bare metal likewise cannot
assume that its AES impl is better then the host CPU's  aes-ni
instruction.


So the idea is to give the user the power to effectively not provide
an algorithm via virtio-crypto. That is, if the user observes a
performance
degradation because of virtio-crypto, he can turn off the bad
algorithms
at the device. That way overstatement becomes a much smaller problem.
The user can turn off the bad algorithms for reasons other than
performance
too.

Of course there are other ways to deal with the problem of
virtio-crypto
driver not knowing how good it's implementation of a given algo is. We
could make the in kernel crypto priorities dynamically adjustable
in general
or we could provide the user with means to specify the priorities
(e.g.
as module parameter) with which the virtio-crypto driver registers
each algo.
Both of these would be knobs in the guest. It's hard to tell if
these first
one would be useful in scenarios not involving virtualization. Same
goes
for some kind of dynamic priority management for crypto algorithm
implementations
in the Linux kernel. I assume the people involved with the respective
subsystem do not see the necessity for something like that.


It still feels like this is a problem for the guest OS to solve. If you
put a physical crypto acceler

Re: [Qemu-devel] [RFC v1 1/1] virtio-crypto: Allow disabling of cipher algorithms for virtio-crypto device

2018-06-14 Thread Farhan Ali




On 06/14/2018 11:10 AM, Daniel P. Berrangé wrote:

On Thu, Jun 14, 2018 at 10:50:40AM -0400, Farhan Ali wrote:



On 06/14/2018 04:21 AM, Daniel P. Berrangé wrote:

On Wed, Jun 13, 2018 at 07:28:08PM +0200, Halil Pasic wrote:



On 06/13/2018 05:05 PM, Daniel P. Berrangé wrote:

On Wed, Jun 13, 2018 at 11:01:05AM -0400, Farhan Ali wrote:

Hi Daniel

On 06/13/2018 05:37 AM, Daniel P. Berrangé wrote:

On Tue, Jun 12, 2018 at 03:48:34PM -0400, Farhan Ali wrote:

The virtio-crypto driver currently propagates to the guest
all the cipher algorithms that the backend cryptodev can
support. But in certain cases where the guest has more
performant mechanism to handle some algorithms, it would be
useful to propagate only a subset of the algorithms.


I'm not really convinced by this.

The performance of crypto algorithms has many influencing
factors, making it pretty hard to decide which is best
without actively testing specific impls and comparing
them in a manner which matches the application usage
pattern. eg in theory the kernel crypto impl of an alg
is faster than a userspace impl, if the kernel uses
hardware accel and userspace does not. This, however,
ignores the overhead of the kernel/userspace switch.
The real world performance winner, thus depends on the
amount of data being processed in each operation. Some
times userspace can win & sometimes kernel space can
win. This is even more relevant to virtio-crypto as
it has more expensive context switches.


True. But what if the guest can perform some crypto algorithms without a
incurring a VM exit? For example in s390 we have the cpacf instructions to
perform crypto and this instruction is implemented for us by our hardware
virtualization technology. In such a case it would be better not to use
virtio-crypto's implementation of such a crypto algorithm.

At the same time we would like to take advantage of virtio-crypto's
acceleration capabilities for certain crypto algorithms for which there is
no hardware assistance.


IIUC, the kernel's crypto layer can support multiple implementations of
any algorithm. Providers can report a priority against implementations
which influences which impl is used in practice. So if there's a native
instruction for a partiuclar algorithm I would expect the impl registered
for that to be designated higher priority than other impls, so that it is
used in preference to other impls.



AFAIR the problem here is that in (the guest) kernel the virtio-crypto
driver has to register it's crypto algo implementations with a priority
(single number), which dictates if it's going to be the preferred (used)
implementation of the algorithm or not. The virtio-crypto driver does this
without having information about the (comparative or absolute) performance
of it's implementation (which depends on the backend among others). I don't 
think
any dynamic re-prioritization of the algorithms takes place (e.g. based on how 
these
perform in for the given configuration).

I think the strategy of the virtio-crypto is to rather overstate, than
understate the performance of it's implementation. If we were to 'be
conservative' and say, 'hey we don't know nothing about the performance,
let's make it lowest priority implementation' the implementations provided
by virtio-crypto would end up being used only if there is no other
implementation. And that does not sound like a good idea either.



This problem you describe, however, is something that applies to *any*
kerenl code that is registering a crypto algo impl for accelerator
hardware. A non-virtualized crypto cards in bare metal likewise cannot
assume that its AES impl is better then the host CPU's  aes-ni instruction.


So the idea is to give the user the power to effectively not provide
an algorithm via virtio-crypto. That is, if the user observes a performance
degradation because of virtio-crypto, he can turn off the bad algorithms
at the device. That way overstatement becomes a much smaller problem.
The user can turn off the bad algorithms for reasons other than performance
too.

Of course there are other ways to deal with the problem of virtio-crypto
driver not knowing how good it's implementation of a given algo is. We
could make the in kernel crypto priorities dynamically adjustable in general
or we could provide the user with means to specify the priorities (e.g.
as module parameter) with which the virtio-crypto driver registers each algo.
Both of these would be knobs in the guest. It's hard to tell if these first
one would be useful in scenarios not involving virtualization. Same goes
for some kind of dynamic priority management for crypto algorithm 
implementations
in the Linux kernel. I assume the people involved with the respective
subsystem do not see the necessity for something like that.


It still feels like this is a problem for the guest OS to solve. If you
put a physical crypto accelerator in a bare metal machine, that has the
same problem you describe here, so the kernel su

Re: [Qemu-devel] [RFC v1 1/1] virtio-crypto: Allow disabling of cipher algorithms for virtio-crypto device

2018-06-14 Thread Farhan Ali




On 06/14/2018 04:21 AM, Daniel P. Berrangé wrote:

On Wed, Jun 13, 2018 at 07:28:08PM +0200, Halil Pasic wrote:



On 06/13/2018 05:05 PM, Daniel P. Berrangé wrote:

On Wed, Jun 13, 2018 at 11:01:05AM -0400, Farhan Ali wrote:

Hi Daniel

On 06/13/2018 05:37 AM, Daniel P. Berrangé wrote:

On Tue, Jun 12, 2018 at 03:48:34PM -0400, Farhan Ali wrote:

The virtio-crypto driver currently propagates to the guest
all the cipher algorithms that the backend cryptodev can
support. But in certain cases where the guest has more
performant mechanism to handle some algorithms, it would be
useful to propagate only a subset of the algorithms.


I'm not really convinced by this.

The performance of crypto algorithms has many influencing
factors, making it pretty hard to decide which is best
without actively testing specific impls and comparing
them in a manner which matches the application usage
pattern. eg in theory the kernel crypto impl of an alg
is faster than a userspace impl, if the kernel uses
hardware accel and userspace does not. This, however,
ignores the overhead of the kernel/userspace switch.
The real world performance winner, thus depends on the
amount of data being processed in each operation. Some
times userspace can win & sometimes kernel space can
win. This is even more relevant to virtio-crypto as
it has more expensive context switches.


True. But what if the guest can perform some crypto algorithms without a
incurring a VM exit? For example in s390 we have the cpacf instructions to
perform crypto and this instruction is implemented for us by our hardware
virtualization technology. In such a case it would be better not to use
virtio-crypto's implementation of such a crypto algorithm.

At the same time we would like to take advantage of virtio-crypto's
acceleration capabilities for certain crypto algorithms for which there is
no hardware assistance.


IIUC, the kernel's crypto layer can support multiple implementations of
any algorithm. Providers can report a priority against implementations
which influences which impl is used in practice. So if there's a native
instruction for a partiuclar algorithm I would expect the impl registered
for that to be designated higher priority than other impls, so that it is
used in preference to other impls.



AFAIR the problem here is that in (the guest) kernel the virtio-crypto
driver has to register it's crypto algo implementations with a priority
(single number), which dictates if it's going to be the preferred (used)
implementation of the algorithm or not. The virtio-crypto driver does this
without having information about the (comparative or absolute) performance
of it's implementation (which depends on the backend among others). I don't 
think
any dynamic re-prioritization of the algorithms takes place (e.g. based on how 
these
perform in for the given configuration).

I think the strategy of the virtio-crypto is to rather overstate, than
understate the performance of it's implementation. If we were to 'be
conservative' and say, 'hey we don't know nothing about the performance,
let's make it lowest priority implementation' the implementations provided
by virtio-crypto would end up being used only if there is no other
implementation. And that does not sound like a good idea either.



This problem you describe, however, is something that applies to *any*
kerenl code that is registering a crypto algo impl for accelerator
hardware. A non-virtualized crypto cards in bare metal likewise cannot
assume that its AES impl is better then the host CPU's  aes-ni instruction.


So the idea is to give the user the power to effectively not provide
an algorithm via virtio-crypto. That is, if the user observes a performance
degradation because of virtio-crypto, he can turn off the bad algorithms
at the device. That way overstatement becomes a much smaller problem.
The user can turn off the bad algorithms for reasons other than performance
too.

Of course there are other ways to deal with the problem of virtio-crypto
driver not knowing how good it's implementation of a given algo is. We
could make the in kernel crypto priorities dynamically adjustable in general
or we could provide the user with means to specify the priorities (e.g.
as module parameter) with which the virtio-crypto driver registers each algo.
Both of these would be knobs in the guest. It's hard to tell if these first
one would be useful in scenarios not involving virtualization. Same goes
for some kind of dynamic priority management for crypto algorithm 
implementations
in the Linux kernel. I assume the people involved with the respective
subsystem do not see the necessity for something like that.


It still feels like this is a problem for the guest OS to solve. If you
put a physical crypto accelerator in a bare metal machine, that has the
same problem you describe here, so the kernel surely already needs to find
a viable solution for this problem.



How would the guest OS know which algo is better? As

[Qemu-devel] [PATCH v2 1/2] crypto/virtio-crypto: Read crypto services and algorithm masks

2018-06-13 Thread Farhan Ali
Read the crypto services and algorithm masks which provides
information about the services and algorithms supported by
virtio-crypto backend.

Signed-off-by: Farhan Ali 
Acked-by: Gonglei 
---
 drivers/crypto/virtio/virtio_crypto_common.h | 14 ++
 drivers/crypto/virtio/virtio_crypto_core.c   | 29 
 2 files changed, 43 insertions(+)

diff --git a/drivers/crypto/virtio/virtio_crypto_common.h 
b/drivers/crypto/virtio/virtio_crypto_common.h
index 66501a5..931a3bd 100644
--- a/drivers/crypto/virtio/virtio_crypto_common.h
+++ b/drivers/crypto/virtio/virtio_crypto_common.h
@@ -55,6 +55,20 @@ struct virtio_crypto {
/* Number of queue currently used by the driver */
u32 curr_queue;
 
+   /*
+* Specifies the services mask which the device support,
+* see VIRTIO_CRYPTO_SERVICE_*
+*/
+   u32 crypto_services;
+
+   /* Detailed algorithms mask */
+   u32 cipher_algo_l;
+   u32 cipher_algo_h;
+   u32 hash_algo;
+   u32 mac_algo_l;
+   u32 mac_algo_h;
+   u32 aead_algo;
+
/* Maximum length of cipher key */
u32 max_cipher_key_len;
/* Maximum length of authenticated key */
diff --git a/drivers/crypto/virtio/virtio_crypto_core.c 
b/drivers/crypto/virtio/virtio_crypto_core.c
index 8332698..8f745f2 100644
--- a/drivers/crypto/virtio/virtio_crypto_core.c
+++ b/drivers/crypto/virtio/virtio_crypto_core.c
@@ -303,6 +303,13 @@ static int virtcrypto_probe(struct virtio_device *vdev)
u32 max_data_queues = 0, max_cipher_key_len = 0;
u32 max_auth_key_len = 0;
u64 max_size = 0;
+   u32 cipher_algo_l = 0;
+   u32 cipher_algo_h = 0;
+   u32 hash_algo = 0;
+   u32 mac_algo_l = 0;
+   u32 mac_algo_h = 0;
+   u32 aead_algo = 0;
+   u32 crypto_services = 0;
 
if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
return -ENODEV;
@@ -339,6 +346,20 @@ static int virtcrypto_probe(struct virtio_device *vdev)
max_auth_key_len, _auth_key_len);
virtio_cread(vdev, struct virtio_crypto_config,
max_size, _size);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   crypto_services, _services);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   cipher_algo_l, _algo_l);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   cipher_algo_h, _algo_h);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   hash_algo, _algo);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   mac_algo_l, _algo_l);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   mac_algo_h, _algo_h);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   aead_algo, _algo);
 
/* Add virtio crypto device to global table */
err = virtcrypto_devmgr_add_dev(vcrypto);
@@ -358,6 +379,14 @@ static int virtcrypto_probe(struct virtio_device *vdev)
vcrypto->max_cipher_key_len = max_cipher_key_len;
vcrypto->max_auth_key_len = max_auth_key_len;
vcrypto->max_size = max_size;
+   vcrypto->crypto_services = crypto_services;
+   vcrypto->cipher_algo_l = cipher_algo_l;
+   vcrypto->cipher_algo_h = cipher_algo_h;
+   vcrypto->mac_algo_l = mac_algo_l;
+   vcrypto->mac_algo_h = mac_algo_h;
+   vcrypto->hash_algo = hash_algo;
+   vcrypto->aead_algo = aead_algo;
+
 
dev_info(>dev,
"max_queues: %u, max_cipher_key_len: %u, max_auth_key_len: %u, 
max_size 0x%llx\n",
-- 
2.7.4




[Qemu-devel] [PATCH v2 0/2] Detect & register virtio-crypto algos only if it can be supported by backend

2018-06-13 Thread Farhan Ali
Hi,

Currently the Linux virtio-crypto driver registers the crypto
algorithm without verifying if the backend actually supports the
algorithm.

This kernel patch series adds support for registering algorithm
with Linux crypto layer, only if the algorithm is supported by
the backend device. This also makes the driver more compliant with
the virtio-crypto spec [1].

I would appreciate any feedback or comments on this.

Thank you
Farhan

Reference
-
[1] Virtio crypto spec proposal 
https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg00816.html

ChangeLog
-
v1 -> v2
   - Modify comment as suggested by Arei (patch 1)
   - Modify error message as suggested by Arei (patch 2)


Farhan Ali (2):
  crypto/virtio-crypto: Read crypto services and algorithm masks
  crypto/virtio-crypto: Register an algo only if it's supported

 drivers/crypto/virtio/virtio_crypto_algs.c   | 112 ++-
 drivers/crypto/virtio/virtio_crypto_common.h |  25 +-
 drivers/crypto/virtio/virtio_crypto_core.c   |  29 +++
 drivers/crypto/virtio/virtio_crypto_mgr.c|  81 +--
 4 files changed, 202 insertions(+), 45 deletions(-)

-- 
2.7.4




[Qemu-devel] [PATCH v2 2/2] crypto/virtio-crypto: Register an algo only if it's supported

2018-06-13 Thread Farhan Ali
From: Farhan Ali 

Register a crypto algo with the Linux crypto layer only if
the algorithm is supported by the backend virtio-crypto
device.

Also route crypto requests to a virtio-crypto
device, only if it can support the requested service and
algorithm.

Signed-off-by: Farhan Ali 
Acked-by: Gonglei 
---
 drivers/crypto/virtio/virtio_crypto_algs.c   | 112 ++-
 drivers/crypto/virtio/virtio_crypto_common.h |  11 ++-
 drivers/crypto/virtio/virtio_crypto_mgr.c|  81 +--
 3 files changed, 159 insertions(+), 45 deletions(-)

diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c 
b/drivers/crypto/virtio/virtio_crypto_algs.c
index ba190cf..11db62f 100644
--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -49,12 +49,18 @@ struct virtio_crypto_sym_request {
bool encrypt;
 };
 
+struct virtio_crypto_algo {
+   uint32_t algonum;
+   uint32_t service;
+   unsigned int active_devs;
+   struct crypto_alg algo;
+};
+
 /*
  * The algs_lock protects the below global virtio_crypto_active_devs
  * and crypto algorithms registion.
  */
 static DEFINE_MUTEX(algs_lock);
-static unsigned int virtio_crypto_active_devs;
 static void virtio_crypto_ablkcipher_finalize_req(
struct virtio_crypto_sym_request *vc_sym_req,
struct ablkcipher_request *req,
@@ -312,15 +318,21 @@ static int virtio_crypto_ablkcipher_setkey(struct 
crypto_ablkcipher *tfm,
 unsigned int keylen)
 {
struct virtio_crypto_ablkcipher_ctx *ctx = crypto_ablkcipher_ctx(tfm);
+   uint32_t alg;
int ret;
 
+   ret = virtio_crypto_alg_validate_key(keylen, );
+   if (ret)
+   return ret;
+
if (!ctx->vcrypto) {
/* New key */
int node = virtio_crypto_get_current_node();
struct virtio_crypto *vcrypto =
- virtcrypto_get_dev_node(node);
+ virtcrypto_get_dev_node(node,
+ VIRTIO_CRYPTO_SERVICE_CIPHER, alg);
if (!vcrypto) {
-   pr_err("virtio_crypto: Could not find a virtio device 
in the system\n");
+   pr_err("virtio_crypto: Could not find a virtio device 
in the system or unsupported algo\n");
return -ENODEV;
}
 
@@ -571,57 +583,85 @@ static void virtio_crypto_ablkcipher_finalize_req(
virtcrypto_clear_request(_sym_req->base);
 }
 
-static struct crypto_alg virtio_crypto_algs[] = { {
-   .cra_name = "cbc(aes)",
-   .cra_driver_name = "virtio_crypto_aes_cbc",
-   .cra_priority = 150,
-   .cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC,
-   .cra_blocksize = AES_BLOCK_SIZE,
-   .cra_ctxsize  = sizeof(struct virtio_crypto_ablkcipher_ctx),
-   .cra_alignmask = 0,
-   .cra_module = THIS_MODULE,
-   .cra_type = _ablkcipher_type,
-   .cra_init = virtio_crypto_ablkcipher_init,
-   .cra_exit = virtio_crypto_ablkcipher_exit,
-   .cra_u = {
-  .ablkcipher = {
-   .setkey = virtio_crypto_ablkcipher_setkey,
-   .decrypt = virtio_crypto_ablkcipher_decrypt,
-   .encrypt = virtio_crypto_ablkcipher_encrypt,
-   .min_keysize = AES_MIN_KEY_SIZE,
-   .max_keysize = AES_MAX_KEY_SIZE,
-   .ivsize = AES_BLOCK_SIZE,
+static struct virtio_crypto_algo virtio_crypto_algs[] = { {
+   .algonum = VIRTIO_CRYPTO_CIPHER_AES_CBC,
+   .service = VIRTIO_CRYPTO_SERVICE_CIPHER,
+   .algo = {
+   .cra_name = "cbc(aes)",
+   .cra_driver_name = "virtio_crypto_aes_cbc",
+   .cra_priority = 150,
+   .cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC,
+   .cra_blocksize = AES_BLOCK_SIZE,
+   .cra_ctxsize  = sizeof(struct virtio_crypto_ablkcipher_ctx),
+   .cra_alignmask = 0,
+   .cra_module = THIS_MODULE,
+   .cra_type = _ablkcipher_type,
+   .cra_init = virtio_crypto_ablkcipher_init,
+   .cra_exit = virtio_crypto_ablkcipher_exit,
+   .cra_u = {
+   .ablkcipher = {
+   .setkey = virtio_crypto_ablkcipher_setkey,
+   .decrypt = virtio_crypto_ablkcipher_decrypt,
+   .encrypt = virtio_crypto_ablkcipher_encrypt,
+   .min_keysize = AES_MIN_KEY_SIZE,
+   .max_keysize = AES_MAX_KEY_SIZE,
+   .ivsize = AES_BLOCK_SIZE,
+   },
},
},
 } };
 
-int virtio_crypto_algs_register(void)
+int virtio_crypto_algs_register(struct

Re: [Qemu-devel] [RFC v1 1/1] virtio-crypto: Allow disabling of cipher algorithms for virtio-crypto device

2018-06-13 Thread Farhan Ali

Hi Arei

On 06/12/2018 08:57 PM, Gonglei (Arei) wrote:



-Original Message-
From: Farhan Ali [mailto:al...@linux.ibm.com]
Sent: Wednesday, June 13, 2018 3:49 AM
To: qemu-devel@nongnu.org
Cc: m...@redhat.com; Gonglei (Arei) ; longpeng
; pa...@linux.ibm.com; borntrae...@de.ibm.com;
fran...@linux.ibm.com; al...@linux.ibm.com
Subject: [RFC v1 1/1] virtio-crypto: Allow disabling of cipher algorithms for
virtio-crypto device

The virtio-crypto driver currently propagates to the guest
all the cipher algorithms that the backend cryptodev can
support. But in certain cases where the guest has more
performant mechanism to handle some algorithms, it would be
useful to propagate only a subset of the algorithms.



It makes sense to me. E.g. current Intel CPU has the AES-NI instruction for 
accelerating
AES algo. We don't need to propagate AES algos.


This patch adds support for disabling the cipher
algorithms of the backend cryptodev.

eg:
  -object cryptodev-backend-builtin,id=cryptodev0
  -device virtio-crypto-ccw,id=crypto0,cryptodev=cryptodev0,cipher-aes-cbc=off

Signed-off-by: Farhan Ali 
---

Please note this patch is not complete, and there are TODOs to handle
for other types of algorithms such Hash, AEAD and MAC algorithms.

This is mainly intended to get some feedback on the design approach
from the community.


  hw/virtio/virtio-crypto.c | 46
---
  include/hw/virtio/virtio-crypto.h |  3 +++
  2 files changed, 46 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/virtio-crypto.c b/hw/virtio/virtio-crypto.c
index 9a9fa49..4aed9ca 100644
--- a/hw/virtio/virtio-crypto.c
+++ b/hw/virtio/virtio-crypto.c
@@ -754,12 +754,22 @@ static void virtio_crypto_reset(VirtIODevice *vdev)
  static void virtio_crypto_init_config(VirtIODevice *vdev)
  {
  VirtIOCrypto *vcrypto = VIRTIO_CRYPTO(vdev);
+uint32_t user_crypto_services = (1u <<
VIRTIO_CRYPTO_SERVICE_CIPHER) |
+(1u <<
VIRTIO_CRYPTO_SERVICE_HASH) |
+(1u <<
VIRTIO_CRYPTO_SERVICE_AEAD) |
+(1u <<
VIRTIO_CRYPTO_SERVICE_MAC);
+
+if (vcrypto->user_cipher_algo_l & (1u << VIRTIO_CRYPTO_NO_CIPHER)) {
+vcrypto->user_cipher_algo_l = 1u << VIRTIO_CRYPTO_NO_CIPHER;
+vcrypto->user_cipher_algo_h = 0;
+user_crypto_services &= ~(1u <<
VIRTIO_CRYPTO_SERVICE_CIPHER);
+}

-vcrypto->conf.crypto_services =
+vcrypto->conf.crypto_services = user_crypto_services &
   vcrypto->conf.cryptodev->conf.crypto_services;
-vcrypto->conf.cipher_algo_l =
+vcrypto->conf.cipher_algo_l = vcrypto->user_cipher_algo_l &
   vcrypto->conf.cryptodev->conf.cipher_algo_l;
-vcrypto->conf.cipher_algo_h =
+vcrypto->conf.cipher_algo_h = vcrypto->user_cipher_algo_h &
   vcrypto->conf.cryptodev->conf.cipher_algo_h;
  vcrypto->conf.hash_algo = vcrypto->conf.cryptodev->conf.hash_algo;
  vcrypto->conf.mac_algo_l = vcrypto->conf.cryptodev->conf.mac_algo_l;
@@ -853,6 +863,34 @@ static const VMStateDescription
vmstate_virtio_crypto = {
  static Property virtio_crypto_properties[] = {
  DEFINE_PROP_LINK("cryptodev", VirtIOCrypto, conf.cryptodev,
   TYPE_CRYPTODEV_BACKEND, CryptoDevBackend
*),
+DEFINE_PROP_BIT("no-cipher", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_ARC4, false),


s/ VIRTIO_CRYPTO_CIPHER_ARC4/VIRTIO_CRYPTO_NO_CIPHER/


Thanks for catching that. I will change.




+DEFINE_PROP_BIT("cipher-arc4", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_ARC4, false),
+DEFINE_PROP_BIT("cipher-aes-ecb", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_AES_ECB, false),
+DEFINE_PROP_BIT("cipher-aes-cbc", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_AES_CBC, false),
+DEFINE_PROP_BIT("cipher-aes-ctr", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_AES_CTR, false),
+DEFINE_PROP_BIT("cipher-des-ecb", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_DES_ECB, false),
+DEFINE_PROP_BIT("cipher-3des-ecb", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_3DES_ECB, false),
+DEFINE_PROP_BIT("cipher-3des-cbc", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_3DES_CBC, false),
+DEFINE_PROP_BIT("cipher-3des-ctr", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_3DES_CTR, false),
+DEFINE_PROP_BIT("cipher-kasumi-f8", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CR

Re: [Qemu-devel] [RFC v1 1/1] virtio-crypto: Allow disabling of cipher algorithms for virtio-crypto device

2018-06-13 Thread Farhan Ali

Hi Daniel

On 06/13/2018 05:37 AM, Daniel P. Berrangé wrote:

On Tue, Jun 12, 2018 at 03:48:34PM -0400, Farhan Ali wrote:

The virtio-crypto driver currently propagates to the guest
all the cipher algorithms that the backend cryptodev can
support. But in certain cases where the guest has more
performant mechanism to handle some algorithms, it would be
useful to propagate only a subset of the algorithms.


I'm not really convinced by this.

The performance of crypto algorithms has many influencing
factors, making it pretty hard to decide which is best
without actively testing specific impls and comparing
them in a manner which matches the application usage
pattern. eg in theory the kernel crypto impl of an alg
is faster than a userspace impl, if the kernel uses
hardware accel and userspace does not. This, however,
ignores the overhead of the kernel/userspace switch.
The real world performance winner, thus depends on the
amount of data being processed in each operation. Some
times userspace can win & sometimes kernel space can
win. This is even more relevant to virtio-crypto as
it has more expensive context switches.


True. But what if the guest can perform some crypto algorithms without a 
incurring a VM exit? For example in s390 we have the cpacf instructions 
to perform crypto and this instruction is implemented for us by our 
hardware virtualization technology. In such a case it would be better 
not to use virtio-crypto's implementation of such a crypto algorithm.


At the same time we would like to take advantage of virtio-crypto's 
acceleration capabilities for certain crypto algorithms for which there 
is no hardware assistance.





IOW, when we expose a virtio-crypto dev to a guest,
it is never reasonable for the guest to blindly assume
that anything it does is faster than a pure software
impl running in the guest. It will depend on the usage
pattern. This is no different to bare metal where you
should not assume kernel crypto is faster.

IMHO this is not a compelling reason to be able to turn
off algorithms in virtio-crypto, as any decision will
always be at best incomplete & inaccurate.


But shouldn't the user have the option to try and test by turning off 
certain algorithms? You are right the performance will depend on usage 
patterns, but I believe at least the user should have the option to test 
and see what works and does not work. It would be far easier to do so 
with the virtio-crypto dev than changing code in the kernel or userspace 
IMHO.






@@ -853,6 +863,34 @@ static const VMStateDescription vmstate_virtio_crypto = {
  static Property virtio_crypto_properties[] = {
  DEFINE_PROP_LINK("cryptodev", VirtIOCrypto, conf.cryptodev,
   TYPE_CRYPTODEV_BACKEND, CryptoDevBackend *),
+DEFINE_PROP_BIT("no-cipher", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_ARC4, false),
+DEFINE_PROP_BIT("cipher-arc4", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_ARC4, false),
+DEFINE_PROP_BIT("cipher-aes-ecb", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_AES_ECB, false),
+DEFINE_PROP_BIT("cipher-aes-cbc", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_AES_CBC, false),
+DEFINE_PROP_BIT("cipher-aes-ctr", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_AES_CTR, false),
+DEFINE_PROP_BIT("cipher-des-ecb", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_DES_ECB, false),
+DEFINE_PROP_BIT("cipher-3des-ecb", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_3DES_ECB, false),
+DEFINE_PROP_BIT("cipher-3des-cbc", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_3DES_CBC, false),
+DEFINE_PROP_BIT("cipher-3des-ctr", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_3DES_CTR, false),
+DEFINE_PROP_BIT("cipher-kasumi-f8", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_KASUMI_F8, false),
+DEFINE_PROP_BIT("cipher-snow3g-uea2", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_SNOW3G_UEA2, false),
+DEFINE_PROP_BIT("cipher-aes-f8", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_AES_F8, false),
+DEFINE_PROP_BIT("cipher-aes-xts", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_AES_XTS, false),
+DEFINE_PROP_BIT("cipher-zuc-eea3", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_ZUC_EEA3, false),


This does not scale as an approach IMHO which just reinforces to me
that we shouldn't do this.


I am open suggestions on better implementation. I thought defining a 
property bit for the virtio-crypto dev would 

[Qemu-devel] [RFC v1 1/1] virtio-crypto: Allow disabling of cipher algorithms for virtio-crypto device

2018-06-12 Thread Farhan Ali
The virtio-crypto driver currently propagates to the guest
all the cipher algorithms that the backend cryptodev can
support. But in certain cases where the guest has more
performant mechanism to handle some algorithms, it would be
useful to propagate only a subset of the algorithms.

This patch adds support for disabling the cipher
algorithms of the backend cryptodev.

eg:
 -object cryptodev-backend-builtin,id=cryptodev0
 -device virtio-crypto-ccw,id=crypto0,cryptodev=cryptodev0,cipher-aes-cbc=off

Signed-off-by: Farhan Ali 
---

Please note this patch is not complete, and there are TODOs to handle
for other types of algorithms such Hash, AEAD and MAC algorithms.

This is mainly intended to get some feedback on the design approach
from the community.


 hw/virtio/virtio-crypto.c | 46 ---
 include/hw/virtio/virtio-crypto.h |  3 +++
 2 files changed, 46 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/virtio-crypto.c b/hw/virtio/virtio-crypto.c
index 9a9fa49..4aed9ca 100644
--- a/hw/virtio/virtio-crypto.c
+++ b/hw/virtio/virtio-crypto.c
@@ -754,12 +754,22 @@ static void virtio_crypto_reset(VirtIODevice *vdev)
 static void virtio_crypto_init_config(VirtIODevice *vdev)
 {
 VirtIOCrypto *vcrypto = VIRTIO_CRYPTO(vdev);
+uint32_t user_crypto_services = (1u << VIRTIO_CRYPTO_SERVICE_CIPHER) |
+(1u << VIRTIO_CRYPTO_SERVICE_HASH) |
+(1u << VIRTIO_CRYPTO_SERVICE_AEAD) |
+(1u << VIRTIO_CRYPTO_SERVICE_MAC);
+
+if (vcrypto->user_cipher_algo_l & (1u << VIRTIO_CRYPTO_NO_CIPHER)) {
+vcrypto->user_cipher_algo_l = 1u << VIRTIO_CRYPTO_NO_CIPHER;
+vcrypto->user_cipher_algo_h = 0;
+user_crypto_services &= ~(1u << VIRTIO_CRYPTO_SERVICE_CIPHER);
+}
 
-vcrypto->conf.crypto_services =
+vcrypto->conf.crypto_services = user_crypto_services &
  vcrypto->conf.cryptodev->conf.crypto_services;
-vcrypto->conf.cipher_algo_l =
+vcrypto->conf.cipher_algo_l = vcrypto->user_cipher_algo_l &
  vcrypto->conf.cryptodev->conf.cipher_algo_l;
-vcrypto->conf.cipher_algo_h =
+vcrypto->conf.cipher_algo_h = vcrypto->user_cipher_algo_h &
  vcrypto->conf.cryptodev->conf.cipher_algo_h;
 vcrypto->conf.hash_algo = vcrypto->conf.cryptodev->conf.hash_algo;
 vcrypto->conf.mac_algo_l = vcrypto->conf.cryptodev->conf.mac_algo_l;
@@ -853,6 +863,34 @@ static const VMStateDescription vmstate_virtio_crypto = {
 static Property virtio_crypto_properties[] = {
 DEFINE_PROP_LINK("cryptodev", VirtIOCrypto, conf.cryptodev,
  TYPE_CRYPTODEV_BACKEND, CryptoDevBackend *),
+DEFINE_PROP_BIT("no-cipher", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_ARC4, false),
+DEFINE_PROP_BIT("cipher-arc4", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_ARC4, false),
+DEFINE_PROP_BIT("cipher-aes-ecb", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_AES_ECB, false),
+DEFINE_PROP_BIT("cipher-aes-cbc", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_AES_CBC, false),
+DEFINE_PROP_BIT("cipher-aes-ctr", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_AES_CTR, false),
+DEFINE_PROP_BIT("cipher-des-ecb", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_DES_ECB, false),
+DEFINE_PROP_BIT("cipher-3des-ecb", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_3DES_ECB, false),
+DEFINE_PROP_BIT("cipher-3des-cbc", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_3DES_CBC, false),
+DEFINE_PROP_BIT("cipher-3des-ctr", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_3DES_CTR, false),
+DEFINE_PROP_BIT("cipher-kasumi-f8", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_KASUMI_F8, false),
+DEFINE_PROP_BIT("cipher-snow3g-uea2", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_SNOW3G_UEA2, false),
+DEFINE_PROP_BIT("cipher-aes-f8", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_AES_F8, false),
+DEFINE_PROP_BIT("cipher-aes-xts", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_AES_XTS, false),
+DEFINE_PROP_BIT("cipher-zuc-eea3", VirtIOCrypto, user_cipher_algo_l,
+VIRTIO_CRYPTO_CIPHER_ZUC_EEA3, false),
 DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -974,6 +1012,8 @@ static void virtio_crypto_instance_init(Objec

Re: [Qemu-devel] [RFC v1 2/2] crypto/virtio-crypto: Register an algo only if it's supported

2018-06-12 Thread Farhan Ali




On 06/11/2018 04:48 AM, Gonglei (Arei) wrote:




-Original Message-
From: Farhan Ali [mailto:al...@linux.ibm.com]
Sent: Saturday, June 09, 2018 3:09 AM
To: linux-ker...@vger.kernel.org; k...@vger.kernel.org
Cc: m...@redhat.com; qemu-devel@nongnu.org; Gonglei (Arei)
; longpeng ;
pa...@linux.ibm.com; fran...@linux.ibm.com; borntrae...@de.ibm.com;
al...@linux.ibm.com
Subject: [RFC v1 2/2] crypto/virtio-crypto: Register an algo only if it's 
supported

From: Farhan Ali 

Register a crypto algo with the Linux crypto layer only if
the algorithm is supported by the backend virtio-crypto
device.

Also route crypto requests to a virtio-crypto
device, only if it can support the requested service and
algorithm.

Signed-off-by: Farhan Ali 
---
  drivers/crypto/virtio/virtio_crypto_algs.c   | 110
++-
  drivers/crypto/virtio/virtio_crypto_common.h |  11 ++-
  drivers/crypto/virtio/virtio_crypto_mgr.c|  81 ++--
  3 files changed, 158 insertions(+), 44 deletions(-)

diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c
b/drivers/crypto/virtio/virtio_crypto_algs.c
index ba190cf..fef112a 100644
--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -49,12 +49,18 @@ struct virtio_crypto_sym_request {
bool encrypt;
  };

+struct virtio_crypto_algo {
+   uint32_t algonum;
+   uint32_t service;
+   unsigned int active_devs;
+   struct crypto_alg algo;
+};
+
  /*
   * The algs_lock protects the below global virtio_crypto_active_devs
   * and crypto algorithms registion.
   */
  static DEFINE_MUTEX(algs_lock);
-static unsigned int virtio_crypto_active_devs;
  static void virtio_crypto_ablkcipher_finalize_req(
struct virtio_crypto_sym_request *vc_sym_req,
struct ablkcipher_request *req,
@@ -312,13 +318,19 @@ static int virtio_crypto_ablkcipher_setkey(struct
crypto_ablkcipher *tfm,
 unsigned int keylen)
  {
struct virtio_crypto_ablkcipher_ctx *ctx = crypto_ablkcipher_ctx(tfm);
+   uint32_t alg;
int ret;

+   ret = virtio_crypto_alg_validate_key(keylen, );
+   if (ret)
+   return ret;
+
if (!ctx->vcrypto) {
/* New key */
int node = virtio_crypto_get_current_node();
struct virtio_crypto *vcrypto =
- virtcrypto_get_dev_node(node);
+ virtcrypto_get_dev_node(node,
+ VIRTIO_CRYPTO_SERVICE_CIPHER, alg);
if (!vcrypto) {
pr_err("virtio_crypto: Could not find a virtio device 
in the
system\n");


We'd better change the above error message now. What about:
  " virtio_crypto: Could not find a virtio device in the system or unsupported 
algo" ?

Regards,
-Gonglei



Sure, I will update the error message. But other than that does the rest 
of the code looks good to you?


Thanks
Farhan











Re: [Qemu-devel] [RFC v1 1/2] crypto/virtio-crypto: Read crypto services and algorithm masks

2018-06-12 Thread Farhan Ali

Hi Arei

On 06/11/2018 02:43 AM, Gonglei (Arei) wrote:



-Original Message-
From: Farhan Ali [mailto:al...@linux.ibm.com]
Sent: Saturday, June 09, 2018 3:09 AM
To: linux-ker...@vger.kernel.org; k...@vger.kernel.org
Cc: m...@redhat.com; qemu-devel@nongnu.org; Gonglei (Arei)
; longpeng ;
pa...@linux.ibm.com; fran...@linux.ibm.com; borntrae...@de.ibm.com;
al...@linux.ibm.com
Subject: [RFC v1 1/2] crypto/virtio-crypto: Read crypto services and algorithm
masks

Read the crypto services and algorithm masks which provides
information about the services and algorithms supported by
virtio-crypto backend.

Signed-off-by: Farhan Ali 
---
  drivers/crypto/virtio/virtio_crypto_common.h | 14 ++
  drivers/crypto/virtio/virtio_crypto_core.c   | 29

  2 files changed, 43 insertions(+)

diff --git a/drivers/crypto/virtio/virtio_crypto_common.h
b/drivers/crypto/virtio/virtio_crypto_common.h
index 66501a5..05eca12e 100644
--- a/drivers/crypto/virtio/virtio_crypto_common.h
+++ b/drivers/crypto/virtio/virtio_crypto_common.h
@@ -55,6 +55,20 @@ struct virtio_crypto {
/* Number of queue currently used by the driver */
u32 curr_queue;

+   /*
+* Specifies the services mask which the device support,
+* see VIRTIO_CRYPTO_SERVICE_* above
+*/


Pls update the above comments. Except that:

Acked-by: Gonglei 



Sure will update the comment. How about " Specifies the services mask 
which the device support, * see VIRTIO_CRYPTO_SERVICE_*" ?


or should I specify the file where the VIRTIO_CRYPTO_SERVICE_* are defined?

Thanks
Farhan


+   u32 crypto_services;
+
+   /* Detailed algorithms mask */
+   u32 cipher_algo_l;
+   u32 cipher_algo_h;
+   u32 hash_algo;
+   u32 mac_algo_l;
+   u32 mac_algo_h;
+   u32 aead_algo;
+
/* Maximum length of cipher key */
u32 max_cipher_key_len;
/* Maximum length of authenticated key */
diff --git a/drivers/crypto/virtio/virtio_crypto_core.c
b/drivers/crypto/virtio/virtio_crypto_core.c
index 8332698..8f745f2 100644
--- a/drivers/crypto/virtio/virtio_crypto_core.c
+++ b/drivers/crypto/virtio/virtio_crypto_core.c
@@ -303,6 +303,13 @@ static int virtcrypto_probe(struct virtio_device *vdev)
u32 max_data_queues = 0, max_cipher_key_len = 0;
u32 max_auth_key_len = 0;
u64 max_size = 0;
+   u32 cipher_algo_l = 0;
+   u32 cipher_algo_h = 0;
+   u32 hash_algo = 0;
+   u32 mac_algo_l = 0;
+   u32 mac_algo_h = 0;
+   u32 aead_algo = 0;
+   u32 crypto_services = 0;

if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
return -ENODEV;
@@ -339,6 +346,20 @@ static int virtcrypto_probe(struct virtio_device *vdev)
max_auth_key_len, _auth_key_len);
virtio_cread(vdev, struct virtio_crypto_config,
max_size, _size);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   crypto_services, _services);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   cipher_algo_l, _algo_l);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   cipher_algo_h, _algo_h);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   hash_algo, _algo);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   mac_algo_l, _algo_l);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   mac_algo_h, _algo_h);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   aead_algo, _algo);

/* Add virtio crypto device to global table */
err = virtcrypto_devmgr_add_dev(vcrypto);
@@ -358,6 +379,14 @@ static int virtcrypto_probe(struct virtio_device *vdev)
vcrypto->max_cipher_key_len = max_cipher_key_len;
vcrypto->max_auth_key_len = max_auth_key_len;
vcrypto->max_size = max_size;
+   vcrypto->crypto_services = crypto_services;
+   vcrypto->cipher_algo_l = cipher_algo_l;
+   vcrypto->cipher_algo_h = cipher_algo_h;
+   vcrypto->mac_algo_l = mac_algo_l;
+   vcrypto->mac_algo_h = mac_algo_h;
+   vcrypto->hash_algo = hash_algo;
+   vcrypto->aead_algo = aead_algo;
+

dev_info(>dev,
"max_queues: %u, max_cipher_key_len: %u, max_auth_key_len: %u,
max_size 0x%llx\n",
--
2.7.4








[Qemu-devel] [RFC v1 2/2] crypto/virtio-crypto: Register an algo only if it's supported

2018-06-08 Thread Farhan Ali
From: Farhan Ali 

Register a crypto algo with the Linux crypto layer only if
the algorithm is supported by the backend virtio-crypto
device.

Also route crypto requests to a virtio-crypto
device, only if it can support the requested service and
algorithm.

Signed-off-by: Farhan Ali 
---
 drivers/crypto/virtio/virtio_crypto_algs.c   | 110 ++-
 drivers/crypto/virtio/virtio_crypto_common.h |  11 ++-
 drivers/crypto/virtio/virtio_crypto_mgr.c|  81 ++--
 3 files changed, 158 insertions(+), 44 deletions(-)

diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c 
b/drivers/crypto/virtio/virtio_crypto_algs.c
index ba190cf..fef112a 100644
--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -49,12 +49,18 @@ struct virtio_crypto_sym_request {
bool encrypt;
 };
 
+struct virtio_crypto_algo {
+   uint32_t algonum;
+   uint32_t service;
+   unsigned int active_devs;
+   struct crypto_alg algo;
+};
+
 /*
  * The algs_lock protects the below global virtio_crypto_active_devs
  * and crypto algorithms registion.
  */
 static DEFINE_MUTEX(algs_lock);
-static unsigned int virtio_crypto_active_devs;
 static void virtio_crypto_ablkcipher_finalize_req(
struct virtio_crypto_sym_request *vc_sym_req,
struct ablkcipher_request *req,
@@ -312,13 +318,19 @@ static int virtio_crypto_ablkcipher_setkey(struct 
crypto_ablkcipher *tfm,
 unsigned int keylen)
 {
struct virtio_crypto_ablkcipher_ctx *ctx = crypto_ablkcipher_ctx(tfm);
+   uint32_t alg;
int ret;
 
+   ret = virtio_crypto_alg_validate_key(keylen, );
+   if (ret)
+   return ret;
+
if (!ctx->vcrypto) {
/* New key */
int node = virtio_crypto_get_current_node();
struct virtio_crypto *vcrypto =
- virtcrypto_get_dev_node(node);
+ virtcrypto_get_dev_node(node,
+ VIRTIO_CRYPTO_SERVICE_CIPHER, alg);
if (!vcrypto) {
pr_err("virtio_crypto: Could not find a virtio device 
in the system\n");
return -ENODEV;
@@ -571,57 +583,85 @@ static void virtio_crypto_ablkcipher_finalize_req(
virtcrypto_clear_request(_sym_req->base);
 }
 
-static struct crypto_alg virtio_crypto_algs[] = { {
-   .cra_name = "cbc(aes)",
-   .cra_driver_name = "virtio_crypto_aes_cbc",
-   .cra_priority = 150,
-   .cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC,
-   .cra_blocksize = AES_BLOCK_SIZE,
-   .cra_ctxsize  = sizeof(struct virtio_crypto_ablkcipher_ctx),
-   .cra_alignmask = 0,
-   .cra_module = THIS_MODULE,
-   .cra_type = _ablkcipher_type,
-   .cra_init = virtio_crypto_ablkcipher_init,
-   .cra_exit = virtio_crypto_ablkcipher_exit,
-   .cra_u = {
-  .ablkcipher = {
-   .setkey = virtio_crypto_ablkcipher_setkey,
-   .decrypt = virtio_crypto_ablkcipher_decrypt,
-   .encrypt = virtio_crypto_ablkcipher_encrypt,
-   .min_keysize = AES_MIN_KEY_SIZE,
-   .max_keysize = AES_MAX_KEY_SIZE,
-   .ivsize = AES_BLOCK_SIZE,
+static struct virtio_crypto_algo virtio_crypto_algs[] = { {
+   .algonum = VIRTIO_CRYPTO_CIPHER_AES_CBC,
+   .service = VIRTIO_CRYPTO_SERVICE_CIPHER,
+   .algo = {
+   .cra_name = "cbc(aes)",
+   .cra_driver_name = "virtio_crypto_aes_cbc",
+   .cra_priority = 150,
+   .cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC,
+   .cra_blocksize = AES_BLOCK_SIZE,
+   .cra_ctxsize  = sizeof(struct virtio_crypto_ablkcipher_ctx),
+   .cra_alignmask = 0,
+   .cra_module = THIS_MODULE,
+   .cra_type = _ablkcipher_type,
+   .cra_init = virtio_crypto_ablkcipher_init,
+   .cra_exit = virtio_crypto_ablkcipher_exit,
+   .cra_u = {
+   .ablkcipher = {
+   .setkey = virtio_crypto_ablkcipher_setkey,
+   .decrypt = virtio_crypto_ablkcipher_decrypt,
+   .encrypt = virtio_crypto_ablkcipher_encrypt,
+   .min_keysize = AES_MIN_KEY_SIZE,
+   .max_keysize = AES_MAX_KEY_SIZE,
+   .ivsize = AES_BLOCK_SIZE,
+   },
},
},
 } };
 
-int virtio_crypto_algs_register(void)
+int virtio_crypto_algs_register(struct virtio_crypto *vcrypto)
 {
int ret = 0;
+   int i = 0;
 
mutex_lock(_lock);
-   if (++virtio_crypto_active_devs != 1)
-   goto unlock;

[Qemu-devel] [RFC v1 1/2] crypto/virtio-crypto: Read crypto services and algorithm masks

2018-06-08 Thread Farhan Ali
Read the crypto services and algorithm masks which provides
information about the services and algorithms supported by
virtio-crypto backend.

Signed-off-by: Farhan Ali 
---
 drivers/crypto/virtio/virtio_crypto_common.h | 14 ++
 drivers/crypto/virtio/virtio_crypto_core.c   | 29 
 2 files changed, 43 insertions(+)

diff --git a/drivers/crypto/virtio/virtio_crypto_common.h 
b/drivers/crypto/virtio/virtio_crypto_common.h
index 66501a5..05eca12e 100644
--- a/drivers/crypto/virtio/virtio_crypto_common.h
+++ b/drivers/crypto/virtio/virtio_crypto_common.h
@@ -55,6 +55,20 @@ struct virtio_crypto {
/* Number of queue currently used by the driver */
u32 curr_queue;
 
+   /*
+* Specifies the services mask which the device support,
+* see VIRTIO_CRYPTO_SERVICE_* above
+*/
+   u32 crypto_services;
+
+   /* Detailed algorithms mask */
+   u32 cipher_algo_l;
+   u32 cipher_algo_h;
+   u32 hash_algo;
+   u32 mac_algo_l;
+   u32 mac_algo_h;
+   u32 aead_algo;
+
/* Maximum length of cipher key */
u32 max_cipher_key_len;
/* Maximum length of authenticated key */
diff --git a/drivers/crypto/virtio/virtio_crypto_core.c 
b/drivers/crypto/virtio/virtio_crypto_core.c
index 8332698..8f745f2 100644
--- a/drivers/crypto/virtio/virtio_crypto_core.c
+++ b/drivers/crypto/virtio/virtio_crypto_core.c
@@ -303,6 +303,13 @@ static int virtcrypto_probe(struct virtio_device *vdev)
u32 max_data_queues = 0, max_cipher_key_len = 0;
u32 max_auth_key_len = 0;
u64 max_size = 0;
+   u32 cipher_algo_l = 0;
+   u32 cipher_algo_h = 0;
+   u32 hash_algo = 0;
+   u32 mac_algo_l = 0;
+   u32 mac_algo_h = 0;
+   u32 aead_algo = 0;
+   u32 crypto_services = 0;
 
if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
return -ENODEV;
@@ -339,6 +346,20 @@ static int virtcrypto_probe(struct virtio_device *vdev)
max_auth_key_len, _auth_key_len);
virtio_cread(vdev, struct virtio_crypto_config,
max_size, _size);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   crypto_services, _services);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   cipher_algo_l, _algo_l);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   cipher_algo_h, _algo_h);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   hash_algo, _algo);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   mac_algo_l, _algo_l);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   mac_algo_h, _algo_h);
+   virtio_cread(vdev, struct virtio_crypto_config,
+   aead_algo, _algo);
 
/* Add virtio crypto device to global table */
err = virtcrypto_devmgr_add_dev(vcrypto);
@@ -358,6 +379,14 @@ static int virtcrypto_probe(struct virtio_device *vdev)
vcrypto->max_cipher_key_len = max_cipher_key_len;
vcrypto->max_auth_key_len = max_auth_key_len;
vcrypto->max_size = max_size;
+   vcrypto->crypto_services = crypto_services;
+   vcrypto->cipher_algo_l = cipher_algo_l;
+   vcrypto->cipher_algo_h = cipher_algo_h;
+   vcrypto->mac_algo_l = mac_algo_l;
+   vcrypto->mac_algo_h = mac_algo_h;
+   vcrypto->hash_algo = hash_algo;
+   vcrypto->aead_algo = aead_algo;
+
 
dev_info(>dev,
"max_queues: %u, max_cipher_key_len: %u, max_auth_key_len: %u, 
max_size 0x%llx\n",
-- 
2.7.4




[Qemu-devel] [RFC v1 0/2] Detect & register virtio-crypto algos only if it can be supported by backend

2018-06-08 Thread Farhan Ali
Hi,

Currently the Linux virtio-crypto driver registers the crypto
algorithm without verifying if the backend actually supports the
algorithm.

This kernel patch series adds support for registering algorithm
with Linux crypto layer, only if the algorithm is supported by
the backend device. This also makes the driver more compliant with
the virtio-crypto spec [1].

I would appreciate any feedback or comments on this.

Thank you
Farhan

Reference
-
[1] Virtio crypto spec proposal 
https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg00816.html

Farhan Ali (2):
  crypto/virtio-crypto: Read crypto services and algorithm masks
  crypto/virtio-crypto: Register an algo only if it's supported

 drivers/crypto/virtio/virtio_crypto_algs.c   | 110 ++-
 drivers/crypto/virtio/virtio_crypto_common.h |  25 +-
 drivers/crypto/virtio/virtio_crypto_core.c   |  29 +++
 drivers/crypto/virtio/virtio_crypto_mgr.c|  81 ++--
 4 files changed, 201 insertions(+), 44 deletions(-)

-- 
2.7.4




Re: [Qemu-devel] [PATCH 2/3] pc-bios/s390-ccw/net: Add support for pxelinux-style config files

2018-06-01 Thread Farhan Ali




On 05/31/2018 11:21 PM, Thomas Huth wrote:

Just a question do we want to clear cfgbuf here, before calling
pxelinux_load_parse_cfg?

That's theoretically not necessary. The buffer either gets populated
with data, or the function errors out. The code also makes sure that
there is a final NUL-character in the buffer:

https://github.com/aik/SLOF/blob/64c526a/lib/libnet/pxelinux.c#L169

... but I think I've got to double check that there is also a
NUL-character immediately at the end of the downloaded data ... so
there's indeed a change required, but likely rather in the SLOF code
than here.

  Thomas




Can't we do that in net_try_direct_tftp_load, or it is better to put 
that in SLOF code?





Re: [Qemu-devel] [PATCH 3/3] pc-bios/s390-ccw/net: Try to load pxelinux.cfg file accoring to the UUID

2018-06-01 Thread Farhan Ali



Reviewed-by: Farhan Ali 




Re: [Qemu-devel] [PATCH 2/3] pc-bios/s390-ccw/net: Add support for pxelinux-style config files

2018-05-31 Thread Farhan Ali




On 05/30/2018 05:16 AM, Thomas Huth wrote:

Since it is quite cumbersome to manually create a combined kernel with
initrd image for network booting, we now support loading via pxelinux
configuration files, too. In these files, the kernel, initrd and command
line parameters can be specified seperately, and the firmware then takes
care of glueing everything together in memory after the files have been
downloaded. See this URL for details about the config file layout:
https://www.syslinux.org/wiki/index.php?title=PXELINUX

The user can either specify a config file directly as bootfile via DHCP
(but in this case, the file has to start either with "default" or a "#"
comment so we can distinguish it from binary kernels), or a folder (i.e.
the bootfile name must end with "/") where the firmware should look for
the typical pxelinux.cfg file names, e.g. based on MAC or IP address.
We also support the pxelinux.cfg DHCP options 209 and 210 from RFC 5071.

Signed-off-by: Thomas Huth 
---
  pc-bios/s390-ccw/netboot.mak |  7 ++--
  pc-bios/s390-ccw/netmain.c   | 79 +++-
  2 files changed, 82 insertions(+), 4 deletions(-)

diff --git a/pc-bios/s390-ccw/netboot.mak b/pc-bios/s390-ccw/netboot.mak
index a73be36..8af0cfd 100644
--- a/pc-bios/s390-ccw/netboot.mak
+++ b/pc-bios/s390-ccw/netboot.mak
@@ -25,8 +25,9 @@ CTYPE_OBJS = isdigit.o isxdigit.o toupper.o
  %.o : $(SLOF_DIR)/lib/libc/ctype/%.c
$(call quiet-command,$(CC) $(LIBC_CFLAGS) -c -o $@ 
$<,"CC","$(TARGET_DIR)$@")
  
-STRING_OBJS = strcat.o strchr.o strcmp.o strcpy.o strlen.o strncmp.o strncpy.o \

- strstr.o memset.o memcpy.o memmove.o memcmp.o
+STRING_OBJS = strcat.o strchr.o strrchr.o strcpy.o strlen.o strncpy.o \
+ strcmp.o strncmp.o strcasecmp.o strncasecmp.o strstr.o \
+ memset.o memcpy.o memmove.o memcmp.o
  %.o : $(SLOF_DIR)/lib/libc/string/%.c
$(call quiet-command,$(CC) $(LIBC_CFLAGS) -c -o $@ 
$<,"CC","$(TARGET_DIR)$@")
  
@@ -50,7 +51,7 @@ libc.a: $(LIBCOBJS)

  # libnet files:
  
  LIBNETOBJS := args.o dhcp.o dns.o icmpv6.o ipv6.o tcp.o udp.o bootp.o \

- dhcpv6.o ethernet.o ipv4.o ndp.o tftp.o
+ dhcpv6.o ethernet.o ipv4.o ndp.o tftp.o pxelinux.o
  LIBNETCFLAGS := $(QEMU_CFLAGS) -DDHCPARCH=0x1F $(LIBC_INC) $(LIBNET_INC)
  
  %.o : $(SLOF_DIR)/lib/libnet/%.c

diff --git a/pc-bios/s390-ccw/netmain.c b/pc-bios/s390-ccw/netmain.c
index 7533cf7..e84bb2b 100644
--- a/pc-bios/s390-ccw/netmain.c
+++ b/pc-bios/s390-ccw/netmain.c
@@ -30,6 +30,7 @@
  #include 
  #include 
  #include 
+#include 
  
  #include "s390-ccw.h"

  #include "virtio.h"
@@ -41,12 +42,14 @@ extern char _start[];
  
  #define KERNEL_ADDR ((void *)0L)

  #define KERNEL_MAX_SIZE ((long)_start)
+#define ARCH_COMMAND_LINE_SIZE  896  /* Taken from Linux kernel */
  
  char stack[PAGE_SIZE * 8] __attribute__((aligned(PAGE_SIZE)));

  IplParameterBlock iplb __attribute__((aligned(PAGE_SIZE)));
  static char cfgbuf[2048];
  
  static SubChannelId net_schid = { .one = 1 };

+static uint8_t mac[6];
  static uint64_t dest_timer;
  
  static uint64_t get_timer_ms(void)

@@ -158,7 +161,6 @@ static int tftp_load(filename_ip_t *fnip, void *buffer, int 
len)
  
  static int net_init(filename_ip_t *fn_ip)

  {
-uint8_t mac[6];
  int rc;
  
  memset(fn_ip, 0, sizeof(filename_ip_t));

@@ -233,6 +235,66 @@ static void net_release(filename_ip_t *fn_ip)
  }
  
  /**

+ * Load a kernel with initrd (i.e. with the information that we've got from
+ * a pxelinux.cfg config file)
+ */
+static int load_kernel_with_initrd(filename_ip_t *fn_ip,
+   struct pl_cfg_entry *entry)
+{
+int rc;
+
+printf("Loading pxelinux.cfg entry '%s'\n", entry->label);
+
+if (!entry->kernel) {
+printf("Kernel entry is missing!\n");
+return -1;
+}
+
+strncpy(fn_ip->filename, entry->kernel, sizeof(fn_ip->filename));
+rc = tftp_load(fn_ip, KERNEL_ADDR, KERNEL_MAX_SIZE);
+if (rc < 0) {
+return rc;
+}
+
+if (entry->initrd) {
+uint64_t iaddr = (rc + 0xfff) & ~0xfffUL;
+
+strncpy(fn_ip->filename, entry->initrd, sizeof(fn_ip->filename));
+rc = tftp_load(fn_ip, (void *)iaddr, KERNEL_MAX_SIZE - iaddr);
+if (rc < 0) {
+return rc;
+}
+/* Patch location and size: */
+*(uint64_t *)0x10408 = iaddr;
+*(uint64_t *)0x10410 = rc;
+rc += iaddr;
+}
+
+if (entry->append) {
+strncpy((char *)0x10480, entry->append, ARCH_COMMAND_LINE_SIZE);
+}
+
+return rc;
+}
+
+#define MAX_PXELINUX_ENTRIES 16
+
+static int net_try_pxelinux_cfg(filename_ip_t *fn_ip)
+{
+struct pl_cfg_entry entries[MAX_PXELINUX_ENTRIES];
+int num_ent, def_ent = 0;
+
+num_ent = pxelinux_load_parse_cfg(fn_ip, mac, NULL, DEFAULT_TFTP_RETRIES,
+  cfgbuf, sizeof(cfgbuf),
+   

Re: [Qemu-devel] [PATCH] pc-bios/s390-ccw: define loadparm length

2018-05-30 Thread Farhan Ali




On 05/29/2018 12:40 AM, Collin Walling wrote:

Loadparm is defined by the s390 architecture to be 8 bytes
in length. Let's define this size in the s390-ccw bios.

Suggested-by: Laszlo Ersek
Signed-off-by: Collin Walling
---
  pc-bios/s390-ccw/iplb.h | 4 +++-
  pc-bios/s390-ccw/main.c | 8 
  pc-bios/s390-ccw/sclp.c | 2 +-
  pc-bios/s390-ccw/sclp.h | 2 +-
  4 files changed, 9 insertions(+), 7 deletions(-)


Reviewed-by: Farhan Ali 




Re: [Qemu-devel] [PATCH] s390-ccw: force diag 308 subcode to unsigned long

2018-05-03 Thread Farhan Ali



On 05/03/2018 11:44 AM, Cornelia Huck wrote:

On Thu, 3 May 2018 11:25:08 -0400
Farhan Ali <al...@linux.ibm.com> wrote:


On 05/02/2018 08:52 AM, Cornelia Huck wrote:

We currently pass an integer as the subcode parameter. However,
the upper bits of the register containing the subcode need to
be 0, which is not guaranteed unless we explicitly specify the
subcode to be an unsigned long value.

Fixes: d046c51dad3 ("pc-bios/s390-ccw: Get device address via diag 308/6")
Cc:qemu-sta...@nongnu.org
Signed-off-by: Cornelia Huck<coh...@redhat.com>


Sorry for my ignorance, but is there a C standard that says upper bits
of an int is not guaranteed to be 0?


The value (5 resp. 6) is small enough to fit into a regular integer,
and the compiler generated a lhi for the load, which did not change any
upper values that might have been in the register previously. Telling
the compiler to treat the value as an unsigned long makes it generate a
lghi.



This makes sense now. Thanks for the explanation :)




Re: [Qemu-devel] [PATCH] s390-ccw: force diag 308 subcode to unsigned long

2018-05-03 Thread Farhan Ali



On 05/03/2018 11:48 AM, Eric Blake wrote:

On 05/03/2018 10:25 AM, Farhan Ali wrote:

On 05/02/2018 08:52 AM, Cornelia Huck wrote:

We currently pass an integer as the subcode parameter. However,
the upper bits of the register containing the subcode need to
be 0, which is not guaranteed unless we explicitly specify the
subcode to be an unsigned long value.

Fixes: d046c51dad3 ("pc-bios/s390-ccw: Get device address via diag 
308/6")

Cc:qemu-sta...@nongnu.org
Signed-off-by: Cornelia Huck<coh...@redhat.com>


Sorry for my ignorance, but is there a C standard that says upper bits 
of an int is not guaranteed to be 0?


We're outside the bounds of the C standard because of the use of asm(). 
The problem here is that the compiler assigning a 32-bit int into a 
64-bit register uses the shortest sequence possible (leaving the upper 
64 bits garbage), because the compiler assumes you correctly wrote the 
assembly to only use 32-bit operations on that register (which don't 
care about the upper bits).  By using an unsigned long (a 64-bit value), 
the compiler instead emits assembly to write the full 64-bit register 
value, rather than leaving the upper bits as garbage; and this matters 
because we are subsequently using all 64 bits of the register in a later 
operation.  We could also use a signed long, even long long, or written 
it as: (store ? 6ULL : 5ULL) instead of using a temporary variable.  The 
crux of the fix is that you have to tell asm() that you want a 64-bit 
value written (the unpatched (store ? 6 : 5) is only a 32-bit value), 
and not whether that value is signed or unsigned (since the 
representation of both 6 and 5 are the same regardless of whether the 
type being written into the register is signed or not).




Thank you so much for the detailed explanation :).

I did not think about the instruction that will be used by the compiler 
to handle the values. Definitely learned something new!





Re: [Qemu-devel] [PATCH] s390-ccw: force diag 308 subcode to unsigned long

2018-05-03 Thread Farhan Ali



On 05/02/2018 08:52 AM, Cornelia Huck wrote:

We currently pass an integer as the subcode parameter. However,
the upper bits of the register containing the subcode need to
be 0, which is not guaranteed unless we explicitly specify the
subcode to be an unsigned long value.

Fixes: d046c51dad3 ("pc-bios/s390-ccw: Get device address via diag 308/6")
Cc:qemu-sta...@nongnu.org
Signed-off-by: Cornelia Huck


Sorry for my ignorance, but is there a C standard that says upper bits 
of an int is not guaranteed to be 0?


Thanks
Farhan




Re: [Qemu-devel] [PATCH v3 1/3] pc-bios/s390-ccw/net: Split up net_load() into init, load and uninit parts

2018-04-25 Thread Farhan Ali



On 04/25/2018 05:08 AM, Thomas Huth wrote:

When we want to support pxelinux-style network booting later, we've got
to do several TFTP transfers - and we do not want to apply for a new IP
address via DHCP each time. So split up net_load into three parts:

1. net_init(), which initializes virtio-net, gets an IP address via DHCP
and prints out the related information.

2. The tftp_load call is now moved directly into the main() function

3. A new net_release() function which should tear down the network stack
before we are done in the firmware.

This will make it easier to extend the code in the next patches.

Signed-off-by: Thomas Huth


You might want to change the commit header "uninit parts" to "release 
parts" :)





Re: [Qemu-devel] [qemu-s390x] [PATCH v1 for-2.13 0/4] pc-bios/s390-ccw: Network boot improvements

2018-04-19 Thread Farhan Ali



On 04/19/2018 01:27 AM, Thomas Huth wrote:

Ah, that's not a memory leak, though it might look like one if you don't
know what the vring_send_buf function is doing: vring_send_buf adds a
buffer to a virtio ring for later use. So the buffer is not unused after
this function, but gets filled in later by the virtio-net device. Thus
we must not free the buffer here.

We could free it in the "uninit" function after shutting down the
device, but OTOH we are then leaving the firmware code anyway by jumping
into the OS kernel, so it does not really matter whether we free()ed the
buffers or not.

  Thomas


That makes a lot of sense now :) thanks a lot for that explanation and 
sorry for my ignorance.





Re: [Qemu-devel] [PATCH v1 for-2.13 0/4] pc-bios/s390-ccw: Network boot improvements

2018-04-18 Thread Farhan Ali



On 04/18/2018 08:31 AM, Thomas Huth wrote:

Some patches to improve the network boot experience on s390x:

First, make sure that we shut down the virtio-net device before jumping
into the kernel. Otherwise some incoming packets might destroy some of
the kernel's data if it has not taken over the device yet.

Then the last two patches add support for loading kernels via
configuration files - pxelinux-style and .INS-file style. This way
you don't have to manually glue your ramdisk to your kernel anymore,
so this should be quite a relieve for all users who want to boot
Linux via the network.

The config file parsers have been completely written by myself from
scratch and only tested with some config files that I came up with
on my own. So if anybody has some pre-existing pxelinux config files
already for booting a s390x, I'd appreciate some testing to see whether
this works as expected for you, too!

Thomas Huth (4):
   pc-bios/s390-ccw/net: Split up net_load() into init, load and uninit
 parts
   pc-bios/s390-ccw/net: Stop virtio-net device before jumping into the
 OS
   pc-bios/s390-ccw/net: Add support for pxelinux-style config files
   pc-bios/s390-ccw/net: Add support for .INS config files

  pc-bios/s390-ccw/netboot.mak  |   5 +-
  pc-bios/s390-ccw/netmain.c| 312 ++
  pc-bios/s390-ccw/virtio-net.c |   8 ++
  pc-bios/s390-ccw/virtio.c |  19 ++-
  pc-bios/s390-ccw/virtio.h |   3 +
  5 files changed, 312 insertions(+), 35 deletions(-)




I have tried with pxelinux default config file I had and it worked fine. 
I will try a couple more tests.


Also while going through the code again, I noticed a memory leak in the 
function virtio_net_init, and fixed it. I could send a formal patch for 
it, if you want to queue it as part of this series as I think we might 
have missed 2.12 window?


diff --git a/pc-bios/s390-ccw/virtio-net.c b/pc-bios/s390-ccw/virtio-net.c
index ff7f4da..e83ba08 100644
--- a/pc-bios/s390-ccw/virtio-net.c
+++ b/pc-bios/s390-ccw/virtio-net.c
@@ -61,6 +61,7 @@ int virtio_net_init(void *mac_addr)
 IPL_assert(buf != NULL, "Can not allocate memory for receive 
buffers");

 vring_send_buf(rxvq, buf, ETH_MTU_SIZE + sizeof(VirtioNetHdr),
VRING_DESC_F_WRITE);
+free(buf);
 }
 vring_notify(rxvq);




Re: [Qemu-devel] [PATCH v1 for-2.13 1/4] pc-bios/s390-ccw/net: Split up net_load() into init, load and uninit parts

2018-04-18 Thread Farhan Ali



On 04/18/2018 08:31 AM, Thomas Huth wrote:

When we want to support pxelinux-style network booting later, we've got
to do several TFTP transfers - and we do not want to apply for a new IP
address via DHCP each time. So split up net_load into three parts:

1. net_init(), which initializes virtio-net, gets an IP address via DHCP
and prints out the related information.

2. The tftp_load call is now moved directly into the main() function

3. A new net_uninit() function which should tear down the network stack
before we are done in the firmware.

This will make it easier to extend the code in the next patches.

Signed-off-by: Thomas Huth



Just a minor nit, if we could rename *_uninit functions to 
destroy/release? I think it's just easier to read





Re: [Qemu-devel] [PATCH 2/4] pc-bios/s390-ccw: fix loadparm initialization and int conversion

2018-04-12 Thread Farhan Ali



On 04/12/2018 04:57 PM, Collin Walling wrote:

On 04/12/2018 02:57 PM, Thomas Huth wrote:

On 10.04.2018 17:01, Collin Walling wrote:

Rename the loadparm char array in main.c to loadparm_str and
increase the size by one byte to account for a null termination
when converting the loadparm string to an int via atoui. Also
allow the boot menu to be enabled when loadparm is set to an
empty string or a series of spaces.

Signed-off-by: Collin Walling 
Reported-by: Vasily Gorbik 
---
  hw/s390x/ipl.c  |  2 ++
  pc-bios/s390-ccw/main.c | 14 +++---
  2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c
index fdeaec3..23b5b54 100644
--- a/hw/s390x/ipl.c
+++ b/hw/s390x/ipl.c
@@ -352,6 +352,8 @@ int s390_ipl_set_loadparm(uint8_t *loadparm)
  loadparm[i] = ascii2ebcdic[(uint8_t) lp[i]];
  }
  
+memset(loadparm + i, 0x40, 8 - i); /* fill with EBCDIC spaces */

+
  g_free(lp);
  return 0;
  }
diff --git a/pc-bios/s390-ccw/main.c b/pc-bios/s390-ccw/main.c
index 9d9f8cf..26f9adf 100644
--- a/pc-bios/s390-ccw/main.c
+++ b/pc-bios/s390-ccw/main.c
@@ -15,11 +15,11 @@
  char stack[PAGE_SIZE * 8] __attribute__((__aligned__(PAGE_SIZE)));
  static SubChannelId blk_schid = { .one = 1 };
  IplParameterBlock iplb __attribute__((__aligned__(PAGE_SIZE)));
-static char loadparm[8] = { 0, 0, 0, 0, 0, 0, 0, 0 };
+static char loadparm_str[9] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 };
  QemuIplParameters qipl;
  
  #define LOADPARM_PROMPT "PROMPT  "

-#define LOADPARM_EMPTY  ""
+#define LOADPARM_EMPTY  ""


Sorry for my ignorance, but why was the old string containing dots?

  Thomas



No need for apologies :)

If -machine loadparm is *not* present on the command line, then the loadparm in 
the sclp
will be a series of nulls. For whatever reason, that gets translated into a 
series of dots.



It's because of the ebc2asc table we use for conversion, which results 
in the dots when converting from ebcdic_to_ascii.



If -machine loadparm="" is present, then loadparm will in the sclp will be a 
series of
spaces. >
We want to enable the boot menu for both of these cases and, to make things 
easier, this
patch replaces any nulls (dots) to spaces when setting loadparm so that we only 
have to
handle one case instead of two within the bios.







Re: [Qemu-devel] [PATCH 2/3] s390: Ensure IPL from SCSI works as expected

2018-04-05 Thread Farhan Ali



On 04/05/2018 11:07 AM, Viktor Mihajlovski wrote:

Operating systems may request an IPL from a virtio-scsi device
by specifying an IPL parameter type of CCW. In this case QEMU
won't set up the IPLB correctly. The BIOS will still detect
it's a SCSI device to boot from, but it will now have to search
for the first LUN and attempt to boot from there.
However this may not be the original boot LUN if there's more than
one SCSI disk attached to the HBA.

With this change QEMU will detect that the request is for a
SCSI device and will rebuild the initial IPL parameter info
if it's the SCSI device used for the first boot. In consequence
the BIOS can use the boot LUN from the IPL information block.

In case a different SCSI device has been set, the BIOS will find
and use the first available LUN.

Signed-off-by: Viktor Mihajlovski <mihaj...@linux.vnet.ibm.com>
---
  hw/s390x/ipl.c | 31 +--
  1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c
index 58e33c5..fb554ab 100644
--- a/hw/s390x/ipl.c
+++ b/hw/s390x/ipl.c
@@ -427,7 +427,8 @@ unref_mr:
  return img_size;
  }

-static bool is_virtio_net_device(IplParameterBlock *iplb)
+static bool is_virtio_ccw_device_of_type(IplParameterBlock *iplb,
+ int virtio_id)
  {
  uint8_t cssid;
  uint8_t ssid;
@@ -447,13 +448,23 @@ static bool is_virtio_net_device(IplParameterBlock *iplb)
  sch = css_find_subch(1, cssid, ssid, schid);

  if (sch && sch->devno == devno) {
-return sch->id.cu_model == VIRTIO_ID_NET;
+return sch->id.cu_model == virtio_id;
  }
  }
  }
  return false;
  }

+static bool is_virtio_net_device(IplParameterBlock *iplb)
+{
+return is_virtio_ccw_device_of_type(iplb, VIRTIO_ID_NET);
+}
+
+static bool is_virtio_scsi_device(IplParameterBlock *iplb)
+{
+return is_virtio_ccw_device_of_type(iplb, VIRTIO_ID_SCSI);
+}
+
  void s390_ipl_update_diag308(IplParameterBlock *iplb)
  {
  S390IPLState *ipl = get_ipl_device();
@@ -478,6 +489,22 @@ void s390_reipl_request(void)
  S390IPLState *ipl = get_ipl_device();

  ipl->reipl_requested = true;
+if (ipl->iplb_valid &&
+!ipl->netboot &&
+ipl->iplb.pbt == S390_IPL_TYPE_CCW &&
+is_virtio_scsi_device(>iplb)) {
+CcwDevice *ccw_dev = s390_get_ccw_device(get_boot_device(0));
+
+if (ccw_dev &&
+cpu_to_be16(ccw_dev->sch->devno) == ipl->iplb.ccw.devno &&
+(ccw_dev->sch->ssid & 3) == ipl->iplb.ccw.ssid) {
+/*
+ * this is the original boot device's SCSI
+ * so restore IPL parameter info from it
+ */
+ipl->iplb_valid = s390_gen_initial_iplb(ipl);
+}
+}
  qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
  }

This does feel a little hackish, but I can't think of a better way to 
ensure the correct SCSI device is used for booting.


Reviewed-by: Farhan Ali <al...@linux.vnet.ibm.com>







Re: [Qemu-devel] [PATCH 1/3] s390: Refactor IPL parameter block generation

2018-04-05 Thread Farhan Ali



On 04/05/2018 11:07 AM, Viktor Mihajlovski wrote:

Splitting out the the CCW device extraction allows reuse.

Signed-off-by: Viktor Mihajlovski<mihaj...@linux.vnet.ibm.com>
---
  hw/s390x/ipl.c | 81 --
  1 file changed, 51 insertions(+), 30 deletions(-)


Reviewed-by: Farhan Ali <al...@linux.vnet.ibm.com>




Re: [Qemu-devel] [BUG] I/O thread segfault for QEMU on s390x

2018-03-07 Thread Farhan Ali



On 03/06/2018 01:34 AM, Martin Schwidefsky wrote:

On Mon, 5 Mar 2018 20:08:45 +0100
Christian Borntraeger  wrote:


Do you happen to run with a recent host kernel that has

commit 7041d28115e91f2144f811ffe8a195c696b1e1d0
 s390: scrub registers on kernel entry and KVM exit

Can you run with this on top
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
index 13a133a6015c..d6dc0e5e8f74 100644
--- a/arch/s390/kernel/entry.S
+++ b/arch/s390/kernel/entry.S
@@ -426,13 +426,13 @@ ENTRY(system_call)
 UPDATE_VTIME %r8,%r9,__LC_SYNC_ENTER_TIMER
 BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP
 stmg%r0,%r7,__PT_R0(%r11)
-   # clear user controlled register to prevent speculative use
-   xgr %r0,%r0
 mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC
 mvc __PT_PSW(16,%r11),__LC_SVC_OLD_PSW
 mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC
 stg %r14,__PT_FLAGS(%r11)
  .Lsysc_do_svc:
+   # clear user controlled register to prevent speculative use
+   xgr %r0,%r0
 # load address of system call table
 lg  %r10,__THREAD_sysc_table(%r13,%r12)
 llgh%r8,__PT_INT_CODE+2(%r11)


To me it looks like that the critical section cleanup (interrupt during system 
call entry) might
save the registers again into ptregs but we have already zeroed out r0.
This patch moves the clearing of r0 after sysc_do_svc, which should fix the 
critical
section cleanup.

Adding Martin and Heiko. Will spin a patch.


Argh, yes. Thanks Chrisitan, this is it. I have been searching for the bug
for days now. The point is that if the system call handler is interrupted
after the xgr but before .Lsysc_do_svc the code at .Lcleanup_system_call
repeats the stmg for %r0-%r7 but now %r0 is already zero.

Please commit a patch for this and I'll will queue it up immediately.



This patch does fix the QEMU crash. I haven't seen the crash after 
running the test case for more than a day. Thanks to everyone for taking 
a look at this problem :)


Thanks
Farhan




Re: [Qemu-devel] [BUG] I/O thread segfault for QEMU on s390x

2018-03-05 Thread Farhan Ali



On 03/05/2018 02:08 PM, Christian Borntraeger wrote:

Do you happen to run with a recent host kernel that has

commit 7041d28115e91f2144f811ffe8a195c696b1e1d0
 s390: scrub registers on kernel entry and KVM exit





Yes.




Can you run with this on top
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
index 13a133a6015c..d6dc0e5e8f74 100644
--- a/arch/s390/kernel/entry.S
+++ b/arch/s390/kernel/entry.S
@@ -426,13 +426,13 @@ ENTRY(system_call)
 UPDATE_VTIME %r8,%r9,__LC_SYNC_ENTER_TIMER
 BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP
 stmg%r0,%r7,__PT_R0(%r11)
-   # clear user controlled register to prevent speculative use
-   xgr %r0,%r0
 mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC
 mvc __PT_PSW(16,%r11),__LC_SVC_OLD_PSW
 mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC
 stg %r14,__PT_FLAGS(%r11)
  .Lsysc_do_svc:
+   # clear user controlled register to prevent speculative use
+   xgr %r0,%r0
 # load address of system call table
 lg  %r10,__THREAD_sysc_table(%r13,%r12)
 llgh%r8,__PT_INT_CODE+2(%r11)


To me it looks like that the critical section cleanup (interrupt during system 
call entry) might
save the registers again into ptregs but we have already zeroed out r0.
This patch moves the clearing of r0 after sysc_do_svc, which should fix the 
critical
section cleanup.



Okay I will run with this.


Adding Martin and Heiko. Will spin a patch.


On 03/05/2018 07:54 PM, Christian Borntraeger wrote:



On 03/05/2018 07:45 PM, Farhan Ali wrote:



On 03/05/2018 06:03 AM, Stefan Hajnoczi wrote:

Please include the following gdb output:

    (gdb) disas swapcontext
    (gdb) i r

That way it's possible to see which instruction faulted and which
registers were being accessed.



here is the disas out for swapcontext, this is on a coredump with debugging 
symbols enabled for qemu. So the addresses from the previous dump is a little 
different.


(gdb) disas swapcontext
Dump of assembler code for function swapcontext:
    0x03ff90751fb8 <+0>:    lgr    %r1,%r2
    0x03ff90751fbc <+4>:    lgr    %r0,%r3
    0x03ff90751fc0 <+8>:    stfpc    248(%r1)
    0x03ff90751fc4 <+12>:    std    %f0,256(%r1)
    0x03ff90751fc8 <+16>:    std    %f1,264(%r1)
    0x03ff90751fcc <+20>:    std    %f2,272(%r1)
    0x03ff90751fd0 <+24>:    std    %f3,280(%r1)
    0x03ff90751fd4 <+28>:    std    %f4,288(%r1)
    0x03ff90751fd8 <+32>:    std    %f5,296(%r1)
    0x03ff90751fdc <+36>:    std    %f6,304(%r1)
    0x03ff90751fe0 <+40>:    std    %f7,312(%r1)
    0x03ff90751fe4 <+44>:    std    %f8,320(%r1)
    0x03ff90751fe8 <+48>:    std    %f9,328(%r1)
    0x03ff90751fec <+52>:    std    %f10,336(%r1)
    0x03ff90751ff0 <+56>:    std    %f11,344(%r1)
    0x03ff90751ff4 <+60>:    std    %f12,352(%r1)
    0x03ff90751ff8 <+64>:    std    %f13,360(%r1)
    0x03ff90751ffc <+68>:    std    %f14,368(%r1)
    0x03ff90752000 <+72>:    std    %f15,376(%r1)
    0x03ff90752004 <+76>:    slgr    %r2,%r2
    0x03ff90752008 <+80>:    stam    %a0,%a15,184(%r1)
    0x03ff9075200c <+84>:    stmg    %r0,%r15,56(%r1)
    0x03ff90752012 <+90>:    la    %r2,2
    0x03ff90752016 <+94>:    lgr    %r5,%r0
    0x03ff9075201a <+98>:    la    %r3,384(%r5)
    0x03ff9075201e <+102>:    la    %r4,384(%r1)
    0x03ff90752022 <+106>:    lghi    %r5,8
    0x03ff90752026 <+110>:    svc    175


sys_rt_sigprocmask. r0 should not be changed by the system call.


    0x03ff90752028 <+112>:    lgr    %r5,%r0
=> 0x03ff9075202c <+116>:    lfpc    248(%r5)


so r5 is zero and it was loaded from r0. r0 was loaded from r3 (which is the 
2nd parameter to this
function). Now this is odd.


    0x03ff90752030 <+120>:    ld    %f0,256(%r5)
    0x03ff90752034 <+124>:    ld    %f1,264(%r5)
    0x03ff90752038 <+128>:    ld    %f2,272(%r5)
    0x03ff9075203c <+132>:    ld    %f3,280(%r5)
    0x03ff90752040 <+136>:    ld    %f4,288(%r5)
    0x03ff90752044 <+140>:    ld    %f5,296(%r5)
    0x03ff90752048 <+144>:    ld    %f6,304(%r5)
    0x03ff9075204c <+148>:    ld    %f7,312(%r5)
    0x03ff90752050 <+152>:    ld    %f8,320(%r5)
    0x03ff90752054 <+156>:    ld    %f9,328(%r5)
    0x03ff90752058 <+160>:    ld    %f10,336(%r5)
    0x03ff9075205c <+164>:    ld    %f11,344(%r5)
    0x03ff90752060 <+168>:    ld    %f12,352(%r5)
    0x03ff90752064 <+172>:    ld    %f13,360(%r5)
    0x03ff90752068 <+176>:    ld    %f14,368(%r5)
    0x03ff9075206c <+180>:    ld    %f15,376(%r5)
    0x03f

  1   2   >