Re: [PATCH v2 2/2] x86/xen: switch initial pvops IRQ functions to dummy ones

2021-09-22 Thread Juergen Gross via Virtualization

On 22.09.21 23:49, Boris Ostrovsky wrote:


On 9/22/21 6:31 AM, Juergen Gross wrote:

The initial pvops functions handling irq flags will only ever be called
before interrupts are being enabled.

So make the __init and switch them to be dummy functions:



What are you making __init?


Oh, sorry, that's a leftover from a previous version.


+/* stub always return ing 0. */



"always returning"


Yes.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2 2/2] x86/xen: switch initial pvops IRQ functions to dummy ones

2021-09-22 Thread Boris Ostrovsky


On 9/22/21 6:31 AM, Juergen Gross wrote:
> The initial pvops functions handling irq flags will only ever be called
> before interrupts are being enabled.
>
> So make the __init and switch them to be dummy functions:


What are you making __init?


>  
> +/* stub always return ing 0. */


"always returning"


-boris

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] hwrng: virtio - add an internal buffer

2021-09-22 Thread Michael S. Tsirkin
On Wed, Sep 22, 2021 at 07:09:00PM +0200, Laurent Vivier wrote:
> hwrng core uses two buffers that can be mixed in the
> virtio-rng queue.
> 
> If the buffer is provided with wait=0 it is enqueued in the
> virtio-rng queue but unused by the caller.
> On the next call, core provides another buffer but the
> first one is filled instead and the new one queued.
> And the caller reads the data from the new one that is not
> updated, and the data in the first one are lost.
> 
> To avoid this mix, virtio-rng needs to use its own unique
> internal buffer at a cost of a data copy to the caller buffer.
> 
> Signed-off-by: Laurent Vivier 
> ---
>  drivers/char/hw_random/virtio-rng.c | 43 ++---
>  1 file changed, 33 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/char/hw_random/virtio-rng.c 
> b/drivers/char/hw_random/virtio-rng.c
> index a90001e02bf7..208c547dcac1 100644
> --- a/drivers/char/hw_random/virtio-rng.c
> +++ b/drivers/char/hw_random/virtio-rng.c
> @@ -18,13 +18,20 @@ static DEFINE_IDA(rng_index_ida);
>  struct virtrng_info {
>   struct hwrng hwrng;
>   struct virtqueue *vq;
> - struct completion have_data;
>   char name[25];
> - unsigned int data_avail;
>   int index;
>   bool busy;
>   bool hwrng_register_done;
>   bool hwrng_removed;
> + /* data transfer */
> + struct completion have_data;
> + unsigned int data_avail;
> + /* minimal size returned by rng_buffer_size() */
> +#if SMP_CACHE_BYTES < 32
> + u8 data[32];
> +#else
> + u8 data[SMP_CACHE_BYTES];
> +#endif

Let's move this logic to a macro in hw_random.h ?

>  };
>  
>  static void random_recv_done(struct virtqueue *vq)
> @@ -39,14 +46,14 @@ static void random_recv_done(struct virtqueue *vq)
>  }
>  
>  /* The host will fill any buffer we give it with sweet, sweet randomness. */
> -static void register_buffer(struct virtrng_info *vi, u8 *buf, size_t size)
> +static void register_buffer(struct virtrng_info *vi)
>  {
>   struct scatterlist sg;
>  
> - sg_init_one(, buf, size);
> + sg_init_one(, vi->data, sizeof(vi->data));

Note that add_early_randomness requests less:
size_t size = min_t(size_t, 16, rng_buffer_size());

maybe track how much was requested and grow up to sizeof(data)?

>  
>   /* There should always be room for one buffer. */
> - virtqueue_add_inbuf(vi->vq, , 1, buf, GFP_KERNEL);
> + virtqueue_add_inbuf(vi->vq, , 1, vi->data, GFP_KERNEL);


BTW no longer true if DMA API is in use ... not easy to fix,
I think some changes to virtio API to allow pre-mapping
s/g for DMA might be needed ...

>  
>   virtqueue_kick(vi->vq);
>  }
> @@ -55,6 +62,8 @@ static int virtio_read(struct hwrng *rng, void *buf, size_t 
> size, bool wait)
>  {
>   int ret;
>   struct virtrng_info *vi = (struct virtrng_info *)rng->priv;
> + unsigned int chunk;
> + size_t read;
>  
>   if (vi->hwrng_removed)
>   return -ENODEV;
> @@ -62,19 +71,33 @@ static int virtio_read(struct hwrng *rng, void *buf, 
> size_t size, bool wait)
>   if (!vi->busy) {
>   vi->busy = true;
>   reinit_completion(>have_data);
> - register_buffer(vi, buf, size);
> + register_buffer(vi);
>   }
>  
>   if (!wait)
>   return 0;
>  
> - ret = wait_for_completion_killable(>have_data);
> - if (ret < 0)
> - return ret;
> + read = 0;
> + while (size != 0) {
> + ret = wait_for_completion_killable(>have_data);
> + if (ret < 0)
> + return ret;
> +
> + chunk = min_t(unsigned int, size, vi->data_avail);
> + memcpy(buf + read, vi->data, chunk);
> + read += chunk;
> + size -= chunk;
> + vi->data_avail = 0;
> +
> + if (size != 0) {
> + reinit_completion(>have_data);
> + register_buffer(vi);
> + }
> + }
>  
>   vi->busy = false;
>  
> - return vi->data_avail;
> + return read;
>  }
>  
>  static void virtio_cleanup(struct hwrng *rng)
> -- 
> 2.31.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 0/4] hwrng: virtio - add an internal buffer

2021-09-22 Thread Alexander Potapenko via Virtualization
On Wed, Sep 22, 2021 at 7:09 PM Laurent Vivier  wrote:
>
> hwrng core uses two buffers that can be mixed in the virtio-rng queue.
>
> This series fixes the problem by adding an internal buffer in virtio-rng.
>
> Once the internal buffer is added, we can fix two other problems:
>
> - to be able to release the driver without waiting the device releases the
>   buffer
>
> - actually returns some data when wait=0 as we can have some already
>   available data
>
> It also tries to improve the performance by always having a buffer in
> the queue of the device.

Tested-by: Alexander Potapenko 
for the series

With these four patches applied, KMSAN stops reporting boot-time
errors in _mix_pool_bytes reported here:
https://www.spinics.net/lists/linux-virtualization/msg46310.html

> Laurent Vivier (4):
>   hwrng: virtio - add an internal buffer
>   hwrng: virtio - don't wait on cleanup
>   hwrng: virtio - don't waste entropy
>   hwrng: virtio - always add a pending request
>
>  drivers/char/hw_random/virtio-rng.c | 84 +
>  1 file changed, 63 insertions(+), 21 deletions(-)
>
> --
> 2.31.1
>
>


-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 4/4] hwrng: virtio - always add a pending request

2021-09-22 Thread Laurent Vivier
If we ensure we have already some data available by enqueuing
again the buffer once data are exhausted, we can return what we
have without waiting for the device answer.

Signed-off-by: Laurent Vivier 
---
 drivers/char/hw_random/virtio-rng.c | 24 ++--
 1 file changed, 10 insertions(+), 14 deletions(-)

diff --git a/drivers/char/hw_random/virtio-rng.c 
b/drivers/char/hw_random/virtio-rng.c
index 8ba97cf4ca8f..0b3c9b643495 100644
--- a/drivers/char/hw_random/virtio-rng.c
+++ b/drivers/char/hw_random/virtio-rng.c
@@ -20,7 +20,6 @@ struct virtrng_info {
struct virtqueue *vq;
char name[25];
int index;
-   bool busy;
bool hwrng_register_done;
bool hwrng_removed;
/* data transfer */
@@ -44,16 +43,16 @@ static void random_recv_done(struct virtqueue *vq)
return;
 
vi->data_idx = 0;
-   vi->busy = false;
 
complete(>have_data);
 }
 
-/* The host will fill any buffer we give it with sweet, sweet randomness. */
-static void register_buffer(struct virtrng_info *vi)
+static void request_entropy(struct virtrng_info *vi)
 {
struct scatterlist sg;
 
+   reinit_completion(>have_data);
+
sg_init_one(, vi->data, sizeof(vi->data));
 
/* There should always be room for one buffer. */
@@ -69,6 +68,8 @@ static unsigned int copy_data(struct virtrng_info *vi, void 
*buf,
memcpy(buf, vi->data + vi->data_idx, size);
vi->data_idx += size;
vi->data_avail -= size;
+   if (vi->data_avail == 0)
+   request_entropy(vi);
return size;
 }
 
@@ -98,13 +99,7 @@ static int virtio_read(struct hwrng *rng, void *buf, size_t 
size, bool wait)
 * so either size is 0 or data_avail is 0
 */
while (size != 0) {
-   /* data_avail is 0 */
-   if (!vi->busy) {
-   /* no pending request, ask for more */
-   vi->busy = true;
-   reinit_completion(>have_data);
-   register_buffer(vi);
-   }
+   /* data_avail is 0 but a request is pending */
ret = wait_for_completion_killable(>have_data);
if (ret < 0)
return ret;
@@ -126,8 +121,7 @@ static void virtio_cleanup(struct hwrng *rng)
 {
struct virtrng_info *vi = (struct virtrng_info *)rng->priv;
 
-   if (vi->busy)
-   complete(>have_data);
+   complete(>have_data);
 }
 
 static int probe_common(struct virtio_device *vdev)
@@ -163,6 +157,9 @@ static int probe_common(struct virtio_device *vdev)
goto err_find;
}
 
+   /* we always have a pending entropy request */
+   request_entropy(vi);
+
return 0;
 
 err_find:
@@ -181,7 +178,6 @@ static void remove_common(struct virtio_device *vdev)
vi->data_idx = 0;
complete(>have_data);
vdev->config->reset(vdev);
-   vi->busy = false;
if (vi->hwrng_register_done)
hwrng_unregister(>hwrng);
vdev->config->del_vqs(vdev);
-- 
2.31.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 3/4] hwrng: virtio - don't waste entropy

2021-09-22 Thread Laurent Vivier
if we don't use all the entropy available in the buffer, keep it
and use it later.

Signed-off-by: Laurent Vivier 
---
 drivers/char/hw_random/virtio-rng.c | 52 +++--
 1 file changed, 35 insertions(+), 17 deletions(-)

diff --git a/drivers/char/hw_random/virtio-rng.c 
b/drivers/char/hw_random/virtio-rng.c
index 173aeea835bb..8ba97cf4ca8f 100644
--- a/drivers/char/hw_random/virtio-rng.c
+++ b/drivers/char/hw_random/virtio-rng.c
@@ -26,6 +26,7 @@ struct virtrng_info {
/* data transfer */
struct completion have_data;
unsigned int data_avail;
+   unsigned int data_idx;
/* minimal size returned by rng_buffer_size() */
 #if SMP_CACHE_BYTES < 32
u8 data[32];
@@ -42,6 +43,9 @@ static void random_recv_done(struct virtqueue *vq)
if (!virtqueue_get_buf(vi->vq, >data_avail))
return;
 
+   vi->data_idx = 0;
+   vi->busy = false;
+
complete(>have_data);
 }
 
@@ -58,6 +62,16 @@ static void register_buffer(struct virtrng_info *vi)
virtqueue_kick(vi->vq);
 }
 
+static unsigned int copy_data(struct virtrng_info *vi, void *buf,
+ unsigned int size)
+{
+   size = min_t(unsigned int, size, vi->data_avail);
+   memcpy(buf, vi->data + vi->data_idx, size);
+   vi->data_idx += size;
+   vi->data_avail -= size;
+   return size;
+}
+
 static int virtio_read(struct hwrng *rng, void *buf, size_t size, bool wait)
 {
int ret;
@@ -68,17 +82,29 @@ static int virtio_read(struct hwrng *rng, void *buf, size_t 
size, bool wait)
if (vi->hwrng_removed)
return -ENODEV;
 
-   if (!vi->busy) {
-   vi->busy = true;
-   reinit_completion(>have_data);
-   register_buffer(vi);
+   read = 0;
+
+   /* copy available data */
+   if (vi->data_avail) {
+   chunk = copy_data(vi, buf, size);
+   size -= chunk;
+   read += chunk;
}
 
if (!wait)
-   return 0;
+   return read;
 
-   read = 0;
+   /* We have already copied available entropy,
+* so either size is 0 or data_avail is 0
+*/
while (size != 0) {
+   /* data_avail is 0 */
+   if (!vi->busy) {
+   /* no pending request, ask for more */
+   vi->busy = true;
+   reinit_completion(>have_data);
+   register_buffer(vi);
+   }
ret = wait_for_completion_killable(>have_data);
if (ret < 0)
return ret;
@@ -88,20 +114,11 @@ static int virtio_read(struct hwrng *rng, void *buf, 
size_t size, bool wait)
if (vi->data_avail == 0)
return read;
 
-   chunk = min_t(unsigned int, size, vi->data_avail);
-   memcpy(buf + read, vi->data, chunk);
-   read += chunk;
+   chunk = copy_data(vi, buf + read, size);
size -= chunk;
-   vi->data_avail = 0;
-
-   if (size != 0) {
-   reinit_completion(>have_data);
-   register_buffer(vi);
-   }
+   read += chunk;
}
 
-   vi->busy = false;
-
return read;
 }
 
@@ -161,6 +178,7 @@ static void remove_common(struct virtio_device *vdev)
 
vi->hwrng_removed = true;
vi->data_avail = 0;
+   vi->data_idx = 0;
complete(>have_data);
vdev->config->reset(vdev);
vi->busy = false;
-- 
2.31.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 2/4] hwrng: virtio - don't wait on cleanup

2021-09-22 Thread Laurent Vivier
When virtio-rng device was dropped by the hwrng core we were forced
to wait the buffer to come back from the device to not have
remaining ongoing operation that could spoil the buffer.

But now, as the buffer is internal to the virtio-rng we can release
the waiting loop immediately, the buffer will be retrieve and use
when the virtio-rng driver will be selected again.

This avoids to hang on an rng_current write command if the virtio-rng
device is blocked by a lack of entropy. This allows to select
another entropy source if the current one is empty.

Signed-off-by: Laurent Vivier 
---
 drivers/char/hw_random/virtio-rng.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/char/hw_random/virtio-rng.c 
b/drivers/char/hw_random/virtio-rng.c
index 208c547dcac1..173aeea835bb 100644
--- a/drivers/char/hw_random/virtio-rng.c
+++ b/drivers/char/hw_random/virtio-rng.c
@@ -82,6 +82,11 @@ static int virtio_read(struct hwrng *rng, void *buf, size_t 
size, bool wait)
ret = wait_for_completion_killable(>have_data);
if (ret < 0)
return ret;
+   /* if vi->data_avail is 0, we have been interrupted
+* by a cleanup, but buffer stays in the queue
+*/
+   if (vi->data_avail == 0)
+   return read;
 
chunk = min_t(unsigned int, size, vi->data_avail);
memcpy(buf + read, vi->data, chunk);
@@ -105,7 +110,7 @@ static void virtio_cleanup(struct hwrng *rng)
struct virtrng_info *vi = (struct virtrng_info *)rng->priv;
 
if (vi->busy)
-   wait_for_completion(>have_data);
+   complete(>have_data);
 }
 
 static int probe_common(struct virtio_device *vdev)
-- 
2.31.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 1/4] hwrng: virtio - add an internal buffer

2021-09-22 Thread Laurent Vivier
hwrng core uses two buffers that can be mixed in the
virtio-rng queue.

If the buffer is provided with wait=0 it is enqueued in the
virtio-rng queue but unused by the caller.
On the next call, core provides another buffer but the
first one is filled instead and the new one queued.
And the caller reads the data from the new one that is not
updated, and the data in the first one are lost.

To avoid this mix, virtio-rng needs to use its own unique
internal buffer at a cost of a data copy to the caller buffer.

Signed-off-by: Laurent Vivier 
---
 drivers/char/hw_random/virtio-rng.c | 43 ++---
 1 file changed, 33 insertions(+), 10 deletions(-)

diff --git a/drivers/char/hw_random/virtio-rng.c 
b/drivers/char/hw_random/virtio-rng.c
index a90001e02bf7..208c547dcac1 100644
--- a/drivers/char/hw_random/virtio-rng.c
+++ b/drivers/char/hw_random/virtio-rng.c
@@ -18,13 +18,20 @@ static DEFINE_IDA(rng_index_ida);
 struct virtrng_info {
struct hwrng hwrng;
struct virtqueue *vq;
-   struct completion have_data;
char name[25];
-   unsigned int data_avail;
int index;
bool busy;
bool hwrng_register_done;
bool hwrng_removed;
+   /* data transfer */
+   struct completion have_data;
+   unsigned int data_avail;
+   /* minimal size returned by rng_buffer_size() */
+#if SMP_CACHE_BYTES < 32
+   u8 data[32];
+#else
+   u8 data[SMP_CACHE_BYTES];
+#endif
 };
 
 static void random_recv_done(struct virtqueue *vq)
@@ -39,14 +46,14 @@ static void random_recv_done(struct virtqueue *vq)
 }
 
 /* The host will fill any buffer we give it with sweet, sweet randomness. */
-static void register_buffer(struct virtrng_info *vi, u8 *buf, size_t size)
+static void register_buffer(struct virtrng_info *vi)
 {
struct scatterlist sg;
 
-   sg_init_one(, buf, size);
+   sg_init_one(, vi->data, sizeof(vi->data));
 
/* There should always be room for one buffer. */
-   virtqueue_add_inbuf(vi->vq, , 1, buf, GFP_KERNEL);
+   virtqueue_add_inbuf(vi->vq, , 1, vi->data, GFP_KERNEL);
 
virtqueue_kick(vi->vq);
 }
@@ -55,6 +62,8 @@ static int virtio_read(struct hwrng *rng, void *buf, size_t 
size, bool wait)
 {
int ret;
struct virtrng_info *vi = (struct virtrng_info *)rng->priv;
+   unsigned int chunk;
+   size_t read;
 
if (vi->hwrng_removed)
return -ENODEV;
@@ -62,19 +71,33 @@ static int virtio_read(struct hwrng *rng, void *buf, size_t 
size, bool wait)
if (!vi->busy) {
vi->busy = true;
reinit_completion(>have_data);
-   register_buffer(vi, buf, size);
+   register_buffer(vi);
}
 
if (!wait)
return 0;
 
-   ret = wait_for_completion_killable(>have_data);
-   if (ret < 0)
-   return ret;
+   read = 0;
+   while (size != 0) {
+   ret = wait_for_completion_killable(>have_data);
+   if (ret < 0)
+   return ret;
+
+   chunk = min_t(unsigned int, size, vi->data_avail);
+   memcpy(buf + read, vi->data, chunk);
+   read += chunk;
+   size -= chunk;
+   vi->data_avail = 0;
+
+   if (size != 0) {
+   reinit_completion(>have_data);
+   register_buffer(vi);
+   }
+   }
 
vi->busy = false;
 
-   return vi->data_avail;
+   return read;
 }
 
 static void virtio_cleanup(struct hwrng *rng)
-- 
2.31.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 0/4] hwrng: virtio - add an internal buffer

2021-09-22 Thread Laurent Vivier
hwrng core uses two buffers that can be mixed in the virtio-rng queue.

This series fixes the problem by adding an internal buffer in virtio-rng.

Once the internal buffer is added, we can fix two other problems:

- to be able to release the driver without waiting the device releases the
  buffer

- actually returns some data when wait=0 as we can have some already
  available data

It also tries to improve the performance by always having a buffer in
the queue of the device.

Laurent Vivier (4):
  hwrng: virtio - add an internal buffer
  hwrng: virtio - don't wait on cleanup
  hwrng: virtio - don't waste entropy
  hwrng: virtio - always add a pending request

 drivers/char/hw_random/virtio-rng.c | 84 +
 1 file changed, 63 insertions(+), 21 deletions(-)

-- 
2.31.1


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V2 2/9] fork: pass worker_flags to copy_thread

2021-09-22 Thread Geert Uytterhoeven
On Tue, Sep 21, 2021 at 11:55 PM Mike Christie
 wrote:
> We need to break up PF_IO_WORKER into the parts that are used for
> scheduling and signal handling and the part that tells copy_thread to
> treat it as a special type of thread during setup. This patch passes the
> worker_flags to copy_thread, so in the next patch we can add new worker
> flags that function can see.
>
> Signed-off-by: Mike Christie 

>  arch/m68k/kernel/process.c   | 2 +-

Acked-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: Re: [RFC 0/5] VirtIO RDMA

2021-09-22 Thread Leon Romanovsky
On Wed, Sep 22, 2021 at 09:37:37PM +0800, 魏俊吉 wrote:
> On Wed, Sep 22, 2021 at 9:06 PM Leon Romanovsky  wrote:
> >
> > On Wed, Sep 22, 2021 at 08:08:44PM +0800, Junji Wei wrote:
> > > > On Sep 15, 2021, at 9:43 PM, Jason Gunthorpe  wrote:
> >
> > <...>
> >
> > > >> 4. The FRMR api need to set key of MR through IB_WR_REG_MR.
> > > >>   But it is impossible to change a key of mr using uverbs.
> > > >
> > > > FRMR is more like memory windows in user space, you can't support it
> > > > using just regular MRs.
> > >
> > > It is hard to support this using uverbs, but it is easy to support
> > > with uRDMA that we can get full control of mrs.
> >
> > What is uRDMA?
> 
> uRDMA is a software implementation of the RoCEv2 protocol like rxe.
> We will implement it in QEMU with VFIO or DPDK.

ok, thanks

> 
> Thanks.
> Junji
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC 0/5] VirtIO RDMA

2021-09-22 Thread Leon Romanovsky
On Wed, Sep 22, 2021 at 08:08:44PM +0800, Junji Wei wrote:
> > On Sep 15, 2021, at 9:43 PM, Jason Gunthorpe  wrote:

<...>

> >> 4. The FRMR api need to set key of MR through IB_WR_REG_MR.
> >>   But it is impossible to change a key of mr using uverbs.
> > 
> > FRMR is more like memory windows in user space, you can't support it
> > using just regular MRs.
> 
> It is hard to support this using uverbs, but it is easy to support
> with uRDMA that we can get full control of mrs.

What is uRDMA?

Thanks
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 0/2] x86/xen: simplify irq pvops

2021-09-22 Thread Peter Zijlstra
On Wed, Sep 22, 2021 at 12:31:00PM +0200, Juergen Gross wrote:
> The pvops function for Xen PV guests handling the interrupt flag are
> much more complex than needed.
> 
> With the supported Xen hypervisor versions they can be simplified a
> lot, especially by removing the need for disabling preemption.
> 
> Juergen Gross (2):
>   x86/xen: remove xen_have_vcpu_info_placement flag
>   x86/xen: switch initial pvops IRQ functions to dummy ones
> 
>  arch/x86/include/asm/paravirt_types.h |   2 +
>  arch/x86/kernel/paravirt.c|  13 ++-
>  arch/x86/xen/enlighten.c  | 116 ++
>  arch/x86/xen/enlighten_hvm.c  |   6 +-
>  arch/x86/xen/enlighten_pv.c   |  28 ++-
>  arch/x86/xen/irq.c|  61 +-
>  arch/x86/xen/smp.c|  24 --
>  arch/x86/xen/xen-ops.h|   4 +-
>  8 files changed, 53 insertions(+), 201 deletions(-)

That looks awesome, I'm totally in favour of deleting code :-)

Acked-by: Peter Zijlstra (Intel) 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V2 3/9] fork: move PF_IO_WORKER's kernel frame setup to new flag

2021-09-22 Thread Geert Uytterhoeven
Hi Mike,

On Tue, Sep 21, 2021 at 11:55 PM Mike Christie
 wrote:
> The vhost worker threads need the same frame setup as io_uring's worker
> threads, but handle signals differently and do not need the same
> scheduling behavior. This patch separate's the frame setup parts of
> PF_IO_WORKER into a kernel_clone_args flag, KERN_WORKER_USER.
>
> Signed-off-by: Mike Christie 

Thanks for your patch!

> --- a/arch/m68k/kernel/process.c
> +++ b/arch/m68k/kernel/process.c
> @@ -157,7 +157,8 @@ int copy_thread(unsigned long clone_flags, unsigned long 
> usp, unsigned long arg,
>  */
> p->thread.fs = get_fs().seg;
>
> -   if (unlikely(p->flags & (PF_KTHREAD | PF_IO_WORKER))) {
> +   if (unlikely(p->flags & (PF_KTHREAD) ||
> +worker_flags & KERN_WORKER_USER)) {

I guess it wouldn't hurt to add parentheses to improve
readability:

if (unlikely((p->flags & (PF_KTHREAD)) ||
 (worker_flags & KERN_WORKER_USER))) {

> /* kernel thread */
> memset(frame, 0, sizeof(struct fork_frame));
> frame->regs.sr = PS_S;

With the above fixed, for m68k:
Acked-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 2/2] x86/xen: switch initial pvops IRQ functions to dummy ones

2021-09-22 Thread Juergen Gross via Virtualization
The initial pvops functions handling irq flags will only ever be called
before interrupts are being enabled.

So make the __init and switch them to be dummy functions:
- xen_save_fl() can always return 0
- xen_irq_disable() is a nop
- xen_irq_enable() can BUG()

Add some generic paravirt functions for that purpose.

Signed-off-by: Juergen Gross 
---
 arch/x86/include/asm/paravirt_types.h |  2 +
 arch/x86/kernel/paravirt.c| 13 +-
 arch/x86/xen/enlighten.c  | 19 +
 arch/x86/xen/irq.c| 61 ++-
 4 files changed, 20 insertions(+), 75 deletions(-)

diff --git a/arch/x86/include/asm/paravirt_types.h 
b/arch/x86/include/asm/paravirt_types.h
index d9d6b0203ec4..fc1151e77569 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -577,7 +577,9 @@ void paravirt_leave_lazy_mmu(void);
 void paravirt_flush_lazy_mmu(void);
 
 void _paravirt_nop(void);
+void paravirt_BUG(void);
 u64 _paravirt_ident_64(u64);
+unsigned long paravirt_ret0(void);
 
 #define paravirt_nop   ((void *)_paravirt_nop)
 
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 04cafc057bed..4f0ebc46a15d 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -46,6 +46,17 @@ asm (".pushsection .entry.text, \"ax\"\n"
  ".type _paravirt_nop, @function\n\t"
  ".popsection");
 
+/* stub always return ing 0. */
+asm (".pushsection .entry.text, \"ax\"\n"
+ ".global paravirt_ret0\n"
+ "paravirt_ret0:\n\t"
+ "xor %" _ASM_AX ", %" _ASM_AX ";\n\t"
+ "ret\n\t"
+ ".size paravirt_ret0, . - paravirt_ret0\n\t"
+ ".type paravirt_ret0, @function\n\t"
+ ".popsection");
+
+
 void __init default_banner(void)
 {
printk(KERN_INFO "Booting paravirtualized kernel on %s\n",
@@ -53,7 +64,7 @@ void __init default_banner(void)
 }
 
 /* Undefined instruction for dealing with missing ops pointers. */
-static void paravirt_BUG(void)
+void paravirt_BUG(void)
 {
BUG();
 }
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 8a7bd3f4591e..d96cabe34a01 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -27,25 +27,10 @@ EXPORT_SYMBOL_GPL(hypercall_page);
  * Pointer to the xen_vcpu_info structure or
  * _shared_info->vcpu_info[cpu]. See xen_hvm_init_shared_info
  * and xen_vcpu_setup for details. By default it points to 
share_info->vcpu_info
- * but if the hypervisor supports VCPUOP_register_vcpu_info then it can point
- * to xen_vcpu_info. The pointer is used in __xen_evtchn_do_upcall to
- * acknowledge pending events.
- * Also more subtly it is used by the patched version of irq enable/disable
- * e.g. xen_irq_enable_direct and xen_iret in PV mode.
- *
- * The desire to be able to do those mask/unmask operations as a single
- * instruction by using the per-cpu offset held in %gs is the real reason
- * vcpu info is in a per-cpu pointer and the original reason for this
- * hypercall.
- *
+ * but during boot it is switched to point to xen_vcpu_info.
+ * The pointer is used in __xen_evtchn_do_upcall to acknowledge pending events.
  */
 DEFINE_PER_CPU(struct vcpu_info *, xen_vcpu);
-
-/*
- * Per CPU pages used if hypervisor supports VCPUOP_register_vcpu_info
- * hypercall. This can be used both in PV and PVHVM mode. The structure
- * overrides the default per_cpu(xen_vcpu, cpu) value.
- */
 DEFINE_PER_CPU(struct vcpu_info, xen_vcpu_info);
 
 /* Linux <-> Xen vCPU id mapping */
diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c
index dfa091d79c2e..ae8537583102 100644
--- a/arch/x86/xen/irq.c
+++ b/arch/x86/xen/irq.c
@@ -24,60 +24,6 @@ void xen_force_evtchn_callback(void)
(void)HYPERVISOR_xen_version(0, NULL);
 }
 
-asmlinkage __visible unsigned long xen_save_fl(void)
-{
-   struct vcpu_info *vcpu;
-   unsigned long flags;
-
-   vcpu = this_cpu_read(xen_vcpu);
-
-   /* flag has opposite sense of mask */
-   flags = !vcpu->evtchn_upcall_mask;
-
-   /* convert to IF type flag
-  -0 -> 0x
-  -1 -> 0x
-   */
-   return (-flags) & X86_EFLAGS_IF;
-}
-PV_CALLEE_SAVE_REGS_THUNK(xen_save_fl);
-
-asmlinkage __visible void xen_irq_disable(void)
-{
-   /* There's a one instruction preempt window here.  We need to
-  make sure we're don't switch CPUs between getting the vcpu
-  pointer and updating the mask. */
-   preempt_disable();
-   this_cpu_read(xen_vcpu)->evtchn_upcall_mask = 1;
-   preempt_enable_no_resched();
-}
-PV_CALLEE_SAVE_REGS_THUNK(xen_irq_disable);
-
-asmlinkage __visible void xen_irq_enable(void)
-{
-   struct vcpu_info *vcpu;
-
-   /*
-* We may be preempted as soon as vcpu->evtchn_upcall_mask is
-* cleared, so disable preemption to ensure we check for
-* events on the VCPU we are still running on.
-*/
-   preempt_disable();
-
-   vcpu = this_cpu_read(xen_vcpu);
-   

[PATCH v2 0/2] x86/xen: simplify irq pvops

2021-09-22 Thread Juergen Gross via Virtualization
The pvops function for Xen PV guests handling the interrupt flag are
much more complex than needed.

With the supported Xen hypervisor versions they can be simplified a
lot, especially by removing the need for disabling preemption.

Juergen Gross (2):
  x86/xen: remove xen_have_vcpu_info_placement flag
  x86/xen: switch initial pvops IRQ functions to dummy ones

 arch/x86/include/asm/paravirt_types.h |   2 +
 arch/x86/kernel/paravirt.c|  13 ++-
 arch/x86/xen/enlighten.c  | 116 ++
 arch/x86/xen/enlighten_hvm.c  |   6 +-
 arch/x86/xen/enlighten_pv.c   |  28 ++-
 arch/x86/xen/irq.c|  61 +-
 arch/x86/xen/smp.c|  24 --
 arch/x86/xen/xen-ops.h|   4 +-
 8 files changed, 53 insertions(+), 201 deletions(-)

-- 
2.26.2

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [Virtio-fs] [PATCH v4 0/8] fuse,virtiofs: support per-file DAX

2021-09-22 Thread JeffleXu
Thanks for the replying and suggesting. ;)


On 9/20/21 3:45 AM, Vivek Goyal wrote:
> On Thu, Sep 16, 2021 at 04:21:59PM +0800, JeffleXu wrote:
>> Hi, I add some performance statistics below.
>>
>>
>> On 8/17/21 8:40 PM, Vivek Goyal wrote:
>>> On Tue, Aug 17, 2021 at 10:32:14AM +0100, Dr. David Alan Gilbert wrote:
 * Miklos Szeredi (mik...@szeredi.hu) wrote:
> On Tue, 17 Aug 2021 at 04:22, Jeffle Xu  
> wrote:
>>
>> This patchset adds support of per-file DAX for virtiofs, which is
>> inspired by Ira Weiny's work on ext4[1] and xfs[2].
>
> Can you please explain the background of this change in detail?
>
> Why would an admin want to enable DAX for a particular virtiofs file
> and not for others?

 Where we're contending on virtiofs dax cache size it makes a lot of
 sense; it's quite expensive for us to map something into the cache
 (especially if we push something else out), so selectively DAXing files
 that are expected to be hot could help reduce cache churn.
>>
>> Yes, the performance of dax can be limited when the DAX window is
>> limited, where dax window may be contended by multiple files.
>>
>> I tested kernel compiling in virtiofs, emulating the scenario where a
>> lot of files contending dax window and triggering dax window reclaiming.
>>
>> Environment setup:
>> - guest vCPU: 16
>> - time make vmlinux -j128
>>
>> type| cache  | cache-size | time
>> --- | -- | -- | 
>> non-dax | always |   --   | real 2m48.119s
>> dax | always | 64M| real 4m49.563s
>> dax | always |   1G   | real 3m14.200s
>> dax | always |   4G   | real 2m41.141s
>>
>>
>> It can be seen that there's performance drop, comparing to the normal
>> buffered IO, when dax window resource is restricted and dax window
>> relcaiming is triggered. The smaller the cache size is, the worse the
>> performance is. The performance drop can be alleviated and eliminated as
>> cache size increases.
>>
>> Though we may not compile kernel in virtiofs, indeed we may access a lot
>> of small files in virtiofs and suffer this performance drop.
> 
> Hi Jeffle,
> 
> If you access lot of big files or a file bigger than dax window, still
> you will face performance drop due to reclaim. IOW, if data being
> accessed is bigger than dax window, then reclaim will trigger and
> performance drop will be observed. So I think its not fair to assciate
> performance drop with big for small files as such.

Yes, it is. Actually what I mean is that small files (with size smaller
than dax window chunk size) is more likely to consume more dax windows
compared to large files, under the same total file size.


> 
> What makes more sense is that memomry usage argument you have used
> later in the email. That is, we have a fixed chunk size of 2MB. And
> that means we use 512 * 64 = 32K of memory per chunk. So if a file
> is smaller than 32K in size, it might be better to just access it
> without DAX and incur the cost of page cache in guest instead. Even this
> argument also works only if dax window is being utilized fully.

Yes, agreed. In this case, the meaning of per-file dax is that, admin
could control the size of overall dax window under a limited number,
while still sustaining a reasonable performance. But at least, users are
capable of tuning it now.

> 
> Anyway, I think Miklos already asked you to send patches so that
> virtiofs daemon specifies which file to use dax on. So are you
> planning to post patches again for that. (And drop patches to
> read dax attr from per inode from filesystem in guest).

OK. I will send a new version, disabling dax based on the file size on
the host daemon side. Besides, I'm afraid the negotiation phase is also
not needed anymore, since currently the hint whether dax shall be
enabled or not is completely feeded from host daemon, and the guest side
needn't set/clear per inode dax attr now.

-- 
Thanks,
Jeffle
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: Use of uninitialized memory with CONFIG_HW_RANDOM_VIRTIO

2021-09-22 Thread Michael S. Tsirkin
On Fri, Sep 17, 2021 at 02:57:02PM +0200, Laurent Vivier wrote:
> On 17/09/2021 00:58, Michael S. Tsirkin wrote:
> > On Thu, Sep 16, 2021 at 10:52:59AM +0200, Laurent Vivier wrote:
> > > On 13/09/2021 10:25, Alexander Potapenko wrote:
> > > > Hi Laurent,
> > > > 
> > > > I took the latest kernel (5.15-rc1,
> > > > 6880fa6c56601bb8ed59df6c30fd390cc5f6dd8f) and a slightly modified
> > > > config from syzbot (see attached)
> > > > The latter has a lot of unnecessary debug checks, but those should not
> > > > affect the RNG.
> > > > 
> > > > You then need to apply the following patch to the kernel:
> > > > 
> > > > 
> > > > diff --git a/drivers/char/hw_random/core.c 
> > > > b/drivers/char/hw_random/core.c
> > > > index a3db27916256d..a4cba9f0ff8cb 100644
> > > > --- a/drivers/char/hw_random/core.c
> > > > +++ b/drivers/char/hw_random/core.c
> > > > @@ -433,8 +433,11 @@ static int hwrng_fillfn(void *unused)
> > > >   if (IS_ERR(rng) || !rng)
> > > >   break;
> > > >   mutex_lock(_mutex);
> > > > +   memset(rng_fillbuf, 'A', rng_buffer_size());
> > > > +   rng_fillbuf[rng_buffer_size()-1] = 0;
> > > >   rc = rng_get_data(rng, rng_fillbuf,
> > > > rng_buffer_size(), 1);
> > > > +   pr_err("rng_fillbuf: %s\n", rng_fillbuf);
> > > >   mutex_unlock(_mutex);
> > > >   put_rng(rng);
> > > >   if (rc <= 0) {
> > > > 
> > > > 
> > > > and run the kernel under QEMU.
> > > > 
> > > > On my machine I'm seeing the following output:
> > > > 
> > > > $ cat log | strings | grep rng_fillbuf
> > > > [4.901931][  T897] rng_fillbuf:
> > > > AAA
> > > > [4.903104][  T897] rng_fillbuf: >
> > > > [4.903641][  T897] rng_fillbuf:
> > > > [4.904846][  T897] rng_fillbuf: ?
> > > > [4.913442][  T897] rng_fillbuf: [
> > > > 
> > > > , which denotes that the first call to rng_get_data() leaves
> > > > rng_fillbuf uninitialized.
> > > 
> > > 
> > > Thank you for the detailed steps.
> > > 
> > > The problem happens because we mix two different buffers:
> > > - in add_early_randomness() we provide rng_buffer but don't wait it is 
> > > full (see [1])
> > > - in hwrng_fillfn() we provide rng_fillbuf, and we wait data here, but we
> > > received the signal from QEMU that there are data, but these data are in
> > > rng_buffer while we expect them in rng_fillbuf.
> > > 
> > > There are several ways to fix/workaround that:
> > > 
> > > 1- ignore the read when wait=0 :
> > > 
> > > diff --git a/drivers/char/hw_random/virtio-rng.c 
> > > b/drivers/char/hw_random/virtio-rng.c
> > > index a90001e02bf7..8466d76566fd 100644
> > > --- a/drivers/char/hw_random/virtio-rng.c
> > > +++ b/drivers/char/hw_random/virtio-rng.c
> > > @@ -59,15 +59,15 @@ static int virtio_read(struct hwrng *rng, void *buf,
> > > size_t size, bool wait)
> > >  if (vi->hwrng_removed)
> > >  return -ENODEV;
> > > 
> > > +   if (!wait)
> > > +   return 0;
> > > +
> > >  if (!vi->busy) {
> > >  vi->busy = true;
> > >  reinit_completion(>have_data);
> > >  register_buffer(vi, buf, size);
> > >  }
> > > 
> > > -   if (!wait)
> > > -   return 0;
> > > -
> > >  ret = wait_for_completion_killable(>have_data);
> > >  if (ret < 0)
> > >  return ret;
> > > 
> > > 
> > > 2- Use an internal intermediate buffer in virtio-rng, at a cost of a copy,
> > > I have some patches (somewhere) I can refresh to do that.
> > > 
> > > 3- modify hw_random/core.c to use only one buffer
> > > 
> > > Thanks,
> > > Laurent
> > > 
> > > [1] 78887832e765 ("hwrng: core - don't wait on add_early_randomness()")
> > 
> > 4. actually differentiate between the two
> > using the pointer returned by get_buf.
> 
> Even if it can help I think we should avoid to keep mixing buffers.
> 
> For instance, if we submit a buffer with wait=0, the caller can re-use or
> release the memory while it is queued in the queue of the device.
> 
> Moreover, what to do if buffers differ?
> 
> Wait and use the data in the previous buffer (that can be corrupted by the
> submitter in-between)?
> 
> Or wait and drop, and wait again with the new buffer?
> 
> BTW, I found my patches that introduce an internal buffer in virtio-rng 
> (solution 2):
> 
> https://github.com/vivier/linux/commits/virtio-rng
> 
> Thanks,
> Laurent

Pls go ahead and post them!

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization