Re: [PATCH net-next v3] virtio-net: page_to_skb() use build_skb when there's sufficient tailroom

2021-04-20 Thread Jason Wang


在 2021/4/20 下午8:35, Xuan Zhuo 写道:

I realize this has been merged to net-next already, but I'm getting a
use-after-free with KASAN in page_to_skb() with this patch. Reverting this
change fixes the UAF. I've included the KASAN dump below, and a couple of
comments inline.

I think something went wrong, this was merged before it was acked. Based on the
Jason Wang's comments, there are still some tests that have not been done?

If it has been merged, what should I do now, can I make up the test?



I think so, please test net-next which should contains the fixes from Eric.

Thanks





Thanks.



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[RFC PATCH 7/7] virtio-ring: store DMA metadata in desc_extra for split virtqueue

2021-04-20 Thread Jason Wang
For split virtqueue, we used to depend on the address, length and
flags stored in the descriptor ring for DMA unmapping. This is unsafe
for the case when we don't trust the device since the device can tries
to manipulate the behavior of virtio driver and swiotlb.

For safety, maintain the DMA address, DMA length, descriptor flags and
next filed of the non indirect descriptors in vring_desc_state_extra
when DMA API is used for virtio as we did for packed virtqueue and use
those metadata for performing DMA operations. Indirect descriptors
should be safe since they are using streaming mappings.

For the device that doesn't use DMA API, the behavior which use
descriptor table is unchanged to minimize the performance impact.

Signed-off-by: Jason Wang 
---
 drivers/virtio/virtio_ring.c | 100 +--
 1 file changed, 84 insertions(+), 16 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 9800f1c9ce4c..b53ceb65f9cf 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -130,6 +130,7 @@ struct vring_virtqueue {
 
/* Per-descriptor state. */
struct vring_desc_state_split *desc_state;
+   struct vring_desc_extra *desc_extra;
 
/* DMA address and size information */
dma_addr_t queue_dma_addr;
@@ -364,8 +365,8 @@ static int vring_mapping_error(const struct vring_virtqueue 
*vq,
  * Split ring specific functions - *_split().
  */
 
-static void vring_unmap_one_split(const struct vring_virtqueue *vq,
- struct vring_desc *desc)
+static void vring_unmap_one_split_indirect(const struct vring_virtqueue *vq,
+  struct vring_desc *desc)
 {
u16 flags;
 
@@ -389,6 +390,35 @@ static void vring_unmap_one_split(const struct 
vring_virtqueue *vq,
}
 }
 
+static unsigned int vring_unmap_one_split(const struct vring_virtqueue *vq,
+ unsigned int i)
+{
+   struct vring_desc_extra *extra = vq->split.desc_extra;
+   u16 flags;
+
+   if (!vq->use_dma_api)
+   goto out;
+
+   flags = extra[i].flags;
+
+   if (flags & VRING_DESC_F_INDIRECT) {
+   dma_unmap_single(vring_dma_dev(vq),
+extra[i].addr,
+extra[i].len,
+(flags & VRING_DESC_F_WRITE) ?
+DMA_FROM_DEVICE : DMA_TO_DEVICE);
+   } else {
+   dma_unmap_page(vring_dma_dev(vq),
+  extra[i].addr,
+  extra[i].len,
+  (flags & VRING_DESC_F_WRITE) ?
+  DMA_FROM_DEVICE : DMA_TO_DEVICE);
+   }
+
+out:
+   return extra[i].next;
+}
+
 static struct vring_desc *alloc_indirect_split(struct virtqueue *_vq,
   unsigned int total_sg,
   gfp_t gfp)
@@ -417,13 +447,28 @@ static inline unsigned int 
virtqueue_add_desc_split(struct virtqueue *vq,
unsigned int i,
dma_addr_t addr,
unsigned int len,
-   u16 flags)
+   u16 flags,
+   bool trust)
 {
+   struct vring_virtqueue *vring = to_vvq(vq);
+   struct vring_desc_extra *extra = vring->split.desc_extra;
+   u16 next;
+
desc[i].flags = cpu_to_virtio16(vq->vdev, flags);
desc[i].addr = cpu_to_virtio64(vq->vdev, addr);
desc[i].len = cpu_to_virtio32(vq->vdev, len);
 
-   return virtio16_to_cpu(vq->vdev, desc[i].next);
+   if (!trust) {
+   next = extra[i].next;
+   desc[i].next = cpu_to_virtio16(vq->vdev, next);
+
+   extra[i].addr = addr;
+   extra[i].len = len;
+   extra[i].flags = flags;
+   } else
+   next = virtio16_to_cpu(vq->vdev, desc[i].next);
+
+   return next;
 }
 
 static inline int virtqueue_add_split(struct virtqueue *_vq,
@@ -499,8 +544,12 @@ static inline int virtqueue_add_split(struct virtqueue 
*_vq,
goto unmap_release;
 
prev = i;
+   /* Note that we trust indirect descriptor
+* table since it use stream DMA mapping.
+*/
i = virtqueue_add_desc_split(_vq, desc, i, addr, 
sg->length,
-VRING_DESC_F_NEXT);
+VRING_DESC_F_NEXT,
+ 

[RFC PATCH 6/7] virtio: use err label in __vring_new_virtqueue()

2021-04-20 Thread Jason Wang
Using error label for unwind in __vring_new_virtqueue. This is useful
for future refacotring.

Signed-off-by: Jason Wang 
---
 drivers/virtio/virtio_ring.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 11dfa0dc8ec1..9800f1c9ce4c 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2137,10 +2137,8 @@ struct virtqueue *__vring_new_virtqueue(unsigned int 
index,
 
vq->split.desc_state = kmalloc_array(vring.num,
sizeof(struct vring_desc_state_split), GFP_KERNEL);
-   if (!vq->split.desc_state) {
-   kfree(vq);
-   return NULL;
-   }
+   if (!vq->split.desc_state)
+   goto err_state;
 
/* Put everything in free lists. */
vq->free_head = 0;
@@ -2151,6 +2149,10 @@ struct virtqueue *__vring_new_virtqueue(unsigned int 
index,
 
list_add_tail(>vq.list, >vqs);
return >vq;
+
+err_state:
+   kfree(vq);
+   return NULL;
 }
 EXPORT_SYMBOL_GPL(__vring_new_virtqueue);
 
-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RFC PATCH 5/7] virtio_ring: introduce virtqueue_desc_add_split()

2021-04-20 Thread Jason Wang
This patch introduces a helper for storing descriptor in the
descriptor table for split virtqueue.

Signed-off-by: Jason Wang 
---
 drivers/virtio/virtio_ring.c | 39 ++--
 1 file changed, 24 insertions(+), 15 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 5509c2643fb1..11dfa0dc8ec1 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -412,6 +412,20 @@ static struct vring_desc *alloc_indirect_split(struct 
virtqueue *_vq,
return desc;
 }
 
+static inline unsigned int virtqueue_add_desc_split(struct virtqueue *vq,
+   struct vring_desc *desc,
+   unsigned int i,
+   dma_addr_t addr,
+   unsigned int len,
+   u16 flags)
+{
+   desc[i].flags = cpu_to_virtio16(vq->vdev, flags);
+   desc[i].addr = cpu_to_virtio64(vq->vdev, addr);
+   desc[i].len = cpu_to_virtio32(vq->vdev, len);
+
+   return virtio16_to_cpu(vq->vdev, desc[i].next);
+}
+
 static inline int virtqueue_add_split(struct virtqueue *_vq,
  struct scatterlist *sgs[],
  unsigned int total_sg,
@@ -484,11 +498,9 @@ static inline int virtqueue_add_split(struct virtqueue 
*_vq,
if (vring_mapping_error(vq, addr))
goto unmap_release;
 
-   desc[i].flags = cpu_to_virtio16(_vq->vdev, 
VRING_DESC_F_NEXT);
-   desc[i].addr = cpu_to_virtio64(_vq->vdev, addr);
-   desc[i].len = cpu_to_virtio32(_vq->vdev, sg->length);
prev = i;
-   i = virtio16_to_cpu(_vq->vdev, desc[i].next);
+   i = virtqueue_add_desc_split(_vq, desc, i, addr, 
sg->length,
+VRING_DESC_F_NEXT);
}
}
for (; n < (out_sgs + in_sgs); n++) {
@@ -497,11 +509,11 @@ static inline int virtqueue_add_split(struct virtqueue 
*_vq,
if (vring_mapping_error(vq, addr))
goto unmap_release;
 
-   desc[i].flags = cpu_to_virtio16(_vq->vdev, 
VRING_DESC_F_NEXT | VRING_DESC_F_WRITE);
-   desc[i].addr = cpu_to_virtio64(_vq->vdev, addr);
-   desc[i].len = cpu_to_virtio32(_vq->vdev, sg->length);
prev = i;
-   i = virtio16_to_cpu(_vq->vdev, desc[i].next);
+   i = virtqueue_add_desc_split(_vq, desc, i, addr,
+sg->length,
+VRING_DESC_F_NEXT |
+VRING_DESC_F_WRITE);
}
}
/* Last one doesn't continue. */
@@ -515,13 +527,10 @@ static inline int virtqueue_add_split(struct virtqueue 
*_vq,
if (vring_mapping_error(vq, addr))
goto unmap_release;
 
-   vq->split.vring.desc[head].flags = cpu_to_virtio16(_vq->vdev,
-   VRING_DESC_F_INDIRECT);
-   vq->split.vring.desc[head].addr = cpu_to_virtio64(_vq->vdev,
-   addr);
-
-   vq->split.vring.desc[head].len = cpu_to_virtio32(_vq->vdev,
-   total_sg * sizeof(struct vring_desc));
+   virtqueue_add_desc_split(_vq, vq->split.vring.desc,
+head, addr,
+total_sg * sizeof(struct vring_desc),
+VRING_DESC_F_INDIRECT);
}
 
/* We're using some buffers from the free list. */
-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RFC PATCH 4/7] virtio_ring: secure handling of mapping errors

2021-04-20 Thread Jason Wang
We should not depend on the DMA address, length and flag of descriptor
table since they could be wrote with arbitrary value by the device. So
this patch switches to use the stored one in desc_extra.

Note that the indirect descriptors are fine since they are read-only
streaming mappings.

Signed-off-by: Jason Wang 
---
 drivers/virtio/virtio_ring.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 0cdd965dba58..5509c2643fb1 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -1213,13 +1213,16 @@ static inline int virtqueue_add_packed(struct virtqueue 
*_vq,
 unmap_release:
err_idx = i;
i = head;
+   curr = vq->free_head;
 
vq->packed.avail_used_flags = avail_used_flags;
 
for (n = 0; n < total_sg; n++) {
if (i == err_idx)
break;
-   vring_unmap_desc_packed(vq, [i]);
+   vring_unmap_state_packed(vq,
+>packed.desc_extra[curr]);
+   curr = vq->packed.desc_extra[curr].next;
i++;
if (i >= vq->packed.vring.num)
i = 0;
-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RFC PATCH 3/7] virtio-ring: factor out desc_extra allocation

2021-04-20 Thread Jason Wang
A helper is introduced for the logic of allocating the descriptor
extra data. This will be reused by split virtqueue.

Signed-off-by: Jason Wang 
---
 drivers/virtio/virtio_ring.c | 30 --
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index c25ea5776687..0cdd965dba58 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -1550,6 +1550,25 @@ static void *virtqueue_detach_unused_buf_packed(struct 
virtqueue *_vq)
return NULL;
 }
 
+static struct vring_desc_extra *vring_alloc_desc_extra(struct vring_virtqueue 
*vq,
+  unsigned int num)
+{
+   struct vring_desc_extra *desc_extra;
+   unsigned int i;
+
+   desc_extra = kmalloc_array(num, sizeof(struct vring_desc_extra),
+  GFP_KERNEL);
+   if (!desc_extra)
+   return NULL;
+
+   memset(desc_extra, 0, num * sizeof(struct vring_desc_extra));
+
+   for (i = 0; i < num - 1; i++)
+   desc_extra[i].next = i + 1;
+
+   return desc_extra;
+}
+
 static struct virtqueue *vring_create_virtqueue_packed(
unsigned int index,
unsigned int num,
@@ -1567,7 +1586,6 @@ static struct virtqueue *vring_create_virtqueue_packed(
struct vring_packed_desc_event *driver, *device;
dma_addr_t ring_dma_addr, driver_event_dma_addr, device_event_dma_addr;
size_t ring_size_in_bytes, event_size_in_bytes;
-   unsigned int i;
 
ring_size_in_bytes = num * sizeof(struct vring_packed_desc);
 
@@ -1650,18 +1668,10 @@ static struct virtqueue *vring_create_virtqueue_packed(
/* Put everything in free lists. */
vq->free_head = 0;
 
-   vq->packed.desc_extra = kmalloc_array(num,
-   sizeof(struct vring_desc_extra),
-   GFP_KERNEL);
+   vq->packed.desc_extra = vring_alloc_desc_extra(vq, num);
if (!vq->packed.desc_extra)
goto err_desc_extra;
 
-   memset(vq->packed.desc_extra, 0,
-   num * sizeof(struct vring_desc_extra));
-
-   for (i = 0; i < num - 1; i++)
-   vq->packed.desc_extra[i].next = i + 1;
-
/* No callback?  Tell other side not to bother us. */
if (!callback) {
vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RFC PATCH 2/7] virtio_ring: rename vring_desc_extra_packed

2021-04-20 Thread Jason Wang
Rename vring_desc_extra_packed to vring_desc_extra since the structure
are pretty generic which could be reused by split virtqueue as well.

Signed-off-by: Jason Wang 
---
 drivers/virtio/virtio_ring.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index e1e9ed42e637..c25ea5776687 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -77,7 +77,7 @@ struct vring_desc_state_packed {
u16 last;   /* The last desc state in a list. */
 };
 
-struct vring_desc_extra_packed {
+struct vring_desc_extra {
dma_addr_t addr;/* Buffer DMA addr. */
u32 len;/* Buffer length. */
u16 flags;  /* Descriptor flags. */
@@ -166,7 +166,7 @@ struct vring_virtqueue {
 
/* Per-descriptor state. */
struct vring_desc_state_packed *desc_state;
-   struct vring_desc_extra_packed *desc_extra;
+   struct vring_desc_extra *desc_extra;
 
/* DMA address and size information */
dma_addr_t ring_dma_addr;
@@ -912,7 +912,7 @@ static struct virtqueue *vring_create_virtqueue_split(
  */
 
 static void vring_unmap_state_packed(const struct vring_virtqueue *vq,
-struct vring_desc_extra_packed *state)
+struct vring_desc_extra *state)
 {
u16 flags;
 
@@ -1651,13 +1651,13 @@ static struct virtqueue *vring_create_virtqueue_packed(
vq->free_head = 0;
 
vq->packed.desc_extra = kmalloc_array(num,
-   sizeof(struct vring_desc_extra_packed),
+   sizeof(struct vring_desc_extra),
GFP_KERNEL);
if (!vq->packed.desc_extra)
goto err_desc_extra;
 
memset(vq->packed.desc_extra, 0,
-   num * sizeof(struct vring_desc_extra_packed));
+   num * sizeof(struct vring_desc_extra));
 
for (i = 0; i < num - 1; i++)
vq->packed.desc_extra[i].next = i + 1;
-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RFC PATCH 1/7] virtio-ring: maintain next in extra state for packed virtqueue

2021-04-20 Thread Jason Wang
This patch moves next from vring_desc_state_packed to
vring_desc_desc_extra_packed. This makes it simpler to let extra state
to be reused by split virtqueue.

Signed-off-by: Jason Wang 
---
 drivers/virtio/virtio_ring.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 71e16b53e9c1..e1e9ed42e637 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -74,7 +74,6 @@ struct vring_desc_state_packed {
void *data; /* Data for callback. */
struct vring_packed_desc *indir_desc; /* Indirect descriptor, if any. */
u16 num;/* Descriptor list length. */
-   u16 next;   /* The next desc state in a list. */
u16 last;   /* The last desc state in a list. */
 };
 
@@ -82,6 +81,7 @@ struct vring_desc_extra_packed {
dma_addr_t addr;/* Buffer DMA addr. */
u32 len;/* Buffer length. */
u16 flags;  /* Descriptor flags. */
+   u16 next;   /* The next desc state in a list. */
 };
 
 struct vring_virtqueue {
@@ -1061,7 +1061,7 @@ static int virtqueue_add_indirect_packed(struct 
vring_virtqueue *vq,
1 << VRING_PACKED_DESC_F_USED;
}
vq->packed.next_avail_idx = n;
-   vq->free_head = vq->packed.desc_state[id].next;
+   vq->free_head = vq->packed.desc_extra[id].next;
 
/* Store token and indirect buffer state. */
vq->packed.desc_state[id].num = 1;
@@ -1169,7 +1169,7 @@ static inline int virtqueue_add_packed(struct virtqueue 
*_vq,
le16_to_cpu(flags);
}
prev = curr;
-   curr = vq->packed.desc_state[curr].next;
+   curr = vq->packed.desc_extra[curr].next;
 
if ((unlikely(++i >= vq->packed.vring.num))) {
i = 0;
@@ -1290,7 +1290,7 @@ static void detach_buf_packed(struct vring_virtqueue *vq,
/* Clear data ptr. */
state->data = NULL;
 
-   vq->packed.desc_state[state->last].next = vq->free_head;
+   vq->packed.desc_extra[state->last].next = vq->free_head;
vq->free_head = id;
vq->vq.num_free += state->num;
 
@@ -1299,7 +1299,7 @@ static void detach_buf_packed(struct vring_virtqueue *vq,
for (i = 0; i < state->num; i++) {
vring_unmap_state_packed(vq,
>packed.desc_extra[curr]);
-   curr = vq->packed.desc_state[curr].next;
+   curr = vq->packed.desc_extra[curr].next;
}
}
 
@@ -1649,8 +1649,6 @@ static struct virtqueue *vring_create_virtqueue_packed(
 
/* Put everything in free lists. */
vq->free_head = 0;
-   for (i = 0; i < num-1; i++)
-   vq->packed.desc_state[i].next = i + 1;
 
vq->packed.desc_extra = kmalloc_array(num,
sizeof(struct vring_desc_extra_packed),
@@ -1661,6 +1659,9 @@ static struct virtqueue *vring_create_virtqueue_packed(
memset(vq->packed.desc_extra, 0,
num * sizeof(struct vring_desc_extra_packed));
 
+   for (i = 0; i < num - 1; i++)
+   vq->packed.desc_extra[i].next = i + 1;
+
/* No callback?  Tell other side not to bother us. */
if (!callback) {
vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RFC PATCH 0/7] Untrusted device support for virtio

2021-04-20 Thread Jason Wang
Hi All:

Sometimes, the driver doesn't trust the device. This is usually
happens for the encrtpyed VM or VDUSE[1]. In both cases, technology
like swiotlb is used to prevent the poking/mangling of memory from the
device. But this is not sufficient since current virtio driver may
trust what is stored in the descriptor table (coherent mapping) for
performing the DMA operations like unmap and bounce so the device may
choose to utilize the behaviour of swiotlb to perform attacks[2].

For double insurance, to protect from a malicous device, when DMA API
is used for the device, this series store and use the descriptor
metadata in an auxiliay structure which can not be accessed via
swiotlb instead of the ones in the descriptor table. Actually, we've
almost achieved that through packed virtqueue and we just need to fix
a corner case of handling mapping errors. For split virtqueue we just
follow what's done in the packed.

Note that we don't duplicate descriptor medata for indirect
descriptors since it uses stream mapping which is read only so it's
safe if the metadata of non-indirect descriptors are correct.

The behaivor for non DMA API is kept for minimizing the performance
impact.

Slightly tested with packed on/off, iommu on/of, swiotlb force/off in
the guest.

Please review.

[1] 
https://lore.kernel.org/netdev/fab615ce-5e13-a3b3-3715-a4203b4ab...@redhat.com/T/
[2] 
https://yhbt.net/lore/all/c3629a27-3590-1d9f-211b-c0b7be152...@redhat.com/T/#mc6b6e2343cbeffca68ca7a97e0f473aaa871c95b

Jason Wang (7):
  virtio-ring: maintain next in extra state for packed virtqueue
  virtio_ring: rename vring_desc_extra_packed
  virtio-ring: factor out desc_extra allocation
  virtio_ring: secure handling of mapping errors
  virtio_ring: introduce virtqueue_desc_add_split()
  virtio: use err label in __vring_new_virtqueue()
  virtio-ring: store DMA metadata in desc_extra for split virtqueue

 drivers/virtio/virtio_ring.c | 189 ++-
 1 file changed, 141 insertions(+), 48 deletions(-)

-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next] virtio-net: fix use-after-free in page_to_skb()

2021-04-20 Thread Jason Wang


在 2021/4/20 下午5:43, Eric Dumazet 写道:

From: Eric Dumazet 

KASAN/syzbot had 4 reports, one of them being:

BUG: KASAN: slab-out-of-bounds in memcpy include/linux/fortify-string.h:191 
[inline]
BUG: KASAN: slab-out-of-bounds in page_to_skb+0x5cf/0xb70 
drivers/net/virtio_net.c:480
Read of size 12 at addr 888014a5f800 by task systemd-udevd/8445

CPU: 0 PID: 8445 Comm: systemd-udevd Not tainted 
5.12.0-rc8-next-20210419-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
  
  __dump_stack lib/dump_stack.c:79 [inline]
  dump_stack+0x141/0x1d7 lib/dump_stack.c:120
  print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:233
  __kasan_report mm/kasan/report.c:419 [inline]
  kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:436
  check_region_inline mm/kasan/generic.c:180 [inline]
  kasan_check_range+0x13d/0x180 mm/kasan/generic.c:186
  memcpy+0x20/0x60 mm/kasan/shadow.c:65
  memcpy include/linux/fortify-string.h:191 [inline]
  page_to_skb+0x5cf/0xb70 drivers/net/virtio_net.c:480
  receive_mergeable drivers/net/virtio_net.c:1009 [inline]
  receive_buf+0x2bc0/0x6250 drivers/net/virtio_net.c:1119
  virtnet_receive drivers/net/virtio_net.c:1411 [inline]
  virtnet_poll+0x568/0x10b0 drivers/net/virtio_net.c:1516
  __napi_poll+0xaf/0x440 net/core/dev.c:6962
  napi_poll net/core/dev.c:7029 [inline]
  net_rx_action+0x801/0xb40 net/core/dev.c:7116
  __do_softirq+0x29b/0x9fe kernel/softirq.c:559
  invoke_softirq kernel/softirq.c:433 [inline]
  __irq_exit_rcu+0x136/0x200 kernel/softirq.c:637
  irq_exit_rcu+0x5/0x20 kernel/softirq.c:649
  common_interrupt+0xa4/0xd0 arch/x86/kernel/irq.c:240

Fixes: fb32856b16ad ("virtio-net: page_to_skb() use build_skb when there's 
sufficient tailroom")
Signed-off-by: Eric Dumazet 
Reported-by: syzbot 
Reported-by: Guenter Roeck 
Reported-by: Mat Martineau 
Cc: Xuan Zhuo 
Cc: Jason Wang 
Cc: "Michael S. Tsirkin" 
Cc: virtualization@lists.linux-foundation.org
---



Acked-by: Jason Wang 



  drivers/net/virtio_net.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 
8cd76037c72481200ea3e8429e9fdfec005dad85..2e28c04aa6351d2b4016f7d277ce104c4970069d
 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -385,6 +385,7 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
struct sk_buff *skb;
struct virtio_net_hdr_mrg_rxbuf *hdr;
unsigned int copy, hdr_len, hdr_padded_len;
+   struct page *page_to_free = NULL;
int tailroom, shinfo_size;
char *p, *hdr_p;
  
@@ -445,7 +446,7 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,

if (len)
skb_add_rx_frag(skb, 0, page, offset, len, truesize);
else
-   put_page(page);
+   page_to_free = page;
goto ok;
}
  
@@ -479,6 +480,8 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,

hdr = skb_vnet_hdr(skb);
memcpy(hdr, hdr_p, hdr_len);
}
+   if (page_to_free)
+   put_page(page_to_free);
  
  	if (metasize) {

__skb_pull(skb, metasize);


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH net-next] virtio-net: restrict build_skb() use to some arches

2021-04-20 Thread Eric Dumazet
From: Eric Dumazet 

build_skb() is supposed to be followed by
skb_reserve(skb, NET_IP_ALIGN), so that IP headers are word-aligned.
(Best practice is to reserve NET_IP_ALIGN+NET_SKB_PAD, but the NET_SKB_PAD
part is only a performance optimization if tunnel encaps are added.)

Unfortunately virtio_net has not provisioned this reserve.
We can only use build_skb() for arches where NET_IP_ALIGN == 0

We might refine this later, with enough testing.

Fixes: fb32856b16ad ("virtio-net: page_to_skb() use build_skb when there's 
sufficient tailroom")
Signed-off-by: Eric Dumazet 
Reported-by: Guenter Roeck 
Cc: Xuan Zhuo 
Cc: Jason Wang 
Cc: "Michael S. Tsirkin" 
Cc: virtualization@lists.linux-foundation.org
---
 drivers/net/virtio_net.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 
2e28c04aa6351d2b4016f7d277ce104c4970069d..74d2d49264f3f3b7039be70331d4a44c53b8cc28
 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -416,7 +416,7 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
 
shinfo_size = SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
 
-   if (len > GOOD_COPY_LEN && tailroom >= shinfo_size) {
+   if (!NET_IP_ALIGN && len > GOOD_COPY_LEN && tailroom >= shinfo_size) {
skb = build_skb(p, truesize);
if (unlikely(!skb))
return NULL;
-- 
2.31.1.368.gbe11c130af-goog

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [Virtio-fs] [PATCH] virtiofs: propagate sync() to file server

2021-04-20 Thread Vivek Goyal
On Mon, Apr 19, 2021 at 05:08:48PM +0200, Greg Kurz wrote:
> Even if POSIX doesn't mandate it, linux users legitimately expect
> sync() to flush all data and metadata to physical storage when it
> is located on the same system. This isn't happening with virtiofs
> though : sync() inside the guest returns right away even though
> data still needs to be flushed from the host page cache.
> 
> This is easily demonstrated by doing the following in the guest:
> 
> $ dd if=/dev/zero of=/mnt/foo bs=1M count=5K ; strace -T -e sync sync
> 5120+0 records in
> 5120+0 records out
> 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 5.4 s, 1.0 GB/s
> sync()  = 0 <0.024068>
> +++ exited with 0 +++
> 
> and start the following in the host when the 'dd' command completes
> in the guest:
> 
> $ strace -T -e fsync sync virtiofs/foo
   
That "sync" is not /usr/bin/sync and its your own binary to call fsync()?

> fsync(3)= 0 <10.371640>
> +++ exited with 0 +++
> 
> There are no good reasons not to honor the expected behavior of
> sync() actually : it gives an unrealistic impression that virtiofs
> is super fast and that data has safely landed on HW, which isn't
> the case obviously.
> 
> Implement a ->sync_fs() superblock operation that sends a new
> FUSE_SYNC request type for this purpose. The FUSE_SYNC request
> conveys the 'wait' argument of ->sync_fs() in case the file
> server has a use for it. Like with FUSE_FSYNC and FUSE_FSYNCDIR,
> lack of support for FUSE_SYNC in the file server is treated as
> permanent success.
> 
> Note that such an operation allows the file server to DoS sync().
> Since a typical FUSE file server is an untrusted piece of software
> running in userspace, this is disabled by default.  Only enable it
> with virtiofs for now since virtiofsd is supposedly trusted by the
> guest kernel.
> 
> Reported-by: Robert Krawitz 
> Signed-off-by: Greg Kurz 
> ---
> 
> Can be tested using the following custom QEMU with FUSE_SYNCFS support:
> 
> https://gitlab.com/gkurz/qemu/-/tree/fuse-sync
> 
> ---
>  fs/fuse/fuse_i.h  |  3 +++
>  fs/fuse/inode.c   | 29 +
>  fs/fuse/virtio_fs.c   |  1 +
>  include/uapi/linux/fuse.h | 11 ++-
>  4 files changed, 43 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> index 63d97a15ffde..68e9ae96cbd4 100644
> --- a/fs/fuse/fuse_i.h
> +++ b/fs/fuse/fuse_i.h
> @@ -755,6 +755,9 @@ struct fuse_conn {
>   /* Auto-mount submounts announced by the server */
>   unsigned int auto_submounts:1;
>  
> + /* Propagate syncfs() to server */
> + unsigned int sync_fs:1;
> +
>   /** The number of requests waiting for completion */
>   atomic_t num_waiting;
>  
> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> index b0e18b470e91..425d567a06c5 100644
> --- a/fs/fuse/inode.c
> +++ b/fs/fuse/inode.c
> @@ -506,6 +506,34 @@ static int fuse_statfs(struct dentry *dentry, struct 
> kstatfs *buf)
>   return err;
>  }
>  
> +static int fuse_sync_fs(struct super_block *sb, int wait)
> +{
> + struct fuse_mount *fm = get_fuse_mount_super(sb);
> + struct fuse_conn *fc = fm->fc;
> + struct fuse_syncfs_in inarg;
> + FUSE_ARGS(args);
> + int err;
> +
> + if (!fc->sync_fs)
> + return 0;
> +
> + memset(, 0, sizeof(inarg));
> + inarg.wait = wait;
> + args.in_numargs = 1;
> + args.in_args[0].size = sizeof(inarg);
> + args.in_args[0].value = 
> + args.opcode = FUSE_SYNCFS;
> + args.out_numargs = 0;
> +
> + err = fuse_simple_request(fm, );
> + if (err == -ENOSYS) {
> + fc->sync_fs = 0;
> + err = 0;
> + }

I was wondering what will happen if older file server does not support
FUSE_SYNCFS. So we will get -ENOSYS and future syncfs commmands will not
be sent.

> +
> + return err;

Right now we don't propagate this error code all the way to user space.
I think I should post my patch to fix it again.

https://lore.kernel.org/linux-fsdevel/20201221195055.35295-2-vgo...@redhat.com/

> +}
> +
>  enum {
>   OPT_SOURCE,
>   OPT_SUBTYPE,
> @@ -909,6 +937,7 @@ static const struct super_operations 
> fuse_super_operations = {
>   .put_super  = fuse_put_super,
>   .umount_begin   = fuse_umount_begin,
>   .statfs = fuse_statfs,
> + .sync_fs= fuse_sync_fs,
>   .show_options   = fuse_show_options,
>  };
>  
> diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
> index 4ee6f734ba83..a3c025308743 100644
> --- a/fs/fuse/virtio_fs.c
> +++ b/fs/fuse/virtio_fs.c
> @@ -1441,6 +1441,7 @@ static int virtio_fs_get_tree(struct fs_context *fsc)
>   fc->release = fuse_free_conn;
>   fc->delete_stale = true;
>   fc->auto_submounts = true;
> + fc->sync_fs = true;
>  
>   fsc->s_fs_info = fm;
>   sb = sget_fc(fsc, virtio_fs_test_super, set_anon_super_fc);
> diff --git 

Re: [PATCH net-next] virtio-net: fix use-after-free in page_to_skb()

2021-04-20 Thread Eric Dumazet



On 4/20/21 7:51 PM, Guenter Roeck wrote:

> 
> sh does indeed fail, with the same symptoms as before, but so far I was not
> able to track it down to a specific commit. The alpha failure is different,
> though. It is a NULL pointer access.
> 
> Anyway, testing ...
> 
> The patch below does indeed fix the problem I am seeing on sh.
> 
> ... and it does fix the alpha problem as well. Neat, though I don't really 
> understand
> what a NULL pointer access and an unaligned access have to do with each other.
> 
> Great catch, thanks!
> 
> Guenter
> 

Note that build_skb(), without an additional skb_reserve(skb, NET_IP_ALIGN)
can not possibly work on arches that care about alignments.

That is because we can not both align skb->data and skb_shinfo(skb)

So unless we change build_skb() to make sure to align skb_shinfo(),
a fix could be simply :

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 
8cd76037c72481200ea3e8429e9fdfec005dad85..9cbe9c1737649450e451e3c65f59f794d1bf34b0
 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -415,7 +415,7 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
 
shinfo_size = SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
 
-   if (len > GOOD_COPY_LEN && tailroom >= shinfo_size) {
+   if (!_NET_IP_ALIGN && len > GOOD_COPY_LEN && tailroom >= shinfo_size) {
skb = build_skb(p, truesize);
if (unlikely(!skb))
return NULL;

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next] virtio-net: fix use-after-free in page_to_skb()

2021-04-20 Thread Guenter Roeck
On 4/20/21 9:31 AM, Eric Dumazet wrote:
> On Tue, Apr 20, 2021 at 5:42 PM Guenter Roeck  wrote:
>>
>> On Tue, Apr 20, 2021 at 04:00:07PM +0200, Eric Dumazet wrote:
>>> On Tue, Apr 20, 2021 at 3:48 PM Guenter Roeck  wrote:

 On 4/20/21 2:43 AM, Eric Dumazet wrote:
>>>
>

 Unfortunately that doesn't fix the problem for me. With this patch applied
 on top of next-20210419, I still get the same crash as before:

 udhcpc: sending discover^M
 Unable to handle kernel paging request at virtual address 
 0004^M
 udhcpc(169): Oops -1^M
 pc = [<0004>]  ra = []  ps = Not 
 tainted^M
 pc is at 0x4^M
 ra is at napi_gro_receive+0x68/0x150^M
 v0 =   t0 = 0008  t1 = ^M
 t2 =   t3 = 000e  t4 = 0038^M
 t5 =   t6 = fc2f298a  t7 = fc0002c78000^M
 s0 = fc00010b3ca0  s1 =   s2 = fc00011267e0^M
 s3 =   s4 = fc00025f2008  s5 = fc2f2940^M
 s6 = fc00025f2040^M
 a0 = fc00025f2008  a1 = fc2f2940  a2 = fc0002ca000c^M
 a3 = fc0250d0  a4 = 000e0008  a5 = ^M
 t8 = fc00010b3c80  t9 = fc0002ca04cc  t10= ^M
 t11= 04c0  pv = fcb8bc40  at = ^M
 gp = fc00010f9fb8  sp = df74db09^M
 Disabling lock debugging due to kernel taint^M
 Trace:^M
 [] napi_gro_receive+0x68/0x150^M
 [] receive_buf+0x50c/0x1b80^M
 [] virtnet_poll+0x1a8/0x5b0^M
 [] virtnet_poll+0x1dc/0x5b0^M
 [] __napi_poll+0x4c/0x270^M
 [] net_rx_action+0x130/0x2c0^M
 [] sch_direct_xmit+0x170/0x360^M
 [] __qdisc_run+0x160/0x6c0^M
 [] do_softirq+0xa4/0xd0^M
 [] __local_bh_enable_ip+0x114/0x120^M
 [] __dev_queue_xmit+0x484/0xa60^M
 [] packet_sendmsg+0xe7c/0x1ba0^M
 [] __sys_sendto+0xf8/0x170^M
 [] _raw_spin_unlock+0x18/0x30^M
 [] ehci_irq+0x2cc/0x5c0^M
 [] usb_hcd_irq+0x34/0x50^M
 [] move_addr_to_kernel+0x3c/0x60^M
 [] __sys_sendto+0xa4/0x170^M
 [] sys_sendto+0x24/0x40^M
 [] _raw_spin_lock+0x18/0x30^M
 [] _raw_spin_unlock+0x18/0x30^M
 [] clipper_enable_irq+0x98/0x100^M
 [] _raw_spin_unlock+0x18/0x30^M
 [] entSys+0xa4/0xc0^M
>>>
>>> OK, it would be nice if you could get line number from this stack trace.
>>>
>>
>> Here you are:
>>
>> napi_gro_receive (net/core/dev.c:6196)
>> receive_buf (drivers/net/virtio_net.c:1150)
>> virtnet_poll (drivers/net/virtio_net.c:1414 drivers/net/virtio_net.c:1519)
>> clipper_srm_device_interrupt (arch/alpha/kernel/sys_dp264.c:256)
>> virtnet_poll (drivers/net/virtio_net.c:1413 drivers/net/virtio_net.c:1519)
>> __napi_poll (net/core/dev.c:6962)
>> net_rx_action (net/core/dev.c:7029 net/core/dev.c:7116)
>> __qdisc_run (net/sched/sch_generic.c:376 net/sched/sch_generic.c:384)
>> do_softirq (./include/asm-generic/softirq_stack.h:10 kernel/softirq.c:460 
>> kernel/softirq.c:447)
>> __local_bh_enable_ip (kernel/softirq.c:384)
>> __dev_queue_xmit (./include/linux/bottom_half.h:32 
>> ./include/linux/rcupdate.h:746 net/core/dev.c:4272)
>> packet_sendmsg (net/packet/af_packet.c:3009 net/packet/af_packet.c:3034)
>> __sys_sendto (net/socket.c:654 net/socket.c:674 net/socket.c:1977)
>> __d_alloc (fs/dcache.c:1744)
>> packet_create (net/packet/af_packet.c:1192 net/packet/af_packet.c:3296)
>> move_addr_to_kernel (./include/linux/uaccess.h:192 net/socket.c:198 
>> net/socket.c:192)
>> __sys_sendto (net/socket.c:1968)
>> sys_sendto (net/socket.c:1989 net/socket.c:1985)
>> sys_bind (net/socket.c:1648 net/socket.c:1646)
>> entSys (arch/alpha/kernel/entry.S:477)
>>
>> Guenter
> 
> OK, I guess we are back to unaligned access, right ?
> I guess sh arch should have failed as well ?
> 

sh does indeed fail, with the same symptoms as before, but so far I was not
able to track it down to a specific commit. The alpha failure is different,
though. It is a NULL pointer access.

Anyway, testing ...

The patch below does indeed fix the problem I am seeing on sh.

... and it does fix the alpha problem as well. Neat, though I don't really 
understand
what a NULL pointer access and an unaligned access have to do with each other.

Great catch, thanks!

Guenter


> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 
> 8cd76037c72481200ea3e8429e9fdfec005dad85..0579914d3dd84c24982c1ff85314cc7b8d0f8d2d
> 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -415,7 +415,8 @@ static struct sk_buff *page_to_skb(struct virtnet_info 
> *vi,
> 
> shinfo_size = SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> 
> -   if (len > GOOD_COPY_LEN && tailroom >= shinfo_size) {
> +   if (len > GOOD_COPY_LEN && tailroom >= shinfo_size &&
> +   (!NET_IP_ALIGN || ((unsigned long)p & 3) == 2)) {
> skb = build_skb(p, 

Re: [PATCH] drm/bochs: Add screen blanking support

2021-04-20 Thread Thomas Zimmermann

Hi

Am 20.04.21 um 18:56 schrieb Takashi Iwai:

On bochs DRM driver, the execution of "setterm --blank force" results
in a frozen screen instead of a blank screen.  It's due to the lack of
the screen blanking support in its code.

Actually, the QEMU bochs vga side can switch to the blanking mode when
the bit 0x20 is cleared on VGA_ATT_IW register (0x3c0), which updates
ar_index in QEMU side.  So, essentially, we'd just need to clear the
bit at pipe disable callback; that's what this patch does essentially.

However, a tricky part is that the QEMU vga code does treat VGA_ATT_IW
register always as "flip-flop"; the first write is for index and the
second write is for the data like palette.  Meanwhile, in the current
bochs DRM driver, the flip-flop wasn't considered, and it calls only
the register update once with the value 0x20.



Unless bochs does things very different, the index should first be reset 
by reading 0x3da. Then write the index, then the data.


https://web.stanford.edu/class/cs140/projects/pintos/specs/freevga/vga/vgareg.htm#attribute

Best regards
Thomas


So, in this patch, we fix the behavior by simply writing the
VGA_ATT_IW register value twice at each place as well.

Signed-off-by: Takashi Iwai 
---
  drivers/gpu/drm/bochs/bochs_hw.c  | 13 -
  drivers/gpu/drm/bochs/bochs_kms.c |  8 
  2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bochs/bochs_hw.c b/drivers/gpu/drm/bochs/bochs_hw.c
index 2d7380a9890e..9a6f90216d6c 100644
--- a/drivers/gpu/drm/bochs/bochs_hw.c
+++ b/drivers/gpu/drm/bochs/bochs_hw.c
@@ -213,6 +213,14 @@ void bochs_hw_setmode(struct bochs_device *bochs,
if (!drm_dev_enter(bochs->dev, ))
return;
  
+	if (!mode) {

+   DRM_DEBUG_DRIVER("crtc disabled\n");
+   /* set to blank mode; send twice for ar_flip_flop */
+   bochs_vga_writeb(bochs, 0x3c0, 0);
+   bochs_vga_writeb(bochs, 0x3c0, 0);
+   goto exit;
+   }
+
bochs->xres = mode->hdisplay;
bochs->yres = mode->vdisplay;
bochs->bpp = 32;
@@ -223,7 +231,9 @@ void bochs_hw_setmode(struct bochs_device *bochs,
 bochs->xres, bochs->yres, bochs->bpp,
 bochs->yres_virtual);
  
-	bochs_vga_writeb(bochs, 0x3c0, 0x20); /* unblank */

+   /* unblank; send twice for ar_flip_flop */
+   bochs_vga_writeb(bochs, 0x3c0, 0x20);
+   bochs_vga_writeb(bochs, 0x3c0, 0x20);
  
  	bochs_dispi_write(bochs, VBE_DISPI_INDEX_ENABLE,  0);

bochs_dispi_write(bochs, VBE_DISPI_INDEX_BPP, bochs->bpp);
@@ -239,6 +249,7 @@ void bochs_hw_setmode(struct bochs_device *bochs,
bochs_dispi_write(bochs, VBE_DISPI_INDEX_ENABLE,
  VBE_DISPI_ENABLED | VBE_DISPI_LFB_ENABLED);
  
+ exit:

drm_dev_exit(idx);
  }
  
diff --git a/drivers/gpu/drm/bochs/bochs_kms.c b/drivers/gpu/drm/bochs/bochs_kms.c

index 853081d186d5..b0d77d6d3ae4 100644
--- a/drivers/gpu/drm/bochs/bochs_kms.c
+++ b/drivers/gpu/drm/bochs/bochs_kms.c
@@ -57,6 +57,13 @@ static void bochs_pipe_enable(struct drm_simple_display_pipe 
*pipe,
bochs_plane_update(bochs, plane_state);
  }
  
+static void bochs_pipe_disable(struct drm_simple_display_pipe *pipe)

+{
+   struct bochs_device *bochs = pipe->crtc.dev->dev_private;
+
+   bochs_hw_setmode(bochs, NULL);
+}
+
  static void bochs_pipe_update(struct drm_simple_display_pipe *pipe,
  struct drm_plane_state *old_state)
  {
@@ -67,6 +74,7 @@ static void bochs_pipe_update(struct drm_simple_display_pipe 
*pipe,
  
  static const struct drm_simple_display_pipe_funcs bochs_pipe_funcs = 

{

.enable = bochs_pipe_enable,
+   .disable= bochs_pipe_disable,
.update = bochs_pipe_update,
.prepare_fb = drm_gem_vram_simple_display_pipe_prepare_fb,
.cleanup_fb = drm_gem_vram_simple_display_pipe_cleanup_fb,



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2] virtio_blk: Add support for lifetime feature

2021-04-20 Thread Cornelia Huck
On Tue, 20 Apr 2021 06:08:29 -0400
"Michael S. Tsirkin"  wrote:

> On Tue, Apr 20, 2021 at 08:01:29AM +0100, Christoph Hellwig wrote:
> > Just to despit my 2 cents again:  I think the way this is specified
> > in the virtio spec is actively harmful and we should not suport it in
> > Linux.
> > 
> > If others override me we at least need to require a detailed
> > documentation of these fields as the virto spec does not provide it.
> > 
> > Please also do not add pointless over 80 character lines, and follow
> > the one value per sysfs file rule.  
> 
> Enrico would you like to raise the issues with the virtio TC
> for resolution?
> 

FWIW, I've opened https://github.com/oasis-tcs/virtio-spec/issues/106
to track this.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next] virtio-net: fix use-after-free in page_to_skb()

2021-04-20 Thread Guenter Roeck
On Tue, Apr 20, 2021 at 04:00:07PM +0200, Eric Dumazet wrote:
> On Tue, Apr 20, 2021 at 3:48 PM Guenter Roeck  wrote:
> >
> > On 4/20/21 2:43 AM, Eric Dumazet wrote:
> 
> > >
> >
> > Unfortunately that doesn't fix the problem for me. With this patch applied
> > on top of next-20210419, I still get the same crash as before:
> >
> > udhcpc: sending discover^M
> > Unable to handle kernel paging request at virtual address 0004^M
> > udhcpc(169): Oops -1^M
> > pc = [<0004>]  ra = []  ps = Not 
> > tainted^M
> > pc is at 0x4^M
> > ra is at napi_gro_receive+0x68/0x150^M
> > v0 =   t0 = 0008  t1 = ^M
> > t2 =   t3 = 000e  t4 = 0038^M
> > t5 =   t6 = fc2f298a  t7 = fc0002c78000^M
> > s0 = fc00010b3ca0  s1 =   s2 = fc00011267e0^M
> > s3 =   s4 = fc00025f2008  s5 = fc2f2940^M
> > s6 = fc00025f2040^M
> > a0 = fc00025f2008  a1 = fc2f2940  a2 = fc0002ca000c^M
> > a3 = fc0250d0  a4 = 000e0008  a5 = ^M
> > t8 = fc00010b3c80  t9 = fc0002ca04cc  t10= ^M
> > t11= 04c0  pv = fcb8bc40  at = ^M
> > gp = fc00010f9fb8  sp = df74db09^M
> > Disabling lock debugging due to kernel taint^M
> > Trace:^M
> > [] napi_gro_receive+0x68/0x150^M
> > [] receive_buf+0x50c/0x1b80^M
> > [] virtnet_poll+0x1a8/0x5b0^M
> > [] virtnet_poll+0x1dc/0x5b0^M
> > [] __napi_poll+0x4c/0x270^M
> > [] net_rx_action+0x130/0x2c0^M
> > [] sch_direct_xmit+0x170/0x360^M
> > [] __qdisc_run+0x160/0x6c0^M
> > [] do_softirq+0xa4/0xd0^M
> > [] __local_bh_enable_ip+0x114/0x120^M
> > [] __dev_queue_xmit+0x484/0xa60^M
> > [] packet_sendmsg+0xe7c/0x1ba0^M
> > [] __sys_sendto+0xf8/0x170^M
> > [] _raw_spin_unlock+0x18/0x30^M
> > [] ehci_irq+0x2cc/0x5c0^M
> > [] usb_hcd_irq+0x34/0x50^M
> > [] move_addr_to_kernel+0x3c/0x60^M
> > [] __sys_sendto+0xa4/0x170^M
> > [] sys_sendto+0x24/0x40^M
> > [] _raw_spin_lock+0x18/0x30^M
> > [] _raw_spin_unlock+0x18/0x30^M
> > [] clipper_enable_irq+0x98/0x100^M
> > [] _raw_spin_unlock+0x18/0x30^M
> > [] entSys+0xa4/0xc0^M
> 
> OK, it would be nice if you could get line number from this stack trace.
> 

Here you are:

napi_gro_receive (net/core/dev.c:6196)
receive_buf (drivers/net/virtio_net.c:1150)
virtnet_poll (drivers/net/virtio_net.c:1414 drivers/net/virtio_net.c:1519)
clipper_srm_device_interrupt (arch/alpha/kernel/sys_dp264.c:256)
virtnet_poll (drivers/net/virtio_net.c:1413 drivers/net/virtio_net.c:1519)
__napi_poll (net/core/dev.c:6962)
net_rx_action (net/core/dev.c:7029 net/core/dev.c:7116)
__qdisc_run (net/sched/sch_generic.c:376 net/sched/sch_generic.c:384)
do_softirq (./include/asm-generic/softirq_stack.h:10 kernel/softirq.c:460 
kernel/softirq.c:447)
__local_bh_enable_ip (kernel/softirq.c:384)
__dev_queue_xmit (./include/linux/bottom_half.h:32 
./include/linux/rcupdate.h:746 net/core/dev.c:4272)
packet_sendmsg (net/packet/af_packet.c:3009 net/packet/af_packet.c:3034)
__sys_sendto (net/socket.c:654 net/socket.c:674 net/socket.c:1977)
__d_alloc (fs/dcache.c:1744)
packet_create (net/packet/af_packet.c:1192 net/packet/af_packet.c:3296)
move_addr_to_kernel (./include/linux/uaccess.h:192 net/socket.c:198 
net/socket.c:192)
__sys_sendto (net/socket.c:1968)
sys_sendto (net/socket.c:1989 net/socket.c:1985)
sys_bind (net/socket.c:1648 net/socket.c:1646)
entSys (arch/alpha/kernel/entry.S:477)

Guenter
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v4 0/9] drm: Support simple-framebuffer devices and firmware fbs

2021-04-20 Thread Daniel Vetter
On Tue, Apr 20, 2021 at 11:16:09AM +0200, Geert Uytterhoeven wrote:
> Hi Daniel,
> 
> On Tue, Apr 20, 2021 at 10:46 AM Daniel Vetter  wrote:
> > On Mon, Apr 19, 2021 at 10:00:56AM +0200, Geert Uytterhoeven wrote:
> > > On Fri, Apr 16, 2021 at 11:00 AM Thomas Zimmermann  
> > > wrote:
> > > > This patchset adds support for simple-framebuffer platform devices and
> > > > a handover mechanism for native drivers to take-over control of the
> > > > hardware.
> > > >
> > > > The new driver, called simpledrm, binds to a simple-frambuffer platform
> > > > device. The kernel's boot code creates such devices for 
> > > > firmware-provided
> > > > framebuffers, such as EFI-GOP or VESA. Typically the BIOS, UEFI or boot
> > > > loader sets up the framebuffers. Description via device tree is also an
> > > > option.
> > >
> > > I guess this can be used as a replacement for offb, too...
> > >
> > > > Patches 4 to 8 add the simpledrm driver. It's build on simple DRM 
> > > > helpers
> > > > and SHMEM. It supports 16-bit, 24-bit and 32-bit RGB framebuffers. 
> > > > During
> > >
> > >  if support for 8-bit frame buffers would be added?
> >
> > Is that 8-bit greyscale or 8-bit indexed with 256 entry palette? Former
> 
> 8-bit indexed with 256 entry palette.
> 
> > shouldn't be a big thing, but the latter is only really supported by the
> > overall drm ecosystem in theory. Most userspace assumes that xrgb
> > works, and we keep that illusion up by emulating it in kernel for hw which
> > just doesn't support it. But reformatting xrgb to c8 is tricky at
> > best. The uapis are all there for setting the palette, and C8 is a defined
> > format even with atomic kms interface, but really there's not much
> > userspace for it. In other words, it would work as well as current offb
> > would, but that's at least that.
> 
> Oh, that's good to know!
> Does this mean fbdev is not deprecated for anything <= C8? ;-)

Nope. It just means you wont be able to use drm-only userspace with it
most likely, without also investing a ton of effort into porting those
over.

> A while ago, I was looking into converting an fbdev driver to drm, and
> one of the things I ran into is lack of C4, C2, C1, Y8, Y4, Y2, and
> monochrome support.  On top of that, lots of internal code seems to
> assume pixels are never smaller than a byte (thus ignoring
> char_per_block[]/block_w).  The lack of support for planar modes isn't
> that bad, combined with the need for copying, as c2p conversion can be
> done while copying, thus even making life easier for userspace
> applications that can just always work on chunky data.
> Then real work kicked in, before I got anything working...

We support drm_fourcc, so adding more pixel formats is not problem at all.
Anything indexed/paletted will simply not work great with unchanged drm
userspace, because you can't really convert it over from the common
denominator of xrgb888. But if it's just about adding support, adding more
fourcc codes isn't a big deal. The fbdev layer hasn't been taught about
fourcc codes yet, but that's also just for lack of need by anyone.

Also we support abitrary uneven pixel packing too, with some generic
support for anything that's at least somewhat regular. That's been the
case for a while now. It was added for fancy tiling and compression
formats, but works equally well for anything else that's aligned different
than what can be described with simplistic bytes-per-pixel only.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH net] vsock/virtio: free queued packets when closing socket

2021-04-20 Thread Stefano Garzarella
As reported by syzbot [1], there is a memory leak while closing the
socket. We partially solved this issue with commit ac03046ece2b
("vsock/virtio: free packets during the socket release"), but we
forgot to drain the RX queue when the socket is definitely closed by
the scheduled work.

To avoid future issues, let's use the new virtio_transport_remove_sock()
to drain the RX queue before removing the socket from the af_vsock lists
calling vsock_remove_sock().

[1] https://syzkaller.appspot.com/bug?extid=24452624fc4c571eedd9

Fixes: ac03046ece2b ("vsock/virtio: free packets during the socket release")
Reported-and-tested-by: syzbot+24452624fc4c571ee...@syzkaller.appspotmail.com
Signed-off-by: Stefano Garzarella 
---
 net/vmw_vsock/virtio_transport_common.c | 28 +
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport_common.c 
b/net/vmw_vsock/virtio_transport_common.c
index e4370b1b7494..902cb6dd710b 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -733,6 +733,23 @@ static int virtio_transport_reset_no_sock(const struct 
virtio_transport *t,
return t->send_pkt(reply);
 }
 
+/* This function should be called with sk_lock held and SOCK_DONE set */
+static void virtio_transport_remove_sock(struct vsock_sock *vsk)
+{
+   struct virtio_vsock_sock *vvs = vsk->trans;
+   struct virtio_vsock_pkt *pkt, *tmp;
+
+   /* We don't need to take rx_lock, as the socket is closing and we are
+* removing it.
+*/
+   list_for_each_entry_safe(pkt, tmp, >rx_queue, list) {
+   list_del(>list);
+   virtio_transport_free_pkt(pkt);
+   }
+
+   vsock_remove_sock(vsk);
+}
+
 static void virtio_transport_wait_close(struct sock *sk, long timeout)
 {
if (timeout) {
@@ -765,7 +782,7 @@ static void virtio_transport_do_close(struct vsock_sock 
*vsk,
(!cancel_timeout || cancel_delayed_work(>close_work))) {
vsk->close_work_scheduled = false;
 
-   vsock_remove_sock(vsk);
+   virtio_transport_remove_sock(vsk);
 
/* Release refcnt obtained when we scheduled the timeout */
sock_put(sk);
@@ -828,22 +845,15 @@ static bool virtio_transport_close(struct vsock_sock *vsk)
 
 void virtio_transport_release(struct vsock_sock *vsk)
 {
-   struct virtio_vsock_sock *vvs = vsk->trans;
-   struct virtio_vsock_pkt *pkt, *tmp;
struct sock *sk = >sk;
bool remove_sock = true;
 
if (sk->sk_type == SOCK_STREAM)
remove_sock = virtio_transport_close(vsk);
 
-   list_for_each_entry_safe(pkt, tmp, >rx_queue, list) {
-   list_del(>list);
-   virtio_transport_free_pkt(pkt);
-   }
-
if (remove_sock) {
sock_set_flag(sk, SOCK_DONE);
-   vsock_remove_sock(vsk);
+   virtio_transport_remove_sock(vsk);
}
 }
 EXPORT_SYMBOL_GPL(virtio_transport_release);
-- 
2.30.2

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2] virtio_blk: Add support for lifetime feature

2021-04-20 Thread Michael S. Tsirkin
On Tue, Apr 20, 2021 at 08:01:29AM +0100, Christoph Hellwig wrote:
> Just to despit my 2 cents again:  I think the way this is specified
> in the virtio spec is actively harmful and we should not suport it in
> Linux.
> 
> If others override me we at least need to require a detailed
> documentation of these fields as the virto spec does not provide it.
> 
> Please also do not add pointless over 80 character lines, and follow
> the one value per sysfs file rule.

Enrico would you like to raise the issues with the virtio TC
for resolution?

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next] virtio-net: fix use-after-free in page_to_skb()

2021-04-20 Thread Michael S. Tsirkin
On Tue, Apr 20, 2021 at 02:43:41AM -0700, Eric Dumazet wrote:
> From: Eric Dumazet 
> 
> KASAN/syzbot had 4 reports, one of them being:
> 
> BUG: KASAN: slab-out-of-bounds in memcpy include/linux/fortify-string.h:191 
> [inline]
> BUG: KASAN: slab-out-of-bounds in page_to_skb+0x5cf/0xb70 
> drivers/net/virtio_net.c:480
> Read of size 12 at addr 888014a5f800 by task systemd-udevd/8445
> 
> CPU: 0 PID: 8445 Comm: systemd-udevd Not tainted 
> 5.12.0-rc8-next-20210419-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Call Trace:
>  
>  __dump_stack lib/dump_stack.c:79 [inline]
>  dump_stack+0x141/0x1d7 lib/dump_stack.c:120
>  print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:233
>  __kasan_report mm/kasan/report.c:419 [inline]
>  kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:436
>  check_region_inline mm/kasan/generic.c:180 [inline]
>  kasan_check_range+0x13d/0x180 mm/kasan/generic.c:186
>  memcpy+0x20/0x60 mm/kasan/shadow.c:65
>  memcpy include/linux/fortify-string.h:191 [inline]
>  page_to_skb+0x5cf/0xb70 drivers/net/virtio_net.c:480
>  receive_mergeable drivers/net/virtio_net.c:1009 [inline]
>  receive_buf+0x2bc0/0x6250 drivers/net/virtio_net.c:1119
>  virtnet_receive drivers/net/virtio_net.c:1411 [inline]
>  virtnet_poll+0x568/0x10b0 drivers/net/virtio_net.c:1516
>  __napi_poll+0xaf/0x440 net/core/dev.c:6962
>  napi_poll net/core/dev.c:7029 [inline]
>  net_rx_action+0x801/0xb40 net/core/dev.c:7116
>  __do_softirq+0x29b/0x9fe kernel/softirq.c:559
>  invoke_softirq kernel/softirq.c:433 [inline]
>  __irq_exit_rcu+0x136/0x200 kernel/softirq.c:637
>  irq_exit_rcu+0x5/0x20 kernel/softirq.c:649
>  common_interrupt+0xa4/0xd0 arch/x86/kernel/irq.c:240
> 
> Fixes: fb32856b16ad ("virtio-net: page_to_skb() use build_skb when there's 
> sufficient tailroom")
> Signed-off-by: Eric Dumazet 
> Reported-by: syzbot 
> Reported-by: Guenter Roeck 
> Reported-by: Mat Martineau 
> Cc: Xuan Zhuo 
> Cc: Jason Wang 
> Cc: "Michael S. Tsirkin" 
> Cc: virtualization@lists.linux-foundation.org

Acked-by: Michael S. Tsirkin 

> ---
>  drivers/net/virtio_net.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 
> 8cd76037c72481200ea3e8429e9fdfec005dad85..2e28c04aa6351d2b4016f7d277ce104c4970069d
>  100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -385,6 +385,7 @@ static struct sk_buff *page_to_skb(struct virtnet_info 
> *vi,
>   struct sk_buff *skb;
>   struct virtio_net_hdr_mrg_rxbuf *hdr;
>   unsigned int copy, hdr_len, hdr_padded_len;
> + struct page *page_to_free = NULL;
>   int tailroom, shinfo_size;
>   char *p, *hdr_p;
>  
> @@ -445,7 +446,7 @@ static struct sk_buff *page_to_skb(struct virtnet_info 
> *vi,
>   if (len)
>   skb_add_rx_frag(skb, 0, page, offset, len, truesize);
>   else
> - put_page(page);
> + page_to_free = page;
>   goto ok;
>   }
>  
> @@ -479,6 +480,8 @@ static struct sk_buff *page_to_skb(struct virtnet_info 
> *vi,
>   hdr = skb_vnet_hdr(skb);
>   memcpy(hdr, hdr_p, hdr_len);
>   }
> + if (page_to_free)
> + put_page(page_to_free);
>  
>   if (metasize) {
>   __skb_pull(skb, metasize);
> -- 
> 2.31.1.368.gbe11c130af-goog

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH net-next] virtio-net: fix use-after-free in page_to_skb()

2021-04-20 Thread Eric Dumazet
From: Eric Dumazet 

KASAN/syzbot had 4 reports, one of them being:

BUG: KASAN: slab-out-of-bounds in memcpy include/linux/fortify-string.h:191 
[inline]
BUG: KASAN: slab-out-of-bounds in page_to_skb+0x5cf/0xb70 
drivers/net/virtio_net.c:480
Read of size 12 at addr 888014a5f800 by task systemd-udevd/8445

CPU: 0 PID: 8445 Comm: systemd-udevd Not tainted 
5.12.0-rc8-next-20210419-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x141/0x1d7 lib/dump_stack.c:120
 print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:233
 __kasan_report mm/kasan/report.c:419 [inline]
 kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:436
 check_region_inline mm/kasan/generic.c:180 [inline]
 kasan_check_range+0x13d/0x180 mm/kasan/generic.c:186
 memcpy+0x20/0x60 mm/kasan/shadow.c:65
 memcpy include/linux/fortify-string.h:191 [inline]
 page_to_skb+0x5cf/0xb70 drivers/net/virtio_net.c:480
 receive_mergeable drivers/net/virtio_net.c:1009 [inline]
 receive_buf+0x2bc0/0x6250 drivers/net/virtio_net.c:1119
 virtnet_receive drivers/net/virtio_net.c:1411 [inline]
 virtnet_poll+0x568/0x10b0 drivers/net/virtio_net.c:1516
 __napi_poll+0xaf/0x440 net/core/dev.c:6962
 napi_poll net/core/dev.c:7029 [inline]
 net_rx_action+0x801/0xb40 net/core/dev.c:7116
 __do_softirq+0x29b/0x9fe kernel/softirq.c:559
 invoke_softirq kernel/softirq.c:433 [inline]
 __irq_exit_rcu+0x136/0x200 kernel/softirq.c:637
 irq_exit_rcu+0x5/0x20 kernel/softirq.c:649
 common_interrupt+0xa4/0xd0 arch/x86/kernel/irq.c:240

Fixes: fb32856b16ad ("virtio-net: page_to_skb() use build_skb when there's 
sufficient tailroom")
Signed-off-by: Eric Dumazet 
Reported-by: syzbot 
Reported-by: Guenter Roeck 
Reported-by: Mat Martineau 
Cc: Xuan Zhuo 
Cc: Jason Wang 
Cc: "Michael S. Tsirkin" 
Cc: virtualization@lists.linux-foundation.org
---
 drivers/net/virtio_net.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 
8cd76037c72481200ea3e8429e9fdfec005dad85..2e28c04aa6351d2b4016f7d277ce104c4970069d
 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -385,6 +385,7 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
struct sk_buff *skb;
struct virtio_net_hdr_mrg_rxbuf *hdr;
unsigned int copy, hdr_len, hdr_padded_len;
+   struct page *page_to_free = NULL;
int tailroom, shinfo_size;
char *p, *hdr_p;
 
@@ -445,7 +446,7 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
if (len)
skb_add_rx_frag(skb, 0, page, offset, len, truesize);
else
-   put_page(page);
+   page_to_free = page;
goto ok;
}
 
@@ -479,6 +480,8 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
hdr = skb_vnet_hdr(skb);
memcpy(hdr, hdr_p, hdr_len);
}
+   if (page_to_free)
+   put_page(page_to_free);
 
if (metasize) {
__skb_pull(skb, metasize);
-- 
2.31.1.368.gbe11c130af-goog

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [net-next, v2] virtio-net: page_to_skb() use build_skb when there's sufficient tailroom

2021-04-20 Thread Eric Dumazet



On 4/20/21 6:46 AM, Guenter Roeck wrote:
> On Wed, Apr 14, 2021 at 09:52:21AM +0800, Xuan Zhuo wrote:
>> In page_to_skb(), if we have enough tailroom to save skb_shared_info, we
>> can use build_skb to create skb directly. No need to alloc for
>> additional space. And it can save a 'frags slot', which is very friendly
>> to GRO.
>>
>> Here, if the payload of the received package is too small (less than
>> GOOD_COPY_LEN), we still choose to copy it directly to the space got by
>> napi_alloc_skb. So we can reuse these pages.
>>
>> Testing Machine:
>> The four queues of the network card are bound to the cpu1.
>>
>> Test command:
>> for ((i=0;i<5;++i)); do sockperf tp --ip 192.168.122.64 -m 1000 -t 150& 
>> done
>>
>> The size of the udp package is 1000, so in the case of this patch, there
>> will always be enough tailroom to use build_skb. The sent udp packet
>> will be discarded because there is no port to receive it. The irqsoftd
>> of the machine is 100%, we observe the received quantity displayed by
>> sar -n DEV 1:
>>
>> no build_skb:  956864.00 rxpck/s
>> build_skb:1158465.00 rxpck/s
>>
>> Signed-off-by: Xuan Zhuo 
>> Suggested-by: Jason Wang 
> 
> Booting qemu-system-alpha with virtio-net interface instantiated results in:
> 
> udhcpc: sending discover
> Unable to handle kernel paging request at virtual address 0004
> udhcpc(169): Oops -1
> pc = [<0004>]  ra = []  ps = Not tainted
> pc is at 0x4
> ra is at napi_gro_receive+0x68/0x150
> v0 =   t0 = 0008  t1 = 
> t2 =   t3 = 000e  t4 = 0038
> t5 =   t6 = fc2f220a  t7 = fc0002cd
> s0 = fc00010b3ca0  s1 =   s2 = fc00011267e0
> s3 =   s4 = fc00025f2008  s5 = fc2f21c0
> s6 = fc00025f2040
> a0 = fc00025f2008  a1 = fc2f21c0  a2 = fc0002cc800c
> a3 = fc0250d0  a4 = 000e0008  a5 = 
> t8 = fc00010b3c80  t9 = fc0002cc84cc  t10= 
> t11= 04c0  pv = fcb8bc10  at = 
> gp = fc00010f9fb8  sp = aefe3f8a
> Disabling lock debugging due to kernel taint
> Trace:
> [] napi_gro_receive+0x68/0x150
> [] receive_buf+0x50c/0x1b80
> [] virtnet_poll+0x1a8/0x5b0
> [] virtnet_poll+0x1dc/0x5b0
> [] __napi_poll+0x4c/0x270
> [] net_rx_action+0x130/0x2c0
> [] __qdisc_run+0x90/0x6c0
> [] do_softirq+0xa4/0xd0
> [] __local_bh_enable_ip+0x114/0x120
> [] __dev_queue_xmit+0x484/0xa60
> [] packet_sendmsg+0xe7c/0x1ba0
> [] __sys_sendto+0xf8/0x170
> [] __d_alloc+0x40/0x270
> [] packet_create+0x17c/0x3c0
> [] move_addr_to_kernel+0x3c/0x60
> [] __sys_sendto+0xa4/0x170
> [] sys_sendto+0x24/0x40
> [] sys_bind+0x20/0x40
> [] entSys+0xa4/0xc0
> 
> Bisect log attached.
> 
> Guenter
> 
> ---
> # bad: [50b8b1d699ac313c0a07a3c185ffb23aecab8abb] Add linux-next specific 
> files for 20210419
> # good: [bf05bf16c76bb44ab5156223e1e58e26dfe30a88] Linux 5.12-rc8
> git bisect start 'HEAD' 'v5.12-rc8'
> # bad: [c4bb91fc07e59241cde97f913d7a2fbedc248f0d] Merge remote-tracking 
> branch 'crypto/master'
> git bisect bad c4bb91fc07e59241cde97f913d7a2fbedc248f0d
> # good: [499f739ad70f2a58aac985dceb25ca7666da88be] Merge remote-tracking 
> branch 'jc_docs/docs-next'
> git bisect good 499f739ad70f2a58aac985dceb25ca7666da88be
> # good: [17e1be342d46eb0b7c3df4c7e623493483080b63] bnxt_en: Treat health 
> register value 0 as valid in bnxt_try_reover_fw().
> git bisect good 17e1be342d46eb0b7c3df4c7e623493483080b63
> # good: [cf6d6925625755029cdf4bb0d0028f0b6e713242] Merge remote-tracking 
> branch 'rdma/for-next'
> git bisect good cf6d6925625755029cdf4bb0d0028f0b6e713242
> # good: [fb8517f4fade44fa5e42e29ca4d6e4a7ed50b512] rtw88: 8822c: add CFO 
> tracking
> git bisect good fb8517f4fade44fa5e42e29ca4d6e4a7ed50b512
> # bad: [d168b61fb769d10306b6118ec7623d2911d45690] Merge remote-tracking 
> branch 'gfs2/for-next'
> git bisect bad d168b61fb769d10306b6118ec7623d2911d45690
> # bad: [ee3e875f10fca68fb7478c23c75b553e56da319c] net: enetc: increase TX 
> ring size
> git bisect bad ee3e875f10fca68fb7478c23c75b553e56da319c
> # good: [4a51b0e8a0143b0e83d51d9c58c6416c3818a9f2] r8152: support PHY 
> firmware for RTL8156 series
> git bisect good 4a51b0e8a0143b0e83d51d9c58c6416c3818a9f2
> # bad: [03e481e88b194296defdff3600b2fcebb04bd6cf] Merge tag 
> 'mlx5-updates-2021-04-16' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
> git bisect bad 03e481e88b194296defdff3600b2fcebb04bd6cf
> # bad: [70c183759b2cece2f9ba82e63e38fa32bebc9db2] Merge branch 
> 'gianfar-mq-polling'
> git bisect bad 70c183759b2cece2f9ba82e63e38fa32bebc9db2
> # bad: [d8604b209e9b3762280b8321162f0f64219d51c9] dt-bindings: net: qcom,ipa: 
> add firmware-name property
> git bisect bad d8604b209e9b3762280b8321162f0f64219d51c9
> # good: [4ad29b1a484e0c58acfffdcd87172ed17f35c1dd] net: mvpp2: Add parsing 
> support for different IPv4 

Re: [PATCH net-next v3] virtio-net: page_to_skb() use build_skb when there's sufficient tailroom

2021-04-20 Thread Eric Dumazet



On 4/16/21 11:16 AM, Xuan Zhuo wrote:
> In page_to_skb(), if we have enough tailroom to save skb_shared_info, we
> can use build_skb to create skb directly. No need to alloc for
> additional space. And it can save a 'frags slot', which is very friendly
> to GRO.
> 
> Here, if the payload of the received package is too small (less than
> GOOD_COPY_LEN), we still choose to copy it directly to the space got by
> napi_alloc_skb. So we can reuse these pages.
> 
> Testing Machine:
> The four queues of the network card are bound to the cpu1.
> 
> Test command:
> for ((i=0;i<5;++i)); do sockperf tp --ip 192.168.122.64 -m 1000 -t 150& 
> done
> 
> The size of the udp package is 1000, so in the case of this patch, there
> will always be enough tailroom to use build_skb. The sent udp packet
> will be discarded because there is no port to receive it. The irqsoftd
> of the machine is 100%, we observe the received quantity displayed by
> sar -n DEV 1:
> 
> no build_skb:  956864.00 rxpck/s
> build_skb:1158465.00 rxpck/s
> 
> Signed-off-by: Xuan Zhuo 
> Suggested-by: Jason Wang 
> ---
> 
> v3: fix the truesize when headroom > 0
> 
> v2: conflict resolution
> 
>  drivers/net/virtio_net.c | 69 
>  1 file changed, 48 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 101659cd4b87..8cd76037c724 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -379,21 +379,17 @@ static struct sk_buff *page_to_skb(struct virtnet_info 
> *vi,
>  struct receive_queue *rq,
>  struct page *page, unsigned int offset,
>  unsigned int len, unsigned int truesize,
> -bool hdr_valid, unsigned int metasize)
> +bool hdr_valid, unsigned int metasize,
> +unsigned int headroom)
>  {
>   struct sk_buff *skb;
>   struct virtio_net_hdr_mrg_rxbuf *hdr;
>   unsigned int copy, hdr_len, hdr_padded_len;
> - char *p;
> + int tailroom, shinfo_size;
> + char *p, *hdr_p;
> 
>   p = page_address(page) + offset;
> -
> - /* copy small packet so we can reuse these pages for small data */
> - skb = napi_alloc_skb(>napi, GOOD_COPY_LEN);
> - if (unlikely(!skb))
> - return NULL;
> -
> - hdr = skb_vnet_hdr(skb);
> + hdr_p = p;
> 
>   hdr_len = vi->hdr_len;
>   if (vi->mergeable_rx_bufs)
> @@ -401,14 +397,38 @@ static struct sk_buff *page_to_skb(struct virtnet_info 
> *vi,
>   else
>   hdr_padded_len = sizeof(struct padded_vnet_hdr);
> 
> - /* hdr_valid means no XDP, so we can copy the vnet header */
> - if (hdr_valid)
> - memcpy(hdr, p, hdr_len);
> + /* If headroom is not 0, there is an offset between the beginning of the
> +  * data and the allocated space, otherwise the data and the allocated
> +  * space are aligned.
> +  */
> + if (headroom) {
> + /* The actual allocated space size is PAGE_SIZE. */
> + truesize = PAGE_SIZE;
> + tailroom = truesize - len - offset;
> + } else {
> + tailroom = truesize - len;
> + }
> 
>   len -= hdr_len;
>   offset += hdr_padded_len;
>   p += hdr_padded_len;
> 
> + shinfo_size = SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> +
> + if (len > GOOD_COPY_LEN && tailroom >= shinfo_size) {
> + skb = build_skb(p, truesize);
> + if (unlikely(!skb))
> + return NULL;
> +
> + skb_put(skb, len);
> + goto ok;
> + }
> +
> + /* copy small packet so we can reuse these pages for small data */
> + skb = napi_alloc_skb(>napi, GOOD_COPY_LEN);
> + if (unlikely(!skb))
> + return NULL;
> +
>   /* Copy all frame if it fits skb->head, otherwise
>* we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
>*/
> @@ -418,11 +438,6 @@ static struct sk_buff *page_to_skb(struct virtnet_info 
> *vi,
>   copy = ETH_HLEN + metasize;
>   skb_put_data(skb, p, copy);
> 
> - if (metasize) {
> - __skb_pull(skb, metasize);
> - skb_metadata_set(skb, metasize);
> - }
> -
>   len -= copy;
>   offset += copy;
> 
> @@ -431,7 +446,7 @@ static struct sk_buff *page_to_skb(struct virtnet_info 
> *vi,
>   skb_add_rx_frag(skb, 0, page, offset, len, truesize);
>   else
>   put_page(page);

Here the struct page has been freed..

> - return skb;
> + goto ok;
>   }
> 
>   /*
> @@ -458,6 +473,18 @@ static struct sk_buff *page_to_skb(struct virtnet_info 
> *vi,
>   if (page)
>   give_pages(rq, page);
> 
> +ok:
> + /* hdr_valid means no XDP, so we can copy the vnet header */
> + if (hdr_valid) {

Re: [PATCH v4 0/9] drm: Support simple-framebuffer devices and firmware fbs

2021-04-20 Thread Geert Uytterhoeven
Hi Gerd,

On Tue, Apr 20, 2021 at 11:22 AM Gerd Hoffmann  wrote:
> > > > Patches 4 to 8 add the simpledrm driver. It's build on simple DRM 
> > > > helpers
> > > > and SHMEM. It supports 16-bit, 24-bit and 32-bit RGB framebuffers. 
> > > > During
> > >
> > >  if support for 8-bit frame buffers would be added?
> >
> > Is that 8-bit greyscale or 8-bit indexed with 256 entry palette? Former
> > shouldn't be a big thing, but the latter is only really supported by the
> > overall drm ecosystem in theory. Most userspace assumes that xrgb
> > works, and we keep that illusion up by emulating it in kernel for hw which
> > just doesn't support it. But reformatting xrgb to c8 is tricky at
> > best.
>
> Well.  cirrus converts xrgb on the fly to rgb888 or rgb565
> (depending on display resolution).  We could pull off the same trick
> here and convert to rgb332 (assuming we can program the palette with the
> color cube needed for that).  Wouldn't look pretty, but would probably
> work better than expecting userspace know what color palettes are in
> 2021 ...

Yeah, I already had a similar idea for Amiga HAM ;-)

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v4 0/9] drm: Support simple-framebuffer devices and firmware fbs

2021-04-20 Thread Gerd Hoffmann
  Hi,

> > > Patches 4 to 8 add the simpledrm driver. It's build on simple DRM helpers
> > > and SHMEM. It supports 16-bit, 24-bit and 32-bit RGB framebuffers. During
> > 
> >  if support for 8-bit frame buffers would be added?
> 
> Is that 8-bit greyscale or 8-bit indexed with 256 entry palette? Former
> shouldn't be a big thing, but the latter is only really supported by the
> overall drm ecosystem in theory. Most userspace assumes that xrgb
> works, and we keep that illusion up by emulating it in kernel for hw which
> just doesn't support it. But reformatting xrgb to c8 is tricky at
> best.

Well.  cirrus converts xrgb on the fly to rgb888 or rgb565
(depending on display resolution).  We could pull off the same trick
here and convert to rgb332 (assuming we can program the palette with the
color cube needed for that).  Wouldn't look pretty, but would probably
work better than expecting userspace know what color palettes are in
2021 ...

take care,
  Gerd

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v4 0/9] drm: Support simple-framebuffer devices and firmware fbs

2021-04-20 Thread Geert Uytterhoeven
Hi Daniel,

On Tue, Apr 20, 2021 at 10:46 AM Daniel Vetter  wrote:
> On Mon, Apr 19, 2021 at 10:00:56AM +0200, Geert Uytterhoeven wrote:
> > On Fri, Apr 16, 2021 at 11:00 AM Thomas Zimmermann  
> > wrote:
> > > This patchset adds support for simple-framebuffer platform devices and
> > > a handover mechanism for native drivers to take-over control of the
> > > hardware.
> > >
> > > The new driver, called simpledrm, binds to a simple-frambuffer platform
> > > device. The kernel's boot code creates such devices for firmware-provided
> > > framebuffers, such as EFI-GOP or VESA. Typically the BIOS, UEFI or boot
> > > loader sets up the framebuffers. Description via device tree is also an
> > > option.
> >
> > I guess this can be used as a replacement for offb, too...
> >
> > > Patches 4 to 8 add the simpledrm driver. It's build on simple DRM helpers
> > > and SHMEM. It supports 16-bit, 24-bit and 32-bit RGB framebuffers. During
> >
> >  if support for 8-bit frame buffers would be added?
>
> Is that 8-bit greyscale or 8-bit indexed with 256 entry palette? Former

8-bit indexed with 256 entry palette.

> shouldn't be a big thing, but the latter is only really supported by the
> overall drm ecosystem in theory. Most userspace assumes that xrgb
> works, and we keep that illusion up by emulating it in kernel for hw which
> just doesn't support it. But reformatting xrgb to c8 is tricky at
> best. The uapis are all there for setting the palette, and C8 is a defined
> format even with atomic kms interface, but really there's not much
> userspace for it. In other words, it would work as well as current offb
> would, but that's at least that.

Oh, that's good to know!
Does this mean fbdev is not deprecated for anything <= C8? ;-)

A while ago, I was looking into converting an fbdev driver to drm, and
one of the things I ran into is lack of C4, C2, C1, Y8, Y4, Y2, and
monochrome support.  On top of that, lots of internal code seems to
assume pixels are never smaller than a byte (thus ignoring
char_per_block[]/block_w).  The lack of support for planar modes isn't
that bad, combined with the need for copying, as c2p conversion can be
done while copying, thus even making life easier for userspace
applications that can just always work on chunky data.
Then real work kicked in, before I got anything working...

Thanks!

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v4 0/9] drm: Support simple-framebuffer devices and firmware fbs

2021-04-20 Thread Daniel Vetter
On Mon, Apr 19, 2021 at 10:00:56AM +0200, Geert Uytterhoeven wrote:
> Hi Thomas,
> 
> On Fri, Apr 16, 2021 at 11:00 AM Thomas Zimmermann  
> wrote:
> > This patchset adds support for simple-framebuffer platform devices and
> > a handover mechanism for native drivers to take-over control of the
> > hardware.
> >
> > The new driver, called simpledrm, binds to a simple-frambuffer platform
> > device. The kernel's boot code creates such devices for firmware-provided
> > framebuffers, such as EFI-GOP or VESA. Typically the BIOS, UEFI or boot
> > loader sets up the framebuffers. Description via device tree is also an
> > option.
> 
> I guess this can be used as a replacement for offb, too...
> 
> > Patches 4 to 8 add the simpledrm driver. It's build on simple DRM helpers
> > and SHMEM. It supports 16-bit, 24-bit and 32-bit RGB framebuffers. During
> 
>  if support for 8-bit frame buffers would be added?

Is that 8-bit greyscale or 8-bit indexed with 256 entry palette? Former
shouldn't be a big thing, but the latter is only really supported by the
overall drm ecosystem in theory. Most userspace assumes that xrgb
works, and we keep that illusion up by emulating it in kernel for hw which
just doesn't support it. But reformatting xrgb to c8 is tricky at
best. The uapis are all there for setting the palette, and C8 is a defined
format even with atomic kms interface, but really there's not much
userspace for it. In other words, it would work as well as current offb
would, but that's at least that.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2] virtio_blk: Add support for lifetime feature

2021-04-20 Thread Christoph Hellwig
Just to despit my 2 cents again:  I think the way this is specified
in the virtio spec is actively harmful and we should not suport it in
Linux.

If others override me we at least need to require a detailed
documentation of these fields as the virto spec does not provide it.

Please also do not add pointless over 80 character lines, and follow
the one value per sysfs file rule.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization