date:20210609

Re: Security hole in vhost-vdpa?

2021-06-09 Thread Jason Wang



在 2021/6/10 下午12:30, Michael S. Tsirkin 写道:

On Mon, Jun 07, 2021 at 10:10:03AM +0800, Jason Wang wrote:

在 2021/6/7 上午5:38, Michael S. Tsirkin 写道:

On Sun, Jun 06, 2021 at 02:39:48PM +, Gautam Dawar wrote:

Hi All,


This is in continuation to my findings noted in Bug 213179 and discussions we
have had in the last couple of weeks over emails.


Today, I published the first patch for this issue which adds timeout based wait
for completion event and also logs a warning message to alert the user/
administrator of the problem.

Can't close just finish without waiting for userspace?


It works as long as we don't use mmap(). When we map kicks, it looks to me
there's no way to "revoke" the mapping from userspace?

Thanks

Can't we track these mappings and map some other page there?
Likely no more than one is needed ...



I think we can but if I understand correctly about how mm guys. Such 
"revoking" is expected to be done only via munmap(). Doing that is 
kernel might be tricky and hard to be correct.


If I was wrong, we can try to do that and post RFC to see if it works.

Thanks

(Note that this "security hole" is not vDPA specific, VFIO and other 
userspace driver subsystem may have the similar "issue").


Thanks







Then notify userspace about any buffers that did not complete ...



As a next step, the idea is to implement a mechanism to allow vhost-vdpa module
notify userspace app (QEMU) to close the fd corresponding to the vhost-vdpa
character device when it is waiting for the completion event in
vhost_vdpa_remove(). Jason confirmed this by saying that we need a new eventfd/
ioctl to receive hot remove request from kernel.


Although, we can proceed to implement changes for the part described above but
I feel that that the problem is much deeper than that. This mechanism will just
request the userspace to close the fd and let vhost-vdpa proceed with the
clean-up. However, IMHO things should be under more control of kernel space
than the user space.


The problem I am trying to highlight is that a malicious user-space application
can render any module registering a vDPA device to hang in their
de-initialization sequence. This will typically surface when
vdpa_device_unregister() is called from the function responsible for module
unload leading rmmod commands to not return, forever.


To prove my point, I created a simple C program (test_vdpa.c) that opens the
vhost-vdpa character device and never exits. The logs (test_logs.txt) show that
after registering the vDPA device from sfc driver, vhost-vdpa module creates
the char device /dev/vhost-vdpa-0  for it. As this is available to all apps in
the userspace, the malicious app (./block_vdpa_unload) opens this device and
goes to infinite sleep. At this time, when module unload (rmmod sfc) is called,
it hangs and the following print informs the user/admin of this state with
following message:

[ 8180.053647]  vhost-vdpa-0: vhost_vdpa_remove waiting for /dev/vhost-vdpa-0
to be closed


Finally, when block_vdpa_unload is killed, vhost_vdpa_remove() unblocks and sfc
module is unloaded.


With such application running in userspace, a kernel module (that registered
corresponding vDPA device) will hang during unload sequence. Such control of
the userspace application on the system resources should certainly be
prevented.

To me, this seems to be a serious issue and requires modifications in the way
it is currently handled in vhost-vdpa (and other modules (VFIO?) with similar
implementation).

Let me know what you think.


Regards,

Gautam Dawar

#include 
#include 
#include 
#include 
#include 
#include 

int main(int argc, char **argv)
{
 unsigned int index;
 char dev_path[30];
 int fd;

 if (argc != 2) {
   printf("Usage: %s \n", argv[0]);
   return -1;
 }

 index = strtoul(argv[1], NULL, 10);

 snprintf(dev_path, sizeof(dev_path), "/dev/vhost-vdpa-%u", index);
 fd = open(dev_path, O_RDWR);
 if(fd < 0)
 {
 printf("Failed to open %s, errno: %d!\n", dev_path, errno);
 return 1;
 }

 printf("Blocking unload of driver that registered vDPA device"
  " corresponding to cdev %s created by vhost-vdpa\n", dev_path);
 while (1)
   sleep(1);

 close(fd);
 return 0;
}
[root@ndr730p ~]# ~/create_vdpa_device.sh

[root@ndr730p ~]# ll /dev/vhost-vdpa-0
crw--- 1 root root 240, 0 Jun  6 19:59 /dev/vhost-vdpa-0

[root@ndr730p ~]# ./block_vdpa_unload 0 &
[1] 10930
Blocking unload of driver that registered vDPA device corresponding to cdev 
/dev/vhost-vdpa-0 created by vhost-vdpa

[root@ndr730p ~]# rmmod sfc
[ 8179.010520] sfc_ef100 :06:00.4: ef100_vdpa_delete: Calling vdpa 
unregister device
[ 8180.053647]  vhost-vdpa-0: vhost_vdpa_remove waiting for /dev/vhost-vdpa-0 
to be closed

[root@ndr730p ~]# kill -9 10930
[ 8218.392897] sfc_ef100 :06:00.0: shutdown successful



___
Virtualization mailing list

Re: Security hole in vhost-vdpa?

2021-06-09 Thread Michael S. Tsirkin

On Mon, Jun 07, 2021 at 10:10:03AM +0800, Jason Wang wrote:
> 
> 在 2021/6/7 上午5:38, Michael S. Tsirkin 写道:
> > On Sun, Jun 06, 2021 at 02:39:48PM +, Gautam Dawar wrote:
> > > Hi All,
> > > 
> > > 
> > > This is in continuation to my findings noted in Bug 213179 and 
> > > discussions we
> > > have had in the last couple of weeks over emails.
> > > 
> > > 
> > > Today, I published the first patch for this issue which adds timeout 
> > > based wait
> > > for completion event and also logs a warning message to alert the user/
> > > administrator of the problem.
> > Can't close just finish without waiting for userspace?
> 
> 
> It works as long as we don't use mmap(). When we map kicks, it looks to me
> there's no way to "revoke" the mapping from userspace?
> 
> Thanks

Can't we track these mappings and map some other page there?
Likely no more than one is needed ...



> 
> > Then notify userspace about any buffers that did not complete ...
> > 
> > 
> > > As a next step, the idea is to implement a mechanism to allow vhost-vdpa 
> > > module
> > > notify userspace app (QEMU) to close the fd corresponding to the 
> > > vhost-vdpa
> > > character device when it is waiting for the completion event in
> > > vhost_vdpa_remove(). Jason confirmed this by saying that we need a new 
> > > eventfd/
> > > ioctl to receive hot remove request from kernel.
> > > 
> > > 
> > > Although, we can proceed to implement changes for the part described 
> > > above but
> > > I feel that that the problem is much deeper than that. This mechanism 
> > > will just
> > > request the userspace to close the fd and let vhost-vdpa proceed with the
> > > clean-up. However, IMHO things should be under more control of kernel 
> > > space
> > > than the user space.
> > > 
> > > 
> > > The problem I am trying to highlight is that a malicious user-space 
> > > application
> > > can render any module registering a vDPA device to hang in their
> > > de-initialization sequence. This will typically surface when
> > > vdpa_device_unregister() is called from the function responsible for 
> > > module
> > > unload leading rmmod commands to not return, forever.
> > > 
> > > 
> > > To prove my point, I created a simple C program (test_vdpa.c) that opens 
> > > the
> > > vhost-vdpa character device and never exits. The logs (test_logs.txt) 
> > > show that
> > > after registering the vDPA device from sfc driver, vhost-vdpa module 
> > > creates
> > > the char device /dev/vhost-vdpa-0  for it. As this is available to all 
> > > apps in
> > > the userspace, the malicious app (./block_vdpa_unload) opens this device 
> > > and
> > > goes to infinite sleep. At this time, when module unload (rmmod sfc) is 
> > > called,
> > > it hangs and the following print informs the user/admin of this state with
> > > following message:
> > > 
> > > [ 8180.053647]  vhost-vdpa-0: vhost_vdpa_remove waiting for 
> > > /dev/vhost-vdpa-0
> > > to be closed
> > > 
> > > 
> > > Finally, when block_vdpa_unload is killed, vhost_vdpa_remove() unblocks 
> > > and sfc
> > > module is unloaded.
> > > 
> > > 
> > > With such application running in userspace, a kernel module (that 
> > > registered
> > > corresponding vDPA device) will hang during unload sequence. Such control 
> > > of
> > > the userspace application on the system resources should certainly be
> > > prevented.
> > > 
> > > To me, this seems to be a serious issue and requires modifications in the 
> > > way
> > > it is currently handled in vhost-vdpa (and other modules (VFIO?) with 
> > > similar
> > > implementation).
> > > 
> > > Let me know what you think.
> > > 
> > > 
> > > Regards,
> > > 
> > > Gautam Dawar
> > > 
> > > #include 
> > > #include 
> > > #include 
> > > #include 
> > > #include 
> > > #include 
> > > 
> > > int main(int argc, char **argv)
> > > {
> > > unsigned int index;
> > > char dev_path[30];
> > > int fd;
> > > 
> > > if (argc != 2) {
> > >  printf("Usage: %s \n", argv[0]);
> > >  return -1;
> > > }
> > > 
> > > index = strtoul(argv[1], NULL, 10);
> > > 
> > > snprintf(dev_path, sizeof(dev_path), "/dev/vhost-vdpa-%u", index);
> > > fd = open(dev_path, O_RDWR);
> > > if(fd < 0)
> > > {
> > > printf("Failed to open %s, errno: %d!\n", dev_path, errno);
> > > return 1;
> > > }
> > > 
> > > printf("Blocking unload of driver that registered vDPA device"
> > > " corresponding to cdev %s created by vhost-vdpa\n", dev_path);
> > > while (1)
> > >  sleep(1);
> > > 
> > > close(fd);
> > > return 0;
> > > }
> > > [root@ndr730p ~]# ~/create_vdpa_device.sh
> > > 
> > > [root@ndr730p ~]# ll /dev/vhost-vdpa-0
> > > crw--- 1 root root 240, 0 Jun  6 19:59 /dev/vhost-vdpa-0
> > > 
> > > [root@ndr730p ~]# ./block_vdpa_unload 0 &
> > > [1] 10930
> > > Blocking unload of driver that registered vDPA device corresponding to 
> > > cdev /dev/vhost-vdpa-0 created by vhost-vdpa
> > > 
> > > [root@ndr730p ~]# rmmod sfc
> > > [

Re: [RFC v1 0/6] virtio/vsock: introduce SOCK_DGRAM support

2021-06-09 Thread Jason Wang



在 2021/6/10 上午11:43, Jiang Wang . 写道:

On Wed, Jun 9, 2021 at 6:51 PM Jason Wang  wrote:


在 2021/6/10 上午7:24, Jiang Wang 写道:

This patchset implements support of SOCK_DGRAM for virtio
transport.

Datagram sockets are connectionless and unreliable. To avoid unfair contention
with stream and other sockets, add two more virtqueues and
a new feature bit to indicate if those two new queues exist or not.

Dgram does not use the existing credit update mechanism for
stream sockets. When sending from the guest/driver, sending packets
synchronously, so the sender will get an error when the virtqueue is full.
When sending from the host/device, send packets asynchronously
because the descriptor memory belongs to the corresponding QEMU
process.


What's the use case for the datagram vsock?


One use case is for non critical info logging from the guest
to the host, such as the performance data of some applications.



Anything that prevents you from using the stream socket?




It can also be used to replace UDP communications between
the guest and the host.



Any advantage for VSOCK in this case? Is it for performance (I guess not 
since I don't exepct vsock will be faster).


An obvious drawback is that it breaks the migration. Using UDP you can 
have a very rich features support from the kernel where vsock can't.






The virtio spec patch is here:
https://www.spinics.net/lists/linux-virtualization/msg50027.html


Have a quick glance, I suggest to split mergeable rx buffer into an
separate patch.

Sure.


But I think it's time to revisit the idea of unifying the virtio-net and
virtio-vsock. Otherwise we're duplicating features and bugs.

For mergeable rxbuf related code, I think a set of common helper
functions can be used by both virtio-net and virtio-vsock. For other
parts, that may not be very beneficial. I will think about more.

If there is a previous email discussion about this topic, could you send me
some links? I did a quick web search but did not find any related
info. Thanks.



We had a lot:

[1] 
https://patchwork.kernel.org/project/kvm/patch/5bdff537.3050...@huawei.com/
[2] 
https://lists.linuxfoundation.org/pipermail/virtualization/2018-November/039798.html

[3] https://www.lkml.org/lkml/2020/1/16/2043

Thanks




Thanks



For those who prefer git repo, here is the link for the linux kernel：
https://github.com/Jiang1155/linux/tree/vsock-dgram-v1

qemu patch link:
https://github.com/Jiang1155/qemu/tree/vsock-dgram-v1


To do:
1. use skb when receiving packets
2. support multiple transport
3. support mergeable rx buffer


Jiang Wang (6):
virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit
virtio/vsock: add support for virtio datagram
vhost/vsock: add support for vhost dgram.
vsock_test: add tests for vsock dgram
vhost/vsock: add kconfig for vhost dgram support
virtio/vsock: add sysfs for rx buf len for dgram

   drivers/vhost/Kconfig  |   8 +
   drivers/vhost/vsock.c  | 207 --
   include/linux/virtio_vsock.h   |   9 +
   include/net/af_vsock.h |   1 +
   .../trace/events/vsock_virtio_transport_common.h   |   5 +-
   include/uapi/linux/virtio_vsock.h  |   4 +
   net/vmw_vsock/af_vsock.c   |  12 +
   net/vmw_vsock/virtio_transport.c   | 433 
++---
   net/vmw_vsock/virtio_transport_common.c| 184 -
   tools/testing/vsock/util.c | 105 +
   tools/testing/vsock/util.h |   4 +
   tools/testing/vsock/vsock_test.c   | 195 ++
   12 files changed, 1070 insertions(+), 97 deletions(-)



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: Re: [RFC v1 0/6] virtio/vsock: introduce SOCK_DGRAM support

2021-06-09 Thread Jiang Wang .

On Wed, Jun 9, 2021 at 6:51 PM Jason Wang  wrote:
>
>
> 在 2021/6/10 上午7:24, Jiang Wang 写道:
> > This patchset implements support of SOCK_DGRAM for virtio
> > transport.
> >
> > Datagram sockets are connectionless and unreliable. To avoid unfair 
> > contention
> > with stream and other sockets, add two more virtqueues and
> > a new feature bit to indicate if those two new queues exist or not.
> >
> > Dgram does not use the existing credit update mechanism for
> > stream sockets. When sending from the guest/driver, sending packets
> > synchronously, so the sender will get an error when the virtqueue is full.
> > When sending from the host/device, send packets asynchronously
> > because the descriptor memory belongs to the corresponding QEMU
> > process.
>
>
> What's the use case for the datagram vsock?
>
One use case is for non critical info logging from the guest
to the host, such as the performance data of some applications.

It can also be used to replace UDP communications between
the guest and the host.

> >
> > The virtio spec patch is here:
> > https://www.spinics.net/lists/linux-virtualization/msg50027.html
>
>
> Have a quick glance, I suggest to split mergeable rx buffer into an
> separate patch.

Sure.

> But I think it's time to revisit the idea of unifying the virtio-net and
> virtio-vsock. Otherwise we're duplicating features and bugs.

For mergeable rxbuf related code, I think a set of common helper
functions can be used by both virtio-net and virtio-vsock. For other
parts, that may not be very beneficial. I will think about more.

If there is a previous email discussion about this topic, could you send me
some links? I did a quick web search but did not find any related
info. Thanks.

> Thanks
>
>
> >
> > For those who prefer git repo, here is the link for the linux kernel：
> > https://github.com/Jiang1155/linux/tree/vsock-dgram-v1
> >
> > qemu patch link:
> > https://github.com/Jiang1155/qemu/tree/vsock-dgram-v1
> >
> >
> > To do:
> > 1. use skb when receiving packets
> > 2. support multiple transport
> > 3. support mergeable rx buffer
> >
> >
> > Jiang Wang (6):
> >virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit
> >virtio/vsock: add support for virtio datagram
> >vhost/vsock: add support for vhost dgram.
> >vsock_test: add tests for vsock dgram
> >vhost/vsock: add kconfig for vhost dgram support
> >virtio/vsock: add sysfs for rx buf len for dgram
> >
> >   drivers/vhost/Kconfig  |   8 +
> >   drivers/vhost/vsock.c  | 207 --
> >   include/linux/virtio_vsock.h   |   9 +
> >   include/net/af_vsock.h |   1 +
> >   .../trace/events/vsock_virtio_transport_common.h   |   5 +-
> >   include/uapi/linux/virtio_vsock.h  |   4 +
> >   net/vmw_vsock/af_vsock.c   |  12 +
> >   net/vmw_vsock/virtio_transport.c   | 433 
> > ++---
> >   net/vmw_vsock/virtio_transport_common.c| 184 -
> >   tools/testing/vsock/util.c | 105 +
> >   tools/testing/vsock/util.h |   4 +
> >   tools/testing/vsock/vsock_test.c   | 195 ++
> >   12 files changed, 1070 insertions(+), 97 deletions(-)
> >
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [External] Re: [RFC v4] virtio-vsock: add description for datagram type

2021-06-09 Thread Jiang Wang .

On Wed, Jun 9, 2021 at 12:17 AM Stefano Garzarella  wrote:
>
> On Tue, Jun 08, 2021 at 09:22:26PM -0700, Jiang Wang . wrote:
> >On Tue, Jun 8, 2021 at 6:46 AM Stefano Garzarella  
> >wrote:
> >>
> >> On Fri, May 28, 2021 at 04:01:18AM +, Jiang Wang wrote:
> >> >From: "jiang.wang" 
> >> >
> >> >Add supports for datagram type for virtio-vsock. Datagram
> >> >sockets are connectionless and unreliable. To avoid contention
> >> >with stream and other sockets, add two more virtqueues and
> >> >a new feature bit to identify if those two new queues exist or not.
> >> >
> >> >Also add descriptions for resource management of datagram, which
> >> >does not use the existing credit update mechanism associated with
> >> >stream sockets.
> >> >
> >> >Signed-off-by: Jiang Wang 
> >> >---
> >> >
> >> >V2: addressed the comments for the previous version.
> >> >V3: add description for the mergeable receive buffer.
> >> >V4: add a feature bit for stream and reserver a bit for seqpacket.
> >> >Fix mrg_rxbuf related sentences.
> >> >
> >> > virtio-vsock.tex | 155 
> >> > ++-
> >> > 1 file changed, 142 insertions(+), 13 deletions(-)
> >> >
> >> >diff --git a/virtio-vsock.tex b/virtio-vsock.tex
> >> >index da7e641..bacac3c 100644
> >> >--- a/virtio-vsock.tex
> >> >+++ b/virtio-vsock.tex
> >> >@@ -9,14 +9,41 @@ \subsection{Device ID}\label{sec:Device Types / Socket 
> >> >Device / Device ID}
> >> >
> >> > \subsection{Virtqueues}\label{sec:Device Types / Socket Device / 
> >> > Virtqueues}
> >> > \begin{description}
> >> >-\item[0] rx
> >> >-\item[1] tx
> >> >+\item[0] stream rx
> >> >+\item[1] stream tx
> >> >+\item[2] datagram rx
> >> >+\item[3] datagram tx
> >> >+\item[4] event
> >>
> >> Is there a particular reason to always have the event queue as the last
> >> one?
> >>
> >> Maybe it's better to add the datagram queues at the bottom, so the first
> >> 3 queues are always the same.
> >>
> >I am not sure. I think Linux kernel should be fine with what you described.
> >But I am not sure about QEMU. From the code, I see virtqueue is allocated
> >as an array, like following,
> >
> >+ #ifdef CONFIG_VHOST_VSOCK_DGRAM
> >+struct vhost_virtqueue vhost_vqs[4];
> >+ #else
> >struct vhost_virtqueue vhost_vqs[2];
> >+ #endi
>
> I see, also vhost_dev_init() requires an array, so I agree that this is
> the best approach, sorry for the noise.
>
> Just to be sure to check that anything is working if
> CONFIG_VHOST_VSOCK_DGRAM is defined, but the guest has an old driver
> that doesn't support DGRAM, and viceversa.

Sure.  I just want to mention that the QEMU should be consistent
with the device (host). If QEMU enabled CONFIG_VHOST_VSOCK_DGRAM,
the device also needs to enable a similar option. Than the driver can
be old or new versions.

> >
> >so I assume the virtqueues for tx/rx should be
> >continuous? I can try to put the new queues at the end and see if it
> >works or not.
> >
> >btw, my qemu change is here:
> >https://github.com/Jiang1155/qemu/commit/6307aa7a0c347905a31f3ca6577923e2f6dd9d84
> >
> >> >+\end{description}
> >> >+The virtio socket device uses 5 queues if feature bit 
> >> >VIRTIO_VSOCK_F_DRGAM is set. Otherwise, it
> >> >+only uses 3 queues, as the following.
> >> >+
> >> >+\begin{description}
> >> >+\item[0] stream rx
> >> >+\item[1] stream tx
> >> > \item[2] event
> >> > \end{description}
> >> >
> >> >+When behavior differs between stream and datagram rx/tx virtqueues
> >> >+their full names are used. Common behavior is simply described in
> >> >+terms of rx/tx virtqueues and applies to both stream and datagram
> >> >+virtqueues.
> >> >+
> >> > \subsection{Feature bits}\label{sec:Device Types / Socket Device / 
> >> > Feature bits}
> >> >
> >> >-There are currently no feature bits defined for this device.
> >> >+\begin{description}
> >> >+\item[VIRTIO_VSOCK_F_STREAM (0)] Device has support for stream socket 
> >> >type.
> >> >+\end{description}
> >> >+
> >> >+\begin{description}
> >> >+\item[VIRTIO_VSOCK_F_DGRAM (2)] Device has support for datagram socket
> >> >type.
> >> >+\end{description}
> >> >+
> >> >+\begin{description}
> >> >+\item[VIRTIO_VSOCK_F_MRG_RXBUF (3)] Driver can merge receive buffers.
> >> >+\end{description}
> >> >+
> >> >+If no feature bits are defined, then assume only VIRTIO_VSOCK_F_STREAM
> >> >is set.
> >>
> >> I'd say more like socket streams are supported, without reference to the
> >> feature bit, something like: "If no feature bits are defined, then
> >> assume device only supports stream socket type."
> >>
> >OK.
> >
> >> >
> >> > \subsection{Device configuration layout}\label{sec:Device Types / Socket 
> >> > Device / Device configuration layout}
> >> >
> >> >@@ -64,6 +91,8 @@ \subsection{Device Operation}\label{sec:Device Types / 
> >> >Socket Device / Device Op
> >> >
> >> > Packets transmitted or received contain a header before the payload:
> >> >
> >> >+If feature VIRTIO_VSOCK_F_MRG_RXBUF is not negotiated, use the following

Re: [PATCH 0/7] Do not read from descriptor ring

2021-06-09 Thread Jason Wang



在 2021/6/9 上午12:24, Andy Lutomirski 写道:

On 6/3/21 10:53 PM, Jason Wang wrote:

Hi:

The virtio driver should not trust the device. This beame more urgent
for the case of encrtpyed VM or VDUSE[1]. In both cases, technology
like swiotlb/IOMMU is used to prevent the poking/mangling of memory
from the device. But this is not sufficient since current virtio
driver may trust what is stored in the descriptor table (coherent
mapping) for performing the DMA operations like unmap and bounce so
the device may choose to utilize the behaviour of swiotlb to perform
attacks[2].

Based on a quick skim, this looks entirely reasonable to me.

(I'm not a virtio maintainer or expert.  I got my hands very dirty with
virtio once dealing with the DMA mess, but that's about it.)

--Andy



Good to know that :)

Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC v1 0/6] virtio/vsock: introduce SOCK_DGRAM support

2021-06-09 Thread Jason Wang



在 2021/6/10 上午7:24, Jiang Wang 写道:

This patchset implements support of SOCK_DGRAM for virtio
transport.

Datagram sockets are connectionless and unreliable. To avoid unfair contention
with stream and other sockets, add two more virtqueues and
a new feature bit to indicate if those two new queues exist or not.

Dgram does not use the existing credit update mechanism for
stream sockets. When sending from the guest/driver, sending packets
synchronously, so the sender will get an error when the virtqueue is full.
When sending from the host/device, send packets asynchronously
because the descriptor memory belongs to the corresponding QEMU
process.



What's the use case for the datagram vsock?




The virtio spec patch is here:
https://www.spinics.net/lists/linux-virtualization/msg50027.html



Have a quick glance, I suggest to split mergeable rx buffer into an 
separate patch.


But I think it's time to revisit the idea of unifying the virtio-net and 
virtio-vsock. Otherwise we're duplicating features and bugs.


Thanks




For those who prefer git repo, here is the link for the linux kernel：
https://github.com/Jiang1155/linux/tree/vsock-dgram-v1

qemu patch link:
https://github.com/Jiang1155/qemu/tree/vsock-dgram-v1


To do:
1. use skb when receiving packets
2. support multiple transport
3. support mergeable rx buffer


Jiang Wang (6):
   virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit
   virtio/vsock: add support for virtio datagram
   vhost/vsock: add support for vhost dgram.
   vsock_test: add tests for vsock dgram
   vhost/vsock: add kconfig for vhost dgram support
   virtio/vsock: add sysfs for rx buf len for dgram

  drivers/vhost/Kconfig  |   8 +
  drivers/vhost/vsock.c  | 207 --
  include/linux/virtio_vsock.h   |   9 +
  include/net/af_vsock.h |   1 +
  .../trace/events/vsock_virtio_transport_common.h   |   5 +-
  include/uapi/linux/virtio_vsock.h  |   4 +
  net/vmw_vsock/af_vsock.c   |  12 +
  net/vmw_vsock/virtio_transport.c   | 433 ++---
  net/vmw_vsock/virtio_transport_common.c| 184 -
  tools/testing/vsock/util.c | 105 +
  tools/testing/vsock/util.h |   4 +
  tools/testing/vsock/vsock_test.c   | 195 ++
  12 files changed, 1070 insertions(+), 97 deletions(-)



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[RFC v1 6/6] virtio/vsock: add sysfs for rx buf len for dgram

2021-06-09 Thread Jiang Wang

Make rx buf len configurable via sysfs

Signed-off-by: Jiang Wang 
---
 net/vmw_vsock/virtio_transport.c | 37 +++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index cf47aadb0c34..2e4dd9c48472 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -29,6 +29,14 @@ static struct virtio_vsock __rcu *the_virtio_vsock;
 static struct virtio_vsock *the_virtio_vsock_dgram;
 static DEFINE_MUTEX(the_virtio_vsock_mutex); /* protects the_virtio_vsock */
 
+static int rx_buf_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
+static struct kobject *kobj_ref;
+static ssize_t  sysfs_show(struct kobject *kobj,
+   struct kobj_attribute *attr, char *buf);
+static ssize_t  sysfs_store(struct kobject *kobj,
+   struct kobj_attribute *attr, const char *buf, size_t 
count);
+static struct kobj_attribute rxbuf_attr = __ATTR(rx_buf_value, 0660, 
sysfs_show, sysfs_store);
+
 struct virtio_vsock {
struct virtio_device *vdev;
struct virtqueue **vqs;
@@ -360,7 +368,7 @@ virtio_transport_cancel_pkt(struct vsock_sock *vsk)
 
 static void virtio_vsock_rx_fill(struct virtio_vsock *vsock, bool is_dgram)
 {
-   int buf_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
+   int buf_len = rx_buf_len;
struct virtio_vsock_pkt *pkt;
struct scatterlist hdr, buf, *sgs[2];
struct virtqueue *vq;
@@ -1003,6 +1011,22 @@ static struct virtio_driver virtio_vsock_driver = {
.remove = virtio_vsock_remove,
 };
 
+static ssize_t sysfs_show(struct kobject *kobj,
+   struct kobj_attribute *attr, char *buf)
+{
+   return sprintf(buf, "%d", rx_buf_len);
+}
+
+static ssize_t sysfs_store(struct kobject *kobj,
+   struct kobj_attribute *attr, const char *buf, size_t count)
+{
+   if (kstrtou32(buf, 0, _buf_len) < 0)
+   return -EINVAL;
+   if (rx_buf_len < 1024)
+   rx_buf_len = 1024;
+   return count;
+}
+
 static int __init virtio_vsock_init(void)
 {
int ret;
@@ -1020,8 +1044,17 @@ static int __init virtio_vsock_init(void)
if (ret)
goto out_vci;
 
-   return 0;
+   kobj_ref = kobject_create_and_add("vsock", kernel_kobj);
 
+   /*Creating sysfs file for etx_value*/
+   ret = sysfs_create_file(kobj_ref, _attr.attr);
+   if (ret)
+   goto out_sysfs;
+
+   return 0;
+out_sysfs:
+   kobject_put(kobj_ref);
+   sysfs_remove_file(kernel_kobj, _attr.attr);
 out_vci:
vsock_core_unregister(_transport.transport);
 out_wq:
-- 
2.11.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[RFC v1 5/6] vhost/vsock: add kconfig for vhost dgram support

2021-06-09 Thread Jiang Wang

Also change number of vqs according to the config

Signed-off-by: Jiang Wang 
---
 drivers/vhost/Kconfig |  8 
 drivers/vhost/vsock.c | 11 ---
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
index 587fbae06182..d63fffee6007 100644
--- a/drivers/vhost/Kconfig
+++ b/drivers/vhost/Kconfig
@@ -61,6 +61,14 @@ config VHOST_VSOCK
To compile this driver as a module, choose M here: the module will be 
called
vhost_vsock.
 
+config VHOST_VSOCK_DGRAM
+   bool "vhost vsock datagram sockets support"
+   depends on VHOST_VSOCK
+   default n
+   help
+   Enable vhost-vsock to support datagram types vsock.  The QEMU
+   and the guest must support datagram types too to use it.
+
 config VHOST_VDPA
tristate "Vhost driver for vDPA-based backend"
depends on EVENTFD
diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index d366463be6d4..12ca1dc0268f 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -48,7 +48,11 @@ static DEFINE_READ_MOSTLY_HASHTABLE(vhost_vsock_hash, 8);
 
 struct vhost_vsock {
struct vhost_dev dev;
+#ifdef CONFIG_VHOST_VSOCK_DGRAM
struct vhost_virtqueue vqs[4];
+#else
+   struct vhost_virtqueue vqs[2];
+#endif
 
/* Link to global vhost_vsock_hash, writes use vhost_vsock_mutex */
struct hlist_node hash;
@@ -763,15 +767,16 @@ static int vhost_vsock_dev_open(struct inode *inode, 
struct file *file)
 
vqs[VSOCK_VQ_TX] = >vqs[VSOCK_VQ_TX];
vqs[VSOCK_VQ_RX] = >vqs[VSOCK_VQ_RX];
-   vqs[VSOCK_VQ_DGRAM_TX] = >vqs[VSOCK_VQ_DGRAM_TX];
-   vqs[VSOCK_VQ_DGRAM_RX] = >vqs[VSOCK_VQ_DGRAM_RX];
vsock->vqs[VSOCK_VQ_TX].handle_kick = vhost_vsock_handle_tx_kick;
vsock->vqs[VSOCK_VQ_RX].handle_kick = vhost_vsock_handle_rx_kick;
+#ifdef CONFIG_VHOST_VSOCK_DGRAM
+   vqs[VSOCK_VQ_DGRAM_TX] = >vqs[VSOCK_VQ_DGRAM_TX];
+   vqs[VSOCK_VQ_DGRAM_RX] = >vqs[VSOCK_VQ_DGRAM_RX];
vsock->vqs[VSOCK_VQ_DGRAM_TX].handle_kick =
vhost_vsock_handle_tx_kick;
vsock->vqs[VSOCK_VQ_DGRAM_RX].handle_kick =
vhost_vsock_handle_rx_kick;
-
+#endif
vhost_dev_init(>dev, vqs, ARRAY_SIZE(vsock->vqs),
   UIO_MAXIOV, VHOST_VSOCK_PKT_WEIGHT,
   VHOST_VSOCK_WEIGHT, true, NULL);
-- 
2.11.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[RFC v1 4/6] vsock_test: add tests for vsock dgram

2021-06-09 Thread Jiang Wang

Added test cases for vsock dgram types.

Signed-off-by: Jiang Wang 
---
 tools/testing/vsock/util.c   | 105 +
 tools/testing/vsock/util.h   |   4 +
 tools/testing/vsock/vsock_test.c | 195 +++
 3 files changed, 304 insertions(+)

diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c
index 93cbd6f603f9..59e5301b5380 100644
--- a/tools/testing/vsock/util.c
+++ b/tools/testing/vsock/util.c
@@ -238,6 +238,57 @@ void send_byte(int fd, int expected_ret, int flags)
}
 }
 
+/* Transmit one byte and check the return value.
+ *
+ * expected_ret:
+ *  <0 Negative errno (for testing errors)
+ *   0 End-of-file
+ *   1 Success
+ */
+void sendto_byte(int fd, const struct sockaddr *dest_addr, int len, int 
expected_ret,
+   int flags)
+{
+   const uint8_t byte = 'A';
+   ssize_t nwritten;
+
+   timeout_begin(TIMEOUT);
+   do {
+   nwritten = sendto(fd, , sizeof(byte), flags, dest_addr,
+   len);
+   timeout_check("write");
+   } while (nwritten < 0 && errno == EINTR);
+   timeout_end();
+
+   if (expected_ret < 0) {
+   if (nwritten != -1) {
+   fprintf(stderr, "bogus sendto(2) return value %zd\n",
+   nwritten);
+   exit(EXIT_FAILURE);
+   }
+   if (errno != -expected_ret) {
+   perror("write");
+   exit(EXIT_FAILURE);
+   }
+   return;
+   }
+
+   if (nwritten < 0) {
+   perror("write");
+   exit(EXIT_FAILURE);
+   }
+   if (nwritten == 0) {
+   if (expected_ret == 0)
+   return;
+
+   fprintf(stderr, "unexpected EOF while sending byte\n");
+   exit(EXIT_FAILURE);
+   }
+   if (nwritten != sizeof(byte)) {
+   fprintf(stderr, "bogus sendto(2) return value %zd\n", nwritten);
+   exit(EXIT_FAILURE);
+   }
+}
+
 /* Receive one byte and check the return value.
  *
  * expected_ret:
@@ -291,6 +342,60 @@ void recv_byte(int fd, int expected_ret, int flags)
}
 }
 
+/* Receive one byte and check the return value.
+ *
+ * expected_ret:
+ *  <0 Negative errno (for testing errors)
+ *   0 End-of-file
+ *   1 Success
+ */
+void recvfrom_byte(int fd, struct sockaddr *src_addr, socklen_t *addrlen,
+   int expected_ret, int flags)
+{
+   uint8_t byte;
+   ssize_t nread;
+
+   timeout_begin(TIMEOUT);
+   do {
+   nread = recvfrom(fd, , sizeof(byte), flags, src_addr, 
addrlen);
+   timeout_check("read");
+   } while (nread < 0 && errno == EINTR);
+   timeout_end();
+
+   if (expected_ret < 0) {
+   if (nread != -1) {
+   fprintf(stderr, "bogus recvfrom(2) return value %zd\n",
+   nread);
+   exit(EXIT_FAILURE);
+   }
+   if (errno != -expected_ret) {
+   perror("read");
+   exit(EXIT_FAILURE);
+   }
+   return;
+   }
+
+   if (nread < 0) {
+   perror("read");
+   exit(EXIT_FAILURE);
+   }
+   if (nread == 0) {
+   if (expected_ret == 0)
+   return;
+
+   fprintf(stderr, "unexpected EOF while receiving byte\n");
+   exit(EXIT_FAILURE);
+   }
+   if (nread != sizeof(byte)) {
+   fprintf(stderr, "bogus recvfrom(2) return value %zd\n", nread);
+   exit(EXIT_FAILURE);
+   }
+   if (byte != 'A') {
+   fprintf(stderr, "unexpected byte read %c\n", byte);
+   exit(EXIT_FAILURE);
+   }
+}
+
 /* Run test cases.  The program terminates if a failure occurs. */
 void run_tests(const struct test_case *test_cases,
   const struct test_opts *opts)
diff --git a/tools/testing/vsock/util.h b/tools/testing/vsock/util.h
index e53dd09d26d9..cea1acd094c6 100644
--- a/tools/testing/vsock/util.h
+++ b/tools/testing/vsock/util.h
@@ -40,7 +40,11 @@ int vsock_stream_accept(unsigned int cid, unsigned int port,
struct sockaddr_vm *clientaddrp);
 void vsock_wait_remote_close(int fd);
 void send_byte(int fd, int expected_ret, int flags);
+void sendto_byte(int fd, const struct sockaddr *dest_addr, int len, int 
expected_ret,
+   int flags);
 void recv_byte(int fd, int expected_ret, int flags);
+void recvfrom_byte(int fd, struct sockaddr *src_addr, socklen_t *addrlen,
+   int expected_ret, int flags);
 void run_tests(const struct test_case *test_cases,
   const struct test_opts *opts);
 void list_tests(const struct test_case *test_cases);
diff --git

[RFC v1 3/6] vhost/vsock: add support for vhost dgram.

2021-06-09 Thread Jiang Wang

This patch supports dgram on vhost side, including
tx and rx. The vhost send packets asynchronously.

Signed-off-by: Jiang Wang 
---
 drivers/vhost/vsock.c | 199 +++---
 1 file changed, 173 insertions(+), 26 deletions(-)

diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index 81d064601093..d366463be6d4 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -28,7 +28,10 @@
  * small pkts.
  */
 #define VHOST_VSOCK_PKT_WEIGHT 256
+#define VHOST_VSOCK_DGRM_MAX_PENDING_PKT 128
 
+/* Max wait time in busy poll in microseconds */
+#define VHOST_VSOCK_BUSY_POLL_TIMEOUT 20
 enum {
VHOST_VSOCK_FEATURES = VHOST_FEATURES |
   (1ULL << VIRTIO_F_ACCESS_PLATFORM) |
@@ -45,7 +48,7 @@ static DEFINE_READ_MOSTLY_HASHTABLE(vhost_vsock_hash, 8);
 
 struct vhost_vsock {
struct vhost_dev dev;
-   struct vhost_virtqueue vqs[2];
+   struct vhost_virtqueue vqs[4];
 
/* Link to global vhost_vsock_hash, writes use vhost_vsock_mutex */
struct hlist_node hash;
@@ -54,6 +57,11 @@ struct vhost_vsock {
spinlock_t send_pkt_list_lock;
struct list_head send_pkt_list; /* host->guest pending packets */
 
+   spinlock_t dgram_send_pkt_list_lock;
+   struct list_head dgram_send_pkt_list;   /* host->guest pending packets 
*/
+   struct vhost_work dgram_send_pkt_work;
+   int  dgram_used; /*pending packets to be send */
+
atomic_t queued_replies;
 
u32 guest_cid;
@@ -90,10 +98,22 @@ static void
 vhost_transport_do_send_pkt(struct vhost_vsock *vsock,
struct vhost_virtqueue *vq)
 {
-   struct vhost_virtqueue *tx_vq = >vqs[VSOCK_VQ_TX];
+   struct vhost_virtqueue *tx_vq;
int pkts = 0, total_len = 0;
bool added = false;
bool restart_tx = false;
+   spinlock_t *lock;
+   struct list_head *send_pkt_list;
+
+   if (vq == >vqs[VSOCK_VQ_RX]) {
+   tx_vq = >vqs[VSOCK_VQ_TX];
+   lock = >send_pkt_list_lock;
+   send_pkt_list = >send_pkt_list;
+   } else {
+   tx_vq = >vqs[VSOCK_VQ_DGRAM_TX];
+   lock = >dgram_send_pkt_list_lock;
+   send_pkt_list = >dgram_send_pkt_list;
+   }
 
mutex_lock(>mutex);
 
@@ -113,36 +133,48 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock,
size_t nbytes;
size_t iov_len, payload_len;
int head;
+   bool is_dgram = false;
 
-   spin_lock_bh(>send_pkt_list_lock);
-   if (list_empty(>send_pkt_list)) {
-   spin_unlock_bh(>send_pkt_list_lock);
+   spin_lock_bh(lock);
+   if (list_empty(send_pkt_list)) {
+   spin_unlock_bh(lock);
vhost_enable_notify(>dev, vq);
break;
}
 
-   pkt = list_first_entry(>send_pkt_list,
+   pkt = list_first_entry(send_pkt_list,
   struct virtio_vsock_pkt, list);
list_del_init(>list);
-   spin_unlock_bh(>send_pkt_list_lock);
+   spin_unlock_bh(lock);
+
+   if (pkt->hdr.type == VIRTIO_VSOCK_TYPE_DGRAM)
+   is_dgram = true;
 
head = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
 , , NULL, NULL);
if (head < 0) {
-   spin_lock_bh(>send_pkt_list_lock);
-   list_add(>list, >send_pkt_list);
-   spin_unlock_bh(>send_pkt_list_lock);
+   spin_lock_bh(lock);
+   list_add(>list, send_pkt_list);
+   spin_unlock_bh(lock);
break;
}
 
if (head == vq->num) {
-   spin_lock_bh(>send_pkt_list_lock);
-   list_add(>list, >send_pkt_list);
-   spin_unlock_bh(>send_pkt_list_lock);
+   if (is_dgram) {
+   virtio_transport_free_pkt(pkt);
+   vq_err(vq, "Dgram virtqueue is full!");
+   spin_lock_bh(lock);
+   vsock->dgram_used--;
+   spin_unlock_bh(lock);
+   break;
+   }
+   spin_lock_bh(lock);
+   list_add(>list, send_pkt_list);
+   spin_unlock_bh(lock);
 
/* We cannot finish yet if more buffers snuck in while
-* re-enabling notify.
-*/
+   * re-enabling notify.
+   */
if (unlikely(vhost_enable_notify(>dev, vq))) {

[RFC v1 2/6] virtio/vsock: add support for virtio datagram

2021-06-09 Thread Jiang Wang

This patch add support for virtio dgram for the driver.
Implemented related functions for tx and rx, enqueue
and dequeue. Send packets synchronously to give sender
indication when the virtqueue is full.
Refactored virtio_transport_send_pkt_work() a little bit but
no functions changes for it.

Support for the host/device side is in another
patch.

Signed-off-by: Jiang Wang 
---
 include/net/af_vsock.h |   1 +
 .../trace/events/vsock_virtio_transport_common.h   |   5 +-
 include/uapi/linux/virtio_vsock.h  |   1 +
 net/vmw_vsock/af_vsock.c   |  12 +
 net/vmw_vsock/virtio_transport.c   | 325 ++---
 net/vmw_vsock/virtio_transport_common.c| 184 ++--
 6 files changed, 466 insertions(+), 62 deletions(-)

diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index b1c717286993..fcae7bca9609 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -200,6 +200,7 @@ void vsock_remove_sock(struct vsock_sock *vsk);
 void vsock_for_each_connected_socket(void (*fn)(struct sock *sk));
 int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk);
 bool vsock_find_cid(unsigned int cid);
+int vsock_bind_stream(struct vsock_sock *vsk, struct sockaddr_vm *addr);
 
 / TAP /
 
diff --git a/include/trace/events/vsock_virtio_transport_common.h 
b/include/trace/events/vsock_virtio_transport_common.h
index 6782213778be..b1be25b327a1 100644
--- a/include/trace/events/vsock_virtio_transport_common.h
+++ b/include/trace/events/vsock_virtio_transport_common.h
@@ -9,9 +9,12 @@
 #include 
 
 TRACE_DEFINE_ENUM(VIRTIO_VSOCK_TYPE_STREAM);
+TRACE_DEFINE_ENUM(VIRTIO_VSOCK_TYPE_DGRAM);
 
 #define show_type(val) \
-   __print_symbolic(val, { VIRTIO_VSOCK_TYPE_STREAM, "STREAM" })
+__print_symbolic(val, \
+   { VIRTIO_VSOCK_TYPE_STREAM, "STREAM" }, 
\
+   { VIRTIO_VSOCK_TYPE_DGRAM, "DGRAM" })
 
 TRACE_DEFINE_ENUM(VIRTIO_VSOCK_OP_INVALID);
 TRACE_DEFINE_ENUM(VIRTIO_VSOCK_OP_REQUEST);
diff --git a/include/uapi/linux/virtio_vsock.h 
b/include/uapi/linux/virtio_vsock.h
index b56614dff1c9..5503585b26e8 100644
--- a/include/uapi/linux/virtio_vsock.h
+++ b/include/uapi/linux/virtio_vsock.h
@@ -68,6 +68,7 @@ struct virtio_vsock_hdr {
 
 enum virtio_vsock_type {
VIRTIO_VSOCK_TYPE_STREAM = 1,
+   VIRTIO_VSOCK_TYPE_DGRAM = 3,
 };
 
 enum virtio_vsock_op {
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 92a72f0e0d94..c1f512291b94 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -659,6 +659,18 @@ static int __vsock_bind_stream(struct vsock_sock *vsk,
return 0;
 }
 
+int vsock_bind_stream(struct vsock_sock *vsk,
+  struct sockaddr_vm *addr)
+{
+   int retval;
+
+   spin_lock_bh(_table_lock);
+   retval = __vsock_bind_stream(vsk, addr);
+   spin_unlock_bh(_table_lock);
+   return retval;
+}
+EXPORT_SYMBOL(vsock_bind_stream);
+
 static int __vsock_bind_dgram(struct vsock_sock *vsk,
  struct sockaddr_vm *addr)
 {
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index 7dcb8db23305..cf47aadb0c34 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -20,21 +20,29 @@
 #include 
 #include 
 #include 
+#include
+#include
+#include 
 
 static struct workqueue_struct *virtio_vsock_workqueue;
 static struct virtio_vsock __rcu *the_virtio_vsock;
+static struct virtio_vsock *the_virtio_vsock_dgram;
 static DEFINE_MUTEX(the_virtio_vsock_mutex); /* protects the_virtio_vsock */
 
 struct virtio_vsock {
struct virtio_device *vdev;
struct virtqueue **vqs;
bool has_dgram;
+   refcount_t active;
 
/* Virtqueue processing is deferred to a workqueue */
struct work_struct tx_work;
struct work_struct rx_work;
struct work_struct event_work;
 
+   struct work_struct dgram_tx_work;
+   struct work_struct dgram_rx_work;
+
/* The following fields are protected by tx_lock.  vqs[VSOCK_VQ_TX]
 * must be accessed with tx_lock held.
 */
@@ -55,6 +63,22 @@ struct virtio_vsock {
int rx_buf_nr;
int rx_buf_max_nr;
 
+   /* The following fields are protected by dgram_tx_lock.  
vqs[VSOCK_VQ_DGRAM_TX]
+* must be accessed with dgram_tx_lock held.
+*/
+   struct mutex dgram_tx_lock;
+   bool dgram_tx_run;
+
+   atomic_t dgram_queued_replies;
+
+   /* The following fields are protected by dgram_rx_lock.  
vqs[VSOCK_VQ_DGRAM_RX]
+* must be accessed with dgram_rx_lock held.
+*/
+   struct mutex dgram_rx_lock;
+   bool dgram_rx_run;
+   int dgram_rx_buf_nr;
+   int dgram_rx_buf_max_nr;
+
/* The following fields are protected by event_lock.
 * vqs[VSOCK_VQ_EVENT] must

[RFC v1 1/6] virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit

2021-06-09 Thread Jiang Wang

When this feature is enabled, allocate 5 queues,
otherwise, allocate 3 queues to be compatible with
old QEMU versions.

Signed-off-by: Jiang Wang 
---
 drivers/vhost/vsock.c |  3 +-
 include/linux/virtio_vsock.h  |  9 +
 include/uapi/linux/virtio_vsock.h |  3 ++
 net/vmw_vsock/virtio_transport.c  | 73 +++
 4 files changed, 80 insertions(+), 8 deletions(-)

diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index 5e78fb719602..81d064601093 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -31,7 +31,8 @@
 
 enum {
VHOST_VSOCK_FEATURES = VHOST_FEATURES |
-  (1ULL << VIRTIO_F_ACCESS_PLATFORM)
+  (1ULL << VIRTIO_F_ACCESS_PLATFORM) |
+  (1ULL << VIRTIO_VSOCK_F_DGRAM)
 };
 
 enum {
diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index dc636b727179..ba3189ed9345 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -18,6 +18,15 @@ enum {
VSOCK_VQ_MAX= 3,
 };
 
+enum {
+   VSOCK_VQ_STREAM_RX = 0, /* for host to guest data */
+   VSOCK_VQ_STREAM_TX = 1, /* for guest to host data */
+   VSOCK_VQ_DGRAM_RX   = 2,
+   VSOCK_VQ_DGRAM_TX   = 3,
+   VSOCK_VQ_EX_EVENT   = 4,
+   VSOCK_VQ_EX_MAX = 5,
+};
+
 /* Per-socket state (accessed via vsk->trans) */
 struct virtio_vsock_sock {
struct vsock_sock *vsk;
diff --git a/include/uapi/linux/virtio_vsock.h 
b/include/uapi/linux/virtio_vsock.h
index 1d57ed3d84d2..b56614dff1c9 100644
--- a/include/uapi/linux/virtio_vsock.h
+++ b/include/uapi/linux/virtio_vsock.h
@@ -38,6 +38,9 @@
 #include 
 #include 
 
+/* The feature bitmap for virtio net */
+#define VIRTIO_VSOCK_F_DGRAM   0   /* Host support dgram vsock */
+
 struct virtio_vsock_config {
__le64 guest_cid;
 } __attribute__((packed));
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index 2700a63ab095..7dcb8db23305 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -27,7 +27,8 @@ static DEFINE_MUTEX(the_virtio_vsock_mutex); /* protects 
the_virtio_vsock */
 
 struct virtio_vsock {
struct virtio_device *vdev;
-   struct virtqueue *vqs[VSOCK_VQ_MAX];
+   struct virtqueue **vqs;
+   bool has_dgram;
 
/* Virtqueue processing is deferred to a workqueue */
struct work_struct tx_work;
@@ -333,7 +334,10 @@ static int virtio_vsock_event_fill_one(struct virtio_vsock 
*vsock,
struct scatterlist sg;
struct virtqueue *vq;
 
-   vq = vsock->vqs[VSOCK_VQ_EVENT];
+   if (vsock->has_dgram)
+   vq = vsock->vqs[VSOCK_VQ_EX_EVENT];
+   else
+   vq = vsock->vqs[VSOCK_VQ_EVENT];
 
sg_init_one(, event, sizeof(*event));
 
@@ -351,7 +355,10 @@ static void virtio_vsock_event_fill(struct virtio_vsock 
*vsock)
virtio_vsock_event_fill_one(vsock, event);
}
 
-   virtqueue_kick(vsock->vqs[VSOCK_VQ_EVENT]);
+   if (vsock->has_dgram)
+   virtqueue_kick(vsock->vqs[VSOCK_VQ_EX_EVENT]);
+   else
+   virtqueue_kick(vsock->vqs[VSOCK_VQ_EVENT]);
 }
 
 static void virtio_vsock_reset_sock(struct sock *sk)
@@ -391,7 +398,10 @@ static void virtio_transport_event_work(struct work_struct 
*work)
container_of(work, struct virtio_vsock, event_work);
struct virtqueue *vq;
 
-   vq = vsock->vqs[VSOCK_VQ_EVENT];
+   if (vsock->has_dgram)
+   vq = vsock->vqs[VSOCK_VQ_EX_EVENT];
+   else
+   vq = vsock->vqs[VSOCK_VQ_EVENT];
 
mutex_lock(>event_lock);
 
@@ -411,7 +421,10 @@ static void virtio_transport_event_work(struct work_struct 
*work)
}
} while (!virtqueue_enable_cb(vq));
 
-   virtqueue_kick(vsock->vqs[VSOCK_VQ_EVENT]);
+   if (vsock->has_dgram)
+   virtqueue_kick(vsock->vqs[VSOCK_VQ_EX_EVENT]);
+   else
+   virtqueue_kick(vsock->vqs[VSOCK_VQ_EVENT]);
 out:
mutex_unlock(>event_lock);
 }
@@ -434,6 +447,10 @@ static void virtio_vsock_tx_done(struct virtqueue *vq)
queue_work(virtio_vsock_workqueue, >tx_work);
 }
 
+static void virtio_vsock_dgram_tx_done(struct virtqueue *vq)
+{
+}
+
 static void virtio_vsock_rx_done(struct virtqueue *vq)
 {
struct virtio_vsock *vsock = vq->vdev->priv;
@@ -443,6 +460,10 @@ static void virtio_vsock_rx_done(struct virtqueue *vq)
queue_work(virtio_vsock_workqueue, >rx_work);
 }
 
+static void virtio_vsock_dgram_rx_done(struct virtqueue *vq)
+{
+}
+
 static struct virtio_transport virtio_transport = {
.transport = {
.module   = THIS_MODULE,
@@ -545,13 +566,29 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
virtio_vsock_tx_done,
virtio_vsock_event_done,
};
+   vq_callback_t

[RFC v1 0/6] virtio/vsock: introduce SOCK_DGRAM support

2021-06-09 Thread Jiang Wang

This patchset implements support of SOCK_DGRAM for virtio
transport.

Datagram sockets are connectionless and unreliable. To avoid unfair contention
with stream and other sockets, add two more virtqueues and
a new feature bit to indicate if those two new queues exist or not.

Dgram does not use the existing credit update mechanism for
stream sockets. When sending from the guest/driver, sending packets 
synchronously, so the sender will get an error when the virtqueue is full.
When sending from the host/device, send packets asynchronously
because the descriptor memory belongs to the corresponding QEMU
process.

The virtio spec patch is here: 
https://www.spinics.net/lists/linux-virtualization/msg50027.html

For those who prefer git repo, here is the link for the linux kernel：
https://github.com/Jiang1155/linux/tree/vsock-dgram-v1

qemu patch link:
https://github.com/Jiang1155/qemu/tree/vsock-dgram-v1


To do:
1. use skb when receiving packets
2. support multiple transport
3. support mergeable rx buffer


Jiang Wang (6):
  virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit
  virtio/vsock: add support for virtio datagram
  vhost/vsock: add support for vhost dgram.
  vsock_test: add tests for vsock dgram
  vhost/vsock: add kconfig for vhost dgram support
  virtio/vsock: add sysfs for rx buf len for dgram

 drivers/vhost/Kconfig  |   8 +
 drivers/vhost/vsock.c  | 207 --
 include/linux/virtio_vsock.h   |   9 +
 include/net/af_vsock.h |   1 +
 .../trace/events/vsock_virtio_transport_common.h   |   5 +-
 include/uapi/linux/virtio_vsock.h  |   4 +
 net/vmw_vsock/af_vsock.c   |  12 +
 net/vmw_vsock/virtio_transport.c   | 433 ++---
 net/vmw_vsock/virtio_transport_common.c| 184 -
 tools/testing/vsock/util.c | 105 +
 tools/testing/vsock/util.h |   4 +
 tools/testing/vsock/vsock_test.c   | 195 ++
 12 files changed, 1070 insertions(+), 97 deletions(-)

-- 
2.11.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 0/4] virtio net: spurious interrupt related fixes

2021-06-09 Thread Willem de Bruijn

On Wed, Jun 9, 2021 at 5:36 PM Willem de Bruijn
 wrote:
>
> On Mon, May 31, 2021 at 10:53 PM Willem de Bruijn
>  wrote:
> >
> > On Wed, May 26, 2021 at 11:34 AM Willem de Bruijn
> >  wrote:
> > >
> > > On Wed, May 26, 2021 at 4:24 AM Michael S. Tsirkin  
> > > wrote:
> > > >
> > > >
> > > > With the implementation of napi-tx in virtio driver, we clean tx
> > > > descriptors from rx napi handler, for the purpose of reducing tx
> > > > complete interrupts. But this introduces a race where tx complete
> > > > interrupt has been raised, but the handler finds there is no work to do
> > > > because we have done the work in the previous rx interrupt handler.
> > > > A similar issue exists with polling from start_xmit, it is however
> > > > less common because of the delayed cb optimization of the split ring -
> > > > but will likely affect the packed ring once that is more common.
> > > >
> > > > In particular, this was reported to lead to the following warning msg:
> > > > [ 3588.010778] irq 38: nobody cared (try booting with the
> > > > "irqpoll" option)
> > > > [ 3588.017938] CPU: 4 PID: 0 Comm: swapper/4 Not tainted
> > > > 5.3.0-19-generic #20~18.04.2-Ubuntu
> > > > [ 3588.017940] Call Trace:
> > > > [ 3588.017942]  
> > > > [ 3588.017951]  dump_stack+0x63/0x85
> > > > [ 3588.017953]  __report_bad_irq+0x35/0xc0
> > > > [ 3588.017955]  note_interrupt+0x24b/0x2a0
> > > > [ 3588.017956]  handle_irq_event_percpu+0x54/0x80
> > > > [ 3588.017957]  handle_irq_event+0x3b/0x60
> > > > [ 3588.017958]  handle_edge_irq+0x83/0x1a0
> > > > [ 3588.017961]  handle_irq+0x20/0x30
> > > > [ 3588.017964]  do_IRQ+0x50/0xe0
> > > > [ 3588.017966]  common_interrupt+0xf/0xf
> > > > [ 3588.017966]  
> > > > [ 3588.017989] handlers:
> > > > [ 3588.020374] [<1b9f1da8>] vring_interrupt
> > > > [ 3588.025099] Disabling IRQ #38
> > > >
> > > > This patchset attempts to fix this by cleaning up a bunch of races
> > > > related to the handling of sq callbacks (aka tx interrupts).
> > > > Somewhat tested but I couldn't reproduce the original issues
> > > > reported, sending out for help with testing.
> > > >
> > > > Wei, does this address the spurious interrupt issue you are
> > > > observing? Could you confirm please?
> > >
> > > Thanks for working on this, Michael. Wei is on leave. I'll try to 
> > > reproduce.
> >
> > The original report was generated with five GCE virtual machines
> > sharing a sole-tenant node, together sending up to 160 netperf
> > tcp_stream connections to 16 other instances. Running Ubuntu 20.04-LTS
> > with kernel 5.4.0-1034-gcp.
> >
> > But the issue can also be reproduced with just two n2-standard-16
> > instances, running neper tcp_stream with high parallelism (-T 16 -F
> > 240).
> >
> > It's a bit faster to trigger by reducing the interrupt count threshold
> > from 99.9K/100K to 9.9K/10K. And I added additional logging to report
> > the unhandled rate even if lower.
> >
> > Unhandled interrupt rate scales with the number of queue pairs
> > (`ethtool -L $DEV combined $NUM`). It is essentially absent at 8
> > queues, at around 90% at 14 queues. By default these GCE instances
> > have one rx and tx interrupt per core, so 16 each. With the rx and tx
> > interrupts for a given virtio-queue pinned to the same core.
> >
> > Unfortunately, commit 3/4 did not have a significant impact on these
> > numbers. Have to think a bit more about possible mitigations. At least
> > I'll be able to test the more easily now.
>
> Continuing to experiment with approaches to avoid this interrupt disable.
>
> I think it's good to remember that the real bug is the disabling of
> interrupts, which may cause stalls in absence of receive events.
>
> The spurious tx interrupts themselves are no worse than the processing
> the tx and rx interrupts strictly separately without the optimization.
> The clean-from-rx optimization just reduces latency. The spurious
> interrupts indicate a cycle optimization opportunity for sure. I
> support Jason's suggestion for a single combined interrupt for both tx
> and rx. That is not feasible as a bugfix for stable, so we need something
> to mitigate the impact in the short term.
>
> For that, I suggest just an approach to maintain most benefit
> from the opportunistic cleaning, while keeping spurious rate below the
> threshold. A few variants:
>
> 1. In virtnet_poll_cleantx, a uniformly random draw on whether or not
> to attemp to clean. Not trivial to get a good random source that is
> essentially free. One example perhaps is sq->vq->num_free & 0x7, but
> not sure how randomized those bits are. Pro: this can be implemented
> strictly in virtio_net. Con: a probabilistic method will reduce the
> incidence rate, but it may still occur at the tail.
>
> 2. If also changing virtio_ring, in vring_interrupt count spurious
> interrupts and report this count through a new interface. Modify
> virtio_net to query and skip the optimization if above a threshold.
>
> 2a. slight variant: in virtio_net count

Re: [PATCH v3 1/4] virtio_net: move tx vq operation under tx queue lock

2021-06-09 Thread Michael S. Tsirkin

On Fri, May 28, 2021 at 06:25:11PM -0400, Willem de Bruijn wrote:
> On Wed, May 26, 2021 at 11:41 PM Jason Wang  wrote:
> >
> >
> > 在 2021/5/26 下午4:24, Michael S. Tsirkin 写道:
> > > It's unsafe to operate a vq from multiple threads.
> > > Unfortunately this is exactly what we do when invoking
> > > clean tx poll from rx napi.
> > > Same happens with napi-tx even without the
> > > opportunistic cleaning from the receive interrupt: that races
> > > with processing the vq in start_xmit.
> > >
> > > As a fix move everything that deals with the vq to under tx lock.
> 
> This patch also disables callbacks during free_old_xmit_skbs
> processing on tx interrupt. Should that be a separate commit, with its
> own explanation?
> > >
> > > Fixes: b92f1e6751a6 ("virtio-net: transmit napi")
> > > Signed-off-by: Michael S. Tsirkin 
> > > ---
> > >   drivers/net/virtio_net.c | 22 +-
> > >   1 file changed, 21 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > index ac0c143f97b4..12512d1002ec 100644
> > > --- a/drivers/net/virtio_net.c
> > > +++ b/drivers/net/virtio_net.c
> > > @@ -1508,6 +1508,8 @@ static int virtnet_poll_tx(struct napi_struct 
> > > *napi, int budget)
> > >   struct virtnet_info *vi = sq->vq->vdev->priv;
> > >   unsigned int index = vq2txq(sq->vq);
> > >   struct netdev_queue *txq;
> > > + int opaque;
> > > + bool done;
> > >
> > >   if (unlikely(is_xdp_raw_buffer_queue(vi, index))) {
> > >   /* We don't need to enable cb for XDP */
> > > @@ -1517,10 +1519,28 @@ static int virtnet_poll_tx(struct napi_struct 
> > > *napi, int budget)
> > >
> > >   txq = netdev_get_tx_queue(vi->dev, index);
> > >   __netif_tx_lock(txq, raw_smp_processor_id());
> > > + virtqueue_disable_cb(sq->vq);
> > >   free_old_xmit_skbs(sq, true);
> > > +
> > > + opaque = virtqueue_enable_cb_prepare(sq->vq);
> > > +
> > > + done = napi_complete_done(napi, 0);
> > > +
> > > + if (!done)
> > > + virtqueue_disable_cb(sq->vq);
> > > +
> > >   __netif_tx_unlock(txq);
> > >
> > > - virtqueue_napi_complete(napi, sq->vq, 0);
> > > + if (done) {
> > > + if (unlikely(virtqueue_poll(sq->vq, opaque))) {
> 
> Should this also be inside the lock, as it operates on vq?

No vq poll is ok outside of locks, it's atomic.

> Is there anything that is not allowed to run with the lock held?
> > > + if (napi_schedule_prep(napi)) {
> > > + __netif_tx_lock(txq, 
> > > raw_smp_processor_id());
> > > + virtqueue_disable_cb(sq->vq);
> > > + __netif_tx_unlock(txq);
> > > + __napi_schedule(napi);
> > > + }
> > > + }
> > > + }
> >
> >
> > Interesting, this looks like somehwo a open-coded version of
> > virtqueue_napi_complete(). I wonder if we can simply keep using
> > virtqueue_napi_complete() by simply moving the __netif_tx_unlock() after
> > that:
> >
> > netif_tx_lock(txq);
> > free_old_xmit_skbs(sq, true);
> > virtqueue_napi_complete(napi, sq->vq, 0);
> > __netif_tx_unlock(txq);
> 
> Agreed. And subsequent block
> 
>if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
>netif_tx_wake_queue(txq);
> 
> as well

Yes I thought I saw something here that can't be called with tx lock
held but I no longer see it. Will do.

> >
> > Thanks
> >
> >
> > >
> > >   if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> > >   netif_tx_wake_queue(txq);
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 0/4] virtio net: spurious interrupt related fixes

2021-06-09 Thread Willem de Bruijn

On Mon, May 31, 2021 at 10:53 PM Willem de Bruijn
 wrote:
>
> On Wed, May 26, 2021 at 11:34 AM Willem de Bruijn
>  wrote:
> >
> > On Wed, May 26, 2021 at 4:24 AM Michael S. Tsirkin  wrote:
> > >
> > >
> > > With the implementation of napi-tx in virtio driver, we clean tx
> > > descriptors from rx napi handler, for the purpose of reducing tx
> > > complete interrupts. But this introduces a race where tx complete
> > > interrupt has been raised, but the handler finds there is no work to do
> > > because we have done the work in the previous rx interrupt handler.
> > > A similar issue exists with polling from start_xmit, it is however
> > > less common because of the delayed cb optimization of the split ring -
> > > but will likely affect the packed ring once that is more common.
> > >
> > > In particular, this was reported to lead to the following warning msg:
> > > [ 3588.010778] irq 38: nobody cared (try booting with the
> > > "irqpoll" option)
> > > [ 3588.017938] CPU: 4 PID: 0 Comm: swapper/4 Not tainted
> > > 5.3.0-19-generic #20~18.04.2-Ubuntu
> > > [ 3588.017940] Call Trace:
> > > [ 3588.017942]  
> > > [ 3588.017951]  dump_stack+0x63/0x85
> > > [ 3588.017953]  __report_bad_irq+0x35/0xc0
> > > [ 3588.017955]  note_interrupt+0x24b/0x2a0
> > > [ 3588.017956]  handle_irq_event_percpu+0x54/0x80
> > > [ 3588.017957]  handle_irq_event+0x3b/0x60
> > > [ 3588.017958]  handle_edge_irq+0x83/0x1a0
> > > [ 3588.017961]  handle_irq+0x20/0x30
> > > [ 3588.017964]  do_IRQ+0x50/0xe0
> > > [ 3588.017966]  common_interrupt+0xf/0xf
> > > [ 3588.017966]  
> > > [ 3588.017989] handlers:
> > > [ 3588.020374] [<1b9f1da8>] vring_interrupt
> > > [ 3588.025099] Disabling IRQ #38
> > >
> > > This patchset attempts to fix this by cleaning up a bunch of races
> > > related to the handling of sq callbacks (aka tx interrupts).
> > > Somewhat tested but I couldn't reproduce the original issues
> > > reported, sending out for help with testing.
> > >
> > > Wei, does this address the spurious interrupt issue you are
> > > observing? Could you confirm please?
> >
> > Thanks for working on this, Michael. Wei is on leave. I'll try to reproduce.
>
> The original report was generated with five GCE virtual machines
> sharing a sole-tenant node, together sending up to 160 netperf
> tcp_stream connections to 16 other instances. Running Ubuntu 20.04-LTS
> with kernel 5.4.0-1034-gcp.
>
> But the issue can also be reproduced with just two n2-standard-16
> instances, running neper tcp_stream with high parallelism (-T 16 -F
> 240).
>
> It's a bit faster to trigger by reducing the interrupt count threshold
> from 99.9K/100K to 9.9K/10K. And I added additional logging to report
> the unhandled rate even if lower.
>
> Unhandled interrupt rate scales with the number of queue pairs
> (`ethtool -L $DEV combined $NUM`). It is essentially absent at 8
> queues, at around 90% at 14 queues. By default these GCE instances
> have one rx and tx interrupt per core, so 16 each. With the rx and tx
> interrupts for a given virtio-queue pinned to the same core.
>
> Unfortunately, commit 3/4 did not have a significant impact on these
> numbers. Have to think a bit more about possible mitigations. At least
> I'll be able to test the more easily now.

Continuing to experiment with approaches to avoid this interrupt disable.

I think it's good to remember that the real bug is the disabling of
interrupts, which may cause stalls in absence of receive events.

The spurious tx interrupts themselves are no worse than the processing
the tx and rx interrupts strictly separately without the optimization.
The clean-from-rx optimization just reduces latency. The spurious
interrupts indicate a cycle optimization opportunity for sure. I
support Jason's suggestion for a single combined interrupt for both tx
and rx. That is not feasible as a bugfix for stable, so we need something
to mitigate the impact in the short term.

For that, I suggest just an approach to maintain most benefit
from the opportunistic cleaning, while keeping spurious rate below the
threshold. A few variants:

1. In virtnet_poll_cleantx, a uniformly random draw on whether or not
to attemp to clean. Not trivial to get a good random source that is
essentially free. One example perhaps is sq->vq->num_free & 0x7, but
not sure how randomized those bits are. Pro: this can be implemented
strictly in virtio_net. Con: a probabilistic method will reduce the
incidence rate, but it may still occur at the tail.

2. If also changing virtio_ring, in vring_interrupt count spurious
interrupts and report this count through a new interface. Modify
virtio_net to query and skip the optimization if above a threshold.

2a. slight variant: in virtio_net count consecutive succesful
opportunistic cleaning operations. If 100% hit rate, then probably
the tx interrupts are all spurious. Temporarily back off. (virtio_net
is not called for interrupts if there is no work on the ring, so cannot
count these events independently itself).

Re: [PATCH 7/9] vhost: allow userspace to create workers

2021-06-09 Thread Mike Christie

On 6/7/21 10:19 AM, Stefan Hajnoczi wrote:
> My concern is that threads should probably accounted against
> RLIMIT_NPROC and max_threads rather than something indirect like 128 *
> RLIMIT_NOFILE (a userspace process can only have RLIMIT_NOFILE
> vhost-user file descriptors open).
> 

Ah ok, I see what you want I think.

Ok, I think the options are:

0. Nothing. Just use existing indirect/RLIMIT_NOFILE.

1. Do something like io_uring's create_io_thread/copy_process. If we call
copy_process from the vhost ioctl context, then the userspace process that
did the ioctl will have it's processes count incremented and checked against
its rlimit.

The drawbacks:
- This gets a little more complicated than just calling copy_process though.
We end up duplicating a lot of the kthread API.
- We have to deal with new error cases like the parent exiting early.
- I think all devs sharing a worker have to have the same owner. kthread_use_mm
and kthread_unuse_mm to switch between mm's for differrent owner's devs seem to
be causing lots of errors. I'm still looking into this one though.

2.  It's not really what you want, but for unbound work io_uring has a check for
RLIMIT_NPROC in the io_uring code. It does:

wqe->acct[IO_WQ_ACCT_UNBOUND].max_workers =
task_rlimit(current, RLIMIT_NPROC);

then does:

if (!ret && acct->nr_workers < acct->max_workers) {

Drawbacks:
In vhost.c, we could do something similar. It would make sure that vhost.c does
not create more worker threads than the rlimit value, but we wouldn't be
incrementing the userspace process's process count. The userspace process could
then create RLIMIT_NPROC threads and vhost.c could also create RLIMIT_NPROC
threads, so we end up with 2 * RLIMIT_NPROC threads.

3. Change the kthread and copy_process code so we can pass in the thread
(or it's creds or some struct that has the values that need to be check) that
needs to be checked and updated.

Drawback:
This might be considered too ugly for how special case vhost is. For example, we
need checks/code like the io_thread/PF_IO_WORKER code in copy_process for 
io_uring.
I can see how added that for io_uring because it affects so many users, but I 
can
see how vhost is not special enough.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 7/9] drm/xen: Implement mmap as GEM object function

2021-06-09 Thread Thomas Zimmermann

Moving the driver-specific mmap code into a GEM object function allows
for using DRM helpers for various mmap callbacks.

The respective xen functions are being removed. The file_operations
structure fops is now being created by the helper macro
DEFINE_DRM_GEM_FOPS().

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/xen/xen_drm_front.c |  16 +---
 drivers/gpu/drm/xen/xen_drm_front_gem.c | 108 +---
 drivers/gpu/drm/xen/xen_drm_front_gem.h |   7 --
 3 files changed, 44 insertions(+), 87 deletions(-)

diff --git a/drivers/gpu/drm/xen/xen_drm_front.c 
b/drivers/gpu/drm/xen/xen_drm_front.c
index 9f14d99c763c..434064c820e8 100644
--- a/drivers/gpu/drm/xen/xen_drm_front.c
+++ b/drivers/gpu/drm/xen/xen_drm_front.c
@@ -469,19 +469,7 @@ static void xen_drm_drv_release(struct drm_device *dev)
kfree(drm_info);
 }
 
-static const struct file_operations xen_drm_dev_fops = {
-   .owner  = THIS_MODULE,
-   .open   = drm_open,
-   .release= drm_release,
-   .unlocked_ioctl = drm_ioctl,
-#ifdef CONFIG_COMPAT
-   .compat_ioctl   = drm_compat_ioctl,
-#endif
-   .poll   = drm_poll,
-   .read   = drm_read,
-   .llseek = no_llseek,
-   .mmap   = xen_drm_front_gem_mmap,
-};
+DEFINE_DRM_GEM_FOPS(xen_drm_dev_fops);
 
 static const struct drm_driver xen_drm_driver = {
.driver_features   = DRIVER_GEM | DRIVER_MODESET | 
DRIVER_ATOMIC,
@@ -489,7 +477,7 @@ static const struct drm_driver xen_drm_driver = {
.prime_handle_to_fd= drm_gem_prime_handle_to_fd,
.prime_fd_to_handle= drm_gem_prime_fd_to_handle,
.gem_prime_import_sg_table = xen_drm_front_gem_import_sg_table,
-   .gem_prime_mmap= xen_drm_front_gem_prime_mmap,
+   .gem_prime_mmap= drm_gem_prime_mmap,
.dumb_create   = xen_drm_drv_dumb_create,
.fops  = _drm_dev_fops,
.name  = "xendrm-du",
diff --git a/drivers/gpu/drm/xen/xen_drm_front_gem.c 
b/drivers/gpu/drm/xen/xen_drm_front_gem.c
index b293c67230ef..dd358ba2bf8e 100644
--- a/drivers/gpu/drm/xen/xen_drm_front_gem.c
+++ b/drivers/gpu/drm/xen/xen_drm_front_gem.c
@@ -57,6 +57,47 @@ static void gem_free_pages_array(struct xen_gem_object 
*xen_obj)
xen_obj->pages = NULL;
 }
 
+static int xen_drm_front_gem_object_mmap(struct drm_gem_object *gem_obj,
+struct vm_area_struct *vma)
+{
+   struct xen_gem_object *xen_obj = to_xen_gem_obj(gem_obj);
+   int ret;
+
+   vma->vm_ops = gem_obj->funcs->vm_ops;
+
+   /*
+* Clear the VM_PFNMAP flag that was set by drm_gem_mmap(), and set the
+* vm_pgoff (used as a fake buffer offset by DRM) to 0 as we want to map
+* the whole buffer.
+*/
+   vma->vm_flags &= ~VM_PFNMAP;
+   vma->vm_flags |= VM_MIXEDMAP;
+   vma->vm_pgoff = 0;
+
+   /*
+* According to Xen on ARM ABI (xen/include/public/arch-arm.h):
+* all memory which is shared with other entities in the system
+* (including the hypervisor and other guests) must reside in memory
+* which is mapped as Normal Inner Write-Back Outer Write-Back
+* Inner-Shareable.
+*/
+   vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
+
+   /*
+* vm_operations_struct.fault handler will be called if CPU access
+* to VM is here. For GPUs this isn't the case, because CPU  doesn't
+* touch the memory. Insert pages now, so both CPU and GPU are happy.
+*
+* FIXME: as we insert all the pages now then no .fault handler must
+* be called, so don't provide one
+*/
+   ret = vm_map_pages(vma, xen_obj->pages, xen_obj->num_pages);
+   if (ret < 0)
+   DRM_ERROR("Failed to map pages into vma: %d\n", ret);
+
+   return ret;
+}
+
 static const struct vm_operations_struct xen_drm_drv_vm_ops = {
.open   = drm_gem_vm_open,
.close  = drm_gem_vm_close,
@@ -67,6 +108,7 @@ static const struct drm_gem_object_funcs 
xen_drm_front_gem_object_funcs = {
.get_sg_table = xen_drm_front_gem_get_sg_table,
.vmap = xen_drm_front_gem_prime_vmap,
.vunmap = xen_drm_front_gem_prime_vunmap,
+   .mmap = xen_drm_front_gem_object_mmap,
.vm_ops = _drm_drv_vm_ops,
 };
 
@@ -238,58 +280,6 @@ xen_drm_front_gem_import_sg_table(struct drm_device *dev,
return _obj->base;
 }
 
-static int gem_mmap_obj(struct xen_gem_object *xen_obj,
-   struct vm_area_struct *vma)
-{
-   int ret;
-
-   /*
-* clear the VM_PFNMAP flag that was set by drm_gem_mmap(), and set the
-* vm_pgoff (used as a fake buffer offset by DRM) to 0 as we want to map
-* the whole buffer.
-*/
-   vma->vm_flags &= ~VM_PFNMAP;
-   vma->vm_flags |= VM_MIXEDMAP;
-   vma->vm_pgoff

[PATCH 8/9] drm/rockchip: Implement mmap as GEM object function

2021-06-09 Thread Thomas Zimmermann

Moving the driver-specific mmap code into a GEM object function allows
for using DRM helpers for various mmap callbacks.

The respective rockchip functions are being removed. The file_operations
structure fops is now being created by the helper macro
DEFINE_DRM_GEM_FOPS().

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/rockchip/rockchip_drm_drv.c   | 13 +-
 drivers/gpu/drm/rockchip/rockchip_drm_fbdev.c |  3 +-
 drivers/gpu/drm/rockchip/rockchip_drm_gem.c   | 44 +--
 drivers/gpu/drm/rockchip/rockchip_drm_gem.h   |  7 ---
 4 files changed, 15 insertions(+), 52 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_drv.c 
b/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
index b730b8d5d949..2e3ab573a817 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
@@ -208,16 +208,7 @@ static void rockchip_drm_unbind(struct device *dev)
drm_dev_put(drm_dev);
 }
 
-static const struct file_operations rockchip_drm_driver_fops = {
-   .owner = THIS_MODULE,
-   .open = drm_open,
-   .mmap = rockchip_gem_mmap,
-   .poll = drm_poll,
-   .read = drm_read,
-   .unlocked_ioctl = drm_ioctl,
-   .compat_ioctl = drm_compat_ioctl,
-   .release = drm_release,
-};
+DEFINE_DRM_GEM_FOPS(rockchip_drm_driver_fops);
 
 static const struct drm_driver rockchip_drm_driver = {
.driver_features= DRIVER_MODESET | DRIVER_GEM | DRIVER_ATOMIC,
@@ -226,7 +217,7 @@ static const struct drm_driver rockchip_drm_driver = {
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
.gem_prime_import_sg_table  = rockchip_gem_prime_import_sg_table,
-   .gem_prime_mmap = rockchip_gem_mmap_buf,
+   .gem_prime_mmap = drm_gem_prime_mmap,
.fops   = _drm_driver_fops,
.name   = DRIVER_NAME,
.desc   = DRIVER_DESC,
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fbdev.c 
b/drivers/gpu/drm/rockchip/rockchip_drm_fbdev.c
index 2fdc455c4ad7..d8418dd39d0e 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_fbdev.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_fbdev.c
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "rockchip_drm_drv.h"
@@ -24,7 +25,7 @@ static int rockchip_fbdev_mmap(struct fb_info *info,
struct drm_fb_helper *helper = info->par;
struct rockchip_drm_private *private = to_drm_private(helper);
 
-   return rockchip_gem_mmap_buf(private->fbdev_bo, vma);
+   return drm_gem_prime_mmap(private->fbdev_bo, vma);
 }
 
 static const struct fb_ops rockchip_drm_fbdev_ops = {
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c 
b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
index 7971f57436dd..63eb73b624aa 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
@@ -240,12 +240,22 @@ static int rockchip_drm_gem_object_mmap(struct 
drm_gem_object *obj,
int ret;
struct rockchip_gem_object *rk_obj = to_rockchip_obj(obj);
 
+   /*
+* Set vm_pgoff (used as a fake buffer offset by DRM) to 0 and map the
+* whole buffer from the start.
+*/
+   vma->vm_pgoff = 0;
+
/*
 * We allocated a struct page table for rk_obj, so clear
 * VM_PFNMAP flag that was set by drm_gem_mmap_obj()/drm_gem_mmap().
 */
+   vma->vm_flags |= VM_IO | VM_DONTEXPAND | VM_DONTDUMP;
vma->vm_flags &= ~VM_PFNMAP;
 
+   vma->vm_page_prot = 
pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
+   vma->vm_page_prot = pgprot_decrypted(vma->vm_page_prot);
+
if (rk_obj->pages)
ret = rockchip_drm_gem_object_mmap_iommu(obj, vma);
else
@@ -257,39 +267,6 @@ static int rockchip_drm_gem_object_mmap(struct 
drm_gem_object *obj,
return ret;
 }
 
-int rockchip_gem_mmap_buf(struct drm_gem_object *obj,
- struct vm_area_struct *vma)
-{
-   int ret;
-
-   ret = drm_gem_mmap_obj(obj, obj->size, vma);
-   if (ret)
-   return ret;
-
-   return rockchip_drm_gem_object_mmap(obj, vma);
-}
-
-/* drm driver mmap file operations */
-int rockchip_gem_mmap(struct file *filp, struct vm_area_struct *vma)
-{
-   struct drm_gem_object *obj;
-   int ret;
-
-   ret = drm_gem_mmap(filp, vma);
-   if (ret)
-   return ret;
-
-   /*
-* Set vm_pgoff (used as a fake buffer offset by DRM) to 0 and map the
-* whole buffer from the start.
-*/
-   vma->vm_pgoff = 0;
-
-   obj = vma->vm_private_data;
-
-   return rockchip_drm_gem_object_mmap(obj, vma);
-}
-
 static void rockchip_gem_release_object(struct rockchip_gem_object *rk_obj)
 {
drm_gem_object_release(_obj->base);
@@ -301,6 +278,7 @@ static const struct drm_gem_object_funcs 
rockchip_gem_object_funcs = {
.get_sg_table =

[PATCH 5/9] drm/qxl: Remove empty qxl_gem_prime_mmap()

2021-06-09 Thread Thomas Zimmermann

The function qxl_gem_prime_mmap() returns an error. The two callers
of gem_prime_mmap are drm_fbdev_fb_mmap() and drm_gem_dmabuf_mmap(),
which both already handle NULL-callbacks with an error code. So clear
gem_prime_mmap in qxl and remove qxl_gem_prime_mmap().

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/qxl/qxl_drv.c   | 1 -
 drivers/gpu/drm/qxl/qxl_drv.h   | 2 --
 drivers/gpu/drm/qxl/qxl_prime.c | 6 --
 3 files changed, 9 deletions(-)

diff --git a/drivers/gpu/drm/qxl/qxl_drv.c b/drivers/gpu/drm/qxl/qxl_drv.c
index 854e6c5a563f..b3d75ea7e6b3 100644
--- a/drivers/gpu/drm/qxl/qxl_drv.c
+++ b/drivers/gpu/drm/qxl/qxl_drv.c
@@ -281,7 +281,6 @@ static struct drm_driver qxl_driver = {
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
.gem_prime_import_sg_table = qxl_gem_prime_import_sg_table,
-   .gem_prime_mmap = qxl_gem_prime_mmap,
.fops = _fops,
.ioctls = qxl_ioctls,
.irq_handler = qxl_irq_handler,
diff --git a/drivers/gpu/drm/qxl/qxl_drv.h b/drivers/gpu/drm/qxl/qxl_drv.h
index dd6abee55f56..f95885a8bd2b 100644
--- a/drivers/gpu/drm/qxl/qxl_drv.h
+++ b/drivers/gpu/drm/qxl/qxl_drv.h
@@ -434,8 +434,6 @@ struct drm_gem_object *qxl_gem_prime_import_sg_table(
 int qxl_gem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map);
 void qxl_gem_prime_vunmap(struct drm_gem_object *obj,
  struct dma_buf_map *map);
-int qxl_gem_prime_mmap(struct drm_gem_object *obj,
-   struct vm_area_struct *vma);
 
 /* qxl_irq.c */
 int qxl_irq_init(struct qxl_device *qdev);
diff --git a/drivers/gpu/drm/qxl/qxl_prime.c b/drivers/gpu/drm/qxl/qxl_prime.c
index 0628d1cc91fe..4a10cb0a413b 100644
--- a/drivers/gpu/drm/qxl/qxl_prime.c
+++ b/drivers/gpu/drm/qxl/qxl_prime.c
@@ -73,9 +73,3 @@ void qxl_gem_prime_vunmap(struct drm_gem_object *obj,
 
qxl_bo_vunmap(bo);
 }
-
-int qxl_gem_prime_mmap(struct drm_gem_object *obj,
-  struct vm_area_struct *area)
-{
-   return -ENOSYS;
-}
-- 
2.31.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 6/9] drm/vgem: Implement mmap as GEM object function

2021-06-09 Thread Thomas Zimmermann

Moving the driver-specific mmap code into a GEM object function allows
for using DRM helpers for various mmap callbacks.

The respective vgem functions are being removed. The file_operations
structure vgem_driver_fops is now being created by the helper macro
DEFINE_DRM_GEM_FOPS().

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/vgem/vgem_drv.c | 46 -
 1 file changed, 5 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index bf38a7e319d1..df634aa52638 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -239,32 +239,7 @@ static struct drm_ioctl_desc vgem_ioctls[] = {
DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, 
DRM_RENDER_ALLOW),
 };
 
-static int vgem_mmap(struct file *filp, struct vm_area_struct *vma)
-{
-   unsigned long flags = vma->vm_flags;
-   int ret;
-
-   ret = drm_gem_mmap(filp, vma);
-   if (ret)
-   return ret;
-
-   /* Keep the WC mmaping set by drm_gem_mmap() but our pages
-* are ordinary and not special.
-*/
-   vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP;
-   return 0;
-}
-
-static const struct file_operations vgem_driver_fops = {
-   .owner  = THIS_MODULE,
-   .open   = drm_open,
-   .mmap   = vgem_mmap,
-   .poll   = drm_poll,
-   .read   = drm_read,
-   .unlocked_ioctl = drm_ioctl,
-   .compat_ioctl   = drm_compat_ioctl,
-   .release= drm_release,
-};
+DEFINE_DRM_GEM_FOPS(vgem_driver_fops);
 
 static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo)
 {
@@ -387,24 +362,12 @@ static void vgem_prime_vunmap(struct drm_gem_object *obj, 
struct dma_buf_map *ma
vgem_unpin_pages(bo);
 }
 
-static int vgem_prime_mmap(struct drm_gem_object *obj,
-  struct vm_area_struct *vma)
+static int vgem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct 
*vma)
 {
-   int ret;
-
-   if (obj->size < vma->vm_end - vma->vm_start)
-   return -EINVAL;
-
-   if (!obj->filp)
-   return -ENODEV;
-
-   ret = call_mmap(obj->filp, vma);
-   if (ret)
-   return ret;
-
vma_set_file(vma, obj->filp);
vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
vma->vm_page_prot = 
pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
+   vma->vm_page_prot = pgprot_decrypted(vma->vm_page_prot);
 
return 0;
 }
@@ -416,6 +379,7 @@ static const struct drm_gem_object_funcs 
vgem_gem_object_funcs = {
.get_sg_table = vgem_prime_get_sg_table,
.vmap = vgem_prime_vmap,
.vunmap = vgem_prime_vunmap,
+   .mmap = vgem_prime_mmap,
.vm_ops = _gem_vm_ops,
 };
 
@@ -433,7 +397,7 @@ static const struct drm_driver vgem_driver = {
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
.gem_prime_import = vgem_prime_import,
.gem_prime_import_sg_table = vgem_prime_import_sg_table,
-   .gem_prime_mmap = vgem_prime_mmap,
+   .gem_prime_mmap = drm_gem_prime_mmap,
 
.name   = DRIVER_NAME,
.desc   = DRIVER_DESC,
-- 
2.31.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 9/9] drm: Update documentation and TODO of gem_prime_mmap hook

2021-06-09 Thread Thomas Zimmermann

The hook gem_prime_mmap in struct drm_driver is deprecated. Document
the new requirements.

Signed-off-by: Thomas Zimmermann 
---
 Documentation/gpu/todo.rst | 11 ---
 include/drm/drm_drv.h  | 11 +++
 2 files changed, 7 insertions(+), 15 deletions(-)

diff --git a/Documentation/gpu/todo.rst b/Documentation/gpu/todo.rst
index 12e61869939e..50ad731d579b 100644
--- a/Documentation/gpu/todo.rst
+++ b/Documentation/gpu/todo.rst
@@ -268,17 +268,6 @@ Contact: Daniel Vetter
 
 Level: Intermediate
 
-Clean up mmap forwarding
-
-
-A lot of drivers forward gem mmap calls to dma-buf mmap for imported buffers.
-And also a lot of them forward dma-buf mmap to the gem mmap implementations.
-There's drm_gem_prime_mmap() for this now, but still needs to be rolled out.
-
-Contact: Daniel Vetter
-
-Level: Intermediate
-
 Generic fbdev defio support
 ---
 
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index b439ae1921b8..40d93a52cf7a 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -385,11 +385,14 @@ struct drm_driver {
 * mmap hook for GEM drivers, used to implement dma-buf mmap in the
 * PRIME helpers.
 *
-* FIXME: There's way too much duplication going on here, and also moved
-* to _gem_object_funcs.
+* This hook only exists for historical reasons. Drivers must use
+* drm_gem_prime_mmap() to implement it.
+*
+* FIXME: Convert all drivers to implement mmap in struct
+* _gem_object_funcs and inline drm_gem_prime_mmap() into
+* its callers. This hook should be removed afterwards.
 */
-   int (*gem_prime_mmap)(struct drm_gem_object *obj,
-   struct vm_area_struct *vma);
+   int (*gem_prime_mmap)(struct drm_gem_object *obj, struct vm_area_struct 
*vma);
 
/**
 * @dumb_create:
-- 
2.31.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 4/9] drm/msm: Implement mmap as GEM object function

2021-06-09 Thread Thomas Zimmermann

Moving the driver-specific mmap code into a GEM object function allows
for using DRM helpers for various mmap callbacks.

The respective msm functions are being removed. The file_operations
structure fops is now being created by the helper macro
DEFINE_DRM_GEM_FOPS().

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/msm/msm_drv.c   | 14 +-
 drivers/gpu/drm/msm/msm_drv.h   |  1 -
 drivers/gpu/drm/msm/msm_fbdev.c | 10 +
 drivers/gpu/drm/msm/msm_gem.c   | 67 -
 drivers/gpu/drm/msm/msm_gem.h   |  3 --
 drivers/gpu/drm/msm/msm_gem_prime.c | 11 -
 6 files changed, 31 insertions(+), 75 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index fe7d17cd35ec..f62eaedfc0d7 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -985,17 +985,7 @@ static const struct drm_ioctl_desc msm_ioctls[] = {
DRM_IOCTL_DEF_DRV(MSM_SUBMITQUEUE_QUERY, msm_ioctl_submitqueue_query, 
DRM_RENDER_ALLOW),
 };
 
-static const struct file_operations fops = {
-   .owner  = THIS_MODULE,
-   .open   = drm_open,
-   .release= drm_release,
-   .unlocked_ioctl = drm_ioctl,
-   .compat_ioctl   = drm_compat_ioctl,
-   .poll   = drm_poll,
-   .read   = drm_read,
-   .llseek = no_llseek,
-   .mmap   = msm_gem_mmap,
-};
+DEFINE_DRM_GEM_FOPS(fops);
 
 static const struct drm_driver msm_driver = {
.driver_features= DRIVER_GEM |
@@ -1015,7 +1005,7 @@ static const struct drm_driver msm_driver = {
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
.gem_prime_import_sg_table = msm_gem_prime_import_sg_table,
-   .gem_prime_mmap = msm_gem_prime_mmap,
+   .gem_prime_mmap = drm_gem_prime_mmap,
 #ifdef CONFIG_DEBUG_FS
.debugfs_init   = msm_debugfs_init,
 #endif
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index 2668941df529..8f1e0d7c8bbb 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -300,7 +300,6 @@ void msm_gem_shrinker_cleanup(struct drm_device *dev);
 struct sg_table *msm_gem_prime_get_sg_table(struct drm_gem_object *obj);
 int msm_gem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map);
 void msm_gem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map);
-int msm_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma);
 struct drm_gem_object *msm_gem_prime_import_sg_table(struct drm_device *dev,
struct dma_buf_attachment *attach, struct sg_table *sg);
 int msm_gem_prime_pin(struct drm_gem_object *obj);
diff --git a/drivers/gpu/drm/msm/msm_fbdev.c b/drivers/gpu/drm/msm/msm_fbdev.c
index 227404077e39..07225907fd2d 100644
--- a/drivers/gpu/drm/msm/msm_fbdev.c
+++ b/drivers/gpu/drm/msm/msm_fbdev.c
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "msm_drv.h"
 #include "msm_gem.h"
@@ -48,15 +49,8 @@ static int msm_fbdev_mmap(struct fb_info *info, struct 
vm_area_struct *vma)
struct drm_fb_helper *helper = (struct drm_fb_helper *)info->par;
struct msm_fbdev *fbdev = to_msm_fbdev(helper);
struct drm_gem_object *bo = msm_framebuffer_bo(fbdev->fb, 0);
-   int ret = 0;
 
-   ret = drm_gem_mmap_obj(bo, bo->size, vma);
-   if (ret) {
-   pr_err("%s:drm_gem_mmap_obj fail\n", __func__);
-   return ret;
-   }
-
-   return msm_gem_mmap_obj(bo, vma);
+   return drm_gem_prime_mmap(bo, vma);
 }
 
 static int msm_fbdev_create(struct drm_fb_helper *helper,
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index a94a43de95ef..09fd1a990b3c 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -211,46 +211,6 @@ void msm_gem_put_pages(struct drm_gem_object *obj)
msm_gem_unlock(obj);
 }
 
-int msm_gem_mmap_obj(struct drm_gem_object *obj,
-   struct vm_area_struct *vma)
-{
-   struct msm_gem_object *msm_obj = to_msm_bo(obj);
-
-   vma->vm_flags &= ~VM_PFNMAP;
-   vma->vm_flags |= VM_MIXEDMAP;
-
-   if (msm_obj->flags & MSM_BO_WC) {
-   vma->vm_page_prot = 
pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
-   } else if (msm_obj->flags & MSM_BO_UNCACHED) {
-   vma->vm_page_prot = 
pgprot_noncached(vm_get_page_prot(vma->vm_flags));
-   } else {
-   /*
-* Shunt off cached objs to shmem file so they have their own
-* address_space (so unmap_mapping_range does what we want,
-* in particular in the case of mmap'd dmabufs)
-*/
-   vma->vm_pgoff = 0;
-   vma_set_file(vma, obj->filp);
-
-   vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
-   }
-
-   return 0;
-}
-

[PATCH 2/9] drm/exynox: Implement mmap as GEM object function

2021-06-09 Thread Thomas Zimmermann

Moving the driver-specific mmap code into a GEM object function allows
for using DRM helpers for various mmap callbacks.

The respective exynos functions are being removed. The file_operations
structure exynos_drm_driver_fops is now being created by the helper macro
DEFINE_DRM_GEM_FOPS().

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/exynos/exynos_drm_drv.c   | 13 ++-
 drivers/gpu/drm/exynos/exynos_drm_fbdev.c | 20 ++-
 drivers/gpu/drm/exynos/exynos_drm_gem.c   | 43 +--
 drivers/gpu/drm/exynos/exynos_drm_gem.h   |  5 ---
 4 files changed, 13 insertions(+), 68 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c 
b/drivers/gpu/drm/exynos/exynos_drm_drv.c
index e60257f1f24b..1d46751cad02 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
@@ -102,16 +102,7 @@ static const struct drm_ioctl_desc exynos_ioctls[] = {
DRM_RENDER_ALLOW),
 };
 
-static const struct file_operations exynos_drm_driver_fops = {
-   .owner  = THIS_MODULE,
-   .open   = drm_open,
-   .mmap   = exynos_drm_gem_mmap,
-   .poll   = drm_poll,
-   .read   = drm_read,
-   .unlocked_ioctl = drm_ioctl,
-   .compat_ioctl = drm_compat_ioctl,
-   .release= drm_release,
-};
+DEFINE_DRM_GEM_FOPS(exynos_drm_driver_fops);
 
 static const struct drm_driver exynos_drm_driver = {
.driver_features= DRIVER_MODESET | DRIVER_GEM
@@ -124,7 +115,7 @@ static const struct drm_driver exynos_drm_driver = {
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
.gem_prime_import   = exynos_drm_gem_prime_import,
.gem_prime_import_sg_table  = exynos_drm_gem_prime_import_sg_table,
-   .gem_prime_mmap = exynos_drm_gem_prime_mmap,
+   .gem_prime_mmap = drm_gem_prime_mmap,
.ioctls = exynos_ioctls,
.num_ioctls = ARRAY_SIZE(exynos_ioctls),
.fops   = _drm_driver_fops,
diff --git a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c 
b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
index 5147f5929be7..02c97b9ca926 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -39,25 +40,8 @@ static int exynos_drm_fb_mmap(struct fb_info *info,
struct drm_fb_helper *helper = info->par;
struct exynos_drm_fbdev *exynos_fbd = to_exynos_fbdev(helper);
struct exynos_drm_gem *exynos_gem = exynos_fbd->exynos_gem;
-   unsigned long vm_size;
-   int ret;
-
-   vma->vm_flags |= VM_IO | VM_DONTEXPAND | VM_DONTDUMP;
-
-   vm_size = vma->vm_end - vma->vm_start;
-
-   if (vm_size > exynos_gem->size)
-   return -EINVAL;
 
-   ret = dma_mmap_attrs(to_dma_dev(helper->dev), vma, exynos_gem->cookie,
-exynos_gem->dma_addr, exynos_gem->size,
-exynos_gem->dma_attrs);
-   if (ret < 0) {
-   DRM_DEV_ERROR(to_dma_dev(helper->dev), "failed to mmap.\n");
-   return ret;
-   }
-
-   return 0;
+   return drm_gem_prime_mmap(_gem->base, vma);
 }
 
 static const struct fb_ops exynos_drm_fb_ops = {
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c 
b/drivers/gpu/drm/exynos/exynos_drm_gem.c
index 4396224227d1..c4b63902ee7a 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
@@ -17,6 +17,8 @@
 #include "exynos_drm_drv.h"
 #include "exynos_drm_gem.h"
 
+static int exynos_drm_gem_mmap(struct drm_gem_object *obj, struct 
vm_area_struct *vma);
+
 static int exynos_drm_alloc_buf(struct exynos_drm_gem *exynos_gem, bool kvmap)
 {
struct drm_device *dev = exynos_gem->base.dev;
@@ -135,6 +137,7 @@ static const struct vm_operations_struct 
exynos_drm_gem_vm_ops = {
 static const struct drm_gem_object_funcs exynos_drm_gem_object_funcs = {
.free = exynos_drm_gem_free_object,
.get_sg_table = exynos_drm_gem_prime_get_sg_table,
+   .mmap = exynos_drm_gem_mmap,
.vm_ops = _drm_gem_vm_ops,
 };
 
@@ -354,12 +357,16 @@ int exynos_drm_gem_dumb_create(struct drm_file *file_priv,
return 0;
 }
 
-static int exynos_drm_gem_mmap_obj(struct drm_gem_object *obj,
-  struct vm_area_struct *vma)
+static int exynos_drm_gem_mmap(struct drm_gem_object *obj, struct 
vm_area_struct *vma)
 {
struct exynos_drm_gem *exynos_gem = to_exynos_gem(obj);
int ret;
 
+   if (obj->import_attach)
+   return dma_buf_mmap(obj->dma_buf, vma, 0);
+
+   vma->vm_flags |= VM_IO | VM_DONTEXPAND | VM_DONTDUMP;
+
DRM_DEV_DEBUG_KMS(to_dma_dev(obj->dev), "flags = 0x%x\n",
  exynos_gem->flags);
 
@@ -385,26 +392,6 @@ static int exynos_drm_gem_mmap_obj(struct drm_gem_object

[PATCH 3/9] drm/mediatek: Implement mmap as GEM object function

2021-06-09 Thread Thomas Zimmermann

Moving the driver-specific mmap code into a GEM object function allows
for using DRM helpers for various mmap callbacks.

The respective mediatek functions are being removed. The file_operations
structure fops is now being created by the helper macro
DEFINE_DRM_GEM_FOPS().

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/mediatek/mtk_drm_drv.c | 13 ++--
 drivers/gpu/drm/mediatek/mtk_drm_gem.c | 44 +++---
 drivers/gpu/drm/mediatek/mtk_drm_gem.h |  3 --
 3 files changed, 14 insertions(+), 46 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c 
b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
index b46bdb8985da..bbfefb29c211 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
@@ -300,16 +300,7 @@ static void mtk_drm_kms_deinit(struct drm_device *drm)
component_unbind_all(drm->dev, drm);
 }
 
-static const struct file_operations mtk_drm_fops = {
-   .owner = THIS_MODULE,
-   .open = drm_open,
-   .release = drm_release,
-   .unlocked_ioctl = drm_ioctl,
-   .mmap = mtk_drm_gem_mmap,
-   .poll = drm_poll,
-   .read = drm_read,
-   .compat_ioctl = drm_compat_ioctl,
-};
+DEFINE_DRM_GEM_FOPS(mtk_drm_fops);
 
 /*
  * We need to override this because the device used to import the memory is
@@ -332,7 +323,7 @@ static const struct drm_driver mtk_drm_driver = {
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
.gem_prime_import = mtk_drm_gem_prime_import,
.gem_prime_import_sg_table = mtk_gem_prime_import_sg_table,
-   .gem_prime_mmap = mtk_drm_gem_mmap_buf,
+   .gem_prime_mmap = drm_gem_prime_mmap,
.fops = _drm_fops,
 
.name = DRIVER_NAME,
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.c 
b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
index 280ea0d5e840..d0544962cfc1 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_gem.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
@@ -14,11 +14,14 @@
 #include "mtk_drm_drv.h"
 #include "mtk_drm_gem.h"
 
+static int mtk_drm_gem_object_mmap(struct drm_gem_object *obj, struct 
vm_area_struct *vma);
+
 static const struct drm_gem_object_funcs mtk_drm_gem_object_funcs = {
.free = mtk_drm_gem_free_object,
.get_sg_table = mtk_gem_prime_get_sg_table,
.vmap = mtk_drm_gem_prime_vmap,
.vunmap = mtk_drm_gem_prime_vunmap,
+   .mmap = mtk_drm_gem_object_mmap,
.vm_ops = _gem_cma_vm_ops,
 };
 
@@ -145,11 +148,19 @@ static int mtk_drm_gem_object_mmap(struct drm_gem_object 
*obj,
struct mtk_drm_gem_obj *mtk_gem = to_mtk_gem_obj(obj);
struct mtk_drm_private *priv = obj->dev->dev_private;
 
+   /*
+* Set vm_pgoff (used as a fake buffer offset by DRM) to 0 and map the
+* whole buffer from the start.
+*/
+   vma->vm_pgoff = 0;
+
/*
 * dma_alloc_attrs() allocated a struct page table for mtk_gem, so clear
 * VM_PFNMAP flag that was set by drm_gem_mmap_obj()/drm_gem_mmap().
 */
-   vma->vm_flags &= ~VM_PFNMAP;
+   vma->vm_flags |= VM_IO | VM_DONTEXPAND | VM_DONTDUMP;
+   vma->vm_page_prot = 
pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
+   vma->vm_page_prot = pgprot_decrypted(vma->vm_page_prot);
 
ret = dma_mmap_attrs(priv->dma_dev, vma, mtk_gem->cookie,
 mtk_gem->dma_addr, obj->size, mtk_gem->dma_attrs);
@@ -159,37 +170,6 @@ static int mtk_drm_gem_object_mmap(struct drm_gem_object 
*obj,
return ret;
 }
 
-int mtk_drm_gem_mmap_buf(struct drm_gem_object *obj, struct vm_area_struct 
*vma)
-{
-   int ret;
-
-   ret = drm_gem_mmap_obj(obj, obj->size, vma);
-   if (ret)
-   return ret;
-
-   return mtk_drm_gem_object_mmap(obj, vma);
-}
-
-int mtk_drm_gem_mmap(struct file *filp, struct vm_area_struct *vma)
-{
-   struct drm_gem_object *obj;
-   int ret;
-
-   ret = drm_gem_mmap(filp, vma);
-   if (ret)
-   return ret;
-
-   obj = vma->vm_private_data;
-
-   /*
-* Set vm_pgoff (used as a fake buffer offset by DRM) to 0 and map the
-* whole buffer from the start.
-*/
-   vma->vm_pgoff = 0;
-
-   return mtk_drm_gem_object_mmap(obj, vma);
-}
-
 /*
  * Allocate a sg_table for this GEM object.
  * Note: Both the table's contents, and the sg_table itself must be freed by
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.h 
b/drivers/gpu/drm/mediatek/mtk_drm_gem.h
index 6da5ccb4b933..9a359a06cb73 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_gem.h
+++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.h
@@ -39,9 +39,6 @@ struct mtk_drm_gem_obj *mtk_drm_gem_create(struct drm_device 
*dev, size_t size,
   bool alloc_kmap);
 int mtk_drm_gem_dumb_create(struct drm_file *file_priv, struct drm_device *dev,
struct drm_mode_create_dumb *args);
-int mtk_drm_gem_mmap(struct file *filp, struct vm_area_struct *vma);
-int

[PATCH 1/9] drm/etnaviv: Implement mmap as GEM object function

2021-06-09 Thread Thomas Zimmermann

Moving the driver-specific mmap code into a GEM object function allows
for using DRM helpers for various mmap callbacks.

The respective etnaviv functions are being removed. The file_operations
structure fops is now being created by the helper macro
DEFINE_DRM_GEM_FOPS().

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/etnaviv/etnaviv_drv.c   | 14 ++
 drivers/gpu/drm/etnaviv/etnaviv_drv.h   |  3 ---
 drivers/gpu/drm/etnaviv/etnaviv_gem.c   | 18 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c | 13 -
 4 files changed, 7 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c 
b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
index f0a07278ad04..7dcc6392792d 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
@@ -468,17 +468,7 @@ static const struct drm_ioctl_desc etnaviv_ioctls[] = {
ETNA_IOCTL(PM_QUERY_SIG, pm_query_sig, DRM_RENDER_ALLOW),
 };
 
-static const struct file_operations fops = {
-   .owner  = THIS_MODULE,
-   .open   = drm_open,
-   .release= drm_release,
-   .unlocked_ioctl = drm_ioctl,
-   .compat_ioctl   = drm_compat_ioctl,
-   .poll   = drm_poll,
-   .read   = drm_read,
-   .llseek = no_llseek,
-   .mmap   = etnaviv_gem_mmap,
-};
+DEFINE_DRM_GEM_FOPS(fops);
 
 static const struct drm_driver etnaviv_drm_driver = {
.driver_features= DRIVER_GEM | DRIVER_RENDER,
@@ -487,7 +477,7 @@ static const struct drm_driver etnaviv_drm_driver = {
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
.gem_prime_import_sg_table = etnaviv_gem_prime_import_sg_table,
-   .gem_prime_mmap = etnaviv_gem_prime_mmap,
+   .gem_prime_mmap = drm_gem_prime_mmap,
 #ifdef CONFIG_DEBUG_FS
.debugfs_init   = etnaviv_debugfs_init,
 #endif
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.h 
b/drivers/gpu/drm/etnaviv/etnaviv_drv.h
index 003288ebd896..049ae87de9be 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_drv.h
+++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.h
@@ -47,12 +47,9 @@ struct etnaviv_drm_private {
 int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data,
struct drm_file *file);
 
-int etnaviv_gem_mmap(struct file *filp, struct vm_area_struct *vma);
 int etnaviv_gem_mmap_offset(struct drm_gem_object *obj, u64 *offset);
 struct sg_table *etnaviv_gem_prime_get_sg_table(struct drm_gem_object *obj);
 int etnaviv_gem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map 
*map);
-int etnaviv_gem_prime_mmap(struct drm_gem_object *obj,
-  struct vm_area_struct *vma);
 struct drm_gem_object *etnaviv_gem_prime_import_sg_table(struct drm_device 
*dev,
struct dma_buf_attachment *attach, struct sg_table *sg);
 int etnaviv_gem_prime_pin(struct drm_gem_object *obj);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
index b8fa6ed3dd73..8f1b5af47dd6 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
@@ -130,8 +130,7 @@ static int etnaviv_gem_mmap_obj(struct etnaviv_gem_object 
*etnaviv_obj,
 {
pgprot_t vm_page_prot;
 
-   vma->vm_flags &= ~VM_PFNMAP;
-   vma->vm_flags |= VM_MIXEDMAP;
+   vma->vm_flags |= VM_IO | VM_MIXEDMAP | VM_DONTEXPAND | VM_DONTDUMP;
 
vm_page_prot = vm_get_page_prot(vma->vm_flags);
 
@@ -154,19 +153,11 @@ static int etnaviv_gem_mmap_obj(struct etnaviv_gem_object 
*etnaviv_obj,
return 0;
 }
 
-int etnaviv_gem_mmap(struct file *filp, struct vm_area_struct *vma)
+static int etnaviv_gem_mmap(struct drm_gem_object *obj, struct vm_area_struct 
*vma)
 {
-   struct etnaviv_gem_object *obj;
-   int ret;
-
-   ret = drm_gem_mmap(filp, vma);
-   if (ret) {
-   DBG("mmap failed: %d", ret);
-   return ret;
-   }
+   struct etnaviv_gem_object *etnaviv_obj = to_etnaviv_bo(obj);
 
-   obj = to_etnaviv_bo(vma->vm_private_data);
-   return obj->ops->mmap(obj, vma);
+   return etnaviv_obj->ops->mmap(etnaviv_obj, vma);
 }
 
 static vm_fault_t etnaviv_gem_fault(struct vm_fault *vmf)
@@ -567,6 +558,7 @@ static const struct drm_gem_object_funcs 
etnaviv_gem_object_funcs = {
.unpin = etnaviv_gem_prime_unpin,
.get_sg_table = etnaviv_gem_prime_get_sg_table,
.vmap = etnaviv_gem_prime_vmap,
+   .mmap = etnaviv_gem_mmap,
.vm_ops = _ops,
 };
 
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
index b390dd4d60b7..4d9e8e9b6191 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
@@ -34,19 +34,6 @@ int etnaviv_gem_prime_vmap(struct drm_gem_object *obj, 
struct dma_buf_map *map)
return 0;
 }
 
-int

[PATCH 0/9] drm: Implement gem_prime_mmap with drm_gem_prime_mmap()

2021-06-09 Thread Thomas Zimmermann

Replace all remaining implementations of struct drm_driver.gem_prime_mmap
with use drm_gem_prime_mmap(). For each affected driver, put the mmap code
into struct drm_gem_object_funcs.mmap. With the latter change in place,
create struct file_operations via DEFINE_DRM_GEM_FOPS().

As next steps, remaining drivers can be converted to use drm_gem_prime_mmap()
and drm_gem_mmap() (e.g., Tegra). The default mmap code in drm_gem_prime_mmap()
can be pushed into affected drivers or a helper function. The gem_prime_mmap
hook can probably be removed at some point.

Testing is welcome. I don't have all the necessary hardware.

Thomas Zimmermann (9):
  drm/etnaviv: Implement mmap as GEM object function
  drm/exynox: Implement mmap as GEM object function
  drm/mediatek: Implement mmap as GEM object function
  drm/msm: Implement mmap as GEM object function
  drm/qxl: Remove empty qxl_gem_prime_mmap()
  drm/vgem: Implement mmap as GEM object function
  drm/xen: Implement mmap as GEM object function
  drm/rockchip: Implement mmap as GEM object function
  drm: Update documentation and TODO of gem_prime_mmap hook

 Documentation/gpu/todo.rst|  11 --
 drivers/gpu/drm/etnaviv/etnaviv_drv.c |  14 +--
 drivers/gpu/drm/etnaviv/etnaviv_drv.h |   3 -
 drivers/gpu/drm/etnaviv/etnaviv_gem.c |  18 +--
 drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c   |  13 ---
 drivers/gpu/drm/exynos/exynos_drm_drv.c   |  13 +--
 drivers/gpu/drm/exynos/exynos_drm_fbdev.c |  20 +---
 drivers/gpu/drm/exynos/exynos_drm_gem.c   |  43 ++-
 drivers/gpu/drm/exynos/exynos_drm_gem.h   |   5 -
 drivers/gpu/drm/mediatek/mtk_drm_drv.c|  13 +--
 drivers/gpu/drm/mediatek/mtk_drm_gem.c|  44 ++-
 drivers/gpu/drm/mediatek/mtk_drm_gem.h|   3 -
 drivers/gpu/drm/msm/msm_drv.c |  14 +--
 drivers/gpu/drm/msm/msm_drv.h |   1 -
 drivers/gpu/drm/msm/msm_fbdev.c   |  10 +-
 drivers/gpu/drm/msm/msm_gem.c |  67 +--
 drivers/gpu/drm/msm/msm_gem.h |   3 -
 drivers/gpu/drm/msm/msm_gem_prime.c   |  11 --
 drivers/gpu/drm/qxl/qxl_drv.c |   1 -
 drivers/gpu/drm/qxl/qxl_drv.h |   2 -
 drivers/gpu/drm/qxl/qxl_prime.c   |   6 -
 drivers/gpu/drm/rockchip/rockchip_drm_drv.c   |  13 +--
 drivers/gpu/drm/rockchip/rockchip_drm_fbdev.c |   3 +-
 drivers/gpu/drm/rockchip/rockchip_drm_gem.c   |  44 ++-
 drivers/gpu/drm/rockchip/rockchip_drm_gem.h   |   7 --
 drivers/gpu/drm/vgem/vgem_drv.c   |  46 +---
 drivers/gpu/drm/xen/xen_drm_front.c   |  16 +--
 drivers/gpu/drm/xen/xen_drm_front_gem.c   | 108 +++---
 drivers/gpu/drm/xen/xen_drm_front_gem.h   |   7 --
 include/drm/drm_drv.h |  11 +-
 30 files changed, 136 insertions(+), 434 deletions(-)


base-commit: 70e4d80795934312a3853a4f4f49445ce6db1271
prerequisite-patch-id: c2b2f08f0eccc9f5df0c0da49fa1d36267deb11d
prerequisite-patch-id: c67e5d886a47b7d0266d81100837557fda34cb24
--
2.31.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v1 05/12] mm/memory_hotplug: remove nid parameter from remove_memory() and friends

2021-06-09 Thread David Hildenbrand


On 08.06.21 13:18, David Hildenbrand wrote:

On 08.06.21 13:11, Michael Ellerman wrote:

David Hildenbrand  writes:

There is only a single user remaining. We can simply try to offline all
online nodes - which is fast, because we usually span pages and can skip
such nodes right away.


That makes me slightly nervous, because our big powerpc boxes tend to
trip on these scaling issues before others.

But the spanned pages check is just:

void try_offline_node(int nid)
{
pg_data_t *pgdat = NODE_DATA(nid);
  ...
if (pgdat->node_spanned_pages)
return;

So I guess that's pretty cheap, and it's only O(nodes), which should
never get that big.


Exactly. And if it does turn out to be a problem, we can walk all memory
blocks before removing them, collecting the nid(s).



I might just do the following on top:

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 61bff8f3bfb1..bbc26fdac364 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -2176,7 +2176,9 @@ int __ref offline_pages(unsigned long start_pfn, unsigned 
long nr_pages,
 static int check_memblock_offlined_cb(struct memory_block *mem, void *arg)
 {
int ret = !is_memblock_offlined(mem);
+   int *nid = arg;
 
+   *nid = mem->nid;

if (unlikely(ret)) {
phys_addr_t beginpa, endpa;
 
@@ -2271,10 +2273,10 @@ EXPORT_SYMBOL(try_offline_node);
 
 static int __ref try_remove_memory(u64 start, u64 size)

 {
-   int rc = 0, nid;
struct vmem_altmap mhp_altmap = {};
struct vmem_altmap *altmap = NULL;
unsigned long nr_vmemmap_pages;
+   int rc = 0, nid = NUMA_NO_NODE;
 
BUG_ON(check_hotplug_memory_range(start, size));
 
@@ -2282,8 +2284,12 @@ static int __ref try_remove_memory(u64 start, u64 size)

 * All memory blocks must be offlined before removing memory.  Check
 * whether all memory blocks in question are offline and return error
 * if this is not the case.
+*
+* While at it, determine the nid. Note that if we'd have mixed nodes,
+* we'd only try to offline the last determined one -- which is good
+* enough for the cases we care about.
 */
-   rc = walk_memory_blocks(start, size, NULL, check_memblock_offlined_cb);
+   rc = walk_memory_blocks(start, size, , check_memblock_offlined_cb);
if (rc)
return rc;
 
@@ -2332,7 +2338,7 @@ static int __ref try_remove_memory(u64 start, u64 size)
 
release_mem_region_adjustable(start, size);
 
-   for_each_online_node(nid)

+   if (nid != NUMA_NO_NODE)
try_offline_node(nid);
 
mem_hotplug_done();




--
Thanks,

David / dhildenb

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: virtio-net: kernel panic in virtio_net.c

2021-06-09 Thread Xuan Zhuo

On Wed, 9 Jun 2021 10:03:53 +0200, Greg KH  wrote:
> On Wed, Jun 09, 2021 at 03:51:20PM +0800, Xuan Zhuo wrote:
> > On Wed, 9 Jun 2021 08:24:20 +0200, Greg KH  
> > wrote:
> > > On Wed, Jun 09, 2021 at 02:08:17PM +0800, Xuan Zhuo wrote:
> > > > On Wed, 9 Jun 2021 06:50:10 +0200, Greg KH  
> > > > wrote:
> > > > > On Wed, Jun 09, 2021 at 09:48:33AM +0800, Xuan Zhuo wrote:
> > > > > > > > With this patch and the latest net branch I no longer get 
> > > > > > > > crashes.
> > > > > > >
> > > > > > > Did this ever get properly submitted to the networking tree to 
> > > > > > > get into
> > > > > > > 5.13-final?
> > > > > >
> > > > > > The patch has been submitted.
> > > > > >
> > > > > > [PATCH net] virtio-net: fix for skb_over_panic inside big mode
> > > > >
> > > > > Submitted where?  Do you have a lore.kernel.org link somewhere?
> > > >
> > > >
> > > > https://lore.kernel.org/netdev/20210603170901.66504-1-xuanz...@linux.alibaba.com/
> > >
> > > So this is commit 1a8024239dac ("virtio-net: fix for skb_over_panic
> > > inside big mode") in Linus's tree, right?
> >
> > YES.
> >
> > >
> > > But why is that referencing:
> > >   Fixes: fb32856b16ad ("virtio-net: page_to_skb() use build_skb when 
> > > there's sufficient tailroom")
> >
> > This problem was indeed introduced in fb32856b16ad.
> >
> > I confirmed that this commit fb32856b16ad was first entered in 5.13-rc1, 
> > and the
> > previous 5.12 did not have this commit fb32856b16ad.
> >
> > I'm not sure if it helped you.
>
> Hm, then what resolves the reported problem that people were having with
> the 5.12.y kernel release?  Is that a separate issue?

Has anyone reported a problem with 5.12.y? I don’t seem to see it. Corentin
only reported a problem with 5.13? Did I miss something?

I confirm that 5.12.9 has no modification of fb32856b16ad.

Thanks.

>
> thanks,
>
> greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: virtio-net: kernel panic in virtio_net.c

2021-06-09 Thread Greg KH

On Wed, Jun 09, 2021 at 03:51:20PM +0800, Xuan Zhuo wrote:
> On Wed, 9 Jun 2021 08:24:20 +0200, Greg KH  wrote:
> > On Wed, Jun 09, 2021 at 02:08:17PM +0800, Xuan Zhuo wrote:
> > > On Wed, 9 Jun 2021 06:50:10 +0200, Greg KH  
> > > wrote:
> > > > On Wed, Jun 09, 2021 at 09:48:33AM +0800, Xuan Zhuo wrote:
> > > > > > > With this patch and the latest net branch I no longer get crashes.
> > > > > >
> > > > > > Did this ever get properly submitted to the networking tree to get 
> > > > > > into
> > > > > > 5.13-final?
> > > > >
> > > > > The patch has been submitted.
> > > > >
> > > > >   [PATCH net] virtio-net: fix for skb_over_panic inside big mode
> > > >
> > > > Submitted where?  Do you have a lore.kernel.org link somewhere?
> > >
> > >
> > > https://lore.kernel.org/netdev/20210603170901.66504-1-xuanz...@linux.alibaba.com/
> >
> > So this is commit 1a8024239dac ("virtio-net: fix for skb_over_panic
> > inside big mode") in Linus's tree, right?
> 
> YES.
> 
> >
> > But why is that referencing:
> > Fixes: fb32856b16ad ("virtio-net: page_to_skb() use build_skb when 
> > there's sufficient tailroom")
> 
> This problem was indeed introduced in fb32856b16ad.
> 
> I confirmed that this commit fb32856b16ad was first entered in 5.13-rc1, and 
> the
> previous 5.12 did not have this commit fb32856b16ad.
> 
> I'm not sure if it helped you.

Hm, then what resolves the reported problem that people were having with
the 5.12.y kernel release?  Is that a separate issue?

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: virtio-net: kernel panic in virtio_net.c

2021-06-09 Thread Xuan Zhuo

On Wed, 9 Jun 2021 08:24:20 +0200, Greg KH  wrote:
> On Wed, Jun 09, 2021 at 02:08:17PM +0800, Xuan Zhuo wrote:
> > On Wed, 9 Jun 2021 06:50:10 +0200, Greg KH  
> > wrote:
> > > On Wed, Jun 09, 2021 at 09:48:33AM +0800, Xuan Zhuo wrote:
> > > > > > With this patch and the latest net branch I no longer get crashes.
> > > > >
> > > > > Did this ever get properly submitted to the networking tree to get 
> > > > > into
> > > > > 5.13-final?
> > > >
> > > > The patch has been submitted.
> > > >
> > > > [PATCH net] virtio-net: fix for skb_over_panic inside big mode
> > >
> > > Submitted where?  Do you have a lore.kernel.org link somewhere?
> >
> >
> > https://lore.kernel.org/netdev/20210603170901.66504-1-xuanz...@linux.alibaba.com/
>
> So this is commit 1a8024239dac ("virtio-net: fix for skb_over_panic
> inside big mode") in Linus's tree, right?

YES.

>
> But why is that referencing:
>   Fixes: fb32856b16ad ("virtio-net: page_to_skb() use build_skb when 
> there's sufficient tailroom")

This problem was indeed introduced in fb32856b16ad.

I confirmed that this commit fb32856b16ad was first entered in 5.13-rc1, and the
previous 5.12 did not have this commit fb32856b16ad.

I'm not sure if it helped you.

Thanks.

>
> when this problem was seen in stable kernels that had a different commit
> backported to it?
>
> Is there nothing needed to be done for the stable kernel trees?
>
> confused,
>
> greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC v2] virtio-vsock: add description for datagram type

2021-06-09 Thread Stefano Garzarella


On Tue, Jun 08, 2021 at 09:31:28PM -0700, Jiang Wang . wrote:

On Tue, May 18, 2021 at 9:59 PM Jiang Wang .  wrote:


On Tue, May 18, 2021 at 6:02 AM Stefano Garzarella  wrote:
>
> On Mon, May 17, 2021 at 11:33:06PM -0700, Jiang Wang . wrote:
> >On Mon, May 17, 2021 at 4:02 AM Stefano Garzarella  
wrote:
> >>
> >> On Fri, May 14, 2021 at 11:55:29AM -0700, Jiang Wang . wrote:
> >> >On Fri, May 14, 2021 at 8:17 AM Stefano Garzarella  
wrote:
> >> >> On Thu, May 13, 2021 at 04:26:03PM -0700, Jiang Wang . wrote:
> >>
> >> [...]
> >>
> >> >> >I see. I will add some limit to dgram packets. Also, when the
> >> >> >virtqueues
> >> >> >are shared between stream and dgram, both of them need to grab a lock
> >> >> >before using the virtqueue, so one will not completely block another 
one.
> >> >>
> >> >> I'm not worried about the concurrent access that we definitely need to
> >> >> handle with a lock, but more about the uncontrolled packet sending that
> >> >> dgram might have, flooding the queues and preventing others from
> >> >> communicating.
> >> >
> >> >That is a valid concern. Let me explain how I would handle that if we
> >> >don't add two new virtqueues. For dgram, I also add a dgram_send_pkt_list,
> >> >which is similar to send_pkt_list for stream (and seqpacket). But there
> >> >is one difference. The dgram_send_pkt_list has a maximum size setting,
> >> >and keep tracking how many pkts are in the list. The track number
> >> >(dgram_send_pkt_list_size) is  increased when a packet is added
> >> >to the list and is decreased when a packet
> >> >is removed from the list and added to the virtqueue. In
> >> >virtio_transport_send_pkt, if the current
> >> >dgram_send_pkt_list_size is equal
> >> >to the maximum ( let's say 128), then it will not add to the
> >> >dgram_send_pkt_list and return an error to the application.
> >>
> >> For stream socket, we have the send_pkt_list and the send worker because
> >> the virtqueue can be full and the transmitter needs to wait available
> >> slots, because we can't discard packets.
> >> For dgram I think we don't need this, so we can avoid the
> >> dgram_send_pkt_list and directly enqueue packets in the virtqueue.
> >>


For the question of whether we need dgram_send_pkt_list, I tried to remove
it and that has no problem with virtio vsock in the guest. But on the host, we
still need to keep the dgram_send_pkt_list. The reason is that we cannot
access virtqueue memory reliably in the syscall handling of an
arbitrary process.
The virtqueue memory is in the QEMU process virtual memory and may be
paged out.


I see, I think in that case we can use the virtqueue size as limit for 
the dgram_send_pkt_list.


I mean for example if the virtqueue has 128 elements, we can queue at 
least 128 packets in the dgram_send_pkt_list.


If you have a better idea go ahead, we can discuss this implementation 
detail in the RFC linux series :-)


Thanks,
Stefano

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [External] Re: [RFC v4] virtio-vsock: add description for datagram type

2021-06-09 Thread Stefano Garzarella


On Tue, Jun 08, 2021 at 09:22:26PM -0700, Jiang Wang . wrote:

On Tue, Jun 8, 2021 at 6:46 AM Stefano Garzarella  wrote:


On Fri, May 28, 2021 at 04:01:18AM +, Jiang Wang wrote:
>From: "jiang.wang" 
>
>Add supports for datagram type for virtio-vsock. Datagram
>sockets are connectionless and unreliable. To avoid contention
>with stream and other sockets, add two more virtqueues and
>a new feature bit to identify if those two new queues exist or not.
>
>Also add descriptions for resource management of datagram, which
>does not use the existing credit update mechanism associated with
>stream sockets.
>
>Signed-off-by: Jiang Wang 
>---
>
>V2: addressed the comments for the previous version.
>V3: add description for the mergeable receive buffer.
>V4: add a feature bit for stream and reserver a bit for seqpacket.
>Fix mrg_rxbuf related sentences.
>
> virtio-vsock.tex | 155 ++-
> 1 file changed, 142 insertions(+), 13 deletions(-)
>
>diff --git a/virtio-vsock.tex b/virtio-vsock.tex
>index da7e641..bacac3c 100644
>--- a/virtio-vsock.tex
>+++ b/virtio-vsock.tex
>@@ -9,14 +9,41 @@ \subsection{Device ID}\label{sec:Device Types / Socket 
Device / Device ID}
>
> \subsection{Virtqueues}\label{sec:Device Types / Socket Device / Virtqueues}
> \begin{description}
>-\item[0] rx
>-\item[1] tx
>+\item[0] stream rx
>+\item[1] stream tx
>+\item[2] datagram rx
>+\item[3] datagram tx
>+\item[4] event

Is there a particular reason to always have the event queue as the last
one?

Maybe it's better to add the datagram queues at the bottom, so the first
3 queues are always the same.


I am not sure. I think Linux kernel should be fine with what you described.
But I am not sure about QEMU. From the code, I see virtqueue is allocated
as an array, like following,

+ #ifdef CONFIG_VHOST_VSOCK_DGRAM
+struct vhost_virtqueue vhost_vqs[4];
+ #else
   struct vhost_virtqueue vhost_vqs[2];
+ #endi


I see, also vhost_dev_init() requires an array, so I agree that this is 
the best approach, sorry for the noise.


Just to be sure to check that anything is working if 
CONFIG_VHOST_VSOCK_DGRAM is defined, but the guest has an old driver 
that doesn't support DGRAM, and viceversa.




so I assume the virtqueues for tx/rx should be
continuous? I can try to put the new queues at the end and see if it
works or not.

btw, my qemu change is here:
https://github.com/Jiang1155/qemu/commit/6307aa7a0c347905a31f3ca6577923e2f6dd9d84


>+\end{description}
>+The virtio socket device uses 5 queues if feature bit VIRTIO_VSOCK_F_DRGAM is 
set. Otherwise, it
>+only uses 3 queues, as the following.
>+
>+\begin{description}
>+\item[0] stream rx
>+\item[1] stream tx
> \item[2] event
> \end{description}
>
>+When behavior differs between stream and datagram rx/tx virtqueues
>+their full names are used. Common behavior is simply described in
>+terms of rx/tx virtqueues and applies to both stream and datagram
>+virtqueues.
>+
> \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature 
bits}
>
>-There are currently no feature bits defined for this device.
>+\begin{description}
>+\item[VIRTIO_VSOCK_F_STREAM (0)] Device has support for stream socket type.
>+\end{description}
>+
>+\begin{description}
>+\item[VIRTIO_VSOCK_F_DGRAM (2)] Device has support for datagram socket
>type.
>+\end{description}
>+
>+\begin{description}
>+\item[VIRTIO_VSOCK_F_MRG_RXBUF (3)] Driver can merge receive buffers.
>+\end{description}
>+
>+If no feature bits are defined, then assume only VIRTIO_VSOCK_F_STREAM
>is set.

I'd say more like socket streams are supported, without reference to the
feature bit, something like: "If no feature bits are defined, then
assume device only supports stream socket type."


OK.


>
> \subsection{Device configuration layout}\label{sec:Device Types / Socket 
Device / Device configuration layout}
>
>@@ -64,6 +91,8 @@ \subsection{Device Operation}\label{sec:Device Types / 
Socket Device / Device Op
>
> Packets transmitted or received contain a header before the payload:
>
>+If feature VIRTIO_VSOCK_F_MRG_RXBUF is not negotiated, use the following 
header.
>+
> \begin{lstlisting}
> struct virtio_vsock_hdr {
>   le64 src_cid;
>@@ -79,6 +108,15 @@ \subsection{Device Operation}\label{sec:Device Types / 
Socket Device / Device Op
> };
> \end{lstlisting}
>
>+If feature VIRTIO_VSOCK_F_MRG_RXBUF is negotiated, use the following header.
>+\begin{lstlisting}
>+struct virtio_vsock_hdr_mrg_rxbuf {
>+  struct virtio_vsock_hdr hdr;
>+  le16 num_buffers;
>+};
>+\end{lstlisting}
>+
>+
> The upper 32 bits of src_cid and dst_cid are reserved and zeroed.
>
> Most packets simply transfer data but control packets are also used for
>@@ -107,6 +145,9 @@ \subsection{Device Operation}\label{sec:Device Types / 
Socket Device / Device Op
>
> \subsubsection{Virtqueue Flow Control}\label{sec:Device Types / Socket Device 
/ Device Operation / Virtqueue Flow Control}
>
>+Flow control applies to stream sockets;

Re: [PATCH] drm: qxl: ensure surf.data is ininitialized

2021-06-09 Thread Gerd Hoffmann

On Tue, Jun 08, 2021 at 05:13:13PM +0100, Colin King wrote:
> From: Colin Ian King 
> 
> The object surf is not fully initialized and the uninitialized
> field surf.data is being copied by the call to qxl_bo_create
> via the call to qxl_gem_object_create. Set surf.data to zero
> to ensure garbage data from the stack is not being copied.
> 
> Addresses-Coverity: ("Uninitialized scalar variable")
> Fixes: f64122c1f6ad ("drm: add new QXL driver. (v1.4)")
> Signed-off-by: Colin Ian King 

Pushed to drm-misc-next.

thanks,
  Gerd

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: virtio-net: kernel panic in virtio_net.c

2021-06-09 Thread Greg KH

On Wed, Jun 09, 2021 at 02:08:17PM +0800, Xuan Zhuo wrote:
> On Wed, 9 Jun 2021 06:50:10 +0200, Greg KH  wrote:
> > On Wed, Jun 09, 2021 at 09:48:33AM +0800, Xuan Zhuo wrote:
> > > > > With this patch and the latest net branch I no longer get crashes.
> > > >
> > > > Did this ever get properly submitted to the networking tree to get into
> > > > 5.13-final?
> > >
> > > The patch has been submitted.
> > >
> > >   [PATCH net] virtio-net: fix for skb_over_panic inside big mode
> >
> > Submitted where?  Do you have a lore.kernel.org link somewhere?
> 
> 
> https://lore.kernel.org/netdev/20210603170901.66504-1-xuanz...@linux.alibaba.com/

So this is commit 1a8024239dac ("virtio-net: fix for skb_over_panic
inside big mode") in Linus's tree, right?

But why is that referencing:
Fixes: fb32856b16ad ("virtio-net: page_to_skb() use build_skb when 
there's sufficient tailroom")

when this problem was seen in stable kernels that had a different commit
backported to it?

Is there nothing needed to be done for the stable kernel trees?

confused,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: virtio-net: kernel panic in virtio_net.c

2021-06-09 Thread Xuan Zhuo

On Wed, 9 Jun 2021 06:50:10 +0200, Greg KH  wrote:
> On Wed, Jun 09, 2021 at 09:48:33AM +0800, Xuan Zhuo wrote:
> > > > With this patch and the latest net branch I no longer get crashes.
> > >
> > > Did this ever get properly submitted to the networking tree to get into
> > > 5.13-final?
> >
> > The patch has been submitted.
> >
> > [PATCH net] virtio-net: fix for skb_over_panic inside big mode
>
> Submitted where?  Do you have a lore.kernel.org link somewhere?


https://lore.kernel.org/netdev/20210603170901.66504-1-xuanz...@linux.alibaba.com/

Thanks.

>
> thanks,
>
> greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

37 matches

Mail list logo