Re: [PATCH net-next 11/14] vsock: add multi-transports support

2019-11-13 Thread Stefano Garzarella
On Wed, Nov 13, 2019 at 02:30:24PM +, Jorgen Hansen wrote:
> > From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> > Sent: Tuesday, November 12, 2019 11:37 AM
> 
> > > > > You already mentioned that you are working on a fix for loopback
> > > > > here for the guest, but presumably a host could also do loopback.
> > > >
> > > > IIUC we don't support loopback in the host, because in this case the
> > > > application will use the CID_HOST as address, but if we are in a nested
> > > > VM environment we are in trouble.
> > >
> > > If both src and dst CID are CID_HOST, we should be fairly sure that this
> > > Is host loopback, no? If src is anything else, we would do G2H.
> > >
> > 
> > The problem is that we don't know the src until we assign a transport
> > looking at the dst. (or if the user bound the socket to CID_HOST before
> > the connect(), but it is not very common)
> > 
> > So if we are in a L1 and the user uses the local guest CID, it works, but if
> > it uses the HOST_CID, the packet will go to the L0.
> > 
> > If we are in L0, it could be simple, because we can simply check if G2H
> > is not loaded, so any packet to CID_HOST, is host loopback.
> > 
> > I think that if the user uses the IOCTL_VM_SOCKETS_GET_LOCAL_CID, to set
> > the dest CID for the loopback, it works in both cases because we return the
> > HOST_CID in L0, and always the guest CID in L1, also if a H2G is loaded to
> > handle the L2.
> > 
> > Maybe we should document this in the man page.
> 
> Yeah, it seems like a good idea to flesh out the routing behavior for nested
> VMs in the man page.

I'll do it.

> 
> > 
> > But I have a question: Does vmci support the host loopback?
> > I've tried, and it seems not.
> 
> Only for datagrams - not for stream sockets.
>  

Ok, I'll leave the datagram loopback as before.

> > Also vhost-vsock doesn't support it, but virtio-vsock does.
> > 
> > > >
> > > > Since several people asked about this feature at the KVM Forum, I would
> > like
> > > > to add a new VMADDR_CID_LOCAL (i.e. using the reserved 1) and
> > implement
> > > > loopback in the core.
> > > >
> > > > What do you think?
> > >
> > > What kind of use cases are mentioned in the KVM forum for loopback?
> > One concern
> > > is that we have to maintain yet another interprocess communication
> > mechanism,
> > > even though other choices exist already  (and those are likely to be more
> > efficient
> > > given the development time and specific focus that went into those). To
> > me, the
> > > local connections are mainly useful as a way to sanity test the protocol 
> > > and
> > transports.
> > > However, if loopback is compelling, it would make sense have it in the 
> > > core,
> > since it
> > > shouldn't need a specific transport.
> > 
> > The common use cases is for developer point of view, and to test the
> > protocol and transports as you said.
> > 
> > People that are introducing VSOCK support in their projects, would like to
> > test them on their own PC without starting a VM.
> > 
> > The idea is to move the code to handle loopback from the virtio-vsock,
> > in the core, but in another series :-)
> 
> OK, that makes sense.

Thanks,
Stefano

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


RE: [PATCH net-next 11/14] vsock: add multi-transports support

2019-11-13 Thread Jorgen Hansen via Virtualization
> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> Sent: Tuesday, November 12, 2019 11:37 AM

> > > > You already mentioned that you are working on a fix for loopback
> > > > here for the guest, but presumably a host could also do loopback.
> > >
> > > IIUC we don't support loopback in the host, because in this case the
> > > application will use the CID_HOST as address, but if we are in a nested
> > > VM environment we are in trouble.
> >
> > If both src and dst CID are CID_HOST, we should be fairly sure that this
> > Is host loopback, no? If src is anything else, we would do G2H.
> >
> 
> The problem is that we don't know the src until we assign a transport
> looking at the dst. (or if the user bound the socket to CID_HOST before
> the connect(), but it is not very common)
> 
> So if we are in a L1 and the user uses the local guest CID, it works, but if
> it uses the HOST_CID, the packet will go to the L0.
> 
> If we are in L0, it could be simple, because we can simply check if G2H
> is not loaded, so any packet to CID_HOST, is host loopback.
> 
> I think that if the user uses the IOCTL_VM_SOCKETS_GET_LOCAL_CID, to set
> the dest CID for the loopback, it works in both cases because we return the
> HOST_CID in L0, and always the guest CID in L1, also if a H2G is loaded to
> handle the L2.
> 
> Maybe we should document this in the man page.

Yeah, it seems like a good idea to flesh out the routing behavior for nested
VMs in the man page.

> 
> But I have a question: Does vmci support the host loopback?
> I've tried, and it seems not.

Only for datagrams - not for stream sockets.
 
> Also vhost-vsock doesn't support it, but virtio-vsock does.
> 
> > >
> > > Since several people asked about this feature at the KVM Forum, I would
> like
> > > to add a new VMADDR_CID_LOCAL (i.e. using the reserved 1) and
> implement
> > > loopback in the core.
> > >
> > > What do you think?
> >
> > What kind of use cases are mentioned in the KVM forum for loopback?
> One concern
> > is that we have to maintain yet another interprocess communication
> mechanism,
> > even though other choices exist already  (and those are likely to be more
> efficient
> > given the development time and specific focus that went into those). To
> me, the
> > local connections are mainly useful as a way to sanity test the protocol and
> transports.
> > However, if loopback is compelling, it would make sense have it in the core,
> since it
> > shouldn't need a specific transport.
> 
> The common use cases is for developer point of view, and to test the
> protocol and transports as you said.
> 
> People that are introducing VSOCK support in their projects, would like to
> test them on their own PC without starting a VM.
> 
> The idea is to move the code to handle loopback from the virtio-vsock,
> in the core, but in another series :-)

OK, that makes sense.

Thanks,
Jorgen
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next 11/14] vsock: add multi-transports support

2019-11-12 Thread Stefano Garzarella
On Tue, Nov 12, 2019 at 09:59:12AM +, Jorgen Hansen wrote:
> > From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> > Sent: Monday, November 11, 2019 6:18 PM
> > To: Jorgen Hansen 
> > Subject: Re: [PATCH net-next 11/14] vsock: add multi-transports support
> > 
> > On Mon, Nov 11, 2019 at 01:53:39PM +, Jorgen Hansen wrote:
> > > > From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> > > > Sent: Wednesday, October 23, 2019 11:56 AM
> > >
> > > Thanks a lot for working on this!
> > >
> > 
> > Thanks to you for the reviews!
> > 
> > > > With the multi-transports support, we can use vsock with nested VMs
> > (using
> > > > also different hypervisors) loading both guest->host and
> > > > host->guest transports at the same time.
> > > >
> > > > Major changes:
> > > > - vsock core module can be loaded regardless of the transports
> > > > - vsock_core_init() and vsock_core_exit() are renamed to
> > > >   vsock_core_register() and vsock_core_unregister()
> > > > - vsock_core_register() has a feature parameter (H2G, G2H, DGRAM)
> > > >   to identify which directions the transport can handle and if it's
> > > >   support DGRAM (only vmci)
> > > > - each stream socket is assigned to a transport when the remote CID
> > > >   is set (during the connect() or when we receive a connection request
> > > >   on a listener socket).
> > >
> > > How about allowing the transport to be set during bind as well? That
> > > would allow an application to ensure that it is using a specific 
> > > transport,
> > > i.e., if it binds to the host CID, it will use H2G, and if it binds to 
> > > something
> > > else it will use G2H? You can still use VMADDR_CID_ANY if you want to
> > > initially listen to both transports.
> > 
> > Do you mean for socket that will call the connect()?
> 
> I was just thinking that in general we know the transport at that point, so we
> could ensure that the socket would only see traffic from the relevant 
> transport,
> but as you mention below -  the updated bind lookup, and the added checks
> when selecting transport should also take care of this, so that is fine.
>  
> > For listener socket the "[PATCH net-next 14/14] vsock: fix bind() behaviour
> > taking care of CID" provides this behaviour.
> > Since the listener sockets don't use any transport specific callback
> > (they don't send any data to the remote peer), but they are used as
> > placeholder,
> > we don't need to assign them to a transport.
> > 
> > >
> > >
> > > >   The remote CID is used to decide which transport to use:
> > > >   - remote CID > VMADDR_CID_HOST will use host->guest transport
> > > >   - remote CID <= VMADDR_CID_HOST will use guest->host transport
> > > > - listener sockets are not bound to any transports since no transport
> > > >   operations are done on it. In this way we can create a listener
> > > >   socket, also if the transports are not loaded or with VMADDR_CID_ANY
> > > >   to listen on all transports.
> > > > - DGRAM sockets are handled as before, since only the vmci_transport
> > > >   provides this feature.
> > > >
> > > > Signed-off-by: Stefano Garzarella 
> > > > ---
> > > > RFC -> v1:
> > > > - documented VSOCK_TRANSPORT_F_* flags
> > > > - fixed vsock_assign_transport() when the socket is already assigned
> > > >   (e.g connection failed)
> > > > - moved features outside of struct vsock_transport, and used as
> > > >   parameter of vsock_core_register()
> > > > ---
> > > >  drivers/vhost/vsock.c   |   5 +-
> > > >  include/net/af_vsock.h  |  17 +-
> > > >  net/vmw_vsock/af_vsock.c| 237 ++--
> > > >  net/vmw_vsock/hyperv_transport.c|  26 ++-
> > > >  net/vmw_vsock/virtio_transport.c|   7 +-
> > > >  net/vmw_vsock/virtio_transport_common.c |  28 ++-
> > > >  net/vmw_vsock/vmci_transport.c  |  31 +++-
> > > >  7 files changed, 270 insertions(+), 81 deletions(-)
> > > >
> > >
> > >
> > > > diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
> > index
> > > > d89381166028..85d9a147 100644
> > > > --- a/net/vmw_vso

RE: [PATCH net-next 11/14] vsock: add multi-transports support

2019-11-12 Thread Jorgen Hansen via Virtualization
> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> Sent: Monday, November 11, 2019 6:18 PM
> To: Jorgen Hansen 
> Subject: Re: [PATCH net-next 11/14] vsock: add multi-transports support
> 
> On Mon, Nov 11, 2019 at 01:53:39PM +, Jorgen Hansen wrote:
> > > From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> > > Sent: Wednesday, October 23, 2019 11:56 AM
> >
> > Thanks a lot for working on this!
> >
> 
> Thanks to you for the reviews!
> 
> > > With the multi-transports support, we can use vsock with nested VMs
> (using
> > > also different hypervisors) loading both guest->host and
> > > host->guest transports at the same time.
> > >
> > > Major changes:
> > > - vsock core module can be loaded regardless of the transports
> > > - vsock_core_init() and vsock_core_exit() are renamed to
> > >   vsock_core_register() and vsock_core_unregister()
> > > - vsock_core_register() has a feature parameter (H2G, G2H, DGRAM)
> > >   to identify which directions the transport can handle and if it's
> > >   support DGRAM (only vmci)
> > > - each stream socket is assigned to a transport when the remote CID
> > >   is set (during the connect() or when we receive a connection request
> > >   on a listener socket).
> >
> > How about allowing the transport to be set during bind as well? That
> > would allow an application to ensure that it is using a specific transport,
> > i.e., if it binds to the host CID, it will use H2G, and if it binds to 
> > something
> > else it will use G2H? You can still use VMADDR_CID_ANY if you want to
> > initially listen to both transports.
> 
> Do you mean for socket that will call the connect()?

I was just thinking that in general we know the transport at that point, so we
could ensure that the socket would only see traffic from the relevant transport,
but as you mention below -  the updated bind lookup, and the added checks
when selecting transport should also take care of this, so that is fine.
 
> For listener socket the "[PATCH net-next 14/14] vsock: fix bind() behaviour
> taking care of CID" provides this behaviour.
> Since the listener sockets don't use any transport specific callback
> (they don't send any data to the remote peer), but they are used as
> placeholder,
> we don't need to assign them to a transport.
> 
> >
> >
> > >   The remote CID is used to decide which transport to use:
> > >   - remote CID > VMADDR_CID_HOST will use host->guest transport
> > >   - remote CID <= VMADDR_CID_HOST will use guest->host transport
> > > - listener sockets are not bound to any transports since no transport
> > >   operations are done on it. In this way we can create a listener
> > >   socket, also if the transports are not loaded or with VMADDR_CID_ANY
> > >   to listen on all transports.
> > > - DGRAM sockets are handled as before, since only the vmci_transport
> > >   provides this feature.
> > >
> > > Signed-off-by: Stefano Garzarella 
> > > ---
> > > RFC -> v1:
> > > - documented VSOCK_TRANSPORT_F_* flags
> > > - fixed vsock_assign_transport() when the socket is already assigned
> > >   (e.g connection failed)
> > > - moved features outside of struct vsock_transport, and used as
> > >   parameter of vsock_core_register()
> > > ---
> > >  drivers/vhost/vsock.c   |   5 +-
> > >  include/net/af_vsock.h  |  17 +-
> > >  net/vmw_vsock/af_vsock.c| 237 ++--
> > >  net/vmw_vsock/hyperv_transport.c|  26 ++-
> > >  net/vmw_vsock/virtio_transport.c|   7 +-
> > >  net/vmw_vsock/virtio_transport_common.c |  28 ++-
> > >  net/vmw_vsock/vmci_transport.c  |  31 +++-
> > >  7 files changed, 270 insertions(+), 81 deletions(-)
> > >
> >
> >
> > > diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
> index
> > > d89381166028..85d9a147 100644
> > > --- a/net/vmw_vsock/af_vsock.c
> > > +++ b/net/vmw_vsock/af_vsock.c
> > > @@ -130,7 +130,12 @@ static struct proto vsock_proto = {  #define
> > > VSOCK_DEFAULT_BUFFER_MAX_SIZE (1024 * 256)  #define
> > > VSOCK_DEFAULT_BUFFER_MIN_SIZE 128
> > >
> > > -static const struct vsock_transport *transport_single;
> > > +/* Transport used for host->guest communication */ static const struct
> > > +vsock_transport *transport_h2g;
> > > +/* Transport used for guest->host communi

Re: [PATCH net-next 11/14] vsock: add multi-transports support

2019-11-11 Thread Stefano Garzarella
On Mon, Nov 11, 2019 at 01:53:39PM +, Jorgen Hansen wrote:
> > From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> > Sent: Wednesday, October 23, 2019 11:56 AM
> 
> Thanks a lot for working on this!
> 

Thanks to you for the reviews!

> > With the multi-transports support, we can use vsock with nested VMs (using
> > also different hypervisors) loading both guest->host and
> > host->guest transports at the same time.
> > 
> > Major changes:
> > - vsock core module can be loaded regardless of the transports
> > - vsock_core_init() and vsock_core_exit() are renamed to
> >   vsock_core_register() and vsock_core_unregister()
> > - vsock_core_register() has a feature parameter (H2G, G2H, DGRAM)
> >   to identify which directions the transport can handle and if it's
> >   support DGRAM (only vmci)
> > - each stream socket is assigned to a transport when the remote CID
> >   is set (during the connect() or when we receive a connection request
> >   on a listener socket).
> 
> How about allowing the transport to be set during bind as well? That
> would allow an application to ensure that it is using a specific transport,
> i.e., if it binds to the host CID, it will use H2G, and if it binds to 
> something
> else it will use G2H? You can still use VMADDR_CID_ANY if you want to
> initially listen to both transports.

Do you mean for socket that will call the connect()?

For listener socket the "[PATCH net-next 14/14] vsock: fix bind() behaviour
taking care of CID" provides this behaviour.
Since the listener sockets don't use any transport specific callback
(they don't send any data to the remote peer), but they are used as placeholder,
we don't need to assign them to a transport.

> 
> 
> >   The remote CID is used to decide which transport to use:
> >   - remote CID > VMADDR_CID_HOST will use host->guest transport
> >   - remote CID <= VMADDR_CID_HOST will use guest->host transport
> > - listener sockets are not bound to any transports since no transport
> >   operations are done on it. In this way we can create a listener
> >   socket, also if the transports are not loaded or with VMADDR_CID_ANY
> >   to listen on all transports.
> > - DGRAM sockets are handled as before, since only the vmci_transport
> >   provides this feature.
> > 
> > Signed-off-by: Stefano Garzarella 
> > ---
> > RFC -> v1:
> > - documented VSOCK_TRANSPORT_F_* flags
> > - fixed vsock_assign_transport() when the socket is already assigned
> >   (e.g connection failed)
> > - moved features outside of struct vsock_transport, and used as
> >   parameter of vsock_core_register()
> > ---
> >  drivers/vhost/vsock.c   |   5 +-
> >  include/net/af_vsock.h  |  17 +-
> >  net/vmw_vsock/af_vsock.c| 237 ++--
> >  net/vmw_vsock/hyperv_transport.c|  26 ++-
> >  net/vmw_vsock/virtio_transport.c|   7 +-
> >  net/vmw_vsock/virtio_transport_common.c |  28 ++-
> >  net/vmw_vsock/vmci_transport.c  |  31 +++-
> >  7 files changed, 270 insertions(+), 81 deletions(-)
> > 
> 
> 
> > diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index
> > d89381166028..85d9a147 100644
> > --- a/net/vmw_vsock/af_vsock.c
> > +++ b/net/vmw_vsock/af_vsock.c
> > @@ -130,7 +130,12 @@ static struct proto vsock_proto = {  #define
> > VSOCK_DEFAULT_BUFFER_MAX_SIZE (1024 * 256)  #define
> > VSOCK_DEFAULT_BUFFER_MIN_SIZE 128
> > 
> > -static const struct vsock_transport *transport_single;
> > +/* Transport used for host->guest communication */ static const struct
> > +vsock_transport *transport_h2g;
> > +/* Transport used for guest->host communication */ static const struct
> > +vsock_transport *transport_g2h;
> > +/* Transport used for DGRAM communication */ static const struct
> > +vsock_transport *transport_dgram;
> >  static DEFINE_MUTEX(vsock_register_mutex);
> > 
> >  / UTILS /
> > @@ -182,7 +187,7 @@ static int vsock_auto_bind(struct vsock_sock *vsk)
> > return __vsock_bind(sk, _addr);
> >  }
> > 
> > -static int __init vsock_init_tables(void)
> > +static void vsock_init_tables(void)
> >  {
> > int i;
> > 
> > @@ -191,7 +196,6 @@ static int __init vsock_init_tables(void)
> > 
> > for (i = 0; i < ARRAY_SIZE(vsock_connected_table); i++)
> > INIT_LIST_HEAD(_connected_table[i]);
> > -   return 0;
> >  }
> > 
> >  static void __vsock_insert_bound(struct list_head *list, @@ -376,6 +380,62
> > @@ void vsock_enqueue_accept(struct sock *listener, struct sock
> > *connected)  }  EXPORT_SYMBOL_GPL(vsock_enqueue_accept);
> > 
> > +/* Assign a transport to a socket and call the .init transport callback.
> > + *
> > + * Note: for stream socket this must be called when vsk->remote_addr is
> > +set
> > + * (e.g. during the connect() or when a connection request on a
> > +listener
> > + * socket is received).
> > + * The vsk->remote_addr is used to decide which transport to use:
> > + *  - remote CID > VMADDR_CID_HOST will use host->guest transport
> > 

RE: [PATCH net-next 11/14] vsock: add multi-transports support

2019-11-11 Thread Jorgen Hansen via Virtualization
> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> Sent: Wednesday, October 23, 2019 11:56 AM

Thanks a lot for working on this!

> With the multi-transports support, we can use vsock with nested VMs (using
> also different hypervisors) loading both guest->host and
> host->guest transports at the same time.
> 
> Major changes:
> - vsock core module can be loaded regardless of the transports
> - vsock_core_init() and vsock_core_exit() are renamed to
>   vsock_core_register() and vsock_core_unregister()
> - vsock_core_register() has a feature parameter (H2G, G2H, DGRAM)
>   to identify which directions the transport can handle and if it's
>   support DGRAM (only vmci)
> - each stream socket is assigned to a transport when the remote CID
>   is set (during the connect() or when we receive a connection request
>   on a listener socket).

How about allowing the transport to be set during bind as well? That
would allow an application to ensure that it is using a specific transport,
i.e., if it binds to the host CID, it will use H2G, and if it binds to something
else it will use G2H? You can still use VMADDR_CID_ANY if you want to
initially listen to both transports.


>   The remote CID is used to decide which transport to use:
>   - remote CID > VMADDR_CID_HOST will use host->guest transport
>   - remote CID <= VMADDR_CID_HOST will use guest->host transport
> - listener sockets are not bound to any transports since no transport
>   operations are done on it. In this way we can create a listener
>   socket, also if the transports are not loaded or with VMADDR_CID_ANY
>   to listen on all transports.
> - DGRAM sockets are handled as before, since only the vmci_transport
>   provides this feature.
> 
> Signed-off-by: Stefano Garzarella 
> ---
> RFC -> v1:
> - documented VSOCK_TRANSPORT_F_* flags
> - fixed vsock_assign_transport() when the socket is already assigned
>   (e.g connection failed)
> - moved features outside of struct vsock_transport, and used as
>   parameter of vsock_core_register()
> ---
>  drivers/vhost/vsock.c   |   5 +-
>  include/net/af_vsock.h  |  17 +-
>  net/vmw_vsock/af_vsock.c| 237 ++--
>  net/vmw_vsock/hyperv_transport.c|  26 ++-
>  net/vmw_vsock/virtio_transport.c|   7 +-
>  net/vmw_vsock/virtio_transport_common.c |  28 ++-
>  net/vmw_vsock/vmci_transport.c  |  31 +++-
>  7 files changed, 270 insertions(+), 81 deletions(-)
> 


> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index
> d89381166028..85d9a147 100644
> --- a/net/vmw_vsock/af_vsock.c
> +++ b/net/vmw_vsock/af_vsock.c
> @@ -130,7 +130,12 @@ static struct proto vsock_proto = {  #define
> VSOCK_DEFAULT_BUFFER_MAX_SIZE (1024 * 256)  #define
> VSOCK_DEFAULT_BUFFER_MIN_SIZE 128
> 
> -static const struct vsock_transport *transport_single;
> +/* Transport used for host->guest communication */ static const struct
> +vsock_transport *transport_h2g;
> +/* Transport used for guest->host communication */ static const struct
> +vsock_transport *transport_g2h;
> +/* Transport used for DGRAM communication */ static const struct
> +vsock_transport *transport_dgram;
>  static DEFINE_MUTEX(vsock_register_mutex);
> 
>  / UTILS /
> @@ -182,7 +187,7 @@ static int vsock_auto_bind(struct vsock_sock *vsk)
>   return __vsock_bind(sk, _addr);
>  }
> 
> -static int __init vsock_init_tables(void)
> +static void vsock_init_tables(void)
>  {
>   int i;
> 
> @@ -191,7 +196,6 @@ static int __init vsock_init_tables(void)
> 
>   for (i = 0; i < ARRAY_SIZE(vsock_connected_table); i++)
>   INIT_LIST_HEAD(_connected_table[i]);
> - return 0;
>  }
> 
>  static void __vsock_insert_bound(struct list_head *list, @@ -376,6 +380,62
> @@ void vsock_enqueue_accept(struct sock *listener, struct sock
> *connected)  }  EXPORT_SYMBOL_GPL(vsock_enqueue_accept);
> 
> +/* Assign a transport to a socket and call the .init transport callback.
> + *
> + * Note: for stream socket this must be called when vsk->remote_addr is
> +set
> + * (e.g. during the connect() or when a connection request on a
> +listener
> + * socket is received).
> + * The vsk->remote_addr is used to decide which transport to use:
> + *  - remote CID > VMADDR_CID_HOST will use host->guest transport
> + *  - remote CID <= VMADDR_CID_HOST will use guest->host transport  */
> +int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock
> +*psk) {
> + const struct vsock_transport *new_transport;
> + struct sock *sk = sk_vsock(vsk);
> +
> + switch (sk->sk_type) {
> + case SOCK_DGRAM:
> + new_transport = transport_dgram;
> + break;
> + case SOCK_STREAM:
> + if (vsk->remote_addr.svm_cid > VMADDR_CID_HOST)
> + new_transport = transport_h2g;
> + else
> + new_transport = transport_g2h;
> + break;

You already mentioned that you are working on 

Re: [PATCH net-next 11/14] vsock: add multi-transports support

2019-10-31 Thread Stefano Garzarella
On Wed, Oct 30, 2019 at 03:40:05PM +, Jorgen Hansen wrote:
> > From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> > > +/* Assign a transport to a socket and call the .init transport callback.
> > > + *
> > > + * Note: for stream socket this must be called when vsk->remote_addr
> > > +is set
> > > + * (e.g. during the connect() or when a connection request on a
> > > +listener
> > > + * socket is received).
> > > + * The vsk->remote_addr is used to decide which transport to use:
> > > + *  - remote CID > VMADDR_CID_HOST will use host->guest transport
> > > + *  - remote CID <= VMADDR_CID_HOST will use guest->host transport
> > > +*/ int vsock_assign_transport(struct vsock_sock *vsk, struct
> > > +vsock_sock *psk) {
> > > +   const struct vsock_transport *new_transport;
> > > +   struct sock *sk = sk_vsock(vsk);
> > > +
> > > +   switch (sk->sk_type) {
> > > +   case SOCK_DGRAM:
> > > +   new_transport = transport_dgram;
> > > +   break;
> > > +   case SOCK_STREAM:
> > > +   if (vsk->remote_addr.svm_cid > VMADDR_CID_HOST)
> > > +   new_transport = transport_h2g;
> > > +   else
> > > +   new_transport = transport_g2h;
> > 
> > I just noticed that this break the loopback in the guest.
> > As a fix, we should use 'transport_g2h' when remote_cid <=
> > VMADDR_CID_HOST or remote_cid is the id of 'transport_g2h'.
> > 
> > To do that we also need to avoid that L2 guests can have the same CID of L1.
> > For vhost_vsock I can call vsock_find_cid() in vhost_vsock_set_cid()
> > 
> > @Jorgen: for vmci we need to do the same? or it is guaranteed, since it's
> > already support nested VMs, that a L2 guests cannot have the same CID as
> > the L1.
> 
> As far as I can tell, we have the same issue with the current support for 
> nested VMs in
> VMCI. If we have an L2 guest with the same CID as the L1 guest, we will 
> always send to
> the L2 guest, and we may assign an L2 guest the same CID as L1. It should be 
> straight
> forward to avoid this, though.
> 

Yes, I think so.

For the v2 I'm exposing the vsock_find_cid() to the transports, in this
way I can reject requests to set the same L1 CID for L2 guests.

Thanks,
Stefano
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


RE: [PATCH net-next 11/14] vsock: add multi-transports support

2019-10-30 Thread Jorgen Hansen via Virtualization
> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> > +/* Assign a transport to a socket and call the .init transport callback.
> > + *
> > + * Note: for stream socket this must be called when vsk->remote_addr
> > +is set
> > + * (e.g. during the connect() or when a connection request on a
> > +listener
> > + * socket is received).
> > + * The vsk->remote_addr is used to decide which transport to use:
> > + *  - remote CID > VMADDR_CID_HOST will use host->guest transport
> > + *  - remote CID <= VMADDR_CID_HOST will use guest->host transport
> > +*/ int vsock_assign_transport(struct vsock_sock *vsk, struct
> > +vsock_sock *psk) {
> > +   const struct vsock_transport *new_transport;
> > +   struct sock *sk = sk_vsock(vsk);
> > +
> > +   switch (sk->sk_type) {
> > +   case SOCK_DGRAM:
> > +   new_transport = transport_dgram;
> > +   break;
> > +   case SOCK_STREAM:
> > +   if (vsk->remote_addr.svm_cid > VMADDR_CID_HOST)
> > +   new_transport = transport_h2g;
> > +   else
> > +   new_transport = transport_g2h;
> 
> I just noticed that this break the loopback in the guest.
> As a fix, we should use 'transport_g2h' when remote_cid <=
> VMADDR_CID_HOST or remote_cid is the id of 'transport_g2h'.
> 
> To do that we also need to avoid that L2 guests can have the same CID of L1.
> For vhost_vsock I can call vsock_find_cid() in vhost_vsock_set_cid()
> 
> @Jorgen: for vmci we need to do the same? or it is guaranteed, since it's
> already support nested VMs, that a L2 guests cannot have the same CID as
> the L1.

As far as I can tell, we have the same issue with the current support for 
nested VMs in
VMCI. If we have an L2 guest with the same CID as the L1 guest, we will always 
send to
the L2 guest, and we may assign an L2 guest the same CID as L1. It should be 
straight
forward to avoid this, though.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next 11/14] vsock: add multi-transports support

2019-10-23 Thread Stefano Garzarella
On Wed, Oct 23, 2019 at 11:59 AM Stefano Garzarella  wrote:
>
> This patch adds the support of multiple transports in the
> VSOCK core.
>
> With the multi-transports support, we can use vsock with nested VMs
> (using also different hypervisors) loading both guest->host and
> host->guest transports at the same time.
>
> Major changes:
> - vsock core module can be loaded regardless of the transports
> - vsock_core_init() and vsock_core_exit() are renamed to
>   vsock_core_register() and vsock_core_unregister()
> - vsock_core_register() has a feature parameter (H2G, G2H, DGRAM)
>   to identify which directions the transport can handle and if it's
>   support DGRAM (only vmci)
> - each stream socket is assigned to a transport when the remote CID
>   is set (during the connect() or when we receive a connection request
>   on a listener socket).
>   The remote CID is used to decide which transport to use:
>   - remote CID > VMADDR_CID_HOST will use host->guest transport
>   - remote CID <= VMADDR_CID_HOST will use guest->host transport
> - listener sockets are not bound to any transports since no transport
>   operations are done on it. In this way we can create a listener
>   socket, also if the transports are not loaded or with VMADDR_CID_ANY
>   to listen on all transports.
> - DGRAM sockets are handled as before, since only the vmci_transport
>   provides this feature.
>
> Signed-off-by: Stefano Garzarella 
> ---
> RFC -> v1:
> - documented VSOCK_TRANSPORT_F_* flags
> - fixed vsock_assign_transport() when the socket is already assigned
>   (e.g connection failed)
> - moved features outside of struct vsock_transport, and used as
>   parameter of vsock_core_register()
> ---
>  drivers/vhost/vsock.c   |   5 +-
>  include/net/af_vsock.h  |  17 +-
>  net/vmw_vsock/af_vsock.c| 237 ++--
>  net/vmw_vsock/hyperv_transport.c|  26 ++-
>  net/vmw_vsock/virtio_transport.c|   7 +-
>  net/vmw_vsock/virtio_transport_common.c |  28 ++-
>  net/vmw_vsock/vmci_transport.c  |  31 +++-
>  7 files changed, 270 insertions(+), 81 deletions(-)
>
> diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
> index 6d7e4f022748..b235f4bbe8ea 100644
> --- a/drivers/vhost/vsock.c
> +++ b/drivers/vhost/vsock.c
> @@ -831,7 +831,8 @@ static int __init vhost_vsock_init(void)
>  {
> int ret;
>
> -   ret = vsock_core_init(_transport.transport);
> +   ret = vsock_core_register(_transport.transport,
> + VSOCK_TRANSPORT_F_H2G);
> if (ret < 0)
> return ret;
> return misc_register(_vsock_misc);
> @@ -840,7 +841,7 @@ static int __init vhost_vsock_init(void)
>  static void __exit vhost_vsock_exit(void)
>  {
> misc_deregister(_vsock_misc);
> -   vsock_core_exit();
> +   vsock_core_unregister(_transport.transport);
>  };
>
>  module_init(vhost_vsock_init);
> diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
> index fa1570dc9f5c..27a3463e4892 100644
> --- a/include/net/af_vsock.h
> +++ b/include/net/af_vsock.h
> @@ -91,6 +91,14 @@ struct vsock_transport_send_notify_data {
> u64 data2; /* Transport-defined. */
>  };
>
> +/* Transport features flags */
> +/* Transport provides host->guest communication */
> +#define VSOCK_TRANSPORT_F_H2G  0x0001
> +/* Transport provides guest->host communication */
> +#define VSOCK_TRANSPORT_F_G2H  0x0002
> +/* Transport provides DGRAM communication */
> +#define VSOCK_TRANSPORT_F_DGRAM0x0004
> +
>  struct vsock_transport {
> /* Initialize/tear-down socket. */
> int (*init)(struct vsock_sock *, struct vsock_sock *);
> @@ -154,12 +162,8 @@ struct vsock_transport {
>
>  / CORE /
>
> -int __vsock_core_init(const struct vsock_transport *t, struct module *owner);
> -static inline int vsock_core_init(const struct vsock_transport *t)
> -{
> -   return __vsock_core_init(t, THIS_MODULE);
> -}
> -void vsock_core_exit(void);
> +int vsock_core_register(const struct vsock_transport *t, int features);
> +void vsock_core_unregister(const struct vsock_transport *t);
>
>  /* The transport may downcast this to access transport-specific functions */
>  const struct vsock_transport *vsock_core_get_transport(struct vsock_sock 
> *vsk);
> @@ -190,6 +194,7 @@ struct sock *vsock_find_connected_socket(struct 
> sockaddr_vm *src,
>  struct sockaddr_vm *dst);
>  void vsock_remove_sock(struct vsock_sock *vsk);
>  void vsock_for_each_connected_socket(void (*fn)(struct sock *sk));
> +int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk);
>
>  / TAP /
>
> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
> index d89381166028..85d9a147 100644
> --- a/net/vmw_vsock/af_vsock.c
> +++ b/net/vmw_vsock/af_vsock.c
> @@ -130,7 +130,12 @@ static struct proto vsock_proto = {
>  #define