Re: [Qemu-devel] [virtio-comment] Re: [PATCH v1] docs/vhost-user: extend the vhost-user protocol to support the vhost-pci based inter-vm communication

Wei Wang Tue, 08 Nov 2016 03:37:52 -0800

On 11/08/2016 03:47 PM, Marc-André Lureau wrote:

Hi


I suggest you split this patch for the various "features" you propose.


OK. I'll make it several small patches.   <*v1-AR1*>


     Master and slave can be either a client (i.e. connecting) or
    server (listening)
     in the socket communication.

"Client" and "Server" have already been used in the doc here.

    +The current vhost-user protocol is extended to support the
    vhost-pci based inter-VM
    +communication. In this case, Slave is a QEMU which runs a
    vhost-pci server, and
    +Master is another QEMU which runs a vhost-pci client.
    +
Why introduce new terminology "server" and "client"? What does itchange? This is confusing with socket client/server configuration.

OK. I will try with "Slave" and "Master" in this doc when it's possible.<*v1-AR2*>


     Message Specification
     ---------------------

     Note that all numbers are in the machine native byte order. A
    vhost-user message
    -consists of 3 header fields and a payload:
    +consists of 4 header fields and a payload:

    -------------------------------------
    -| request | flags | size | payload |
    -------------------------------------
    +----------------------------------------------
    +| request | flags | conn_id | size | payload |
    +----------------------------------------------

      * Request: 32-bit type of the request
      * Flags: 32-bit bit field:
        - Lower 2 bits are the version (currently 0x01)
    -   - Bit 2 is the reply flag - needs to be sent on each reply
    from the slave
    +   - Bit 2 is the reply flag - needs to be sent on each reply
        - Bit 3 is the need_reply flag - see
    VHOST_USER_PROTOCOL_F_REPLY_ACK for
          details.
    + * Conn_id: 64-bit connection id to indentify a client socket
    connection. It is
    +            introduced in version 0x02 to support the
    "1-server-N-client" model
    +            and an asynchronous client read implementation. The
    connection id,
    +            0xFFFFFFFFFFFFFFFF, is used by an anonymous client
    (e.g. a client who
    +            has not got its connection id from the server in the
    initial talk)

I don't understand why you need a connection id, on each message.What's the purpose? Since the communication is unicast, a singlemessage should be enough.


Sure, please let me explain more:

The QEMU socket is going to be upgraded to support 1 server socket beingconnected by multiple client sockets (I've made patches to achievethis). In other words, here, multiple masters will connect to one slave,and the slave creates a vhost-pci device for each master after receivingthe necessary message info. The slave needs to know which master it istalking to when receiving a message, as it maintains multipleconnections at the same time.

Also shed some light on the implementation:

The slave maintains a table for those masters. Each master has an entryin the table (indexed by a "conn_id"). When the slave receives amessage, the payload is added to the corresponding table entry. Whenthings are ready (i.e. it has received enough info to create a vhost-pcidevice for the master), the device creation code creates and initializesa vhost-pci device (e.g. initialize VirtioPCIProxy in virtio-pci.c) fromthe corresponding table entry.


      * Size - 32-bit size of the payload


    @@ -97,6 +106,13 @@ Depending on the request type, payload can be:
        log offset: offset from start of supplied file descriptor
            where logging starts (i.e. where guest address 0 would be
    logged)

    +* Device info
    +   --------------------
    +   | virito id | uuid |
    +   --------------------
    +   Virtio id: 16-bit virtio id of the device
    +   UUID: 128-bit UUID to identify the QEMU instance that creates
    the device
    +


I wonder if UUID should be a different message.

We can make uuid another message if it has other usages.
Do you see any other usages of uuid?


     In QEMU the vhost-user message is implemented with the following
    struct:

     typedef struct VhostUserMsg {
    @@ -109,6 +125,7 @@ typedef struct VhostUserMsg {
             struct vhost_vring_addr addr;
             VhostUserMemory memory;
             VhostUserLog log;
    +        DeviceInfo dev_info;
         };
     } QEMU_PACKED VhostUserMsg;

    @@ -119,17 +136,25 @@ The protocol for vhost-user is based on the
    existing implementation of vhost
     for the Linux Kernel. Most messages that can be sent via the Unix
    domain socket
     implementing vhost-user have an equivalent ioctl to the kernel
    implementation.

    -The communication consists of master sending message requests and
    slave sending
    -message replies. Most of the requests don't require replies. Here
    is a list of
    -the ones that do:
    +Traditionally, the communication consists of master sending
    message requests
    +and slave sending message replies. Most of the requests don't
    require replies.
    +Here is a list of the ones that do:

      * VHOST_GET_FEATURES
      * VHOST_GET_PROTOCOL_FEATURES
      * VHOST_GET_VRING_BASE
      * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
    + * VHOST_USER_GET_CONN_ID
    + * VHOST_USER_SET_PEER_CONNECTION

Let's also fix  the VHOST_USER prefix of the above requests.

Sure, will do. <*v1-AR3*>

     [ Also see the section on REPLY_ACK protocol extension. ]

    +Currently, the communication also supports the Slave (server)
    sending messages
    +to the Master (client). Here is a list of them:
    + * VHOST_USER_SET_FEATURES

    + * VHOST_USER_SET_PEER_CONNECTION (the serve may actively request
    to disconnect
    +   with the client)
Oh, you are making the communication bidirectional? This is afundamental change in the protocol. This may be difficult to implementin qemu, since the communication in synchronous, a request expects animmediate reply, if it gets back a request (from the slave) in themiddle, it will fail.


Not really.

Adding the above two doesn't affect the existing synchronous read()messages (basically, those VHOST_USER_GET_xx messages). LikeVHOST_USER_SET_FEATURES, the _SET_ messages don't need a reply. Here, wejust make the slave capable of actively sending messages to the master.

Currently all requests (including VHOST_USER_SET_FEATURES) are comingfrom the Master. I don't understand yet the purpose ofVHOST_USER_SET_PEER_CONNECTION to propose an alternative, but I wouldrather keep the unidirectional communication if possible.


     There are several messages that the master sends with file
    descriptors passed
     in the ancillary data:

    @@ -259,6 +284,7 @@ Protocol features
     #define VHOST_USER_PROTOCOL_F_LOG_SHMFD      1
     #define VHOST_USER_PROTOCOL_F_RARP           2
     #define VHOST_USER_PROTOCOL_F_REPLY_ACK      3
    +#define VHOST_USER_PROTOCOL_F_VHOST_PCI      4

     Message types
     -------------
    @@ -470,6 +496,43 @@ Message types
           The first 6 bytes of the payload contain the mac address of
    the guest to
           allow the vhost user backend to construct and broadcast the
    fake RARP.

    + * VHOST_USER_GET_CONN_ID
    +
    +      Id: 20
    +      Equivalent ioctl: N/A
    +      Master payload: u64
    +
    +      The client sends this message to the server to ask for its
    connection id.


Confusing, please keep the Master/Slave terminology

OK.


    + The connection id is then put into the message header (the
    conn_id field),
    +      so that the server can always know who it is talking with.
    +


Could you explain what the connection id is for?


Explained above. Please let me know if I didn't make it clear.

+This request should be sent only when VHOST_USER_PROTOCOL_F_VHOST_PCIhas...


    +* VHOST_USER_SET_DEV_INFO
    +
    +      Id: 21
    +      Equivalent ioctl: N/A
    +      Master payload: dev info
    +
    +      The client sends the producer device info to the server.


"Master sends producer device info to the Slave" works, no?

Yes, it works. The current dev info only contains a "virtio id" field(assume we'll take uuid out as a separate message), which tells theslave if it is a net, scsi, console or else. do you see any issue?


Could we guarantee this message is sent before SET_VRING*?


Why do we need to guarantee this?

    + This request should be sent only when
    VHOST_USER_PROTOCOL_F_VHOST_PCI has
    +      been negotiated.
    +
I think this message could be useful for other purposes thanvhost-pci, thus I would give it its own flag.


Could you please give an example of other usage? Thanks.


    +* VHOST_USER_SET_PEER_CONNECTION
    +
    +      Id: 22
    +      Equivalent ioctl: N/A
    +      Master payload: u64
    +
    +      The producer device requests to connect or disconnect to
    the consumer device.


producer->Master, consummer->Slave

How does it interact with SET_VRING_ENABLE?


It's independent of SET_VRING_ENABLE:
SET_VRING_ENABLE enables a virtq to be in "active".

SET_PEER_CONNECTION enables the peer (slave or master) device to be in"active". The driver shouldn't send packets if the device is inactive.


    + The consumer device may request to disconnect to the producer
    device. This
    +      request should be sent only when
    VHOST_USER_PROTOCOL_F_VHOST_PCI has been
    +      negotiated.
    +      Connection request: If the reply message indicates
    "success", the vhost-pci based
    +      inter-VM communication channel has been established.
    +      Disconnection request: If the reply message indicates
    "success", the vhost-pci based
    +      inter-VM communication channel has been destroyed.
    +      #define VHOST_USER_SET_PEER_CONNECTION_F_OFF 0
    +      #define VHOST_USER_SET_PEER_CONNECTION_F_ON 1
    +

I think it would be better to add one more command here:
#define VHOST_USER_SET_PEER_CONNECTION_F_INIT 2

The master uses this command to tell the slave it's ready to create thevhost-pci device. Regarding the implementation, it is put at the bottomof vhost_net_start() function (when all the vring info have been sentand enabled).


Best,
Wei

Re: [Qemu-devel] [virtio-comment] Re: [PATCH v1] docs/vhost-user: extend the vhost-user protocol to support the vhost-pci based inter-vm communication

Reply via email to