RE: [PATCH v15 net-next 1/1] hv_sock: introduce Hyper-V Sockets

2016-07-11 Thread Dexuan Cui
> From: Vitaly Kuznetsov [mailto:vkuzn...@redhat.com]
>  ...
> Some comments below. The vast majority of them are really minor, the
> only thing which bothers me a little bit is WARN() in hvsock_sendmsg()
> which I think shouldn't be there. But I may have missed something.
 
Thank you for the very detailed comments, Vitaly!

Now I see I shouldn't put pr_err() in hvsock_sendmsg() and hvsock_recvmsg(),
because IMO a malicious app can use this to generate lots of messages to slow
down the system.  I'll remove them.

I'll reply your other comments bellow.

> > +#define guid_t uuid_le
> > +struct sockaddr_hv {
> > +   __kernel_sa_family_tshv_family;  /* Address family  */
> > +   u16 reserved;/* Must be Zero*/
> > +   guid_t  shv_vm_id;   /* VM ID   */
> > +   guid_t  shv_service_id;  /* Service ID  */
> > +};
> 
> I'm not sure it is worth it to introduce a new 'guid_t' type here, we
> may want to rename
> 
> shv_vm_id -> shv_vm_guid
> shv_service_id -> shv_service_guid
> 
> and use uuid_le type.

Ok. I'll make the change.

> > +config HYPERV_SOCK
> > +   tristate "Hyper-V Sockets"
> > +   depends on HYPERV
> > +   default m if HYPERV
> > +   help
> > + Hyper-V Sockets is somewhat like TCP over VMBus, allowing
> > + communication between Linux guest and Hyper-V host without TCP/IP.
> > +
> 
> I know it's hard to come up with a simple description but I'd rather
> describe is as "Socket interface for high speed communication between
> Linux guest and Hyper-V host over VMBus."

OK.

> > +static bool uuid_equals(uuid_le u1, uuid_le u2)
> > +{
> > +   return !uuid_le_cmp(u1, u2);
> > +}
> 
> why not use uuid_le_cmp directly?
OK. I will change to it.

> > +static unsigned int hvsock_poll(struct file *file, struct socket *sock,
> > +   poll_table *wait)
>> ...
> > +   if (channel) {
> > +   /* If there is something in the queue then we can read */
> > +   get_ringbuffer_rw_status(channel, _read, _write);
> > +
> > +   if (!can_read && hvsk->recv)
> > +   can_read = true;
> > +
> > +   if (!(sk->sk_shutdown & RCV_SHUTDOWN) && can_read)
> > +   mask |= POLLIN | POLLRDNORM;
> > +   } else {
> > +   can_read = false;
> 
> we don't use can_read below

I'll remove the can_read assignment.

> > +   channel = hvsk->channel;
> > +   if (!channel) {
> > +   WARN_ONCE(1, "NULL channel! There is a programming
> bug.\n");
> 
> BUG() then

OK.

> > +static int hvsock_open_connection(struct vmbus_channel *channel)
> > +{
> > + ..
> > +   if (conn_from_host) {
> > +   if (sk->sk_ack_backlog >= sk->sk_max_ack_backlog) {
> > +   ret = -EMFILE;
> 
> I'm not sure -EMFILE is appropriate, we don't really have "too many open
> files".
Here the ret value doesn't really matter, because the return value of the
function is not really used at present.

However, I agree with you that EMFILE is unsuitable.
Let me change to ECONNREFUSED, which seems better to me.

> > +static int hvsock_connect_wait(struct socket *sock,
> > +  int flags, int current_ret)
> > +{
> > +   struct sock *sk = sock->sk;
> > +   struct hvsock_sock *hvsk;
> > +   int ret = current_ret;
> > +   DEFINE_WAIT(wait);
> > +   long timeout;
> > +
> > +   hvsk = sk_to_hvsock(sk);
> > +   timeout = 30 * HZ;
> 
> We may want to introduce a define for this timeout. Does it actually
> match host's timeout?

I'll add HVSOCK_CONNECT_TIMEOUT for this.
Yes, the value is from Hyper-V team.
 
> > +static int hvsock_accept_wait(struct sock *listener,
> > + ..
> > +
> > +   if (ret) {
> > +   release_sock(connected);
> > +   sock_put(connected);
> > +   } else {
> > +   newsock->state = SS_CONNECTED;
> > +   sock_graft(connected, newsock);
> > +   release_sock(connected);
> > +   sock_put(connected);
> 
> so we do release_sock()/sock_put() unconditionally and this piece could
> be rewritten as
> 
> if (!ret) {
> newsock->state = SS_CONNECTED;
> sock_graft(connected, newsock);
> }
> release_sock(connected);
> sock_put(connected);

Will do.


> > +static int hvsock_listen(struct socket *sock, int backlog)
> > +{
> > + ..
> > +   /* This is an artificial limit */
> > +   if (backlog > 128)
> > +   backlog = 128;
> 
> Let's do a define for it.
Ok.
 
> > +static int hvsock_sendmsg(struct socket *sock, struct msghdr *msg,
> > + size_t len)
> > +{
> > +   struct hvsock_sock *hvsk;
> > +   struct sock *sk;
> > +   int ret;
> > +
> > +   if (len == 0)
> > +   return -EINVAL;
> > +
> > +   if (msg->msg_flags & ~MSG_DONTWAIT) {
> > +   pr_err("%s: unsupported flags=0x%x\n", __func__,
> > +  msg->msg_flags);
> 
> I don't think we 

RE: [PATCH v15 net-next 1/1] hv_sock: introduce Hyper-V Sockets

2016-07-11 Thread Dexuan Cui
> From: Vitaly Kuznetsov [mailto:vkuzn...@redhat.com]
>  ...
> Some comments below. The vast majority of them are really minor, the
> only thing which bothers me a little bit is WARN() in hvsock_sendmsg()
> which I think shouldn't be there. But I may have missed something.
 
Thank you for the very detailed comments, Vitaly!

Now I see I shouldn't put pr_err() in hvsock_sendmsg() and hvsock_recvmsg(),
because IMO a malicious app can use this to generate lots of messages to slow
down the system.  I'll remove them.

I'll reply your other comments bellow.

> > +#define guid_t uuid_le
> > +struct sockaddr_hv {
> > +   __kernel_sa_family_tshv_family;  /* Address family  */
> > +   u16 reserved;/* Must be Zero*/
> > +   guid_t  shv_vm_id;   /* VM ID   */
> > +   guid_t  shv_service_id;  /* Service ID  */
> > +};
> 
> I'm not sure it is worth it to introduce a new 'guid_t' type here, we
> may want to rename
> 
> shv_vm_id -> shv_vm_guid
> shv_service_id -> shv_service_guid
> 
> and use uuid_le type.

Ok. I'll make the change.

> > +config HYPERV_SOCK
> > +   tristate "Hyper-V Sockets"
> > +   depends on HYPERV
> > +   default m if HYPERV
> > +   help
> > + Hyper-V Sockets is somewhat like TCP over VMBus, allowing
> > + communication between Linux guest and Hyper-V host without TCP/IP.
> > +
> 
> I know it's hard to come up with a simple description but I'd rather
> describe is as "Socket interface for high speed communication between
> Linux guest and Hyper-V host over VMBus."

OK.

> > +static bool uuid_equals(uuid_le u1, uuid_le u2)
> > +{
> > +   return !uuid_le_cmp(u1, u2);
> > +}
> 
> why not use uuid_le_cmp directly?
OK. I will change to it.

> > +static unsigned int hvsock_poll(struct file *file, struct socket *sock,
> > +   poll_table *wait)
>> ...
> > +   if (channel) {
> > +   /* If there is something in the queue then we can read */
> > +   get_ringbuffer_rw_status(channel, _read, _write);
> > +
> > +   if (!can_read && hvsk->recv)
> > +   can_read = true;
> > +
> > +   if (!(sk->sk_shutdown & RCV_SHUTDOWN) && can_read)
> > +   mask |= POLLIN | POLLRDNORM;
> > +   } else {
> > +   can_read = false;
> 
> we don't use can_read below

I'll remove the can_read assignment.

> > +   channel = hvsk->channel;
> > +   if (!channel) {
> > +   WARN_ONCE(1, "NULL channel! There is a programming
> bug.\n");
> 
> BUG() then

OK.

> > +static int hvsock_open_connection(struct vmbus_channel *channel)
> > +{
> > + ..
> > +   if (conn_from_host) {
> > +   if (sk->sk_ack_backlog >= sk->sk_max_ack_backlog) {
> > +   ret = -EMFILE;
> 
> I'm not sure -EMFILE is appropriate, we don't really have "too many open
> files".
Here the ret value doesn't really matter, because the return value of the
function is not really used at present.

However, I agree with you that EMFILE is unsuitable.
Let me change to ECONNREFUSED, which seems better to me.

> > +static int hvsock_connect_wait(struct socket *sock,
> > +  int flags, int current_ret)
> > +{
> > +   struct sock *sk = sock->sk;
> > +   struct hvsock_sock *hvsk;
> > +   int ret = current_ret;
> > +   DEFINE_WAIT(wait);
> > +   long timeout;
> > +
> > +   hvsk = sk_to_hvsock(sk);
> > +   timeout = 30 * HZ;
> 
> We may want to introduce a define for this timeout. Does it actually
> match host's timeout?

I'll add HVSOCK_CONNECT_TIMEOUT for this.
Yes, the value is from Hyper-V team.
 
> > +static int hvsock_accept_wait(struct sock *listener,
> > + ..
> > +
> > +   if (ret) {
> > +   release_sock(connected);
> > +   sock_put(connected);
> > +   } else {
> > +   newsock->state = SS_CONNECTED;
> > +   sock_graft(connected, newsock);
> > +   release_sock(connected);
> > +   sock_put(connected);
> 
> so we do release_sock()/sock_put() unconditionally and this piece could
> be rewritten as
> 
> if (!ret) {
> newsock->state = SS_CONNECTED;
> sock_graft(connected, newsock);
> }
> release_sock(connected);
> sock_put(connected);

Will do.


> > +static int hvsock_listen(struct socket *sock, int backlog)
> > +{
> > + ..
> > +   /* This is an artificial limit */
> > +   if (backlog > 128)
> > +   backlog = 128;
> 
> Let's do a define for it.
Ok.
 
> > +static int hvsock_sendmsg(struct socket *sock, struct msghdr *msg,
> > + size_t len)
> > +{
> > +   struct hvsock_sock *hvsk;
> > +   struct sock *sk;
> > +   int ret;
> > +
> > +   if (len == 0)
> > +   return -EINVAL;
> > +
> > +   if (msg->msg_flags & ~MSG_DONTWAIT) {
> > +   pr_err("%s: unsupported flags=0x%x\n", __func__,
> > +  msg->msg_flags);
> 
> I don't think we 

Re: [PATCH v15 net-next 1/1] hv_sock: introduce Hyper-V Sockets

2016-07-08 Thread Vitaly Kuznetsov
Dexuan Cui  writes:

> Hyper-V Sockets (hv_sock) supplies a byte-stream based communication
> mechanism between the host and the guest. It's somewhat like TCP over
> VMBus, but the transportation layer (VMBus) is much simpler than IP.
>
> With Hyper-V Sockets, applications between the host and the guest can talk
> to each other directly by the traditional BSD-style socket APIs.
>
> Hyper-V Sockets is only available on new Windows hosts, like Windows Server
> 2016. More info is in this article "Make your own integration services":
> https://msdn.microsoft.com/en-us/virtualization/hyperv_on_windows/develop/make_mgmt_service
>
> The patch implements the necessary support in the guest side by introducing
> a new socket address family AF_HYPERV.
>
> Signed-off-by: Dexuan Cui 
> Cc: "K. Y. Srinivasan" 
> Cc: Haiyang Zhang 
> Cc: Vitaly Kuznetsov 

Some comments below. The vast majority of them are really minor, the
only thing which bothers me a little bit is WARN() in hvsock_sendmsg()
which I think shouldn't be there. But I may have missed something.

I didn't do any tests for the code.

> Cc: Cathy Avery 
> ---
>
> You can also get the patch here (2764221d):
> https://github.com/dcui/linux/commits/decui/hv_sock/net-next/20160708_v15
>
> For the change log before v12, please see https://lkml.org/lkml/2016/5/15/31
>
> In v12, the changes are mainly the following:
>
> 1) remove the module params as David suggested.
>
> 2) use 5 exact pages for VMBus send/recv rings, respectively.
> The host side's design of the feature requires 5 exact pages for recv/send
> rings respectively -- this is suboptimal considering memory consumption,
> however unluckily we have to live with it, before the host comes up with
> a new design in the future. :-(
>
> 3) remove the per-connection static send/recv buffers
> Instead, we allocate and free the buffers dynamically only when we recv/send
> data. This means: when a connection is idle, no memory is consumed as
> recv/send buffers at all.
>
> In v13:
> I return ENOMEM on buffer alllocation failure
>
>Actually "man read/write" says "Other errors may occur, depending on the
> object connected to fd". "man send/recv" indeed lists ENOMEM.
>Considering AF_HYPERV is a new socket type, ENOMEM seems OK here.
>In the long run, I think we should add a new API in the VMBus driver,
> allowing data copy from VMBus ringbuffer into user mode buffer directly.
> This way, we can even eliminate this temporary buffer.
>
> In v14:
> fix some coding style issues pointed out by David.
>
> In v15:
> Just some stylistic changes addressing comments from Joe Perches and
> Olaf Hering -- thank you!
> - add a GPL blurb.
> - define a new macro PAGE_SIZE_4K and use it to replace PAGE_SIZE
> - change sk_to_hvsock/hvsock_to_sk() from macros to inline functions
> - remove a not-very-useful pr_err()
> - fix some typos in comment and coding style issues.
>
> Looking forward to your comments!
>
>  MAINTAINERS |2 +
>  include/linux/hyperv.h  |   13 +
>  include/linux/socket.h  |4 +-
>  include/net/af_hvsock.h |   78 +++
>  include/uapi/linux/hyperv.h |   24 +
>  net/Kconfig |1 +
>  net/Makefile|1 +
>  net/hv_sock/Kconfig |   10 +
>  net/hv_sock/Makefile|3 +
>  net/hv_sock/af_hvsock.c | 1523 
> +++
>  10 files changed, 1658 insertions(+), 1 deletion(-)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 50f69ba..6eaa26f 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -5514,7 +5514,9 @@ F:  drivers/pci/host/pci-hyperv.c
>  F:   drivers/net/hyperv/
>  F:   drivers/scsi/storvsc_drv.c
>  F:   drivers/video/fbdev/hyperv_fb.c
> +F:   net/hv_sock/
>  F:   include/linux/hyperv.h
> +F:   include/net/af_hvsock.h
>  F:   tools/hv/
>  F:   Documentation/ABI/stable/sysfs-bus-vmbus
>
> diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
> index 50f493e..1cda6ea5 100644
> --- a/include/linux/hyperv.h
> +++ b/include/linux/hyperv.h
> @@ -1508,5 +1508,18 @@ static inline void commit_rd_index(struct 
> vmbus_channel *channel)
>   vmbus_set_event(channel);
>  }
>
> +struct vmpipe_proto_header {
> + u32 pkt_type;
> + u32 data_size;
> +};
> +
> +#define HVSOCK_HEADER_LEN(sizeof(struct vmpacket_descriptor) + \
> +  sizeof(struct vmpipe_proto_header))
> +
> +/* See 'prev_indices' in hv_ringbuffer_read(), hv_ringbuffer_write() */
> +#define PREV_INDICES_LEN (sizeof(u64))
>
> +#define HVSOCK_PKT_LEN(payload_len)  (HVSOCK_HEADER_LEN + \
> + ALIGN((payload_len), 8) + \
> + PREV_INDICES_LEN)
>  #endif /* _HYPERV_H */
> diff --git a/include/linux/socket.h b/include/linux/socket.h
> index b5cc5a6..0b68b58 100644
> --- 

Re: [PATCH v15 net-next 1/1] hv_sock: introduce Hyper-V Sockets

2016-07-08 Thread Vitaly Kuznetsov
Dexuan Cui  writes:

> Hyper-V Sockets (hv_sock) supplies a byte-stream based communication
> mechanism between the host and the guest. It's somewhat like TCP over
> VMBus, but the transportation layer (VMBus) is much simpler than IP.
>
> With Hyper-V Sockets, applications between the host and the guest can talk
> to each other directly by the traditional BSD-style socket APIs.
>
> Hyper-V Sockets is only available on new Windows hosts, like Windows Server
> 2016. More info is in this article "Make your own integration services":
> https://msdn.microsoft.com/en-us/virtualization/hyperv_on_windows/develop/make_mgmt_service
>
> The patch implements the necessary support in the guest side by introducing
> a new socket address family AF_HYPERV.
>
> Signed-off-by: Dexuan Cui 
> Cc: "K. Y. Srinivasan" 
> Cc: Haiyang Zhang 
> Cc: Vitaly Kuznetsov 

Some comments below. The vast majority of them are really minor, the
only thing which bothers me a little bit is WARN() in hvsock_sendmsg()
which I think shouldn't be there. But I may have missed something.

I didn't do any tests for the code.

> Cc: Cathy Avery 
> ---
>
> You can also get the patch here (2764221d):
> https://github.com/dcui/linux/commits/decui/hv_sock/net-next/20160708_v15
>
> For the change log before v12, please see https://lkml.org/lkml/2016/5/15/31
>
> In v12, the changes are mainly the following:
>
> 1) remove the module params as David suggested.
>
> 2) use 5 exact pages for VMBus send/recv rings, respectively.
> The host side's design of the feature requires 5 exact pages for recv/send
> rings respectively -- this is suboptimal considering memory consumption,
> however unluckily we have to live with it, before the host comes up with
> a new design in the future. :-(
>
> 3) remove the per-connection static send/recv buffers
> Instead, we allocate and free the buffers dynamically only when we recv/send
> data. This means: when a connection is idle, no memory is consumed as
> recv/send buffers at all.
>
> In v13:
> I return ENOMEM on buffer alllocation failure
>
>Actually "man read/write" says "Other errors may occur, depending on the
> object connected to fd". "man send/recv" indeed lists ENOMEM.
>Considering AF_HYPERV is a new socket type, ENOMEM seems OK here.
>In the long run, I think we should add a new API in the VMBus driver,
> allowing data copy from VMBus ringbuffer into user mode buffer directly.
> This way, we can even eliminate this temporary buffer.
>
> In v14:
> fix some coding style issues pointed out by David.
>
> In v15:
> Just some stylistic changes addressing comments from Joe Perches and
> Olaf Hering -- thank you!
> - add a GPL blurb.
> - define a new macro PAGE_SIZE_4K and use it to replace PAGE_SIZE
> - change sk_to_hvsock/hvsock_to_sk() from macros to inline functions
> - remove a not-very-useful pr_err()
> - fix some typos in comment and coding style issues.
>
> Looking forward to your comments!
>
>  MAINTAINERS |2 +
>  include/linux/hyperv.h  |   13 +
>  include/linux/socket.h  |4 +-
>  include/net/af_hvsock.h |   78 +++
>  include/uapi/linux/hyperv.h |   24 +
>  net/Kconfig |1 +
>  net/Makefile|1 +
>  net/hv_sock/Kconfig |   10 +
>  net/hv_sock/Makefile|3 +
>  net/hv_sock/af_hvsock.c | 1523 
> +++
>  10 files changed, 1658 insertions(+), 1 deletion(-)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 50f69ba..6eaa26f 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -5514,7 +5514,9 @@ F:  drivers/pci/host/pci-hyperv.c
>  F:   drivers/net/hyperv/
>  F:   drivers/scsi/storvsc_drv.c
>  F:   drivers/video/fbdev/hyperv_fb.c
> +F:   net/hv_sock/
>  F:   include/linux/hyperv.h
> +F:   include/net/af_hvsock.h
>  F:   tools/hv/
>  F:   Documentation/ABI/stable/sysfs-bus-vmbus
>
> diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
> index 50f493e..1cda6ea5 100644
> --- a/include/linux/hyperv.h
> +++ b/include/linux/hyperv.h
> @@ -1508,5 +1508,18 @@ static inline void commit_rd_index(struct 
> vmbus_channel *channel)
>   vmbus_set_event(channel);
>  }
>
> +struct vmpipe_proto_header {
> + u32 pkt_type;
> + u32 data_size;
> +};
> +
> +#define HVSOCK_HEADER_LEN(sizeof(struct vmpacket_descriptor) + \
> +  sizeof(struct vmpipe_proto_header))
> +
> +/* See 'prev_indices' in hv_ringbuffer_read(), hv_ringbuffer_write() */
> +#define PREV_INDICES_LEN (sizeof(u64))
>
> +#define HVSOCK_PKT_LEN(payload_len)  (HVSOCK_HEADER_LEN + \
> + ALIGN((payload_len), 8) + \
> + PREV_INDICES_LEN)
>  #endif /* _HYPERV_H */
> diff --git a/include/linux/socket.h b/include/linux/socket.h
> index b5cc5a6..0b68b58 100644
> --- a/include/linux/socket.h
> +++ b/include/linux/socket.h
> @@ -202,8 +202,9 @@ struct ucred {
>  #define AF_VSOCK 40  /* vSockets 

Re: [PATCH v15 net-next 1/1] hv_sock: introduce Hyper-V Sockets

2016-07-08 Thread Cathy Avery
It's too bad about having to allocate the send/receive rings on a per 
socket basis. Hopefully this will change in the future.


I have built kernel 4.7.0-rc6 with this patch and did a quick test on 
the driver using rhel7 over hyperv Windows Server 2016 TP5. These were 
basic send and receive tests ( host to vm and vm to host ) using apps 
provided by Dexuan.


Reviewed-by: Cathy Avery 
Tested-by: Cathy Avery 

On 07/08/2016 03:47 AM, Dexuan Cui wrote:

Hyper-V Sockets (hv_sock) supplies a byte-stream based communication
mechanism between the host and the guest. It's somewhat like TCP over
VMBus, but the transportation layer (VMBus) is much simpler than IP.

With Hyper-V Sockets, applications between the host and the guest can talk
to each other directly by the traditional BSD-style socket APIs.

Hyper-V Sockets is only available on new Windows hosts, like Windows Server
2016. More info is in this article "Make your own integration services":
https://msdn.microsoft.com/en-us/virtualization/hyperv_on_windows/develop/make_mgmt_service

The patch implements the necessary support in the guest side by introducing
a new socket address family AF_HYPERV.

Signed-off-by: Dexuan Cui 
Cc: "K. Y. Srinivasan" 
Cc: Haiyang Zhang 
Cc: Vitaly Kuznetsov 
Cc: Cathy Avery 
---

You can also get the patch here (2764221d):
https://github.com/dcui/linux/commits/decui/hv_sock/net-next/20160708_v15

For the change log before v12, please see https://lkml.org/lkml/2016/5/15/31

In v12, the changes are mainly the following:

1) remove the module params as David suggested.

2) use 5 exact pages for VMBus send/recv rings, respectively.
The host side's design of the feature requires 5 exact pages for recv/send
rings respectively -- this is suboptimal considering memory consumption,
however unluckily we have to live with it, before the host comes up with
a new design in the future. :-(

3) remove the per-connection static send/recv buffers
Instead, we allocate and free the buffers dynamically only when we recv/send
data. This means: when a connection is idle, no memory is consumed as
recv/send buffers at all.

In v13:
I return ENOMEM on buffer alllocation failure

Actually "man read/write" says "Other errors may occur, depending on the
object connected to fd". "man send/recv" indeed lists ENOMEM.
Considering AF_HYPERV is a new socket type, ENOMEM seems OK here.
In the long run, I think we should add a new API in the VMBus driver,
allowing data copy from VMBus ringbuffer into user mode buffer directly.
This way, we can even eliminate this temporary buffer.

In v14:
fix some coding style issues pointed out by David.

In v15:
Just some stylistic changes addressing comments from Joe Perches and
Olaf Hering -- thank you!
- add a GPL blurb.
- define a new macro PAGE_SIZE_4K and use it to replace PAGE_SIZE
- change sk_to_hvsock/hvsock_to_sk() from macros to inline functions
- remove a not-very-useful pr_err()
- fix some typos in comment and coding style issues.

Looking forward to your comments!

  MAINTAINERS |2 +
  include/linux/hyperv.h  |   13 +
  include/linux/socket.h  |4 +-
  include/net/af_hvsock.h |   78 +++
  include/uapi/linux/hyperv.h |   24 +
  net/Kconfig |1 +
  net/Makefile|1 +
  net/hv_sock/Kconfig |   10 +
  net/hv_sock/Makefile|3 +
  net/hv_sock/af_hvsock.c | 1523 +++
  10 files changed, 1658 insertions(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 50f69ba..6eaa26f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5514,7 +5514,9 @@ F:drivers/pci/host/pci-hyperv.c
  F:drivers/net/hyperv/
  F:drivers/scsi/storvsc_drv.c
  F:drivers/video/fbdev/hyperv_fb.c
+F: net/hv_sock/
  F:include/linux/hyperv.h
+F: include/net/af_hvsock.h
  F:tools/hv/
  F:Documentation/ABI/stable/sysfs-bus-vmbus
  
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h

index 50f493e..1cda6ea5 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1508,5 +1508,18 @@ static inline void commit_rd_index(struct vmbus_channel 
*channel)
vmbus_set_event(channel);
  }
  
+struct vmpipe_proto_header {

+   u32 pkt_type;
+   u32 data_size;
+};
+
+#define HVSOCK_HEADER_LEN  (sizeof(struct vmpacket_descriptor) + \
+sizeof(struct vmpipe_proto_header))
+
+/* See 'prev_indices' in hv_ringbuffer_read(), hv_ringbuffer_write() */
+#define PREV_INDICES_LEN   (sizeof(u64))
  
+#define HVSOCK_PKT_LEN(payload_len)	(HVSOCK_HEADER_LEN + \

+   ALIGN((payload_len), 8) + \
+   PREV_INDICES_LEN)
  #endif /* _HYPERV_H */
diff --git a/include/linux/socket.h b/include/linux/socket.h

Re: [PATCH v15 net-next 1/1] hv_sock: introduce Hyper-V Sockets

2016-07-08 Thread Cathy Avery
It's too bad about having to allocate the send/receive rings on a per 
socket basis. Hopefully this will change in the future.


I have built kernel 4.7.0-rc6 with this patch and did a quick test on 
the driver using rhel7 over hyperv Windows Server 2016 TP5. These were 
basic send and receive tests ( host to vm and vm to host ) using apps 
provided by Dexuan.


Reviewed-by: Cathy Avery 
Tested-by: Cathy Avery 

On 07/08/2016 03:47 AM, Dexuan Cui wrote:

Hyper-V Sockets (hv_sock) supplies a byte-stream based communication
mechanism between the host and the guest. It's somewhat like TCP over
VMBus, but the transportation layer (VMBus) is much simpler than IP.

With Hyper-V Sockets, applications between the host and the guest can talk
to each other directly by the traditional BSD-style socket APIs.

Hyper-V Sockets is only available on new Windows hosts, like Windows Server
2016. More info is in this article "Make your own integration services":
https://msdn.microsoft.com/en-us/virtualization/hyperv_on_windows/develop/make_mgmt_service

The patch implements the necessary support in the guest side by introducing
a new socket address family AF_HYPERV.

Signed-off-by: Dexuan Cui 
Cc: "K. Y. Srinivasan" 
Cc: Haiyang Zhang 
Cc: Vitaly Kuznetsov 
Cc: Cathy Avery 
---

You can also get the patch here (2764221d):
https://github.com/dcui/linux/commits/decui/hv_sock/net-next/20160708_v15

For the change log before v12, please see https://lkml.org/lkml/2016/5/15/31

In v12, the changes are mainly the following:

1) remove the module params as David suggested.

2) use 5 exact pages for VMBus send/recv rings, respectively.
The host side's design of the feature requires 5 exact pages for recv/send
rings respectively -- this is suboptimal considering memory consumption,
however unluckily we have to live with it, before the host comes up with
a new design in the future. :-(

3) remove the per-connection static send/recv buffers
Instead, we allocate and free the buffers dynamically only when we recv/send
data. This means: when a connection is idle, no memory is consumed as
recv/send buffers at all.

In v13:
I return ENOMEM on buffer alllocation failure

Actually "man read/write" says "Other errors may occur, depending on the
object connected to fd". "man send/recv" indeed lists ENOMEM.
Considering AF_HYPERV is a new socket type, ENOMEM seems OK here.
In the long run, I think we should add a new API in the VMBus driver,
allowing data copy from VMBus ringbuffer into user mode buffer directly.
This way, we can even eliminate this temporary buffer.

In v14:
fix some coding style issues pointed out by David.

In v15:
Just some stylistic changes addressing comments from Joe Perches and
Olaf Hering -- thank you!
- add a GPL blurb.
- define a new macro PAGE_SIZE_4K and use it to replace PAGE_SIZE
- change sk_to_hvsock/hvsock_to_sk() from macros to inline functions
- remove a not-very-useful pr_err()
- fix some typos in comment and coding style issues.

Looking forward to your comments!

  MAINTAINERS |2 +
  include/linux/hyperv.h  |   13 +
  include/linux/socket.h  |4 +-
  include/net/af_hvsock.h |   78 +++
  include/uapi/linux/hyperv.h |   24 +
  net/Kconfig |1 +
  net/Makefile|1 +
  net/hv_sock/Kconfig |   10 +
  net/hv_sock/Makefile|3 +
  net/hv_sock/af_hvsock.c | 1523 +++
  10 files changed, 1658 insertions(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 50f69ba..6eaa26f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5514,7 +5514,9 @@ F:drivers/pci/host/pci-hyperv.c
  F:drivers/net/hyperv/
  F:drivers/scsi/storvsc_drv.c
  F:drivers/video/fbdev/hyperv_fb.c
+F: net/hv_sock/
  F:include/linux/hyperv.h
+F: include/net/af_hvsock.h
  F:tools/hv/
  F:Documentation/ABI/stable/sysfs-bus-vmbus
  
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h

index 50f493e..1cda6ea5 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1508,5 +1508,18 @@ static inline void commit_rd_index(struct vmbus_channel 
*channel)
vmbus_set_event(channel);
  }
  
+struct vmpipe_proto_header {

+   u32 pkt_type;
+   u32 data_size;
+};
+
+#define HVSOCK_HEADER_LEN  (sizeof(struct vmpacket_descriptor) + \
+sizeof(struct vmpipe_proto_header))
+
+/* See 'prev_indices' in hv_ringbuffer_read(), hv_ringbuffer_write() */
+#define PREV_INDICES_LEN   (sizeof(u64))
  
+#define HVSOCK_PKT_LEN(payload_len)	(HVSOCK_HEADER_LEN + \

+   ALIGN((payload_len), 8) + \
+   PREV_INDICES_LEN)
  #endif /* _HYPERV_H */
diff --git a/include/linux/socket.h b/include/linux/socket.h
index b5cc5a6..0b68b58 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -202,8 +202,9 @@ struct ucred {
  #define AF_VSOCK  

RE: [PATCH v15 net-next 1/1] hv_sock: introduce Hyper-V Sockets

2016-07-08 Thread Dexuan Cui
> From: Dexuan Cui
> Sent: Friday, July 8, 2016 15:47
> 
> You can also get the patch here (2764221d):
> https://github.com/dcui/linux/commits/decui/hv_sock/net-next/20160708_v15
> 
> In v14:
> fix some coding style issues pointed out by David.
> 
> In v15:
> Just some stylistic changes addressing comments from Joe Perches and
> Olaf Hering -- thank you!
> - add a GPL blurb.
> - define a new macro PAGE_SIZE_4K and use it to replace PAGE_SIZE
> - change sk_to_hvsock/hvsock_to_sk() from macros to inline functions
> - remove a not-very-useful pr_err()
> - fix some typos in comment and coding style issues.

FYI: the diff between v14 and v15 is attached: the diff is generated by 
git-diff-ing the 2 branches decui/hv_sock/net-next/20160629_v14 and 
decui/hv_sock/net-next/20160708_v15 in the above github repo.
 
Thanks,
-- Dexuan


delta_v14_vs.v15.patch
Description: delta_v14_vs.v15.patch


RE: [PATCH v15 net-next 1/1] hv_sock: introduce Hyper-V Sockets

2016-07-08 Thread Dexuan Cui
> From: Dexuan Cui
> Sent: Friday, July 8, 2016 15:47
> 
> You can also get the patch here (2764221d):
> https://github.com/dcui/linux/commits/decui/hv_sock/net-next/20160708_v15
> 
> In v14:
> fix some coding style issues pointed out by David.
> 
> In v15:
> Just some stylistic changes addressing comments from Joe Perches and
> Olaf Hering -- thank you!
> - add a GPL blurb.
> - define a new macro PAGE_SIZE_4K and use it to replace PAGE_SIZE
> - change sk_to_hvsock/hvsock_to_sk() from macros to inline functions
> - remove a not-very-useful pr_err()
> - fix some typos in comment and coding style issues.

FYI: the diff between v14 and v15 is attached: the diff is generated by 
git-diff-ing the 2 branches decui/hv_sock/net-next/20160629_v14 and 
decui/hv_sock/net-next/20160708_v15 in the above github repo.
 
Thanks,
-- Dexuan


delta_v14_vs.v15.patch
Description: delta_v14_vs.v15.patch