Re: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling

2018-02-14 Thread Stephen Hemminger
What about the following simplification. If on older hosts you can avoid more 
code
and the additional barrier. Also cleanup comment wording somewhat.



From de2d566d9092cbbc8e5974dea581617ef787ff69 Mon Sep 17 00:00:00 2001
From: Michael Kelley 
Date: Sat, 10 Feb 2018 20:48:49 +
Subject: [PATCH 1/9] Drivers: hv: vmbus: Fix ring buffer signaling

Fix bugs in signaling the Hyper-V host when freeing space in the
host->guest ring buffer:

1. The interrupt_mask must not be used to determine whether to signal
   on the host->guest ring buffer
2. The ring buffer write_index must be read (via hv_get_bytes_to_write)
   *after* pending_send_sz is read in order to avoid a race condition
3. Comparisons with pending_send_sz must treat the "equals" case as
   not-enough-space
4. Don't signal if the pending_send_sz feature is not present. Older
   versions of Hyper-V that don't implement this feature will poll.

Fixes: 03bad714a161 ("vmbus: more host signalling avoidance")
Signed-off-by: Michael Kelley 
---
 drivers/hv/ring_buffer.c | 67 +---
 1 file changed, 40 insertions(+), 27 deletions(-)

diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
index 50e071444a5c..d27cbb66279c 100644
--- a/drivers/hv/ring_buffer.c
+++ b/drivers/hv/ring_buffer.c
@@ -417,13 +417,25 @@ __hv_pkt_iter_next(struct vmbus_channel *channel,
 }
 EXPORT_SYMBOL_GPL(__hv_pkt_iter_next);
 
+/* How many bytes were read in this iterator cycle */
+static u32 hv_pkt_iter_bytes_read(const struct hv_ring_buffer_info *rbi)
+{
+   if (rbi->ring_buffer->read_index < rbi->priv_read_index)
+   return rbi->priv_read_index - rbi->ring_buffer->read_index;
+   else
+   return rbi->ring_datasize -
+   rbi->ring_buffer->read_index + rbi->priv_read_index;
+}
+
 /*
  * Update host ring buffer after iterating over packets.
  */
 void hv_pkt_iter_close(struct vmbus_channel *channel)
 {
struct hv_ring_buffer_info *rbi = >inbound;
-   u32 orig_write_sz = hv_get_bytes_to_write(rbi);
+   u32 curr_write_sz;
+   u32 delta = hv_pkt_iter_bytes_read(rbi);
+   u32 pending_sz;
 
/*
 * Make sure all reads are done before we update the read index since
@@ -433,39 +445,40 @@ void hv_pkt_iter_close(struct vmbus_channel *channel)
virt_rmb();
rbi->ring_buffer->read_index = rbi->priv_read_index;
 
+   /* Older versions do not require signalling */
+   if (!rbi->ring_buffer->feature_bits.feat_pending_send_sz)
+   return;
+
/*
-* Issue a full memory barrier before making the signaling decision.
-* Here is the reason for having this barrier:
-* If the reading of the pend_sz (in this function)
-* were to be reordered and read before we commit the new read
-* index (in the calling function)  we could
-* have a problem. If the host were to set the pending_sz after we
-* have sampled pending_sz and go to sleep before we commit the
-* read index, we could miss sending the interrupt. Issue a full
-* memory barrier to address this.
+* If the reading of the pend_sz were to be reordered and read
+* before we commit the new read index then we would have a
+* problem. If the host were to set the pending_sz after we
+* have sampled pending_sz and go to sleep before we commit
+* the read index, we could miss sending the interrupt.
 */
virt_mb();
 
-   /* If host has disabled notifications then skip */
-   if (rbi->ring_buffer->interrupt_mask)
-   return;
+   pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz);
 
-   if (rbi->ring_buffer->feature_bits.feat_pending_send_sz) {
-   u32 pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz);
+   /*
+* Ensure the read of write_index in hv_get_bytes_to_write()
+* happens after the read of pending_send_sz.
+*/
+   virt_rmb();
+   curr_write_sz = hv_get_bytes_to_write(rbi);
 
-   /*
-* If there was space before we began iteration,
-* then host was not blocked. Also handles case where
-* pending_sz is zero then host has nothing pending
-* and does not need to be signaled.
-*/
-   if (orig_write_sz > pending_sz)
-   return;
+   /*
+* If there was space before we began iteration,
+* then host was not blocked. Also handles case where
+* pending_sz is zero then host has nothing pending and does
+* not need to be  signaled.
+*/
+   if (curr_write_sz - delta > pending_sz)
+   return;
 
-   /* If pending write will not fit, don't give false hope. */
-   if (hv_get_bytes_to_write(rbi) < pending_sz)
-   

Re: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling

2018-02-14 Thread Stephen Hemminger
What about the following simplification. If on older hosts you can avoid more 
code
and the additional barrier. Also cleanup comment wording somewhat.



From de2d566d9092cbbc8e5974dea581617ef787ff69 Mon Sep 17 00:00:00 2001
From: Michael Kelley 
Date: Sat, 10 Feb 2018 20:48:49 +
Subject: [PATCH 1/9] Drivers: hv: vmbus: Fix ring buffer signaling

Fix bugs in signaling the Hyper-V host when freeing space in the
host->guest ring buffer:

1. The interrupt_mask must not be used to determine whether to signal
   on the host->guest ring buffer
2. The ring buffer write_index must be read (via hv_get_bytes_to_write)
   *after* pending_send_sz is read in order to avoid a race condition
3. Comparisons with pending_send_sz must treat the "equals" case as
   not-enough-space
4. Don't signal if the pending_send_sz feature is not present. Older
   versions of Hyper-V that don't implement this feature will poll.

Fixes: 03bad714a161 ("vmbus: more host signalling avoidance")
Signed-off-by: Michael Kelley 
---
 drivers/hv/ring_buffer.c | 67 +---
 1 file changed, 40 insertions(+), 27 deletions(-)

diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
index 50e071444a5c..d27cbb66279c 100644
--- a/drivers/hv/ring_buffer.c
+++ b/drivers/hv/ring_buffer.c
@@ -417,13 +417,25 @@ __hv_pkt_iter_next(struct vmbus_channel *channel,
 }
 EXPORT_SYMBOL_GPL(__hv_pkt_iter_next);
 
+/* How many bytes were read in this iterator cycle */
+static u32 hv_pkt_iter_bytes_read(const struct hv_ring_buffer_info *rbi)
+{
+   if (rbi->ring_buffer->read_index < rbi->priv_read_index)
+   return rbi->priv_read_index - rbi->ring_buffer->read_index;
+   else
+   return rbi->ring_datasize -
+   rbi->ring_buffer->read_index + rbi->priv_read_index;
+}
+
 /*
  * Update host ring buffer after iterating over packets.
  */
 void hv_pkt_iter_close(struct vmbus_channel *channel)
 {
struct hv_ring_buffer_info *rbi = >inbound;
-   u32 orig_write_sz = hv_get_bytes_to_write(rbi);
+   u32 curr_write_sz;
+   u32 delta = hv_pkt_iter_bytes_read(rbi);
+   u32 pending_sz;
 
/*
 * Make sure all reads are done before we update the read index since
@@ -433,39 +445,40 @@ void hv_pkt_iter_close(struct vmbus_channel *channel)
virt_rmb();
rbi->ring_buffer->read_index = rbi->priv_read_index;
 
+   /* Older versions do not require signalling */
+   if (!rbi->ring_buffer->feature_bits.feat_pending_send_sz)
+   return;
+
/*
-* Issue a full memory barrier before making the signaling decision.
-* Here is the reason for having this barrier:
-* If the reading of the pend_sz (in this function)
-* were to be reordered and read before we commit the new read
-* index (in the calling function)  we could
-* have a problem. If the host were to set the pending_sz after we
-* have sampled pending_sz and go to sleep before we commit the
-* read index, we could miss sending the interrupt. Issue a full
-* memory barrier to address this.
+* If the reading of the pend_sz were to be reordered and read
+* before we commit the new read index then we would have a
+* problem. If the host were to set the pending_sz after we
+* have sampled pending_sz and go to sleep before we commit
+* the read index, we could miss sending the interrupt.
 */
virt_mb();
 
-   /* If host has disabled notifications then skip */
-   if (rbi->ring_buffer->interrupt_mask)
-   return;
+   pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz);
 
-   if (rbi->ring_buffer->feature_bits.feat_pending_send_sz) {
-   u32 pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz);
+   /*
+* Ensure the read of write_index in hv_get_bytes_to_write()
+* happens after the read of pending_send_sz.
+*/
+   virt_rmb();
+   curr_write_sz = hv_get_bytes_to_write(rbi);
 
-   /*
-* If there was space before we began iteration,
-* then host was not blocked. Also handles case where
-* pending_sz is zero then host has nothing pending
-* and does not need to be signaled.
-*/
-   if (orig_write_sz > pending_sz)
-   return;
+   /*
+* If there was space before we began iteration,
+* then host was not blocked. Also handles case where
+* pending_sz is zero then host has nothing pending and does
+* not need to be  signaled.
+*/
+   if (curr_write_sz - delta > pending_sz)
+   return;
 
-   /* If pending write will not fit, don't give false hope. */
-   if (hv_get_bytes_to_write(rbi) < pending_sz)
-   return;
-   }
+   /* If pending 

RE: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling

2018-02-13 Thread Michael Kelley (EOSG)
> -Original Message-
> From: Stephen Hemminger <step...@networkplumber.org>
> Sent: Tuesday, February 13, 2018 9:35 AM
> To: Michael Kelley <mhkel...@outlook.com>
> Cc: Michael Kelley (EOSG) <michael.h.kel...@microsoft.com>; 
> gre...@linuxfoundation.org;
> linux-kernel@vger.kernel.org; de...@linuxdriverproject.org; o...@aepfle.de;
> a...@canonical.com; vkuzn...@redhat.com; jasow...@redhat.com;
> leann.ogasaw...@canonical.com; marcelo.ce...@canonical.com; Stephen Hemminger
> <sthem...@microsoft.com>; KY Srinivasan <k...@microsoft.com>
> Subject: Re: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer 
> signaling
> 
> 
> > if (rbi->ring_buffer->feature_bits.feat_pending_send_sz) {
> > u32 pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz);
> >
> > /*
> > +* Ensure the read of write_index in hv_get_bytes_to_write()
> > +* happens after the read of pending_send_sz.
> > +*/
> > +   virt_rmb();
> > +   curr_write_sz = hv_get_bytes_to_write(rbi);
> > +
> > +   /*
> >  * If there was space before we began iteration,
> >  * then host was not blocked. Also handles case where
> >  * pending_sz is zero then host has nothing pending
> >  * and does not need to be signaled.
> >  */
> > -   if (orig_write_sz > pending_sz)
> > +   if (curr_write_sz - delta > pending_sz)
> > return;
> >
> > /* If pending write will not fit, don't give false hope. */
> > -   if (hv_get_bytes_to_write(rbi) < pending_sz)
> > +   if (curr_write_sz <= pending_sz)
> > return;
> > +
> > +   vmbus_setevent(channel);
> > }
> >
> > -   vmbus_setevent(channel);
> >  }
> 
> I think this won't work on older versions of Windows where feat_pending_sz
> is never set.  On those versions vmbus_setevent needs to always be called.

The older Windows/Hyper-V hosts poll if they can't write to the
host->guest ring buffer because it is full.   We don't want to call
vmbus_setevent() every time through this routine because then
the guest would be signaling the host every time the host interrupts
the guest.

Michael


RE: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling

2018-02-13 Thread Michael Kelley (EOSG)
> -Original Message-
> From: Stephen Hemminger 
> Sent: Tuesday, February 13, 2018 9:35 AM
> To: Michael Kelley 
> Cc: Michael Kelley (EOSG) ; 
> gre...@linuxfoundation.org;
> linux-kernel@vger.kernel.org; de...@linuxdriverproject.org; o...@aepfle.de;
> a...@canonical.com; vkuzn...@redhat.com; jasow...@redhat.com;
> leann.ogasaw...@canonical.com; marcelo.ce...@canonical.com; Stephen Hemminger
> ; KY Srinivasan 
> Subject: Re: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer 
> signaling
> 
> 
> > if (rbi->ring_buffer->feature_bits.feat_pending_send_sz) {
> > u32 pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz);
> >
> > /*
> > +* Ensure the read of write_index in hv_get_bytes_to_write()
> > +* happens after the read of pending_send_sz.
> > +*/
> > +   virt_rmb();
> > +   curr_write_sz = hv_get_bytes_to_write(rbi);
> > +
> > +   /*
> >  * If there was space before we began iteration,
> >  * then host was not blocked. Also handles case where
> >  * pending_sz is zero then host has nothing pending
> >  * and does not need to be signaled.
> >  */
> > -   if (orig_write_sz > pending_sz)
> > +   if (curr_write_sz - delta > pending_sz)
> > return;
> >
> > /* If pending write will not fit, don't give false hope. */
> > -   if (hv_get_bytes_to_write(rbi) < pending_sz)
> > +   if (curr_write_sz <= pending_sz)
> > return;
> > +
> > +   vmbus_setevent(channel);
> > }
> >
> > -   vmbus_setevent(channel);
> >  }
> 
> I think this won't work on older versions of Windows where feat_pending_sz
> is never set.  On those versions vmbus_setevent needs to always be called.

The older Windows/Hyper-V hosts poll if they can't write to the
host->guest ring buffer because it is full.   We don't want to call
vmbus_setevent() every time through this routine because then
the guest would be signaling the host every time the host interrupts
the guest.

Michael


Re: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling

2018-02-13 Thread Stephen Hemminger

>   if (rbi->ring_buffer->feature_bits.feat_pending_send_sz) {
>   u32 pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz);
>  
>   /*
> +  * Ensure the read of write_index in hv_get_bytes_to_write()
> +  * happens after the read of pending_send_sz.
> +  */
> + virt_rmb();
> + curr_write_sz = hv_get_bytes_to_write(rbi);
> +
> + /*
>* If there was space before we began iteration,
>* then host was not blocked. Also handles case where
>* pending_sz is zero then host has nothing pending
>* and does not need to be signaled.
>*/
> - if (orig_write_sz > pending_sz)
> + if (curr_write_sz - delta > pending_sz)
>   return;
>  
>   /* If pending write will not fit, don't give false hope. */
> - if (hv_get_bytes_to_write(rbi) < pending_sz)
> + if (curr_write_sz <= pending_sz)
>   return;
> +
> + vmbus_setevent(channel);
>   }
>  
> - vmbus_setevent(channel);
>  }

I think this won't work on older versions of Windows where feat_pending_sz
is never set.  On those versions vmbus_setevent needs to always be called.


Re: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling

2018-02-13 Thread Stephen Hemminger

>   if (rbi->ring_buffer->feature_bits.feat_pending_send_sz) {
>   u32 pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz);
>  
>   /*
> +  * Ensure the read of write_index in hv_get_bytes_to_write()
> +  * happens after the read of pending_send_sz.
> +  */
> + virt_rmb();
> + curr_write_sz = hv_get_bytes_to_write(rbi);
> +
> + /*
>* If there was space before we began iteration,
>* then host was not blocked. Also handles case where
>* pending_sz is zero then host has nothing pending
>* and does not need to be signaled.
>*/
> - if (orig_write_sz > pending_sz)
> + if (curr_write_sz - delta > pending_sz)
>   return;
>  
>   /* If pending write will not fit, don't give false hope. */
> - if (hv_get_bytes_to_write(rbi) < pending_sz)
> + if (curr_write_sz <= pending_sz)
>   return;
> +
> + vmbus_setevent(channel);
>   }
>  
> - vmbus_setevent(channel);
>  }

I think this won't work on older versions of Windows where feat_pending_sz
is never set.  On those versions vmbus_setevent needs to always be called.


RE: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling

2018-02-11 Thread Michael Kelley (EOSG)
> -Original Message-
> From: KY Srinivasan
> Sent: Sunday, February 11, 2018 5:14 PM

--- snip ---

> > if (rbi->ring_buffer->feature_bits.feat_pending_send_sz) {
> > u32 pending_sz = READ_ONCE(rbi->ring_buffer-
> > >pending_send_sz);
> >
> > /*
> > +* Ensure the read of write_index in
> > hv_get_bytes_to_write()
> > +* happens after the read of pending_send_sz.
> > +*/
> > +   virt_rmb();
> We can avoid the read barrier by making the initialization of curr_write_sz 
> conditional
> on pending_send_sz being non-zero. Indeed you can make all the signaling code 
> conditional
> on pending_send_sz being non-zero.
> 
> 

I agree that we can immediately test pending_send_sz for zero, and exit
if zero.  A  zero value will be by far the most common path, and would
save executing the read barrier and hv_get_bytes_to_write().  Can also
move the calculation of "delta" to after the test for zero.

But I believe the read barrier is still needed on the path where
pending_send_sz is non-zero.  Just the testing the value of pending_send_sz
doesn't guarantee that the write_index wouldn't be speculatively read,
and potentially out-of-order.   And we have to consider the out-of-order
behaviors on ARM64 as well.

> 
> > +   curr_write_sz = hv_get_bytes_to_write(rbi);
> > +
> > +   /*
> >  * If there was space before we began iteration,
> >  * then host was not blocked. Also handles case where
> >  * pending_sz is zero then host has nothing pending
> >  * and does not need to be signaled.
> >  */
> > -   if (orig_write_sz > pending_sz)
> > +   if (curr_write_sz - delta > pending_sz)
> > return;
> >
> > /* If pending write will not fit, don't give false hope. */
> > -   if (hv_get_bytes_to_write(rbi) < pending_sz)
> > +   if (curr_write_sz <= pending_sz)
> > return;
> > +
> > +   vmbus_setevent(channel);
> > }
> >
> > -   vmbus_setevent(channel);
> >  }
> >  EXPORT_SYMBOL_GPL(hv_pkt_iter_close);
> > --
> > 1.8.3.1



RE: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling

2018-02-11 Thread Michael Kelley (EOSG)
> -Original Message-
> From: KY Srinivasan
> Sent: Sunday, February 11, 2018 5:14 PM

--- snip ---

> > if (rbi->ring_buffer->feature_bits.feat_pending_send_sz) {
> > u32 pending_sz = READ_ONCE(rbi->ring_buffer-
> > >pending_send_sz);
> >
> > /*
> > +* Ensure the read of write_index in
> > hv_get_bytes_to_write()
> > +* happens after the read of pending_send_sz.
> > +*/
> > +   virt_rmb();
> We can avoid the read barrier by making the initialization of curr_write_sz 
> conditional
> on pending_send_sz being non-zero. Indeed you can make all the signaling code 
> conditional
> on pending_send_sz being non-zero.
> 
> 

I agree that we can immediately test pending_send_sz for zero, and exit
if zero.  A  zero value will be by far the most common path, and would
save executing the read barrier and hv_get_bytes_to_write().  Can also
move the calculation of "delta" to after the test for zero.

But I believe the read barrier is still needed on the path where
pending_send_sz is non-zero.  Just the testing the value of pending_send_sz
doesn't guarantee that the write_index wouldn't be speculatively read,
and potentially out-of-order.   And we have to consider the out-of-order
behaviors on ARM64 as well.

> 
> > +   curr_write_sz = hv_get_bytes_to_write(rbi);
> > +
> > +   /*
> >  * If there was space before we began iteration,
> >  * then host was not blocked. Also handles case where
> >  * pending_sz is zero then host has nothing pending
> >  * and does not need to be signaled.
> >  */
> > -   if (orig_write_sz > pending_sz)
> > +   if (curr_write_sz - delta > pending_sz)
> > return;
> >
> > /* If pending write will not fit, don't give false hope. */
> > -   if (hv_get_bytes_to_write(rbi) < pending_sz)
> > +   if (curr_write_sz <= pending_sz)
> > return;
> > +
> > +   vmbus_setevent(channel);
> > }
> >
> > -   vmbus_setevent(channel);
> >  }
> >  EXPORT_SYMBOL_GPL(hv_pkt_iter_close);
> > --
> > 1.8.3.1



RE: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling

2018-02-11 Thread KY Srinivasan


> -Original Message-
> From: Michael Kelley [mailto:mhkel...@outlook.com]
> Sent: Saturday, February 10, 2018 12:49 PM
> To: gre...@linuxfoundation.org; linux-kernel@vger.kernel.org;
> de...@linuxdriverproject.org; o...@aepfle.de; a...@canonical.com;
> vkuzn...@redhat.com; jasow...@redhat.com;
> leann.ogasaw...@canonical.com; marcelo.ce...@canonical.com; Stephen
> Hemminger ; KY Srinivasan
> 
> Cc: Michael Kelley (EOSG) 
> Subject: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling
> 
> Fix bugs in signaling the Hyper-V host when freeing space in the
> host->guest ring buffer:
> 
> 1. The interrupt_mask must not be used to determine whether to signal
>on the host->guest ring buffer
> 2. The ring buffer write_index must be read (via hv_get_bytes_to_write)
>*after* pending_send_sz is read in order to avoid a race condition
> 3. Comparisons with pending_send_sz must treat the "equals" case as
>not-enough-space
> 4. Don't signal if the pending_send_sz feature is not present. Older
>versions of Hyper-V that don't implement this feature will poll.
> 
> Fixes: 03bad714a161 ("vmbus: more host signalling avoidance")
> Signed-off-by: Michael Kelley 
> ---
>  drivers/hv/ring_buffer.c | 24 
>  1 file changed, 16 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
> index 50e0714..b64be18 100644
> --- a/drivers/hv/ring_buffer.c
> +++ b/drivers/hv/ring_buffer.c
> @@ -423,7 +423,11 @@ struct vmpacket_descriptor *
>  void hv_pkt_iter_close(struct vmbus_channel *channel)
>  {
>   struct hv_ring_buffer_info *rbi = >inbound;
> - u32 orig_write_sz = hv_get_bytes_to_write(rbi);
> + u32 curr_write_sz;
> + u32 delta = rbi->ring_buffer->read_index < rbi->priv_read_index ?
> + (rbi->priv_read_index - rbi->ring_buffer-
> >read_index) :
> + (rbi->ring_datasize - rbi->ring_buffer->read_index +
> + rbi->priv_read_index);
> 
>   /*
>* Make sure all reads are done before we update the read index
> since
> @@ -446,27 +450,31 @@ void hv_pkt_iter_close(struct vmbus_channel
> *channel)
>*/
>   virt_mb();
> 
> - /* If host has disabled notifications then skip */
> - if (rbi->ring_buffer->interrupt_mask)
> - return;
> -
>   if (rbi->ring_buffer->feature_bits.feat_pending_send_sz) {
>   u32 pending_sz = READ_ONCE(rbi->ring_buffer-
> >pending_send_sz);
> 
>   /*
> +  * Ensure the read of write_index in
> hv_get_bytes_to_write()
> +  * happens after the read of pending_send_sz.
> +  */
> + virt_rmb();
We can avoid the read barrier by making the initialization of curr_write_sz 
conditional
on pending_send_sz being non-zero. Indeed you can make all the signaling code 
conditional
on pending_send_sz being non-zero.



> + curr_write_sz = hv_get_bytes_to_write(rbi);
> +
> + /*
>* If there was space before we began iteration,
>* then host was not blocked. Also handles case where
>* pending_sz is zero then host has nothing pending
>* and does not need to be signaled.
>*/
> - if (orig_write_sz > pending_sz)
> + if (curr_write_sz - delta > pending_sz)
>   return;
> 
>   /* If pending write will not fit, don't give false hope. */
> - if (hv_get_bytes_to_write(rbi) < pending_sz)
> + if (curr_write_sz <= pending_sz)
>   return;
> +
> + vmbus_setevent(channel);
>   }
> 
> - vmbus_setevent(channel);
>  }
>  EXPORT_SYMBOL_GPL(hv_pkt_iter_close);
> --
> 1.8.3.1



RE: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling

2018-02-11 Thread KY Srinivasan


> -Original Message-
> From: Michael Kelley [mailto:mhkel...@outlook.com]
> Sent: Saturday, February 10, 2018 12:49 PM
> To: gre...@linuxfoundation.org; linux-kernel@vger.kernel.org;
> de...@linuxdriverproject.org; o...@aepfle.de; a...@canonical.com;
> vkuzn...@redhat.com; jasow...@redhat.com;
> leann.ogasaw...@canonical.com; marcelo.ce...@canonical.com; Stephen
> Hemminger ; KY Srinivasan
> 
> Cc: Michael Kelley (EOSG) 
> Subject: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling
> 
> Fix bugs in signaling the Hyper-V host when freeing space in the
> host->guest ring buffer:
> 
> 1. The interrupt_mask must not be used to determine whether to signal
>on the host->guest ring buffer
> 2. The ring buffer write_index must be read (via hv_get_bytes_to_write)
>*after* pending_send_sz is read in order to avoid a race condition
> 3. Comparisons with pending_send_sz must treat the "equals" case as
>not-enough-space
> 4. Don't signal if the pending_send_sz feature is not present. Older
>versions of Hyper-V that don't implement this feature will poll.
> 
> Fixes: 03bad714a161 ("vmbus: more host signalling avoidance")
> Signed-off-by: Michael Kelley 
> ---
>  drivers/hv/ring_buffer.c | 24 
>  1 file changed, 16 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
> index 50e0714..b64be18 100644
> --- a/drivers/hv/ring_buffer.c
> +++ b/drivers/hv/ring_buffer.c
> @@ -423,7 +423,11 @@ struct vmpacket_descriptor *
>  void hv_pkt_iter_close(struct vmbus_channel *channel)
>  {
>   struct hv_ring_buffer_info *rbi = >inbound;
> - u32 orig_write_sz = hv_get_bytes_to_write(rbi);
> + u32 curr_write_sz;
> + u32 delta = rbi->ring_buffer->read_index < rbi->priv_read_index ?
> + (rbi->priv_read_index - rbi->ring_buffer-
> >read_index) :
> + (rbi->ring_datasize - rbi->ring_buffer->read_index +
> + rbi->priv_read_index);
> 
>   /*
>* Make sure all reads are done before we update the read index
> since
> @@ -446,27 +450,31 @@ void hv_pkt_iter_close(struct vmbus_channel
> *channel)
>*/
>   virt_mb();
> 
> - /* If host has disabled notifications then skip */
> - if (rbi->ring_buffer->interrupt_mask)
> - return;
> -
>   if (rbi->ring_buffer->feature_bits.feat_pending_send_sz) {
>   u32 pending_sz = READ_ONCE(rbi->ring_buffer-
> >pending_send_sz);
> 
>   /*
> +  * Ensure the read of write_index in
> hv_get_bytes_to_write()
> +  * happens after the read of pending_send_sz.
> +  */
> + virt_rmb();
We can avoid the read barrier by making the initialization of curr_write_sz 
conditional
on pending_send_sz being non-zero. Indeed you can make all the signaling code 
conditional
on pending_send_sz being non-zero.



> + curr_write_sz = hv_get_bytes_to_write(rbi);
> +
> + /*
>* If there was space before we began iteration,
>* then host was not blocked. Also handles case where
>* pending_sz is zero then host has nothing pending
>* and does not need to be signaled.
>*/
> - if (orig_write_sz > pending_sz)
> + if (curr_write_sz - delta > pending_sz)
>   return;
> 
>   /* If pending write will not fit, don't give false hope. */
> - if (hv_get_bytes_to_write(rbi) < pending_sz)
> + if (curr_write_sz <= pending_sz)
>   return;
> +
> + vmbus_setevent(channel);
>   }
> 
> - vmbus_setevent(channel);
>  }
>  EXPORT_SYMBOL_GPL(hv_pkt_iter_close);
> --
> 1.8.3.1



Re: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling

2018-02-11 Thread Stephen Hemminger
On Sat, 10 Feb 2018 20:48:49 +
Michael Kelley  wrote:

> + u32 delta = rbi->ring_buffer->read_index < rbi->priv_read_index ?
> + (rbi->priv_read_index - rbi->ring_buffer->read_index) :
> + (rbi->ring_datasize - rbi->ring_buffer->read_index +
> + rbi->priv_read_index);
>  

Would it would make sense for this to be a wrapper function like other ring 
operations?


Re: [PATCH char-misc 1/1] Drivers: hv: vmbus: Fix ring buffer signaling

2018-02-11 Thread Stephen Hemminger
On Sat, 10 Feb 2018 20:48:49 +
Michael Kelley  wrote:

> + u32 delta = rbi->ring_buffer->read_index < rbi->priv_read_index ?
> + (rbi->priv_read_index - rbi->ring_buffer->read_index) :
> + (rbi->ring_datasize - rbi->ring_buffer->read_index +
> + rbi->priv_read_index);
>  

Would it would make sense for this to be a wrapper function like other ring 
operations?