Re: [PATCH] isdn: hysdn: fix code style error from checkpatch

2019-08-02 Thread Stephen Hemminger
On Fri,  2 Aug 2019 19:50:17 +
Ricardo Bruno Lopes da Silva  wrote:

> Fix error bellow from checkpatch.
> 
> WARNING: Block comments use * on subsequent lines
> +/***
> +
> 
> Signed-off-by: Ricardo Bruno Lopes da Silva 

Read the TODO, these drivers are scheduled for removal, so changes
are not helpful at this time.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH] isdn: hysdn: Fix error spaces around '*'

2019-08-02 Thread Stephen Hemminger
On Fri,  2 Aug 2019 19:56:02 +
Jose Carlos Cazarin Filho  wrote:

> Fix checkpath error:
> CHECK: spaces preferred around that '*' (ctx:WxV)
> +extern hysdn_card *card_root;/* pointer to first card */
> 
> Signed-off-by: Jose Carlos Cazarin Filho 


Read the TODO, these drivers are scheduled for removal, so changes
are not helpful at this time.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH 0/3] pci-hyperv: fix memory leak and add pci_destroy_slot()

2019-03-05 Thread Stephen Hemminger
On Mon, 4 Mar 2019 21:34:47 +
Dexuan Cui  wrote:

> Patch #1 fixes a memory leak caused by incorrectly-maintained hpdev->refs.
> 
> Patch #2 and #3 make sure the "slot" is removed in all the scenarios.
> Without them, in the quick hot-add/hot-remove test, systemd-dev may easily
> crash when trying to access a dangling sys file in /sys/bus/pci/slots/:
> "BUG: unable to handle kernel paging request".
> 
> BTW, Patch #2 was posted on Feb 7, 2019, and this is the v2: the change
> to hv_eject_device_work() in v1 is removed, as the change is only needed
> when we hot-remove the device and remove the pci-hyperv driver at the 
> same time. It looks more work is required to make this scenaro work
> correctly, and since removing the driver is not really a "usual" usage,
> we can address this scenario in the future.
> 
> Please review the patchset.
> 
> Dexuan Cui (3):
>   PCI: hv: Fix a memory leak in hv_eject_device_work()
>   PCI: hv: Add hv_pci_remove_slots() when we unload the driver
>   PCI: hv: Add pci_destroy_slot() in pci_devices_present_work(), if
> necessary
> 
>  drivers/pci/controller/pci-hyperv.c | 23 +++
>  1 file changed, 23 insertions(+)


Thanks for fixing this.

Reviewed-by: Stephen Hemminger 
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


RE: [PATCH V3 1/10] X86/Hyper-V: Add parameter offset for hyperv_fill_flush_guest_mapping_list()

2019-02-22 Thread Stephen Hemminger
int hyperv_fill_flush_guest_mapping_list(
struct hv_guest_mapping_flush_list *flush,
-   u64 start_gfn, u64 pages)
+   int offset, u64 start_gfn, u64 pages)
 {
u64 cur = start_gfn;
u64 additional_pages;
-   int gpa_n = 0;
+   int gpa_n = offset;
 
do {
/*

Do you mean to support negative offsets here? Maybe unsigned would be better?
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH 2/2] Drivers: hv: vmbus: Display nothing in sysfs if monitor_allocated not set

2019-02-14 Thread Stephen Hemminger
On Thu, 14 Feb 2019 01:11:03 -0500
Kimberly Brown  wrote:

> On Mon, Feb 11, 2019 at 10:02:47AM -0800, Stephen Hemminger wrote:
> > On Mon, 11 Feb 2019 02:01:18 -0500
> > Kimberly Brown  wrote:
> >   
> > > On Fri, Feb 08, 2019 at 02:32:09PM -0800, Stephen Hemminger wrote:  
> > > > On Fri, 8 Feb 2019 05:01:12 -0500
> > > > Kimberly Brown  wrote:
> > > > 
> > > > You are right, the current behavior is broken.
> > > > It would be good to add a description of under what conditions
> > > > monitor is not used. Is this some part of a project emulating
> > > > Hyper-V?
> > > > 
> > > 
> > > I'm not sure which conditions determine whether the monitor mechanism is
> > > used. I've searched the Hypervisor TLFS, and I couldn't find any
> > > information. If you have any suggestions for where I can find this
> > > information, please let me know.  
> > 
> > The monitor page stuff pre-dates my involvement with Hyper-V. KY might know.
> > But based on comments it looks like it was added to avoid hypercalls
> > for each message. It probably showed up in Windows Server 2012 timeframe.
> > 
> > To test you might want to dig up Windows Server 2008.
> >
> 
> It looks like the monitor mechanism has always been used. It's present in the
> earliest commit that I can find: 3e7ee4902fe6 ("add the Hyper-V virtual bus")
> from 2009.
> 
> I propose that the following sentences be added to the sysfs documentation for
> the affected attributes:
> 
> "The monitor page mechanism is used for performance critical channels 
> (storage,
> network, etc.). Channels that do not use the monitor page mechanism will 
> return
> EINVAL."
> 
> I think that this provides sufficient information for a user to understand why
> opening an affected file can return EINVAL. What do you think?

Thanks for following up. I agree with you EINVAL works as a solution.
My understanding is that their are two ways a channel can work. The first one is
for the guest to send a hyper call to the host to indicate when data is 
available.
The other is for the guest to indicate by setting a bit in shared memory with 
host.

The shared memory approach reduces host/guest overhead and allows for more 
opportunities
for batching in the ring. The host checks the shared memory on a polling 
interval
defined in the latency field.

The hypercall method does not use the monitor page. It has lower latency (no 
polling).
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH 2/2] Drivers: hv: vmbus: Display nothing in sysfs if monitor_allocated not set

2019-02-11 Thread Stephen Hemminger
On Mon, 11 Feb 2019 02:01:18 -0500
Kimberly Brown  wrote:

> On Fri, Feb 08, 2019 at 02:32:09PM -0800, Stephen Hemminger wrote:
> > On Fri, 8 Feb 2019 05:01:12 -0500
> > Kimberly Brown  wrote:
> > 
> > You are right, the current behavior is broken.
> > It would be good to add a description of under what conditions
> > monitor is not used. Is this some part of a project emulating
> > Hyper-V?
> >   
> 
> I'm not sure which conditions determine whether the monitor mechanism is
> used. I've searched the Hypervisor TLFS, and I couldn't find any
> information. If you have any suggestions for where I can find this
> information, please let me know.

The monitor page stuff pre-dates my involvement with Hyper-V. KY might know.
But based on comments it looks like it was added to avoid hypercalls
for each message. It probably showed up in Windows Server 2012 timeframe.

To test you might want to dig up Windows Server 2008.
 
> No, I'm not working on a project emulating Hyper-V.

OK, I had heard that KVM project was doing something with QEMU.

> >   
> > > +
> > > + if (!hv_dev->channel->offermsg.monitor_allocated)
> > > + return sprintf(buf, "\n");  
> > 
> > If monitor is not used, why not return an error instead of empty
> > data. Any program (or user) would have to handle that already.  
> 
> I think that returning an error instead is fine. I'll make this change
> in the next version of the patch.

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH 2/2] Drivers: hv: vmbus: Display nothing in sysfs if monitor_allocated not set

2019-02-08 Thread Stephen Hemminger
On Fri, 8 Feb 2019 05:01:12 -0500
Kimberly Brown  wrote:

You are right, the current behavior is broken.
It would be good to add a description of under what conditions
monitor is not used. Is this some part of a project emulating
Hyper-V?


> +
> + if (!hv_dev->channel->offermsg.monitor_allocated)
> + return sprintf(buf, "\n");

If monitor is not used, why not return an error instead of empty
data. Any program (or user) would have to handle that already.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH 1/2] Drivers: hv: vmbus: Change server monitor_pages index to 0

2019-02-08 Thread Stephen Hemminger
On Fri, 8 Feb 2019 04:58:52 -0500
Kimberly Brown  wrote:

> Change the monitor_pages index in server_monitor_pending_show() to '0'.
> '0' is the correct monitor_pages index for the server. A comment for the
> monitor_pages field in the vmbus_connection struct definition indicates
> that the 1st page is for parent->child notifications. In addition, the
> server_monitor_latency_show() and server_monitor_conn_id_show()
> functions use monitor_pages index '0'.
> 
> Signed-off-by: Kimberly Brown 
> ---
>  drivers/hv/vmbus_drv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 403fee01572c..f2a79f5129d7 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -234,7 +234,7 @@ static ssize_t server_monitor_pending_show(struct device 
> *dev,
>   return -ENODEV;
>   return sprintf(buf, "%d\n",
>  channel_pending(hv_dev->channel,
> -vmbus_connection.monitor_pages[1]));
> +vmbus_connection.monitor_pages[0]));
>  }
>  static DEVICE_ATTR_RO(server_monitor_pending);


Looks good.

I wonder if ever gets used though since it returned incorrect data...

Acked-by: Stephen Hemminger 
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH] PCI: hv: Add hv_pci_remove_slots() when we unload the driver

2019-02-07 Thread Stephen Hemminger
On Thu, 7 Feb 2019 20:36:32 +
Dexuan Cui  wrote:

> When we unload pci-hyperv, the host doesn't send us a PCI_EJECT message.
> In this case we also need to make sure the sysfs pci slot directory
> is removed, otherwise "cat /sys/bus/pci/slots/2/address" will trigger
> "BUG: unable to handle kernel paging request". And, if we unload/reload
> the driver several times, we'll have multiple pci slot directories in
> /sys/bus/pci/slots/ like this:
> 
> root@localhost:~# ls -rtl  /sys/bus/pci/slots/
> total 0
> drwxr-xr-x 2 root root 0 Feb  7 10:49 2
> drwxr-xr-x 2 root root 0 Feb  7 10:49 2-1
> drwxr-xr-x 2 root root 0 Feb  7 10:51 2-2
> 
> The patch adds the missing code, and in hv_eject_device_work() it also
> moves pci_destroy_slot() to an earlier place where we hold the pci lock.
> 
> Fixes: a15f2c08c708 ("PCI: hv: support reporting serial number as slot 
> information")
> Signed-off-by: Dexuan Cui 
> Cc: sta...@vger.kernel.org
> Cc: Stephen Hemminger 

Acked-by: Stephen Hemminger 
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v3] Drivers: hv: vmbus: Expose counters for interrupts and full conditions

2019-01-17 Thread Stephen Hemminger



> +static ssize_t channel_intr_in_full_show(const struct vmbus_channel
> *channel,
> +  char *buf)
> +{
> + return sprintf(buf, "%llu\n", channel->intr_in_full);
> +}


intr_in_full is u64, which is not the same as unsigned long long.
to be correct you need a cast here.

> > diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
> > index dcb6977afce9..7e5239123276 100644
> > --- a/include/linux/hyperv.h
> > +++ b/include/linux/hyperv.h
> > @@ -751,6 +751,27 @@ struct vmbus_channel {
> > u64 interrupts; /* Host to Guest interrupts */
> > u64 sig_events; /* Guest to Host events */
> > 
> > +   /* Interrupt counts for 2 types of Guest to Host interrupts */
> > +   u64 intr_in_full;   /* in ring buffer, full to not full */
> > +   u64 intr_out_empty; /* out ring buffer, empty to not empty */
> > +
> > +   /*
> > +* The total number of write operations that encountered a full
> > +* outbound ring buffer.
> > +*/
> > +   u64 out_full_total;
> > +   /*
> > +* The number of write operations that were the first to encounter a
> > +* full outbound ring buffer.
> > +*/
> > +   u64 out_full_first;

Adding more fields changes cache layout which can cause
additional cache miss in the hot path.  

> > +   /*
> > +* Indicates that a full outbound ring buffer was encountered. The flag
> > +* is set to true when a full outbound ring buffer is encountered and
> > +* set to false when a write to the outbound ring buffer is completed.
> > +*/
> > +   bool out_full_flag;

Discussion on kernel mailing list. Recommends against putting bool
in structures since that pads to full sizeof(int).  Could this be
part of a bitfield?

> > /* Channel callback's invoked in softirq context */
> > struct tasklet_struct callback_event;
> > void (*onchannel_callback)(void *context);
> > @@ -936,6 +957,23 @@ static inline void *get_per_channel_state(struct
> > vmbus_channel *c)
> >  static inline void set_channel_pending_send_size(struct vmbus_channel *c,
> >  u32 size)
> >  {
> > +   unsigned long flags;
> > +
> > +   spin_lock_irqsave(>outbound.ring_lock, flags);
> > +
> > +   if (size) {
> > +   ++c->out_full_total;
> > +
> > +   if (!c->out_full_flag) {
> > +   ++c->out_full_first;
> > +   c->out_full_flag = true;
> > +   }
> > +   } else {
> > +   c->out_full_flag = false;
> > +   }
> > +
> > +   spin_unlock_irqrestore(>outbound.ring_lock, flags);

If this is called often, the additional locking will impact performance.

> > c->outbound.ring_buffer->pending_send_sz = size;
> >  }
> > 

Could I propose another alternative.

It might be more useful to count the guest to host interaction events
rather than the ring buffer.

For example the number of calls to:
vmbus_set_event which means host exit call
vmbus_setevent fastpath using sync_set_bit
calls to rinbuffer_write that returned -EAGAIN

These would require less locking, reuse existing code paths
and not require additional state.

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH] Drivers: hv: vmbus: Return -EINVAL for the sys files for unopened channels

2018-12-17 Thread Stephen Hemminger
On Mon, 17 Dec 2018 18:44:12 +
Dexuan Cui  wrote:

> > From: devel  On Behalf Of
> > Dexuan Cui
> > Sent: Monday, December 17, 2018 10:31 AM  
> > > From: Stephen Hemminger 
> > >
> > > The old code was risky because it would silently return stack garbage.
> > > Having an error check in get_debuginfo would eliminate that.  
> > 
> > OK, then let me make another patch based on the latest char-misc-linus.
> > 
> > -- Dexuan  
> 
> Hi Stephen, your patch can apply cleanly. Let me rebase your patch to
> char-misc-linus, do a test, and then post it with your Signed-off-by and 
> mine: 
> I assume you're Ok with this. Please let me know in case it's not. :-)
> 
> Thanks,
> -- Dexuan

Sure.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH] Drivers: hv: vmbus: Return -EINVAL for the sys files for unopened channels

2018-12-17 Thread Stephen Hemminger
On Mon, 17 Dec 2018 18:00:29 +
Dexuan Cui  wrote:

> > From: Stephen Hemminger 
> > On Thu, 13 Dec 2018 16:35:43 +
> > Dexuan Cui  wrote:
> >   
> > > Before 98f4c651762c, we returned zeros for unopened channels.
> > > With 98f4c651762c, we started to return random on-stack values.
> > >
> > > We'd better return -EINVAL instead.  
> > 
> > The concept looks fine, but maybe it would be simpler to move it into
> > hv_ringbuffer_get_debuginfo and have it return an error code.
> > 
> > Since so much of the code is repeated, I would probably make a
> > macro which generates the code as well.
> > 
> > Something like this:  
> 
> Thanks, Stephen! Now the patch has been in char-misc's char-misc-linus
> branch, so IMO we may as well leave it as is (considering the code here is
> unlikely to be frqeuencly changed), and we have a smaller patch this way. :-)
> 
> But, yes, I agree with you that generally we should make a common
> function to avoid duplicate code.
> 
> Thanks,
> -- Dexuan

The old code was risky because it would silently return stack garbage.
Having an error check in get_debuginfo would eliminate that.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH] Drivers: hv: vmbus: Return -EINVAL for the sys files for unopened channels

2018-12-17 Thread Stephen Hemminger
On Thu, 13 Dec 2018 16:35:43 +
Dexuan Cui  wrote:

> Before 98f4c651762c, we returned zeros for unopened channels.
> With 98f4c651762c, we started to return random on-stack values.
> 
> We'd better return -EINVAL instead.
> 
> Fixes: 98f4c651762c ("hv: move ringbuffer bus attributes to dev_groups")
> Cc: sta...@vger.kernel.org
> Cc: K. Y. Srinivasan 
> Cc: Haiyang Zhang 
> Cc: Stephen Hemminger 
> Signed-off-by: Dexuan Cui 

The concept looks fine, but maybe it would be simpler to move it into
hv_ringbuffer_get_debuginfo and have it return an error code.

Since so much of the code is repeated, I would probably make a
macro which generates the code as well.

Something like this:

>From c6bbdbcde933c85098f7b3e71650a8479d52810c Mon Sep 17 00:00:00 2001
From: Stephen Hemminger 
Date: Mon, 17 Dec 2018 09:13:24 -0800
Subject: [PATCH] hv: vmbus: check for ring in debug info

---
 drivers/hv/ring_buffer.c | 31 +-
 drivers/hv/vmbus_drv.c   | 71 ++--
 include/linux/hyperv.h   |  5 +--
 3 files changed, 79 insertions(+), 28 deletions(-)

diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
index 64d0c85d5161..1f1a55e07733 100644
--- a/drivers/hv/ring_buffer.c
+++ b/drivers/hv/ring_buffer.c
@@ -164,26 +164,25 @@ hv_get_ringbuffer_availbytes(const struct 
hv_ring_buffer_info *rbi,
 }
 
 /* Get various debug metrics for the specified ring buffer. */
-void hv_ringbuffer_get_debuginfo(const struct hv_ring_buffer_info *ring_info,
-struct hv_ring_buffer_debug_info *debug_info)
+int hv_ringbuffer_get_debuginfo(const struct hv_ring_buffer_info *ring_info,
+   struct hv_ring_buffer_debug_info *debug_info)
 {
u32 bytes_avail_towrite;
u32 bytes_avail_toread;
 
-   if (ring_info->ring_buffer) {
-   hv_get_ringbuffer_availbytes(ring_info,
-   _avail_toread,
-   _avail_towrite);
-
-   debug_info->bytes_avail_toread = bytes_avail_toread;
-   debug_info->bytes_avail_towrite = bytes_avail_towrite;
-   debug_info->current_read_index =
-   ring_info->ring_buffer->read_index;
-   debug_info->current_write_index =
-   ring_info->ring_buffer->write_index;
-   debug_info->current_interrupt_mask =
-   ring_info->ring_buffer->interrupt_mask;
-   }
+   if (!ring_info->ring_buffer)
+   return -EINVAL;
+
+   hv_get_ringbuffer_availbytes(ring_info,
+_avail_toread,
+_avail_towrite);
+   debug_info->bytes_avail_toread = bytes_avail_toread;
+   debug_info->bytes_avail_towrite = bytes_avail_towrite;
+   debug_info->current_read_index = ring_info->ring_buffer->read_index;
+   debug_info->current_write_index = ring_info->ring_buffer->write_index;
+   debug_info->current_interrupt_mask
+   = ring_info->ring_buffer->interrupt_mask;
+   return 0;
 }
 EXPORT_SYMBOL_GPL(hv_ringbuffer_get_debuginfo);
 
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 283d184280af..403fee01572c 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -313,10 +313,16 @@ static ssize_t out_intr_mask_show(struct device *dev,
 {
struct hv_device *hv_dev = device_to_hv_device(dev);
struct hv_ring_buffer_debug_info outbound;
+   int ret;
 
if (!hv_dev->channel)
return -ENODEV;
-   hv_ringbuffer_get_debuginfo(_dev->channel->outbound, );
+
+   ret = hv_ringbuffer_get_debuginfo(_dev->channel->outbound,
+ );
+   if (ret < 0)
+   return ret;
+
return sprintf(buf, "%d\n", outbound.current_interrupt_mask);
 }
 static DEVICE_ATTR_RO(out_intr_mask);
@@ -326,10 +332,15 @@ static ssize_t out_read_index_show(struct device *dev,
 {
struct hv_device *hv_dev = device_to_hv_device(dev);
struct hv_ring_buffer_debug_info outbound;
+   int ret;
 
if (!hv_dev->channel)
return -ENODEV;
-   hv_ringbuffer_get_debuginfo(_dev->channel->outbound, );
+
+   ret = hv_ringbuffer_get_debuginfo(_dev->channel->outbound,
+ );
+   if (ret < 0)
+   return ret;
return sprintf(buf, "%d\n", outbound.current_read_index);
 }
 static DEVICE_ATTR_RO(out_read_index);
@@ -340,10 +351,15 @@ static ssize_t out_write_index_show(struct device *dev,
 {
struct hv_device *hv_dev = device_to_hv_device(dev);
struct hv_ring_buffer_debug_info outbound;
+   int ret;
 
if (!hv_dev

[PATCH] uio_hv_generic: set callbacks on open

2018-12-10 Thread Stephen Hemminger
This fixes the problem where uio application was unable to
use multple queues on restart. The root cause is that the callbacks
are cleared on disconnect. Change to setting up callbacks
everytime in open.

Fixes: cdfa835c6e5e ("uio_hv_generic: defer opening vmbus until first use")
Reported-by: Mohammed Gamal 
Signed-off-by: Stephen Hemminger 
---
 drivers/uio/uio_hv_generic.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
index c2493d011225..3c5169eb23f5 100644
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -204,9 +204,11 @@ hv_uio_open(struct uio_info *info, struct inode *inode)
if (atomic_inc_return(>refcnt) != 1)
return 0;
 
+   vmbus_set_chn_rescind_callback(dev->channel, hv_uio_rescind);
+   vmbus_set_sc_create_callback(dev->channel, hv_uio_new_channel);
+
ret = vmbus_connect_ring(dev->channel,
 hv_uio_channel_cb, dev->channel);
-
if (ret == 0)
dev->channel->inbound.ring_buffer->interrupt_mask = 1;
else
@@ -334,9 +336,6 @@ hv_uio_probe(struct hv_device *dev,
goto fail_close;
}
 
-   vmbus_set_chn_rescind_callback(channel, hv_uio_rescind);
-   vmbus_set_sc_create_callback(channel, hv_uio_new_channel);
-
ret = sysfs_create_bin_file(>kobj, _buffer_bin_attr);
if (ret)
dev_notice(>device,
-- 
2.19.2

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH] vmbus: fix subchannel removal

2018-12-07 Thread Stephen Hemminger
The changes to split ring allocation from open/close, broke
the cleanup of subchannels. This resulted in problems using
uio on network devices because the subchannel was left behind
when the network device was unbound.

The cause was in the disconnect logic which used list splice
to move the subchannel list into a local variable. This won't
work because the subchannel list is needed later during the
process of the rescind messages (relid2channel).

The fix is to just leave the subchannel list in place
which is what the original code did. The list is cleaned
up later when the host rescind is processed.

Fixes: ae6935ed7d42 ("vmbus: split ring buffer allocation from open")
Signed-off-by: Stephen Hemminger 
---
 drivers/hv/channel.c | 10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index fe00b12e4417..bea4c9850247 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -701,20 +701,12 @@ static int vmbus_close_internal(struct vmbus_channel 
*channel)
 int vmbus_disconnect_ring(struct vmbus_channel *channel)
 {
struct vmbus_channel *cur_channel, *tmp;
-   unsigned long flags;
-   LIST_HEAD(list);
int ret;
 
if (channel->primary_channel != NULL)
return -EINVAL;
 
-   /* Snapshot the list of subchannels */
-   spin_lock_irqsave(>lock, flags);
-   list_splice_init(>sc_list, );
-   channel->num_sc = 0;
-   spin_unlock_irqrestore(>lock, flags);
-
-   list_for_each_entry_safe(cur_channel, tmp, , sc_list) {
+   list_for_each_entry_safe(cur_channel, tmp, >sc_list, sc_list) {
if (cur_channel->rescind)
wait_for_completion(_channel->rescind_event);
 
-- 
2.19.2

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH] hyperv: replace mutex_is_locked with lockdep

2018-12-07 Thread Stephen Hemminger
lockdep_assert_held is better at checking for locking requirements
since it doesn't get confused if someone else is holding the mutex.

Inspired by changes in network drivers by Lance Roy.

Signed-off-by: Stephen Hemminger 
---
 drivers/hv/channel_mgmt.c | 2 +-
 drivers/hv/connection.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 6277597d3d58..abdaf8ac0002 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -385,7 +385,7 @@ void hv_process_channel_removal(struct vmbus_channel 
*channel)
struct vmbus_channel *primary_channel;
unsigned long flags;
 
-   BUG_ON(!mutex_is_locked(_connection.channel_mutex));
+   lockdep_assert_held(_connection.channel_mutex);
BUG_ON(!channel->rescind);
 
if (channel->target_cpu != get_cpu()) {
diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index f4d08c8ac7f8..0adaec0db85a 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -307,7 +307,7 @@ struct vmbus_channel *relid2channel(u32 relid)
struct list_head *cur, *tmp;
struct vmbus_channel *cur_sc;
 
-   BUG_ON(!mutex_is_locked(_connection.channel_mutex));
+   lockdep_assert_held(_connection.channel_mutex);
 
list_for_each_entry(channel, _connection.chn_list, listentry) {
if (channel->offermsg.child_relid == relid) {
-- 
2.19.2

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH V2 3/4] vmbus: add per-channel sysfs info

2018-10-18 Thread Stephen Hemminger
On Thu, 18 Oct 2018 15:32:35 +
Michael Kelley  wrote:

> From Olaf Hering  Sent: Thursday, October 18, 2018 8:20 AM
> >  
> > > This extends existing vmbus related sysfs structure to provide per-channel
> > > state information. This is useful when diagnosing issues with multiple
> > > queues in networking and storage.  
> >   
> > > +++ b/drivers/hv/vmbus_drv.c
> > > +static ssize_t write_avail_show(const struct vmbus_channel *channel, 
> > > char *buf)
> > > +{
> > > + const struct hv_ring_buffer_info *rbi = >outbound;
> > > +
> > > + return sprintf(buf, "%u\n", hv_get_bytes_to_write(rbi));
> > > +}
> > > +VMBUS_CHAN_ATTR_RO(write_avail);  
> > 
> > This is upstream since a year.
> > 
> > But I wonder how this can work if vmbus_device_register is called,
> > and then something reads the populated sysfs files before vmbus_open 
> > returns.
> > Nothing protects rbi->ring_buffer in this case, which remains NULL
> > until vmbus_open populates it.
> > 
> > A simple reproduce, with a modular kernel, is to boot with init=/bin/bash
> > head /sys/bus/vmbus/devices/*/channels/*/*
> >   
> 
> There are multiple race conditions with this and other VMbus sysfs 
> information.
> There's a race on the close path as well.  I've got an action on my list to 
> get it
> cleaned up.
> 
> Michael
> 

There is also a bunch of issues with code like:

static ssize_t id_show(struct device *dev, struct device_attribute *dev_attr,
   char *buf)
{
struct hv_device *hv_dev = device_to_hv_device(dev);

if (!hv_dev->channel)
return -ENODEV;
return sprintf(buf, "%d\n", hv_dev->channel->offermsg.child_relid);
}

Which should be using ACCESS_ONCE on hv_dev->channel or doing proper RCU.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH V2 3/4] vmbus: add per-channel sysfs info

2018-10-18 Thread Stephen Hemminger
On Thu, 18 Oct 2018 17:19:53 +0200
Olaf Hering  wrote:

> Am Sun, 17 Sep 2017 20:54:18 -0700
> schrieb k...@exchange.microsoft.com:
> 
> > This extends existing vmbus related sysfs structure to provide per-channel
> > state information. This is useful when diagnosing issues with multiple
> > queues in networking and storage.  
> 
> > +++ b/drivers/hv/vmbus_drv.c
> > +static ssize_t write_avail_show(const struct vmbus_channel *channel, char 
> > *buf)
> > +{
> > +   const struct hv_ring_buffer_info *rbi = >outbound;
> > +
> > +   return sprintf(buf, "%u\n", hv_get_bytes_to_write(rbi));
> > +}
> > +VMBUS_CHAN_ATTR_RO(write_avail);  
> 
> This is upstream since a year.
> 
> But I wonder how this can work if vmbus_device_register is called,
> and then something reads the populated sysfs files before vmbus_open returns.
> Nothing protects rbi->ring_buffer in this case, which remains NULL
> until vmbus_open populates it.
> 
> A simple reproduce, with a modular kernel, is to boot with init=/bin/bash
> head /sys/bus/vmbus/devices/*/channels/*/*
> 
> Olaf


Good catch, actually the problem goes across all of the ring buffer sysfs files
so it existed long before that.

The channel ring buffer could be missing.

I am less worried about the open from init case, and more worried about issues
when channels are closed (as happens when changing number of channels on a net 
device).

As Al has pointed out for years, sysfs is riddled with dangling reference 
issues.

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH net-next, v3] hv_netvsc: fix vf serial matching with pci slot info

2018-10-15 Thread Stephen Hemminger
On Mon, 15 Oct 2018 19:06:15 +
Haiyang Zhang  wrote:

> From: Haiyang Zhang 
> 
> The VF device's serial number is saved as a string in PCI slot's
> kobj name, not the slot->number. This patch corrects the netvsc
> driver, so the VF device can be successfully paired with synthetic
> NIC.
> 
> Fixes: 00d7ddba1143 ("hv_netvsc: pair VF based on serial number")
> Reported-by: Vitaly Kuznetsov 
> Signed-off-by: Haiyang Zhang 

Reviewed-by: Stephen Hemminger 

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH net-next, v2] hv_netvsc: fix vf serial matching with pci slot info

2018-10-12 Thread Stephen Hemminger
On Fri, 12 Oct 2018 20:55:15 +
Haiyang Zhang  wrote:

Thanks for fixing this.

  
> + if (kstrtou32(kobject_name(>slot->kobj), 10, )) {
> + netdev_notice(vf_netdev, "Invalid vf serial:%s\n",
> +   pdev->slot->kobj.name);
> + return NULL;
> + }

Shouldn't this use kobject_name() in the message as well.

Looking at the pci.h code there is already an API to get name from
slot (it uses kobject_name()). So please use that one.

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH] hv_netvsc: Make sure out channel is fully opened on send

2018-09-27 Thread Stephen Hemminger
On Thu, 27 Sep 2018 10:57:05 +0200
Mohammed Gamal  wrote:

> On Wed, 2018-09-26 at 17:13 +, Haiyang Zhang wrote:
> > > -Original Message-
> > > From: Mohammed Gamal 
> > > Sent: Wednesday, September 26, 2018 12:34 PM
> > > To: Stephen Hemminger ; netdev@vger.kernel.
> > > org
> > > Cc: KY Srinivasan ; Haiyang Zhang
> > > ; vkuznets ;
> > > ot...@redhat.com; cavery ; linux-
> > > ker...@vger.kernel.org; de...@linuxdriverproject.org; Mohammed
> > > Gamal
> > > 
> > > Subject: [PATCH] hv_netvsc: Make sure out channel is fully opened
> > > on send
> > > 
> > > Dring high network traffic changes to network interface parameters
> > > such as
> > > number of channels or MTU can cause a kernel panic with a NULL
> > > pointer
> > > dereference. This is due to netvsc_device_remove() being called and
> > > deallocating the channel ring buffers, which can then be accessed
> > > by
> > > netvsc_send_pkt() before they're allocated on calling
> > > netvsc_device_add()
> > > 
> > > The patch fixes this problem by checking the channel state and
> > > returning
> > > ENODEV if not yet opened. We also move the call to
> > > hv_ringbuf_avail_percent()
> > > which may access the uninitialized ring buffer.
> > > 
> > > Signed-off-by: Mohammed Gamal 
> > > ---
> > >  drivers/net/hyperv/netvsc.c | 7 ++-
> > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/net/hyperv/netvsc.c
> > > b/drivers/net/hyperv/netvsc.c index
> > > fe01e14..75f1b31 100644
> > > --- a/drivers/net/hyperv/netvsc.c
> > > +++ b/drivers/net/hyperv/netvsc.c
> > > @@ -825,7 +825,12 @@ static inline int netvsc_send_pkt(
> > >   struct netdev_queue *txq = netdev_get_tx_queue(ndev,
> > > packet->q_idx);
> > >   u64 req_id;
> > >   int ret;
> > > - u32 ring_avail =
> > > hv_get_avail_to_write_percent(_channel-  
> > > > outbound);  
> > > 
> > > + u32 ring_avail;
> > > +
> > > + if (out_channel->state != CHANNEL_OPENED_STATE)
> > > + return -ENODEV;
> > > +
> > > + ring_avail = hv_get_avail_to_write_percent(_channel-  
> > > >outbound);  
> > 
> > When you reproducing the NULL ptr panic, does your kernel include the
> > following patch?
> > hv_netvsc: common detach logic
> > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/c
> > ommit/?id=7b2ee50c0cd513a176a26a71f2989facdd75bfea
> >   
> Yes it is included. And the commit did reduce the occurrence of this
> race condition, but it still nevertheless occurs albeit rarely.
> 
> > We call netif_tx_disable(ndev) and netif_device_detach(ndev) before
> > doing the changes 
> > on MTU or #channels. So there should be no call to start_xmit() when
> > channel is not ready.
> > 
> > If you see the check for CHANNEL_OPENED_STATE is still necessary on
> > upstream kernel (including 
> > the patch " common detach logic "), we should debug further on the
> > code and find out the 
> > root cause.
> > 
> > Thanks,
> > - Haiyang
> >   
> ___
> devel mailing list
> de...@linuxdriverproject.org
> http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Is there some workload, that can be used to reproduce this?
The stress test from Vitaly with changing parameters while running network 
traffic
passes now.

Can you reproduce this with the upstream current kernel?

Adding the check in start xmit is still racy, and won't cure the problem.

Another solution would be to add a grace period in the netvsc detach logic.

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH net-next] hv_netvsc: fix return type of ndo_start_xmit function

2018-09-26 Thread Stephen Hemminger
On Wed, 26 Sep 2018 18:25:10 +0800
YueHaibing  wrote:

> The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
> which is a typedef for an enum type, so make sure the implementation in
> this driver has returns 'netdev_tx_t' value, and change the function
> return type to netdev_tx_t.
> 
> As suggestion from Haiyang Zhang , if netvsc_vf_xmit
> fails, We are not sure if the error can go away after retrying, returning
> NETDEV_TX_BUSY may cause infinite retry from the upper layer.
> so just return NETDEV_TX_OK at there.
> 
> Found by coccinelle.
> 
> Signed-off-by: YueHaibing 
> ---
>  drivers/net/hyperv/netvsc_drv.c | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
> index ec69974..a1d44b4 100644
> --- a/drivers/net/hyperv/netvsc_drv.c
> +++ b/drivers/net/hyperv/netvsc_drv.c
> @@ -511,7 +511,8 @@ static int netvsc_vf_xmit(struct net_device *net, struct 
> net_device *vf_netdev,
>   return rc;
>  }
>  
> -static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net)
> +static netdev_tx_t
> +netvsc_start_xmit(struct sk_buff *skb, struct net_device *net)
>  {
>   struct net_device_context *net_device_ctx = netdev_priv(net);
>   struct hv_netvsc_packet *packet = NULL;
> @@ -528,8 +529,10 @@ static int netvsc_start_xmit(struct sk_buff *skb, struct 
> net_device *net)
>*/
>   vf_netdev = rcu_dereference_bh(net_device_ctx->vf_netdev);
>   if (vf_netdev && netif_running(vf_netdev) &&
> - !netpoll_tx_running(net))
> - return netvsc_vf_xmit(net, vf_netdev, skb);
> + !netpoll_tx_running(net)) {
> + netvsc_vf_xmit(net, vf_netdev, skb);
> + return NETDEV_TX_OK;
> + }
>  
>   /* We will atmost need two pages to describe the rndis
>* header. We can only transmit MAX_PAGE_BUFFER_COUNT number

Your patch loses the possible return values of netvsc_vf_xmit.
A suggested better fix would be to make netvsc_vf_xmit return netdev_tx_t type.
And this means the return value of dev_queue_xmit needs to be netdev_tx_t.

Please don't merge this as is.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH net-next, v2, 1/3] hv_netvsc: Add support for LRO/RSC in the vSwitch

2018-09-21 Thread Stephen Hemminger
On Fri, 21 Sep 2018 18:51:54 +
Haiyang Zhang  wrote:

> > -Original Message-
> > From: Stephen Hemminger 
> > Sent: Friday, September 21, 2018 2:37 PM
> > To: Haiyang Zhang 
> > Cc: Haiyang Zhang ; da...@davemloft.net;
> > net...@vger.kernel.org; o...@aepfle.de; linux-ker...@vger.kernel.org;
> > de...@linuxdriverproject.org; vkuznets 
> > Subject: Re: [PATCH net-next, v2, 1/3] hv_netvsc: Add support for LRO/RSC in
> > the vSwitch
> > 
> > On Fri, 21 Sep 2018 18:20:35 +
> > Haiyang Zhang  wrote:
> > 
> > Overall, this looks good. Some minor suggestions.
> >   
> > > +struct nvsc_rsc {
> > > + const struct ndis_pkt_8021q_info *vlan;
> > > + const struct ndis_tcp_ip_checksum_info *csum_info;
> > > + u8 is_last; /* last RNDIS msg in a vmtransfer_page */
> > > + u32 cnt; /* #fragments in an RSC packet */
> > > + u32 pktlen; /* Full packet length */
> > > + void *data[NVSP_RSC_MAX];
> > > + u32 len[NVSP_RSC_MAX];
> > > +};
> > > +  
> > 
> > This new state structure is state on a per-channel basis.
> > Do you really need this to be persistent across packets?
> > 
> > Could this be on stack or do you need it to handle split packets arriving in
> > different polls? Or is the stack space a problem?
> > 
> > Also, maybe data and length could be in one structure since they are 
> > related.  
> 
> The stack space is a problem. NVSP_RSC_MAX is 562, which is defined by host.
> It will be too large for limited stack space. 
> 
> struct nvsc_rsc includes the data, len, cnt, chksum, vlan for one RSC packet. 
> They
> are all related to construction of one SKB and its meta data. So I put them in
> one structure.
> 
> Thanks,
> - Haiyang
> 

That makes sense. How big is sizeof(struct net_device) + netdev_priv now?
Need to make sure it doesn't become an order 2 (ie keep it less than 4K).
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH net-next, v2, 1/3] hv_netvsc: Add support for LRO/RSC in the vSwitch

2018-09-21 Thread Stephen Hemminger
On Fri, 21 Sep 2018 18:20:35 +
Haiyang Zhang  wrote:

Overall, this looks good. Some minor suggestions.

> +struct nvsc_rsc {
> + const struct ndis_pkt_8021q_info *vlan;
> + const struct ndis_tcp_ip_checksum_info *csum_info;
> + u8 is_last; /* last RNDIS msg in a vmtransfer_page */
> + u32 cnt; /* #fragments in an RSC packet */
> + u32 pktlen; /* Full packet length */
> + void *data[NVSP_RSC_MAX];
> + u32 len[NVSP_RSC_MAX];
> +};
> +

This new state structure is state on a per-channel basis.
Do you really need this to be persistent across packets?

Could this be on stack or do you need it to handle split packets
arriving in different polls? Or is the stack space a problem?

Also, maybe data and length could be in one structure since they
are related.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH net-next, 1/3] hv_netvsc: Add support for LRO/RSC in the vSwitch

2018-09-20 Thread Stephen Hemminger
On Thu, 20 Sep 2018 20:56:46 +
Haiyang Zhang  wrote:

> > -Original Message-
> > From: Stephen Hemminger 
> > Sent: Thursday, September 20, 2018 4:48 PM
> > To: Haiyang Zhang 
> > Cc: Haiyang Zhang ; da...@davemloft.net;
> > net...@vger.kernel.org; o...@aepfle.de; linux-ker...@vger.kernel.org;
> > de...@linuxdriverproject.org; vkuznets 
> > Subject: Re: [PATCH net-next, 1/3] hv_netvsc: Add support for LRO/RSC in the
> > vSwitch
> > 
> > On Thu, 20 Sep 2018 17:06:59 +
> > Haiyang Zhang  wrote:
> >   
> > > +static inline void rsc_add_data
> > > + (struct netvsc_channel *nvchan,
> > > +  const struct ndis_pkt_8021q_info *vlan,
> > > +  const struct ndis_tcp_ip_checksum_info *csum_info,
> > > +  void *data, u32 len)
> > > +{  
> > 
> > Could this be changed to look more like a function and skip the inline.
> > The compiler will end up inlining it anyway.
> > 
> > static void rsc_add_data(struct netvsc_channel *nvchan,  
> 
> How about this?
> static inline
> void rsc_add_data(struct netvsc_channel *nvchan,
> 

Sure that matches other code in that file
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH net-next, 1/3] hv_netvsc: Add support for LRO/RSC in the vSwitch

2018-09-20 Thread Stephen Hemminger
On Thu, 20 Sep 2018 17:06:59 +
Haiyang Zhang  wrote:

> +static inline void rsc_add_data
> + (struct netvsc_channel *nvchan,
> +  const struct ndis_pkt_8021q_info *vlan,
> +  const struct ndis_tcp_ip_checksum_info *csum_info,
> +  void *data, u32 len)
> +{

Could this be changed to look more like a function and skip the inline.
The compiler will end up inlining it anyway.

static void rsc_add_data(struct netvsc_channel *nvchan,

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v2 0/2] hv_netvsc: associate VF and PV device by serial number

2018-09-20 Thread Stephen Hemminger
On Thu, 20 Sep 2018 15:18:20 +0100
Lorenzo Pieralisi  wrote:

> On Fri, Sep 14, 2018 at 12:54:55PM -0700, Stephen Hemminger wrote:
> > The Hyper-V implementation of PCI controller has concept of 32 bit serial 
> > number
> > (not to be confused with PCI-E serial number).  This value is sent in the 
> > protocol
> > from the host to indicate SR-IOV VF device is attached to a synthetic NIC.
> > 
> > Using the serial number (instead of MAC address) to associate the two 
> > devices
> > avoids lots of potential problems when there are duplicate MAC addresses 
> > from
> > tunnels or layered devices.
> > 
> > The patch set is broken into two parts, one is for the PCI controller
> > and the other is for the netvsc device. Normally, these go through different
> > trees but sending them together here for better review. The PCI changes
> > were submitted previously, but the main review comment was "why do you
> > need this?". This is why.  
> 
> The question was more whether we should convert this serial number into
> a PCI slot number (that has user space visibility and that is what you are
> after) to improve the current matching, I do not question why you need
> it, just for the records.

The name slot is way overloaded in this context.
There is 
windows slot number which comes from Hyperv
pci address slot which pci-hyperv sets from windows slot
pci slot api value which for normal devices comes from ACPI
this patch gets it from serial number


The netvsc driver needed to be able to find a PCI device based on the serial
number. The serial number was not visible in any current PCI-hyperv controller
values.  The windows slot (wslot) is not the same the serial number.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH net-next 17/22] hv_netvsc: fix return type of ndo_start_xmit function

2018-09-20 Thread Stephen Hemminger
On Thu, 20 Sep 2018 20:33:01 +0800
YueHaibing  wrote:

> The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
> which is a typedef for an enum type, so make sure the implementation in
> this driver has returns 'netdev_tx_t' value, and change the function
> return type to netdev_tx_t.
> 
> Found by coccinelle.
> 
> Signed-off-by: YueHaibing 
> ---
>  drivers/net/hyperv/netvsc_drv.c | 10 +++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
> index 3af6d8d..056c472 100644
> --- a/drivers/net/hyperv/netvsc_drv.c
> +++ b/drivers/net/hyperv/netvsc_drv.c
> @@ -511,7 +511,8 @@ static int netvsc_vf_xmit(struct net_device *net, struct 
> net_device *vf_netdev,
>   return rc;
>  }
>  
> -static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net)
> +static netdev_tx_t
> +netvsc_start_xmit(struct sk_buff *skb, struct net_device *net)
>  {
>   struct net_device_context *net_device_ctx = netdev_priv(net);
>   struct hv_netvsc_packet *packet = NULL;
> @@ -528,8 +529,11 @@ static int netvsc_start_xmit(struct sk_buff *skb, struct 
> net_device *net)
>*/
>   vf_netdev = rcu_dereference_bh(net_device_ctx->vf_netdev);
>   if (vf_netdev && netif_running(vf_netdev) &&
> - !netpoll_tx_running(net))
> - return netvsc_vf_xmit(net, vf_netdev, skb);
> + !netpoll_tx_running(net)) {
> + ret = netvsc_vf_xmit(net, vf_netdev, skb);
> + if (ret)
> + return NETDEV_TX_BUSY;
> + }

Sorry, the new code is wrong. It will fall through if ret == 0 (NETDEV_TX_OK)
Please review and test your patches.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH v2 1/2] PCI: hv: support reporting serial number as slot information

2018-09-14 Thread Stephen Hemminger
The Hyper-V host API for PCI provides a unique "serial number" which
can be used as basis for sysfs PCI slot table. This can be useful
for cases where userspace wants to find the PCI device based on
serial number.

When an SR-IOV NIC is added, the host sends an attach message
with serial number. The kernel doesn't use the serial number, but
it is useful when doing the same thing in a userspace driver such
as the DPDK. By having /sys/bus/pci/slots/N it provides a direct
way to find the matching PCI device.

There maybe some cases where serial number is not unique such
as when using GPU's. But the PCI slot infrastructure will handle
that.

This has a side effect which may also be useful. The common udev
network device naming policy uses the slot information (rather
than PCI address).

Signed-off-by: Stephen Hemminger 
---
 drivers/pci/controller/pci-hyperv.c | 37 +
 1 file changed, 37 insertions(+)

diff --git a/drivers/pci/controller/pci-hyperv.c 
b/drivers/pci/controller/pci-hyperv.c
index c00f82cc54aa..ee80e79db21a 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -89,6 +89,9 @@ static enum pci_protocol_version_t pci_protocol_version;
 
 #define STATUS_REVISION_MISMATCH 0xC059
 
+/* space for 32bit serial number as string */
+#define SLOT_NAME_SIZE 11
+
 /*
  * Message Types
  */
@@ -494,6 +497,7 @@ struct hv_pci_dev {
struct list_head list_entry;
refcount_t refs;
enum hv_pcichild_state state;
+   struct pci_slot *pci_slot;
struct pci_function_description desc;
bool reported_missing;
struct hv_pcibus_device *hbus;
@@ -1457,6 +1461,34 @@ static void prepopulate_bars(struct hv_pcibus_device 
*hbus)
spin_unlock_irqrestore(>device_list_lock, flags);
 }
 
+/*
+ * Assign entries in sysfs pci slot directory.
+ *
+ * Note that this function does not need to lock the children list
+ * because it is called from pci_devices_present_work which
+ * is serialized with hv_eject_device_work because they are on the
+ * same ordered workqueue. Therefore hbus->children list will not change
+ * even when pci_create_slot sleeps.
+ */
+static void hv_pci_assign_slots(struct hv_pcibus_device *hbus)
+{
+   struct hv_pci_dev *hpdev;
+   char name[SLOT_NAME_SIZE];
+   int slot_nr;
+
+   list_for_each_entry(hpdev, >children, list_entry) {
+   if (hpdev->pci_slot)
+   continue;
+
+   slot_nr = PCI_SLOT(wslot_to_devfn(hpdev->desc.win_slot.slot));
+   snprintf(name, SLOT_NAME_SIZE, "%u", hpdev->desc.ser);
+   hpdev->pci_slot = pci_create_slot(hbus->pci_bus, slot_nr,
+ name, NULL);
+   if (!hpdev->pci_slot)
+   pr_warn("pci_create slot %s failed\n", name);
+   }
+}
+
 /**
  * create_root_hv_pci_bus() - Expose a new root PCI bus
  * @hbus:  Root PCI bus, as understood by this driver
@@ -1480,6 +1512,7 @@ static int create_root_hv_pci_bus(struct hv_pcibus_device 
*hbus)
pci_lock_rescan_remove();
pci_scan_child_bus(hbus->pci_bus);
pci_bus_assign_resources(hbus->pci_bus);
+   hv_pci_assign_slots(hbus);
pci_bus_add_devices(hbus->pci_bus);
pci_unlock_rescan_remove();
hbus->state = hv_pcibus_installed;
@@ -1742,6 +1775,7 @@ static void pci_devices_present_work(struct work_struct 
*work)
 */
pci_lock_rescan_remove();
pci_scan_child_bus(hbus->pci_bus);
+   hv_pci_assign_slots(hbus);
pci_unlock_rescan_remove();
break;
 
@@ -1858,6 +1892,9 @@ static void hv_eject_device_work(struct work_struct *work)
list_del(>list_entry);
spin_unlock_irqrestore(>hbus->device_list_lock, flags);
 
+   if (hpdev->pci_slot)
+   pci_destroy_slot(hpdev->pci_slot);
+
memset(, 0, sizeof(ctxt));
ejct_pkt = (struct pci_eject_response *)
ejct_pkt->message_type.type = PCI_EJECTION_COMPLETE;
-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH v2 2/2] hv_netvsc: pair VF based on serial number

2018-09-14 Thread Stephen Hemminger
Matching network device based on MAC address is problematic
since a non VF network device can be creted with a duplicate MAC
address causing confusion and problems.  The VMBus API does provide
a serial number that is a better matching method.

Signed-off-by: Stephen Hemminger 
---
 drivers/net/hyperv/netvsc.c |  3 ++
 drivers/net/hyperv/netvsc_drv.c | 58 +++--
 2 files changed, 36 insertions(+), 25 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 31c3d77b4733..fe01e141c8f8 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -1203,6 +1203,9 @@ static void netvsc_send_vf(struct net_device *ndev,
 
net_device_ctx->vf_alloc = nvmsg->msg.v4_msg.vf_assoc.allocated;
net_device_ctx->vf_serial = nvmsg->msg.v4_msg.vf_assoc.serial;
+   netdev_info(ndev, "VF slot %u %s\n",
+   net_device_ctx->vf_serial,
+   net_device_ctx->vf_alloc ? "added" : "removed");
 }
 
 static  void netvsc_receive_inband(struct net_device *ndev,
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 1121a1ec407c..9dedc1463e88 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -1894,20 +1894,6 @@ static void netvsc_link_change(struct work_struct *w)
rtnl_unlock();
 }
 
-static struct net_device *get_netvsc_bymac(const u8 *mac)
-{
-   struct net_device_context *ndev_ctx;
-
-   list_for_each_entry(ndev_ctx, _dev_list, list) {
-   struct net_device *dev = hv_get_drvdata(ndev_ctx->device_ctx);
-
-   if (ether_addr_equal(mac, dev->perm_addr))
-   return dev;
-   }
-
-   return NULL;
-}
-
 static struct net_device *get_netvsc_byref(struct net_device *vf_netdev)
 {
struct net_device_context *net_device_ctx;
@@ -2036,26 +2022,48 @@ static void netvsc_vf_setup(struct work_struct *w)
rtnl_unlock();
 }
 
+/* Find netvsc by VMBus serial number.
+ * The PCI hyperv controller records the serial number as the slot.
+ */
+static struct net_device *get_netvsc_byslot(const struct net_device *vf_netdev)
+{
+   struct device *parent = vf_netdev->dev.parent;
+   struct net_device_context *ndev_ctx;
+   struct pci_dev *pdev;
+
+   if (!parent || !dev_is_pci(parent))
+   return NULL; /* not a PCI device */
+
+   pdev = to_pci_dev(parent);
+   if (!pdev->slot) {
+   netdev_notice(vf_netdev, "no PCI slot information\n");
+   return NULL;
+   }
+
+   list_for_each_entry(ndev_ctx, _dev_list, list) {
+   if (!ndev_ctx->vf_alloc)
+   continue;
+
+   if (ndev_ctx->vf_serial == pdev->slot->number)
+   return hv_get_drvdata(ndev_ctx->device_ctx);
+   }
+
+   netdev_notice(vf_netdev,
+ "no netdev found for slot %u\n", pdev->slot->number);
+   return NULL;
+}
+
 static int netvsc_register_vf(struct net_device *vf_netdev)
 {
-   struct net_device *ndev;
struct net_device_context *net_device_ctx;
-   struct device *pdev = vf_netdev->dev.parent;
struct netvsc_device *netvsc_dev;
+   struct net_device *ndev;
int ret;
 
if (vf_netdev->addr_len != ETH_ALEN)
return NOTIFY_DONE;
 
-   if (!pdev || !dev_is_pci(pdev) || dev_is_pf(pdev))
-   return NOTIFY_DONE;
-
-   /*
-* We will use the MAC address to locate the synthetic interface to
-* associate with the VF interface. If we don't find a matching
-* synthetic interface, move on.
-*/
-   ndev = get_netvsc_bymac(vf_netdev->perm_addr);
+   ndev = get_netvsc_byslot(vf_netdev);
if (!ndev)
return NOTIFY_DONE;
 
-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH v2 0/2] hv_netvsc: associate VF and PV device by serial number

2018-09-14 Thread Stephen Hemminger
The Hyper-V implementation of PCI controller has concept of 32 bit serial number
(not to be confused with PCI-E serial number).  This value is sent in the 
protocol
from the host to indicate SR-IOV VF device is attached to a synthetic NIC.

Using the serial number (instead of MAC address) to associate the two devices
avoids lots of potential problems when there are duplicate MAC addresses from
tunnels or layered devices.

The patch set is broken into two parts, one is for the PCI controller
and the other is for the netvsc device. Normally, these go through different
trees but sending them together here for better review. The PCI changes
were submitted previously, but the main review comment was "why do you
need this?". This is why.

v2 - slot name can be shorter.
 remove locking when creating pci_slots; see comment for explaination

Stephen Hemminger (2):
  PCI: hv: support reporting serial number as slot information
  hv_netvsc: pair VF based on serial number

 drivers/net/hyperv/netvsc.c |  3 ++
 drivers/net/hyperv/netvsc_drv.c | 58 -
 drivers/pci/controller/pci-hyperv.c | 37 ++
 3 files changed, 73 insertions(+), 25 deletions(-)

-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH v3 2/6] vmbus: keep pointer to ring buffer page

2018-09-14 Thread Stephen Hemminger
Avoid going from struct page to virt address (and back) by just
keeping pointer to the allocated pages instead of virt address.

Signed-off-by: Stephen Hemminger 
---
 drivers/hv/channel.c | 20 +---
 drivers/uio/uio_hv_generic.c |  5 +++--
 include/linux/hyperv.h   |  2 +-
 3 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index 33e6db02dbab..56ec0d96d876 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -91,11 +91,14 @@ int vmbus_open(struct vmbus_channel *newchannel, u32 
send_ringbuffer_size,
unsigned long flags;
int ret, err = 0;
struct page *page;
+   unsigned int order;
 
if (send_ringbuffer_size % PAGE_SIZE ||
recv_ringbuffer_size % PAGE_SIZE)
return -EINVAL;
 
+   order = get_order(send_ringbuffer_size + recv_ringbuffer_size);
+
spin_lock_irqsave(>lock, flags);
if (newchannel->state == CHANNEL_OPEN_STATE) {
newchannel->state = CHANNEL_OPENING_STATE;
@@ -110,21 +113,17 @@ int vmbus_open(struct vmbus_channel *newchannel, u32 
send_ringbuffer_size,
 
/* Allocate the ring buffer */
page = alloc_pages_node(cpu_to_node(newchannel->target_cpu),
-   GFP_KERNEL|__GFP_ZERO,
-   get_order(send_ringbuffer_size +
-   recv_ringbuffer_size));
+   GFP_KERNEL|__GFP_ZERO, order);
 
if (!page)
-   page = alloc_pages(GFP_KERNEL|__GFP_ZERO,
-  get_order(send_ringbuffer_size +
-recv_ringbuffer_size));
+   page = alloc_pages(GFP_KERNEL|__GFP_ZERO, order);
 
if (!page) {
err = -ENOMEM;
goto error_set_chnstate;
}
 
-   newchannel->ringbuffer_pages = page_address(page);
+   newchannel->ringbuffer_page = page;
newchannel->ringbuffer_pagecount = (send_ringbuffer_size +
   recv_ringbuffer_size) >> PAGE_SHIFT;
 
@@ -239,8 +238,7 @@ int vmbus_open(struct vmbus_channel *newchannel, u32 
send_ringbuffer_size,
 error_free_pages:
hv_ringbuffer_cleanup(>outbound);
hv_ringbuffer_cleanup(>inbound);
-   __free_pages(page,
-get_order(send_ringbuffer_size + recv_ringbuffer_size));
+   __free_pages(page, order);
 error_set_chnstate:
newchannel->state = CHANNEL_OPEN_STATE;
return err;
@@ -658,8 +656,8 @@ static int vmbus_close_internal(struct vmbus_channel 
*channel)
hv_ringbuffer_cleanup(>outbound);
hv_ringbuffer_cleanup(>inbound);
 
-   free_pages((unsigned long)channel->ringbuffer_pages,
-   get_order(channel->ringbuffer_pagecount * PAGE_SIZE));
+   __free_pages(channel->ringbuffer_page,
+get_order(channel->ringbuffer_pagecount << PAGE_SHIFT));
 
 out:
return ret;
diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
index a08860260f55..ba67a5267557 100644
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -130,11 +130,12 @@ static int hv_uio_ring_mmap(struct file *filp, struct 
kobject *kobj,
= container_of(kobj, struct vmbus_channel, kobj);
struct hv_device *dev = channel->primary_channel->device_obj;
u16 q_idx = channel->offermsg.offer.sub_channel_index;
+   void *ring_buffer = page_address(channel->ringbuffer_page);
 
dev_dbg(>device, "mmap channel %u pages %#lx at %#lx\n",
q_idx, vma_pages(vma), vma->vm_pgoff);
 
-   return vm_iomap_memory(vma, virt_to_phys(channel->ringbuffer_pages),
+   return vm_iomap_memory(vma, virt_to_phys(ring_buffer),
   channel->ringbuffer_pagecount << PAGE_SHIFT);
 }
 
@@ -223,7 +224,7 @@ hv_uio_probe(struct hv_device *dev,
/* mem resources */
pdata->info.mem[TXRX_RING_MAP].name = "txrx_rings";
pdata->info.mem[TXRX_RING_MAP].addr
-   = (uintptr_t)dev->channel->ringbuffer_pages;
+   = (uintptr_t)page_address(dev->channel->ringbuffer_page);
pdata->info.mem[TXRX_RING_MAP].size
= dev->channel->ringbuffer_pagecount << PAGE_SHIFT;
pdata->info.mem[TXRX_RING_MAP].memtype = UIO_MEM_LOGICAL;
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 6c4575c7f46b..a6c32d2d090b 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -739,7 +739,7 @@ struct vmbus_channel {
u32 ringbuffer_gpadlhandle;
 
/* Allocated memory for ring buffer */
-   void *ringbuffer_pages;
+   struct page *ringbuffer_page;
u32 ringbuffer_pagecount;
   

[PATCH v3 1/6] vmbus: pass channel to hv_process_channel_removal

2018-09-14 Thread Stephen Hemminger
Rather than passing relid and then looking up the channel.
Pass the channel directly, since caller already knows it.

Signed-off-by: Stephen Hemminger 
---
 drivers/hv/channel.c  |  3 +--
 drivers/hv/channel_mgmt.c | 17 +
 drivers/hv/vmbus_drv.c|  3 +--
 include/linux/hyperv.h|  2 +-
 4 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index 741857d80da1..33e6db02dbab 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -690,8 +690,7 @@ void vmbus_close(struct vmbus_channel *channel)
wait_for_completion(_channel->rescind_event);
mutex_lock(_connection.channel_mutex);
vmbus_close_internal(cur_channel);
-   hv_process_channel_removal(
-  cur_channel->offermsg.child_relid);
+   hv_process_channel_removal(cur_channel);
} else {
mutex_lock(_connection.channel_mutex);
vmbus_close_internal(cur_channel);
diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 0f0e091c117c..b7c48ebdf6a1 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -385,21 +385,14 @@ static void vmbus_release_relid(u32 relid)
trace_vmbus_release_relid(, ret);
 }
 
-void hv_process_channel_removal(u32 relid)
+void hv_process_channel_removal(struct vmbus_channel *channel)
 {
+   struct vmbus_channel *primary_channel;
unsigned long flags;
-   struct vmbus_channel *primary_channel, *channel;
 
BUG_ON(!mutex_is_locked(_connection.channel_mutex));
-
-   /*
-* Make sure channel is valid as we may have raced.
-*/
-   channel = relid2channel(relid);
-   if (!channel)
-   return;
-
BUG_ON(!channel->rescind);
+
if (channel->target_cpu != get_cpu()) {
put_cpu();
smp_call_function_single(channel->target_cpu,
@@ -429,7 +422,7 @@ void hv_process_channel_removal(u32 relid)
cpumask_clear_cpu(channel->target_cpu,
  _channel->alloced_cpus_in_node);
 
-   vmbus_release_relid(relid);
+   vmbus_release_relid(channel->offermsg.child_relid);
 
free_channel(channel);
 }
@@ -943,7 +936,7 @@ static void vmbus_onoffer_rescind(struct 
vmbus_channel_message_header *hdr)
 * The channel is currently not open;
 * it is safe for us to cleanup the channel.
 */
-   hv_process_channel_removal(rescind->child_relid);
+   hv_process_channel_removal(channel);
} else {
complete(>rescind_event);
}
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index e6d8fdac6d8b..007ee8e5986a 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -864,10 +864,9 @@ static void vmbus_device_release(struct device *device)
struct vmbus_channel *channel = hv_dev->channel;
 
mutex_lock(_connection.channel_mutex);
-   hv_process_channel_removal(channel->offermsg.child_relid);
+   hv_process_channel_removal(channel);
mutex_unlock(_connection.channel_mutex);
kfree(hv_dev);
-
 }
 
 /* The one and only one */
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 2c3798bcb01c..6c4575c7f46b 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1443,7 +1443,7 @@ extern bool vmbus_prep_negotiate_resp(struct icmsg_hdr 
*icmsghdrp, u8 *buf,
const int *srv_version, int srv_vercnt,
int *nego_fw_version, int *nego_srv_version);
 
-void hv_process_channel_removal(u32 relid);
+void hv_process_channel_removal(struct vmbus_channel *channel);
 
 void vmbus_setevent(struct vmbus_channel *channel);
 /*
-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH v3 0/6] fix hv_uio_generic open/close

2018-09-14 Thread Stephen Hemminger
This set of patches fixes the problem where DPDK applications
using hv_uio_generic driver can not be successfully restarted.

In order to get this working it required small change to uio
to allow for mapping without no-cache. And refactoring of how
ring buffer is setup in vmbus code.

It could be backported as a fix, to 4.19 but that is not
an LTS so probably not worth it.

v3 - fix typo (sent wrong version for v2)

v2 - fix refcount when hv_uio_open fails


Stephen Hemminger (6):
  vmbus: pass channel to hv_process_channel_removal
  vmbus: keep pointer to ring buffer page
  vmbus: split ring buffer allocation from open
  uio: introduce UIO_MEM_IOVA
  hv_uio_generic: map ringbuffer phys addr
  uio_hv_generic: defer opening vmbus until first use

 drivers/hv/channel.c | 276 ---
 drivers/hv/channel_mgmt.c|  17 +--
 drivers/hv/ring_buffer.c |   1 +
 drivers/hv/vmbus_drv.c   |   3 +-
 drivers/uio/uio.c|  24 +--
 drivers/uio/uio_hv_generic.c | 109 ++
 include/linux/hyperv.h   |  13 +-
 include/linux/uio_driver.h   |   1 +
 8 files changed, 264 insertions(+), 180 deletions(-)

-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH v3 3/6] vmbus: split ring buffer allocation from open

2018-09-14 Thread Stephen Hemminger
The UIO driver needs the ring buffer to be persistent(reused)
across open/close. Split the allocation and setup of ring buffer
out of vmbus_open. For normal usage vmbus_open/vmbus_close there
are no changes; only impacts uio_hv_generic which needs to keep
ring buffer memory and reuse when application restarts.

Signed-off-by: Stephen Hemminger 
---
 drivers/hv/channel.c | 267 ++-
 drivers/hv/ring_buffer.c |   1 +
 include/linux/hyperv.h   |   9 ++
 3 files changed, 162 insertions(+), 115 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index 56ec0d96d876..ddadb7efd1cc 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -79,84 +79,96 @@ void vmbus_setevent(struct vmbus_channel *channel)
 }
 EXPORT_SYMBOL_GPL(vmbus_setevent);
 
-/*
- * vmbus_open - Open the specified channel.
- */
-int vmbus_open(struct vmbus_channel *newchannel, u32 send_ringbuffer_size,
-u32 recv_ringbuffer_size, void *userdata, u32 userdatalen,
-void (*onchannelcallback)(void *context), void *context)
+/* vmbus_free_ring - drop mapping of ring buffer */
+void vmbus_free_ring(struct vmbus_channel *channel)
 {
-   struct vmbus_channel_open_channel *open_msg;
-   struct vmbus_channel_msginfo *open_info = NULL;
-   unsigned long flags;
-   int ret, err = 0;
-   struct page *page;
-   unsigned int order;
+   hv_ringbuffer_cleanup(>outbound);
+   hv_ringbuffer_cleanup(>inbound);
 
-   if (send_ringbuffer_size % PAGE_SIZE ||
-   recv_ringbuffer_size % PAGE_SIZE)
-   return -EINVAL;
+   if (channel->ringbuffer_page) {
+   __free_pages(channel->ringbuffer_page,
+get_order(channel->ringbuffer_pagecount
+  << PAGE_SHIFT));
+   channel->ringbuffer_page = NULL;
+   }
+}
+EXPORT_SYMBOL_GPL(vmbus_free_ring);
 
-   order = get_order(send_ringbuffer_size + recv_ringbuffer_size);
+/* vmbus_alloc_ring - allocate and map pages for ring buffer */
+int vmbus_alloc_ring(struct vmbus_channel *newchannel,
+u32 send_size, u32 recv_size)
+{
+   struct page *page;
+   int order;
 
-   spin_lock_irqsave(>lock, flags);
-   if (newchannel->state == CHANNEL_OPEN_STATE) {
-   newchannel->state = CHANNEL_OPENING_STATE;
-   } else {
-   spin_unlock_irqrestore(>lock, flags);
+   if (send_size % PAGE_SIZE || recv_size % PAGE_SIZE)
return -EINVAL;
-   }
-   spin_unlock_irqrestore(>lock, flags);
-
-   newchannel->onchannel_callback = onchannelcallback;
-   newchannel->channel_callback_context = context;
 
/* Allocate the ring buffer */
+   order = get_order(send_size + recv_size);
page = alloc_pages_node(cpu_to_node(newchannel->target_cpu),
GFP_KERNEL|__GFP_ZERO, order);
 
if (!page)
page = alloc_pages(GFP_KERNEL|__GFP_ZERO, order);
 
-   if (!page) {
-   err = -ENOMEM;
-   goto error_set_chnstate;
-   }
+   if (!page)
+   return -ENOMEM;
 
newchannel->ringbuffer_page = page;
-   newchannel->ringbuffer_pagecount = (send_ringbuffer_size +
-  recv_ringbuffer_size) >> PAGE_SHIFT;
+   newchannel->ringbuffer_pagecount = (send_size + recv_size) >> 
PAGE_SHIFT;
+   newchannel->ringbuffer_send_offset = send_size >> PAGE_SHIFT;
 
-   ret = hv_ringbuffer_init(>outbound, page,
-send_ringbuffer_size >> PAGE_SHIFT);
+   return 0;
+}
+EXPORT_SYMBOL_GPL(vmbus_alloc_ring);
 
-   if (ret != 0) {
-   err = ret;
-   goto error_free_pages;
-   }
+static int __vmbus_open(struct vmbus_channel *newchannel,
+  void *userdata, u32 userdatalen,
+  void (*onchannelcallback)(void *context), void *context)
+{
+   struct vmbus_channel_open_channel *open_msg;
+   struct vmbus_channel_msginfo *open_info = NULL;
+   struct page *page = newchannel->ringbuffer_page;
+   u32 send_pages, recv_pages;
+   unsigned long flags;
+   int err;
 
-   ret = hv_ringbuffer_init(>inbound,
-[send_ringbuffer_size >> PAGE_SHIFT],
-recv_ringbuffer_size >> PAGE_SHIFT);
-   if (ret != 0) {
-   err = ret;
-   goto error_free_pages;
+   if (userdatalen > MAX_USER_DEFINED_BYTES)
+   return -EINVAL;
+
+   send_pages = newchannel->ringbuffer_send_offset;
+   recv_pages = newchannel->ringbuffer_pagecount - send_pages;
+
+   spin_lock_irqsave(>lock, flags);
+   if (newchannel->state != CHANNEL_OPEN_STATE) {
+ 

[PATCH v3 5/6] hv_uio_generic: map ringbuffer phys addr

2018-09-14 Thread Stephen Hemminger
The ring buffer is contiguous IOVA and is mapped via phys addr
for sysfs file. Use same method for the UIO mapping.

Signed-off-by: Stephen Hemminger 
---
 drivers/uio/uio_hv_generic.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
index ba67a5267557..53f5610c6065 100644
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -224,10 +224,10 @@ hv_uio_probe(struct hv_device *dev,
/* mem resources */
pdata->info.mem[TXRX_RING_MAP].name = "txrx_rings";
pdata->info.mem[TXRX_RING_MAP].addr
-   = (uintptr_t)page_address(dev->channel->ringbuffer_page);
+   = 
(uintptr_t)virt_to_phys(page_address(dev->channel->ringbuffer_page));
pdata->info.mem[TXRX_RING_MAP].size
= dev->channel->ringbuffer_pagecount << PAGE_SHIFT;
-   pdata->info.mem[TXRX_RING_MAP].memtype = UIO_MEM_LOGICAL;
+   pdata->info.mem[TXRX_RING_MAP].memtype = UIO_MEM_IOVA;
 
pdata->info.mem[INT_PAGE_MAP].name = "int_page";
pdata->info.mem[INT_PAGE_MAP].addr
-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH v3 6/6] uio_hv_generic: defer opening vmbus until first use

2018-09-14 Thread Stephen Hemminger
This fixes two design flaws in hv_uio_generic.

Since hv_uio_probe is called from vmbus_probe with lock held
it potentially can cause sleep in an atomic section because
vmbus_open will wait for response from host.

The hv_uio_generic driver could not handle applications
exiting and restarting because the vmbus channel was
persistent.  Change the semantics so that the buffers are
allocated on probe, but not attached to host until
device is opened.

Signed-off-by: Stephen Hemminger 
---
 drivers/uio/uio_hv_generic.c | 104 +--
 1 file changed, 74 insertions(+), 30 deletions(-)

diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
index 53f5610c6065..c2493d011225 100644
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -55,6 +55,7 @@ enum hv_uio_map {
 struct hv_uio_private_data {
struct uio_info info;
struct hv_device *device;
+   atomic_t refcnt;
 
void*recv_buf;
u32 recv_gpadl;
@@ -128,12 +129,10 @@ static int hv_uio_ring_mmap(struct file *filp, struct 
kobject *kobj,
 {
struct vmbus_channel *channel
= container_of(kobj, struct vmbus_channel, kobj);
-   struct hv_device *dev = channel->primary_channel->device_obj;
-   u16 q_idx = channel->offermsg.offer.sub_channel_index;
void *ring_buffer = page_address(channel->ringbuffer_page);
 
-   dev_dbg(>device, "mmap channel %u pages %#lx at %#lx\n",
-   q_idx, vma_pages(vma), vma->vm_pgoff);
+   if (channel->state != CHANNEL_OPENED_STATE)
+   return -ENODEV;
 
return vm_iomap_memory(vma, virt_to_phys(ring_buffer),
   channel->ringbuffer_pagecount << PAGE_SHIFT);
@@ -176,57 +175,103 @@ hv_uio_new_channel(struct vmbus_channel *new_sc)
}
 }
 
+/* free the reserved buffers for send and receive */
 static void
 hv_uio_cleanup(struct hv_device *dev, struct hv_uio_private_data *pdata)
 {
-   if (pdata->send_gpadl)
+   if (pdata->send_gpadl) {
vmbus_teardown_gpadl(dev->channel, pdata->send_gpadl);
-   vfree(pdata->send_buf);
+   pdata->send_gpadl = 0;
+   vfree(pdata->send_buf);
+   }
 
-   if (pdata->recv_gpadl)
+   if (pdata->recv_gpadl) {
vmbus_teardown_gpadl(dev->channel, pdata->recv_gpadl);
-   vfree(pdata->recv_buf);
+   pdata->recv_gpadl = 0;
+   vfree(pdata->recv_buf);
+   }
+}
+
+/* VMBus primary channel is opened on first use */
+static int
+hv_uio_open(struct uio_info *info, struct inode *inode)
+{
+   struct hv_uio_private_data *pdata
+   = container_of(info, struct hv_uio_private_data, info);
+   struct hv_device *dev = pdata->device;
+   int ret;
+
+   if (atomic_inc_return(>refcnt) != 1)
+   return 0;
+
+   ret = vmbus_connect_ring(dev->channel,
+hv_uio_channel_cb, dev->channel);
+
+   if (ret == 0)
+   dev->channel->inbound.ring_buffer->interrupt_mask = 1;
+   else
+   atomic_dec(>refcnt);
+
+   return ret;
+}
+
+/* VMBus primary channel is closed on last close */
+static int
+hv_uio_release(struct uio_info *info, struct inode *inode)
+{
+   struct hv_uio_private_data *pdata
+   = container_of(info, struct hv_uio_private_data, info);
+   struct hv_device *dev = pdata->device;
+   int ret = 0;
+
+   if (atomic_dec_and_test(>refcnt))
+   ret = vmbus_disconnect_ring(dev->channel);
+
+   return ret;
 }
 
 static int
 hv_uio_probe(struct hv_device *dev,
 const struct hv_vmbus_device_id *dev_id)
 {
+   struct vmbus_channel *channel = dev->channel;
struct hv_uio_private_data *pdata;
+   void *ring_buffer;
int ret;
 
+   /* Communicating with host has to be via shared memory not hypercall */
+   if (!channel->offermsg.monitor_allocated) {
+   dev_err(>device, "vmbus channel requires hypercall\n");
+   return -ENOTSUPP;
+   }
+
pdata = kzalloc(sizeof(*pdata), GFP_KERNEL);
if (!pdata)
return -ENOMEM;
 
-   ret = vmbus_open(dev->channel, HV_RING_SIZE * PAGE_SIZE,
-HV_RING_SIZE * PAGE_SIZE, NULL, 0,
-hv_uio_channel_cb, dev->channel);
+   ret = vmbus_alloc_ring(channel, HV_RING_SIZE * PAGE_SIZE,
+  HV_RING_SIZE * PAGE_SIZE);
if (ret)
goto fail;
 
-   /* Communicating with host has to be via shared memory not hypercall */
-   if (!dev->channel->offermsg.monitor_allocated) {
-   dev_err(>device, "vmbus channel requires hypercall\n");
-   ret = -ENOTSUPP;
-   

[PATCH v3 4/6] uio: introduce UIO_MEM_IOVA

2018-09-14 Thread Stephen Hemminger
Introduce the concept of mapping physical memory locations that
are normal memory. The new type UIO_MEM_IOVA are similar to
existing UIO_MEM_PHYS but the backing memory is not marked as uncached.

Also, indent related switch to the currently used style.

Signed-off-by: Stephen Hemminger 
---
 drivers/uio/uio.c  | 24 +---
 include/linux/uio_driver.h |  1 +
 2 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index 0ffb324aa038..e601bd3fbae1 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -738,7 +738,8 @@ static int uio_mmap_physical(struct vm_area_struct *vma)
return -EINVAL;
 
vma->vm_ops = _physical_vm_ops;
-   vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+   if (idev->info->mem[mi].memtype == UIO_MEM_PHYS)
+   vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 
/*
 * We cannot use the vm_iomap_memory() helper here,
@@ -795,18 +796,19 @@ static int uio_mmap(struct file *filep, struct 
vm_area_struct *vma)
}
 
switch (idev->info->mem[mi].memtype) {
-   case UIO_MEM_PHYS:
-   ret = uio_mmap_physical(vma);
-   break;
-   case UIO_MEM_LOGICAL:
-   case UIO_MEM_VIRTUAL:
-   ret = uio_mmap_logical(vma);
-   break;
-   default:
-   ret = -EINVAL;
+   case UIO_MEM_IOVA:
+   case UIO_MEM_PHYS:
+   ret = uio_mmap_physical(vma);
+   break;
+   case UIO_MEM_LOGICAL:
+   case UIO_MEM_VIRTUAL:
+   ret = uio_mmap_logical(vma);
+   break;
+   default:
+   ret = -EINVAL;
}
 
-out:
+ out:
mutex_unlock(>info_lock);
return ret;
 }
diff --git a/include/linux/uio_driver.h b/include/linux/uio_driver.h
index 6f8b68cd460f..a3cd7cb67a69 100644
--- a/include/linux/uio_driver.h
+++ b/include/linux/uio_driver.h
@@ -133,6 +133,7 @@ extern void uio_event_notify(struct uio_info *info);
 #define UIO_MEM_PHYS   1
 #define UIO_MEM_LOGICAL2
 #define UIO_MEM_VIRTUAL 3
+#define UIO_MEM_IOVA   4
 
 /* defines for uio_port->porttype */
 #define UIO_PORT_NONE  0
-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH 5/6] hv_uio_generic: map ringbuffer phys addr

2018-09-14 Thread Stephen Hemminger
The ring buffer is contiguous IOVA and is mapped via phys addr
for sysfs file. Use same method for the UIO mapping.

Signed-off-by: Stephen Hemminger 
---
 drivers/uio/uio_hv_generic.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
index ba67a5267557..53f5610c6065 100644
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -224,10 +224,10 @@ hv_uio_probe(struct hv_device *dev,
/* mem resources */
pdata->info.mem[TXRX_RING_MAP].name = "txrx_rings";
pdata->info.mem[TXRX_RING_MAP].addr
-   = (uintptr_t)page_address(dev->channel->ringbuffer_page);
+   = 
(uintptr_t)virt_to_phys(page_address(dev->channel->ringbuffer_page));
pdata->info.mem[TXRX_RING_MAP].size
= dev->channel->ringbuffer_pagecount << PAGE_SHIFT;
-   pdata->info.mem[TXRX_RING_MAP].memtype = UIO_MEM_LOGICAL;
+   pdata->info.mem[TXRX_RING_MAP].memtype = UIO_MEM_IOVA;
 
pdata->info.mem[INT_PAGE_MAP].name = "int_page";
pdata->info.mem[INT_PAGE_MAP].addr
-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH 2/6] vmbus: keep pointer to ring buffer page

2018-09-14 Thread Stephen Hemminger
Avoid going from struct page to virt address (and back) by just
keeping pointer to the allocated pages instead of virt address.

Signed-off-by: Stephen Hemminger 
---
 drivers/hv/channel.c | 20 +---
 drivers/uio/uio_hv_generic.c |  5 +++--
 include/linux/hyperv.h   |  2 +-
 3 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index 33e6db02dbab..56ec0d96d876 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -91,11 +91,14 @@ int vmbus_open(struct vmbus_channel *newchannel, u32 
send_ringbuffer_size,
unsigned long flags;
int ret, err = 0;
struct page *page;
+   unsigned int order;
 
if (send_ringbuffer_size % PAGE_SIZE ||
recv_ringbuffer_size % PAGE_SIZE)
return -EINVAL;
 
+   order = get_order(send_ringbuffer_size + recv_ringbuffer_size);
+
spin_lock_irqsave(>lock, flags);
if (newchannel->state == CHANNEL_OPEN_STATE) {
newchannel->state = CHANNEL_OPENING_STATE;
@@ -110,21 +113,17 @@ int vmbus_open(struct vmbus_channel *newchannel, u32 
send_ringbuffer_size,
 
/* Allocate the ring buffer */
page = alloc_pages_node(cpu_to_node(newchannel->target_cpu),
-   GFP_KERNEL|__GFP_ZERO,
-   get_order(send_ringbuffer_size +
-   recv_ringbuffer_size));
+   GFP_KERNEL|__GFP_ZERO, order);
 
if (!page)
-   page = alloc_pages(GFP_KERNEL|__GFP_ZERO,
-  get_order(send_ringbuffer_size +
-recv_ringbuffer_size));
+   page = alloc_pages(GFP_KERNEL|__GFP_ZERO, order);
 
if (!page) {
err = -ENOMEM;
goto error_set_chnstate;
}
 
-   newchannel->ringbuffer_pages = page_address(page);
+   newchannel->ringbuffer_page = page;
newchannel->ringbuffer_pagecount = (send_ringbuffer_size +
   recv_ringbuffer_size) >> PAGE_SHIFT;
 
@@ -239,8 +238,7 @@ int vmbus_open(struct vmbus_channel *newchannel, u32 
send_ringbuffer_size,
 error_free_pages:
hv_ringbuffer_cleanup(>outbound);
hv_ringbuffer_cleanup(>inbound);
-   __free_pages(page,
-get_order(send_ringbuffer_size + recv_ringbuffer_size));
+   __free_pages(page, order);
 error_set_chnstate:
newchannel->state = CHANNEL_OPEN_STATE;
return err;
@@ -658,8 +656,8 @@ static int vmbus_close_internal(struct vmbus_channel 
*channel)
hv_ringbuffer_cleanup(>outbound);
hv_ringbuffer_cleanup(>inbound);
 
-   free_pages((unsigned long)channel->ringbuffer_pages,
-   get_order(channel->ringbuffer_pagecount * PAGE_SIZE));
+   __free_pages(channel->ringbuffer_page,
+get_order(channel->ringbuffer_pagecount << PAGE_SHIFT));
 
 out:
return ret;
diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
index a08860260f55..ba67a5267557 100644
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -130,11 +130,12 @@ static int hv_uio_ring_mmap(struct file *filp, struct 
kobject *kobj,
= container_of(kobj, struct vmbus_channel, kobj);
struct hv_device *dev = channel->primary_channel->device_obj;
u16 q_idx = channel->offermsg.offer.sub_channel_index;
+   void *ring_buffer = page_address(channel->ringbuffer_page);
 
dev_dbg(>device, "mmap channel %u pages %#lx at %#lx\n",
q_idx, vma_pages(vma), vma->vm_pgoff);
 
-   return vm_iomap_memory(vma, virt_to_phys(channel->ringbuffer_pages),
+   return vm_iomap_memory(vma, virt_to_phys(ring_buffer),
   channel->ringbuffer_pagecount << PAGE_SHIFT);
 }
 
@@ -223,7 +224,7 @@ hv_uio_probe(struct hv_device *dev,
/* mem resources */
pdata->info.mem[TXRX_RING_MAP].name = "txrx_rings";
pdata->info.mem[TXRX_RING_MAP].addr
-   = (uintptr_t)dev->channel->ringbuffer_pages;
+   = (uintptr_t)page_address(dev->channel->ringbuffer_page);
pdata->info.mem[TXRX_RING_MAP].size
= dev->channel->ringbuffer_pagecount << PAGE_SHIFT;
pdata->info.mem[TXRX_RING_MAP].memtype = UIO_MEM_LOGICAL;
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 6c4575c7f46b..a6c32d2d090b 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -739,7 +739,7 @@ struct vmbus_channel {
u32 ringbuffer_gpadlhandle;
 
/* Allocated memory for ring buffer */
-   void *ringbuffer_pages;
+   struct page *ringbuffer_page;
u32 ringbuffer_pagecount;
   

[PATCH 1/6] vmbus: pass channel to hv_process_channel_removal

2018-09-14 Thread Stephen Hemminger
Rather than passing relid and then looking up the channel.
Pass the channel directly, since caller already knows it.

Signed-off-by: Stephen Hemminger 
---
 drivers/hv/channel.c  |  3 +--
 drivers/hv/channel_mgmt.c | 17 +
 drivers/hv/vmbus_drv.c|  3 +--
 include/linux/hyperv.h|  2 +-
 4 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index 741857d80da1..33e6db02dbab 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -690,8 +690,7 @@ void vmbus_close(struct vmbus_channel *channel)
wait_for_completion(_channel->rescind_event);
mutex_lock(_connection.channel_mutex);
vmbus_close_internal(cur_channel);
-   hv_process_channel_removal(
-  cur_channel->offermsg.child_relid);
+   hv_process_channel_removal(cur_channel);
} else {
mutex_lock(_connection.channel_mutex);
vmbus_close_internal(cur_channel);
diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 0f0e091c117c..b7c48ebdf6a1 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -385,21 +385,14 @@ static void vmbus_release_relid(u32 relid)
trace_vmbus_release_relid(, ret);
 }
 
-void hv_process_channel_removal(u32 relid)
+void hv_process_channel_removal(struct vmbus_channel *channel)
 {
+   struct vmbus_channel *primary_channel;
unsigned long flags;
-   struct vmbus_channel *primary_channel, *channel;
 
BUG_ON(!mutex_is_locked(_connection.channel_mutex));
-
-   /*
-* Make sure channel is valid as we may have raced.
-*/
-   channel = relid2channel(relid);
-   if (!channel)
-   return;
-
BUG_ON(!channel->rescind);
+
if (channel->target_cpu != get_cpu()) {
put_cpu();
smp_call_function_single(channel->target_cpu,
@@ -429,7 +422,7 @@ void hv_process_channel_removal(u32 relid)
cpumask_clear_cpu(channel->target_cpu,
  _channel->alloced_cpus_in_node);
 
-   vmbus_release_relid(relid);
+   vmbus_release_relid(channel->offermsg.child_relid);
 
free_channel(channel);
 }
@@ -943,7 +936,7 @@ static void vmbus_onoffer_rescind(struct 
vmbus_channel_message_header *hdr)
 * The channel is currently not open;
 * it is safe for us to cleanup the channel.
 */
-   hv_process_channel_removal(rescind->child_relid);
+   hv_process_channel_removal(channel);
} else {
complete(>rescind_event);
}
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index e6d8fdac6d8b..007ee8e5986a 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -864,10 +864,9 @@ static void vmbus_device_release(struct device *device)
struct vmbus_channel *channel = hv_dev->channel;
 
mutex_lock(_connection.channel_mutex);
-   hv_process_channel_removal(channel->offermsg.child_relid);
+   hv_process_channel_removal(channel);
mutex_unlock(_connection.channel_mutex);
kfree(hv_dev);
-
 }
 
 /* The one and only one */
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 2c3798bcb01c..6c4575c7f46b 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1443,7 +1443,7 @@ extern bool vmbus_prep_negotiate_resp(struct icmsg_hdr 
*icmsghdrp, u8 *buf,
const int *srv_version, int srv_vercnt,
int *nego_fw_version, int *nego_srv_version);
 
-void hv_process_channel_removal(u32 relid);
+void hv_process_channel_removal(struct vmbus_channel *channel);
 
 void vmbus_setevent(struct vmbus_channel *channel);
 /*
-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH 6/6] uio_hv_generic: defer opening vmbus until first use

2018-09-14 Thread Stephen Hemminger
This fixes two design flaws in hv_uio_generic.

Since hv_uio_probe is called from vmbus_probe with lock held
it potentially can cause sleep in an atomic section because
vmbus_open will wait for response from host.

The hv_uio_generic driver could not handle applications
exiting and restarting because the vmbus channel was
persistent.  Change the semantics so that the buffers are
allocated on probe, but not attached to host until
device is opened.

Signed-off-by: Stephen Hemminger 
---
 drivers/uio/uio_hv_generic.c | 104 +--
 1 file changed, 74 insertions(+), 30 deletions(-)

diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
index 53f5610c6065..f2ec981d66cb 100644
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -55,6 +55,7 @@ enum hv_uio_map {
 struct hv_uio_private_data {
struct uio_info info;
struct hv_device *device;
+   atomic_t refcnt;
 
void*recv_buf;
u32 recv_gpadl;
@@ -128,12 +129,10 @@ static int hv_uio_ring_mmap(struct file *filp, struct 
kobject *kobj,
 {
struct vmbus_channel *channel
= container_of(kobj, struct vmbus_channel, kobj);
-   struct hv_device *dev = channel->primary_channel->device_obj;
-   u16 q_idx = channel->offermsg.offer.sub_channel_index;
void *ring_buffer = page_address(channel->ringbuffer_page);
 
-   dev_dbg(>device, "mmap channel %u pages %#lx at %#lx\n",
-   q_idx, vma_pages(vma), vma->vm_pgoff);
+   if (channel->state != CHANNEL_OPENED_STATE)
+   return -ENODEV;
 
return vm_iomap_memory(vma, virt_to_phys(ring_buffer),
   channel->ringbuffer_pagecount << PAGE_SHIFT);
@@ -176,57 +175,103 @@ hv_uio_new_channel(struct vmbus_channel *new_sc)
}
 }
 
+/* free the reserved buffers for send and receive */
 static void
 hv_uio_cleanup(struct hv_device *dev, struct hv_uio_private_data *pdata)
 {
-   if (pdata->send_gpadl)
+   if (pdata->send_gpadl) {
vmbus_teardown_gpadl(dev->channel, pdata->send_gpadl);
-   vfree(pdata->send_buf);
+   pdata->send_gpadl = 0;
+   vfree(pdata->send_buf);
+   }
 
-   if (pdata->recv_gpadl)
+   if (pdata->recv_gpadl) {
vmbus_teardown_gpadl(dev->channel, pdata->recv_gpadl);
-   vfree(pdata->recv_buf);
+   pdata->recv_gpadl = 0;
+   vfree(pdata->recv_buf);
+   }
+}
+
+/* VMBus primary channel is opened on first use */
+static int
+hv_uio_open(struct uio_info *info, struct inode *inode)
+{
+   struct hv_uio_private_data *pdata
+   = container_of(info, struct hv_uio_private_data, info);
+   struct hv_device *dev = pdata->device;
+   int ret;
+
+   if (atomic_inc_return(>refcnt) != 1)
+   return 0;
+
+   ret = vmbus_connect_ring(dev->channel,
+hv_uio_channel_cb, dev->channel);
+
+   if (ret == 0)
+   dev->channel->inbound.ring_buffer->interrupt_mask = 1;
+   else
+   atomic_dec(>refcount);
+
+   return ret;
+}
+
+/* VMBus primary channel is closed on last close */
+static int
+hv_uio_release(struct uio_info *info, struct inode *inode)
+{
+   struct hv_uio_private_data *pdata
+   = container_of(info, struct hv_uio_private_data, info);
+   struct hv_device *dev = pdata->device;
+   int ret = 0;
+
+   if (atomic_dec_and_test(>refcnt))
+   ret = vmbus_disconnect_ring(dev->channel);
+
+   return ret;
 }
 
 static int
 hv_uio_probe(struct hv_device *dev,
 const struct hv_vmbus_device_id *dev_id)
 {
+   struct vmbus_channel *channel = dev->channel;
struct hv_uio_private_data *pdata;
+   void *ring_buffer;
int ret;
 
+   /* Communicating with host has to be via shared memory not hypercall */
+   if (!channel->offermsg.monitor_allocated) {
+   dev_err(>device, "vmbus channel requires hypercall\n");
+   return -ENOTSUPP;
+   }
+
pdata = kzalloc(sizeof(*pdata), GFP_KERNEL);
if (!pdata)
return -ENOMEM;
 
-   ret = vmbus_open(dev->channel, HV_RING_SIZE * PAGE_SIZE,
-HV_RING_SIZE * PAGE_SIZE, NULL, 0,
-hv_uio_channel_cb, dev->channel);
+   ret = vmbus_alloc_ring(channel, HV_RING_SIZE * PAGE_SIZE,
+  HV_RING_SIZE * PAGE_SIZE);
if (ret)
goto fail;
 
-   /* Communicating with host has to be via shared memory not hypercall */
-   if (!dev->channel->offermsg.monitor_allocated) {
-   dev_err(>device, "vmbus channel requires hypercall\n");
-   ret = -ENOTSUPP;
-   

[PATCH 4/6] uio: introduce UIO_MEM_IOVA

2018-09-14 Thread Stephen Hemminger
Introduce the concept of mapping physical memory locations that
are normal memory. The new type UIO_MEM_IOVA are similar to
existing UIO_MEM_PHYS but the backing memory is not marked as uncached.

Also, indent related switch to the currently used style.

Signed-off-by: Stephen Hemminger 
---
 drivers/uio/uio.c  | 24 +---
 include/linux/uio_driver.h |  1 +
 2 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index 0ffb324aa038..e601bd3fbae1 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -738,7 +738,8 @@ static int uio_mmap_physical(struct vm_area_struct *vma)
return -EINVAL;
 
vma->vm_ops = _physical_vm_ops;
-   vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+   if (idev->info->mem[mi].memtype == UIO_MEM_PHYS)
+   vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 
/*
 * We cannot use the vm_iomap_memory() helper here,
@@ -795,18 +796,19 @@ static int uio_mmap(struct file *filep, struct 
vm_area_struct *vma)
}
 
switch (idev->info->mem[mi].memtype) {
-   case UIO_MEM_PHYS:
-   ret = uio_mmap_physical(vma);
-   break;
-   case UIO_MEM_LOGICAL:
-   case UIO_MEM_VIRTUAL:
-   ret = uio_mmap_logical(vma);
-   break;
-   default:
-   ret = -EINVAL;
+   case UIO_MEM_IOVA:
+   case UIO_MEM_PHYS:
+   ret = uio_mmap_physical(vma);
+   break;
+   case UIO_MEM_LOGICAL:
+   case UIO_MEM_VIRTUAL:
+   ret = uio_mmap_logical(vma);
+   break;
+   default:
+   ret = -EINVAL;
}
 
-out:
+ out:
mutex_unlock(>info_lock);
return ret;
 }
diff --git a/include/linux/uio_driver.h b/include/linux/uio_driver.h
index 6f8b68cd460f..a3cd7cb67a69 100644
--- a/include/linux/uio_driver.h
+++ b/include/linux/uio_driver.h
@@ -133,6 +133,7 @@ extern void uio_event_notify(struct uio_info *info);
 #define UIO_MEM_PHYS   1
 #define UIO_MEM_LOGICAL2
 #define UIO_MEM_VIRTUAL 3
+#define UIO_MEM_IOVA   4
 
 /* defines for uio_port->porttype */
 #define UIO_PORT_NONE  0
-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH 3/6] vmbus: split ring buffer allocation from open

2018-09-14 Thread Stephen Hemminger
The UIO driver needs the ring buffer to be persistent(reused)
across open/close. Split the allocation and setup of ring buffer
out of vmbus_open. For normal usage vmbus_open/vmbus_close there
are no changes; only impacts uio_hv_generic which needs to keep
ring buffer memory and reuse when application restarts.

Signed-off-by: Stephen Hemminger 
---
 drivers/hv/channel.c | 267 ++-
 drivers/hv/ring_buffer.c |   1 +
 include/linux/hyperv.h   |   9 ++
 3 files changed, 162 insertions(+), 115 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index 56ec0d96d876..ddadb7efd1cc 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -79,84 +79,96 @@ void vmbus_setevent(struct vmbus_channel *channel)
 }
 EXPORT_SYMBOL_GPL(vmbus_setevent);
 
-/*
- * vmbus_open - Open the specified channel.
- */
-int vmbus_open(struct vmbus_channel *newchannel, u32 send_ringbuffer_size,
-u32 recv_ringbuffer_size, void *userdata, u32 userdatalen,
-void (*onchannelcallback)(void *context), void *context)
+/* vmbus_free_ring - drop mapping of ring buffer */
+void vmbus_free_ring(struct vmbus_channel *channel)
 {
-   struct vmbus_channel_open_channel *open_msg;
-   struct vmbus_channel_msginfo *open_info = NULL;
-   unsigned long flags;
-   int ret, err = 0;
-   struct page *page;
-   unsigned int order;
+   hv_ringbuffer_cleanup(>outbound);
+   hv_ringbuffer_cleanup(>inbound);
 
-   if (send_ringbuffer_size % PAGE_SIZE ||
-   recv_ringbuffer_size % PAGE_SIZE)
-   return -EINVAL;
+   if (channel->ringbuffer_page) {
+   __free_pages(channel->ringbuffer_page,
+get_order(channel->ringbuffer_pagecount
+  << PAGE_SHIFT));
+   channel->ringbuffer_page = NULL;
+   }
+}
+EXPORT_SYMBOL_GPL(vmbus_free_ring);
 
-   order = get_order(send_ringbuffer_size + recv_ringbuffer_size);
+/* vmbus_alloc_ring - allocate and map pages for ring buffer */
+int vmbus_alloc_ring(struct vmbus_channel *newchannel,
+u32 send_size, u32 recv_size)
+{
+   struct page *page;
+   int order;
 
-   spin_lock_irqsave(>lock, flags);
-   if (newchannel->state == CHANNEL_OPEN_STATE) {
-   newchannel->state = CHANNEL_OPENING_STATE;
-   } else {
-   spin_unlock_irqrestore(>lock, flags);
+   if (send_size % PAGE_SIZE || recv_size % PAGE_SIZE)
return -EINVAL;
-   }
-   spin_unlock_irqrestore(>lock, flags);
-
-   newchannel->onchannel_callback = onchannelcallback;
-   newchannel->channel_callback_context = context;
 
/* Allocate the ring buffer */
+   order = get_order(send_size + recv_size);
page = alloc_pages_node(cpu_to_node(newchannel->target_cpu),
GFP_KERNEL|__GFP_ZERO, order);
 
if (!page)
page = alloc_pages(GFP_KERNEL|__GFP_ZERO, order);
 
-   if (!page) {
-   err = -ENOMEM;
-   goto error_set_chnstate;
-   }
+   if (!page)
+   return -ENOMEM;
 
newchannel->ringbuffer_page = page;
-   newchannel->ringbuffer_pagecount = (send_ringbuffer_size +
-  recv_ringbuffer_size) >> PAGE_SHIFT;
+   newchannel->ringbuffer_pagecount = (send_size + recv_size) >> 
PAGE_SHIFT;
+   newchannel->ringbuffer_send_offset = send_size >> PAGE_SHIFT;
 
-   ret = hv_ringbuffer_init(>outbound, page,
-send_ringbuffer_size >> PAGE_SHIFT);
+   return 0;
+}
+EXPORT_SYMBOL_GPL(vmbus_alloc_ring);
 
-   if (ret != 0) {
-   err = ret;
-   goto error_free_pages;
-   }
+static int __vmbus_open(struct vmbus_channel *newchannel,
+  void *userdata, u32 userdatalen,
+  void (*onchannelcallback)(void *context), void *context)
+{
+   struct vmbus_channel_open_channel *open_msg;
+   struct vmbus_channel_msginfo *open_info = NULL;
+   struct page *page = newchannel->ringbuffer_page;
+   u32 send_pages, recv_pages;
+   unsigned long flags;
+   int err;
 
-   ret = hv_ringbuffer_init(>inbound,
-[send_ringbuffer_size >> PAGE_SHIFT],
-recv_ringbuffer_size >> PAGE_SHIFT);
-   if (ret != 0) {
-   err = ret;
-   goto error_free_pages;
+   if (userdatalen > MAX_USER_DEFINED_BYTES)
+   return -EINVAL;
+
+   send_pages = newchannel->ringbuffer_send_offset;
+   recv_pages = newchannel->ringbuffer_pagecount - send_pages;
+
+   spin_lock_irqsave(>lock, flags);
+   if (newchannel->state != CHANNEL_OPEN_STATE) {
+ 

[PATCH 4/6] uio: introduce UIO_MEM_IOVA

2018-09-13 Thread Stephen Hemminger
Introduce the concept of mapping physical memory locations that
are normal memory. The new type UIO_MEM_IOVA are similar to
existing UIO_MEM_PHYS but the backing memory is not marked as uncached.

Also, indent related switch to the currently used style.

Signed-off-by: Stephen Hemminger 
---
 drivers/uio/uio.c  | 24 +---
 include/linux/uio_driver.h |  1 +
 2 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index 0ffb324aa038..e601bd3fbae1 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -738,7 +738,8 @@ static int uio_mmap_physical(struct vm_area_struct *vma)
return -EINVAL;
 
vma->vm_ops = _physical_vm_ops;
-   vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+   if (idev->info->mem[mi].memtype == UIO_MEM_PHYS)
+   vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 
/*
 * We cannot use the vm_iomap_memory() helper here,
@@ -795,18 +796,19 @@ static int uio_mmap(struct file *filep, struct 
vm_area_struct *vma)
}
 
switch (idev->info->mem[mi].memtype) {
-   case UIO_MEM_PHYS:
-   ret = uio_mmap_physical(vma);
-   break;
-   case UIO_MEM_LOGICAL:
-   case UIO_MEM_VIRTUAL:
-   ret = uio_mmap_logical(vma);
-   break;
-   default:
-   ret = -EINVAL;
+   case UIO_MEM_IOVA:
+   case UIO_MEM_PHYS:
+   ret = uio_mmap_physical(vma);
+   break;
+   case UIO_MEM_LOGICAL:
+   case UIO_MEM_VIRTUAL:
+   ret = uio_mmap_logical(vma);
+   break;
+   default:
+   ret = -EINVAL;
}
 
-out:
+ out:
mutex_unlock(>info_lock);
return ret;
 }
diff --git a/include/linux/uio_driver.h b/include/linux/uio_driver.h
index 6f8b68cd460f..a3cd7cb67a69 100644
--- a/include/linux/uio_driver.h
+++ b/include/linux/uio_driver.h
@@ -133,6 +133,7 @@ extern void uio_event_notify(struct uio_info *info);
 #define UIO_MEM_PHYS   1
 #define UIO_MEM_LOGICAL2
 #define UIO_MEM_VIRTUAL 3
+#define UIO_MEM_IOVA   4
 
 /* defines for uio_port->porttype */
 #define UIO_PORT_NONE  0
-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH 6/6] uio_hv_generic: defer opening vmbus until first use

2018-09-13 Thread Stephen Hemminger
This fixes two design flaws in hv_uio_generic.

Since hv_uio_probe is called from vmbus_probe with lock held
it potentially can cause sleep in an atomic section because
vmbus_open will wait for response from host.

The hv_uio_generic driver could not handle applications
exiting and restarting because the vmbus channel was
persistent.  Change the semantics so that the buffers are
allocated on probe, but not attached to host until
device is opened.

Signed-off-by: Stephen Hemminger 
---
 drivers/uio/uio_hv_generic.c | 102 ---
 1 file changed, 72 insertions(+), 30 deletions(-)

diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
index 53f5610c6065..9bd837accdb5 100644
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -55,6 +55,7 @@ enum hv_uio_map {
 struct hv_uio_private_data {
struct uio_info info;
struct hv_device *device;
+   atomic_t refcnt;
 
void*recv_buf;
u32 recv_gpadl;
@@ -128,12 +129,10 @@ static int hv_uio_ring_mmap(struct file *filp, struct 
kobject *kobj,
 {
struct vmbus_channel *channel
= container_of(kobj, struct vmbus_channel, kobj);
-   struct hv_device *dev = channel->primary_channel->device_obj;
-   u16 q_idx = channel->offermsg.offer.sub_channel_index;
void *ring_buffer = page_address(channel->ringbuffer_page);
 
-   dev_dbg(>device, "mmap channel %u pages %#lx at %#lx\n",
-   q_idx, vma_pages(vma), vma->vm_pgoff);
+   if (channel->state != CHANNEL_OPENED_STATE)
+   return -ENODEV;
 
return vm_iomap_memory(vma, virt_to_phys(ring_buffer),
   channel->ringbuffer_pagecount << PAGE_SHIFT);
@@ -176,57 +175,101 @@ hv_uio_new_channel(struct vmbus_channel *new_sc)
}
 }
 
+/* free the reserved buffers for send and receive */
 static void
 hv_uio_cleanup(struct hv_device *dev, struct hv_uio_private_data *pdata)
 {
-   if (pdata->send_gpadl)
+   if (pdata->send_gpadl) {
vmbus_teardown_gpadl(dev->channel, pdata->send_gpadl);
-   vfree(pdata->send_buf);
+   pdata->send_gpadl = 0;
+   vfree(pdata->send_buf);
+   }
 
-   if (pdata->recv_gpadl)
+   if (pdata->recv_gpadl) {
vmbus_teardown_gpadl(dev->channel, pdata->recv_gpadl);
-   vfree(pdata->recv_buf);
+   pdata->recv_gpadl = 0;
+   vfree(pdata->recv_buf);
+   }
+}
+
+/* VMBus primary channel is opened on first use */
+static int
+hv_uio_open(struct uio_info *info, struct inode *inode)
+{
+   struct hv_uio_private_data *pdata
+   = container_of(info, struct hv_uio_private_data, info);
+   struct hv_device *dev = pdata->device;
+   int ret;
+
+   if (atomic_inc_return(>refcnt) != 1)
+   return 0;
+
+   ret = vmbus_connect_ring(dev->channel,
+hv_uio_channel_cb, dev->channel);
+
+   if (ret == 0)
+   dev->channel->inbound.ring_buffer->interrupt_mask = 1;
+
+   return ret;
+}
+
+/* VMBus primary channel is closed on last close */
+static int
+hv_uio_release(struct uio_info *info, struct inode *inode)
+{
+   struct hv_uio_private_data *pdata
+   = container_of(info, struct hv_uio_private_data, info);
+   struct hv_device *dev = pdata->device;
+   int ret = 0;
+
+   if (atomic_dec_and_test(>refcnt))
+   ret = vmbus_disconnect_ring(dev->channel);
+
+   return ret;
 }
 
 static int
 hv_uio_probe(struct hv_device *dev,
 const struct hv_vmbus_device_id *dev_id)
 {
+   struct vmbus_channel *channel = dev->channel;
struct hv_uio_private_data *pdata;
+   void *ring_buffer;
int ret;
 
+   /* Communicating with host has to be via shared memory not hypercall */
+   if (!channel->offermsg.monitor_allocated) {
+   dev_err(>device, "vmbus channel requires hypercall\n");
+   return -ENOTSUPP;
+   }
+
pdata = kzalloc(sizeof(*pdata), GFP_KERNEL);
if (!pdata)
return -ENOMEM;
 
-   ret = vmbus_open(dev->channel, HV_RING_SIZE * PAGE_SIZE,
-HV_RING_SIZE * PAGE_SIZE, NULL, 0,
-hv_uio_channel_cb, dev->channel);
+   ret = vmbus_alloc_ring(channel, HV_RING_SIZE * PAGE_SIZE,
+  HV_RING_SIZE * PAGE_SIZE);
if (ret)
goto fail;
 
-   /* Communicating with host has to be via shared memory not hypercall */
-   if (!dev->channel->offermsg.monitor_allocated) {
-   dev_err(>device, "vmbus channel requires hypercall\n");
-   ret = -ENOTSUPP;
-   goto fail_close;
-   }
-
-   dev->channel->inbound

[PATCH 0/6] fix Hyper-V uio restart

2018-09-13 Thread Stephen Hemminger
This set of patches fixes the problem where DPDK applications
using hv_uio_generic driver can not be successfully restarted.

In order to get this working it required small change to uio
to allow for mapping without no-cache. And refactoring of how
ring buffer is setup in vmbus code.

It could be backported as a fix, to 4.19 but that is not
an LTS so probably not worth it.

Stephen Hemminger (6):
  vmbus: pass channel to hv_process_channel_removal
  vmbus: keep pointer to ring buffer page
  vmbus: split ring buffer allocation from open
  uio: introduce UIO_MEM_IOVA
  hv_uio_generic: map ringbuffer phys addr
  uio_hv_generic: defer opening vmbus until first use

 drivers/hv/channel.c | 276 ---
 drivers/hv/channel_mgmt.c|  17 +--
 drivers/hv/ring_buffer.c |   1 +
 drivers/hv/vmbus_drv.c   |   3 +-
 drivers/uio/uio.c|  24 +--
 drivers/uio/uio_hv_generic.c | 107 ++
 include/linux/hyperv.h   |  13 +-
 include/linux/uio_driver.h   |   1 +
 8 files changed, 262 insertions(+), 180 deletions(-)

-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH 3/6] vmbus: split ring buffer allocation from open

2018-09-13 Thread Stephen Hemminger
The UIO driver needs the ring buffer to be persistent(reused)
across open/close. Split the allocation and setup of ring buffer
out of vmbus_open. For normal usage vmbus_open/vmbus_close there
are no changes; only impacts uio_hv_generic which needs to keep
ring buffer memory and reuse when application restarts.

Signed-off-by: Stephen Hemminger 
---
 drivers/hv/channel.c | 267 ++-
 drivers/hv/ring_buffer.c |   1 +
 include/linux/hyperv.h   |   9 ++
 3 files changed, 162 insertions(+), 115 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index 56ec0d96d876..ddadb7efd1cc 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -79,84 +79,96 @@ void vmbus_setevent(struct vmbus_channel *channel)
 }
 EXPORT_SYMBOL_GPL(vmbus_setevent);
 
-/*
- * vmbus_open - Open the specified channel.
- */
-int vmbus_open(struct vmbus_channel *newchannel, u32 send_ringbuffer_size,
-u32 recv_ringbuffer_size, void *userdata, u32 userdatalen,
-void (*onchannelcallback)(void *context), void *context)
+/* vmbus_free_ring - drop mapping of ring buffer */
+void vmbus_free_ring(struct vmbus_channel *channel)
 {
-   struct vmbus_channel_open_channel *open_msg;
-   struct vmbus_channel_msginfo *open_info = NULL;
-   unsigned long flags;
-   int ret, err = 0;
-   struct page *page;
-   unsigned int order;
+   hv_ringbuffer_cleanup(>outbound);
+   hv_ringbuffer_cleanup(>inbound);
 
-   if (send_ringbuffer_size % PAGE_SIZE ||
-   recv_ringbuffer_size % PAGE_SIZE)
-   return -EINVAL;
+   if (channel->ringbuffer_page) {
+   __free_pages(channel->ringbuffer_page,
+get_order(channel->ringbuffer_pagecount
+  << PAGE_SHIFT));
+   channel->ringbuffer_page = NULL;
+   }
+}
+EXPORT_SYMBOL_GPL(vmbus_free_ring);
 
-   order = get_order(send_ringbuffer_size + recv_ringbuffer_size);
+/* vmbus_alloc_ring - allocate and map pages for ring buffer */
+int vmbus_alloc_ring(struct vmbus_channel *newchannel,
+u32 send_size, u32 recv_size)
+{
+   struct page *page;
+   int order;
 
-   spin_lock_irqsave(>lock, flags);
-   if (newchannel->state == CHANNEL_OPEN_STATE) {
-   newchannel->state = CHANNEL_OPENING_STATE;
-   } else {
-   spin_unlock_irqrestore(>lock, flags);
+   if (send_size % PAGE_SIZE || recv_size % PAGE_SIZE)
return -EINVAL;
-   }
-   spin_unlock_irqrestore(>lock, flags);
-
-   newchannel->onchannel_callback = onchannelcallback;
-   newchannel->channel_callback_context = context;
 
/* Allocate the ring buffer */
+   order = get_order(send_size + recv_size);
page = alloc_pages_node(cpu_to_node(newchannel->target_cpu),
GFP_KERNEL|__GFP_ZERO, order);
 
if (!page)
page = alloc_pages(GFP_KERNEL|__GFP_ZERO, order);
 
-   if (!page) {
-   err = -ENOMEM;
-   goto error_set_chnstate;
-   }
+   if (!page)
+   return -ENOMEM;
 
newchannel->ringbuffer_page = page;
-   newchannel->ringbuffer_pagecount = (send_ringbuffer_size +
-  recv_ringbuffer_size) >> PAGE_SHIFT;
+   newchannel->ringbuffer_pagecount = (send_size + recv_size) >> 
PAGE_SHIFT;
+   newchannel->ringbuffer_send_offset = send_size >> PAGE_SHIFT;
 
-   ret = hv_ringbuffer_init(>outbound, page,
-send_ringbuffer_size >> PAGE_SHIFT);
+   return 0;
+}
+EXPORT_SYMBOL_GPL(vmbus_alloc_ring);
 
-   if (ret != 0) {
-   err = ret;
-   goto error_free_pages;
-   }
+static int __vmbus_open(struct vmbus_channel *newchannel,
+  void *userdata, u32 userdatalen,
+  void (*onchannelcallback)(void *context), void *context)
+{
+   struct vmbus_channel_open_channel *open_msg;
+   struct vmbus_channel_msginfo *open_info = NULL;
+   struct page *page = newchannel->ringbuffer_page;
+   u32 send_pages, recv_pages;
+   unsigned long flags;
+   int err;
 
-   ret = hv_ringbuffer_init(>inbound,
-[send_ringbuffer_size >> PAGE_SHIFT],
-recv_ringbuffer_size >> PAGE_SHIFT);
-   if (ret != 0) {
-   err = ret;
-   goto error_free_pages;
+   if (userdatalen > MAX_USER_DEFINED_BYTES)
+   return -EINVAL;
+
+   send_pages = newchannel->ringbuffer_send_offset;
+   recv_pages = newchannel->ringbuffer_pagecount - send_pages;
+
+   spin_lock_irqsave(>lock, flags);
+   if (newchannel->state != CHANNEL_OPEN_STATE) {
+ 

[PATCH 2/6] vmbus: keep pointer to ring buffer page

2018-09-13 Thread Stephen Hemminger
Avoid going from struct page to virt address (and back) by just
keeping pointer to the allocated pages instead of virt address.

Signed-off-by: Stephen Hemminger 
---
 drivers/hv/channel.c | 20 +---
 drivers/uio/uio_hv_generic.c |  5 +++--
 include/linux/hyperv.h   |  2 +-
 3 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index 33e6db02dbab..56ec0d96d876 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -91,11 +91,14 @@ int vmbus_open(struct vmbus_channel *newchannel, u32 
send_ringbuffer_size,
unsigned long flags;
int ret, err = 0;
struct page *page;
+   unsigned int order;
 
if (send_ringbuffer_size % PAGE_SIZE ||
recv_ringbuffer_size % PAGE_SIZE)
return -EINVAL;
 
+   order = get_order(send_ringbuffer_size + recv_ringbuffer_size);
+
spin_lock_irqsave(>lock, flags);
if (newchannel->state == CHANNEL_OPEN_STATE) {
newchannel->state = CHANNEL_OPENING_STATE;
@@ -110,21 +113,17 @@ int vmbus_open(struct vmbus_channel *newchannel, u32 
send_ringbuffer_size,
 
/* Allocate the ring buffer */
page = alloc_pages_node(cpu_to_node(newchannel->target_cpu),
-   GFP_KERNEL|__GFP_ZERO,
-   get_order(send_ringbuffer_size +
-   recv_ringbuffer_size));
+   GFP_KERNEL|__GFP_ZERO, order);
 
if (!page)
-   page = alloc_pages(GFP_KERNEL|__GFP_ZERO,
-  get_order(send_ringbuffer_size +
-recv_ringbuffer_size));
+   page = alloc_pages(GFP_KERNEL|__GFP_ZERO, order);
 
if (!page) {
err = -ENOMEM;
goto error_set_chnstate;
}
 
-   newchannel->ringbuffer_pages = page_address(page);
+   newchannel->ringbuffer_page = page;
newchannel->ringbuffer_pagecount = (send_ringbuffer_size +
   recv_ringbuffer_size) >> PAGE_SHIFT;
 
@@ -239,8 +238,7 @@ int vmbus_open(struct vmbus_channel *newchannel, u32 
send_ringbuffer_size,
 error_free_pages:
hv_ringbuffer_cleanup(>outbound);
hv_ringbuffer_cleanup(>inbound);
-   __free_pages(page,
-get_order(send_ringbuffer_size + recv_ringbuffer_size));
+   __free_pages(page, order);
 error_set_chnstate:
newchannel->state = CHANNEL_OPEN_STATE;
return err;
@@ -658,8 +656,8 @@ static int vmbus_close_internal(struct vmbus_channel 
*channel)
hv_ringbuffer_cleanup(>outbound);
hv_ringbuffer_cleanup(>inbound);
 
-   free_pages((unsigned long)channel->ringbuffer_pages,
-   get_order(channel->ringbuffer_pagecount * PAGE_SIZE));
+   __free_pages(channel->ringbuffer_page,
+get_order(channel->ringbuffer_pagecount << PAGE_SHIFT));
 
 out:
return ret;
diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
index a08860260f55..ba67a5267557 100644
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -130,11 +130,12 @@ static int hv_uio_ring_mmap(struct file *filp, struct 
kobject *kobj,
= container_of(kobj, struct vmbus_channel, kobj);
struct hv_device *dev = channel->primary_channel->device_obj;
u16 q_idx = channel->offermsg.offer.sub_channel_index;
+   void *ring_buffer = page_address(channel->ringbuffer_page);
 
dev_dbg(>device, "mmap channel %u pages %#lx at %#lx\n",
q_idx, vma_pages(vma), vma->vm_pgoff);
 
-   return vm_iomap_memory(vma, virt_to_phys(channel->ringbuffer_pages),
+   return vm_iomap_memory(vma, virt_to_phys(ring_buffer),
   channel->ringbuffer_pagecount << PAGE_SHIFT);
 }
 
@@ -223,7 +224,7 @@ hv_uio_probe(struct hv_device *dev,
/* mem resources */
pdata->info.mem[TXRX_RING_MAP].name = "txrx_rings";
pdata->info.mem[TXRX_RING_MAP].addr
-   = (uintptr_t)dev->channel->ringbuffer_pages;
+   = (uintptr_t)page_address(dev->channel->ringbuffer_page);
pdata->info.mem[TXRX_RING_MAP].size
= dev->channel->ringbuffer_pagecount << PAGE_SHIFT;
pdata->info.mem[TXRX_RING_MAP].memtype = UIO_MEM_LOGICAL;
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 6c4575c7f46b..a6c32d2d090b 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -739,7 +739,7 @@ struct vmbus_channel {
u32 ringbuffer_gpadlhandle;
 
/* Allocated memory for ring buffer */
-   void *ringbuffer_pages;
+   struct page *ringbuffer_page;
u32 ringbuffer_pagecount;
   

[PATCH 1/6] vmbus: pass channel to hv_process_channel_removal

2018-09-13 Thread Stephen Hemminger
Rather than passing relid and then looking up the channel.
Pass the channel directly, since caller already knows it.

Signed-off-by: Stephen Hemminger 
---
 drivers/hv/channel.c  |  3 +--
 drivers/hv/channel_mgmt.c | 17 +
 drivers/hv/vmbus_drv.c|  3 +--
 include/linux/hyperv.h|  2 +-
 4 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index 741857d80da1..33e6db02dbab 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -690,8 +690,7 @@ void vmbus_close(struct vmbus_channel *channel)
wait_for_completion(_channel->rescind_event);
mutex_lock(_connection.channel_mutex);
vmbus_close_internal(cur_channel);
-   hv_process_channel_removal(
-  cur_channel->offermsg.child_relid);
+   hv_process_channel_removal(cur_channel);
} else {
mutex_lock(_connection.channel_mutex);
vmbus_close_internal(cur_channel);
diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 0f0e091c117c..b7c48ebdf6a1 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -385,21 +385,14 @@ static void vmbus_release_relid(u32 relid)
trace_vmbus_release_relid(, ret);
 }
 
-void hv_process_channel_removal(u32 relid)
+void hv_process_channel_removal(struct vmbus_channel *channel)
 {
+   struct vmbus_channel *primary_channel;
unsigned long flags;
-   struct vmbus_channel *primary_channel, *channel;
 
BUG_ON(!mutex_is_locked(_connection.channel_mutex));
-
-   /*
-* Make sure channel is valid as we may have raced.
-*/
-   channel = relid2channel(relid);
-   if (!channel)
-   return;
-
BUG_ON(!channel->rescind);
+
if (channel->target_cpu != get_cpu()) {
put_cpu();
smp_call_function_single(channel->target_cpu,
@@ -429,7 +422,7 @@ void hv_process_channel_removal(u32 relid)
cpumask_clear_cpu(channel->target_cpu,
  _channel->alloced_cpus_in_node);
 
-   vmbus_release_relid(relid);
+   vmbus_release_relid(channel->offermsg.child_relid);
 
free_channel(channel);
 }
@@ -943,7 +936,7 @@ static void vmbus_onoffer_rescind(struct 
vmbus_channel_message_header *hdr)
 * The channel is currently not open;
 * it is safe for us to cleanup the channel.
 */
-   hv_process_channel_removal(rescind->child_relid);
+   hv_process_channel_removal(channel);
} else {
complete(>rescind_event);
}
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index e6d8fdac6d8b..007ee8e5986a 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -864,10 +864,9 @@ static void vmbus_device_release(struct device *device)
struct vmbus_channel *channel = hv_dev->channel;
 
mutex_lock(_connection.channel_mutex);
-   hv_process_channel_removal(channel->offermsg.child_relid);
+   hv_process_channel_removal(channel);
mutex_unlock(_connection.channel_mutex);
kfree(hv_dev);
-
 }
 
 /* The one and only one */
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 2c3798bcb01c..6c4575c7f46b 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1443,7 +1443,7 @@ extern bool vmbus_prep_negotiate_resp(struct icmsg_hdr 
*icmsghdrp, u8 *buf,
const int *srv_version, int srv_vercnt,
int *nego_fw_version, int *nego_srv_version);
 
-void hv_process_channel_removal(u32 relid);
+void hv_process_channel_removal(struct vmbus_channel *channel);
 
 void vmbus_setevent(struct vmbus_channel *channel);
 /*
-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH 5/6] hv_uio_generic: map ringbuffer phys addr

2018-09-13 Thread Stephen Hemminger
The ring buffer is contiguous IOVA and is mapped via phys addr
for sysfs file. Use same method for the UIO mapping.

Signed-off-by: Stephen Hemminger 
---
 drivers/uio/uio_hv_generic.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
index ba67a5267557..53f5610c6065 100644
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -224,10 +224,10 @@ hv_uio_probe(struct hv_device *dev,
/* mem resources */
pdata->info.mem[TXRX_RING_MAP].name = "txrx_rings";
pdata->info.mem[TXRX_RING_MAP].addr
-   = (uintptr_t)page_address(dev->channel->ringbuffer_page);
+   = 
(uintptr_t)virt_to_phys(page_address(dev->channel->ringbuffer_page));
pdata->info.mem[TXRX_RING_MAP].size
= dev->channel->ringbuffer_pagecount << PAGE_SHIFT;
-   pdata->info.mem[TXRX_RING_MAP].memtype = UIO_MEM_LOGICAL;
+   pdata->info.mem[TXRX_RING_MAP].memtype = UIO_MEM_IOVA;
 
pdata->info.mem[INT_PAGE_MAP].name = "int_page";
pdata->info.mem[INT_PAGE_MAP].addr
-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: yield() and cond_resched() do not yield to another thread

2018-09-05 Thread Stephen Hemminger
On Tue, 4 Sep 2018 21:16:07 +0800
fei phung  wrote:

> Hi everyone,
> 
> I am working on https://github.com/promach/riffa/tree/full_duplex
> 
> While I modifying the linux driver, I faced some issue using yield().
> 
> Why riffa_driver.c yield() function
> https://gist.github.com/promach/7716ee8addcaa33fda140d74d1ad94d6#file-riffa_driver-c-L746
> 
> does not yield to chnl_send() thread
> https://gist.github.com/promach/7716ee8addcaa33fda140d74d1ad94d6#file-riffa_driver-c-L883
> ?
> 
> See 
> https://stackoverflow.com/questions/52166325/why-yield-and-cond-resched-do-not-yield-to-another-thread
> for more details
> ___
> devel mailing list
> de...@linuxdriverproject.org
> http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: yield() and cond_resched() do not yield to another thread

2018-09-05 Thread Stephen Hemminger
On Tue, 4 Sep 2018 21:16:07 +0800
fei phung  wrote:

> Hi everyone,
> 
> I am working on https://github.com/promach/riffa/tree/full_duplex
> 
> While I modifying the linux driver, I faced some issue using yield().
> 
> Why riffa_driver.c yield() function
> https://gist.github.com/promach/7716ee8addcaa33fda140d74d1ad94d6#file-riffa_driver-c-L746
> 
> does not yield to chnl_send() thread
> https://gist.github.com/promach/7716ee8addcaa33fda140d74d1ad94d6#file-riffa_driver-c-L883
> ?
> 
> See 
> https://stackoverflow.com/questions/52166325/why-yield-and-cond-resched-do-not-yield-to-another-thread
> for more details
> ___
> devel mailing list
> de...@linuxdriverproject.org
> http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Using yield in modern code is almost always wrong. It has priority inversion 
and latency
issues. You are much better off using a real syncronization primitive (like 
completion or wait).
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH net-next 1/2] PCI: hv: support reporting serial number as slot information

2018-08-29 Thread Stephen Hemminger
The Hyper-V host API for PCI provides a unique "serial number" which
can be used as basis for sysfs PCI slot table. This can be useful
for cases where userspace wants to find the PCI device based on
serial number.

When an SR-IOV NIC is added, the host sends an attach message
with serial number. The kernel doesn't use the serial number, but
it is useful when doing the same thing in a userspace driver such
as the DPDK. By having /sys/bus/pci/slots/N it provides a direct
way to find the matching PCI device.

There may be some cases where serial number is not unique such
as when using GPU's. But the PCI slot infrastructure will handle
that by adding suffix "2-1" etc.

This has a side effect which may also be useful. The common udev
network device naming policy uses the slot information (rather
than PCI address). This causes udev to give shorter network device
names for VF devices.  It does not break applications or startup
because the VF device must never be configured directly.

Signed-off-by: Stephen Hemminger 
---
 drivers/pci/controller/pci-hyperv.c | 30 +
 1 file changed, 30 insertions(+)

diff --git a/drivers/pci/controller/pci-hyperv.c 
b/drivers/pci/controller/pci-hyperv.c
index c00f82cc54aa..e6a6c1146a41 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -89,6 +89,8 @@ static enum pci_protocol_version_t pci_protocol_version;
 
 #define STATUS_REVISION_MISMATCH 0xC059
 
+#define SLOT_NAME_SIZE 21
+
 /*
  * Message Types
  */
@@ -494,6 +496,7 @@ struct hv_pci_dev {
struct list_head list_entry;
refcount_t refs;
enum hv_pcichild_state state;
+   struct pci_slot *pci_slot;
struct pci_function_description desc;
bool reported_missing;
struct hv_pcibus_device *hbus;
@@ -1457,6 +1460,28 @@ static void prepopulate_bars(struct hv_pcibus_device 
*hbus)
spin_unlock_irqrestore(>device_list_lock, flags);
 }
 
+static void hv_pci_assign_slots(struct hv_pcibus_device *hbus)
+{
+   struct hv_pci_dev *hpdev;
+   char name[SLOT_NAME_SIZE];
+   unsigned long flags;
+   int slot_nr;
+
+   spin_lock_irqsave(>device_list_lock, flags);
+   list_for_each_entry(hpdev, >children, list_entry) {
+   if (hpdev->pci_slot)
+   continue;
+
+   slot_nr = PCI_SLOT(wslot_to_devfn(hpdev->desc.win_slot.slot));
+   snprintf(name, SLOT_NAME_SIZE, "%u", hpdev->desc.ser);
+   hpdev->pci_slot = pci_create_slot(hbus->pci_bus, slot_nr,
+ name, NULL);
+   if (!hpdev->pci_slot)
+   pr_warn("pci_create slot %s failed\n", name);
+   }
+   spin_unlock_irqrestore(>device_list_lock, flags);
+}
+
 /**
  * create_root_hv_pci_bus() - Expose a new root PCI bus
  * @hbus:  Root PCI bus, as understood by this driver
@@ -1480,6 +1505,7 @@ static int create_root_hv_pci_bus(struct hv_pcibus_device 
*hbus)
pci_lock_rescan_remove();
pci_scan_child_bus(hbus->pci_bus);
pci_bus_assign_resources(hbus->pci_bus);
+   hv_pci_assign_slots(hbus);
pci_bus_add_devices(hbus->pci_bus);
pci_unlock_rescan_remove();
hbus->state = hv_pcibus_installed;
@@ -1742,6 +1768,7 @@ static void pci_devices_present_work(struct work_struct 
*work)
 */
pci_lock_rescan_remove();
pci_scan_child_bus(hbus->pci_bus);
+   hv_pci_assign_slots(hbus);
pci_unlock_rescan_remove();
break;
 
@@ -1858,6 +1885,9 @@ static void hv_eject_device_work(struct work_struct *work)
list_del(>list_entry);
spin_unlock_irqrestore(>hbus->device_list_lock, flags);
 
+   if (hpdev->pci_slot)
+   pci_destroy_slot(hpdev->pci_slot);
+
memset(, 0, sizeof(ctxt));
ejct_pkt = (struct pci_eject_response *)
ejct_pkt->message_type.type = PCI_EJECTION_COMPLETE;
-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH net-next 2/2] hv_netvsc: pair VF based on serial number

2018-08-29 Thread Stephen Hemminger
Matching network device based on MAC address is problematic
since a non-VF network device can be created with a duplicate MAC
address causing confusion and problems.  The VMBus API provides
a serial number that is a better matching method.

Signed-off-by: Stephen Hemminger 
---
 drivers/net/hyperv/netvsc.c |  3 ++
 drivers/net/hyperv/netvsc_drv.c | 58 +++--
 2 files changed, 36 insertions(+), 25 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 31c3d77b4733..fe01e141c8f8 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -1203,6 +1203,9 @@ static void netvsc_send_vf(struct net_device *ndev,
 
net_device_ctx->vf_alloc = nvmsg->msg.v4_msg.vf_assoc.allocated;
net_device_ctx->vf_serial = nvmsg->msg.v4_msg.vf_assoc.serial;
+   netdev_info(ndev, "VF slot %u %s\n",
+   net_device_ctx->vf_serial,
+   net_device_ctx->vf_alloc ? "added" : "removed");
 }
 
 static  void netvsc_receive_inband(struct net_device *ndev,
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 1121a1ec407c..9dedc1463e88 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -1894,20 +1894,6 @@ static void netvsc_link_change(struct work_struct *w)
rtnl_unlock();
 }
 
-static struct net_device *get_netvsc_bymac(const u8 *mac)
-{
-   struct net_device_context *ndev_ctx;
-
-   list_for_each_entry(ndev_ctx, _dev_list, list) {
-   struct net_device *dev = hv_get_drvdata(ndev_ctx->device_ctx);
-
-   if (ether_addr_equal(mac, dev->perm_addr))
-   return dev;
-   }
-
-   return NULL;
-}
-
 static struct net_device *get_netvsc_byref(struct net_device *vf_netdev)
 {
struct net_device_context *net_device_ctx;
@@ -2036,26 +2022,48 @@ static void netvsc_vf_setup(struct work_struct *w)
rtnl_unlock();
 }
 
+/* Find netvsc by VMBus serial number.
+ * The PCI hyperv controller records the serial number as the slot.
+ */
+static struct net_device *get_netvsc_byslot(const struct net_device *vf_netdev)
+{
+   struct device *parent = vf_netdev->dev.parent;
+   struct net_device_context *ndev_ctx;
+   struct pci_dev *pdev;
+
+   if (!parent || !dev_is_pci(parent))
+   return NULL; /* not a PCI device */
+
+   pdev = to_pci_dev(parent);
+   if (!pdev->slot) {
+   netdev_notice(vf_netdev, "no PCI slot information\n");
+   return NULL;
+   }
+
+   list_for_each_entry(ndev_ctx, _dev_list, list) {
+   if (!ndev_ctx->vf_alloc)
+   continue;
+
+   if (ndev_ctx->vf_serial == pdev->slot->number)
+   return hv_get_drvdata(ndev_ctx->device_ctx);
+   }
+
+   netdev_notice(vf_netdev,
+ "no netdev found for slot %u\n", pdev->slot->number);
+   return NULL;
+}
+
 static int netvsc_register_vf(struct net_device *vf_netdev)
 {
-   struct net_device *ndev;
struct net_device_context *net_device_ctx;
-   struct device *pdev = vf_netdev->dev.parent;
struct netvsc_device *netvsc_dev;
+   struct net_device *ndev;
int ret;
 
if (vf_netdev->addr_len != ETH_ALEN)
return NOTIFY_DONE;
 
-   if (!pdev || !dev_is_pci(pdev) || dev_is_pf(pdev))
-   return NOTIFY_DONE;
-
-   /*
-* We will use the MAC address to locate the synthetic interface to
-* associate with the VF interface. If we don't find a matching
-* synthetic interface, move on.
-*/
-   ndev = get_netvsc_bymac(vf_netdev->perm_addr);
+   ndev = get_netvsc_byslot(vf_netdev);
if (!ndev)
return NOTIFY_DONE;
 
-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH net-next 0/2] hv_netvsc: associate VF and PV device by serial number

2018-08-29 Thread Stephen Hemminger
The Hyper-V implementation of PCI controller has concept of 32 bit serial number
(not to be confused with PCI-E serial number).  This value is sent in the 
protocol
from the host to indicate SR-IOV VF device is attached to a synthetic NIC.

Using the serial number (instead of MAC address) to associate the two devices
avoids lots of potential problems when there are duplicate MAC addresses from
tunnels or layered devices.

The patch set is broken into two parts, one is for the PCI controller
and the other is for the netvsc device. Normally, these go through different
trees but sending them together here for better review. The PCI changes
were submitted previously, but the main review comment was "why do you
need this?". This is why.

Stephen Hemminger (2):
  PCI: hv: support reporting serial number as slot information
  hv_netvsc: pair VF based on serial number

 drivers/net/hyperv/netvsc.c |  3 ++
 drivers/net/hyperv/netvsc_drv.c | 58 -
 drivers/pci/controller/pci-hyperv.c | 30 +++
 3 files changed, 66 insertions(+), 25 deletions(-)

-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH] hv/netvsc: Fix NULL dereference at single queue mode fallback

2018-08-14 Thread Stephen Hemminger
On Tue, 14 Aug 2018 19:10:50 +0200
Takashi Iwai  wrote:

> The recent commit 916c5e1413be ("hv/netvsc: fix handling of fallback
> to single queue mode") tried to fix the fallback behavior to a single
> queue mode, but it changed the function to return zero incorrectly,
> while the function should return an object pointer.  Eventually this
> leads to a NULL dereference at the callers that expect non-NULL
> value.
> 
> Fix it by returning the proper net_device object.
> 
> Fixes: 916c5e1413be ("hv/netvsc: fix handling of fallback to single queue 
> mode")
> Cc: 
> Signed-off-by: Takashi Iwai 

Reviewed-by: Stephen Hemminger 
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH 2/5] vmbus: add driver_override support

2018-08-14 Thread Stephen Hemminger
On Mon, 13 Aug 2018 19:30:50 +
"Michael Kelley (EOSG)"  wrote:

> > +/*
> > + * Return a matching hv_vmbus_device_id pointer.
> > + * If there is no match, return NULL.
> > + */
> > +static const struct hv_vmbus_device_id *hv_vmbus_get_id(struct hv_driver 
> > *drv,
> > +   struct hv_device *dev)
> > +{
> > +   const uuid_le *guid = >dev_type;
> > +   const struct hv_vmbus_device_id *id;
> > 
> > -   return NULL;
> > +   /* When driver_override is set, only bind to the matching driver */
> > +   if (dev->driver_override && strcmp(dev->driver_override, drv->name))
> > +   return NULL;  
> 
> This function needs to be covered by the device lock, so that
> dev->driver_override can't be set to NULL and the memory freed
> during the above 'if' statement.  When called from vmbus_probe(),
> the device lock is held, so it's good. But when called from
> vmbus_match(), the device lock may not be held: consider the path
> __driver_attach() -> driver_match_device() -> vmbus_match().

The function hv_vmbus_get_id is called from that path.
i.e. __device_attach -> driver-match_device -> vmbus_match.
and __device_attach always does:
device_lock(dev);

The code in driver _override_store uses the same device_lock 
when storing the new value.

This is same locking as is done in pci-sysfs.c

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH 2/5] vmbus: add driver_override support

2018-08-13 Thread Stephen Hemminger
On Mon, 13 Aug 2018 19:30:50 +
"Michael Kelley (EOSG)"  wrote:

> From: k...@linuxonhyperv.com   Sent: Friday, August 
> 10, 2018 4:06 PM
> 
> > From: Stephen Hemminger 
> > 
> > Add support for overriding the default driver for a VMBus device
> > in the same way that it can be done for PCI devices. This patch
> > adds the /sys/bus/vmbus/devices/.../driver_override file
> > and the logic for matching.
> > 
> > This is used by driverctl tool to do driver override.
> > https://gitlab.com/driverctl/driverctl
> > 
> > Signed-off-by: Stephen Hemminger 
> > Signed-off-by: K. Y. Srinivasan 
> > ---
> > diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> > index b1b548a21f91..e6d8fdac6d8b 100644
> > --- a/drivers/hv/vmbus_drv.c
> > +++ b/drivers/hv/vmbus_drv.c
> > @@ -498,6 +498,54 @@ static ssize_t device_show(struct device *dev,
> >  }
> >  static DEVICE_ATTR_RO(device);
> > 
> > +static ssize_t driver_override_store(struct device *dev,
> > +struct device_attribute *attr,
> > +const char *buf, size_t count)
> > +{
> > +   struct hv_device *hv_dev = device_to_hv_device(dev);
> > +   char *driver_override, *old, *cp;
> > +
> > +   /* We need to keep extra room for a newline */
> > +   if (count >= (PAGE_SIZE - 1))
> > +   return -EINVAL;  
> 
> Does 'count' actually have a relationship to PAGE_SIZE, or
> is PAGE_SIZE just used as an arbitrary size limit?  I'm
> wondering what happens on ARM64 with a 64K page size,
> for example.  If it's just arbitrary, coding such a constant
> would be better.

This comes from original code how sysfs works.
Sysfs uses PAGE_SIZE for string buffers on store. This code
snippet was cloned from PCI version of same thing.

> > +/*
> > + * Return a matching hv_vmbus_device_id pointer.
> > + * If there is no match, return NULL.
> > + */
> > +static const struct hv_vmbus_device_id *hv_vmbus_get_id(struct hv_driver 
> > *drv,
> > +   struct hv_device *dev)
> > +{
> > +   const uuid_le *guid = >dev_type;
> > +   const struct hv_vmbus_device_id *id;
> > 
> > -   return NULL;
> > +   /* When driver_override is set, only bind to the matching driver */
> > +   if (dev->driver_override && strcmp(dev->driver_override, drv->name))
> > +   return NULL;  
> 
> This function needs to be covered by the device lock, so that
> dev->driver_override can't be set to NULL and the memory freed
> during the above 'if' statement.  When called from vmbus_probe(),
> the device lock is held, so it's good. But when called from
> vmbus_match(), the device lock may not be held: consider the path
> __driver_attach() -> driver_match_device() -> vmbus_match().

The patch clones existing locking model from PCI.
So either both are broken, or somehow vmbus is behaving differently.
I will investigate.





___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH] vmbus: don't return values for uninitalized channels

2018-08-13 Thread Stephen Hemminger
For unsupported device types, the vmbus channel ringbuffer is never
initialized, and therefore reading the sysfs files will return garbage
or cause a kernel OOPS.

Fixes: c2e5df616e1a ("vmbus: add per-channel sysfs info")
Signed-off-by: Stephen Hemminger 
---
 drivers/hv/vmbus_drv.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index b10fe26c4891..c9a466be7709 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1178,6 +1178,9 @@ static ssize_t vmbus_chan_attr_show(struct kobject *kobj,
if (!attribute->show)
return -EIO;
 
+   if (chan->state != CHANNEL_OPENED_STATE)
+   return -EINVAL;
+
return attribute->show(chan, buf);
 }
 
-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH] uio_hv_generic: drop #ifdef DEBUG

2018-08-09 Thread Stephen Hemminger
DEBUG is leftover from the development phase, remove it.

Signed-off-by: Stephen Hemminger 
---
 drivers/uio/uio_hv_generic.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
index 9e7d622b4326..a08860260f55 100644
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -17,7 +17,6 @@
  * # echo -n "ed963694-e847-4b2a-85af-bc9cfc11d6f3" \
  *> /sys/bus/vmbus/drivers/uio_hv_generic/bind
  */
-#define DEBUG 1
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include 
-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH] uio_hv_generic: increase size of receive and send buffers

2018-08-09 Thread Stephen Hemminger
When using DPDK there is significant performance boost by using
the largest possible send and receive buffer area.

Unfortunately, with UIO model there is not a good way to configure
this at run time. But it is okay to have a bigger buffer available
even if application only decides to use a smaller piece of it.

Signed-off-by: Stephen Hemminger 
---
 drivers/uio/uio_hv_generic.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
index e401be8321ab..9e7d622b4326 100644
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -33,13 +33,13 @@
 
 #include "../hv/hyperv_vmbus.h"
 
-#define DRIVER_VERSION "0.02.0"
+#define DRIVER_VERSION "0.02.1"
 #define DRIVER_AUTHOR  "Stephen Hemminger "
 #define DRIVER_DESC"Generic UIO driver for VMBus devices"
 
 #define HV_RING_SIZE512/* pages */
-#define SEND_BUFFER_SIZE (15 * 1024 * 1024)
-#define RECV_BUFFER_SIZE (15 * 1024 * 1024)
+#define SEND_BUFFER_SIZE (16 * 1024 * 1024)
+#define RECV_BUFFER_SIZE (31 * 1024 * 1024)
 
 /*
  * List of resources to be mapped to user space
-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH v2] vmbus: add driver_override support

2018-08-08 Thread Stephen Hemminger
Add support for overriding the default driver for a VMBus device
in the same way that it can be done for PCI devices. This patch
adds the /sys/bus/vmbus/devices/.../driver_override file
and the logic for matching.

This is used by driverctl tool to do driver override.
https://gitlab.com/driverctl/driverctl

Signed-off-by: Stephen Hemminger 
---
v2 - no changes since last version.
 this patch seems to have gotten lost.
 driver development list really needs to start using patchwork!

 Documentation/ABI/testing/sysfs-bus-vmbus |  21 
 drivers/hv/vmbus_drv.c| 115 ++
 include/linux/hyperv.h|   1 +
 3 files changed, 118 insertions(+), 19 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-vmbus

diff --git a/Documentation/ABI/testing/sysfs-bus-vmbus 
b/Documentation/ABI/testing/sysfs-bus-vmbus
new file mode 100644
index ..91e6c065973c
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-vmbus
@@ -0,0 +1,21 @@
+What:  /sys/bus/vmbus/devices/.../driver_override
+Date:  August 2019
+Contact:   Stephen Hemminger 
+Description:
+   This file allows the driver for a device to be specified which
+   will override standard static and dynamic ID matching.  When
+   specified, only a driver with a name matching the value written
+   to driver_override will have an opportunity to bind to the
+   device.  The override is specified by writing a string to the
+   driver_override file (echo uio_hv_generic > driver_override) and
+   may be cleared with an empty string (echo > driver_override).
+   This returns the device to standard matching rules binding.
+   Writing to driver_override does not automatically unbind the
+   device from its current driver or make any attempt to
+   automatically load the specified driver.  If no driver with a
+   matching name is currently loaded in the kernel, the device
+   will not bind to any driver.  This also allows devices to
+   opt-out of driver binding using a driver_override name such as
+   "none".  Only a single driver may be specified in the override,
+   there is no support for parsing delimiters.
+
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index b1b548a21f91..e6d8fdac6d8b 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -498,6 +498,54 @@ static ssize_t device_show(struct device *dev,
 }
 static DEVICE_ATTR_RO(device);
 
+static ssize_t driver_override_store(struct device *dev,
+struct device_attribute *attr,
+const char *buf, size_t count)
+{
+   struct hv_device *hv_dev = device_to_hv_device(dev);
+   char *driver_override, *old, *cp;
+
+   /* We need to keep extra room for a newline */
+   if (count >= (PAGE_SIZE - 1))
+   return -EINVAL;
+
+   driver_override = kstrndup(buf, count, GFP_KERNEL);
+   if (!driver_override)
+   return -ENOMEM;
+
+   cp = strchr(driver_override, '\n');
+   if (cp)
+   *cp = '\0';
+
+   device_lock(dev);
+   old = hv_dev->driver_override;
+   if (strlen(driver_override)) {
+   hv_dev->driver_override = driver_override;
+   } else {
+   kfree(driver_override);
+   hv_dev->driver_override = NULL;
+   }
+   device_unlock(dev);
+
+   kfree(old);
+
+   return count;
+}
+
+static ssize_t driver_override_show(struct device *dev,
+   struct device_attribute *attr, char *buf)
+{
+   struct hv_device *hv_dev = device_to_hv_device(dev);
+   ssize_t len;
+
+   device_lock(dev);
+   len = snprintf(buf, PAGE_SIZE, "%s\n", hv_dev->driver_override);
+   device_unlock(dev);
+
+   return len;
+}
+static DEVICE_ATTR_RW(driver_override);
+
 /* Set up per device attributes in /sys/bus/vmbus/devices/ */
 static struct attribute *vmbus_dev_attrs[] = {
_attr_id.attr,
@@ -528,6 +576,7 @@ static struct attribute *vmbus_dev_attrs[] = {
_attr_channel_vp_mapping.attr,
_attr_vendor.attr,
_attr_device.attr,
+   _attr_driver_override.attr,
NULL,
 };
 ATTRIBUTE_GROUPS(vmbus_dev);
@@ -563,17 +612,26 @@ static inline bool is_null_guid(const uuid_le *guid)
return true;
 }
 
-/*
- * Return a matching hv_vmbus_device_id pointer.
- * If there is no match, return NULL.
- */
-static const struct hv_vmbus_device_id *hv_vmbus_get_id(struct hv_driver *drv,
-   const uuid_le *guid)
+static const struct hv_vmbus_device_id *
+hv_vmbus_dev_match(const struct hv_vmbus_device_id *id, const uuid_le *guid)
+
+{
+   if (id == NULL)
+

Re: [PATCH v4 net-next] hv_netvsc: Add per-cpu ethtool stats for netvsc

2018-07-30 Thread Stephen Hemminger
On Mon, 30 Jul 2018 17:09:45 +
Yidong Ren  wrote:

> From: Yidong Ren 
> 
> This patch implements following ethtool stats fields for netvsc:
> cpu_tx/rx_packets/bytes
> cpu_vf_tx/rx_packets/bytes
> 
> Corresponding per-cpu counters already exist in current code. Exposing
> these counters will help troubleshooting performance issues.
> 
> for_each_present_cpu() was used instead of for_each_possible_cpu().
> for_each_possible_cpu() would create very long and useless output.
> It is still being used for internal buffer, but not for ethtool
> output.
> 
> There could be an overflow if cpu was added between ethtool
> call netvsc_get_sset_count() and netvsc_get_ethtool_stats() and
> netvsc_get_strings(). (still safe if cpu was removed)
> ethtool makes these three function calls separately.
> As long as we use ethtool, I can't see any clean solution.
> 
> Currently and in foreseeable short term, Hyper-V doesn't support
> cpu hot-plug. Plus, ethtool is for admin use. Unlikely the admin
> would perform such combo operations.
> 
> Signed-off-by: Yidong Ren 

Reviewed-by: Stephen Hemminger 
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [RFC PATCH v4] Xilinx AXI-Stream FIFO v4.1 IP core driver

2018-07-22 Thread Stephen Hemminger
On Sat, 21 Jul 2018 20:14:55 -0400
Jacob Feder  wrote:

> diff --git a/drivers/staging/axis-fifo/Kconfig 
> b/drivers/staging/axis-fifo/Kconfig
> new file mode 100644
> index 000..77d5701
> --- /dev/null
> +++ b/drivers/staging/axis-fifo/Kconfig
> @@ -0,0 +1,9 @@
> +#
> +# "Xilinx AXI-Stream FIFO IP core driver"
> +#
> +config XIL_AXIS_FIFO
> + tristate "Xilinx AXI-Stream FIFO IP core driver"
> + default n
> + help
> +   This adds support for the Xilinx AXI-Stream
> +   FIFO IP core driver.
> \ No newline at end of file

You used an editor like emacs that doesn't always add a newline.
Please fix.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH] uio: add SPDX license tags

2018-07-21 Thread Stephen Hemminger
For those without any license text present or short reference
to GPL, add SPDX tag.

Signed-off-by: Stephen Hemminger 
---
 drivers/uio/uio_cif.c   | 4 +---
 drivers/uio/uio_fsl_elbc_gpcm.c | 1 +
 drivers/uio/uio_hv_generic.c| 4 +---
 drivers/uio/uio_netx.c  | 3 +--
 drivers/uio/uio_pci_generic.c   | 3 +--
 drivers/uio/uio_sercos3.c   | 1 +
 6 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/drivers/uio/uio_cif.c b/drivers/uio/uio_cif.c
index 30f533ce3758..ab60186f9759 100644
--- a/drivers/uio/uio_cif.c
+++ b/drivers/uio/uio_cif.c
@@ -1,11 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * UIO Hilscher CIF card driver
  *
  * (C) 2007 Hans J. Koch 
  * Original code (C) 2005 Benedikt Spranger 
- *
- * Licensed under GPL version 2 only.
- *
  */
 
 #include 
diff --git a/drivers/uio/uio_fsl_elbc_gpcm.c b/drivers/uio/uio_fsl_elbc_gpcm.c
index b55191335d90..bbc17effae5e 100644
--- a/drivers/uio/uio_fsl_elbc_gpcm.c
+++ b/drivers/uio/uio_fsl_elbc_gpcm.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
 /* uio_fsl_elbc_gpcm: UIO driver for eLBC/GPCM peripherals
 
Copyright (C) 2014 Linutronix GmbH
diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
index c690d100adcd..e401be8321ab 100644
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -1,12 +1,10 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * uio_hv_generic - generic UIO driver for VMBus
  *
  * Copyright (c) 2013-2016 Brocade Communications Systems, Inc.
  * Copyright (c) 2016, Microsoft Corporation.
  *
- *
- * This work is licensed under the terms of the GNU GPL, version 2.
- *
  * Since the driver does not declare any device ids, you must allocate
  * id and bind the device to the driver yourself.  For example:
  *
diff --git a/drivers/uio/uio_netx.c b/drivers/uio/uio_netx.c
index 4c345db8b016..9ae29ffde410 100644
--- a/drivers/uio/uio_netx.c
+++ b/drivers/uio/uio_netx.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * UIO driver for Hilscher NetX based fieldbus cards (cifX, comX).
  * See http://www.hilscher.com for details.
@@ -5,8 +6,6 @@
  * (C) 2007 Hans J. Koch 
  * (C) 2008 Manuel Traut 
  *
- * Licensed under GPL version 2 only.
- *
  */
 
 #include 
diff --git a/drivers/uio/uio_pci_generic.c b/drivers/uio/uio_pci_generic.c
index a56fdf972dbe..8773e373ffe5 100644
--- a/drivers/uio/uio_pci_generic.c
+++ b/drivers/uio/uio_pci_generic.c
@@ -1,10 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
 /* uio_pci_generic - generic UIO driver for PCI 2.3 devices
  *
  * Copyright (C) 2009 Red Hat, Inc.
  * Author: Michael S. Tsirkin 
  *
- * This work is licensed under the terms of the GNU GPL, version 2.
- *
  * Since the driver does not declare any device ids, you must allocate
  * id and bind the device to the driver yourself.  For example:
  *
diff --git a/drivers/uio/uio_sercos3.c b/drivers/uio/uio_sercos3.c
index 9cfdfcafa262..9658a0887fee 100644
--- a/drivers/uio/uio_sercos3.c
+++ b/drivers/uio/uio_sercos3.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
 /* sercos3: UIO driver for the Automata Sercos III PCI card
 
Copyright (C) 2008 Linutronix GmbH
-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH v2] vmbus: add numa_node to sysfs

2018-07-20 Thread Stephen Hemminger
Being able to find the numa_node for a device is useful for userspace
drivers (DPDK) and also for diagnosing performance issues.  This makes
vmbus similar to pci.

Signed-off-by: Stephen Hemminger 
---
v2 - add sysfs doc

 Documentation/ABI/stable/sysfs-bus-vmbus |  7 +++
 drivers/hv/vmbus_drv.c   | 17 +
 2 files changed, 24 insertions(+)

diff --git a/Documentation/ABI/stable/sysfs-bus-vmbus 
b/Documentation/ABI/stable/sysfs-bus-vmbus
index 3eaffbb2d468..3fed8fdb873d 100644
--- a/Documentation/ABI/stable/sysfs-bus-vmbus
+++ b/Documentation/ABI/stable/sysfs-bus-vmbus
@@ -42,6 +42,13 @@ Contact: K. Y. Srinivasan 
 Description:   The 16 bit vendor ID of the device
 Users: tools/hv/lsvmbus and user level RDMA libraries
 
+What:  /sys/bus/vmbus/devices//numa_node
+Date:  Jul 2018
+KernelVersion: 4.19
+Contact:   Stephen Hemminger 
+Description:   This NUMA node to which the VMBUS device is
+   attached, or -1 if the node is unknown.
+
 What:  /sys/bus/vmbus/devices//channels/
 Date:  September. 2017
 KernelVersion: 4.14
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index b10fe26c4891..be754b9ad33a 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -208,6 +208,20 @@ static ssize_t modalias_show(struct device *dev,
 }
 static DEVICE_ATTR_RO(modalias);
 
+#ifdef CONFIG_NUMA
+static ssize_t numa_node_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+   struct hv_device *hv_dev = device_to_hv_device(dev);
+
+   if (!hv_dev->channel)
+   return -ENODEV;
+
+   return sprintf(buf, "%d\n", hv_dev->channel->numa_node);
+}
+static DEVICE_ATTR_RO(numa_node);
+#endif
+
 static ssize_t server_monitor_pending_show(struct device *dev,
   struct device_attribute *dev_attr,
   char *buf)
@@ -490,6 +504,9 @@ static struct attribute *vmbus_dev_attrs[] = {
_attr_class_id.attr,
_attr_device_id.attr,
_attr_modalias.attr,
+#ifdef CONFIG_NUMA
+   _attr_numa_node.attr,
+#endif
_attr_server_monitor_pending.attr,
_attr_client_monitor_pending.attr,
_attr_server_monitor_latency.attr,
-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH] vmbus: add numa_node to sysfs

2018-07-20 Thread Stephen Hemminger
Having the numa_node for a device is useful for userspace drivers
(ie DPDK) and also for diagnosing performance issues.  This makes
vmbus similar to pci.

Signed-off-by: Stephen Hemminger 
---
 drivers/hv/vmbus_drv.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index b10fe26c4891..be754b9ad33a 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -208,6 +208,20 @@ static ssize_t modalias_show(struct device *dev,
 }
 static DEVICE_ATTR_RO(modalias);
 
+#ifdef CONFIG_NUMA
+static ssize_t numa_node_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+   struct hv_device *hv_dev = device_to_hv_device(dev);
+
+   if (!hv_dev->channel)
+   return -ENODEV;
+
+   return sprintf(buf, "%d\n", hv_dev->channel->numa_node);
+}
+static DEVICE_ATTR_RO(numa_node);
+#endif
+
 static ssize_t server_monitor_pending_show(struct device *dev,
   struct device_attribute *dev_attr,
   char *buf)
@@ -490,6 +504,9 @@ static struct attribute *vmbus_dev_attrs[] = {
_attr_class_id.attr,
_attr_device_id.attr,
_attr_modalias.attr,
+#ifdef CONFIG_NUMA
+   _attr_numa_node.attr,
+#endif
_attr_server_monitor_pending.attr,
_attr_client_monitor_pending.attr,
_attr_server_monitor_latency.attr,
-- 
2.18.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [RFC PATCH v3] Xilinx AXI-Stream FIFO v4.1 IP core driver

2018-07-20 Thread Stephen Hemminger
On Thu, 19 Jul 2018 21:45:31 -0400
Jacob Feder  wrote:

> First I run "make menuconfig" and select my driver in "device drivers" >
> "staging". If I run "make" or "make all" or
> "make drivers/staging/axis-fifo" everything compiles without errors or 
> warnings even if I put blatant syntax errors in my code.
> What am I missing here?
> 
> Thanks,
> Jacob

Try building with gcc-8 and also with W=1 to see a more complete
set of warnings.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [RFC PATCH v3] Xilinx AXI-Stream FIFO v4.1 IP core driver

2018-07-19 Thread Stephen Hemminger
On Thu, 19 Jul 2018 13:16:57 +0300
Dan Carpenter  wrote:

> I'm getting some compile warnings where we us %u instead of %lu for
> size_t.
> 
> We also need a README explaining what else needs to be done before this
> can be moved out of staging into the normal part of the kernel.

Use %zu to print size_t variables.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [RFC PATCH v3] Xilinx AXI-Stream FIFO v4.1 IP core driver

2018-07-19 Thread Stephen Hemminger
On Wed, 18 Jul 2018 22:46:34 -0400
Jacob Feder  wrote:

> +MODULE_DESCRIPTION("Xilinx AXI-Stream FIFO v4.1 IP core driver\n\n"
> +"This IP core has read and write AXI-Stream FIFOs, the contents of which 
> can\n"
> +"be accessed from the AXI4 memory-mapped interface. This is useful for\n"
> +"transferring data from a processor into the FPGA fabric. The driver 
> creates\n"
> +"a character device that can be read/written to with standard\n"
> +"open/read/write/close.\n\n"
> +"See Xilinx PG080 document for IP details.\n"
> +"https://www.xilinx.com/support/documentation/ip_documentation/axi_fifo_mm_s/v4_1/pg080-axi-fifo-mm-s.pdf\n\n;
> +"The driver currently supports only store-forward mode with a 32-bit\n"
> +"AXI4 Lite interface. DOES NOT support:\n"
> +"- cut-through mode\n"
> +"- AXI4 (non-lite)");

One line for MODULE_DESCRIPTION it is for the modinfo tool, not marketing or 
documentation.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [bug report] netvsc: use ERR_PTR to avoid dereference issues

2018-07-10 Thread Stephen Hemminger
On Tue, 3 Jul 2018 15:01:42 +0300
Dan Carpenter  wrote:

> Hello stephen hemminger,
> 
> The patch 9749fed5d43d: "netvsc: use ERR_PTR to avoid dereference
> issues" from Jul 19, 2017, leads to the following static checker
> warning:
> 
>   drivers/net/hyperv/rndis_filter.c:1344 rndis_filter_device_add()
>   warn: passing zero to 'ERR_PTR'
> 
> drivers/net/hyperv/rndis_filter.c
>   1300  ret = rndis_filter_query_device(rndis_device, net_device,
>   1301  
> OID_GEN_RECEIVE_SCALE_CAPABILITIES,
>   1302  , _size);
>   1303  if (ret || rsscap.num_recv_que < 2)
>^^^
> I think this used to be a success path, but now it leads to a NULL
> dereference in the caller.  I guess we should set "ret = -ENOMEM;" here?
> 
>   1304  goto out;
>   1305  
>   1306  /* This guarantees that num_possible_rss_qs <= 
> num_online_cpus */
>   1307  num_possible_rss_qs = min_t(u32, num_online_cpus(),
>   1308  rsscap.num_recv_que);
>   1309  
>   1310  net_device->max_chn = min_t(u32, VRSS_CHANNEL_MAX, 
> num_possible_rss_qs);
>   1311  
>   1312  /* We will use the given number of channels if available. */
>   1313  net_device->num_chn = min(net_device->max_chn, 
> device_info->num_chn);
>   1314  
>   1315  for (i = 0; i < ITAB_NUM; i++)
>   1316  rndis_device->rx_table[i] = 
> ethtool_rxfh_indir_default(
>   1317  i, 
> net_device->num_chn);
>   1318  
>   1319  atomic_set(_device->open_chn, 1);
>   1320  vmbus_set_sc_create_callback(dev->channel, netvsc_sc_open);
>   1321  
>   1322  for (i = 1; i < net_device->num_chn; i++) {
>   1323  ret = netvsc_alloc_recv_comp_ring(net_device, i);
>   1324  if (ret) {
>   1325  while (--i != 0)
>   1326  
> vfree(net_device->chan_table[i].mrc.slots);
>   1327  goto out;
>   1328  }
>   1329  }
>   1330  
>   1331  for (i = 1; i < net_device->num_chn; i++)
>   1332  netif_napi_add(net, _device->chan_table[i].napi,
>   1333 netvsc_poll, NAPI_POLL_WEIGHT);
>   1334  
>   1335  return net_device;
>   1336  
>   1337  out:
>   1338  /* setting up multiple channels failed */
>   1339  net_device->max_chn = 1;
>   1340  net_device->num_chn = 1;
>   1341  
>   1342  err_dev_remv:
>   1343  rndis_filter_device_remove(dev, net_device);
S
>   1344  return ERR_PTR(ret);
> 
> regards,
> dan carpenter

It looks like there is a missing return after out:
all the fall throughs in that path should just work with single queue.
The current code would fail on old versions of Hyper-V.


This should be enough to fix the problem:

diff --git a/drivers/net/hyperv/rndis_filter.c 
b/drivers/net/hyperv/rndis_filter.c
index 9b4e3c3787e5..2a5209f23f29 100644
--- a/drivers/net/hyperv/rndis_filter.c
+++ b/drivers/net/hyperv/rndis_filter.c
@@ -1338,6 +1338,7 @@ struct netvsc_device *rndis_filter_device_add(struct 
hv_device *dev,
/* setting up multiple channels failed */
net_device->max_chn = 1;
net_device->num_chn = 1;
+   return net_device;
 
 err_dev_remv:
rndis_filter_device_remove(dev, net_device);



___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH net] hv_netvsc: Fix napi reschedule while receive completion is busy

2018-07-09 Thread Stephen Hemminger
On Mon,  9 Jul 2018 16:43:19 +
Haiyang Zhang  wrote:

> From: Haiyang Zhang 
> 
> If out ring is full temporarily and receive completion cannot go out,
> we may still need to reschedule napi if other conditions are met.
> Otherwise the napi poll might be stopped forever, and cause network
> disconnect.
> 
> Fixes: 7426b1a51803 ("netvsc: optimize receive completions")
> Signed-off-by: Haiyang Zhang 

Maybe call reschedule?

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 8e9d0ee1572b..b5e92342e422 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -1285,14 +1285,20 @@ int netvsc_poll(struct napi_struct *napi, int budget)
nvchan->desc = hv_pkt_iter_next(channel, nvchan->desc);
}
 
-   /* If send of pending receive completions suceeded
-*   and did not exhaust NAPI budget this time
+   /* If all the receive completions could not be sent,
+* then we want to be scheduled again.
+*/
+   if (unlikely(send_recv_completions(ndev, net_device, nvchan))) {
+   napi_reschedule(napi);
+   goto out;
+   }
+
+   /* if did not exhaust NAPI budget this time
 *   and not doing busy poll
 * then re-enable host interrupts
 * and reschedule if ring is not empty.
 */
-   if (send_recv_completions(ndev, net_device, nvchan) == 0 &&
-   work_done < budget &&
+   if (work_done < budget &&
napi_complete_done(napi, work_done) &&
hv_end_read(>inbound) &&
napi_schedule_prep(napi)) {
@@ -1300,6 +1306,7 @@ int netvsc_poll(struct napi_struct *napi, int budget)
__napi_schedule(napi);
}
 
+out:
/* Driver may overshoot since multiple packets per descriptor */
return min(work_done, budget);
 }


___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH net] hv_netvsc: Fix napi reschedule while receive completion is busy

2018-07-09 Thread Stephen Hemminger
On Mon,  9 Jul 2018 16:43:19 +
Haiyang Zhang  wrote:

> From: Haiyang Zhang 
> 
> If out ring is full temporarily and receive completion cannot go out,
> we may still need to reschedule napi if other conditions are met.
> Otherwise the napi poll might be stopped forever, and cause network
> disconnect.
> 
> Fixes: 7426b1a51803 ("netvsc: optimize receive completions")
> Signed-off-by: Haiyang Zhang 
> ---
>  drivers/net/hyperv/netvsc.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
> index 8e9d0ee1572b..caaf5054f446 100644
> --- a/drivers/net/hyperv/netvsc.c
> +++ b/drivers/net/hyperv/netvsc.c
> @@ -1285,14 +1285,14 @@ int netvsc_poll(struct napi_struct *napi, int budget)
>   nvchan->desc = hv_pkt_iter_next(channel, nvchan->desc);
>   }
>  
> - /* If send of pending receive completions suceeded
> -  *   and did not exhaust NAPI budget this time
> + send_recv_completions(ndev, net_device, nvchan);
> +
> + /* If it did not exhaust NAPI budget this time
>*   and not doing busy poll
>* then re-enable host interrupts
>* and reschedule if ring is not empty.
>*/
> - if (send_recv_completions(ndev, net_device, nvchan) == 0 &&
> - work_done < budget &&
> + if (work_done < budget &&
>   napi_complete_done(napi, work_done) &&
>   hv_end_read(>inbound) &&
>   napi_schedule_prep(napi)) {

This patch doesn't look right. I think the existing code works
as written.

If send_receive_completions is unable to send because ring is full
then vmbus_sendpacket will return -EBUSY which gets returns
from send_receive_completions.  Because the return is non-zero,
the driver will not call napi_complete_done.
Since napi_complete_done was not called, NAPI will reschedule
the napi poll routine.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: Incomplete fix for be9c798 / 7b2ee50 hv_netvsc: common detach logic?

2018-06-19 Thread Stephen Hemminger
On Wed, 20 Jun 2018 08:01:50 +0900
Greg KH  wrote:

> On Tue, Jun 19, 2018 at 03:19:41PM -0400, Thomas Walker wrote:
> > Upon updating some internal kernels from 4.14.18 to 4.14.49, our hyper-v 
> > guests failed to bring up network interfaces on boot, logging "A link 
> > change request failed with some changes committed already. Interface eth0 
> > may have been left with an inconsistent configuration, please check."  
> > Running 'ifconfig eth0 up' appears to fix the problem temporarily so I went 
> > about bisecting which landed on:
> > 
> > commit be9c798d0d13ae609a91177323ac816545c39d28
> > Author: Stephen Hemminger 
> > Date:   Mon May 14 15:32:18 2018 -0700
> > 
> > hv_netvsc: common detach logic
> > 
> > [ Commit 7b2ee50c0cd513a176a26a71f2989facdd75bfea upstream. ]
> > 
> > Make common function for detaching internals of device
> > during changes to MTU and RSS. Make sure no more packets
> > are transmitted and all packets have been received before
> > doing device teardown.
> > 
> > Change the wait logic to be common and use usleep_range().
> > 
> > Changes transmit enabling logic so that transmit queues are disabled
> > during the period when lower device is being changed. And enabled
> > only after sub channels are setup. This avoids issue where it could
> > be that a packet was being sent while subchannel was not initialized.
> > 
> > Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug")
> > Signed-off-by: Stephen Hemminger 
> > Signed-off-by: David S. Miller 
> > Signed-off-by: Greg Kroah-Hartman 
> > 
> > 
> > The removal of which does indeed fix the problem.  That led me to:
> > 
> > commit 52acf73b6e9a6962045feb2ba5a8921da2201915
> > Author: Dexuan Cui 
> > Date:   Wed Jun 6 21:32:51 2018 +
> > 
> > hv_netvsc: Fix a network regression after ifdown/ifup
> > 
> > Recently people reported the NIC stops working after
> > "ifdown eth0; ifup eth0". It turns out in this case the TX queues are 
> > not
> > enabled, after the refactoring of the common detach logic: when the NIC
> > has sub-channels, usually we enable all the TX queues after all
> > sub-channels are set up: see rndis_set_subchannel() ->
> > netif_device_attach(), but in the case of "ifdown eth0; ifup eth0" where
> > the number of channels doesn't change, we also must make sure the TX 
> > queues
> > are enabled. The patch fixes the regression.
> > 
> > Fixes: 7b2ee50c0cd5 ("hv_netvsc: common detach logic")
> > Signed-off-by: Dexuan Cui 
> > Cc: Stephen Hemminger 
> > Cc: K. Y. Srinivasan 
> > Cc: Haiyang Zhang 
> > Signed-off-by: David S. Miller 
> > 
> > Which sounded very promising, but does not seem to fully fix the issue.  
> > Doing some more digging, I was able to determine that the message coincides 
> > with 'ip link set dev eth0 mtu 1300 up' very shortly (>~1 second) after the 
> > hv_netvsc driver loads.  If I delay the mtu change until well after the 
> > driver loads, everything works fine.  If I unload hv_netvsc and then reload 
> > it and apply the mtu change immediately, the failure re-occurs.  So 
> > something is racy here, and the above doesn't entirely address it.
> > 
> > I'm happy to test out any suggested patches and/or do additional debugging 
> > if anyone has any suggestions.
> > 
> > (oh, and I did also try 4.18-rc1 and the problem still persists)  
> 
> Always cc: the authors of the patch you are having problems with, along
> with the mailing list for the networking subsystem, so that the right
> people know to look at this.

How are you changing the MTU? and starting the device.
When MTU changes the device has to reconnect with the host. This takes a small 
amount of time
and no changes to device state are possible then.

If MTU change happens and at the same time some other script tries to bring up 
the connection,
then that script will get an error.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: Incomplete fix for be9c798 / 7b2ee50 hv_netvsc: common detach logic?

2018-06-19 Thread Stephen Hemminger
On Wed, 20 Jun 2018 08:01:50 +0900
Greg KH  wrote:

> On Tue, Jun 19, 2018 at 03:19:41PM -0400, Thomas Walker wrote:
> > Upon updating some internal kernels from 4.14.18 to 4.14.49, our hyper-v 
> > guests failed to bring up network interfaces on boot, logging "A link 
> > change request failed with some changes committed already. Interface eth0 
> > may have been left with an inconsistent configuration, please check."  
> > Running 'ifconfig eth0 up' appears to fix the problem temporarily so I went 
> > about bisecting which landed on:
> > 
> > commit be9c798d0d13ae609a91177323ac816545c39d28
> > Author: Stephen Hemminger 
> > Date:   Mon May 14 15:32:18 2018 -0700
> > 
> > hv_netvsc: common detach logic
> > 
> > [ Commit 7b2ee50c0cd513a176a26a71f2989facdd75bfea upstream. ]
> > 
> > Make common function for detaching internals of device
> > during changes to MTU and RSS. Make sure no more packets
> > are transmitted and all packets have been received before
> > doing device teardown.
> > 
> > Change the wait logic to be common and use usleep_range().
> > 
> > Changes transmit enabling logic so that transmit queues are disabled
> > during the period when lower device is being changed. And enabled
> > only after sub channels are setup. This avoids issue where it could
> > be that a packet was being sent while subchannel was not initialized.
> > 
> > Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug")
> > Signed-off-by: Stephen Hemminger 
> > Signed-off-by: David S. Miller 
> > Signed-off-by: Greg Kroah-Hartman 
> > 
> > 
> > The removal of which does indeed fix the problem.  That led me to:
> > 
> > commit 52acf73b6e9a6962045feb2ba5a8921da2201915
> > Author: Dexuan Cui 
> > Date:   Wed Jun 6 21:32:51 2018 +
> > 
> > hv_netvsc: Fix a network regression after ifdown/ifup
> > 
> > Recently people reported the NIC stops working after
> > "ifdown eth0; ifup eth0". It turns out in this case the TX queues are 
> > not
> > enabled, after the refactoring of the common detach logic: when the NIC
> > has sub-channels, usually we enable all the TX queues after all
> > sub-channels are set up: see rndis_set_subchannel() ->
> > netif_device_attach(), but in the case of "ifdown eth0; ifup eth0" where
> > the number of channels doesn't change, we also must make sure the TX 
> > queues
> > are enabled. The patch fixes the regression.
> > 
> > Fixes: 7b2ee50c0cd5 ("hv_netvsc: common detach logic")
> > Signed-off-by: Dexuan Cui 
> > Cc: Stephen Hemminger 
> > Cc: K. Y. Srinivasan 
> > Cc: Haiyang Zhang 
> > Signed-off-by: David S. Miller 
> > 
> > Which sounded very promising, but does not seem to fully fix the issue.  
> > Doing some more digging, I was able to determine that the message coincides 
> > with 'ip link set dev eth0 mtu 1300 up' very shortly (>~1 second) after the 
> > hv_netvsc driver loads.  If I delay the mtu change until well after the 
> > driver loads, everything works fine.  If I unload hv_netvsc and then reload 
> > it and apply the mtu change immediately, the failure re-occurs.  So 
> > something is racy here, and the above doesn't entirely address it.
> > 
> > I'm happy to test out any suggested patches and/or do additional debugging 
> > if anyone has any suggestions.
> > 
> > (oh, and I did also try 4.18-rc1 and the problem still persists)  
> 
> Always cc: the authors of the patch you are having problems with, along
> with the mailing list for the networking subsystem, so that the right
> people know to look at this.

Let me take a look at it, and log it into the bug system.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v2] hv_netvsc: Add per-cpu ethtool stats for netvsc

2018-06-13 Thread Stephen Hemminger
On Wed, 13 Jun 2018 22:03:34 +
Yidong Ren  wrote:

> > From: devel  On Behalf
> > Of Stephen Hemminger  
> > > +/* statistics per queue (rx/tx packets/bytes) */ #define
> > > +NETVSC_PCPU_STATS_LEN (num_present_cpus() *  
> > ARRAY_SIZE(pcpu_stats))
> > 
> > Even though Hyper-V/Azure does not support hot plug cpu's it might be
> > better to num_cpu_possible to avoid any possible future surprises.  
> 
> That will create a very long output (num_cpu_possible = 128 on my machine) 
> for ethtool,
> While doesn't provide additional info.
> num_present_cpus() would cause problem only if someone removed cpu 
> between netvsc_get_sset_count() and netvsc_get_strings() and 
> netvsc_get_ethtool_stats(). 
> 
> An alternative way could be: Check all stats, and only output if not zero. 
> This need to be done in two pass. First pass to get the correct count, second 
> pass to print the number.
> Is there an elegant way to do this? 

Ok, but there is a race between getting names and getting statistics.
If a cpu was added/removed then statistics would not match.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v2] hv_netvsc: Add per-cpu ethtool stats for netvsc

2018-06-13 Thread Stephen Hemminger
On Wed, 13 Jun 2018 12:36:08 -0700
Yidong Ren  wrote:

> From: Yidong Ren 
> 
> This patch implements following ethtool stats fields for netvsc:
> cpu_tx/rx_packets/bytes
> cpu_vf_tx/rx_packets/bytes
> 
> Corresponding per-cpu counters exist in current code. Exposing these
> counters will help troubleshooting performance issues.
> 
> Signed-off-by: Yidong Ren 
> ---
> Changes in v2:
>   - Remove cpp style comment
>   - Resubmit after freeze
> 
>  drivers/net/hyperv/hyperv_net.h |  11 +
>  drivers/net/hyperv/netvsc_drv.c | 104 
> +++-
>  2 files changed, 113 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
> index 23304ac..c825353 100644
> --- a/drivers/net/hyperv/hyperv_net.h
> +++ b/drivers/net/hyperv/hyperv_net.h
> @@ -873,6 +873,17 @@ struct netvsc_ethtool_stats {
>   unsigned long wake_queue;
>  };
>  
> +struct netvsc_ethtool_pcpu_stats {
> + u64 rx_packets;
> + u64 rx_bytes;
> + u64 tx_packets;
> + u64 tx_bytes;
> + u64 vf_rx_packets;
> + u64 vf_rx_bytes;
> + u64 vf_tx_packets;
> + u64 vf_tx_bytes;
> +};
> +
>  struct netvsc_vf_pcpu_stats {
>   u64 rx_packets;
>   u64 rx_bytes;
> diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
> index 7b18a8c..6803aae 100644
> --- a/drivers/net/hyperv/netvsc_drv.c
> +++ b/drivers/net/hyperv/netvsc_drv.c
> @@ -1105,6 +1105,66 @@ static void netvsc_get_vf_stats(struct net_device *net,
>   }
>  }
>  
> +static void netvsc_get_pcpu_stats(struct net_device *net,
> +   struct netvsc_ethtool_pcpu_stats
> + __percpu *pcpu_tot)
> +{
> + struct net_device_context *ndev_ctx = netdev_priv(net);
> + struct netvsc_device *nvdev = rcu_dereference_rtnl(ndev_ctx->nvdev);
> + int i;
> +
> + /* fetch percpu stats of vf */
> + for_each_possible_cpu(i) {
> + const struct netvsc_vf_pcpu_stats *stats =
> + per_cpu_ptr(ndev_ctx->vf_stats, i);
> + struct netvsc_ethtool_pcpu_stats *this_tot =
> + per_cpu_ptr(pcpu_tot, i);
> + unsigned int start;
> +
> + do {
> + start = u64_stats_fetch_begin_irq(>syncp);
> + this_tot->vf_rx_packets = stats->rx_packets;
> + this_tot->vf_tx_packets = stats->tx_packets;
> + this_tot->vf_rx_bytes = stats->rx_bytes;
> + this_tot->vf_tx_bytes = stats->tx_bytes;
> + } while (u64_stats_fetch_retry_irq(>syncp, start));
> + this_tot->rx_packets = this_tot->vf_rx_packets;
> + this_tot->tx_packets = this_tot->vf_tx_packets;
> + this_tot->rx_bytes   = this_tot->vf_rx_bytes;
> + this_tot->tx_bytes   = this_tot->vf_tx_bytes;
> + }
> +
> + /* fetch percpu stats of netvsc */
> + for (i = 0; i < nvdev->num_chn; i++) {
> + const struct netvsc_channel *nvchan = >chan_table[i];
> + const struct netvsc_stats *stats;
> + struct netvsc_ethtool_pcpu_stats *this_tot =
> + per_cpu_ptr(pcpu_tot, nvchan->channel->target_cpu);
> + u64 packets, bytes;
> + unsigned int start;
> +
> + stats = >tx_stats;
> + do {
> + start = u64_stats_fetch_begin_irq(>syncp);
> + packets = stats->packets;
> + bytes = stats->bytes;
> + } while (u64_stats_fetch_retry_irq(>syncp, start));
> +
> + this_tot->tx_bytes  += bytes;
> + this_tot->tx_packets+= packets;
> +
> + stats = >rx_stats;
> + do {
> + start = u64_stats_fetch_begin_irq(>syncp);
> + packets = stats->packets;
> + bytes = stats->bytes;
> + } while (u64_stats_fetch_retry_irq(>syncp, start));
> +
> + this_tot->rx_bytes  += bytes;
> + this_tot->rx_packets+= packets;
> + }
> +}
> +
>  static void netvsc_get_stats64(struct net_device *net,
>  struct rtnl_link_stats64 *t)
>  {
> @@ -1202,6 +1262,23 @@ static const struct {
>   { "rx_no_memory", offsetof(struct netvsc_ethtool_stats, rx_no_memory) },
>   { "stop_queue", offsetof(struct netvsc_ethtool_stats, stop_queue) },
>   { "wake_queue", offsetof(struct netvsc_ethtool_stats, wake_queue) },
> +}, pcpu_stats[] = {
> + { "cpu%u_rx_packets",
> + offsetof(struct netvsc_ethtool_pcpu_stats, rx_packets) },
> + { "cpu%u_rx_bytes",
> + offsetof(struct netvsc_ethtool_pcpu_stats, rx_bytes) },
> + { "cpu%u_tx_packets",
> + offsetof(struct netvsc_ethtool_pcpu_stats, tx_packets) },
> + { "cpu%u_tx_bytes",
> + offsetof(struct 

[PATCH] hv: add driver_override support

2018-06-12 Thread Stephen Hemminger
Add support for overriding the default driver for a VMBus device
in the same way that it can be done for PCI devices. This patch
adds the /sys/bus/vmbus/devices/.../driver_override file
and the logic for matching.

This is used by driverctl tool to do driver override.
 https://gitlab.com/driverctl/driverctl

When using devices with DPDK it is useful to be able to use
this tool instead of manual bind/unbind.

Signed-off-by: Stephen Hemminger 
---
Resend of patch, that seems to have gotten lost.

 Documentation/ABI/testing/sysfs-bus-vmbus |   6 ++
 drivers/hv/vmbus_drv.c| 115 ++
 include/linux/hyperv.h|   1 +
 3 files changed, 103 insertions(+), 19 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-vmbus

diff --git a/Documentation/ABI/testing/sysfs-bus-vmbus 
b/Documentation/ABI/testing/sysfs-bus-vmbus
new file mode 100644
index ..4f57b8f8b767
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-vmbus
@@ -0,0 +1,6 @@
+What:  /sys/bus/vmbus/devices/.../driver_override
+Date:  April 2018
+Contact:   Stephen Hemminger 
+Description:
+   This file allows the driver for a device to be specified which
+   will override standard static and dynamic ID matching.
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index b10fe26c4891..b5e3b3dc1bbd 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -482,6 +482,54 @@ static ssize_t device_show(struct device *dev,
 }
 static DEVICE_ATTR_RO(device);
 
+static ssize_t driver_override_store(struct device *dev,
+struct device_attribute *attr,
+const char *buf, size_t count)
+{
+   struct hv_device *hv_dev = device_to_hv_device(dev);
+   char *driver_override, *old, *cp;
+
+   /* We need to keep extra room for a newline */
+   if (count >= (PAGE_SIZE - 1))
+   return -EINVAL;
+
+   driver_override = kstrndup(buf, count, GFP_KERNEL);
+   if (!driver_override)
+   return -ENOMEM;
+
+   cp = strchr(driver_override, '\n');
+   if (cp)
+   *cp = '\0';
+
+   device_lock(dev);
+   old = hv_dev->driver_override;
+   if (strlen(driver_override)) {
+   hv_dev->driver_override = driver_override;
+   } else {
+   kfree(driver_override);
+   hv_dev->driver_override = NULL;
+   }
+   device_unlock(dev);
+
+   kfree(old);
+
+   return count;
+}
+
+static ssize_t driver_override_show(struct device *dev,
+   struct device_attribute *attr, char *buf)
+{
+   struct hv_device *hv_dev = device_to_hv_device(dev);
+   ssize_t len;
+
+   device_lock(dev);
+   len = snprintf(buf, PAGE_SIZE, "%s\n", hv_dev->driver_override);
+   device_unlock(dev);
+
+   return len;
+}
+static DEVICE_ATTR_RW(driver_override);
+
 /* Set up per device attributes in /sys/bus/vmbus/devices/ */
 static struct attribute *vmbus_dev_attrs[] = {
_attr_id.attr,
@@ -509,6 +557,7 @@ static struct attribute *vmbus_dev_attrs[] = {
_attr_channel_vp_mapping.attr,
_attr_vendor.attr,
_attr_device.attr,
+   _attr_driver_override.attr,
NULL,
 };
 ATTRIBUTE_GROUPS(vmbus_dev);
@@ -544,17 +593,26 @@ static inline bool is_null_guid(const uuid_le *guid)
return true;
 }
 
-/*
- * Return a matching hv_vmbus_device_id pointer.
- * If there is no match, return NULL.
- */
-static const struct hv_vmbus_device_id *hv_vmbus_get_id(struct hv_driver *drv,
-   const uuid_le *guid)
+static const struct hv_vmbus_device_id *
+hv_vmbus_dev_match(const struct hv_vmbus_device_id *id, const uuid_le *guid)
+
+{
+   if (id == NULL)
+   return NULL; /* empty device table */
+
+   for (; !is_null_guid(>guid); id++)
+   if (!uuid_le_cmp(id->guid, *guid))
+   return id;
+
+   return NULL;
+}
+
+static const struct hv_vmbus_device_id *
+hv_vmbus_dynid_match(struct hv_driver *drv, const uuid_le *guid)
 {
const struct hv_vmbus_device_id *id = NULL;
struct vmbus_dynid *dynid;
 
-   /* Look at the dynamic ids first, before the static ones */
spin_lock(>dynids.lock);
list_for_each_entry(dynid, >dynids.list, node) {
if (!uuid_le_cmp(dynid->id.guid, *guid)) {
@@ -564,18 +622,37 @@ static const struct hv_vmbus_device_id 
*hv_vmbus_get_id(struct hv_driver *drv,
}
spin_unlock(>dynids.lock);
 
-   if (id)
-   return id;
+   return id;
+}
 
-   id = drv->id_table;
-   if (id == NULL)
-   return NULL; /* empty device table */
+static const struct hv_vmbus_device_id vmbus_device_null = {
+   .guid = NULL_UUID_LE,
+};
 
-   for (; !is_null_guid(>guid); id++)

Re: [PATCH net 2/3] hv_netvsc: fix network namespace issues with VF support

2018-06-12 Thread Stephen Hemminger
On Tue, 12 Jun 2018 12:51:28 +0300
Dan Carpenter  wrote:

> On Mon, Jun 11, 2018 at 12:44:55PM -0700, Stephen Hemminger wrote:
> > When finding the parent netvsc device, the search needs to be across
> > all netvsc device instances (independent of network namespace).
> > 
> > Find parent device of VF using upper_dev_get routine which
> > searches only adjacent list.
> > 
> > Fixes: e8ff40d4bff1 ("hv_netvsc: improve VF device matching")
> > Signed-off-by: Stephen Hemminger 
> > 
> > netns aware byref  
> 
> What?  Presumably that wasn't supposed to be part of the commit message.

That was leftover from earlier commit message
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH net 2/3] hv_netvsc: fix network namespace issues with VF support

2018-06-11 Thread Stephen Hemminger
When finding the parent netvsc device, the search needs to be across
all netvsc device instances (independent of network namespace).

Find parent device of VF using upper_dev_get routine which
searches only adjacent list.

Fixes: e8ff40d4bff1 ("hv_netvsc: improve VF device matching")
Signed-off-by: Stephen Hemminger 

netns aware byref
---
 drivers/net/hyperv/hyperv_net.h |  2 ++
 drivers/net/hyperv/netvsc_drv.c | 43 +++--
 2 files changed, 22 insertions(+), 23 deletions(-)

diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index 1be34d2e3563..c0b3f3c125d4 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -902,6 +902,8 @@ struct net_device_context {
struct hv_device *device_ctx;
/* netvsc_device */
struct netvsc_device __rcu *nvdev;
+   /* list of netvsc net_devices */
+   struct list_head list;
/* reconfigure work */
struct delayed_work dwork;
/* last reconfig time */
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 2d4370c94b6e..8cb21e013d1d 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -69,6 +69,8 @@ static int debug = -1;
 module_param(debug, int, 0444);
 MODULE_PARM_DESC(debug, "Debug level (0=none,...,16=all)");
 
+static LIST_HEAD(netvsc_dev_list);
+
 static void netvsc_change_rx_flags(struct net_device *net, int change)
 {
struct net_device_context *ndev_ctx = netdev_priv(net);
@@ -1783,13 +1785,10 @@ static void netvsc_link_change(struct work_struct *w)
 
 static struct net_device *get_netvsc_bymac(const u8 *mac)
 {
-   struct net_device *dev;
-
-   ASSERT_RTNL();
+   struct net_device_context *ndev_ctx;
 
-   for_each_netdev(_net, dev) {
-   if (dev->netdev_ops != _ops)
-   continue;   /* not a netvsc device */
+   list_for_each_entry(ndev_ctx, _dev_list, list) {
+   struct net_device *dev = hv_get_drvdata(ndev_ctx->device_ctx);
 
if (ether_addr_equal(mac, dev->perm_addr))
return dev;
@@ -1800,25 +1799,18 @@ static struct net_device *get_netvsc_bymac(const u8 
*mac)
 
 static struct net_device *get_netvsc_byref(struct net_device *vf_netdev)
 {
+   struct net_device_context *net_device_ctx;
struct net_device *dev;
 
-   ASSERT_RTNL();
-
-   for_each_netdev(_net, dev) {
-   struct net_device_context *net_device_ctx;
+   dev = netdev_master_upper_dev_get(vf_netdev);
+   if (!dev || dev->netdev_ops != _ops)
+   return NULL;/* not a netvsc device */
 
-   if (dev->netdev_ops != _ops)
-   continue;   /* not a netvsc device */
+   net_device_ctx = netdev_priv(dev);
+   if (!rtnl_dereference(net_device_ctx->nvdev))
+   return NULL;/* device is removed */
 
-   net_device_ctx = netdev_priv(dev);
-   if (!rtnl_dereference(net_device_ctx->nvdev))
-   continue;   /* device is removed */
-
-   if (rtnl_dereference(net_device_ctx->vf_netdev) == vf_netdev)
-   return dev; /* a match */
-   }
-
-   return NULL;
+   return dev;
 }
 
 /* Called when VF is injecting data into network stack.
@@ -2095,15 +2087,19 @@ static int netvsc_probe(struct hv_device *dev,
else
net->max_mtu = ETH_DATA_LEN;
 
-   ret = register_netdev(net);
+   rtnl_lock();
+   ret = register_netdevice(net);
if (ret != 0) {
pr_err("Unable to register netdev.\n");
goto register_failed;
}
 
-   return ret;
+   list_add(_device_ctx->list, _dev_list);
+   rtnl_unlock();
+   return 0;
 
 register_failed:
+   rtnl_unlock();
rndis_filter_device_remove(dev, nvdev);
 rndis_failed:
free_percpu(net_device_ctx->vf_stats);
@@ -2149,6 +2145,7 @@ static int netvsc_remove(struct hv_device *dev)
rndis_filter_device_remove(dev, nvdev);
 
unregister_netdevice(net);
+   list_del(_ctx->list);
 
rtnl_unlock();
rcu_read_unlock();
-- 
2.17.1

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH net 3/3] hv_netvsc: move VF to same namespace as netvsc device

2018-06-11 Thread Stephen Hemminger
When VF is added, the paravirtual device is already present
and may have been moved to another network namespace. For example,
sometimes the management interface is put in another net namespace
in some environments.

The VF should get moved to where the netvsc device is when the
VF is discovered. The user can move it later (if desired).

Signed-off-by: Stephen Hemminger 
---
 drivers/net/hyperv/netvsc_drv.c | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 8cb21e013d1d..bf1b845c1147 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -1930,6 +1930,7 @@ static int netvsc_register_vf(struct net_device 
*vf_netdev)
struct net_device *ndev;
struct net_device_context *net_device_ctx;
struct netvsc_device *netvsc_dev;
+   int ret;
 
if (vf_netdev->addr_len != ETH_ALEN)
return NOTIFY_DONE;
@@ -1948,11 +1949,29 @@ static int netvsc_register_vf(struct net_device 
*vf_netdev)
if (!netvsc_dev || rtnl_dereference(net_device_ctx->vf_netdev))
return NOTIFY_DONE;
 
-   if (netvsc_vf_join(vf_netdev, ndev) != 0)
+   /* if syntihetic interface is a different namespace,
+* then move the VF to that namespace; join will be
+* done again in that context.
+*/
+   if (!net_eq(dev_net(ndev), dev_net(vf_netdev))) {
+   ret = dev_change_net_namespace(vf_netdev,
+  dev_net(ndev), "eth%d");
+   if (ret)
+   netdev_err(vf_netdev,
+  "could not move to same namespace as %s: 
%d\n",
+  ndev->name, ret);
+   else
+   netdev_info(vf_netdev,
+   "VF moved to namespace with: %s\n",
+   ndev->name);
return NOTIFY_DONE;
+   }
 
netdev_info(ndev, "VF registering: %s\n", vf_netdev->name);
 
+   if (netvsc_vf_join(vf_netdev, ndev) != 0)
+   return NOTIFY_DONE;
+
dev_hold(vf_netdev);
rcu_assign_pointer(net_device_ctx->vf_netdev, vf_netdev);
return NOTIFY_OK;
-- 
2.17.1

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH net 1/3] hv_netvsc: drop common code until callback model fixed

2018-06-11 Thread Stephen Hemminger
The callback model of handling network failover is not suitable
in the current form.
  1. It was merged without addressing all the review feedback.
  2. It was merged without approval of any of the netvsc maintainers.
  3. Design discussion on how to handle PV/VF fallback is still
 not complete.
  4. IMHO the code model using callbacks is trying to make
 something common which isn't.

Revert the netvsc specific changes for now. Does not impact ongoing
development of failover model for virtio.
Revisit this after a simpler library based failover kernel
routines are extracted.

This reverts
commit 9c6ffbacdb57 ("hv_netvsc: fix error return code in netvsc_probe()")
and
commit 1ff78076d8dd ("netvsc: refactor notifier/event handling code to use the 
failover framework")

Signed-off-by: Stephen Hemminger 
---
 drivers/net/hyperv/Kconfig  |   1 -
 drivers/net/hyperv/hyperv_net.h |   2 -
 drivers/net/hyperv/netvsc_drv.c | 224 +++-
 3 files changed, 165 insertions(+), 62 deletions(-)

diff --git a/drivers/net/hyperv/Kconfig b/drivers/net/hyperv/Kconfig
index 23a2d145813a..0765d5f61714 100644
--- a/drivers/net/hyperv/Kconfig
+++ b/drivers/net/hyperv/Kconfig
@@ -2,6 +2,5 @@ config HYPERV_NET
tristate "Microsoft Hyper-V virtual network driver"
depends on HYPERV
select UCS2_STRING
-   select FAILOVER
help
  Select this option to enable the Hyper-V virtual network driver.
diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index 99d8e7398a5b..1be34d2e3563 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -932,8 +932,6 @@ struct net_device_context {
u32 vf_alloc;
/* Serial number of the VF to team with */
u32 vf_serial;
-
-   struct failover *failover;
 };
 
 /* Per channel data */
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 8eec156418ea..2d4370c94b6e 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -43,7 +43,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "hyperv_net.h"
 
@@ -1782,6 +1781,46 @@ static void netvsc_link_change(struct work_struct *w)
rtnl_unlock();
 }
 
+static struct net_device *get_netvsc_bymac(const u8 *mac)
+{
+   struct net_device *dev;
+
+   ASSERT_RTNL();
+
+   for_each_netdev(_net, dev) {
+   if (dev->netdev_ops != _ops)
+   continue;   /* not a netvsc device */
+
+   if (ether_addr_equal(mac, dev->perm_addr))
+   return dev;
+   }
+
+   return NULL;
+}
+
+static struct net_device *get_netvsc_byref(struct net_device *vf_netdev)
+{
+   struct net_device *dev;
+
+   ASSERT_RTNL();
+
+   for_each_netdev(_net, dev) {
+   struct net_device_context *net_device_ctx;
+
+   if (dev->netdev_ops != _ops)
+   continue;   /* not a netvsc device */
+
+   net_device_ctx = netdev_priv(dev);
+   if (!rtnl_dereference(net_device_ctx->nvdev))
+   continue;   /* device is removed */
+
+   if (rtnl_dereference(net_device_ctx->vf_netdev) == vf_netdev)
+   return dev; /* a match */
+   }
+
+   return NULL;
+}
+
 /* Called when VF is injecting data into network stack.
  * Change the associated network device from VF to netvsc.
  * note: already called with rcu_read_lock
@@ -1804,6 +1843,46 @@ static rx_handler_result_t netvsc_vf_handle_frame(struct 
sk_buff **pskb)
return RX_HANDLER_ANOTHER;
 }
 
+static int netvsc_vf_join(struct net_device *vf_netdev,
+ struct net_device *ndev)
+{
+   struct net_device_context *ndev_ctx = netdev_priv(ndev);
+   int ret;
+
+   ret = netdev_rx_handler_register(vf_netdev,
+netvsc_vf_handle_frame, ndev);
+   if (ret != 0) {
+   netdev_err(vf_netdev,
+  "can not register netvsc VF receive handler (err = 
%d)\n",
+  ret);
+   goto rx_handler_failed;
+   }
+
+   ret = netdev_master_upper_dev_link(vf_netdev, ndev,
+  NULL, NULL, NULL);
+   if (ret != 0) {
+   netdev_err(vf_netdev,
+  "can not set master device %s (err = %d)\n",
+  ndev->name, ret);
+   goto upper_link_failed;
+   }
+
+   /* set slave flag before open to prevent IPv6 addrconf */
+   vf_netdev->flags |= IFF_SLAVE;
+
+   schedule_delayed_work(_ctx->vf_takeover, VF_TAKEOVER_INT);
+
+   call_netdevice_notifiers(NETDEV_JOIN, vf_netdev);
+
+   netdev_info(vf_netdev, "joined to %s\n", ndev->name);
+   return 0;
+
+upper_lin

[PATCH net 0/3] hv_netvsc: notification and namespace fixes

2018-06-11 Thread Stephen Hemminger
This set of patches addresses two set of fixes. First it backs out
the common callback model which was merged in net-next without
completing all the review feedback or getting maintainer approval.

Then it fixes the transparent VF management code to handle network
namespaces.

Stephen Hemminger (3):
  hv_netvsc: drop common code until callback model fixed
  hv_netvsc: fix network namespace issues with VF support
  hv_netvsc: move VF to same namespace as netvsc device

 drivers/net/hyperv/Kconfig  |   1 -
 drivers/net/hyperv/hyperv_net.h |   4 +-
 drivers/net/hyperv/netvsc_drv.c | 242 
 3 files changed, 184 insertions(+), 63 deletions(-)

-- 
2.17.1

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH] hv_netvsc: Add per-cpu ethtool stats for netvsc

2018-06-06 Thread Stephen Hemminger
On Wed,  6 Jun 2018 15:27:00 -0700
Yidong Ren  wrote:

> From: Yidong Ren 
> 
> This patch implements following ethtool stats fields for netvsc:
> cpu_tx/rx_packets/bytes
> cpu_vf_tx/rx_packets/bytes
> 
> Corresponding per-cpu counters exist in current code. Exposing these
> counters will help troubleshooting performance issues.
> 
> Signed-off-by: Yidong Ren 

This patch would be targeted for net-next (davem's tree);
but net-next is currently closed until 4.19-rc1 is done.

> ---
>  drivers/net/hyperv/hyperv_net.h |  11 
>  drivers/net/hyperv/netvsc_drv.c | 104 +++-
>  2 files changed, 113 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
> index 960f06141472..f8c798bf9418 100644
> --- a/drivers/net/hyperv/hyperv_net.h
> +++ b/drivers/net/hyperv/hyperv_net.h
> @@ -710,6 +710,17 @@ struct netvsc_ethtool_stats {
>   unsigned long wake_queue;
>  };
>  
> +struct netvsc_ethtool_pcpu_stats {
> + u64 rx_packets;
> + u64 rx_bytes;
> + u64 tx_packets;
> + u64 tx_bytes;
> + u64 vf_rx_packets;
> + u64 vf_rx_bytes;
> + u64 vf_tx_packets;
> + u64 vf_tx_bytes;
> +};
> +
>  struct netvsc_vf_pcpu_stats {
>   u64 rx_packets;
>   u64 rx_bytes;
> diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
> index da07ccdf84bf..c43e64606c1a 100644
> --- a/drivers/net/hyperv/netvsc_drv.c
> +++ b/drivers/net/hyperv/netvsc_drv.c
> @@ -1104,6 +1104,66 @@ static void netvsc_get_vf_stats(struct net_device *net,
>   }
>  }
>  
> +static void netvsc_get_pcpu_stats(struct net_device *net,
> +   struct netvsc_ethtool_pcpu_stats
> + __percpu *pcpu_tot)
> +{
> + struct net_device_context *ndev_ctx = netdev_priv(net);
> + struct netvsc_device *nvdev = rcu_dereference_rtnl(ndev_ctx->nvdev);
> + int i;
> +
> + // fetch percpu stats of vf

If you ran checkpatch you would see that Linux always uses C style
comments, and not C++ style //

> + for_each_possible_cpu(i) {
> + const struct netvsc_vf_pcpu_stats *stats =
> + per_cpu_ptr(ndev_ctx->vf_stats, i);
> + struct netvsc_ethtool_pcpu_stats *this_tot =
> + per_cpu_ptr(pcpu_tot, i);
> + unsigned int start;
> +
> + do {
> + start = u64_stats_fetch_begin_irq(>syncp);
> + this_tot->vf_rx_packets = stats->rx_packets;
> + this_tot->vf_tx_packets = stats->tx_packets;
> + this_tot->vf_rx_bytes = stats->rx_bytes;
> + this_tot->vf_tx_bytes = stats->tx_bytes;
> + } while (u64_stats_fetch_retry_irq(>syncp, start));
> + this_tot->rx_packets = this_tot->vf_rx_packets;
> + this_tot->tx_packets = this_tot->vf_tx_packets;
> + this_tot->rx_bytes   = this_tot->vf_rx_bytes;
> + this_tot->tx_bytes   = this_tot->vf_tx_bytes;
> + }
> +
> + // fetch percpu stats of netvsc
> + for (i = 0; i < nvdev->num_chn; i++) {
> + const struct netvsc_channel *nvchan = >chan_table[i];
> + const struct netvsc_stats *stats;
> + struct netvsc_ethtool_pcpu_stats *this_tot =
> + per_cpu_ptr(pcpu_tot, nvchan->channel->target_cpu);
> + u64 packets, bytes;
> + unsigned int start;
> +
> + stats = >tx_stats;
> + do {
> + start = u64_stats_fetch_begin_irq(>syncp);
> + packets = stats->packets;
> + bytes = stats->bytes;
> + } while (u64_stats_fetch_retry_irq(>syncp, start));
> +
> + this_tot->tx_bytes  += bytes;
> + this_tot->tx_packets+= packets;
> +
> + stats = >rx_stats;
> + do {
> + start = u64_stats_fetch_begin_irq(>syncp);
> + packets = stats->packets;
> + bytes = stats->bytes;
> + } while (u64_stats_fetch_retry_irq(>syncp, start));
> +
> + this_tot->rx_bytes  += bytes;
> + this_tot->rx_packets+= packets;
> + }
> +}
> +
>  static void netvsc_get_stats64(struct net_device *net,
>  struct rtnl_link_stats64 *t)
>  {
> @@ -1201,6 +1261,23 @@ static const struct {
>   { "rx_no_memory", offsetof(struct netvsc_ethtool_stats, rx_no_memory) },
>   { "stop_queue", offsetof(struct netvsc_ethtool_stats, stop_queue) },
>   { "wake_queue", offsetof(struct netvsc_ethtool_stats, wake_queue) },
> +}, pcpu_stats[] = {
> + { "cpu%u_rx_packets",
> + offsetof(struct netvsc_ethtool_pcpu_stats, rx_packets) },
> + { "cpu%u_rx_bytes",
> + offsetof(struct netvsc_ethtool_pcpu_stats, rx_bytes) },
> + { 

[PATCH 1/3] PCI: hv: remove unused reason for refcount handler

2018-05-23 Thread Stephen Hemminger
The get/put functions were taking a reason code. This appears to be
a debug infrastructure that is no longer used.

Move the functions to start of file to eliminate need for
forward declaration. Forward declarations are discouraged on
Linux.

Signed-off-by: Stephen Hemminger <sthem...@microsoft.com>
---
 drivers/pci/host/pci-hyperv.c | 70 +--
 1 file changed, 26 insertions(+), 44 deletions(-)

diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
index 50cdefe3f6d3..505453e23c22 100644
--- a/drivers/pci/host/pci-hyperv.c
+++ b/drivers/pci/host/pci-hyperv.c
@@ -488,17 +488,6 @@ enum hv_pcichild_state {
hv_pcichild_maximum
 };
 
-enum hv_pcidev_ref_reason {
-   hv_pcidev_ref_invalid = 0,
-   hv_pcidev_ref_initial,
-   hv_pcidev_ref_by_slot,
-   hv_pcidev_ref_packet,
-   hv_pcidev_ref_pnp,
-   hv_pcidev_ref_childlist,
-   hv_pcidev_irqdata,
-   hv_pcidev_ref_max
-};
-
 struct hv_pci_dev {
/* List protected by pci_rescan_remove_lock */
struct list_head list_entry;
@@ -548,10 +537,17 @@ static void hv_pci_generic_compl(void *context, struct 
pci_response *resp,
 
 static struct hv_pci_dev *get_pcichild_wslot(struct hv_pcibus_device *hbus,
u32 wslot);
-static void get_pcichild(struct hv_pci_dev *hv_pcidev,
-enum hv_pcidev_ref_reason reason);
-static void put_pcichild(struct hv_pci_dev *hv_pcidev,
-enum hv_pcidev_ref_reason reason);
+
+static void get_pcichild(struct hv_pci_dev *hpdev)
+{
+   refcount_inc(>refs);
+}
+
+static void put_pcichild(struct hv_pci_dev *hpdev)
+{
+   if (refcount_dec_and_test(>refs))
+   kfree(hpdev);
+}
 
 static void get_hvpcibus(struct hv_pcibus_device *hv_pcibus);
 static void put_hvpcibus(struct hv_pcibus_device *hv_pcibus);
@@ -762,7 +758,7 @@ static int hv_pcifront_read_config(struct pci_bus *bus, 
unsigned int devfn,
 
_hv_pcifront_read_config(hpdev, where, size, val);
 
-   put_pcichild(hpdev, hv_pcidev_ref_by_slot);
+   put_pcichild(hpdev);
return PCIBIOS_SUCCESSFUL;
 }
 
@@ -790,7 +786,7 @@ static int hv_pcifront_write_config(struct pci_bus *bus, 
unsigned int devfn,
 
_hv_pcifront_write_config(hpdev, where, size, val);
 
-   put_pcichild(hpdev, hv_pcidev_ref_by_slot);
+   put_pcichild(hpdev);
return PCIBIOS_SUCCESSFUL;
 }
 
@@ -856,7 +852,7 @@ static void hv_msi_free(struct irq_domain *domain, struct 
msi_domain_info *info,
}
 
hv_int_desc_free(hpdev, int_desc);
-   put_pcichild(hpdev, hv_pcidev_ref_by_slot);
+   put_pcichild(hpdev);
 }
 
 static int hv_set_affinity(struct irq_data *data, const struct cpumask *dest,
@@ -1186,13 +1182,13 @@ static void hv_compose_msi_msg(struct irq_data *data, 
struct msi_msg *msg)
msg->address_lo = comp.int_desc.address & 0x;
msg->data = comp.int_desc.data;
 
-   put_pcichild(hpdev, hv_pcidev_ref_by_slot);
+   put_pcichild(hpdev);
return;
 
 free_int_desc:
kfree(int_desc);
 drop_reference:
-   put_pcichild(hpdev, hv_pcidev_ref_by_slot);
+   put_pcichild(hpdev);
 return_null_message:
msg->address_hi = 0;
msg->address_lo = 0;
@@ -1508,19 +1504,6 @@ static void q_resource_requirements(void *context, 
struct pci_response *resp,
complete(>host_event);
 }
 
-static void get_pcichild(struct hv_pci_dev *hpdev,
-   enum hv_pcidev_ref_reason reason)
-{
-   refcount_inc(>refs);
-}
-
-static void put_pcichild(struct hv_pci_dev *hpdev,
-   enum hv_pcidev_ref_reason reason)
-{
-   if (refcount_dec_and_test(>refs))
-   kfree(hpdev);
-}
-
 /**
  * new_pcichild_device() - Create a new child device
  * @hbus:  The internal struct tracking this root PCI bus.
@@ -1572,7 +1555,7 @@ static struct hv_pci_dev *new_pcichild_device(struct 
hv_pcibus_device *hbus,
 
hpdev->desc = *desc;
refcount_set(>refs, 1);
-   get_pcichild(hpdev, hv_pcidev_ref_childlist);
+   get_pcichild(hpdev);
spin_lock_irqsave(>device_list_lock, flags);
 
/*
@@ -1618,7 +1601,7 @@ static struct hv_pci_dev *get_pcichild_wslot(struct 
hv_pcibus_device *hbus,
list_for_each_entry(iter, >children, list_entry) {
if (iter->desc.win_slot.slot == wslot) {
hpdev = iter;
-   get_pcichild(hpdev, hv_pcidev_ref_by_slot);
+   get_pcichild(hpdev);
break;
}
}
@@ -1735,7 +1718,7 @@ static void pci_devices_present_work(struct work_struct 
*work)
 list_entry);
if (hpdev->reported_missing) {
found = true;
-   

[PATCH 3/3] PCI: hv: use list_for_each_entry

2018-05-23 Thread Stephen Hemminger
There are several places where list_for_each_entry could be
used to simplify the code.

Signed-off-by: Stephen Hemminger <sthem...@microsoft.com>
---
 drivers/pci/host/pci-hyperv.c | 27 +++
 1 file changed, 7 insertions(+), 20 deletions(-)

diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
index 19eb47f58ccb..afc30dee6001 100644
--- a/drivers/pci/host/pci-hyperv.c
+++ b/drivers/pci/host/pci-hyperv.c
@@ -1279,7 +1279,6 @@ static u64 get_bar_size(u64 bar_val)
  */
 static void survey_child_resources(struct hv_pcibus_device *hbus)
 {
-   struct list_head *iter;
struct hv_pci_dev *hpdev;
resource_size_t bar_size = 0;
unsigned long flags;
@@ -1305,8 +1304,7 @@ static void survey_child_resources(struct 
hv_pcibus_device *hbus)
 * for a child device are a power of 2 in size and aligned in memory,
 * so it's sufficient to just add them up without tracking alignment.
 */
-   list_for_each(iter, >children) {
-   hpdev = container_of(iter, struct hv_pci_dev, list_entry);
+   list_for_each_entry(hpdev, >children, list_entry) {
for (i = 0; i < 6; i++) {
if (hpdev->probed_bar[i] & PCI_BASE_ADDRESS_SPACE_IO)
dev_err(>hdev->device,
@@ -1359,7 +1357,6 @@ static void prepopulate_bars(struct hv_pcibus_device 
*hbus)
resource_size_t low_base = 0;
resource_size_t bar_size;
struct hv_pci_dev *hpdev;
-   struct list_head *iter;
unsigned long flags;
u64 bar_val;
u32 command;
@@ -1381,9 +1378,7 @@ static void prepopulate_bars(struct hv_pcibus_device 
*hbus)
 
/* Pick addresses for the BARs. */
do {
-   list_for_each(iter, >children) {
-   hpdev = container_of(iter, struct hv_pci_dev,
-list_entry);
+   list_for_each_entry(hpdev, >children, list_entry) {
for (i = 0; i < 6; i++) {
bar_val = hpdev->probed_bar[i];
if (bar_val == 0)
@@ -1637,7 +1632,6 @@ static void pci_devices_present_work(struct work_struct 
*work)
 {
u32 child_no;
bool found;
-   struct list_head *iter;
struct pci_function_description *new_desc;
struct hv_pci_dev *hpdev;
struct hv_pcibus_device *hbus;
@@ -1674,10 +1668,8 @@ static void pci_devices_present_work(struct work_struct 
*work)
 
/* First, mark all existing children as reported missing. */
spin_lock_irqsave(>device_list_lock, flags);
-   list_for_each(iter, >children) {
-   hpdev = container_of(iter, struct hv_pci_dev,
-list_entry);
-   hpdev->reported_missing = true;
+   list_for_each_entry(hpdev, >children, list_entry) {
+   hpdev->reported_missing = true;
}
spin_unlock_irqrestore(>device_list_lock, flags);
 
@@ -1687,11 +1679,8 @@ static void pci_devices_present_work(struct work_struct 
*work)
new_desc = >func[child_no];
 
spin_lock_irqsave(>device_list_lock, flags);
-   list_for_each(iter, >children) {
-   hpdev = container_of(iter, struct hv_pci_dev,
-list_entry);
-   if ((hpdev->desc.win_slot.slot ==
-new_desc->win_slot.slot) &&
+   list_for_each_entry(hpdev, >children, list_entry) {
+   if ((hpdev->desc.win_slot.slot == 
new_desc->win_slot.slot) &&
(hpdev->desc.v_id == new_desc->v_id) &&
(hpdev->desc.d_id == new_desc->d_id) &&
(hpdev->desc.ser == new_desc->ser)) {
@@ -1713,9 +1702,7 @@ static void pci_devices_present_work(struct work_struct 
*work)
spin_lock_irqsave(>device_list_lock, flags);
do {
found = false;
-   list_for_each(iter, >children) {
-   hpdev = container_of(iter, struct hv_pci_dev,
-list_entry);
+   list_for_each_entry(hpdev, >children, list_entry) {
if (hpdev->reported_missing) {
found = true;
put_pcichild(hpdev);
-- 
2.17.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH 2/3] PCI: hv: convert remove_lock to refcount

2018-05-23 Thread Stephen Hemminger
Use refcount instead of atomic for the reference counting
on bus. Refcount is safer because it handles overflow correctly.

Signed-off-by: Stephen Hemminger <sthem...@microsoft.com>
---
 drivers/pci/host/pci-hyperv.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
index 505453e23c22..19eb47f58ccb 100644
--- a/drivers/pci/host/pci-hyperv.c
+++ b/drivers/pci/host/pci-hyperv.c
@@ -433,7 +433,7 @@ enum hv_pcibus_state {
 struct hv_pcibus_device {
struct pci_sysdata sysdata;
enum hv_pcibus_state state;
-   atomic_t remove_lock;
+   refcount_t remove_lock;
struct hv_device *hdev;
resource_size_t low_mmio_space;
resource_size_t high_mmio_space;
@@ -2441,12 +2441,12 @@ static int hv_send_resources_released(struct hv_device 
*hdev)
 
 static void get_hvpcibus(struct hv_pcibus_device *hbus)
 {
-   atomic_inc(>remove_lock);
+   refcount_inc(>remove_lock);
 }
 
 static void put_hvpcibus(struct hv_pcibus_device *hbus)
 {
-   if (atomic_dec_and_test(>remove_lock))
+   if (refcount_dec_and_test(>remove_lock))
complete(>remove_event);
 }
 
@@ -2490,7 +2490,7 @@ static int hv_pci_probe(struct hv_device *hdev,
   hdev->dev_instance.b[8] << 8;
 
hbus->hdev = hdev;
-   atomic_inc(>remove_lock);
+   refcount_set(>remove_lock, 1);
INIT_LIST_HEAD(>children);
INIT_LIST_HEAD(>dr_list);
INIT_LIST_HEAD(>resources_for_children);
-- 
2.17.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH 0/3] PCI: hv: cleanup patches

2018-05-23 Thread Stephen Hemminger
These are minor code cleanups found while reviewing and implementing
other things in Hyper-V PCI host driver.

Stephen Hemminger (3):
  PCI: hv: remove unused reason for refcount handler
  PCI: hv: convert remove_lock to refcount
  PCI: hv: use list_for_each_entry

 drivers/pci/host/pci-hyperv.c | 105 --
 1 file changed, 37 insertions(+), 68 deletions(-)

-- 
2.17.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v2] Drivers: HV: Send one page worth of kmsg dump over Hyper-V during panic

2018-05-18 Thread Stephen Hemminger
On Fri, 18 May 2018 19:09:10 +
Sunil Muthuswamy  wrote:

>  
> +/*
> + * Boolean to control whether to report panic messages over Hyper-V.
> + *
> + * It can be set via /proc/sys/kernel/hyperv/record_panic_msg
> + */
> +int sysctl_record_panic_msg = 1;
> +

Looks good, this will help.

Please create new documentation to Documentation/sysctl/ for the new
sysctl.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH 1/1] scsi: storvsc: Avoid allocating memory for temp cpumasks

2018-05-17 Thread Stephen Hemminger
On Thu, 17 May 2018 14:07:40 -0700
Michael Kelley <mhkelle...@gmail.com> wrote:

> Current code allocates 240 Kbytes (in typical configs) for
> each synthetic SCSI controller to use as temp cpumask variables.
> Recode to avoid needing the temp cpumask variables and remove the
> memory allocation.
> 
> Signed-off-by: Michael Kelley <mikel...@microsoft.com>

Acked-by: Stephen Hemminger <sthem...@microsoft.com>
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH 1/1] Drivers: hv: vmbus: enable VMBus protocol version 5.0

2018-05-14 Thread Stephen Hemminger
On Mon, 14 May 2018 18:14:15 +
Dexuan Cui <de...@microsoft.com> wrote:

> > From: devel <driverdev-devel-boun...@linuxdriverproject.org> On Behalf Of
> > Stephen Hemminger
> > Sent: Sunday, May 13, 2018 10:24  
> > > ...
> > > @@ -372,6 +400,18 @@ int vmbus_post_msg(void *buffer, size_t buflen,  
> > bool can_sleep)  
> > > ...
> > > + hdr = (struct vmbus_channel_message_header *)buffer;  
> > 
> > Hate to pick o the details, but buffer is void * so cast is not necessary 
> > here.  
> 
> Yes, it's unnecessary in C, though it's necessary in C++.
> 
> I found the patch went into char-misc 4 hours ago, so it looks we may
> as well leave it as is. IMHO an explicit cast is not a bad thing. :-)
> 
> Thanks,
> -- Dexuan

Kernel developers like to be concise. In fact there is a smatch script that 
perodically
gets run and more cleanup patches get sent.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH 1/1] Drivers: hv: vmbus: enable VMBus protocol version 5.0

2018-05-13 Thread Stephen Hemminger
On Sat, 12 May 2018 02:30:33 -0700
k...@linuxonhyperv.com wrote:

>  int vmbus_post_msg(void *buffer, size_t buflen, bool can_sleep)
>  {
> + struct vmbus_channel_message_header *hdr;
>   union hv_connection_id conn_id;
>   int ret = 0;
>   int retries = 0;
>   u32 usec = 1;
>  
>   conn_id.asu32 = 0;
> - conn_id.u.id = VMBUS_MESSAGE_CONNECTION_ID;
> + conn_id.u.id = vmbus_connection.msg_conn_id;
>  
>   /*
>* hv_post_message() can have transient failures because of
> @@ -372,6 +400,18 @@ int vmbus_post_msg(void *buffer, size_t buflen, bool 
> can_sleep)
>  
>   switch (ret) {
>   case HV_STATUS_INVALID_CONNECTION_ID:
> + /*
> +  * See vmbus_negotiate_version(): VMBus protocol 5.0
> +  * requires that we must use
> +  * VMBUS_MESSAGE_CONNECTION_ID_4 for the Initiate
> +  * Contact message, but on old hosts that only
> +  * support VMBus protocol 4.0 or lower, here we get
> +  * HV_STATUS_INVALID_CONNECTION_ID and we should
> +  * return an error immediately without retrying.
> +  */
> + hdr = (struct vmbus_channel_message_header *)buffer;

Hate to pick o the details, but buffer is void * so cast is not necessary here.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v2] hv_netvsc: Fix net device attach on older Windows hosts

2018-05-09 Thread Stephen Hemminger
On Wed,  9 May 2018 10:17:34 +0200
Mohammed Gamal <mga...@redhat.com> wrote:

> On older windows hosts the net_device instance is returned to
> the caller of rndis_filter_device_add() without having the presence
> bit set first. This would cause any subsequent calls to network device
> operations (e.g. MTU change, channel change) to fail after the device
> is detached once, returning -ENODEV.
> 
> Instead of returning the device instabce, we take the exit path where
> we call netif_device_attach()
> 
> Fixes: 7b2ee50c0cd5 ("hv_netvsc: common detach logic")
> 
> Signed-off-by: Mohammed Gamal <mga...@redhat.com>

The standard for patch submission is no blank lines between Fixes
and Signed-off-by to make it easier for bots.

> ---
>  drivers/net/hyperv/rndis_filter.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/hyperv/rndis_filter.c 
> b/drivers/net/hyperv/rndis_filter.c
> index 6b127be..e7ca5b5 100644
> --- a/drivers/net/hyperv/rndis_filter.c
> +++ b/drivers/net/hyperv/rndis_filter.c
> @@ -1288,7 +1288,7 @@ struct netvsc_device *rndis_filter_device_add(struct 
> hv_device *dev,
>  rndis_device->link_state ? "down" : "up");
>  
>   if (net_device->nvsp_version < NVSP_PROTOCOL_VERSION_5)
> - return net_device;
> + goto out;
>  
>   rndis_filter_query_link_speed(rndis_device, net_device);
>  


Reviewed-by: Stephen Hemminger <sthem...@microsoft.com>
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH] hv_netvsc: Fix net device attach on older Windows hosts

2018-05-08 Thread Stephen Hemminger
On Tue, 08 May 2018 20:17:51 +0200
Mohammed Gamal <mga...@redhat.com> wrote:

> On Tue, 2018-05-08 at 11:13 -0700, Stephen Hemminger wrote:
> > On Tue,  8 May 2018 19:40:47 +0200
> > Mohammed Gamal <mga...@redhat.com> wrote:
> >   
> > > On older windows hosts the net_device instance is returned to
> > > the caller of rndis_filter_device_add() without having the presence
> > > bit set first. This would cause any subsequent calls to network
> > > device
> > > operations (e.g. MTU change, channel change) to fail after the
> > > device
> > > is detached once, returning -ENODEV.
> > > 
> > > Make sure we explicitly call netif_device_attach() before returning
> > > the net_device instance to make sure the presence bit is set
> > > 
> > > Fixes: 7b2ee50c0cd5 ("hv_netvsc: common detach logic")
> > > 
> > > Signed-off-by: Mohammed Gamal <mga...@redhat.com>
> > > ---
> > >  drivers/net/hyperv/rndis_filter.c | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/net/hyperv/rndis_filter.c
> > > b/drivers/net/hyperv/rndis_filter.c
> > > index 6b127be..09a3c1d 100644
> > > --- a/drivers/net/hyperv/rndis_filter.c
> > > +++ b/drivers/net/hyperv/rndis_filter.c
> > > @@ -1287,8 +1287,10 @@ struct netvsc_device
> > > *rndis_filter_device_add(struct hv_device *dev,
> > >      rndis_device->hw_mac_adr,
> > >      rndis_device->link_state ? "down" : "up");
> > >  
> > > - if (net_device->nvsp_version < NVSP_PROTOCOL_VERSION_5)
> > > + if (net_device->nvsp_version < NVSP_PROTOCOL_VERSION_5) {
> > > + netif_device_attach(net);
> > >   return net_device;
> > > + }  
> > 
> > Yes, this looks right, but it might be easier to use goto existing
> > exit
> > path.
> >   
> 
> I was just not sure if we should set max_chn and num_chn here. I will
> modify the patch and resend.


On older code it was just a goto. At that point: num_chn = max_chn = 1
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH] Drivers: hv: vmbus: enable VMBus protocol version 5.0

2018-05-08 Thread Stephen Hemminger
On Tue, 8 May 2018 22:26:21 +
Dexuan Cui  wrote:

> --- a/drivers/hv/connection.c
> +++ b/drivers/hv/connection.c
> @@ -63,6 +63,9 @@ static __u32 vmbus_get_next_version(__u32 current_version)
>   case (VERSION_WIN10):
>   return VERSION_WIN8_1;
>  
> + case (VERSION_WIN10_V5):
> + return VERSION_WIN10;
> +
>   case (VERSION_WS2008):

In a future patch, we should get rid of the use of parenthesis on this switch
statement.  Lots of other cleanup of this code is possible as well.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH] hv_netvsc: Fix net device attach on older Windows hosts

2018-05-08 Thread Stephen Hemminger
On Tue,  8 May 2018 19:40:47 +0200
Mohammed Gamal  wrote:

> On older windows hosts the net_device instance is returned to
> the caller of rndis_filter_device_add() without having the presence
> bit set first. This would cause any subsequent calls to network device
> operations (e.g. MTU change, channel change) to fail after the device
> is detached once, returning -ENODEV.
> 
> Make sure we explicitly call netif_device_attach() before returning
> the net_device instance to make sure the presence bit is set
> 
> Fixes: 7b2ee50c0cd5 ("hv_netvsc: common detach logic")
> 
> Signed-off-by: Mohammed Gamal 
> ---
>  drivers/net/hyperv/rndis_filter.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/hyperv/rndis_filter.c 
> b/drivers/net/hyperv/rndis_filter.c
> index 6b127be..09a3c1d 100644
> --- a/drivers/net/hyperv/rndis_filter.c
> +++ b/drivers/net/hyperv/rndis_filter.c
> @@ -1287,8 +1287,10 @@ struct netvsc_device *rndis_filter_device_add(struct 
> hv_device *dev,
>  rndis_device->hw_mac_adr,
>  rndis_device->link_state ? "down" : "up");
>  
> - if (net_device->nvsp_version < NVSP_PROTOCOL_VERSION_5)
> + if (net_device->nvsp_version < NVSP_PROTOCOL_VERSION_5) {
> + netif_device_attach(net);
>   return net_device;
> + }

Yes, this looks right, but it might be easier to use goto existing exit
path.

diff --git a/drivers/net/hyperv/rndis_filter.c 
b/drivers/net/hyperv/rndis_filter.c
index 3b6dbacaf77d..ed941c5a0be9 100644
--- a/drivers/net/hyperv/rndis_filter.c
+++ b/drivers/net/hyperv/rndis_filter.c
@@ -1316,7 +1316,7 @@ struct netvsc_device *rndis_filter_device_add(struct 
hv_device *dev,
   rndis_device->link_state ? "down" : "up");
 
if (net_device->nvsp_version < NVSP_PROTOCOL_VERSION_5)
-   return net_device;
+   goto out;
 
rndis_filter_query_link_speed(rndis_device, net_device);

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH] doc: fix sysfs ABI documentation

2018-05-01 Thread Stephen Hemminger
In 4.9 kernel, the sysfs files for Hyper-V VMBus changed name but
the documentation files were not updated. The current sysfs file
names are /sys/bus/vmbus/devices//...

See commit 9a56e5d6a0ba ("Drivers: hv: make VMBus bus ids persistent")
and commit f6b2db084b65 ("vmbus: make sysfs names consistent with PCI")

Reported-by: Michael Kelley <mikel...@microsoft.com>
Signed-off-by: Stephen Hemminger <sthem...@microsoft.com>
---
 Documentation/ABI/stable/sysfs-bus-vmbus | 40 
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/Documentation/ABI/stable/sysfs-bus-vmbus 
b/Documentation/ABI/stable/sysfs-bus-vmbus
index 0c9d9dcd2151..3eaffbb2d468 100644
--- a/Documentation/ABI/stable/sysfs-bus-vmbus
+++ b/Documentation/ABI/stable/sysfs-bus-vmbus
@@ -1,25 +1,25 @@
-What:  /sys/bus/vmbus/devices/vmbus_*/id
+What:  /sys/bus/vmbus/devices//id
 Date:  Jul 2009
 KernelVersion: 2.6.31
 Contact:   K. Y. Srinivasan <k...@microsoft.com>
 Description:   The VMBus child_relid of the device's primary channel
 Users: tools/hv/lsvmbus
 
-What:  /sys/bus/vmbus/devices/vmbus_*/class_id
+What:  /sys/bus/vmbus/devices//class_id
 Date:  Jul 2009
 KernelVersion: 2.6.31
 Contact:   K. Y. Srinivasan <k...@microsoft.com>
 Description:   The VMBus interface type GUID of the device
 Users: tools/hv/lsvmbus
 
-What:  /sys/bus/vmbus/devices/vmbus_*/device_id
+What:  /sys/bus/vmbus/devices//device_id
 Date:  Jul 2009
 KernelVersion: 2.6.31
 Contact:   K. Y. Srinivasan <k...@microsoft.com>
 Description:   The VMBus interface instance GUID of the device
 Users: tools/hv/lsvmbus
 
-What:  /sys/bus/vmbus/devices/vmbus_*/channel_vp_mapping
+What:  /sys/bus/vmbus/devices//channel_vp_mapping
 Date:  Jul 2015
 KernelVersion: 4.2.0
 Contact:   K. Y. Srinivasan <k...@microsoft.com>
@@ -28,112 +28,112 @@ Description:  The mapping of which primary/sub 
channels are bound to which
Format: <channel's child_relid:the bound cpu's number>
 Users: tools/hv/lsvmbus
 
-What:  /sys/bus/vmbus/devices/vmbus_*/device
+What:  /sys/bus/vmbus/devices//device
 Date:  Dec. 2015
 KernelVersion: 4.5
 Contact:   K. Y. Srinivasan <k...@microsoft.com>
 Description:   The 16 bit device ID of the device
 Users: tools/hv/lsvmbus and user level RDMA libraries
 
-What:  /sys/bus/vmbus/devices/vmbus_*/vendor
+What:  /sys/bus/vmbus/devices//vendor
 Date:  Dec. 2015
 KernelVersion: 4.5
 Contact:   K. Y. Srinivasan <k...@microsoft.com>
 Description:   The 16 bit vendor ID of the device
 Users: tools/hv/lsvmbus and user level RDMA libraries
 
-What:  /sys/bus/vmbus/devices/vmbus_*/channels/NN
+What:  /sys/bus/vmbus/devices//channels/
 Date:  September. 2017
 KernelVersion: 4.14
 Contact:   Stephen Hemminger <sthem...@microsoft.com>
 Description:   Directory for per-channel information
NN is the VMBUS relid associtated with the channel.
 
-What:  /sys/bus/vmbus/devices/vmbus_*/channels/NN/cpu
+What:  /sys/bus/vmbus/devices//channels//cpu
 Date:  September. 2017
 KernelVersion: 4.14
 Contact:   Stephen Hemminger <sthem...@microsoft.com>
 Description:   VCPU (sub)channel is affinitized to
 Users: tools/hv/lsvmbus and other debugging tools
 
-What:  /sys/bus/vmbus/devices/vmbus_*/channels/NN/cpu
+What:  /sys/bus/vmbus/devices//channels//cpu
 Date:  September. 2017
 KernelVersion: 4.14
 Contact:   Stephen Hemminger <sthem...@microsoft.com>
 Description:   VCPU (sub)channel is affinitized to
 Users: tools/hv/lsvmbus and other debugging tools
 
-What:  /sys/bus/vmbus/devices/vmbus_*/channels/NN/in_mask
+What:  /sys/bus/vmbus/devices//channels//in_mask
 Date:  September. 2017
 KernelVersion: 4.14
 Contact:   Stephen Hemminger <sthem...@microsoft.com>
 Description:   Host to guest channel interrupt mask
 Users: Debugging tools
 
-What:  /sys/bus/vmbus/devices/vmbus_*/channels/NN/latency
+What:  /sys/bus/vmbus/devices//channels//latency
 Date:  September. 2017
 KernelVersion: 4.14
 Contact:   Stephen Hemminger <sthem...@microsoft.com>
 Description:   Channel signaling latency
 Users: Debugging tools
 
-What:  /sys/bus/vmbus/devices/vmbus_*/channels/NN/out_mask
+What:  /sys/bus/vmbus/devices//channels//out_mask
 Date:  September. 2017
 KernelVersion: 4.14
 Contact:   Stephen Hemminger <sthem...@microsoft.com>
 Description:   Guest to host channel interrupt mask
 Users: Debugging tools
 
-What:  /sys/bus/vmbus/devices/vmbus_*/channels/NN/pending
+What:  /sys/bus/vmbus/devices//c

Re: [Patch v2] Storvsc: Select channel based on available percentage of ring buffer to write

2018-04-20 Thread Stephen Hemminger
On Thu, 19 Apr 2018 14:54:24 -0700
Long Li <lon...@linuxonhyperv.com> wrote:

> From: Long Li <lon...@microsoft.com>
> 
> This is a best effort for estimating on how busy the ring buffer is for
> that channel, based on available buffer to write in percentage. It is still
> possible that at the time of actual ring buffer write, the space may not be
> available due to other processes may be writing at the time.
> 
> Selecting a channel based on how full it is can reduce the possibility that
> a ring buffer write will fail, and avoid the situation a channel is over
> busy.
> 
> Now it's possible that storvsc can use a smaller ring buffer size
> (e.g. 40k bytes) to take advantage of cache locality.
> 
> Changes.
> v2: Pre-allocate struct cpumask on the heap.
> Struct cpumask is a big structure (1k bytes) when CONFIG_NR_CPUS=8192 (default
> value when CONFIG_MAXSMP=y). Don't use kernel stack for it by pre-allocating
> them using kmalloc when channels are first initialized.
> 
> Signed-off-by: Long Li <lon...@microsoft.com>

Reviewed-by: Stephen Hemminger <sthem...@microsoft.com>
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH v2 2/4] uio_hv_generic: make ring buffer attribute for primary channel

2018-04-16 Thread Stephen Hemminger
The primary channel also needs a ring buffer attribute. This allows
application to check if kernel supports uio sub channels, and also
makes all channels use consistent API.

Signed-off-by: Stephen Hemminger <sthem...@microsoft.com>
---
 drivers/uio/uio_hv_generic.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
index 7ff659dff11d..972a42dd2a46 100644
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -326,6 +326,11 @@ hv_uio_probe(struct hv_device *dev,
vmbus_set_chn_rescind_callback(dev->channel, hv_uio_rescind);
vmbus_set_sc_create_callback(dev->channel, hv_uio_new_channel);
 
+   ret = sysfs_create_bin_file(>channel->kobj, _buffer_bin_attr);
+   if (ret)
+   dev_notice(>device,
+  "sysfs create ring bin file failed; %d\n", ret);
+
hv_set_drvdata(dev, pdata);
 
return 0;
-- 
2.17.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


  1   2   3   4   5   6   >