RE: [net-next 04/14] i40e: dump VF information in debugfs

2017-04-20 Thread Williams, Mitch A


> -Original Message-
> From: netdev-ow...@vger.kernel.org [mailto:netdev-ow...@vger.kernel.org] On
> Behalf Of David Miller
> Sent: Thursday, April 20, 2017 8:20 AM
> To: yuval.mi...@cavium.com
> Cc: gerlitz...@gmail.com; Kirsher, Jeffrey T <jeffrey.t.kirs...@intel.com>;
> Williams, Mitch A <mitch.a.willi...@intel.com>; netdev@vger.kernel.org;
> nhor...@redhat.com; sassm...@redhat.com; jogre...@redhat.com
> Subject: Re: [net-next 04/14] i40e: dump VF information in debugfs
> 
> From: "Mintz, Yuval" <yuval.mi...@cavium.com>
> Date: Thu, 20 Apr 2017 10:09:28 +
> 
> >> > Dump some internal state about VFs through debugfs. This provides
> >> > information not available with 'ip link show'.
> >>
> >> such as?
> >>
> >> donnwantobethedebugfspolice, but still, in the 2-3 times we tried to push
> >> debugfs to MLNX NIC drivers, Dave disallowed that, and lately the switch
> >> team even went further and deleted that portion of the mlxsw driver --
> all to
> >> all,  I don't see much point for these type of changes, thoughts?
> >
> > Don't want to hikjack your thread, but continuing this topic -
> > Is there some flat-out disapproval for debugfs in net-next now?
> >
> > We're currently internally engaged with adding qed support for register
> dumps
> > [~equivalents for `ethtool -d' outputs] through debugfs, on behalf of
> storage
> > drivers [qedi/qedf] lacking the API for doing that.
> 
> I really hate to see debugfs things in networking drivers.  It's a
> complete cop out for doing things properly.
> 
> I push back, but I can only fight too much.  If people want to keep
> adding stupid poorly designed crap endlessly to their drivers there
> is only so much I can do...

Honestly, Dave, I try very hard not to put random crap into debugfs. I felt 
that there was enough utility here to send this one up.

We already have debugfs hooks upstream that allow us to dump information about 
a specific VSI. This just adds a tiny bit to that to allow us to link a VF with 
its specific VSI. It's the missing link, if you will.

I'm not adding backdoor hooks to do naughty things, I'm just (very slightly) 
enhancing what's already there.

This particular piece has been very useful for my debugging, and we have 
customers that have asked for it as well.

If you still want to reject this, I won't fight and cry and stomp my little 
feet. But I honestly think that this a) has genuine utility, b) is fairly 
device specific, and c) isn't crap.

-Mitch



RE: [PATCH] i40e: limit client interface to X722 hardware

2017-04-04 Thread Williams, Mitch A


> -Original Message-
> From: netdev-ow...@vger.kernel.org [mailto:netdev-ow...@vger.kernel.org] On
> Behalf Of Stefan Assmann
> Sent: Tuesday, April 04, 2017 12:52 PM
> To: Or Gerlitz 
> Cc: intel-wired-...@lists.osuosl.org; Linux Netdev List
> ; David Miller ; Kirsher,
> Jeffrey T 
> Subject: Re: [PATCH] i40e: limit client interface to X722 hardware
> 
> On 04.04.2017 18:56, Or Gerlitz wrote:
> > On Tue, Apr 4, 2017 at 5:34 PM, Stefan Assmann  wrote:
> >> The client interface is meant for X722 iWARP support. Modprobing i40iw
> >> on systems with X710/XL710 NICs currently may crash the system.
> >
> > just curious may or crash? and why?
> 
> The backtrace I got was not really conclusive. The code is not meant to
> be run on that hardware so I didn't bother to dig deeper.
> 
>   Stefan

The i40iw module can't easily determine which hardware its loaded upon. So it 
assumes that we (i40e, that is) have handed it a handle to valid hardware. When 
the interface is opened, it starts reading and writing registers that are 
nonexistent on X710/XL710.

-Mitch


RE: [PATCH] i40e: limit client interface to X722 hardware

2017-04-04 Thread Williams, Mitch A


> -Original Message-
> From: netdev-ow...@vger.kernel.org [mailto:netdev-ow...@vger.kernel.org] On
> Behalf Of Stefan Assmann
> Sent: Tuesday, April 04, 2017 7:35 AM
> To: intel-wired-...@lists.osuosl.org
> Cc: netdev@vger.kernel.org; da...@davemloft.net; Kirsher, Jeffrey T
> ; sassm...@kpanic.de
> Subject: [PATCH] i40e: limit client interface to X722 hardware
> 
> The client interface is meant for X722 iWARP support. Modprobing i40iw
> on systems with X710/XL710 NICs currently may crash the system. Adding a
> check which limits client interface access to the appropriate hardware.
> 
> Signed-off-by: Stefan Assmann 
> ---
>  drivers/net/ethernet/intel/i40e/i40e_client.c | 14 --
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_client.c
> b/drivers/net/ethernet/intel/i40e/i40e_client.c
> index 191028b..6f873449 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_client.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_client.c
> @@ -525,15 +525,17 @@ static void i40e_client_release(struct i40e_client
> *client)
>  static void i40e_client_prepare(struct i40e_client *client)
>  {
>   struct i40e_device *ldev;
> - struct i40e_pf *pf;
> 
>   mutex_lock(_device_mutex);
>   list_for_each_entry(ldev, _devices, list) {
> - pf = ldev->pf;
> - i40e_client_add_instance(pf);
> - /* Start the client subtask */
> - pf->flags |= I40E_FLAG_SERVICE_CLIENT_REQUESTED;
> - i40e_service_event_schedule(pf);
> + struct i40e_pf *pf = ldev->pf;
> +
> + if (pf->hw.mac.type == I40E_MAC_X722) {
> + i40e_client_add_instance(pf);
> + /* Start the client subtask */
> + pf->flags |= I40E_FLAG_SERVICE_CLIENT_REQUESTED;
> + i40e_service_event_schedule(pf);
> + }
>   }
>   mutex_unlock(_device_mutex);
>  }
> --
> 2.9.3

Thanks for pointing this out, Stephan. I think that's not exactly the right way 
to do this check. Instead, we need to look at the IWARP flag. I'll get a patch 
out today to take care of it.

So consider this a thankful NAK.

-Mitch



RE: Kernel 4.6.7-rt13: Intel Ethernet driver igb causes huge latencies in cyclictest

2016-10-06 Thread Williams, Mitch A


> -Original Message-
> From: Intel-wired-lan [mailto:intel-wired-lan-boun...@lists.osuosl.org] On
> Behalf Of Koehrer Mathias (ETAS/ESW5)
> Sent: Thursday, October 06, 2016 12:02 AM
> To: Julia Cartwright ; Kirsher, Jeffrey T
> ; Greg 
> Cc: netdev@vger.kernel.org; intel-wired-...@lists.osuosl.org; linux-rt-
> us...@vger.kernel.org; Sebastian Andrzej Siewior
> 
> Subject: Re: [Intel-wired-lan] Kernel 4.6.7-rt13: Intel Ethernet driver igb
> causes huge latencies in cyclictest
> 
> Hi all,
> >
> > Although, to be clear, it isn't the fact that there exists 8 threads, it's
> that the device is
> > firing all 8 interrupts at the same time.  The time spent in hardirq
> context just waking
> > up all 8 of those threads (and the cyclictest wakeup) is enough to cause
> your
> > regression.
> >
> > netdev/igb folks-
> >
> > Under what conditions should it be expected that the i350 trigger all of
> the TxRx
> > interrupts simultaneously?  Any ideas here?

I can answer that! I wrote that code.

We trigger the interrupts once a second because MSI and MSI-X interrupts are 
NOT guaranteed to be delivered. If this happens, the queues being serviced by 
this "lost" interrupt are completely stuck.

The device automatically masks each interrupt vector after it fires, expecting 
the ISR to re-enable the vector after processing is complete. If the interrupt 
is lost, the ISR doesn't run, so the vector ends up permanently masked. At this 
point, any queues associated with that vector are stuck. The only recovery is 
through the netdev watchdog, which initiates a reset.

During development of igb, we had several platforms with chipsets that 
routinely dropped MSI messages under stress. Things would be running fine and 
then, pow, all the traffic on a queue would stop. 

So, I added code to fire each vector once per second. Just unmasking the 
interrupt isn't enough - we need to trigger the ISR to get the queues cleaned 
up so the device can work again.

Is this workaround still needed? I don't know. Modern chipsets don't break a 
sweat handling gigabit-speed traffic, and they *probably* don't drop 
interrupts. But I'd still rather have that insurance.

You could try to remove the write to the EICS registers in the watchdog task to 
see if that takes care of your problem. But I wouldn't want to remove that code 
permanently, because we have seen lost interrupts in the past.

You also could try staggering the writes so that not all vectors fire each 
second. But then you'll potentially incur a much longer delay if an interrupt 
does get lost, which means you could trigger netdev watchdog events.

-Mitch



> >
> > See the start of this thread here:
> >
> >   http://lkml.kernel.org/r/d648628329bc446fa63b5e19d4d3fb56@FE-
> > MBX1012.de.bosch.com
> >
> Greg recommended to use "ethtool -L eth2 combined 1" to reduce the number of
> queues.
> I tried that. Now, I have actually only three irqs (eth2, eth2-rx-0, eth2-
> tx-0).
> However the issue remains the same.
> 
> I ran the cyclictest again:
> # cyclictest -a -i 105 -m -n -p 80 -t 1  -b 23 -C
> (Note: When using 105us instead of 100us the long latencies seem to occur
> more often).
> 
> Here are the final lines of the kernel trace output:
>   -0   4d...2.. 1344661649us : sched_switch: prev_comm=swapper/4
> prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcuc/4 next_pid=56
> next_prio=98
> ktimerso-46  3d...2.. 1344661650us : sched_switch:
> prev_comm=ktimersoftd/3 prev_pid=46 prev_prio=98 prev_state=S ==>
> next_comm=swapper/3 next_pid=0 next_prio=120
> ktimerso-24  1d...2.. 1344661650us : sched_switch:
> prev_comm=ktimersoftd/1 prev_pid=24 prev_prio=98 prev_state=S ==>
> next_comm=swapper/1 next_pid=0 next_prio=120
> ktimerso-79  6d...2.. 1344661650us : sched_switch:
> prev_comm=ktimersoftd/6 prev_pid=79 prev_prio=98 prev_state=S ==>
> next_comm=swapper/6 next_pid=0 next_prio=120
> ktimerso-35  2d...2.. 1344661650us : sched_switch:
> prev_comm=ktimersoftd/2 prev_pid=35 prev_prio=98 prev_state=S ==>
> next_comm=swapper/2 next_pid=0 next_prio=120
>   rcuc/5-67  5d...2.. 1344661650us : sched_switch: prev_comm=rcuc/5
> prev_pid=67 prev_prio=98 prev_state=S ==> next_comm=ktimersoftd/5
> next_pid=68 next_prio=98
>   rcuc/7-89  7d...2.. 1344661650us : sched_switch: prev_comm=rcuc/7
> prev_pid=89 prev_prio=98 prev_state=S ==> next_comm=ktimersoftd/7
> next_pid=90 next_prio=98
> ktimerso-4   0d...211 1344661650us : sched_wakeup: comm=rcu_preempt
> pid=8 prio=98 target_cpu=000
>   rcuc/4-56  4d...2.. 1344661651us : sched_switch: prev_comm=rcuc/4
> prev_pid=56 prev_prio=98 prev_state=S ==> next_comm=ktimersoftd/4
> next_pid=57 next_prio=98
> ktimerso-4   0d...2.. 1344661651us : sched_switch:
> prev_comm=ktimersoftd/0 prev_pid=4 prev_prio=98 prev_state=S ==>
> next_comm=rcu_preempt next_pid=8 next_prio=98
> ktimerso-90  7d...2.. 

RE: [Intel-wired-lan] [PATCH 2/2] ixgbe: Teardown SR-IOV before unregister_netdev()

2015-07-27 Thread Williams, Mitch A
ACK

 -Original Message-
 From: Intel-wired-lan [mailto:intel-wired-lan-boun...@lists.osuosl.org] On
 Behalf Of Alex Williamson
 Sent: Monday, July 27, 2015 4:19 PM
 To: intel-wired-...@lists.osuosl.org; Kirsher, Jeffrey T
 Cc: netdev@vger.kernel.org; linux-ker...@vger.kernel.org
 Subject: [Intel-wired-lan] [PATCH 2/2] ixgbe: Teardown SR-IOV before
 unregister_netdev()
 
 When the .remove() callback for a PF is called, SR-IOV support for the
 device is disabled, which requires unbinding and removing the VFs.
 The VFs may be in-use either by the host kernel or userspace, such as
 assigned to a VM through vfio-pci.  In this latter case, the VFs may
 be removed either by shutting down the VM or hot-unplugging the
 devices from the VM.  Unfortunately in the case of a Windows 2012 R2
 guest, hot-unplug is broken due to the ordering of the PF driver
 teardown.  Disabling SR-IOV prior to unregister_netdev() avoids this
 issue.
 
 Signed-off-by: Alex Williamson alex.william...@redhat.com
 ---
  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)
 
 diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
 b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
 index f775123..e27813c 100644
 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
 +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
 @@ -9035,12 +9035,12 @@ static void ixgbe_remove(struct pci_dev *pdev)
   /* remove the added san mac */
   ixgbe_del_sanmac_netdev(netdev);
 
 - if (netdev-reg_state == NETREG_REGISTERED)
 - unregister_netdev(netdev);
 -
  #ifdef CONFIG_PCI_IOV
   ixgbe_disable_sriov(adapter);
  #endif
 + if (netdev-reg_state == NETREG_REGISTERED)
 + unregister_netdev(netdev);
 +
   ixgbe_clear_interrupt_scheme(adapter);
 
   ixgbe_release_hw_control(adapter);
 
 ___
 Intel-wired-lan mailing list
 intel-wired-...@lists.osuosl.org
 http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [Intel-wired-lan] [PATCH 1/2] igb: Teardown SR-IOV before unregister_netdev()

2015-07-27 Thread Williams, Mitch A
ACK

 -Original Message-
 From: Intel-wired-lan [mailto:intel-wired-lan-boun...@lists.osuosl.org] On
 Behalf Of Alex Williamson
 Sent: Monday, July 27, 2015 4:19 PM
 To: intel-wired-...@lists.osuosl.org; Kirsher, Jeffrey T
 Cc: netdev@vger.kernel.org; linux-ker...@vger.kernel.org
 Subject: [Intel-wired-lan] [PATCH 1/2] igb: Teardown SR-IOV before
 unregister_netdev()
 
 When the .remove() callback for a PF is called, SR-IOV support for the
 device is disabled, which requires unbinding and removing the VFs.
 The VFs may be in-use either by the host kernel or userspace, such as
 assigned to a VM through vfio-pci.  In this latter case, the VFs may
 be removed either by shutting down the VM or hot-unplugging the
 devices from the VM.  Unfortunately in the case of a Windows 2012 R2
 guest, hot-unplug is broken due to the ordering of the PF driver
 teardown.  Disabling SR-IOV prior to unregister_netdev() avoids this
 issue.
 
 Signed-off-by: Alex Williamson alex.william...@redhat.com
 ---
  drivers/net/ethernet/intel/igb/igb_main.c |8 
  1 file changed, 4 insertions(+), 4 deletions(-)
 
 diff --git a/drivers/net/ethernet/intel/igb/igb_main.c
 b/drivers/net/ethernet/intel/igb/igb_main.c
 index 517746f..606a7ae 100644
 --- a/drivers/net/ethernet/intel/igb/igb_main.c
 +++ b/drivers/net/ethernet/intel/igb/igb_main.c
 @@ -2805,14 +2805,14 @@ static void igb_remove(struct pci_dev *pdev)
*/
   igb_release_hw_control(adapter);
 
 - unregister_netdev(netdev);
 -
 - igb_clear_interrupt_scheme(adapter);
 -
  #ifdef CONFIG_PCI_IOV
   igb_disable_sriov(pdev);
  #endif
 
 + unregister_netdev(netdev);
 +
 + igb_clear_interrupt_scheme(adapter);
 +
   pci_iounmap(pdev, hw-hw_addr);
   if (hw-flash_address)
   iounmap(hw-flash_address);
 
 ___
 Intel-wired-lan mailing list
 intel-wired-...@lists.osuosl.org
 http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC 2/2] shrink size of scatterlist on common i386/x86-64

2007-07-11 Thread Williams, Mitch A
 
Muli Ben-Yehuda wrote:
 Net result:  zilch.  No performance increase, no noticeable CPU
 utilization
 benefits.  Nothing.  So I dropped it.

Do you have pointers to the patches perchance?

Muli, I've been looking for this code and it looks like it's gone. 
I was using a Power5 system that I had borrowed and the folks I
borrowed it from have reformatted the drive.  I never bothered
to hang on to the patches because they were neither particularly
pretty or particularly useful.  Sorry.
-Mitch
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC 2/2] shrink size of scatterlist on common i386/x86-64

2007-07-06 Thread Williams, Mitch A
David Miller wrote:
 Okay, but then using SG lists makes skbuff's much bigger.
 
  fraglistscatterlistper skbuff
 32 bit   8   20  +12 * 18 = +216!
 64 bit   16  32  +16 * 18 = +288
 
 So never mind...

I know, this is why nobody ever really tries to tackle this.

 I'll do a fraglist to scatter list set of routines, but not sure
 if it's worth it.

It's better to add dma_map_skb() et al. interfaces to be honest.

Also even with the scatterlist idea, we'd still need to do two
map calls, one for skb-data and one for the page vector.

FWIW, I tried this about a year ago to try to improve e1000 performance
on pSeries.  I was hoping to simplify the driver transmit code and make
IOMMU mapping easier.  This was on 2.6.16 or thereabouts.

Net result:  zilch.  No performance increase, no noticeable CPU
utilization
benefits.  Nothing.  So I dropped it.

Slightly off topic:
The real problem that I saw on pSeries is lock contention for the IOMMU.
It's architected with a single table per slot, which is great in that
two boards in separate slots won't have lock contention.  However, this
all goes out the window when you drop a quad-port gigabit adapter in
there.
The time spent waiting for the IOMMU table lock goes up exponentially
as you activate each additional port.

In my opinion, IOMMU table locking is the major issue with this type of
architecture.  Since both Intel and AMD are touting IOMMUs for virtual-
ization support, this is an issue that's going to need a lot of
scrutiny.

-Mitch
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [E1000-devel] e1000: backport ich9 support from 7.5.5 ?

2007-07-02 Thread Williams, Mitch A
 

Mark McLoughlin wrote:

 I disagree, we should not break the current e1000 driver in 
the kernel while 
 there is a new driver coming up that introduces ich9 support 
without breaking 
 (the old e1000) support for all other devices. This is why 
we want to drop a new 
 version of the e1000 driver upstream instead, and put all 
newer devices in that 
 driver.

   Clearly some major changes are happening around e1000, 
but the point
I'm making is that ich9 support, in particular, doesn't need to be held
hostage to these longer term changes. I think the backport shows that:


There seems to be a lot of concern over obsoleting the e1000 driver
too quickly, and with confusing users (and startup scripts) about which
driver to load.

Obviously, we at Intel want to get e1000new into the kernel as quickly
as possible, and to obsolete e1000 also as quickly as possible.  The
point of this exercise is to reduce our support and maintenance burden,
and e1000new has shown itself internally to be much easier to work on.

So how about this:
- We include e1000new in 2.6.23, along side e1000.  We expose ICH9
  device IDs in e1000new, and gate the rest of the IDs inside
  #ifndef CONFIG_E1000.  This gives distis ICH9 support, along
  with a guarantee of the known e1000 driver.  It also lets the more
  bleeding-edge people turn off e1000 and run with just e1000new to
  work out any kinks.
- For 2.6.24, we reverse the situation:  Gate all device IDs in e1000
  behind #ifndef CONFIG_E1000NEW, and expose all of them in e1000new.
  At this point we (i.e. the community, not just Intel) should be 
  confident in e1000new, and we can mark e1000 as obsolete.  It's still
  a fallback option for those with old/funky/misconfigured hardware.
- After a year (or whatever), we remove e1000.

Seems to me that this plan should appease either everybody or nobody.
We get ICH9 support out there, e1000new gets in the kernel and
exercised,
and we get to set a definite date for obsoleting e1000.

-Mitch
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Netpoll client software?

2006-11-14 Thread Williams, Mitch A
Hi folks, I'm looking for some suggestions for software that uses
the kernel's netpoll interface to receive packets.  The only in-kernel
application that uses netpoll right now is netconsole, and that only
sends packets.

Used to be, netdump would use the netpoll interface for both transmit
and receive, but netdump is gone now, and won't even compile on any
recent kernel.

I've made some changes to e1000's netpoll code to support some Top
Secret Upcoming Hardware, and I would like to be able to validate
my changes.

Any suggestions?

Thanks,
Mitch
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/23] e1000: add multicast stats counters

2006-09-20 Thread Williams, Mitch A
 +{ rx_broadcast, E1000_STAT(stats.bprc) },
 +{ tx_broadcast, E1000_STAT(stats.bptc) },
 +{ rx_multicast, E1000_STAT(stats.mprc) },
 +{ tx_multicast, E1000_STAT(stats.mptc) },
  { rx_errors, E1000_STAT(net_stats.rx_errors) },
  { tx_errors, E1000_STAT(net_stats.tx_errors) },
  { tx_dropped, E1000_STAT(net_stats.tx_dropped) },

NAK -- you also need to remove the standard net stats, which are 
exported elsewhere

Jeff, can you please explain the reason for this NAK a little more?
Neither Auke nor I understand why you rejected the patch.

This patch just adds the display of a few more stats in Ethtool.  It
doesn't affect any other counters, and is really just a convenience
feature.  I added this to the driver because of a customer request.

Thank you in advance for edifying us.

-Mitch

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html