Re: [PATCH] net: asix: Avoid looping when the device does not respond

2016-10-14 Thread David Miller
From: Guenter Roeck 
Date: Thu, 13 Oct 2016 16:43:16 -0700

> Check answers from USB stack and avoid re-sending the request
> multiple times if the device does not respond.
> 
> This fixes the following problem, observed with a probably flaky adapter.
 ...
> Since the USB timeout is 5 seconds, and the operation is retried 30 times,
> this results in
 ...
> Signed-off-by: Guenter Roeck 

Applied, thanks.


Re: [PATCH NET] ethtool: silence warning on bit loss

2016-10-14 Thread David Miller
From: Jesse Brandeburg 
Date: Thu, 13 Oct 2016 16:13:55 -0700

> Sparse was complaining when we went to prototype some code
> using ethtool_cmd_speed_set and SPEED_10, which uses
> the upper 16 bits of __u32 speed for the first time.
> 
> CHECK
> ...
> .../uapi/linux/ethtool.h:123:28: warning:
>   cast truncates bits from constant value (186a0 becomes 86a0)
> 
> The warning is actually bogus, as no bits are really lost, but
> we can get rid of the sparse warning with this one small change.
> 
> Reported-by: Preethi Banala 
> Signed-off-by: Jesse Brandeburg 

Ok, I'll apply this.

There were alternative suggestions but I like this patch
because it makes it explicit what is going on.

Just removing the u16 cast requires the reader to implicitly
understand and know the types in the structure.


Re: [PATCH] net: limit a number of namespaces which can be cleaned up concurrently

2016-10-14 Thread Andrei Vagin
On Thu, Oct 13, 2016 at 10:06:28PM -0500, Eric W. Biederman wrote:
> Andrei Vagin  writes:
> 
> > On Thu, Oct 13, 2016 at 10:49:38AM -0500, Eric W. Biederman wrote:
> >> Andrei Vagin  writes:
> >> 
> >> > From: Andrey Vagin 
> >> >
> >> > The operation of destroying netns is heavy and it is executed under
> >> > net_mutex. If many namespaces are destroyed concurrently, net_mutex can
> >> > be locked for a long time. It is impossible to create a new netns during
> >> > this period of time.
> >> 
> >> This may be the right approach or at least the right approach to bound
> >> net_mutex hold times but I have to take exception to calling network
> >> namespace cleanup heavy.
> >> 
> >> The only particularly time consuming operation I have ever found are calls 
> >> to
> >> synchronize_rcu/sycrhonize_sched/synchronize_net.
> >
> > I booted the kernel with maxcpus=1, in this case these functions work
> > very fast and the problem is there any way.
> >
> > Accoding to perf, we spend a lot of time in kobject_uevent:
> >
> > -   99.96% 0.00%  kworker/u4:1 [kernel.kallsyms]  [k] 
> > unregister_netdevice_many
> >- unregister_netdevice_many
> >   - 99.95% rollback_registered_many
> >  - 99.64% netdev_unregister_kobject
> > - 33.43% netdev_queue_update_kobjects
> >- 33.40% kobject_put
> >   - kobject_release
> >  + 33.37% kobject_uevent
> >  + 0.03% kobject_del
> >+ 0.03% sysfs_remove_group
> > - 33.13% net_rx_queue_update_kobjects
> >- kobject_put
> >- kobject_release
> >   + 33.11% kobject_uevent
> >   + 0.01% kobject_del
> > 0.00% rx_queue_release
> > - 33.08% device_del
> >+ 32.75% kobject_uevent
> >+ 0.17% device_remove_attrs
> >+ 0.07% dpm_sysfs_remove
> >+ 0.04% device_remove_class_symlinks
> >+ 0.01% kobject_del
> >+ 0.01% device_pm_remove
> >+ 0.01% sysfs_remove_file_ns
> >+ 0.00% klist_del
> >+ 0.00% driver_deferred_probe_del
> >  0.00% cleanup_glue_dir.isra.14.part.15
> >  0.00% to_acpi_device_node
> >  0.00% sysfs_remove_group
> >   0.00% klist_del
> >   0.00% device_remove_attrs
> >  + 0.26% call_netdevice_notifiers_info
> >  + 0.04% rtmsg_ifinfo_build_skb
> >  + 0.01% rtmsg_ifinfo_send
> > 0.00% dev_uc_flush
> > 0.00% netif_reset_xps_queues_gt
> >
> > Someone can listen these uevents, so we can't stop sending them without
> > breaking backward compatibility. We can try to optimize
> > kobject_uevent...
> 
> Oh that is a surprise.  We can definitely skip genenerating uevents for
> network namespaces that are exiting because by definition no one can see
> those network namespaces.  If a socket existed that could see those
> uevents it would hold a reference to the network namespace and as such
> the network namespace could not exit.
> 
> That sounds like it is worth investigating a little more deeply.
> 
> I am surprised that allocation and freeing is so heavy we are spending
> lots of time doing that.  On the other hand kobj_bcast_filter is very
> dumb and very late so I expect something can be moved earlier and make
> that code cheaper with the tiniest bit of work.
> 

I'm sorry, I've collected this data for a kernel with debug options
(DEBUG_SPINLOCK, PROVE_LOCKING, DEBUG_LIST, etc). If a kernel is
compiled without debug options, kobject_uevent becomes less expensive,
but still expensive.

-   98.64% 0.00%  kworker/u4:2  [kernel.kallsyms][k] cleanup_net
   - cleanup_net
  - 98.54% ops_exit_list.isra.4
 - 60.48% default_device_exit_batch
- 60.40% unregister_netdevice_many
   - rollback_registered_many
  - 59.82% netdev_unregister_kobject
 - 20.10% device_del
+ 19.44% kobject_uevent
+ 0.40% device_remove_attrs
+ 0.17% dpm_sysfs_remove
+ 0.04% device_remove_class_symlinks
+ 0.04% kobject_del
+ 0.01% device_pm_remove
+ 0.01% sysfs_remove_file_ns
 - 19.89% netdev_queue_update_kobjects
+ 19.81% kobject_put
+ 0.07% sysfs_remove_group
 - 19.79% net_rx_queue_update_kobjects
  kobject_put
- kobject_release
   + 19.77% kobject_uevent
   + 0.02% kobject_del
 0.01% rx_queue_release
 + 0.02% kset_unregister
 

Re: Kernel 4.6.7-rt13: Intel Ethernet driver igb causes huge latencies in cyclictest

2016-10-14 Thread Julia Cartwright
On Fri, Oct 14, 2016 at 08:58:22AM +, Koehrer Mathias (ETAS/ESW5) wrote:
> Hi Julia,
>
> > Have you tested on a vanilla (non-RT) kernel?  I doubt there is anything RT 
> > specific
> > about what you are seeing, but it might be nice to get confirmation.  Also, 
> > bisection
> > would probably be easier if you confirm on a vanilla kernel.
> >
> > I find it unlikely that it's a kernel config option that changed which 
> > regressed you, but
> > instead was a code change to a driver.  Which driver is now the question, 
> > and the
> > surface area is still big (processor mapping attributes for this region, 
> > PCI root
> > complex configuration, PCI brige configuration, igb driver itself, etc.).
> >
> > Big enough that I'd recommend a bisection.  It looks like a bisection 
> > between 3.18
> > and 4.8 would take you about 18 tries to narrow down, assuming all goes 
> > well.
> >
>
> I have now repeated my tests using the vanilla kernel.
> There I got the very same issue.
> Using kernel 4.0 is fine, however starting with kernel 4.1, the issue appears.

Great, thanks for confirming!  That helps narrow things down quite a
bit.

> Here is my exact (reproducible) test description:
> I applied the following patch to the kernel to get the igb trace.
> This patch instruments the igb_rd32() function to measure the call
> to readl() which is used to access registers of the igb NIC.

I took your test setup and ran it between 4.0 and 4.1 on the hardware on
my desk, which is an Atom-based board with dual I210s, however I didn't
see much difference.

However, it's a fairly simple board, with a much simpler PCI topology
than your workstation.  I'll see if I can find some other hardware to
test on.

[..]
> This means, that I think that some other stuff in kernel 4.1 has changed,
> which has impact on the igb accesses.
>
> Any idea what component could cause this kind of issue?

Can you continue your bisection using 'git bisect'?  You've already
narrowed it down between 4.0 and 4.1, so you're well on your way.

Another option might be to try to eliminate igb from the picture as
well, and try reading from another device from the same (or, perhaps
nearest) bus segment, and see if you see the same results.

   Julia


Re: [PATCH v2] r8169: set coherent DMA mask as well as streaming DMA mask

2016-10-14 Thread Francois Romieu
Ard Biesheuvel  :
> PCI devices that are 64-bit DMA capable should set the coherent
> DMA mask as well as the streaming DMA mask. On some architectures,
> these are managed separately, and so the coherent DMA mask will be
> left at its default value of 32 if it is not set explicitly. This
> results in errors such as
> 
>  r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
>  hwdev DMA mask = 0x, dev_addr = 0x0080fbfff000
>  swiotlb: coherent allocation failed for device :02:00.0 size=4096
>  CPU: 0 PID: 1062 Comm: systemd-udevd Not tainted 4.8.0+ #35
>  Hardware name: AMD Seattle/Seattle, BIOS 10:53:24 Oct 13 2016
> 
> on systems without memory that is 32-bit addressable by PCI devices.
> 
> Signed-off-by: Ard Biesheuvel 

Acked-by: Francois Romieu 

Unless someone plans to plug an acenic, a 83820 (pci-e gem board, anyone ?)
on top of a pci <-> pci-e adapter on this kind of motherboard, no other
network driver that uses the pci_... dma api exhibits this mixed 32 / 64 bit
support bug. I haven't checked devices with 32 < mask < 64 nor plain DMA api
converted ones.

-- 
Ueimor


[PATCH v2] vmxnet3: avoid assumption about invalid dma_pa in vmxnet3_set_mc()

2016-10-14 Thread Alexey Khoroshilov
vmxnet3_set_mc() checks new_table_pa returned by dma_map_single()
with dma_mapping_error(), but even there it assumes zero is invalid pa
(it assumes dma_mapping_error(...,0) returns true if new_table is NULL).

The patch adds an explicit variable to track status of new_table_pa.

Found by Linux Driver Verification project (linuxtesting.org).

v2: use "bool" and "true"/"false" for boolean variables.
Signed-off-by: Alexey Khoroshilov 
---
 drivers/net/vmxnet3/vmxnet3_drv.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c 
b/drivers/net/vmxnet3/vmxnet3_drv.c
index b5554f2ebee4..ef83ae3b0a44 100644
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -2279,6 +2279,7 @@ vmxnet3_set_mc(struct net_device *netdev)
>shared->devRead.rxFilterConf;
u8 *new_table = NULL;
dma_addr_t new_table_pa = 0;
+   bool new_table_pa_valid = false;
u32 new_mode = VMXNET3_RXM_UCAST;
 
if (netdev->flags & IFF_PROMISC) {
@@ -2307,13 +2308,15 @@ vmxnet3_set_mc(struct net_device *netdev)
new_table,
sz,
PCI_DMA_TODEVICE);
+   if (!dma_mapping_error(>pdev->dev,
+  new_table_pa)) {
+   new_mode |= VMXNET3_RXM_MCAST;
+   new_table_pa_valid = true;
+   rxConf->mfTablePA = cpu_to_le64(
+   new_table_pa);
+   }
}
-
-   if (!dma_mapping_error(>pdev->dev,
-  new_table_pa)) {
-   new_mode |= VMXNET3_RXM_MCAST;
-   rxConf->mfTablePA = cpu_to_le64(new_table_pa);
-   } else {
+   if (!new_table_pa_valid) {
netdev_info(netdev,
"failed to copy mcast list, setting 
ALL_MULTI\n");
new_mode |= VMXNET3_RXM_ALL_MULTI;
@@ -2338,7 +2341,7 @@ vmxnet3_set_mc(struct net_device *netdev)
   VMXNET3_CMD_UPDATE_MAC_FILTERS);
spin_unlock_irqrestore(>cmd_lock, flags);
 
-   if (new_table_pa)
+   if (new_table_pa_valid)
dma_unmap_single(>pdev->dev, new_table_pa,
 rxConf->mfTableLen, PCI_DMA_TODEVICE);
kfree(new_table);
-- 
2.7.4



Re: pull-request: wireless-drivers 2016-10-14

2016-10-14 Thread David Miller
From: Kalle Valo 
Date: Fri, 14 Oct 2016 10:18:42 +0300

> first wireless-drivers pull request for 4.9 and this time we have
> unusually many fixes even before -rc1 is released. Most important here
> are the wlcore and rtlwifi commits which fix critical regressions,
> otherwise smaller impact fixes and one new sdio id for ath6kl.
> 
> Please let me know if there are any problems.

Pulled, thanks Kalle.


Re: Kernel 4.6.7-rt13: Intel Ethernet driver igb causes huge latencies in cyclictest

2016-10-14 Thread Richard Cochran
On Fri, Oct 14, 2016 at 08:58:22AM +, Koehrer Mathias (ETAS/ESW5) wrote:
> @@ -753,7 +756,9 @@ u32 igb_rd32(struct e1000_hw *hw, u32 re
>   if (E1000_REMOVED(hw_addr))
>   return ~value;
>  
> +trace_igb(801);
>   value = readl(_addr[reg]);
> +trace_igb(802);

Nothing prevents this code from being preempted between the two trace
points, and so you can't be sure whether the time delta in the trace
is caused by the PCIe read stalling or not.

Thanks,
Richard




RE: [PATCH NET] ethtool: silence warning on bit loss

2016-10-14 Thread David Laight
From: Jesse Brandeburg
> Sent: 14 October 2016 00:14
> Sparse was complaining when we went to prototype some code
> using ethtool_cmd_speed_set and SPEED_10, which uses
> the upper 16 bits of __u32 speed for the first time.
...
> Reported-by: Preethi Banala 
> Signed-off-by: Jesse Brandeburg 
> ---
>  include/uapi/linux/ethtool.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
> index 099a420..8e54723 100644
> --- a/include/uapi/linux/ethtool.h
> +++ b/include/uapi/linux/ethtool.h
> @@ -119,8 +119,7 @@ struct ethtool_cmd {
>  static inline void ethtool_cmd_speed_set(struct ethtool_cmd *ep,
>__u32 speed)
>  {
> -
> - ep->speed = (__u16)speed;
> + ep->speed = (__u16)(speed & 0x);
>   ep->speed_hi = (__u16)(speed >> 16);

I suspect that deleting both (__u16) casts also fixes it?

David



Re: [PATCH net 2/2] conntrack: enable to tune gc parameters

2016-10-14 Thread Pablo Neira Ayuso
On Fri, Oct 14, 2016 at 12:37:26PM +0200, Florian Westphal wrote:
> Nicolas Dichtel  wrote:
> > Le 13/10/2016 à 22:43, Florian Westphal a écrit :
[...]
> > > (Or cause too many useless scans)
> > > 
> > > Another idea worth trying might be to get rid of the max cap and
> > > instead break early in case too many jiffies expired.
> > > 
> > > I don't want to add sysctl knobs for this unless absolutely needed; its 
> > > already
> > > possible to 'force' eviction cycle by running 'conntrack -L'.
> > > 
> > Sure, but this is not a "real" solution, just a workaround.
> > We need to find a way to deliver conntrack deletion events in a reasonable
> > delay, whatever the traffic on the machine is.
> 
> Agree, but that depends on what 'reasonable' means and what kind of
> uneeded cpu churn we're willing to add.
> 
> We can add a sysctl for this but we should use a low default to not do
> too much unneeded work.
> 
> So what about your original patch, but only add
> 
> nf_conntrack_gc_interval
> 
> (and also add instant-resched in case entire budget was consumed)?

I would prefer not to expose sysctl knobs, if we don't really know
what good default values are good, then we cannot expect our users to
know this for us.

I would go tune this in a way that this resembles to the previous
behaviour.


Re: [mac80211] BUG_ON with current -git (4.8.0-11417-g24532f7)

2016-10-14 Thread Johannes Berg
On Thu, 2016-10-13 at 14:49 -0700, Andy Lutomirski wrote:
> 
> It's failing before that.  With CONFIG_VMAP_STACK=y, the stack may
> not be physically contiguous and can't be used for DMA, so putting it
> in a scatterlist is bogus in general, and the crypto code mostly
> wants a scatterlist.

I see, so all this stuff is getting inlined, and we crash in
sg_set_buf() because it does sg_set_page() and that obviously needs to
do virt_to_page(), which is invalid on this address now.
With CONFIG_DEBUG_SG we'd have hit the BUG_ON there instead.

It does indeed look like AEAD doesn't have any non-SG API.

So ultimately, the bug already goes back to Ard's commit 7ec7c4a9a686
("mac80211: port CCMP to cryptoapi's CCM driver") since that already
potentially used stack space for DMA.

Since we don't have any space in the SKB or anywhere else at this point
(other than the stack that we can't use), I see two ways out of this:

   1. revert that patch (doing so would need some major adjustments now,
  since it's pretty old and a number of new things were added in the
  meantime)
   2. allocate a per-CPU buffer for all the things that we put on the
  stack and use in SG lists, those are:
   * CCM/GCM: AAD (32B), B_0/J_0 (16B)
   * GMAC: AAD (20B), zero (16B)
   * (not sure why CMAC isn't using this API, but it would be like
  GMAC)

Thoughts?

johannes


RE: Kernel 4.6.7-rt13: Intel Ethernet driver igb causes huge latencies in cyclictest

2016-10-14 Thread Koehrer Mathias (ETAS/ESW5)
Hi Julia,
> Have you tested on a vanilla (non-RT) kernel?  I doubt there is anything RT 
> specific
> about what you are seeing, but it might be nice to get confirmation.  Also, 
> bisection
> would probably be easier if you confirm on a vanilla kernel.
> 
> I find it unlikely that it's a kernel config option that changed which 
> regressed you, but
> instead was a code change to a driver.  Which driver is now the question, and 
> the
> surface area is still big (processor mapping attributes for this region, PCI 
> root
> complex configuration, PCI brige configuration, igb driver itself, etc.).
> 
> Big enough that I'd recommend a bisection.  It looks like a bisection between 
> 3.18
> and 4.8 would take you about 18 tries to narrow down, assuming all goes well.
> 

I have now repeated my tests using the vanilla kernel.
There I got the very same issue.
Using kernel 4.0 is fine, however starting with kernel 4.1, the issue appears.


Here is my exact (reproducible) test description:
I applied the following patch to the kernel to get the igb trace.
This patch instruments the igb_rd32() function to measure the call
to readl() which is used to access registers of the igb NIC.


++ BEGIN PATCH 

Index: linux-4.8/drivers/net/ethernet/intel/igb/trace.h
===
--- /dev/null
+++ linux-4.8/drivers/net/ethernet/intel/igb/trace.h
@@ -0,0 +1,34 @@
+
+#if !defined(_TRACE_IGB_H_) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_IGB_H_ 
+
+#include 
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM igb
+
+
+#define _TRACE_H_
+
+
+TRACE_EVENT(igb,
+TP_PROTO(u32 val),
+TP_ARGS(val),
+TP_STRUCT__entry(
+__field(u32, val)
+),
+TP_fast_assign(
+__entry->val = val;
+),
+TP_printk("val: %u",
+   __entry->val)
+);
+
+
+#endif
+
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH drivers/net/ethernet/intel/igb 
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_FILE trace
+
+#include 
Index: linux-4.8/drivers/net/ethernet/intel/igb/Makefile
===
--- linux-4.8.orig/drivers/net/ethernet/intel/igb/Makefile
+++ linux-4.8/drivers/net/ethernet/intel/igb/Makefile
@@ -28,6 +28,7 @@
 #
 # Makefile for the Intel(R) 82575 PCI-Express ethernet driver
 #
+ccflags-y += -I.
 
 obj-$(CONFIG_IGB) += igb.o
 
Index: linux-4.8/drivers/net/ethernet/intel/igb/igb_main.c
===
--- linux-4.8.orig/drivers/net/ethernet/intel/igb/igb_main.c
+++ linux-4.8/drivers/net/ethernet/intel/igb/igb_main.c
@@ -57,6 +57,9 @@
 #include 
 #include "igb.h"
 
+#define CREATE_TRACE_POINTS
+#include "trace.h"
+
 #define MAJ 5
 #define MIN 3
 #define BUILD 0
@@ -753,7 +756,9 @@ u32 igb_rd32(struct e1000_hw *hw, u32 re
if (E1000_REMOVED(hw_addr))
return ~value;
 
+trace_igb(801);
value = readl(_addr[reg]);
+trace_igb(802);
 
/* reads should not return all F's */
if (!(~value) && (!reg || !(~readl(hw_addr {


++ END PATCH 


I build the kernel with this patch applied, rebooted the PC to run this kernel 
and used the 
following script for my test.

++ BEGIN SCRIPT  +
#!/bin/bash

for f in /sys/devices/system/cpu/cpu[0-9]*/cpufreq/scaling_governor ; do
if [ -w $f ]; then
echo "performance" > $f
fi
done

if true; then
rmmod igb
modprobe igb
ethtool -L eth2 combined 1
ifconfig eth2 up 192.168.100.111
fi

ifconfig

mount /sys/kernel/debug

( cd /sys/kernel/debug/tracing
  echo 0 > tracing_on
  echo 0 > events/enable
  echo 1 > events/igb/enable
  echo "print-parent" > trace_options
  echo "latency-format" > trace_options
  echo 1 > tracing_on

  sleep 4
  cat trace
)
++ END SCRIPT  +

The results of this for kernel 4.0:
[...]
kworker/-12393...1 49699046us : igb: val: 801
kworker/-12393...1 49699047us : igb: val: 802
kworker/-12393...1 49699047us : igb: val: 801
kworker/-12393...1 49699048us+: igb: val: 802
kworker/-12393...1 49699099us : igb: val: 801
kworker/-12393...1 49699100us : igb: val: 802
kworker/-12393...1 49699100us : igb: val: 801
kworker/-12393...1 49699102us : igb: val: 802
kworker/-12393...1 49699102us : igb: val: 801
kworker/-12393...1 49699103us : igb: val: 802
kworker/-12393...1 49699103us : igb: val: 801
kworker/-12393...1 49699104us : igb: val: 802
kworker/-12393...1 49699104us : igb: val: 801
kworker/-12393...1 49699105us : igb: val: 802
kworker/-12393...1 49699105us : igb: val: 801
kworker/-12393...1 49699107us : igb: val: 802
kworker/-12393...1 49699107us : igb: val: 801
kworker/-12393...1 49699108us : igb: val: 

Re: [mac80211] BUG_ON with current -git (4.8.0-11417-g24532f7)

2016-10-14 Thread Johannes Berg
On Fri, 2016-10-14 at 10:05 +0100, Ard Biesheuvel wrote:
> 
> Indeed. And the decrypt path does the same for auth_tag[].

Hadn't gotten that far, due to the BUG_ON() in CONFIG_DEBUG_SG in the
encrypt path :)

> But that still means there are two separate problems here, one which
> affects the WPA code, and one that only affects the generic CCM
> chaining mode (but not the accelerated arm64 implementation)

Yes. The generic CCM chaining still doesn't typically have a request on
the stack though. In fact, ESP (net/ipv4/esp4.c) for example will do
temporary allocations with kmalloc for every frame, it seems.

> Unsurprisingly, I would strongly prefer those to be fixed properly
> rather than backing out my patch, but I'm happy to help out whichever
> solution we reach consensus on.

Yeah, obviously, it would be good to use the accelerated versions after
all.

> I will check whether this removes the issue when not using
> crypto/ccm.ko

Ok. I think we can probably live with having those 48 bytes in per-CPU
buffers, but I suppose we don't really want to have ~500.

johannes


Re: [PATCH net 2/5] net/ncsi: Split out logic for ncsi_dev_state_suspend_select

2016-10-14 Thread Gavin Shan
On Fri, Oct 14, 2016 at 04:32:22PM +1030, Joel Stanley wrote:
>Hi Gavin,
>
>On Fri, Oct 14, 2016 at 1:23 PM, Gavin Shan  wrote:
>> This splits out the code that handles ncsi_dev_state_suspend_select
>> so that we can add more code to the handler in subsequent patch.
>> Apart from adding a error tag to reuse the code in error path,
>> no logical changes introduced.
>>
>> Signed-off-by: Gavin Shan 
>> ---
>>  net/ncsi/ncsi-manage.c | 38 +-
>>  1 file changed, 25 insertions(+), 13 deletions(-)
>>
>> diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c
>> index 1bc96dc..5758a26 100644
>> --- a/net/ncsi/ncsi-manage.c
>> +++ b/net/ncsi/ncsi-manage.c
>> @@ -540,21 +540,30 @@ static void ncsi_suspend_channel(struct ncsi_dev_priv 
>> *ndp)
>> nd->state = ncsi_dev_state_suspend_select;
>> /* Fall through */
>> case ncsi_dev_state_suspend_select:
>> +   ndp->pending_req_num = 1;
>> +
>> +   nca.type = NCSI_PKT_CMD_SP;
>> +   nca.package = np->id;
>> +   nca.channel = NCSI_RESERVED_CHANNEL;
>> +   if (ndp->flags & NCSI_DEV_HWA)
>> +   nca.bytes[0] = 0;
>> +   else
>> +   nca.bytes[0] = 1;
>> +
>> +   nd->state = ncsi_dev_state_suspend_dcnt;
>> +
>> +   ret = ncsi_xmit_cmd();
>> +   if (ret)
>> +   goto error;
>> +
>> +   break;
>> case ncsi_dev_state_suspend_dcnt:
>> case ncsi_dev_state_suspend_dc:
>> case ncsi_dev_state_suspend_deselect:
>> ndp->pending_req_num = 1;
>>
>> nca.package = np->id;
>> -   if (nd->state == ncsi_dev_state_suspend_select) {
>> -   nca.type = NCSI_PKT_CMD_SP;
>> -   nca.channel = NCSI_RESERVED_CHANNEL;
>> -   if (ndp->flags & NCSI_DEV_HWA)
>> -   nca.bytes[0] = 0;
>> -   else
>> -   nca.bytes[0] = 1;
>> -   nd->state = ncsi_dev_state_suspend_dcnt;
>> -   } else if (nd->state == ncsi_dev_state_suspend_dcnt) {
>> +   if (nd->state == ncsi_dev_state_suspend_dcnt) {
>> nca.type = NCSI_PKT_CMD_DCNT;
>> nca.channel = nc->id;
>> nd->state = ncsi_dev_state_suspend_dc;
>
>This is a messy switch statement. How about break out out all of the
>states as you've done with suspend_select, instead of grouping them
>and then doing if ... else if .. else if. I realise there might be one
>or two lines duplicated for each state, but I think that's okay at the
>expense of readability.
>
>Also, patch 1 could also be merged into this when making this cleanup.
>
>What do you think?
>

Thanks, Joel. I agree with you that code readability is important than
duplicated code. I will do in next revision.

Thanks,
Gavin

>> @@ -570,10 +579,8 @@ static void ncsi_suspend_channel(struct ncsi_dev_priv 
>> *ndp)
>> }
>>
>> ret = ncsi_xmit_cmd();
>> -   if (ret) {
>> -   nd->state = ncsi_dev_state_functional;
>> -   return;
>> -   }
>> +   if (ret)
>> +   goto error;
>>
>> break;
>> case ncsi_dev_state_suspend_done:
>> @@ -587,6 +594,11 @@ static void ncsi_suspend_channel(struct ncsi_dev_priv 
>> *ndp)
>> netdev_warn(nd->dev, "Wrong NCSI state 0x%x in suspend\n",
>> nd->state);
>> }
>> +
>> +   return;
>> +
>> +error:
>> +   nd->state = ncsi_dev_state_functional;
>>  }
>>
>>  static void ncsi_configure_channel(struct ncsi_dev_priv *ndp)
>> --
>> 2.1.0
>>
>



Re: [mac80211] BUG_ON with current -git (4.8.0-11417-g24532f7)

2016-10-14 Thread Johannes Berg
On Fri, 2016-10-14 at 09:41 +0100, Ard Biesheuvel wrote:

> > I assume the stack buffer itself is not the problem here, but aad,
> > which is allocated on the stack one frame up.
> > Do we really need to revert the whole patch to fix that?
> 
> Ah never mind, this is about 'odata'. Apologies, should have read
> first

Right, odata also goes into an sg list and further on.

I think we should wait for Herbert to chime in before we do any further
work though, perhaps he has any better ideas.

johannes


Re: [PATCH net-next 1/2] lwtunnel: Add destroy state operation

2016-10-14 Thread Jiri Benc
On Thu, 13 Oct 2016 17:57:42 -0700, Tom Herbert wrote:
> @@ -43,13 +44,11 @@ struct lwtunnel_encap_ops {
>   int (*get_encap_size)(struct lwtunnel_state *lwtstate);
>   int (*cmp_encap)(struct lwtunnel_state *a, struct lwtunnel_state *b);
>   int (*xmit)(struct sk_buff *skb);
> + void (*destroy_state)(struct lwtunnel_state *lws);
>  };

Could you add destroy_state next to build_state? Seems weird to have
those two scattered at the opposite ends of the structure. Looks good
otherwise.

Thanks,

 Jiri


Re: [mac80211] BUG_ON with current -git (4.8.0-11417-g24532f7)

2016-10-14 Thread Johannes Berg

>    1. revert that patch (doing so would need some major adjustments now,
>   since it's pretty old and a number of new things were added in the
>   meantime)

This it will have to be, I guess.

>    2. allocate a per-CPU buffer for all the things that we put on the
>   stack and use in SG lists, those are:
>    * CCM/GCM: AAD (32B), B_0/J_0 (16B)
>    * GMAC: AAD (20B), zero (16B)
>    * (not sure why CMAC isn't using this API, but it would be like GMAC)

This doesn't work - I tried to move the mac80211 buffers, but because
we also put the struct aead_request on the stack, and crypto_ccm has
the "odata" in there, and we can't separate the odata from that struct,
we'd have to also put that into a per-CPU buffer, but it's very big -
456 bytes for CCM, didn't measure the others but I'd expect them to be
larger, if different.

I don't think we can allocate half a kb for each CPU just to be able to
possibly use the acceleration here. We can't even make that conditional
on not having hardware crypto in the wifi NIC because drivers are
always allowed to pass undecrypted frames, regardless of whether or not
HW crypto was attempted, so we don't know upfront if we'll have to
decrypt anything in software...

Given that, I think we have had a bug in here basically since Ard's
patch, we never should've put these structs on the stack. Herbert, you
also touched this later and converted the API usage, did you see the
way the stack is used here and think it should be OK, or did you simply
not realize that?

Ard, are you able to help out working on a revert of your patch? That
would require also reverting a number of other patches (various fixes,
API adjustments, etc. to the AEAD usage), but the more complicated part
is that in the meantime Jouni introduced GCMP and CCMP-256, both of
which we of course need to retain.

johannes


hello

2016-10-14 Thread maowenan
i want to subscribe this mail, thank you very much.



Hello

2016-10-14 Thread yuehaibing
subscribe linux-kernel



Re: [mac80211] BUG_ON with current -git (4.8.0-11417-g24532f7)

2016-10-14 Thread Ard Biesheuvel
On 14 October 2016 at 10:10, Johannes Berg  wrote:
> On Fri, 2016-10-14 at 10:05 +0100, Ard Biesheuvel wrote:
>>
>> Indeed. And the decrypt path does the same for auth_tag[].
>
> Hadn't gotten that far, due to the BUG_ON() in CONFIG_DEBUG_SG in the
> encrypt path :)
>
>> But that still means there are two separate problems here, one which
>> affects the WPA code, and one that only affects the generic CCM
>> chaining mode (but not the accelerated arm64 implementation)
>
> Yes. The generic CCM chaining still doesn't typically have a request on
> the stack though. In fact, ESP (net/ipv4/esp4.c) for example will do
> temporary allocations with kmalloc for every frame, it seems.
>

It is annotated with a TODO, though :-)

38320c70d282b (Herbert Xu   2008-01-28 19:35:05 -0800  41)
 * TODO: Use spare space in skb for this where possible.

>> Unsurprisingly, I would strongly prefer those to be fixed properly
>> rather than backing out my patch, but I'm happy to help out whichever
>> solution we reach consensus on.
>
> Yeah, obviously, it would be good to use the accelerated versions after
> all.
>
>> I will check whether this removes the issue when not using
>> crypto/ccm.ko
>
> Ok. I think we can probably live with having those 48 bytes in per-CPU
> buffers, but I suppose we don't really want to have ~500.
>

Agreed.


Re: [PATCH v3] IB/ipoib: move back IB LL address into the hard header

2016-10-14 Thread Or Gerlitz
On Thu, Oct 13, 2016 at 7:26 PM, Paolo Abeni  wrote:
> After the commit 9207f9d45b0a ("net: preserve IP control block
> during GSO segmentation"), the GSO CB and the IPoIB CB conflict.

the commit --> commit (remove the word "the" to make the sentence a
bit more clear)

> That destroy the IPoIB address information cached there,
> causing a severe performance regression, as better described here:

> http://marc.info/?l=linux-kernel=146787279825501=2

I don't think that links into this archive last for long.. try to find
something better
best if you can provide quick wording telling what is broken (e.g HW LSO)

> This change moves the data cached by the IPoIB driver from the
> skb control lock into the IPoIB hard header, as done before

lock --> block ?

> the commit 936d7de3d736 ("IPoIB: Stop lying about hard_header_len

the commit --> commit

> and use skb->cb to stash LL addresses").
> In order to avoid GRO issue, on packet reception, the IPoIB driver
> stash into the skb a dummy pseudo header, so that the received
> packets have actually a hard header matching the declared length.
> To avoid changing the connected mode maximum mtu, the allocated
> head buffer size is increased by the pseudo header length.

> After this commit, IPoIB performances are back to pre-regression
> value.
>
> v2 -> v3: rebased
> v1 -> v2: avoid changing the max mtu, increasing the head buf size

> Fixes: 9207f9d45b0a ("net: preserve IP control block during GSO segmentation")
> Signed-off-by: Paolo Abeni 

Paolo,

Is this fix backportable to any kernel since the breakage? AFAIR,
Roland mentioned
that a 2nd change introduced in 4.7-rc1 changed things a bit more such
that the fix
he had in his head didn't apply any more.

Dave, Doug

I am still travelling after netdev and would like to put an eye on the
patch and also see that @mellanox.com someone
provides a  tested-by ack. Considering the fact that the bug is soon
to (de-)celebrate it's 1st anniversary and as we're still not in rc1,
the patch has enough time to get into 4.9... can you let it be here
for another week or so?

Or


[PATCH] brcmfmac: print name of connect status event

2016-10-14 Thread Rafał Miłecki
From: Rafał Miłecki 

This simplifies debugging. Format %s (%u) comes from similar debugging
message in brcmf_fweh_event_worker.

Signed-off-by: Rafał Miłecki 
---
 drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c | 3 ++-
 drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.c | 4 ++--
 drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.h | 2 ++
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c 
b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
index b777e1b..1e7c6f0 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
@@ -5506,7 +5506,8 @@ brcmf_notify_connect_status_ap(struct brcmf_cfg80211_info 
*cfg,
u32 reason = e->reason;
struct station_info sinfo;
 
-   brcmf_dbg(CONN, "event %d, reason %d\n", event, reason);
+   brcmf_dbg(CONN, "event %s (%u), reason %d\n",
+ brcmf_fweh_event_name(event), event, reason);
if (event == BRCMF_E_LINK && reason == BRCMF_E_REASON_LINK_BSSCFG_DIS &&
ndev != cfg_to_ndev(cfg)) {
brcmf_dbg(CONN, "AP mode link down\n");
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.c 
b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.c
index 79c081f..c79306b 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.c
@@ -69,7 +69,7 @@ static struct brcmf_fweh_event_name fweh_event_names[] = {
  *
  * @code: code to lookup.
  */
-static const char *brcmf_fweh_event_name(enum brcmf_fweh_event_code code)
+const char *brcmf_fweh_event_name(enum brcmf_fweh_event_code code)
 {
int i;
for (i = 0; i < ARRAY_SIZE(fweh_event_names); i++) {
@@ -79,7 +79,7 @@ static const char *brcmf_fweh_event_name(enum 
brcmf_fweh_event_code code)
return "unknown";
 }
 #else
-static const char *brcmf_fweh_event_name(enum brcmf_fweh_event_code code)
+const char *brcmf_fweh_event_name(enum brcmf_fweh_event_code code)
 {
return "nodebug";
 }
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.h 
b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.h
index 26ff5a9..5fba4b4 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.h
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.h
@@ -287,6 +287,8 @@ struct brcmf_fweh_info {
 void *data);
 };
 
+const char *brcmf_fweh_event_name(enum brcmf_fweh_event_code code);
+
 void brcmf_fweh_attach(struct brcmf_pub *drvr);
 void brcmf_fweh_detach(struct brcmf_pub *drvr);
 int brcmf_fweh_register(struct brcmf_pub *drvr, enum brcmf_fweh_event_code 
code,
-- 
2.9.3



Re: [mac80211] BUG_ON with current -git (4.8.0-11417-g24532f7)

2016-10-14 Thread Sergey Senozhatsky
On (10/13/16 14:49), Andy Lutomirski wrote:
[..]
> > >  FAIL: 412cba02 > c900802cba02 || 1 -> (412cba02
> > > >> 39) == 130
> >
> > Yeah, we already know that in this function the aad variable is on the
> > stack, it explicitly is.
> >
> > The question, though, is why precisely that fails in the crypto code.
> > Can you send the Oops report itself?
> >
> 
> It's failing before that.  With CONFIG_VMAP_STACK=y, the stack may not
> be physically contiguous and can't be used for DMA, so putting it in a
> scatterlist is bogus in general, and the crypto code mostly wants a
> scatterlist.
> 
> There are a couple (faster!) APIs for crypto that don't use
> scatterlists, but I don't think AEAD works with them.

given that we have a known issue shouldn't VMAP_STACK be
disabled for now, or would you rather prefer to mark MAC80211
as incompatible: "depends on CFG80211 && !VMAP_STACK"?

-ss


Re: [PATCH net-next 2/2] ila: Cache a route to translated address

2016-10-14 Thread Jiri Benc
On Thu, 13 Oct 2016 23:22:14 -0700, Roopa Prabhu wrote:
> This removes the last and only user of lwt orig_output. we can drop it
> subsequently. But since orig_input is still in use, probably better to keep it
> around for symmetry and for other uses in the future.

If it's no longer used, let's remove it. It can be always added later
again if needed. We don't keep things just because they maybe can be
used for something in the future.

Thanks,

 Jiri


Re: [mac80211] BUG_ON with current -git (4.8.0-11417-g24532f7)

2016-10-14 Thread Johannes Berg

> So why is the performance hit acceptable for ESP but not for WPA? We
> could easily implement the same thing, i.e.,
> kmalloc(GFP_ATOMIC)/kfree the aead_req struct rather than allocate it
> on the stack

Yeah, maybe we should. It's likely a much bigger allocation, but I
don't actually know if that affects speed.

In most cases where you want high performance we never hit this anyway
since we'll have hardware crypto. I know for our (Intel's) devices we
normally never hit these code paths.

But on the other hand, you also did your changes for a reason, and the
only reason I can see of that is performance. So you'd be the one with
most "skin in the game", I guess?

johannes


pull-request: wireless-drivers 2016-10-14

2016-10-14 Thread Kalle Valo
Hi Dave,

first wireless-drivers pull request for 4.9 and this time we have
unusually many fixes even before -rc1 is released. Most important here
are the wlcore and rtlwifi commits which fix critical regressions,
otherwise smaller impact fixes and one new sdio id for ath6kl.

Please let me know if there are any problems.

Kalle

The following changes since commit 03a1eabc3f54469abd4f1784182851b2e29630cc:

  Merge branch 'mlxsw-fixes' (2016-10-04 20:28:10 -0400)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers.git 
tags/wireless-drivers-for-davem-2016-10-14

for you to fetch changes up to 1ea2643961b0d1b8d0e4a11af5aa69b0f92d0533:

  ath6kl: add Dell OEM SDIO I/O for the Venue 8 Pro (2016-10-13 14:16:33 +0300)


wireless-drivers fixes for 4.9

wlcore

* fix a double free regression causing hard to track crashes

rtl8xxxu

* fix driver reload issues, a memory leak and an endian bug

rtlwifi

* fix a major regression introduced in 4.9 with firmware loading on
  certain hardware

ath10k

* fix regression about broken cal_data debugfs file (since 4.7)

ath9k

* revert temperature compensation for AR9003+ devices, it was causing
  too much problems

ath6kl

* add Dell OEM SDIO I/O for the Venue 8 Pro


Adam Williamson (1):
  ath6kl: add Dell OEM SDIO I/O for the Venue 8 Pro

Felix Fietkau (1):
  Revert "ath9k_hw: implement temperature compensation support for AR9003+"

Jes Sorensen (4):
  rtl8xxxu: Fix memory leak in handling rxdesc16 packets
  rtl8xxxu: Fix big-endian problem reporting mactime
  rtl8xxxu: Fix rtl8723bu driver reload issue
  rtl8xxxu: Fix rtl8192eu driver reload issue

Larry Finger (1):
  rtlwifi: Fix regression caused by commit d86e64768859

Marty Faltesek (1):
  ath10k: cache calibration data when the core is stopped

Wei Yongjun (1):
  wlcore: sdio: drop kfree for memory allocated with devm_kzalloc

 drivers/net/wireless/ath/ath10k/core.h |1 +
 drivers/net/wireless/ath/ath10k/debug.c|   75 ++--
 drivers/net/wireless/ath/ath6kl/sdio.c |1 +
 drivers/net/wireless/ath/ath9k/ar9003_calib.c  |   25 +--
 drivers/net/wireless/ath/ath9k/hw.h|1 -
 drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h   |4 +-
 .../net/wireless/realtek/rtl8xxxu/rtl8xxxu_8192e.c |8 ++-
 .../net/wireless/realtek/rtl8xxxu/rtl8xxxu_8723b.c |4 ++
 .../net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c  |   11 ++-
 drivers/net/wireless/realtek/rtlwifi/core.c|2 +-
 .../net/wireless/realtek/rtlwifi/rtl8188ee/sw.c|8 +--
 .../net/wireless/realtek/rtlwifi/rtl8192ce/sw.c|   13 ++--
 .../net/wireless/realtek/rtlwifi/rtl8192cu/sw.c|   12 ++--
 .../net/wireless/realtek/rtlwifi/rtl8192de/sw.c|6 +-
 .../net/wireless/realtek/rtlwifi/rtl8192ee/sw.c|8 +--
 .../net/wireless/realtek/rtlwifi/rtl8192se/sw.c|9 +--
 .../net/wireless/realtek/rtlwifi/rtl8723ae/sw.c|   12 ++--
 .../net/wireless/realtek/rtlwifi/rtl8723be/sw.c|6 +-
 .../net/wireless/realtek/rtlwifi/rtl8821ae/sw.c|   18 ++---
 drivers/net/wireless/realtek/rtlwifi/wifi.h|2 -
 drivers/net/wireless/ti/wlcore/sdio.c  |1 -
 21 files changed, 110 insertions(+), 117 deletions(-)


[PATCH trivial] net: add bbr to config DEFAULT_TCP_CONG

2016-10-14 Thread Markus Trippelsdorf
While playing with BBR I noticed that it was missing in the list of
possible config DEFAULT_TCP_CONG choices. Fixed thusly.

Signed-off-by: Markus Trippelsdorf 

diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig
index 300b06888fdf..b54b3ca939db 100644
--- a/net/ipv4/Kconfig
+++ b/net/ipv4/Kconfig
@@ -715,6 +715,7 @@ config DEFAULT_TCP_CONG
default "reno" if DEFAULT_RENO
default "dctcp" if DEFAULT_DCTCP
default "cdg" if DEFAULT_CDG
+   default "bbr" if DEFAULT_BBR
default "cubic"
 
 config TCP_MD5SIG

-- 
Markus


Re: [mac80211] BUG_ON with current -git (4.8.0-11417-g24532f7)

2016-10-14 Thread Ard Biesheuvel
On 14 October 2016 at 09:28, Johannes Berg  wrote:
>
>>1. revert that patch (doing so would need some major adjustments now,
>>   since it's pretty old and a number of new things were added in the
>>   meantime)
>
> This it will have to be, I guess.
>
>>2. allocate a per-CPU buffer for all the things that we put on the
>>   stack and use in SG lists, those are:
>>* CCM/GCM: AAD (32B), B_0/J_0 (16B)
>>* GMAC: AAD (20B), zero (16B)
>>* (not sure why CMAC isn't using this API, but it would be like GMAC)
>
> This doesn't work - I tried to move the mac80211 buffers, but because
> we also put the struct aead_request on the stack, and crypto_ccm has
> the "odata" in there, and we can't separate the odata from that struct,
> we'd have to also put that into a per-CPU buffer, but it's very big -
> 456 bytes for CCM, didn't measure the others but I'd expect them to be
> larger, if different.
>
> I don't think we can allocate half a kb for each CPU just to be able to
> possibly use the acceleration here. We can't even make that conditional
> on not having hardware crypto in the wifi NIC because drivers are
> always allowed to pass undecrypted frames, regardless of whether or not
> HW crypto was attempted, so we don't know upfront if we'll have to
> decrypt anything in software...
>
> Given that, I think we have had a bug in here basically since Ard's
> patch, we never should've put these structs on the stack. Herbert, you
> also touched this later and converted the API usage, did you see the
> way the stack is used here and think it should be OK, or did you simply
> not realize that?
>
> Ard, are you able to help out working on a revert of your patch? That
> would require also reverting a number of other patches (various fixes,
> API adjustments, etc. to the AEAD usage), but the more complicated part
> is that in the meantime Jouni introduced GCMP and CCMP-256, both of
> which we of course need to retain.
>

I am missing some context here, but could you explain what exactly is
the problem here?

Look at this code

"""
struct scatterlist sg[3];

char aead_req_data[sizeof(struct aead_request) +
crypto_aead_reqsize(tfm)]
__aligned(__alignof__(struct aead_request));
struct aead_request *aead_req = (void *) aead_req_data;

memset(aead_req, 0, sizeof(aead_req_data));

sg_init_table(sg, 3);
sg_set_buf([0], [2], be16_to_cpup((__be16 *)aad));
sg_set_buf([1], data, data_len);
sg_set_buf([2], mic, mic_len);

aead_request_set_tfm(aead_req, tfm);
aead_request_set_crypt(aead_req, sg, sg, data_len, b_0);
aead_request_set_ad(aead_req, sg[0].length);
"""

I assume the stack buffer itself is not the problem here, but aad,
which is allocated on the stack one frame up.
Do we really need to revert the whole patch to fix that?


Re: [mac80211] BUG_ON with current -git (4.8.0-11417-g24532f7)

2016-10-14 Thread Ard Biesheuvel
On 14 October 2016 at 09:42, Johannes Berg  wrote:
> On Fri, 2016-10-14 at 09:41 +0100, Ard Biesheuvel wrote:
>
>> > I assume the stack buffer itself is not the problem here, but aad,
>> > which is allocated on the stack one frame up.
>> > Do we really need to revert the whole patch to fix that?
>>
>> Ah never mind, this is about 'odata'. Apologies, should have read
>> first
>
> Right, odata also goes into an sg list and further on.
>
> I think we should wait for Herbert to chime in before we do any further
> work though, perhaps he has any better ideas.
>

Do you have a reference for the sg_set_buf() call on odata?
crypto/ccm.c does not seem to have it (afaict), and the same problem
does not exist in the accelerated arm64 implementation. In the mean
time, I will try and see if we can move aad[] off the stack in the WPA
code.


Re: [mac80211] BUG_ON with current -git (4.8.0-11417-g24532f7)

2016-10-14 Thread Johannes Berg
On Fri, 2016-10-14 at 17:39 +0900, Sergey Senozhatsky wrote:
> 
> given that we have a known issue shouldn't VMAP_STACK be
> disabled for now, or would you rather prefer to mark MAC80211
> as incompatible: "depends on CFG80211 && !VMAP_STACK"?

Yeah. It's a bit complicated by the fact that most people will probably
have hardware crypto in their wifi NICs, so that they won't actually
hit the software crypto path. As I said in my other email though, we
can't guarantee - even if the driver says it can do hardware crypto -
that it really will do it for all frames (some might not be able to do
for management frames for example), so we also can't really catch this
at runtime ...

Making mac80211 depend on !VMAP_STACK is probably technically best, but
I fear it'll break a lot of people's configurations who don't have a
problem right now (e.g. Linus's, who probably enabled this, but I know
where he uses wifi he uses an Intel NIC that will always do HW crypto).

Andy, what do you think?

johannes


Re: [PATCH net 2/2] conntrack: enable to tune gc parameters

2016-10-14 Thread Nicolas Dichtel
Le 13/10/2016 à 22:43, Florian Westphal a écrit :
> Nicolas Dichtel  wrote:
>> Le 10/10/2016 à 16:04, Florian Westphal a écrit :
>>> Nicolas Dichtel  wrote:
 After commit b87a2f9199ea ("netfilter: conntrack: add gc worker to remove
 timed-out entries"), netlink conntrack deletion events may be sent with a
 huge delay. It could be interesting to let the user tweak gc parameters
 depending on its use case.
>>>
>>> Hmm, care to elaborate?
>>>
>>> I am not against doing this but I'd like to hear/read your use case.
>>>
>>> The expectation is that in almot all cases eviction will happen from
>>> packet path.  The gc worker is jusdt there for case where a busy system
>>> goes idle.
>> It was precisely that case. After a period of activity, the event is sent a 
>> long
>> time after the timeout. If the router does not manage a lot of flows, why not
>> trying to parse more entries instead of the default 1/64 of the table?
>> In fact, I don't understand why using GC_MAX_BUCKETS_DIV instead of using 
>> always
>> GC_MAX_BUCKETS whatever the size of the table is.
> 
> I wanted to make sure that we have a known upper bound on the number of
> buckets we process so that we do not block other pending kworker items
> for too long.
I don't understand. GC_MAX_BUCKETS is the upper bound and I agree that it is
needed. But why GC_MAX_BUCKETS_DIV (ie 1/64)?
In other words, why this line:
goal = min(nf_conntrack_htable_size / GC_MAX_BUCKETS_DIV, GC_MAX_BUCKETS);
instead of:
goal = GC_MAX_BUCKETS;
?

> 
> (Or cause too many useless scans)
> 
> Another idea worth trying might be to get rid of the max cap and
> instead break early in case too many jiffies expired.
> 
> I don't want to add sysctl knobs for this unless absolutely needed; its 
> already
> possible to 'force' eviction cycle by running 'conntrack -L'.
> 
Sure, but this is not a "real" solution, just a workaround.
We need to find a way to deliver conntrack deletion events in a reasonable
delay, whatever the traffic on the machine is.


Re: [PATCH net 3/5] net/ncsi: Fix stale link state of inactive channels on failover

2016-10-14 Thread Gavin Shan
On Fri, Oct 14, 2016 at 04:32:28PM +1030, Joel Stanley wrote:
>On Fri, Oct 14, 2016 at 1:23 PM, Gavin Shan  wrote:
>> The issue was found on BCM5718 which has two NCSI channels in one
>> package: C0 and C1. Both of them are connected to different LANs,
>> means they are in link-up state and C0 is chosen  as the active
>> one until resetting BCM5718 happens as below.
>>
>> Resetting BCM5718 results in LSC (Link State Change) AEN packet
>> received on C0, meaning LSC AEN is missed on C1. When LSC AEN packet
>> received on C0 to report link-down, it fails over to C1 because C1
>> is in link-up state as software can see. However, C1 is in link-down
>> state in hardware. It means the link state is out of synchronization
>> between hardware and software, resulting in inappropriate channel (C1)
>> selected as active one.
>>
>> This resolves the issue by sending separate GLS (Get Link Status)
>> commands to all channels in the package before trying to do failover.
>> The last link state on all channels in the package is retrieved. With
>> it, C0 is selected as active one as expected.
>
>I follow this, and can see that happening in the
>ncsi_dev_state_suspend_gls state. However, what is
>
>> -   nd->state = ncsi_dev_state_suspend_dcnt;
>> +   if (ndp->flags & NCSI_DEV_RESHUFFLE)
>> +   nd->state = ncsi_dev_state_suspend_gls;
>> +   else
>> +   nd->state = ncsi_dev_state_suspend_dcnt;
>
>However, what is this doing? I'm not quite sure what
>NCSI_DEV_RESHUFFLE is and why we enable it?
>

NCSI_DEV_RESHUFFLE is set when we need failover, which happens on
resetting NIC or unplugging the cable connected to the active channel
(port) or other events. The first step for failover is to suspend
currently active channel and then choose the best one (channel's link
state is important factor) to be active. ncsi_dev_state_suspend_gls
ensures we will get updated link state of available channels before
choosing and enabling next active channel. If there are no failover
happening, we needn't get the update link state on the available
channels and the state ncsi_dev_state_suspend_gls will be skipped.

I think I need put comments here to explain the change in next revision.

Thanks,
Gavin

>>
>> ret = ncsi_xmit_cmd();
>> if (ret)
>> goto error;
>>
>> break;
>> +   case ncsi_dev_state_suspend_gls:
>> +   ndp->pending_req_num = np->channel_num;
>> +
>> +   nca.type = NCSI_PKT_CMD_GLS;
>> +   nca.package = np->id;
>> +   nd->state = ncsi_dev_state_suspend_dcnt;
>> +
>> +   NCSI_FOR_EACH_CHANNEL(np, nc) {
>> +   nca.channel = nc->id;
>> +   ret = ncsi_xmit_cmd();
>> +   if (ret)
>> +   goto error;
>> +   }
>> +
>> +   break;
>> case ncsi_dev_state_suspend_dcnt:
>> case ncsi_dev_state_suspend_dc:
>> case ncsi_dev_state_suspend_deselect:
>> --
>> 2.1.0
>>
>



Re: [PATCH net 2/2] conntrack: enable to tune gc parameters

2016-10-14 Thread Florian Westphal
Nicolas Dichtel  wrote:
> Le 13/10/2016 à 22:43, Florian Westphal a écrit :
> > Nicolas Dichtel  wrote:
> >> Le 10/10/2016 à 16:04, Florian Westphal a écrit :
> >>> Nicolas Dichtel  wrote:
>  After commit b87a2f9199ea ("netfilter: conntrack: add gc worker to remove
>  timed-out entries"), netlink conntrack deletion events may be sent with a
>  huge delay. It could be interesting to let the user tweak gc parameters
>  depending on its use case.
> >>>
> >>> Hmm, care to elaborate?
> >>>
> >>> I am not against doing this but I'd like to hear/read your use case.
> >>>
> >>> The expectation is that in almot all cases eviction will happen from
> >>> packet path.  The gc worker is jusdt there for case where a busy system
> >>> goes idle.
> >> It was precisely that case. After a period of activity, the event is sent 
> >> a long
> >> time after the timeout. If the router does not manage a lot of flows, why 
> >> not
> >> trying to parse more entries instead of the default 1/64 of the table?
> >> In fact, I don't understand why using GC_MAX_BUCKETS_DIV instead of using 
> >> always
> >> GC_MAX_BUCKETS whatever the size of the table is.
> > 
> > I wanted to make sure that we have a known upper bound on the number of
> > buckets we process so that we do not block other pending kworker items
> > for too long.
> I don't understand. GC_MAX_BUCKETS is the upper bound and I agree that it is
> needed. But why GC_MAX_BUCKETS_DIV (ie 1/64)?
> In other words, why this line:
> goal = min(nf_conntrack_htable_size / GC_MAX_BUCKETS_DIV, GC_MAX_BUCKETS);
> instead of:
> goal = GC_MAX_BUCKETS;

Sure, we can do that.  But why is a fixed size better than a fraction?

E.g. with 8k buckets and simple goal = GC_MAX_BUCKETS we scan entire
table on every run, currently we only scan 128.

I wanted to keep too many destroy notifications from firing at once
but maybe i was too paranoid...

> > (Or cause too many useless scans)
> > 
> > Another idea worth trying might be to get rid of the max cap and
> > instead break early in case too many jiffies expired.
> > 
> > I don't want to add sysctl knobs for this unless absolutely needed; its 
> > already
> > possible to 'force' eviction cycle by running 'conntrack -L'.
> > 
> Sure, but this is not a "real" solution, just a workaround.
> We need to find a way to deliver conntrack deletion events in a reasonable
> delay, whatever the traffic on the machine is.

Agree, but that depends on what 'reasonable' means and what kind of
uneeded cpu churn we're willing to add.

We can add a sysctl for this but we should use a low default to not do
too much unneeded work.

So what about your original patch, but only add

nf_conntrack_gc_interval

(and also add instant-resched in case entire budget was consumed)?



RE: [PATCH 4/6] fjes: Implement debug mode for fjes driver

2016-10-14 Thread Izumi, Taku
Dear David,

> -Original Message-
> From: David Miller [mailto:da...@davemloft.net]
> Sent: Thursday, October 13, 2016 11:20 PM
> To: Izumi, Taku/泉 拓
> Cc: netdev@vger.kernel.org
> Subject: Re: [PATCH 4/6] fjes: Implement debug mode for fjes driver
> 
> From: Taku Izumi 
> Date: Tue, 11 Oct 2016 17:55:20 +0900
> 
> > This patch implements debug mode for fjes driver.
> > You can get firmware activity information by enabling
> > debug mode. This is useful for debugging.
> >
> > To enable debug mode, write value of debugging mode to
> > debug_mode file in debugfs:
> >
> >   # echo 1 > /sys/kernel/debug/fjes/fjes.0/debug_mode
> >
> > To disable debug mode, write 0 to debug_mode file in debugfs:
> >
> >   # echo 0 > /sys/kernel/debug/fjes/fjes.0/debug_mode
> >
> > Firmware activity information can be retrieved via
> > /sys/kernel/debug/fjes/fjes.0/debug_data file.
> >
> > Signed-off-by: Taku Izumi 
> 
> There is no reason to use debugfs for this, we have facilities such
> as ETHTOOL_SET_DUMP et al. that you can use to implement this.

  Thank you for reviewing.
  I see. I'll rewrite this patch to use ethtool.

Sincerely,
Taku Izumi


Re: [mac80211] BUG_ON with current -git (4.8.0-11417-g24532f7)

2016-10-14 Thread Johannes Berg
On Fri, 2016-10-14 at 09:47 +0100, Ard Biesheuvel wrote:
> 
> Do you have a reference for the sg_set_buf() call on odata?
> crypto/ccm.c does not seem to have it (afaict), 

It's indirect - crypto_ccm_encrypt() calls crypto_ccm_init_crypt()
which does it.

> and the same problem
> does not exist in the accelerated arm64 implementation. In the mean
> time, I will try and see if we can move aad[] off the stack in the
> WPA code.

I had that with per-CPU buffers, just sent the patch upthread.

johannes


Re: [mac80211] BUG_ON with current -git (4.8.0-11417-g24532f7)

2016-10-14 Thread Johannes Berg
On Fri, 2016-10-14 at 10:21 +0100, Ard Biesheuvel wrote:

> It is annotated with a TODO, though :-)
> 
> 38320c70d282b (Herbert Xu   2008-01-28 19:35:05
> -0800  41)
>  * TODO: Use spare space in skb for this where possible.

I saw that, but I don't think generally there will be spare space for
it - the stuff there is likely far too big. Anyway ... same problem
that we have.

I'm not inclined to allocate ~500 bytes temporarily for every frame
either though.

Maybe we could try to manage it in mac80211, we'd "only" need 5 AEAD
structs (which are today on the stack) in parallel for each key (4 TX,
1 RX), but in a typical case of having 3 keys that's already 7.5K worth
of memory that we almost never use. Again, with more complexity, we
could know that the TX will not be used if the driver does the TX, but
the single RX one we'd need unconditionally... decisions decisions...

johannes


[PATCH] p54: memset(0) whole array

2016-10-14 Thread Jiri Slaby
gcc 7 complains:
drivers/net/wireless/intersil/p54/fwio.c: In function 'p54_scan':
drivers/net/wireless/intersil/p54/fwio.c:491:4: warning: 'memset' used with 
length equal to number of elements without multiplication by element size 
[-Wmemset-elt-size]

Fix that by passing the correct size to memset.

Signed-off-by: Jiri Slaby 
Cc: Christian Lamparter 
Cc: Kalle Valo 
Cc: linux-wirel...@vger.kernel.org
Cc: netdev@vger.kernel.org
---
 drivers/net/wireless/intersil/p54/fwio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/intersil/p54/fwio.c 
b/drivers/net/wireless/intersil/p54/fwio.c
index 257a9eadd595..4ac6764f4897 100644
--- a/drivers/net/wireless/intersil/p54/fwio.c
+++ b/drivers/net/wireless/intersil/p54/fwio.c
@@ -488,7 +488,7 @@ int p54_scan(struct p54_common *priv, u16 mode, u16 dwell)
 
entry += sizeof(__le16);
chan->pa_points_per_curve = 8;
-   memset(chan->curve_data, 0, sizeof(*chan->curve_data));
+   memset(chan->curve_data, 0, sizeof(chan->curve_data));
memcpy(chan->curve_data, entry,
   sizeof(struct p54_pa_curve_data_sample) *
   min((u8)8, curve_data->points_per_channel));
-- 
2.10.1



[PATCH] ethtool: Zero memory allocated for statistics

2016-10-14 Thread Vlad Tsyrklevich
Zero allocations before they're passed to drivers to be filled out with
statistics. While many drivers always correctly fill out the entire
allocated space, under some failure conditions some drivers will not
clear the allocated space appropriately. Unprivileged users could
induce some of these failure conditions to leak kernel memory. Instead
of fixing drivers one by one, the best solution is to eliminate the
possibility of driver errors leaking kernel memory entirely.

Given that ethtool_get_stats(), ethtool_get_phy_stats(), and
ethtool_get_tunable() are accessible without CAP_NET_ADMIN they are the
most important to clear to avoid memory leaks. ethtool_self_test() and
ethtool_get_any_eeprom() require CAP_NET_ADMIN but were also included
for completeness.

Some examples of driver methods that could fail to fill out memory:
enic_get_ethtool_stats(), cp_get_ethtool_stats(),
mv88e6xxx_get_ethtool_stats(), bnx2x_self_test(), be_self_test(), etc.

Signed-off-by: Vlad Tsyrklevich 
---
 net/core/ethtool.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 9774898..7202915 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -1538,7 +1538,7 @@ static int ethtool_get_any_eeprom(struct net_device *dev, 
void __user *useraddr,
if (eeprom.offset + eeprom.len > total_len)
return -EINVAL;
 
-   data = kmalloc(PAGE_SIZE, GFP_USER);
+   data = kzalloc(PAGE_SIZE, GFP_USER);
if (!data)
return -ENOMEM;
 
@@ -1775,7 +1775,7 @@ static int ethtool_self_test(struct net_device *dev, char 
__user *useraddr)
return -EFAULT;
 
test.len = test_len;
-   data = kmalloc(test_len * sizeof(u64), GFP_USER);
+   data = kcalloc(test_len, sizeof(u64), GFP_USER);
if (!data)
return -ENOMEM;
 
@@ -1907,7 +1907,7 @@ static int ethtool_get_stats(struct net_device *dev, void 
__user *useraddr)
return -EFAULT;
 
stats.n_stats = n_stats;
-   data = kmalloc(n_stats * sizeof(u64), GFP_USER);
+   data = kcalloc(n_stats, sizeof(u64), GFP_USER);
if (!data)
return -ENOMEM;
 
@@ -1946,7 +1946,7 @@ static int ethtool_get_phy_stats(struct net_device *dev, 
void __user *useraddr)
return -EFAULT;
 
stats.n_stats = n_stats;
-   data = kmalloc_array(n_stats, sizeof(u64), GFP_USER);
+   data = kcalloc(n_stats, sizeof(u64), GFP_USER);
if (!data)
return -ENOMEM;
 
@@ -2269,7 +2269,7 @@ static int ethtool_get_tunable(struct net_device *dev, 
void __user *useraddr)
ret = ethtool_tunable_valid();
if (ret)
return ret;
-   data = kmalloc(tuna.len, GFP_USER);
+   data = kzalloc(tuna.len, GFP_USER);
if (!data)
return -ENOMEM;
ret = ops->get_tunable(dev, , data);
-- 
2.7.0



[PATCH v3 net-next 6/7] qed: Handle malicious VFs events

2016-10-14 Thread Manish Chopra
From: Yuval Mintz 

Malicious VFs might be caught in several different methods:
  - Misusing their bar permission and being blocked by hardware.
  - Misusing their fastpath logic and being blocked by firmware.
  - Misusing their interaction with their PF via hw-channel,
and being blocked by PF driver.

On the first two items, firmware would indicate to driver that
the VF is to be considered malicious, but would sometime still
allow the VF to communicate with the PF [depending on the exact
nature of the malicious activity done by the VF].
The current existing logic on the PF side lacks handling of such events,
and might allow the PF to perform some incorrect configuration on behalf
of a VF that was previously indicated as malicious.

The new scheme is simple -
Once the PF determines a VF is malicious it would:
 a. Ignore any further requests on behalf of the VF-driver.
 b. Prevent any configurations initiated by the hyperuser for
the malicious VF, as firmware isn't willing to serve such.

The malicious indication would be cleared upon the VF flr,
after which it would become usable once again.

Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_sriov.c | 114 +++-
 drivers/net/ethernet/qlogic/qed/qed_sriov.h |   1 +
 drivers/net/ethernet/qlogic/qed/qed_vf.h|   1 +
 3 files changed, 96 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_sriov.c 
b/drivers/net/ethernet/qlogic/qed/qed_sriov.c
index d2d6621..6f029f9 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_sriov.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_sriov.c
@@ -109,7 +109,8 @@ static int qed_sp_vf_stop(struct qed_hwfn *p_hwfn,
 }
 
 static bool qed_iov_is_valid_vfid(struct qed_hwfn *p_hwfn,
- int rel_vf_id, bool b_enabled_only)
+ int rel_vf_id,
+ bool b_enabled_only, bool b_non_malicious)
 {
if (!p_hwfn->pf_iov_info) {
DP_NOTICE(p_hwfn->cdev, "No iov info\n");
@@ -124,6 +125,10 @@ static bool qed_iov_is_valid_vfid(struct qed_hwfn *p_hwfn,
b_enabled_only)
return false;
 
+   if ((p_hwfn->pf_iov_info->vfs_array[rel_vf_id].b_malicious) &&
+   b_non_malicious)
+   return false;
+
return true;
 }
 
@@ -138,7 +143,8 @@ static struct qed_vf_info *qed_iov_get_vf_info(struct 
qed_hwfn *p_hwfn,
return NULL;
}
 
-   if (qed_iov_is_valid_vfid(p_hwfn, relative_vf_id, b_enabled_only))
+   if (qed_iov_is_valid_vfid(p_hwfn, relative_vf_id,
+ b_enabled_only, false))
vf = _hwfn->pf_iov_info->vfs_array[relative_vf_id];
else
DP_ERR(p_hwfn, "qed_iov_get_vf_info: VF[%d] is not enabled\n",
@@ -542,7 +548,8 @@ int qed_iov_hw_info(struct qed_hwfn *p_hwfn)
return 0;
 }
 
-static bool qed_iov_pf_sanity_check(struct qed_hwfn *p_hwfn, int vfid)
+bool _qed_iov_pf_sanity_check(struct qed_hwfn *p_hwfn,
+ int vfid, bool b_fail_malicious)
 {
/* Check PF supports sriov */
if (IS_VF(p_hwfn->cdev) || !IS_QED_SRIOV(p_hwfn->cdev) ||
@@ -550,12 +557,17 @@ static bool qed_iov_pf_sanity_check(struct qed_hwfn 
*p_hwfn, int vfid)
return false;
 
/* Check VF validity */
-   if (!qed_iov_is_valid_vfid(p_hwfn, vfid, true))
+   if (!qed_iov_is_valid_vfid(p_hwfn, vfid, true, b_fail_malicious))
return false;
 
return true;
 }
 
+bool qed_iov_pf_sanity_check(struct qed_hwfn *p_hwfn, int vfid)
+{
+   return _qed_iov_pf_sanity_check(p_hwfn, vfid, true);
+}
+
 static void qed_iov_set_vf_to_disable(struct qed_dev *cdev,
  u16 rel_vf_id, u8 to_disable)
 {
@@ -652,6 +664,9 @@ static int qed_iov_enable_vf_access(struct qed_hwfn *p_hwfn,
 
qed_iov_vf_igu_reset(p_hwfn, p_ptt, vf);
 
+   /* It's possible VF was previously considered malicious */
+   vf->b_malicious = false;
+
rc = qed_mcp_config_vf_msix(p_hwfn, p_ptt, vf->abs_vf_id, vf->num_sbs);
if (rc)
return rc;
@@ -2804,6 +2819,13 @@ qed_iov_execute_vf_flr_cleanup(struct qed_hwfn *p_hwfn,
return rc;
}
 
+   /* Workaround to make VF-PF channel ready, as FW
+* doesn't do that as a part of FLR.
+*/
+   REG_WR(p_hwfn,
+  GTT_BAR0_MAP_REG_USDM_RAM +
+  USTORM_VF_PF_CHANNEL_READY_OFFSET(vfid), 1);
+
/* VF_STOPPED has to be set only after final cleanup
 * but prior to re-enabling the VF.
 */
@@ -2942,7 +2964,8 @@ static void qed_iov_process_mbx_req(struct qed_hwfn 
*p_hwfn,
mbx->first_tlv = mbx->req_virt->first_tlv;
 
/* check if tlv type is known */

[PATCH v3 net-next 2/7] qede: GSO support for tunnels with outer csum

2016-10-14 Thread Manish Chopra
From: Manish Chopra 

This patch adds GSO support for GRE and UDP tunnels
where outer checksums are enabled.

Signed-off-by: Manish Chopra 
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qede/qede.h  |  1 +
 drivers/net/ethernet/qlogic/qede/qede_main.c | 26 +++---
 2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qede/qede.h 
b/drivers/net/ethernet/qlogic/qede/qede.h
index 28c0e9f..f50e527 100644
--- a/drivers/net/ethernet/qlogic/qede/qede.h
+++ b/drivers/net/ethernet/qlogic/qede/qede.h
@@ -320,6 +320,7 @@ struct qede_fastpath {
 #define XMIT_L4_CSUM   BIT(0)
 #define XMIT_LSO   BIT(1)
 #define XMIT_ENC   BIT(2)
+#define XMIT_ENC_GSO_L4_CSUM   BIT(3)
 
 #define QEDE_CSUM_ERRORBIT(0)
 #define QEDE_CSUM_UNNECESSARY  BIT(1)
diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c 
b/drivers/net/ethernet/qlogic/qede/qede_main.c
index 9866d95..7d5dc1e 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_main.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_main.c
@@ -400,8 +400,19 @@ static u32 qede_xmit_type(struct qede_dev *edev,
(ipv6_hdr(skb)->nexthdr == NEXTHDR_IPV6))
*ipv6_ext = 1;
 
-   if (skb->encapsulation)
+   if (skb->encapsulation) {
rc |= XMIT_ENC;
+   if (skb_is_gso(skb)) {
+   unsigned short gso_type = skb_shinfo(skb)->gso_type;
+
+   if ((gso_type & SKB_GSO_UDP_TUNNEL_CSUM) ||
+   (gso_type & SKB_GSO_GRE_CSUM))
+   rc |= XMIT_ENC_GSO_L4_CSUM;
+
+   rc |= XMIT_LSO;
+   return rc;
+   }
+   }
 
if (skb_is_gso(skb))
rc |= XMIT_LSO;
@@ -637,6 +648,12 @@ static netdev_tx_t qede_start_xmit(struct sk_buff *skb,
if (unlikely(xmit_type & XMIT_ENC)) {
first_bd->data.bd_flags.bitfields |=
1 << ETH_TX_1ST_BD_FLAGS_TUNN_IP_CSUM_SHIFT;
+
+   if (xmit_type & XMIT_ENC_GSO_L4_CSUM) {
+   u8 tmp = ETH_TX_1ST_BD_FLAGS_TUNN_L4_CSUM_SHIFT;
+
+   first_bd->data.bd_flags.bitfields |= 1 << tmp;
+   }
hlen = qede_get_skb_hlen(skb, true);
} else {
first_bd->data.bd_flags.bitfields |=
@@ -2320,11 +2337,14 @@ static void qede_init_ndev(struct qede_dev *edev)
 
/* Encap features*/
hw_features |= NETIF_F_GSO_GRE | NETIF_F_GSO_UDP_TUNNEL |
-  NETIF_F_TSO_ECN;
+  NETIF_F_TSO_ECN | NETIF_F_GSO_UDP_TUNNEL_CSUM |
+  NETIF_F_GSO_GRE_CSUM;
ndev->hw_enc_features = NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
NETIF_F_SG | NETIF_F_TSO | NETIF_F_TSO_ECN |
NETIF_F_TSO6 | NETIF_F_GSO_GRE |
-   NETIF_F_GSO_UDP_TUNNEL | NETIF_F_RXCSUM;
+   NETIF_F_GSO_UDP_TUNNEL | NETIF_F_RXCSUM |
+   NETIF_F_GSO_UDP_TUNNEL_CSUM |
+   NETIF_F_GSO_GRE_CSUM;
 
ndev->vlan_features = hw_features | NETIF_F_RXHASH | NETIF_F_RXCSUM |
  NETIF_F_HIGHDMA;
-- 
2.7.2



[PATCH v3 net-next 3/7] qede: Prevent GSO on long Geneve headers

2016-10-14 Thread Manish Chopra
From: Manish Chopra 

Due to hardware limitation, when transmitting a geneve-encapsulated
packet with more than 32 bytes worth of geneve options the hardware
would not be able to crack the packet and consider it a regular UDP
packet.

This implements the ndo_features_check() in qede in order to prevent
GSO on said transmitted packets.

Signed-off-by: Manish Chopra 
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qede/qede_main.c | 35 
 1 file changed, 35 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c 
b/drivers/net/ethernet/qlogic/qede/qede_main.c
index 7d5dc1e..6c2b09c 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_main.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_main.c
@@ -2240,6 +2240,40 @@ static void qede_udp_tunnel_del(struct net_device *dev,
schedule_delayed_work(>sp_task, 0);
 }
 
+/* 8B udp header + 8B base tunnel header + 32B option length */
+#define QEDE_MAX_TUN_HDR_LEN 48
+
+static netdev_features_t qede_features_check(struct sk_buff *skb,
+struct net_device *dev,
+netdev_features_t features)
+{
+   if (skb->encapsulation) {
+   u8 l4_proto = 0;
+
+   switch (vlan_get_protocol(skb)) {
+   case htons(ETH_P_IP):
+   l4_proto = ip_hdr(skb)->protocol;
+   break;
+   case htons(ETH_P_IPV6):
+   l4_proto = ipv6_hdr(skb)->nexthdr;
+   break;
+   default:
+   return features;
+   }
+
+   /* Disable offloads for geneve tunnels, as HW can't parse
+* the geneve header which has option length greater than 32B.
+*/
+   if ((l4_proto == IPPROTO_UDP) &&
+   ((skb_inner_mac_header(skb) -
+ skb_transport_header(skb)) > QEDE_MAX_TUN_HDR_LEN))
+   return features & ~(NETIF_F_CSUM_MASK |
+   NETIF_F_GSO_MASK);
+   }
+
+   return features;
+}
+
 static const struct net_device_ops qede_netdev_ops = {
.ndo_open = qede_open,
.ndo_stop = qede_close,
@@ -2264,6 +2298,7 @@ static const struct net_device_ops qede_netdev_ops = {
 #endif
.ndo_udp_tunnel_add = qede_udp_tunnel_add,
.ndo_udp_tunnel_del = qede_udp_tunnel_del,
+   .ndo_features_check = qede_features_check,
 };
 
 /* -
-- 
2.7.2



[PATCH v3 net-next 1/7] qed: Pass MAC hints to VFs

2016-10-14 Thread Manish Chopra
From: Yuval Mintz 

Some hypervisors can support MAC hints to their VFs.
Even though we don't have such a hypervisor API in linux, we add
sufficient logic for the VF to be able to receive such hints and
set the mac accordingly - as long as the VF has not been set with
a MAC already.

Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_vf.c | 4 ++--
 drivers/net/ethernet/qlogic/qede/qede_main.c | 6 +-
 include/linux/qed/qed_eth_if.h   | 2 +-
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_vf.c 
b/drivers/net/ethernet/qlogic/qed/qed_vf.c
index abf5bf1..f580bf4 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_vf.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_vf.c
@@ -1230,8 +1230,8 @@ static void qed_handle_bulletin_change(struct qed_hwfn 
*hwfn)
 
is_mac_exist = qed_vf_bulletin_get_forced_mac(hwfn, mac,
  _mac_forced);
-   if (is_mac_exist && is_mac_forced && cookie)
-   ops->force_mac(cookie, mac);
+   if (is_mac_exist && cookie)
+   ops->force_mac(cookie, mac, !!is_mac_forced);
 
/* Always update link configuration according to bulletin */
qed_link_update(hwfn);
diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c 
b/drivers/net/ethernet/qlogic/qede/qede_main.c
index 343038c..9866d95 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_main.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_main.c
@@ -171,10 +171,14 @@ static struct pci_driver qede_pci_driver = {
 #endif
 };
 
-static void qede_force_mac(void *dev, u8 *mac)
+static void qede_force_mac(void *dev, u8 *mac, bool forced)
 {
struct qede_dev *edev = dev;
 
+   /* MAC hints take effect only if we haven't set one already */
+   if (is_valid_ether_addr(edev->ndev->dev_addr) && !forced)
+   return;
+
ether_addr_copy(edev->ndev->dev_addr, mac);
ether_addr_copy(edev->primary_mac, mac);
 }
diff --git a/include/linux/qed/qed_eth_if.h b/include/linux/qed/qed_eth_if.h
index 33c24eb..1c77948 100644
--- a/include/linux/qed/qed_eth_if.h
+++ b/include/linux/qed/qed_eth_if.h
@@ -129,7 +129,7 @@ struct qed_tunn_params {
 
 struct qed_eth_cb_ops {
struct qed_common_cb_ops common;
-   void (*force_mac) (void *dev, u8 *mac);
+   void (*force_mac) (void *dev, u8 *mac, bool forced);
 };
 
 #ifdef CONFIG_DCB
-- 
2.7.2



[PATCH v3 net-next 5/7] qed: Allow chance for fast ramrod completions

2016-10-14 Thread Manish Chopra
From: Yuval Mintz 

Whenever a ramrod is being sent for some device configuration,
the driver is going to sleep at least 5ms between each iteration
of polling on the completion of the ramrod.

However, in almost every configuration scenario the firmware
would be able to comply and complete the ramrod in a manner of
several usecs. This is especially important in cases where there
might be a lot of sequential configurations applying to the hardware
[e.g., RoCE], in which case the existing scheme might cause some
visible user delays.

This patch changes the completion scheme - instead of immediately
starting to sleep for a 'long' period, allow the device to quickly
poll on the first iteration after a couple of usecs.

Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_spq.c | 85 +--
 1 file changed, 59 insertions(+), 26 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_spq.c 
b/drivers/net/ethernet/qlogic/qed/qed_spq.c
index caff415..259a615 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_spq.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_spq.c
@@ -37,7 +37,11 @@
 ***/
 
 #define SPQ_HIGH_PRI_RESERVE_DEFAULT(1)
-#define SPQ_BLOCK_SLEEP_LENGTH  (1000)
+
+#define SPQ_BLOCK_DELAY_MAX_ITER(10)
+#define SPQ_BLOCK_DELAY_US  (10)
+#define SPQ_BLOCK_SLEEP_MAX_ITER(1000)
+#define SPQ_BLOCK_SLEEP_MS  (5)
 
 /***
 * Blocking Imp. (BLOCK/EBLOCK mode)
@@ -57,53 +61,81 @@ static void qed_spq_blocking_cb(struct qed_hwfn *p_hwfn,
smp_wmb();
 }
 
-static int qed_spq_block(struct qed_hwfn *p_hwfn,
-struct qed_spq_entry *p_ent,
-u8 *p_fw_ret)
+static int __qed_spq_block(struct qed_hwfn *p_hwfn,
+  struct qed_spq_entry *p_ent,
+  u8 *p_fw_ret, bool sleep_between_iter)
 {
-   int sleep_count = SPQ_BLOCK_SLEEP_LENGTH;
struct qed_spq_comp_done *comp_done;
-   int rc;
+   u32 iter_cnt;
 
comp_done = (struct qed_spq_comp_done *)p_ent->comp_cb.cookie;
-   while (sleep_count) {
-   /* validate we receive completion update */
+   iter_cnt = sleep_between_iter ? SPQ_BLOCK_SLEEP_MAX_ITER
+ : SPQ_BLOCK_DELAY_MAX_ITER;
+
+   while (iter_cnt--) {
+   /* Validate we receive completion update */
smp_rmb();
if (comp_done->done == 1) {
if (p_fw_ret)
*p_fw_ret = comp_done->fw_return_code;
return 0;
}
-   usleep_range(5000, 1);
-   sleep_count--;
+
+   if (sleep_between_iter)
+   msleep(SPQ_BLOCK_SLEEP_MS);
+   else
+   udelay(SPQ_BLOCK_DELAY_US);
}
 
+   return -EBUSY;
+}
+
+static int qed_spq_block(struct qed_hwfn *p_hwfn,
+struct qed_spq_entry *p_ent,
+u8 *p_fw_ret, bool skip_quick_poll)
+{
+   struct qed_spq_comp_done *comp_done;
+   int rc;
+
+   /* A relatively short polling period w/o sleeping, to allow the FW to
+* complete the ramrod and thus possibly to avoid the following sleeps.
+*/
+   if (!skip_quick_poll) {
+   rc = __qed_spq_block(p_hwfn, p_ent, p_fw_ret, false);
+   if (!rc)
+   return 0;
+   }
+
+   /* Move to polling with a sleeping period between iterations */
+   rc = __qed_spq_block(p_hwfn, p_ent, p_fw_ret, true);
+   if (!rc)
+   return 0;
+
DP_INFO(p_hwfn, "Ramrod is stuck, requesting MCP drain\n");
rc = qed_mcp_drain(p_hwfn, p_hwfn->p_main_ptt);
-   if (rc != 0)
+   if (rc) {
DP_NOTICE(p_hwfn, "MCP drain failed\n");
+   goto err;
+   }
 
/* Retry after drain */
-   sleep_count = SPQ_BLOCK_SLEEP_LENGTH;
-   while (sleep_count) {
-   /* validate we receive completion update */
-   smp_rmb();
-   if (comp_done->done == 1) {
-   if (p_fw_ret)
-   *p_fw_ret = comp_done->fw_return_code;
-   return 0;
-   }
-   usleep_range(5000, 1);
-   sleep_count--;
-   }
+   rc = __qed_spq_block(p_hwfn, p_ent, p_fw_ret, true);
+   if (!rc)
+   return 0;
 
+   comp_done = (struct qed_spq_comp_done *)p_ent->comp_cb.cookie;
if (comp_done->done == 1) {
if (p_fw_ret)
*p_fw_ret = comp_done->fw_return_code;
return 0;
}
-
-   DP_NOTICE(p_hwfn, 

[PATCH v3 net-next 4/7] qed*: Allow unicast filtering

2016-10-14 Thread Manish Chopra
From: Yuval Mintz 

Apparently qede fails to set IFF_UNICAST_FLT, and as a result is not
actually performing unicast MAC filtering.
While we're at it - relax a hard-coded limitation that limits each
interface into using at most 15 unicast MAC addresses before turning
promiscuous. Instead utilize the HW resources to their limit.

Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_l2.c | 12 ++--
 drivers/net/ethernet/qlogic/qede/qede_main.c |  4 +++-
 include/linux/qed/qed_eth_if.h   |  1 +
 3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_l2.c 
b/drivers/net/ethernet/qlogic/qed/qed_l2.c
index ddd410a..6b0e22d 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_l2.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_l2.c
@@ -1652,6 +1652,7 @@ static int qed_fill_eth_dev_info(struct qed_dev *cdev,
 
if (IS_PF(cdev)) {
int max_vf_vlan_filters = 0;
+   int max_vf_mac_filters = 0;
 
if (cdev->int_params.out.int_mode == QED_INT_MODE_MSIX) {
for_each_hwfn(cdev, i)
@@ -1665,11 +1666,18 @@ static int qed_fill_eth_dev_info(struct qed_dev *cdev,
info->num_queues = cdev->num_hwfns;
}
 
-   if (IS_QED_SRIOV(cdev))
+   if (IS_QED_SRIOV(cdev)) {
max_vf_vlan_filters = cdev->p_iov_info->total_vfs *
  QED_ETH_VF_NUM_VLAN_FILTERS;
-   info->num_vlan_filters = RESC_NUM(>hwfns[0], QED_VLAN) -
+   max_vf_mac_filters = cdev->p_iov_info->total_vfs *
+QED_ETH_VF_NUM_MAC_FILTERS;
+   }
+   info->num_vlan_filters = RESC_NUM(QED_LEADING_HWFN(cdev),
+ QED_VLAN) -
 max_vf_vlan_filters;
+   info->num_mac_filters = RESC_NUM(QED_LEADING_HWFN(cdev),
+QED_MAC) -
+   max_vf_mac_filters;
 
ether_addr_copy(info->port_mac,
cdev->hwfns[0].hw_info.hw_mac_addr);
diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c 
b/drivers/net/ethernet/qlogic/qede/qede_main.c
index 6c2b09c..0e483af 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_main.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_main.c
@@ -2365,6 +2365,8 @@ static void qede_init_ndev(struct qede_dev *edev)
 
qede_set_ethtool_ops(ndev);
 
+   ndev->priv_flags = IFF_UNICAST_FLT;
+
/* user-changeble features */
hw_features = NETIF_F_GRO | NETIF_F_SG |
  NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
@@ -3937,7 +3939,7 @@ static void qede_config_rx_mode(struct net_device *ndev)
 
/* Check for promiscuous */
if ((ndev->flags & IFF_PROMISC) ||
-   (uc_count > 15)) { /* @@@TBD resource allocation - 1 */
+   (uc_count > edev->dev_info.num_mac_filters - 1)) {
accept_flags = QED_FILTER_RX_MODE_TYPE_PROMISC;
} else {
/* Add MAC filters according to the unicast secondary macs */
diff --git a/include/linux/qed/qed_eth_if.h b/include/linux/qed/qed_eth_if.h
index 1c77948..1513080 100644
--- a/include/linux/qed/qed_eth_if.h
+++ b/include/linux/qed/qed_eth_if.h
@@ -23,6 +23,7 @@ struct qed_dev_eth_info {
 
u8  port_mac[ETH_ALEN];
u8  num_vlan_filters;
+   u16 num_mac_filters;
 
/* Legacy VF - this affects the datapath, so qede has to know */
bool is_legacy;
-- 
2.7.2



[PATCH v3 net-next 0/7] qed*: driver updates

2016-10-14 Thread Manish Chopra
From: Manish Chopra 

Hi David,

There are several new additions in this series;
Most are connected to either Tx offloading or Rx classifications
[either fastpath changes or supporting configuration].

In addition, there's a single IOV enhancement.

Please consider applying this series to `net-next'.

V2->V3:
Fixes below kbuild warning
call to '__compiletime_assert_60' declared with
attribute error: Need native word sized stores/loads for atomicity.

V1->V2:
Added a fix for the race in ramrod handling
pointed by Eric Dumazet [patch 7].

Thanks,
Manish

Manish Chopra (3):
  qede: GSO support for tunnels with outer csum
  qede: Prevent GSO on long Geneve headers
  qed: Fix possible race when reading firmware return code.

Yuval Mintz (4):
  qed: Pass MAC hints to VFs
  qed*: Allow unicast filtering
  qed: Allow chance for fast ramrod completions
  qed: Handle malicious VFs events

 drivers/net/ethernet/qlogic/qed/qed_l2.c |  12 ++-
 drivers/net/ethernet/qlogic/qed/qed_sp.h |   4 +-
 drivers/net/ethernet/qlogic/qed/qed_spq.c|  97 +++
 drivers/net/ethernet/qlogic/qed/qed_sriov.c  | 114 ++-
 drivers/net/ethernet/qlogic/qed/qed_sriov.h  |   1 +
 drivers/net/ethernet/qlogic/qed/qed_vf.c |   4 +-
 drivers/net/ethernet/qlogic/qed/qed_vf.h |   1 +
 drivers/net/ethernet/qlogic/qede/qede.h  |   1 +
 drivers/net/ethernet/qlogic/qede/qede_main.c |  71 +++--
 include/linux/qed/qed_eth_if.h   |   3 +-
 10 files changed, 244 insertions(+), 64 deletions(-)

-- 
2.7.2



[PATCH v3 net-next 7/7] qed: Fix possible race when reading firmware return code.

2016-10-14 Thread Manish Chopra
From: Manish Chopra 

While handling SPQ ramrod completion, there is a possible race
where driver might not read updated fw return code based on
ramrod completion done. This patch ensures that fw return code
is written first and then completion done flag is updated
using appropriate memory barriers.

Signed-off-by: Manish Chopra 
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_sp.h  |  4 ++--
 drivers/net/ethernet/qlogic/qed/qed_spq.c | 12 ++--
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_sp.h 
b/drivers/net/ethernet/qlogic/qed/qed_sp.h
index 652c908..27c450f 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_sp.h
+++ b/drivers/net/ethernet/qlogic/qed/qed_sp.h
@@ -111,8 +111,8 @@ union qed_spq_req_comp {
 };
 
 struct qed_spq_comp_done {
-   u64 done;
-   u8  fw_return_code;
+   unsigned intdone;
+   u8  fw_return_code;
 };
 
 struct qed_spq_entry {
diff --git a/drivers/net/ethernet/qlogic/qed/qed_spq.c 
b/drivers/net/ethernet/qlogic/qed/qed_spq.c
index 259a615..6c05402 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_spq.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_spq.c
@@ -54,11 +54,10 @@ static void qed_spq_blocking_cb(struct qed_hwfn *p_hwfn,
 
comp_done = (struct qed_spq_comp_done *)cookie;
 
-   comp_done->done = 0x1;
-   comp_done->fw_return_code   = fw_return_code;
+   comp_done->fw_return_code = fw_return_code;
 
-   /* make update visible to waiting thread */
-   smp_wmb();
+   /* Make sure completion done is visible on waiting thread */
+   smp_store_release(_done->done, 0x1);
 }
 
 static int __qed_spq_block(struct qed_hwfn *p_hwfn,
@@ -74,8 +73,9 @@ static int __qed_spq_block(struct qed_hwfn *p_hwfn,
 
while (iter_cnt--) {
/* Validate we receive completion update */
-   smp_rmb();
-   if (comp_done->done == 1) {
+   if (READ_ONCE(comp_done->done) == 1) {
+   /* Read updated FW return value */
+   smp_read_barrier_depends();
if (p_fw_ret)
*p_fw_ret = comp_done->fw_return_code;
return 0;
-- 
2.7.2



Re: [PATCH net 4/5] net/ncsi: Choose hot channel as active one if necessary

2016-10-14 Thread Gavin Shan
On Fri, Oct 14, 2016 at 04:32:36PM +1030, Joel Stanley wrote:
>On Fri, Oct 14, 2016 at 1:23 PM, Gavin Shan  wrote:
>> The issue was found on BCM5718 which has two NCSI channels in one
>> package: C0 and C1. C0 is in link-up state while C1 is in link-down
>> state. C0 is chosen as active channel until unplugging and plugging
>> C0's cable:  On unplugging C0's cable, LSC (Link State Change) AEN
>> packet received on C0 to report link-down event. After that, C1 is
>> chosen as active channel. LSC AEN for link-up event is lost on C0
>> when plugging C0's cable back. We lose the network even C0 is usable.
>
>Why do we lose the LCS AEN packet?
>
>Is this a bug in the BCM5718? If so, we shouldn't put it in the common
>ncsi code without adding a quirk for that hardware.
>

It's not a BCM5718 bug. LSC AEN is only received on the active channel.
After the failover (C0 -> C1), C0 becomes inactive and LSC AEN packet
won't be received on it as expected.

Thanks,
Gavin

>>
>> This resolves the issue by recording the (hot) channel that was ever
>> chosen as active one. The hot channel is chosen to be active one
>> if none of available channels in link-up state. With this, C0 is still
>> the active one after unplugging C0's cable. LSC AEN packet received
>> on C0 when plugging its cable back.
>>
>> Signed-off-by: Gavin Shan 
>> ---
>>  net/ncsi/internal.h|  1 +
>>  net/ncsi/ncsi-manage.c | 22 +++---
>>  2 files changed, 20 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/ncsi/internal.h b/net/ncsi/internal.h
>> index eac4858..1308a56 100644
>> --- a/net/ncsi/internal.h
>> +++ b/net/ncsi/internal.h
>> @@ -265,6 +265,7 @@ struct ncsi_dev_priv {
>>  #endif
>> unsigned intpackage_num; /* Number of packages */
>> struct list_headpackages;/* List of packages   */
>> +   struct ncsi_channel *hot_channel;/* Channel was ever active*/
>> struct ncsi_request requests[256];   /* Request table  */
>> unsigned intrequest_id;  /* Last used request ID   */
>>  #define NCSI_REQ_START_IDX 1
>> diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c
>> index e959979..cccedcf 100644
>> --- a/net/ncsi/ncsi-manage.c
>> +++ b/net/ncsi/ncsi-manage.c
>> @@ -625,6 +625,7 @@ static void ncsi_configure_channel(struct ncsi_dev_priv 
>> *ndp)
>> struct net_device *dev = nd->dev;
>> struct ncsi_package *np = ndp->active_package;
>> struct ncsi_channel *nc = ndp->active_channel;
>> +   struct ncsi_channel *hot_nc = NULL;
>> struct ncsi_cmd_arg nca;
>> unsigned char index;
>> unsigned long flags;
>> @@ -730,12 +731,20 @@ static void ncsi_configure_channel(struct 
>> ncsi_dev_priv *ndp)
>> break;
>> case ncsi_dev_state_config_done:
>> spin_lock_irqsave(>lock, flags);
>> -   if (nc->modes[NCSI_MODE_LINK].data[2] & 0x1)
>> +   if (nc->modes[NCSI_MODE_LINK].data[2] & 0x1) {
>> +   hot_nc = nc;
>> nc->state = NCSI_CHANNEL_ACTIVE;
>> -   else
>> +   } else {
>> +   hot_nc = NULL;
>> nc->state = NCSI_CHANNEL_INACTIVE;
>> +   }
>> spin_unlock_irqrestore(>lock, flags);
>>
>> +   /* Update the hot channel */
>> +   spin_lock_irqsave(>lock, flags);
>> +   ndp->hot_channel = hot_nc;
>> +   spin_unlock_irqrestore(>lock, flags);
>> +
>> ncsi_start_channel_monitor(nc);
>> ncsi_process_next_channel(ndp);
>> break;
>> @@ -753,10 +762,14 @@ static void ncsi_configure_channel(struct 
>> ncsi_dev_priv *ndp)
>>  static int ncsi_choose_active_channel(struct ncsi_dev_priv *ndp)
>>  {
>> struct ncsi_package *np;
>> -   struct ncsi_channel *nc, *found;
>> +   struct ncsi_channel *nc, *found, *hot_nc;
>> struct ncsi_channel_mode *ncm;
>> unsigned long flags;
>>
>> +   spin_lock_irqsave(>lock, flags);
>> +   hot_nc = ndp->hot_channel;
>> +   spin_unlock_irqrestore(>lock, flags);
>> +
>> /* The search is done once an inactive channel with up
>>  * link is found.
>>  */
>> @@ -774,6 +787,9 @@ static int ncsi_choose_active_channel(struct 
>> ncsi_dev_priv *ndp)
>> if (!found)
>> found = nc;
>>
>> +   if (nc == hot_nc)
>> +   found = nc;
>> +
>> ncm = >modes[NCSI_MODE_LINK];
>> if (ncm->data[2] & 0x1) {
>> spin_unlock_irqrestore(>lock, flags);
>> --
>> 2.1.0
>>
>



Re: [PATCH trivial] net: add bbr to config DEFAULT_TCP_CONG

2016-10-14 Thread Eric Dumazet
On Fri, 2016-10-14 at 09:33 +0200, Markus Trippelsdorf wrote:
> While playing with BBR I noticed that it was missing in the list of
> possible config DEFAULT_TCP_CONG choices. Fixed thusly.
> 
> Signed-off-by: Markus Trippelsdorf 
> 
> diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig
> index 300b06888fdf..b54b3ca939db 100644
> --- a/net/ipv4/Kconfig
> +++ b/net/ipv4/Kconfig
> @@ -715,6 +715,7 @@ config DEFAULT_TCP_CONG
>   default "reno" if DEFAULT_RENO
>   default "dctcp" if DEFAULT_DCTCP
>   default "cdg" if DEFAULT_CDG
> + default "bbr" if DEFAULT_BBR
>   default "cubic"

Not sure if we want this at this moment.

BBR needs FQ packet scheduler, and this is not exactly trivial to
achieve.





Re: [PATCH trivial] net: add bbr to config DEFAULT_TCP_CONG

2016-10-14 Thread Markus Trippelsdorf
On 2016.10.14 at 09:43 +0200, Eric Dumazet wrote:
> On Fri, 2016-10-14 at 09:33 +0200, Markus Trippelsdorf wrote:
> > While playing with BBR I noticed that it was missing in the list of
> > possible config DEFAULT_TCP_CONG choices. Fixed thusly.
> > 
> > Signed-off-by: Markus Trippelsdorf 
> > 
> > diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig
> > index 300b06888fdf..b54b3ca939db 100644
> > --- a/net/ipv4/Kconfig
> > +++ b/net/ipv4/Kconfig
> > @@ -715,6 +715,7 @@ config DEFAULT_TCP_CONG
> > default "reno" if DEFAULT_RENO
> > default "dctcp" if DEFAULT_DCTCP
> > default "cdg" if DEFAULT_CDG
> > +   default "bbr" if DEFAULT_BBR
> > default "cubic"
> 
> Not sure if we want this at this moment.
> 
> BBR needs FQ packet scheduler, and this is not exactly trivial to
> achieve.

For a start, it could be automatically selected:

diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig
index 300b06888fdf..845d8d3e9e27 100644
--- a/net/ipv4/Kconfig
+++ b/net/ipv4/Kconfig
@@ -642,6 +642,8 @@ config TCP_CONG_CDG
 
 config TCP_CONG_BBR
tristate "BBR TCP"
+   select NET_SCHED
+   select NET_SCH_FQ
default n
---help---
 
-- 
Markus


Re: [mac80211] BUG_ON with current -git (4.8.0-11417-g24532f7)

2016-10-14 Thread Ard Biesheuvel
On 14 October 2016 at 09:39, Ard Biesheuvel  wrote:
> On 14 October 2016 at 09:28, Johannes Berg  wrote:
>>
>>>1. revert that patch (doing so would need some major adjustments now,
>>>   since it's pretty old and a number of new things were added in the
>>>   meantime)
>>
>> This it will have to be, I guess.
>>
>>>2. allocate a per-CPU buffer for all the things that we put on the
>>>   stack and use in SG lists, those are:
>>>* CCM/GCM: AAD (32B), B_0/J_0 (16B)
>>>* GMAC: AAD (20B), zero (16B)
>>>* (not sure why CMAC isn't using this API, but it would be like GMAC)
>>
>> This doesn't work - I tried to move the mac80211 buffers, but because
>> we also put the struct aead_request on the stack, and crypto_ccm has
>> the "odata" in there, and we can't separate the odata from that struct,
>> we'd have to also put that into a per-CPU buffer, but it's very big -
>> 456 bytes for CCM, didn't measure the others but I'd expect them to be
>> larger, if different.
>>
>> I don't think we can allocate half a kb for each CPU just to be able to
>> possibly use the acceleration here. We can't even make that conditional
>> on not having hardware crypto in the wifi NIC because drivers are
>> always allowed to pass undecrypted frames, regardless of whether or not
>> HW crypto was attempted, so we don't know upfront if we'll have to
>> decrypt anything in software...
>>
>> Given that, I think we have had a bug in here basically since Ard's
>> patch, we never should've put these structs on the stack. Herbert, you
>> also touched this later and converted the API usage, did you see the
>> way the stack is used here and think it should be OK, or did you simply
>> not realize that?
>>
>> Ard, are you able to help out working on a revert of your patch? That
>> would require also reverting a number of other patches (various fixes,
>> API adjustments, etc. to the AEAD usage), but the more complicated part
>> is that in the meantime Jouni introduced GCMP and CCMP-256, both of
>> which we of course need to retain.
>>
>
> I am missing some context here, but could you explain what exactly is
> the problem here?
>
> Look at this code
>
> """
> struct scatterlist sg[3];
>
> char aead_req_data[sizeof(struct aead_request) +
> crypto_aead_reqsize(tfm)]
> __aligned(__alignof__(struct aead_request));
> struct aead_request *aead_req = (void *) aead_req_data;
>
> memset(aead_req, 0, sizeof(aead_req_data));
>
> sg_init_table(sg, 3);
> sg_set_buf([0], [2], be16_to_cpup((__be16 *)aad));
> sg_set_buf([1], data, data_len);
> sg_set_buf([2], mic, mic_len);
>
> aead_request_set_tfm(aead_req, tfm);
> aead_request_set_crypt(aead_req, sg, sg, data_len, b_0);
> aead_request_set_ad(aead_req, sg[0].length);
> """
>
> I assume the stack buffer itself is not the problem here, but aad,
> which is allocated on the stack one frame up.
> Do we really need to revert the whole patch to fix that?

Ah never mind, this is about 'odata'. Apologies, should have read first


Re: [mac80211] BUG_ON with current -git (4.8.0-11417-g24532f7)

2016-10-14 Thread Johannes Berg
For reference, this was my patch moving the mac80211 buffers to percpu.

diff --git a/net/mac80211/aes_ccm.c b/net/mac80211/aes_ccm.c
index 7663c28ba353..c3709ddf71e9 100644
--- a/net/mac80211/aes_ccm.c
+++ b/net/mac80211/aes_ccm.c
@@ -29,6 +29,8 @@ void ieee80211_aes_ccm_encrypt(struct crypto_aead *tfm, u8 
*b_0, u8 *aad,
__aligned(__alignof__(struct aead_request));
struct aead_request *aead_req = (void *) aead_req_data;
 
+   printk(KERN_INFO "ccm size: %d\n", sizeof(aead_req_data));
+
memset(aead_req, 0, sizeof(aead_req_data));
 
sg_init_table(sg, 3);
@@ -37,6 +39,9 @@ void ieee80211_aes_ccm_encrypt(struct crypto_aead *tfm, u8 
*b_0, u8 *aad,
sg_set_buf([2], mic, mic_len);
 
aead_request_set_tfm(aead_req, tfm);
+
+   printk(KERN_INFO "aead: %pf\n", 
crypto_aead_alg(crypto_aead_reqtfm(aead_req))->encrypt);
+
aead_request_set_crypt(aead_req, sg, sg, data_len, b_0);
aead_request_set_ad(aead_req, sg[0].length);
 
@@ -67,6 +72,8 @@ int ieee80211_aes_ccm_decrypt(struct crypto_aead *tfm, u8 
*b_0, u8 *aad,
aead_request_set_crypt(aead_req, sg, sg, data_len + mic_len, b_0);
aead_request_set_ad(aead_req, sg[0].length);
 
+   printk(KERN_INFO "aead: %pf\n", 
crypto_aead_alg(crypto_aead_reqtfm(aead_req))->decrypt);
+
return crypto_aead_decrypt(aead_req);
 }
 
diff --git a/net/mac80211/aes_cmac.c b/net/mac80211/aes_cmac.c
index bdf0790d89cc..ebb8c2dc9928 100644
--- a/net/mac80211/aes_cmac.c
+++ b/net/mac80211/aes_cmac.c
@@ -20,7 +20,6 @@
 
 #define CMAC_TLEN 8 /* CMAC TLen = 64 bits (8 octets) */
 #define CMAC_TLEN_256 16 /* CMAC TLen = 128 bits (16 octets) */
-#define AAD_LEN 20
 
 
 static void gf_mulx(u8 *pad)
@@ -101,7 +100,7 @@ void ieee80211_aes_cmac(struct crypto_cipher *tfm, const u8 
*aad,
 
memset(zero, 0, CMAC_TLEN);
addr[0] = aad;
-   len[0] = AAD_LEN;
+   len[0] = CMAC_AAD_LEN;
addr[1] = data;
len[1] = data_len - CMAC_TLEN;
addr[2] = zero;
@@ -119,7 +118,7 @@ void ieee80211_aes_cmac_256(struct crypto_cipher *tfm, 
const u8 *aad,
 
memset(zero, 0, CMAC_TLEN_256);
addr[0] = aad;
-   len[0] = AAD_LEN;
+   len[0] = CMAC_AAD_LEN;
addr[1] = data;
len[1] = data_len - CMAC_TLEN_256;
addr[2] = zero;
diff --git a/net/mac80211/aes_cmac.h b/net/mac80211/aes_cmac.h
index 3702041f44fd..6645f8963278 100644
--- a/net/mac80211/aes_cmac.h
+++ b/net/mac80211/aes_cmac.h
@@ -11,6 +11,8 @@
 
 #include 
 
+#define CMAC_AAD_LEN 20
+
 struct crypto_cipher *ieee80211_aes_cmac_key_setup(const u8 key[],
   size_t key_len);
 void ieee80211_aes_cmac(struct crypto_cipher *tfm, const u8 *aad,
diff --git a/net/mac80211/aes_gcm.c b/net/mac80211/aes_gcm.c
index 3afe361fd27c..13e64d383c46 100644
--- a/net/mac80211/aes_gcm.c
+++ b/net/mac80211/aes_gcm.c
@@ -25,6 +25,8 @@ void ieee80211_aes_gcm_encrypt(struct crypto_aead *tfm, u8 
*j_0, u8 *aad,
__aligned(__alignof__(struct aead_request));
struct aead_request *aead_req = (void *)aead_req_data;
 
+   printk(KERN_DEBUG "gcm size: %d\n", sizeof(aead_req_data));
+
memset(aead_req, 0, sizeof(aead_req_data));
 
sg_init_table(sg, 3);
diff --git a/net/mac80211/aes_gmac.c b/net/mac80211/aes_gmac.c
index 3ddd927aaf30..a2fc69ec5ca9 100644
--- a/net/mac80211/aes_gmac.c
+++ b/net/mac80211/aes_gmac.c
@@ -19,17 +19,18 @@
 
 #define GMAC_MIC_LEN 16
 #define GMAC_NONCE_LEN 12
-#define AAD_LEN 20
 
 int ieee80211_aes_gmac(struct crypto_aead *tfm, const u8 *aad, u8 *nonce,
-  const u8 *data, size_t data_len, u8 *mic)
+  const u8 *data, size_t data_len, u8 *mic, u8 *zero)
 {
struct scatterlist sg[4];
char aead_req_data[sizeof(struct aead_request) +
   crypto_aead_reqsize(tfm)]
__aligned(__alignof__(struct aead_request));
struct aead_request *aead_req = (void *)aead_req_data;
-   u8 zero[GMAC_MIC_LEN], iv[AES_BLOCK_SIZE];
+   u8 iv[AES_BLOCK_SIZE];
+
+   printk(KERN_DEBUG "gmac size: %d\n", sizeof(aead_req_data));
 
if (data_len < GMAC_MIC_LEN)
return -EINVAL;
@@ -38,7 +39,7 @@ int ieee80211_aes_gmac(struct crypto_aead *tfm, const u8 
*aad, u8 *nonce,
 
memset(zero, 0, GMAC_MIC_LEN);
sg_init_table(sg, 4);
-   sg_set_buf([0], aad, AAD_LEN);
+   sg_set_buf([0], aad, GMAC_AAD_LEN);
sg_set_buf([1], data, data_len - GMAC_MIC_LEN);
sg_set_buf([2], zero, GMAC_MIC_LEN);
sg_set_buf([3], mic, GMAC_MIC_LEN);
@@ -49,7 +50,7 @@ int ieee80211_aes_gmac(struct crypto_aead *tfm, const u8 
*aad, u8 *nonce,
 
aead_request_set_tfm(aead_req, tfm);
aead_request_set_crypt(aead_req, sg, sg, 0, iv);
-   aead_request_set_ad(aead_req, AAD_LEN + data_len);
+   aead_request_set_ad(aead_req, GMAC_AAD_LEN + data_len);
 

Re: [mac80211] BUG_ON with current -git (4.8.0-11417-g24532f7)

2016-10-14 Thread Ard Biesheuvel
On 14 October 2016 at 09:55, Johannes Berg  wrote:
> On Fri, 2016-10-14 at 09:47 +0100, Ard Biesheuvel wrote:
>>
>> Do you have a reference for the sg_set_buf() call on odata?
>> crypto/ccm.c does not seem to have it (afaict),
>
> It's indirect - crypto_ccm_encrypt() calls crypto_ccm_init_crypt()
> which does it.
>

Indeed. And the decrypt path does the same for auth_tag[].

But that still means there are two separate problems here, one which
affects the WPA code, and one that only affects the generic CCM
chaining mode (but not the accelerated arm64 implementation)

Unsurprisingly, I would strongly prefer those to be fixed properly
rather than backing out my patch, but I'm happy to help out whichever
solution we reach consensus on.

>> and the same problem
>> does not exist in the accelerated arm64 implementation. In the mean
>> time, I will try and see if we can move aad[] off the stack in the
>> WPA code.
>
> I had that with per-CPU buffers, just sent the patch upthread.
>

I will check whether this removes the issue when not using crypto/ccm.ko


Re: [mac80211] BUG_ON with current -git (4.8.0-11417-g24532f7)

2016-10-14 Thread Ard Biesheuvel
On 14 October 2016 at 10:25, Johannes Berg  wrote:
> On Fri, 2016-10-14 at 10:21 +0100, Ard Biesheuvel wrote:
>
>> It is annotated with a TODO, though :-)
>>
>> 38320c70d282b (Herbert Xu   2008-01-28 19:35:05
>> -0800  41)
>>  * TODO: Use spare space in skb for this where possible.
>
> I saw that, but I don't think generally there will be spare space for
> it - the stuff there is likely far too big. Anyway ... same problem
> that we have.
>
> I'm not inclined to allocate ~500 bytes temporarily for every frame
> either though.
>
> Maybe we could try to manage it in mac80211, we'd "only" need 5 AEAD
> structs (which are today on the stack) in parallel for each key (4 TX,
> 1 RX), but in a typical case of having 3 keys that's already 7.5K worth
> of memory that we almost never use. Again, with more complexity, we
> could know that the TX will not be used if the driver does the TX, but
> the single RX one we'd need unconditionally... decisions decisions...
>

So why is the performance hit acceptable for ESP but not for WPA? We
could easily implement the same thing, i.e., kmalloc(GFP_ATOMIC)/kfree
the aead_req struct rather than allocate it on the stack


Layer 2 over IPv6 GRE and path MTU discovery

2016-10-14 Thread Mike Walker
When using a layer 2 GREv6 tunnel (ip6gretap), I am using a Linux
bridge to push Ethernet frames from an Ethernet port to the GREv6
device.

Here is an example of the topology:

PC -> eth0 -> grebridge -> gre6dev -> (internet) -> GRE endpoint -> Remote host

In this case, the PC connected to the Ethernet port is using IPv6 to
communicate with the remote host, so the source and destination IP of
the traffic being sent by the PC are both IPv6 addresses.  So we have
an IPv6 header, Ethernet header, then GRE header once the
encapsulation is done.

Sometimes these packets are too large for the GRE tunnel's MTU.  When
this happens, the router's kernel wants to send an ICMP "packet too
big" error message back to the PC.

However, the router has no routing information for the PC.  The path
from the PC to the remote host is all supposed to be layer 2.  The
router is not configured to route traffic to the PC or the remote
host, only to bridge the layer 2 frames.

What happens then is Linux tries to send an ICMP error, it can't find
the route, or else it sends it to its default route, none of which do
any good.

If the PC doesn't get this ICMP error, it will not know why the
packets were dropped, or it won't even know they were dropped.  It's
an ICMP blackhole scenario right?

So, one solution I tried was hacking the kernel so that if it's trying
to send this ICMP "packet too big" error to a host, and we know it's a
layer 2 GRE tunnel, instead of the normal logic, force the ICMP error
message to be sent back out via the network interface the offending
packet was received on.

This mostly worked, the PC recieves the ICMP error and adjusts its
path MTU, so in the future it will know to fragment the packet if it's
too big.

Problem is, I don't know what source IP and mac address I should be
using when I send back this ICMP error to the PC.  Normally this
network path doesn't have any layer 3 address, and even the mac
address normally is transparent / unknown to the PC.  For my prototype
I simply set the source IP of the ICMP error to whatever was the
destination IP of the packet that was too big.  I let the kernel use
the mac address of either the bridge or eth0.

I couldn't seem to find any RFC that says how this should be handled.
Any ideas?


Re: bug in ixgbe_atr

2016-10-14 Thread Sowmini Varadhan
On (10/14/16 16:09), Duyck, Alexander H wrote:
> Sorry I was thinking of a different piece of code.  In the case of the
> atr code it would be hdr.network, not hdr.raw.  Basically the thought
> was to validate that there is enough data in skb_headlen() that we can
> verify that from where the network header should be we have at least
> 40 bytes of data as that would be the minimum needed for a TCP header
> and an IPv4 header, or just an IPv6 header.  We would probably need a
> separate follow-up for the TCP header after we validate network header.
   :
>> Dropping it is fine with me I guess - maybe just return, if the
>> skb_headlen() doesnt have enough bytes for a network header, i.e.,
>> skb_headlen
>> is at least ETH_HLEN + sizeof (struct iphdr) for ETH_P_IP, or  ETH_HLEN +
>> sizeof (struct ipv6hdr) for ETH_P_IPV6?

> Right that is kind of what I was thinking.  If we validate that we
> have at least 40 before inspecting the network header, and at least 20
> before we validate the TCP header that would work for me.

yes, I was on a plane through most of the day today but thought about
this. I think we can check if skb_network_offset() is between
skb->data and tail, and also make sure there are "enough" bytes for
trying to find the ip and transport header. 
Let me try to put a RFC patch together for this tomorrow.


[PATCH net-next v2 1/6] fjes: ethtool -d support for fjes driver

2016-10-14 Thread Taku Izumi
This patch adds implementation of supporting
ethtool -d for fjes driver. By using ethtool -d,
you can get registers dump of Exetnded socket device.

  # ethtool -d es0

Offset  Values
--  --
0x: 01 00 00 00 08 00 00 00 00 00 00 00 00 00 00 00
0x0010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0020: 02 00 00 80 02 00 00 80 64 a6 58 08 07 00 00 00
0x0030: 00 00 00 00 28 80 00 00 00 00 f9 e3 06 00 00 00
0x0040: 00 00 00 00 18 00 00 00 80 a4 58 08 07 00 00 00
0x0050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0080: 00 00 00 00 00 00 e0 7f 00 00 01 00 00 00 01 00
0x0090: 00 00 00 00

Signed-off-by: Taku Izumi 
---
 drivers/net/fjes/fjes_ethtool.c | 48 +
 1 file changed, 48 insertions(+)

diff --git a/drivers/net/fjes/fjes_ethtool.c b/drivers/net/fjes/fjes_ethtool.c
index 9c218e1..8397634 100644
--- a/drivers/net/fjes/fjes_ethtool.c
+++ b/drivers/net/fjes/fjes_ethtool.c
@@ -121,12 +121,60 @@ static int fjes_get_settings(struct net_device *netdev,
return 0;
 }
 
+static int fjes_get_regs_len(struct net_device *netdev)
+{
+#define FJES_REGS_LEN  37
+   return FJES_REGS_LEN * sizeof(u32);
+}
+
+static void fjes_get_regs(struct net_device *netdev,
+ struct ethtool_regs *regs, void *p)
+{
+   struct fjes_adapter *adapter = netdev_priv(netdev);
+   struct fjes_hw *hw = >hw;
+   u32 *regs_buff = p;
+
+   memset(p, 0, FJES_REGS_LEN * sizeof(u32));
+
+   regs->version = 1;
+
+   /* Information registers */
+   regs_buff[0] = rd32(XSCT_OWNER_EPID);
+   regs_buff[1] = rd32(XSCT_MAX_EP);
+
+   /* Device Control registers */
+   regs_buff[4] = rd32(XSCT_DCTL);
+
+   /* Command Control registers */
+   regs_buff[8] = rd32(XSCT_CR);
+   regs_buff[9] = rd32(XSCT_CS);
+   regs_buff[10] = rd32(XSCT_SHSTSAL);
+   regs_buff[11] = rd32(XSCT_SHSTSAH);
+
+   regs_buff[13] = rd32(XSCT_REQBL);
+   regs_buff[14] = rd32(XSCT_REQBAL);
+   regs_buff[15] = rd32(XSCT_REQBAH);
+
+   regs_buff[17] = rd32(XSCT_RESPBL);
+   regs_buff[18] = rd32(XSCT_RESPBAL);
+   regs_buff[19] = rd32(XSCT_RESPBAH);
+
+   /* Interrupt Control registers */
+   regs_buff[32] = rd32(XSCT_IS);
+   regs_buff[33] = rd32(XSCT_IMS);
+   regs_buff[34] = rd32(XSCT_IMC);
+   regs_buff[35] = rd32(XSCT_IG);
+   regs_buff[36] = rd32(XSCT_ICTL);
+}
+
 static const struct ethtool_ops fjes_ethtool_ops = {
.get_settings   = fjes_get_settings,
.get_drvinfo= fjes_get_drvinfo,
.get_ethtool_stats = fjes_get_ethtool_stats,
.get_strings  = fjes_get_strings,
.get_sset_count   = fjes_get_sset_count,
+   .get_regs   = fjes_get_regs,
+   .get_regs_len   = fjes_get_regs_len,
 };
 
 void fjes_set_ethtool_ops(struct net_device *netdev)
-- 
2.6.6



[PATCH net-next v2 0/6] FUJITSU Extended Socket driver version 1.2

2016-10-14 Thread Taku Izumi
This patchset updates FUJITSU Extended Socket network driver into version 1.2.
This includes the following enhancements:
  - ethtool -d support
  - ethtool -S enhancement
  - ethtool -w/-W support
  - Add some debugging feature (tracepoints etc)

v1 -> v2:
  - Use u64 instead of phys_addr_t as TP_STRUCT__entry
  - Use ethtool facility to achieve debug mode instead of using debugfs


Taku Izumi (6):
  fjes: ethtool -d support for fjes driver
  fjes: Enhance ethtool -S for fjes driver
  fjes: Add tracepoints in fjes driver
  fjes: ethtool -w and -W support for fjes driver
  fjes: Add debugfs entry for EP status information in fjes driver
  fjes: Update fjes driver version : 1.2

 drivers/net/fjes/Makefile   |   2 +-
 drivers/net/fjes/fjes.h |  16 ++
 drivers/net/fjes/fjes_debugfs.c | 117 +
 drivers/net/fjes/fjes_ethtool.c | 181 ++-
 drivers/net/fjes/fjes_hw.c  | 171 +-
 drivers/net/fjes/fjes_hw.h  |  34 
 drivers/net/fjes/fjes_main.c|  63 ++-
 drivers/net/fjes/fjes_trace.c   |  30 
 drivers/net/fjes/fjes_trace.h   | 380 
 9 files changed, 983 insertions(+), 11 deletions(-)
 create mode 100644 drivers/net/fjes/fjes_debugfs.c
 create mode 100644 drivers/net/fjes/fjes_trace.c
 create mode 100644 drivers/net/fjes/fjes_trace.h

-- 
2.6.6



[PATCH net-next 2/2] net: phy: Add Fast Link Failure - 2 set driver for Microsemi PHYs.

2016-10-14 Thread Raju Lakkaraju
From: Raju Lakkaraju 

VSC8531 Fast Link Failure 2 feature enables the PHY to indicate the
onset of a potential link failure in < 100 usec for 100BASE-TX
operation. FLF2 is supported through the MDINT (active low) pin.

Signed-off-by: Raju Lakkaraju 
Signed-off-by: Allan W. Nielsen 
---
 .../devicetree/bindings/net/mscc-phy-vsc8531.txt   |  6 +++
 drivers/net/phy/mscc.c | 45 ++
 2 files changed, 51 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/mscc-phy-vsc8531.txt 
b/Documentation/devicetree/bindings/net/mscc-phy-vsc8531.txt
index 062d115..472fc68 100644
--- a/Documentation/devicetree/bindings/net/mscc-phy-vsc8531.txt
+++ b/Documentation/devicetree/bindings/net/mscc-phy-vsc8531.txt
@@ -32,6 +32,11 @@ Optional properties:
  after a 'downshift-cnt' of failed attempts at
  1000BAST-T. Allowed values: 0, 2, 3, 4, 5.
  0 is default and will disable downshifting.
+- flf2 : Fast Link Failure 2 (FLF2) feature enables the PHY
+ to indicate the onset of a potential link failure in
+ < 100 usec for 100BASE-TX operation. FLF2 is
+ supported through the MDINT (active low) pin.
+ Default will be disable flf2.
 
 Table: 1 - Edge rate change
 |
@@ -66,4 +71,5 @@ Example:
 vsc8531,vddmac = <3300>;
 vsc8531,edge-slowdown  = <7>;
 vsc8531,downshift-cnt   = <3>;
+   vsc8531,flf2;
 };
diff --git a/drivers/net/phy/mscc.c b/drivers/net/phy/mscc.c
index e87d9f0..57bd628 100644
--- a/drivers/net/phy/mscc.c
+++ b/drivers/net/phy/mscc.c
@@ -57,6 +57,7 @@ enum rgmii_rx_clock_delay {
 
 /* Extended Page 2 Registers */
 #define MSCC_PHY_RGMII_CNTL  20
+#define FLF2_ENABLE  0x8000
 #define RGMII_RX_CLK_DELAY_MASK  0x0070
 #define RGMII_RX_CLK_DELAY_POS   4
 
@@ -83,6 +84,7 @@ enum rgmii_rx_clock_delay {
 struct vsc8531_private {
int rate_magic;
u8  downshift_magic;
+   bool flf2;  /* Fast Link Failure-2 Enable/Disable */
 };
 
 #ifdef CONFIG_OF_MDIO
@@ -107,6 +109,33 @@ static int vsc85xx_phy_page_set(struct phy_device *phydev, 
u8 page)
return rc;
 }
 
+static int vsc85xx_flf2_set(struct phy_device *phydev, bool op)
+{
+   int rc;
+   u16 reg_val;
+
+   mutex_lock(>lock);
+   rc = vsc85xx_phy_page_set(phydev, MSCC_PHY_PAGE_EXTENDED_2);
+   if (rc != 0)
+   goto out_unlock;
+
+   reg_val = phy_read(phydev, MSCC_PHY_RGMII_CNTL);
+   if (op)
+   reg_val |= FLF2_ENABLE;
+   else
+   reg_val &= ~FLF2_ENABLE;
+   rc = phy_write(phydev, MSCC_PHY_RGMII_CNTL, reg_val);
+   if (rc != 0)
+   goto out_unlock;
+
+   rc = vsc85xx_phy_page_set(phydev, MSCC_PHY_PAGE_STANDARD);
+
+out_unlock:
+   mutex_unlock(>lock);
+
+   return rc;
+}
+
 static int vsc85xx_downshift_set(struct phy_device *phydev, u8 magic)
 {
int rc;
@@ -412,6 +441,10 @@ static int vsc85xx_config_init(struct phy_device *phydev)
if (rc)
return rc;
 
+   rc = vsc85xx_flf2_set(phydev, vsc8531->flf2);
+   if (rc)
+   return rc;
+
rc = genphy_config_init(phydev);
 
return rc;
@@ -449,6 +482,11 @@ static int vsc85xx_probe(struct phy_device *phydev)
int rate_magic;
int downshift_magic;
struct vsc8531_private *vsc8531;
+   struct device *dev = >mdio.dev;
+   struct device_node *of_node = dev->of_node;
+
+   if (!of_node)
+   return -ENODEV;
 
rate_magic = vsc85xx_edge_rate_magic_get(phydev);
if (rate_magic < 0)
@@ -466,6 +504,13 @@ static int vsc85xx_probe(struct phy_device *phydev)
vsc8531->rate_magic = rate_magic;
vsc8531->downshift_magic = downshift_magic;
 
+#ifdef CONFIG_OF_MDIO
+   /* Fast Link Failure 2 */
+   vsc8531->flf2 = of_property_read_bool(of_node, "vsc8531,flf2");
+#else
+   vsc8531->flf2 = 0;
+#endif
+
return 0;
 }
 
-- 
2.7.4



[PATCH net-next 0/2] net: phy: Add Downshift, FLF2 drivers for Microsemi

2016-10-14 Thread Raju Lakkaraju

From: Raju Lakkaraju 

This series adds support to the Speed downshift, Fast Link Failure 2,
set drivers for Microsemi PHYs.

Patch 1/4: Link Speed downshift:
For operation in cabling environments that are incompatible with
1000BAST-T, VSC8531 device provides an automatic link speed
downshift operation. When enabled, the device automatically changes
its 1000BAST-T auto-negotiation to the next slower speed after
a set number of failed attempts at 1000BAST-T.
This feature is useful in setting up in networks using older cable
installations that include only pairs A and B, and not pairs C and D.

Patch 2/4: Fast Link Failure 2:
VSC8531 Fast Link Failure 2 feature enables the PHY to indicate the
onset of a potential link failure in < 100 usec for 100BASE-TX
operation. FLF2 is supported through the MDINT (active low) pin.

All these features tested on Beaglebone Black with VSC 8531 PHY.

Raju Lakkaraju (2):
  net: phy: Add Speed downshift set driver for Microsemi PHYs.
  net: phy: Add Fast Link Failure - 2 set driver for Microsemi PHYs.

 .../devicetree/bindings/net/mscc-phy-vsc8531.txt   |  12 +++
 drivers/net/phy/mscc.c | 120 -
 2 files changed, 131 insertions(+), 1 deletion(-)

-- 
2.7.4



Re: Need help with mdiobus_register and phy

2016-10-14 Thread Timur Tabi

Andrew Lunn wrote:

Please can you tell us what PHY which is, and how it is put to sleep
and woken up.


It's the at803x driver.

http://lxr.free-electrons.com/source/drivers/net/phy/at803x.c

It goes to sleep in its at803x_suspend() function, which is called by 
phy_suspend().


There is a corresponding at803x_resume().  The problem is that this is 
not called by mdiobus_register().  I'm guessing that mdiobus_register() 
assumes that the phy is awake.


It seems like a catch-22.  mdiobus_register() assumes that the phy is 
awake, but you can't wake up the phy until after you call 
mdiobus_register().



If the PHY cannot be woken up using MDIO, then maybe you need to look
at the mdio bus reset call?


I looked at that, but it won't work because there is no phydev when the 
reset function is called:


http://lxr.free-electrons.com/source/drivers/net/phy/mdio_bus.c#L328

It's the same catch-22.

--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, hosted by The Linux Foundation.


Re: [PATCH] r8169: set coherent DMA mask as well as streaming DMA mask

2016-10-14 Thread David Miller
From: Ard Biesheuvel 
Date: Fri, 14 Oct 2016 12:39:30 +0100

> PCI devices that are 64-bit DMA capable should set the coherent
> DMA mask as well as the streaming DMA mask. On some architectures,
> these are managed separately, and so the coherent DMA mask will be
> left at its default value of 32 if it is not set explicitly. This
> results in errors such as
> 
>  r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
>  hwdev DMA mask = 0x, dev_addr = 0x0080fbfff000
>  swiotlb: coherent allocation failed for device :02:00.0 size=4096
>  CPU: 0 PID: 1062 Comm: systemd-udevd Not tainted 4.8.0+ #35
>  Hardware name: AMD Seattle/Seattle, BIOS 10:53:24 Oct 13 2016
> 
> on systems without memory that is 32-bit addressable by PCI devices.
> 
> Signed-off-by: Ard Biesheuvel 
 ...
> @@ -8281,6 +8282,8 @@ static int rtl_init_one(struct pci_dev *pdev, const 
> struct pci_device_id *ent)
>   dev->features |= NETIF_F_HIGHDMA;
>   } else {
>   rc = pci_set_dma_mask(pdev, DMA_BIT_MASK(32));
> + if (!rc)
> + rc = pci_set_consistent_dma_mask(pdev, 
> DMA_BIT_MASK(32));

As you state 32-bit is the default, therefore this part of your patch is 
unnecessary.


Re: [PATCH] mac80211: aes_ccm: move struct aead_req off the stack

2016-10-14 Thread Johannes Berg

> So use kzalloc

Do we really need kzalloc()? We have things on the stack right now, and
don't initialize, so surely we don't really need to zero things?

> This only addresses one half of the problem. The other problem, i.e.,
> the fact that the aad[] array lives on the stack of the caller, is
> handled adequately imo by the change proposed by Johannes.

But if we allocate things anyway, is it worth expending per-CPU buffers
on these?

johannes


Re: Need help with mdiobus_register and phy

2016-10-14 Thread Andrew Lunn
On Fri, Oct 14, 2016 at 08:03:18AM -0500, Timur Tabi wrote:
> Andrew Lunn wrote:
> >Have you tried using the ethernet-phy-id device tree property? It
> >looks like that will allow you to skip get_phy_device and just create
> >the phy device. You can then bring the phy out of sleep in the probe
> >function?
> 
> The problem I'm experiencing is with ACPI, so I can't use any of the
> fancy of_ apis like of_get_phy_id().  But I'll look into it.
> 
> Is it possible that at803x_suspend() is too aggressive?  That's it's
> effectively disabling the phy?  While the phy is suspended, should
> it still respond to MII_PHYSID1 and MII_PHYSID2 requests?

That is a basic assumption of the code. If you cannot read the IDs how
are you supposed to know what device it is, and what quirks you need
to work around its broken features...

Does the datasheet say anything about this?

I would say for this device, suspend() is too aggressive.

  Andrew


Re: [PATCH] mac80211: aes_ccm: move struct aead_req off the stack

2016-10-14 Thread Ard Biesheuvel
On 14 October 2016 at 14:15, Johannes Berg  wrote:
> On Fri, 2016-10-14 at 14:13 +0100, Ard Biesheuvel wrote:
>>
>> > But if we allocate things anyway, is it worth expending per-CPU
>> > buffers on these?
>>
>> Ehmm, maybe not. I could spin a v2 that allocates a bigger buffer,
>> and copies aad[] into it as well
>
> Copies in/out, I guess. Also there's B_0/J_0 for CCM/GCM, and the
> 'zero' thing that GMAC has.
>

Is the aad[] actually reused? I would assume it only affects the mac
on encryption, and the verification on decryption but I don't think we
actually need it back from the crypto routines.

>> That does not help the other algos though
>
> What do you mean?
>

Exactly what you said above :-) My patch only touches CCM but as you said,

"""
'Also there's B_0/J_0 for CCM/GCM, and the 'zero' thing that GMAC has.
"""


[PATCH 4.4 01/21] time: Add cycles to nanoseconds translation

2016-10-14 Thread Greg Kroah-Hartman
4.4-stable review patch.  If anyone has any objections, please let me know.

--

From: Christopher S. Hall 

commit 6bd58f09e1d8cc6c50a824c00bf0d617919986a1 upstream.

The timekeeping code does not currently provide a way to translate
externally provided clocksource cycles to system time. The cycle count
is always provided by the result clocksource read() method internal to
the timekeeping code. The added function timekeeping_cycles_to_ns()
calculated a nanosecond value from a cycle count that can be added to
tk_read_base.base value yielding the current system time. This allows
clocksource cycle values external to the timekeeping code to provide a
cycle count that can be transformed to system time.

Cc: Prarit Bhargava 
Cc: Richard Cochran 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Andy Lutomirski 
Cc: kevin.b.stan...@intel.com
Cc: kevin.j.cla...@intel.com
Cc: h...@zytor.com
Cc: jeffrey.t.kirs...@intel.com
Cc: netdev@vger.kernel.org
Reviewed-by: Thomas Gleixner 
Signed-off-by: Christopher S. Hall 
Signed-off-by: John Stultz 
Signed-off-by: Greg Kroah-Hartman 

---
 kernel/time/timekeeping.c |   25 +
 1 file changed, 21 insertions(+), 4 deletions(-)

--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -298,17 +298,34 @@ u32 (*arch_gettimeoffset)(void) = defaul
 static inline u32 arch_gettimeoffset(void) { return 0; }
 #endif
 
+static inline s64 timekeeping_delta_to_ns(struct tk_read_base *tkr,
+ cycle_t delta)
+{
+   s64 nsec;
+
+   nsec = delta * tkr->mult + tkr->xtime_nsec;
+   nsec >>= tkr->shift;
+
+   /* If arch requires, add in get_arch_timeoffset() */
+   return nsec + arch_gettimeoffset();
+}
+
 static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
 {
cycle_t delta;
-   s64 nsec;
 
delta = timekeeping_get_delta(tkr);
+   return timekeeping_delta_to_ns(tkr, delta);
+}
 
-   nsec = (delta * tkr->mult + tkr->xtime_nsec) >> tkr->shift;
+static inline s64 timekeeping_cycles_to_ns(struct tk_read_base *tkr,
+   cycle_t cycles)
+{
+   cycle_t delta;
 
-   /* If arch requires, add in get_arch_timeoffset() */
-   return nsec + arch_gettimeoffset();
+   /* calculate the delta since the last update_wall_time */
+   delta = clocksource_delta(cycles, tkr->cycle_last, tkr->mask);
+   return timekeeping_delta_to_ns(tkr, delta);
 }
 
 /**




Re: [PATCH net-next 2/2] net: phy: Add Fast Link Failure - 2 set driver for Microsemi PHYs.

2016-10-14 Thread Andrew Lunn
> On Fri, Oct 14, 2016 at 05:10:33PM +0530, Raju Lakkaraju wrote:
> From: Raju Lakkaraju 
> 
> VSC8531 Fast Link Failure 2 feature enables the PHY to indicate the
> onset of a potential link failure in < 100 usec for 100BASE-TX
> operation. FLF2 is supported through the MDINT (active low) pin.

Is the MDINT pin specific to this feature, or a general interrupt pin?

Device tree is used to describe the hardware. It should not really
describe software or configuration. But the borders are a bit
fluffly. Signal edge rates is near to hardware. This is a lot more
towards configuration. So i'm not sure a device tree property is the
correct way to describe this.

This is also a feature i know other PHYs support. The Marvell PHY has
a "Metro Ethernet" extension which allows it to report link failures
for 1000BASE-T in 10, 20 or 40ms, instead of the usual 750ms. So we
need a generic solution other PHYs can implement.

As with cable testing, i think it should be an ethtool option.

   Andrew


Re: Need help with mdiobus_register and phy

2016-10-14 Thread Timur Tabi

Andrew Lunn wrote:

So are you seeing that the reads to MII_PHYSID1 and MII_PHYSID2 return
0x, when called from get_phy_id()?


Yes.

--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, hosted by The Linux Foundation.


Re: Need help with mdiobus_register and phy

2016-10-14 Thread Andrew Lunn
> It's the at803x driver.

The at803x_resume() just does normal MDIO transactions. Which suggests
the MDIO bus side of the device is still away. Or at least, the
MII_BMCR register is.

So are you seeing that the reads to MII_PHYSID1 and MII_PHYSID2 return
0x, when called from get_phy_id()?

Andrew


RE: [PATCH v2] r8169: set coherent DMA mask as well as streaming DMA mask

2016-10-14 Thread David Laight
From: Of Ard Biesheuvel
> Sent: 14 October 2016 14:41
> PCI devices that are 64-bit DMA capable should set the coherent
> DMA mask as well as the streaming DMA mask. On some architectures,
> these are managed separately, and so the coherent DMA mask will be
> left at its default value of 32 if it is not set explicitly. This
> results in errors such as
> 
>  r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
>  hwdev DMA mask = 0x, dev_addr = 0x0080fbfff000
>  swiotlb: coherent allocation failed for device :02:00.0 size=4096
>  CPU: 0 PID: 1062 Comm: systemd-udevd Not tainted 4.8.0+ #35
>  Hardware name: AMD Seattle/Seattle, BIOS 10:53:24 Oct 13 2016
> 
> on systems without memory that is 32-bit addressable by PCI devices.
> 
> Signed-off-by: Ard Biesheuvel 
> ---
> v2: dropped the hunk that sets the coherent DMA mask to DMA_BIT_MASK(32),
> which is unnecessary given that it is the default
> 
>  drivers/net/ethernet/realtek/r8169.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/realtek/r8169.c 
> b/drivers/net/ethernet/realtek/r8169.c
> index e55638c7505a..bf000d819a21 100644
> --- a/drivers/net/ethernet/realtek/r8169.c
> +++ b/drivers/net/ethernet/realtek/r8169.c
> @@ -8273,7 +8273,8 @@ static int rtl_init_one(struct pci_dev *pdev, const 
> struct pci_device_id *ent)
>   if ((sizeof(dma_addr_t) > 4) &&
>   (use_dac == 1 || (use_dac == -1 && pci_is_pcie(pdev) &&
> tp->mac_version >= RTL_GIGA_MAC_VER_18)) &&
> - !pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
> + !pci_set_dma_mask(pdev, DMA_BIT_MASK(64)) &&
> + !pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64))) {

Isn't there a dma_set_mask_and_coherent() function ?

David



Re: [PATCH] mac80211: aes_ccm: move struct aead_req off the stack

2016-10-14 Thread Johannes Berg

> 
> Is the aad[] actually reused? I would assume it only affects the mac
> on encryption, and the verification on decryption but I don't think
> we actually need it back from the crypto routines.

I don't think it's reused.

> Exactly what you said above :-) My patch only touches CCM but as you
> said,
> 
> """
> 'Also there's B_0/J_0 for CCM/GCM, and the 'zero' thing that GMAC
> has.
> """

Ah, but we can/should do the same for the others, no?

johannes


Re: [PATCH net 2/2] conntrack: enable to tune gc parameters

2016-10-14 Thread Florian Westphal
Pablo Neira Ayuso  wrote:
> I would prefer not to expose sysctl knobs, if we don't really know
> what good default values are good, then we cannot expect our users to
> know this for us.
> 
> I would go tune this in a way that this resembles to the previous
> behaviour.

I do not see how this is possible without reverting to old per-conntrack
timer scheme.

With per-ct timer userspace gets notified the moment the timer
fires, without it notification comes 'when kernel detects the timeout'
which in worst case, as Nicholas describes, is when gc worker comes
along.

You can run the gc worker every jiffie of course, but thats just
wasting cpu cycles (and you still get a small delay).

I don't see a way to do run-time tuning except faster restarts when
old entries start accumulating.  This is what the code tries to do,
perhaps you have a better idea for the 'next gc run' computation.



Re: [PATCH] r8169: set coherent DMA mask as well as streaming DMA mask

2016-10-14 Thread Ard Biesheuvel
On 14 October 2016 at 14:31, David Miller  wrote:
> From: Ard Biesheuvel 
> Date: Fri, 14 Oct 2016 12:39:30 +0100
>
>> PCI devices that are 64-bit DMA capable should set the coherent
>> DMA mask as well as the streaming DMA mask. On some architectures,
>> these are managed separately, and so the coherent DMA mask will be
>> left at its default value of 32 if it is not set explicitly. This
>> results in errors such as
>>
>>  r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
>>  hwdev DMA mask = 0x, dev_addr = 0x0080fbfff000
>>  swiotlb: coherent allocation failed for device :02:00.0 size=4096
>>  CPU: 0 PID: 1062 Comm: systemd-udevd Not tainted 4.8.0+ #35
>>  Hardware name: AMD Seattle/Seattle, BIOS 10:53:24 Oct 13 2016
>>
>> on systems without memory that is 32-bit addressable by PCI devices.
>>
>> Signed-off-by: Ard Biesheuvel 
>  ...
>> @@ -8281,6 +8282,8 @@ static int rtl_init_one(struct pci_dev *pdev, const 
>> struct pci_device_id *ent)
>>   dev->features |= NETIF_F_HIGHDMA;
>>   } else {
>>   rc = pci_set_dma_mask(pdev, DMA_BIT_MASK(32));
>> + if (!rc)
>> + rc = pci_set_consistent_dma_mask(pdev, 
>> DMA_BIT_MASK(32));
>
> As you state 32-bit is the default, therefore this part of your patch is 
> unnecessary.

Perhaps, but the original code did not assume that either. Should I
remove the other call in a subsequent patch as well?


Re: [PATCH] r8169: set coherent DMA mask as well as streaming DMA mask

2016-10-14 Thread David Miller
From: Ard Biesheuvel 
Date: Fri, 14 Oct 2016 14:32:24 +0100

> On 14 October 2016 at 14:31, David Miller  wrote:
>> From: Ard Biesheuvel 
>> Date: Fri, 14 Oct 2016 12:39:30 +0100
>>
>>> PCI devices that are 64-bit DMA capable should set the coherent
>>> DMA mask as well as the streaming DMA mask. On some architectures,
>>> these are managed separately, and so the coherent DMA mask will be
>>> left at its default value of 32 if it is not set explicitly. This
>>> results in errors such as
>>>
>>>  r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
>>>  hwdev DMA mask = 0x, dev_addr = 0x0080fbfff000
>>>  swiotlb: coherent allocation failed for device :02:00.0 size=4096
>>>  CPU: 0 PID: 1062 Comm: systemd-udevd Not tainted 4.8.0+ #35
>>>  Hardware name: AMD Seattle/Seattle, BIOS 10:53:24 Oct 13 2016
>>>
>>> on systems without memory that is 32-bit addressable by PCI devices.
>>>
>>> Signed-off-by: Ard Biesheuvel 
>>  ...
>>> @@ -8281,6 +8282,8 @@ static int rtl_init_one(struct pci_dev *pdev, const 
>>> struct pci_device_id *ent)
>>>   dev->features |= NETIF_F_HIGHDMA;
>>>   } else {
>>>   rc = pci_set_dma_mask(pdev, DMA_BIT_MASK(32));
>>> + if (!rc)
>>> + rc = pci_set_consistent_dma_mask(pdev, 
>>> DMA_BIT_MASK(32));
>>
>> As you state 32-bit is the default, therefore this part of your patch is 
>> unnecessary.
> 
> Perhaps, but the original code did not assume that either. Should I
> remove the other call in a subsequent patch as well?

I simply want you to respin this with the above hunk removed.

Your code changes and your commit message must be consistent.


[PATCH v2] r8169: set coherent DMA mask as well as streaming DMA mask

2016-10-14 Thread Ard Biesheuvel
PCI devices that are 64-bit DMA capable should set the coherent
DMA mask as well as the streaming DMA mask. On some architectures,
these are managed separately, and so the coherent DMA mask will be
left at its default value of 32 if it is not set explicitly. This
results in errors such as

 r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
 hwdev DMA mask = 0x, dev_addr = 0x0080fbfff000
 swiotlb: coherent allocation failed for device :02:00.0 size=4096
 CPU: 0 PID: 1062 Comm: systemd-udevd Not tainted 4.8.0+ #35
 Hardware name: AMD Seattle/Seattle, BIOS 10:53:24 Oct 13 2016

on systems without memory that is 32-bit addressable by PCI devices.

Signed-off-by: Ard Biesheuvel 
---
v2: dropped the hunk that sets the coherent DMA mask to DMA_BIT_MASK(32),
which is unnecessary given that it is the default

 drivers/net/ethernet/realtek/r8169.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c 
b/drivers/net/ethernet/realtek/r8169.c
index e55638c7505a..bf000d819a21 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -8273,7 +8273,8 @@ static int rtl_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
if ((sizeof(dma_addr_t) > 4) &&
(use_dac == 1 || (use_dac == -1 && pci_is_pcie(pdev) &&
  tp->mac_version >= RTL_GIGA_MAC_VER_18)) &&
-   !pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
+   !pci_set_dma_mask(pdev, DMA_BIT_MASK(64)) &&
+   !pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64))) {
 
/* CPlusCmd Dual Access Cycle is only needed for non-PCIe */
if (!pci_is_pcie(pdev))
-- 
2.7.4



Re: net/sctp: BUG: KASAN: stack-out-of-bounds in memcmp

2016-10-14 Thread Xin Long
On Sat, Aug 20, 2016 at 3:51 PM, Baozeng Ding  wrote:
> Hello all,
> The following program triggers  stack-out-of-bounds in memcmp. The kernel 
> version is 4.8.0-rc1+ (on Aug 13 commit 
> 118253a593bd1c57de2d1193df1ccffe1abe745b). Thanks.
...
>
> #define _GNU_SOURCE
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
>
> int main()
> {
> int fd;
> mmap((void *)0x2000ul, 0xff2000ul, 0x3ul, 0x32ul, -1, 0x0ul);
> fd = socket(AF_INET6, SOCK_STREAM, IPPROTO_SCTP);
> memcpy((void*)0x20f82f80, 
> "\x0a\x00\xab\x12\x72\xd4\x19\x9a\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x85\xda\x00\xa0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
>  128);
> bind(fd, (struct sockaddr*)0x20f82f80ul, 0x80ul);
> *(uint64_t*)0x202e1fc8 = (uint64_t)0x20f77f80;
> *(uint32_t*)0x202e1fd0 = (uint32_t)0x80;
> *(uint64_t*)0x202e1fd8 = (uint64_t)0x20f7dfe0;
> *(uint64_t*)0x202e1fe0 = (uint64_t)0x2;
> *(uint64_t*)0x202e1fe8 = (uint64_t)0x20f77000;
> *(uint64_t*)0x202e1ff0 = (uint64_t)0x3;
> *(uint32_t*)0x202e1ff8 = (uint32_t)0x80;
> memcpy((void*)0x20f77f80, 
> "\x0a\x00\xab\x12\xb0\xb3\x20\x7b\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xc2\xc2\x0b\xb2\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
>  128);
> *(uint64_t*)0x20f7dfe0 = (uint64_t)0x20f77fc5;
> *(uint64_t*)0x20f7dfe8 = (uint64_t)0x3b;
> *(uint64_t*)0x20f7dff0 = (uint64_t)0x20f77fac;
> *(uint64_t*)0x20f7dff8 = (uint64_t)0x54;
> memcpy((void*)0x20f77fc5, 
> "\xa5\x7d\xf3\xc4\xfe\xd3\xfd\x44\x63\x00\x8c\x1e\x4c\x2e\x8d\x8d\x9a\x9c\x9c\x9d\x5b\x7c\xe1\x06\xf7\x15\x16\xed\x68\xd1\xfc\xf4\xa4\x3a\xe4\x69\x51\x16\x74\xf4\x1a\xcf\x0e\x99\xc3\xa3\x87\xe7\x81\x6c\x10\x78\x75\x17\x69\x9d\x11\x0c\xc7",
>  59);
> memcpy((void*)0x20f77fac, 
> "\x86\x08\x89\x3c\xf3\x58\xea\xe7\x64\x6a\xfb\xb5\xe8\xdd\x5f\x69\xa5\xd4\xdc\xd9\xe7\x71\x95\x07\x78\x7b\x21\xda\x43\x9c\x62\x4d\xca\x64\xb5\x6e\x96\x55\xe9\x58\x76\x66\x1d\xb9\x7b\xe6\x20\xc1\xa9\xed\x70\xc1\x2b\x7c\x86\x8c\xba\x28\xb3\x2c\xb9\x64\xb7\x84\x65\x0d\x7f\xa6\x98\x6f\x49\xcb\x35\xad\x5a\xdf\x13\x75\x99\x57\x7e\xbb\x38\x89",
>  84);
> *(uint64_t*)0x20f77000 = (uint64_t)0x15;
> *(uint32_t*)0x20f77008 = (uint32_t)0x1;
> *(uint32_t*)0x20f7700c = (uint32_t)0xfffe;
> *(uint8_t*)0x20f77010 = (uint8_t)0xbb;
> *(uint8_t*)0x20f77011 = (uint8_t)0x2;
> *(uint8_t*)0x20f77012 = (uint8_t)0x5;
> *(uint8_t*)0x20f77013 = (uint8_t)0x2;
> *(uint8_t*)0x20f77014 = (uint8_t)0x8000;
> *(uint64_t*)0x20f77015 = (uint64_t)0x10;
> *(uint32_t*)0x20f7701d = (uint32_t)0x;
> *(uint32_t*)0x20f77021 = (uint32_t)0x1;
> *(uint64_t*)0x20f77025 = (uint64_t)0x13;
> *(uint32_t*)0x20f7702d = (uint32_t)0x6;
> *(uint32_t*)0x20f77031 = (uint32_t)0xfe00;
> *(uint8_t*)0x20f77035 = (uint8_t)0x8000;
> *(uint8_t*)0x20f77036 = (uint8_t)0xfff8;
> sendmmsg(fd, (struct mmsghdr *)0x202e1fc8ul, 0x1ul, 0x1ul);
> return 0;
> }
>
Hi, Baozeng, I couldn't reproduce this issue with this script,
even in 118253a593bd1c57de2d1193df1ccffe1abe745b
do I need to do some extra config for this ?


RE: [PATCH v2 net-next 0/7] qed*: driver updates

2016-10-14 Thread Chopra, Manish
> -Original Message-
> From: Manish Chopra [mailto:manish.cho...@qlogic.com]
> Sent: Thursday, October 13, 2016 4:53 PM
> To: da...@davemloft.net
> Cc: netdev@vger.kernel.org; yuval.mi...@qlogic.com; Chopra, Manish
> 
> Subject: [PATCH v2 net-next 0/7] qed*: driver updates
> 
> From: Manish Chopra 
> 
> Hi David,
> 
> There are several new additions in this series;
> Most are connected to either Tx offloading or Rx classifications
> [either fastpath changes or supporting configuration].
> 
> In addition, there's a single IOV enhancement.
> 
> Please consider applying this series to `net-next'.
> 
> V2:
> Added a fix for the race in ramrod handling
> pointed by Eric Dumazet [patch 7].
> 

Hi David, The last patch in this series has caused below kbuild failure.

drivers/net/ethernet/qlogic/qed/qed_spq.c: In function 'qed_spq_blocking_cb':
drivers/net/ethernet/qlogic/qed/qed_spq.c:62:85: error: call to 
'__compiletime_assert_60' declared with attribute error: Need native word sized 
stores/loads for atomicity.

I will re-spin the series with kbuild fix and will send out soon.

Thanks,
Manish




Re: Need help with mdiobus_register and phy

2016-10-14 Thread Timur Tabi

Andrew Lunn wrote:

Normally, a sleeping PHY does respond to MDIO. Otherwise, how do you
wake it?

So i assume this phy has some other means to wake it. What is this
means?


I'm guessing that someone has to call phy_resume() before/during the 
call to mdiobus_register, but I don't see how that's possible.


--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, hosted by The Linux Foundation.


[PATCH net-next v2 5/6] fjes: Add debugfs entry for EP status information in fjes driver

2016-10-14 Thread Taku Izumi
This patch adds debugfs entry to show EP status information.
You can get each EP's status information like the following:

  # cat /sys/kernel/debug/fjes/fjes.0/status

EPIDSTATUS   SAME_ZONECONNECTED
ep0 shared   YY
ep1 ---
ep2 unshared NN
ep3 unshared NN
ep4 unshared NN
ep5 unshared NN
ep6 unshared NN
ep7 unshared NN

Signed-off-by: Taku Izumi 
---
 drivers/net/fjes/Makefile   |   2 +-
 drivers/net/fjes/fjes.h |  16 ++
 drivers/net/fjes/fjes_debugfs.c | 117 
 drivers/net/fjes/fjes_main.c|  12 -
 4 files changed, 145 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/fjes/fjes_debugfs.c

diff --git a/drivers/net/fjes/Makefile b/drivers/net/fjes/Makefile
index 6705d1b..bc47b35 100644
--- a/drivers/net/fjes/Makefile
+++ b/drivers/net/fjes/Makefile
@@ -27,4 +27,4 @@
 
 obj-$(CONFIG_FUJITSU_ES) += fjes.o
 
-fjes-objs := fjes_main.o fjes_hw.o fjes_ethtool.o fjes_trace.o
+fjes-objs := fjes_main.o fjes_hw.o fjes_ethtool.o fjes_trace.o fjes_debugfs.o
diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h
index a592fe2..0372be3 100644
--- a/drivers/net/fjes/fjes.h
+++ b/drivers/net/fjes/fjes.h
@@ -66,6 +66,10 @@ struct fjes_adapter {
bool interrupt_watch_enable;
 
struct fjes_hw hw;
+
+#ifdef CONFIG_DEBUG_FS
+   struct dentry *dbg_adapter;
+#endif
 };
 
 extern char fjes_driver_name[];
@@ -74,4 +78,16 @@ extern const u32 fjes_support_mtu[];
 
 void fjes_set_ethtool_ops(struct net_device *);
 
+#ifdef CONFIG_DEBUG_FS
+void fjes_dbg_adapter_init(struct fjes_adapter *adapter);
+void fjes_dbg_adapter_exit(struct fjes_adapter *adapter);
+void fjes_dbg_init(void);
+void fjes_dbg_exit(void);
+#else
+static inline void fjes_dbg_adapter_init(struct fjes_adapter *adapter) {}
+static inline void fjes_dbg_adapter_exit(struct fjes_adapter *adapter) {}
+static inline void fjes_dbg_init(void) {}
+static inline void fjes_dbg_exit(void) {}
+#endif /* CONFIG_DEBUG_FS */
+
 #endif /* FJES_H_ */
diff --git a/drivers/net/fjes/fjes_debugfs.c b/drivers/net/fjes/fjes_debugfs.c
new file mode 100644
index 000..30052eb
--- /dev/null
+++ b/drivers/net/fjes/fjes_debugfs.c
@@ -0,0 +1,117 @@
+/*
+ *  FUJITSU Extended Socket Network Device driver
+ *  Copyright (c) 2015-2016 FUJITSU LIMITED
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, see .
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ */
+
+/* debugfs support for fjes driver */
+
+#ifdef CONFIG_DEBUG_FS
+
+#include 
+#include 
+#include 
+
+#include "fjes.h"
+
+static struct dentry *fjes_debug_root;
+
+static const char * const ep_status_string[] = {
+   "unshared",
+   "shared",
+   "waiting",
+   "complete",
+};
+
+static int fjes_dbg_status_show(struct seq_file *m, void *v)
+{
+   struct fjes_adapter *adapter = m->private;
+   struct fjes_hw *hw = >hw;
+   int max_epid = hw->max_epid;
+   int my_epid = hw->my_epid;
+   int epidx;
+
+   seq_puts(m, "EPID\tSTATUS   SAME_ZONECONNECTED\n");
+   for (epidx = 0; epidx < max_epid; epidx++) {
+   if (epidx == my_epid) {
+   seq_printf(m, "ep%d\t%-16c %-16c %-16c\n",
+  epidx, '-', '-', '-');
+   } else {
+   seq_printf(m, "ep%d\t%-16s %-16c %-16c\n",
+  epidx,
+  
ep_status_string[fjes_hw_get_partner_ep_status(hw, epidx)],
+  fjes_hw_epid_is_same_zone(hw, epidx) ? 'Y' : 
'N',
+  fjes_hw_epid_is_shared(hw->hw_info.share, 
epidx) ? 'Y' : 'N');
+   }
+   }
+
+   return 0;
+}
+
+static int fjes_dbg_status_open(struct inode *inode, struct file *file)
+{
+   return single_open(file, fjes_dbg_status_show, inode->i_private);
+}
+
+static const struct file_operations fjes_dbg_status_fops = {
+   .owner  = THIS_MODULE,
+   .open   = fjes_dbg_status_open,
+   .read   = seq_read,
+   .llseek = seq_lseek,
+   

[PATCH net-next v2 3/6] fjes: Add tracepoints in fjes driver

2016-10-14 Thread Taku Izumi
This patch adds tracepoints in fjes driver.
This is useful for debugging purpose.

Signed-off-by: Taku Izumi 
---
 drivers/net/fjes/Makefile |   2 +-
 drivers/net/fjes/fjes_hw.c|  25 +++-
 drivers/net/fjes/fjes_main.c  |   5 +
 drivers/net/fjes/fjes_trace.c |  30 
 drivers/net/fjes/fjes_trace.h | 311 ++
 5 files changed, 369 insertions(+), 4 deletions(-)
 create mode 100644 drivers/net/fjes/fjes_trace.c
 create mode 100644 drivers/net/fjes/fjes_trace.h

diff --git a/drivers/net/fjes/Makefile b/drivers/net/fjes/Makefile
index 523e3d7..6705d1b 100644
--- a/drivers/net/fjes/Makefile
+++ b/drivers/net/fjes/Makefile
@@ -27,4 +27,4 @@
 
 obj-$(CONFIG_FUJITSU_ES) += fjes.o
 
-fjes-objs := fjes_main.o fjes_hw.o fjes_ethtool.o
+fjes-objs := fjes_main.o fjes_hw.o fjes_ethtool.o fjes_trace.o
diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c
index 82b56e8..dba59dc 100644
--- a/drivers/net/fjes/fjes_hw.c
+++ b/drivers/net/fjes/fjes_hw.c
@@ -21,6 +21,7 @@
 
 #include "fjes_hw.h"
 #include "fjes.h"
+#include "fjes_trace.h"
 
 static void fjes_hw_update_zone_task(struct work_struct *);
 static void fjes_hw_epstop_task(struct work_struct *);
@@ -371,7 +372,7 @@ fjes_hw_issue_request_command(struct fjes_hw *hw,
enum fjes_dev_command_response_e ret = FJES_CMD_STATUS_UNKNOWN;
union REG_CR cr;
union REG_CS cs;
-   int timeout;
+   int timeout = FJES_COMMAND_REQ_TIMEOUT * 1000;
 
cr.reg = 0;
cr.bits.req_start = 1;
@@ -408,6 +409,8 @@ fjes_hw_issue_request_command(struct fjes_hw *hw,
}
}
 
+   trace_fjes_hw_issue_request_command(, , timeout, ret);
+
return ret;
 }
 
@@ -427,11 +430,13 @@ int fjes_hw_request_info(struct fjes_hw *hw)
res_buf->info.code = 0;
 
ret = fjes_hw_issue_request_command(hw, FJES_CMD_REQ_INFO);
+   trace_fjes_hw_request_info(hw, res_buf);
 
result = 0;
 
if (FJES_DEV_COMMAND_INFO_RES_LEN((*hw->hw_info.max_epid)) !=
res_buf->info.length) {
+   trace_fjes_hw_request_info_err("Invalid res_buf");
result = -ENOMSG;
} else if (ret == FJES_CMD_STATUS_NORMAL) {
switch (res_buf->info.code) {
@@ -448,6 +453,7 @@ int fjes_hw_request_info(struct fjes_hw *hw)
result = -EPERM;
break;
case FJES_CMD_STATUS_TIMEOUT:
+   trace_fjes_hw_request_info_err("Timeout");
result = -EBUSY;
break;
case FJES_CMD_STATUS_ERROR_PARAM:
@@ -512,6 +518,8 @@ int fjes_hw_register_buff_addr(struct fjes_hw *hw, int 
dest_epid,
res_buf->share_buffer.length = 0;
res_buf->share_buffer.code = 0;
 
+   trace_fjes_hw_register_buff_addr_req(req_buf, buf_pair);
+
ret = fjes_hw_issue_request_command(hw, FJES_CMD_REQ_SHARE_BUFFER);
 
timeout = FJES_COMMAND_REQ_BUFF_TIMEOUT * 1000;
@@ -532,16 +540,20 @@ int fjes_hw_register_buff_addr(struct fjes_hw *hw, int 
dest_epid,
 
result = 0;
 
+   trace_fjes_hw_register_buff_addr(res_buf, timeout);
+
if (res_buf->share_buffer.length !=
-   FJES_DEV_COMMAND_SHARE_BUFFER_RES_LEN)
+   FJES_DEV_COMMAND_SHARE_BUFFER_RES_LEN) {
+   trace_fjes_hw_register_buff_addr_err("Invalid res_buf");
result = -ENOMSG;
-   else if (ret == FJES_CMD_STATUS_NORMAL) {
+   } else if (ret == FJES_CMD_STATUS_NORMAL) {
switch (res_buf->share_buffer.code) {
case FJES_CMD_REQ_RES_CODE_NORMAL:
result = 0;
set_bit(dest_epid, >hw_info.buffer_share_bit);
break;
case FJES_CMD_REQ_RES_CODE_BUSY:
+   trace_fjes_hw_register_buff_addr_err("Busy Timeout");
result = -EBUSY;
break;
default:
@@ -554,6 +566,7 @@ int fjes_hw_register_buff_addr(struct fjes_hw *hw, int 
dest_epid,
result = -EPERM;
break;
case FJES_CMD_STATUS_TIMEOUT:
+   trace_fjes_hw_register_buff_addr_err("Timeout");
result = -EBUSY;
break;
case FJES_CMD_STATUS_ERROR_PARAM:
@@ -595,6 +608,7 @@ int fjes_hw_unregister_buff_addr(struct fjes_hw *hw, int 
dest_epid)
res_buf->unshare_buffer.length = 0;
res_buf->unshare_buffer.code = 0;
 
+   trace_fjes_hw_unregister_buff_addr_req(req_buf);
ret = fjes_hw_issue_request_command(hw, FJES_CMD_REQ_UNSHARE_BUFFER);
 
timeout = FJES_COMMAND_REQ_BUFF_TIMEOUT * 1000;
@@ -616,8 +630,11 @@ int fjes_hw_unregister_buff_addr(struct fjes_hw *hw, int 
dest_epid)
 
result = 0;
 
+   

[PATCH net-next v2 4/6] fjes: ethtool -w and -W support for fjes driver

2016-10-14 Thread Taku Izumi
This patch adds implementation of supporting
ethtool -w and -W for fjes driver.

You can enable and disable firmware debug mode by
using ethtool -W, and also retrieve firmware
activity information by using ethtool -w.

This is useful for debugging.

Signed-off-by: Taku Izumi 
---
 drivers/net/fjes/fjes_ethtool.c |  63 ++
 drivers/net/fjes/fjes_hw.c  | 137 
 drivers/net/fjes/fjes_hw.h  |  15 +
 drivers/net/fjes/fjes_trace.h   |  69 
 4 files changed, 284 insertions(+)

diff --git a/drivers/net/fjes/fjes_ethtool.c b/drivers/net/fjes/fjes_ethtool.c
index 68ef287..6575f88 100644
--- a/drivers/net/fjes/fjes_ethtool.c
+++ b/drivers/net/fjes/fjes_ethtool.c
@@ -235,6 +235,66 @@ static void fjes_get_regs(struct net_device *netdev,
regs_buff[36] = rd32(XSCT_ICTL);
 }
 
+static int fjes_set_dump(struct net_device *netdev, struct ethtool_dump *dump)
+{
+   struct fjes_adapter *adapter = netdev_priv(netdev);
+   struct fjes_hw *hw = >hw;
+   int ret = 0;
+
+   if (dump->flag) {
+   if (hw->debug_mode)
+   return -EPERM;
+
+   hw->debug_mode = dump->flag;
+
+   /* enable debug mode */
+   mutex_lock(>hw_info.lock);
+   ret = fjes_hw_start_debug(hw);
+   mutex_unlock(>hw_info.lock);
+
+   if (ret)
+   hw->debug_mode = 0;
+   } else {
+   if (!hw->debug_mode)
+   return -EPERM;
+
+   /* disable debug mode */
+   mutex_lock(>hw_info.lock);
+   ret = fjes_hw_stop_debug(hw);
+   mutex_unlock(>hw_info.lock);
+   }
+
+   return ret;
+}
+
+static int fjes_get_dump_flag(struct net_device *netdev,
+ struct ethtool_dump *dump)
+{
+   struct fjes_adapter *adapter = netdev_priv(netdev);
+   struct fjes_hw *hw = >hw;
+
+   dump->len = hw->hw_info.trace_size;
+   dump->version = 1;
+   dump->flag = hw->debug_mode;
+
+   return 0;
+}
+
+static int fjes_get_dump_data(struct net_device *netdev,
+ struct ethtool_dump *dump, void *buf)
+{
+   struct fjes_adapter *adapter = netdev_priv(netdev);
+   struct fjes_hw *hw = >hw;
+   int ret = 0;
+
+   if (hw->hw_info.trace)
+   memcpy(buf, hw->hw_info.trace, hw->hw_info.trace_size);
+   else
+   ret = -EPERM;
+
+   return ret;
+}
+
 static const struct ethtool_ops fjes_ethtool_ops = {
.get_settings   = fjes_get_settings,
.get_drvinfo= fjes_get_drvinfo,
@@ -243,6 +303,9 @@ static const struct ethtool_ops fjes_ethtool_ops = {
.get_sset_count   = fjes_get_sset_count,
.get_regs   = fjes_get_regs,
.get_regs_len   = fjes_get_regs_len,
+   .set_dump   = fjes_set_dump,
+   .get_dump_flag  = fjes_get_dump_flag,
+   .get_dump_data  = fjes_get_dump_data,
 };
 
 void fjes_set_ethtool_ops(struct net_device *netdev)
diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c
index dba59dc..9c652c0 100644
--- a/drivers/net/fjes/fjes_hw.c
+++ b/drivers/net/fjes/fjes_hw.c
@@ -343,6 +343,9 @@ int fjes_hw_init(struct fjes_hw *hw)
 
ret = fjes_hw_setup(hw);
 
+   hw->hw_info.trace = vzalloc(FJES_DEBUG_BUFFER_SIZE);
+   hw->hw_info.trace_size = FJES_DEBUG_BUFFER_SIZE;
+
return ret;
 }
 
@@ -351,6 +354,18 @@ void fjes_hw_exit(struct fjes_hw *hw)
int ret;
 
if (hw->base) {
+
+   if (hw->debug_mode) {
+   /* disable debug mode */
+   mutex_lock(>hw_info.lock);
+   fjes_hw_stop_debug(hw);
+   mutex_unlock(>hw_info.lock);
+   }
+   vfree(hw->hw_info.trace);
+   hw->hw_info.trace = NULL;
+   hw->hw_info.trace_size = 0;
+   hw->debug_mode = 0;
+
ret = fjes_hw_reset(hw);
if (ret)
pr_err("%s: reset error", __func__);
@@ -1175,3 +1190,125 @@ static void fjes_hw_epstop_task(struct work_struct 
*work)
}
}
 }
+
+int fjes_hw_start_debug(struct fjes_hw *hw)
+{
+   union fjes_device_command_req *req_buf = hw->hw_info.req_buf;
+   union fjes_device_command_res *res_buf = hw->hw_info.res_buf;
+   enum fjes_dev_command_response_e ret;
+   int page_count;
+   int result = 0;
+   void *addr;
+   int i;
+
+   if (!hw->hw_info.trace)
+   return -EPERM;
+   memset(hw->hw_info.trace, 0, FJES_DEBUG_BUFFER_SIZE);
+
+   memset(req_buf, 0, hw->hw_info.req_buf_size);
+   memset(res_buf, 0, hw->hw_info.res_buf_size);
+
+   req_buf->start_trace.length =
+

[PATCH net-next v2 2/6] fjes: Enhance ethtool -S for fjes driver

2016-10-14 Thread Taku Izumi
This patch enhances ethtool -S for fjes driver so that
EP related statistics can be retrieved.

The following statistics can be displayed via ethtool -S:

 ep%d_com_regist_buf_exec
 ep%d_com_unregist_buf_exec
 ep%d_send_intr_rx
 ep%d_send_intr_unshare
 ep%d_send_intr_zoneupdate
 ep%d_recv_intr_rx
 ep%d_recv_intr_unshare
 ep%d_recv_intr_stop
 ep%d_recv_intr_zoneupdate
 ep%d_tx_buffer_full
 ep%d_tx_dropped_not_shared
 ep%d_tx_dropped_ver_mismatch
 ep%d_tx_dropped_buf_size_mismatch
 ep%d_tx_dropped_vlanid_mismatch

Signed-off-by: Taku Izumi 
---
 drivers/net/fjes/fjes_ethtool.c | 70 -
 drivers/net/fjes/fjes_hw.c  |  9 ++
 drivers/net/fjes/fjes_hw.h  | 19 +++
 drivers/net/fjes/fjes_main.c| 44 +++---
 4 files changed, 137 insertions(+), 5 deletions(-)

diff --git a/drivers/net/fjes/fjes_ethtool.c b/drivers/net/fjes/fjes_ethtool.c
index 8397634..68ef287 100644
--- a/drivers/net/fjes/fjes_ethtool.c
+++ b/drivers/net/fjes/fjes_ethtool.c
@@ -49,10 +49,18 @@ static const struct fjes_stats fjes_gstrings_stats[] = {
FJES_STAT("tx_dropped", stats64.tx_dropped),
 };
 
+#define FJES_EP_STATS_LEN 14
+#define FJES_STATS_LEN \
+   (ARRAY_SIZE(fjes_gstrings_stats) + \
+((&((struct fjes_adapter *)netdev_priv(netdev))->hw)->max_epid - 1) * \
+FJES_EP_STATS_LEN)
+
 static void fjes_get_ethtool_stats(struct net_device *netdev,
   struct ethtool_stats *stats, u64 *data)
 {
struct fjes_adapter *adapter = netdev_priv(netdev);
+   struct fjes_hw *hw = >hw;
+   int epidx;
char *p;
int i;
 
@@ -61,11 +69,39 @@ static void fjes_get_ethtool_stats(struct net_device 
*netdev,
data[i] = (fjes_gstrings_stats[i].sizeof_stat == sizeof(u64))
? *(u64 *)p : *(u32 *)p;
}
+   for (epidx = 0; epidx < hw->max_epid; epidx++) {
+   if (epidx == hw->my_epid)
+   continue;
+   data[i++] = hw->ep_shm_info[epidx].ep_stats
+   .com_regist_buf_exec;
+   data[i++] = hw->ep_shm_info[epidx].ep_stats
+   .com_unregist_buf_exec;
+   data[i++] = hw->ep_shm_info[epidx].ep_stats.send_intr_rx;
+   data[i++] = hw->ep_shm_info[epidx].ep_stats.send_intr_unshare;
+   data[i++] = hw->ep_shm_info[epidx].ep_stats
+   .send_intr_zoneupdate;
+   data[i++] = hw->ep_shm_info[epidx].ep_stats.recv_intr_rx;
+   data[i++] = hw->ep_shm_info[epidx].ep_stats.recv_intr_unshare;
+   data[i++] = hw->ep_shm_info[epidx].ep_stats.recv_intr_stop;
+   data[i++] = hw->ep_shm_info[epidx].ep_stats
+   .recv_intr_zoneupdate;
+   data[i++] = hw->ep_shm_info[epidx].ep_stats.tx_buffer_full;
+   data[i++] = hw->ep_shm_info[epidx].ep_stats
+   .tx_dropped_not_shared;
+   data[i++] = hw->ep_shm_info[epidx].ep_stats
+   .tx_dropped_ver_mismatch;
+   data[i++] = hw->ep_shm_info[epidx].ep_stats
+   .tx_dropped_buf_size_mismatch;
+   data[i++] = hw->ep_shm_info[epidx].ep_stats
+   .tx_dropped_vlanid_mismatch;
+   }
 }
 
 static void fjes_get_strings(struct net_device *netdev,
 u32 stringset, u8 *data)
 {
+   struct fjes_adapter *adapter = netdev_priv(netdev);
+   struct fjes_hw *hw = >hw;
u8 *p = data;
int i;
 
@@ -76,6 +112,38 @@ static void fjes_get_strings(struct net_device *netdev,
   ETH_GSTRING_LEN);
p += ETH_GSTRING_LEN;
}
+   for (i = 0; i < hw->max_epid; i++) {
+   if (i == hw->my_epid)
+   continue;
+   sprintf(p, "ep%u_com_regist_buf_exec", i);
+   p += ETH_GSTRING_LEN;
+   sprintf(p, "ep%u_com_unregist_buf_exec", i);
+   p += ETH_GSTRING_LEN;
+   sprintf(p, "ep%u_send_intr_rx", i);
+   p += ETH_GSTRING_LEN;
+   sprintf(p, "ep%u_send_intr_unshare", i);
+   p += ETH_GSTRING_LEN;
+   sprintf(p, "ep%u_send_intr_zoneupdate", i);
+   p += ETH_GSTRING_LEN;
+   sprintf(p, "ep%u_recv_intr_rx", i);
+   p += ETH_GSTRING_LEN;
+   sprintf(p, "ep%u_recv_intr_unshare", i);
+   p += ETH_GSTRING_LEN;
+   sprintf(p, "ep%u_recv_intr_stop", i);
+   p += ETH_GSTRING_LEN;
+   

[PATCH net-next v2 6/6] fjes: Update fjes driver version : 1.2

2016-10-14 Thread Taku Izumi
Signed-off-by: Taku Izumi 
---
 drivers/net/fjes/fjes_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
index 359e7a5..f36eb4a 100644
--- a/drivers/net/fjes/fjes_main.c
+++ b/drivers/net/fjes/fjes_main.c
@@ -30,7 +30,7 @@
 #include "fjes_trace.h"
 
 #define MAJ 1
-#define MIN 1
+#define MIN 2
 #define DRV_VERSION __stringify(MAJ) "." __stringify(MIN)
 #define DRV_NAME   "fjes"
 char fjes_driver_name[] = DRV_NAME;
-- 
2.6.6



[PATCH] r8169: set coherent DMA mask as well as streaming DMA mask

2016-10-14 Thread Ard Biesheuvel
PCI devices that are 64-bit DMA capable should set the coherent
DMA mask as well as the streaming DMA mask. On some architectures,
these are managed separately, and so the coherent DMA mask will be
left at its default value of 32 if it is not set explicitly. This
results in errors such as

 r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
 hwdev DMA mask = 0x, dev_addr = 0x0080fbfff000
 swiotlb: coherent allocation failed for device :02:00.0 size=4096
 CPU: 0 PID: 1062 Comm: systemd-udevd Not tainted 4.8.0+ #35
 Hardware name: AMD Seattle/Seattle, BIOS 10:53:24 Oct 13 2016

on systems without memory that is 32-bit addressable by PCI devices.

Signed-off-by: Ard Biesheuvel 
---
 drivers/net/ethernet/realtek/r8169.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c 
b/drivers/net/ethernet/realtek/r8169.c
index e55638c7505a..04957a36b11f 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -8273,7 +8273,8 @@ static int rtl_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
if ((sizeof(dma_addr_t) > 4) &&
(use_dac == 1 || (use_dac == -1 && pci_is_pcie(pdev) &&
  tp->mac_version >= RTL_GIGA_MAC_VER_18)) &&
-   !pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
+   !pci_set_dma_mask(pdev, DMA_BIT_MASK(64)) &&
+   !pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64))) {
 
/* CPlusCmd Dual Access Cycle is only needed for non-PCIe */
if (!pci_is_pcie(pdev))
@@ -8281,6 +8282,8 @@ static int rtl_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
dev->features |= NETIF_F_HIGHDMA;
} else {
rc = pci_set_dma_mask(pdev, DMA_BIT_MASK(32));
+   if (!rc)
+   rc = pci_set_consistent_dma_mask(pdev, 
DMA_BIT_MASK(32));
if (rc < 0) {
netif_err(tp, probe, dev, "DMA configuration failed\n");
goto err_out_unmap_4;
-- 
2.7.4



[PATCH net-next 1/2] net: phy: Add Speed downshift set driver for Microsemi PHYs.

2016-10-14 Thread Raju Lakkaraju
From: Raju Lakkaraju 

For operation in cabling environments that are incompatible with
1000BAST-T, VSC8531 device provides an automatic link speed
downshift operation. When enabled, the device automatically changes
its 1000BAST-T auto-negotiation to the next slower speed after
a configured number of failed attempts at 1000BAST-T.
This feature is useful in setting up in networks using older cable
installations that include only pairs A and B, and not pairs C and D.

Signed-off-by: Raju Lakkaraju 
Signed-off-by: Allan W. Nielsen 
---
 .../devicetree/bindings/net/mscc-phy-vsc8531.txt   |  6 ++
 drivers/net/phy/mscc.c | 75 +-
 2 files changed, 80 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/net/mscc-phy-vsc8531.txt 
b/Documentation/devicetree/bindings/net/mscc-phy-vsc8531.txt
index bdefefc6..062d115 100644
--- a/Documentation/devicetree/bindings/net/mscc-phy-vsc8531.txt
+++ b/Documentation/devicetree/bindings/net/mscc-phy-vsc8531.txt
@@ -27,6 +27,11 @@ Optional properties:
  'vddmac'.
  Default value is 0%.
  Ref: Table:1 - Edge rate change (below).
+- downshift-cnt: When enabled, the device automatically 
changes its
+ 1000BAST-T auto-negotiation to the next slower speed
+ after a 'downshift-cnt' of failed attempts at
+ 1000BAST-T. Allowed values: 0, 2, 3, 4, 5.
+ 0 is default and will disable downshifting.
 
 Table: 1 - Edge rate change
 |
@@ -60,4 +65,5 @@ Example:
 compatible = "ethernet-phy-id0007.0570";
 vsc8531,vddmac = <3300>;
 vsc8531,edge-slowdown  = <7>;
+vsc8531,downshift-cnt   = <3>;
 };
diff --git a/drivers/net/phy/mscc.c b/drivers/net/phy/mscc.c
index 43a7545..e87d9f0 100644
--- a/drivers/net/phy/mscc.c
+++ b/drivers/net/phy/mscc.c
@@ -46,8 +46,15 @@ enum rgmii_rx_clock_delay {
 
 #define MSCC_EXT_PAGE_ACCESS 31
 #define MSCC_PHY_PAGE_STANDARD   0x /* Standard registers */
+#define MSCC_PHY_PAGE_EXTENDED   0x0001 /* Extended registers */
 #define MSCC_PHY_PAGE_EXTENDED_2 0x0002 /* Extended reg - page 2 */
 
+/* Extended Page 1 Registers */
+#define MSCC_PHY_ACTIPHY_CNTL20
+#define DOWNSHIFT_CNTL_MASK  0x000C
+#define DOWNSHIFT_EN 0x0010
+#define DOWNSHIFT_CNTL_POS   2
+
 /* Extended Page 2 Registers */
 #define MSCC_PHY_RGMII_CNTL  20
 #define RGMII_RX_CLK_DELAY_MASK  0x0070
@@ -75,6 +82,7 @@ enum rgmii_rx_clock_delay {
 
 struct vsc8531_private {
int rate_magic;
+   u8  downshift_magic;
 };
 
 #ifdef CONFIG_OF_MDIO
@@ -99,6 +107,31 @@ static int vsc85xx_phy_page_set(struct phy_device *phydev, 
u8 page)
return rc;
 }
 
+static int vsc85xx_downshift_set(struct phy_device *phydev, u8 magic)
+{
+   int rc;
+   u16 reg_val;
+
+   mutex_lock(>lock);
+   rc = vsc85xx_phy_page_set(phydev, MSCC_PHY_PAGE_EXTENDED);
+   if (rc != 0)
+   goto out_unlock;
+
+   reg_val = phy_read(phydev, MSCC_PHY_ACTIPHY_CNTL);
+   reg_val &= ~(DOWNSHIFT_CNTL_MASK);
+   reg_val |= magic;
+   rc = phy_write(phydev, MSCC_PHY_ACTIPHY_CNTL, reg_val);
+   if (rc != 0)
+   goto out_unlock;
+
+   rc = vsc85xx_phy_page_set(phydev, MSCC_PHY_PAGE_STANDARD);
+
+out_unlock:
+   mutex_unlock(>lock);
+
+   return rc;
+}
+
 static int vsc85xx_wol_set(struct phy_device *phydev,
   struct ethtool_wolinfo *wol)
 {
@@ -239,11 +272,42 @@ static int vsc85xx_edge_rate_magic_get(struct phy_device 
*phydev)
 
return -EINVAL;
 }
+
+static int vsc85xx_downshift_magic_get(struct phy_device *phydev)
+{
+   int rc;
+   u8 ds;
+   struct device *dev = >mdio.dev;
+   struct device_node *of_node = dev->of_node;
+
+   if (!of_node)
+   return -ENODEV;
+
+   rc = of_property_read_u8(of_node, "vsc8531,downshift-cnt", );
+   if ((rc == -EINVAL) || (ds == 0))
+   return 0;
+   if (ds == 1 || ds > 5) {
+   phydev_err(phydev, "Invalid downshift count\n");
+   return -EINVAL;
+   }
+
+   /* ds is either 2,3,4 or 5 */
+   ds -= 2;
+   ds <<= DOWNSHIFT_CNTL_POS;
+   ds |= DOWNSHIFT_EN;
+
+   return ds;
+}
 #else
 static int vsc85xx_edge_rate_magic_get(struct phy_device *phydev)
 {
return 0;
 }
+
+static int vsc85xx_downshift_magic_get(struct phy_device *phydev)
+{
+   return 0;
+}
 #endif /* CONFIG_OF_MDIO */
 
 static int vsc85xx_edge_rate_cntl_set(struct phy_device *phydev, u8 edge_rate)
@@ -344,6 +408,10 @@ 

Re: [PATCH] mac80211: aes_ccm: move struct aead_req off the stack

2016-10-14 Thread Johannes Berg
On Fri, 2016-10-14 at 15:10 +0200, Johannes Berg wrote:
> > 
> > So use kzalloc
> 
> Do we really need kzalloc()? We have things on the stack right now,
> and don't initialize, so surely we don't really need to zero things? 

Err, never mind, I'm an idiot - we *do* initialize to 0, of course.

johannes


Re: [PATCH] mac80211: aes_ccm: move struct aead_req off the stack

2016-10-14 Thread Ard Biesheuvel
On 14 October 2016 at 14:10, Johannes Berg  wrote:
>
>> So use kzalloc
>
> Do we really need kzalloc()? We have things on the stack right now, and
> don't initialize, so surely we don't really need to zero things?
>
>> This only addresses one half of the problem. The other problem, i.e.,
>> the fact that the aad[] array lives on the stack of the caller, is
>> handled adequately imo by the change proposed by Johannes.
>
> But if we allocate things anyway, is it worth expending per-CPU buffers
> on these?
>

Ehmm, maybe not. I could spin a v2 that allocates a bigger buffer, and
copies aad[] into it as well
That does not help the other algos though


Re: [PATCH] r8169: set coherent DMA mask as well as streaming DMA mask

2016-10-14 Thread Ard Biesheuvel
On 14 October 2016 at 14:34, David Miller  wrote:
> From: Ard Biesheuvel 
> Date: Fri, 14 Oct 2016 14:32:24 +0100
>
>> On 14 October 2016 at 14:31, David Miller  wrote:
>>> From: Ard Biesheuvel 
>>> Date: Fri, 14 Oct 2016 12:39:30 +0100
>>>
 PCI devices that are 64-bit DMA capable should set the coherent
 DMA mask as well as the streaming DMA mask. On some architectures,
 these are managed separately, and so the coherent DMA mask will be
 left at its default value of 32 if it is not set explicitly. This
 results in errors such as

  r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
  hwdev DMA mask = 0x, dev_addr = 0x0080fbfff000
  swiotlb: coherent allocation failed for device :02:00.0 size=4096
  CPU: 0 PID: 1062 Comm: systemd-udevd Not tainted 4.8.0+ #35
  Hardware name: AMD Seattle/Seattle, BIOS 10:53:24 Oct 13 2016

 on systems without memory that is 32-bit addressable by PCI devices.

 Signed-off-by: Ard Biesheuvel 
>>>  ...
 @@ -8281,6 +8282,8 @@ static int rtl_init_one(struct pci_dev *pdev, const 
 struct pci_device_id *ent)
   dev->features |= NETIF_F_HIGHDMA;
   } else {
   rc = pci_set_dma_mask(pdev, DMA_BIT_MASK(32));
 + if (!rc)
 + rc = pci_set_consistent_dma_mask(pdev, 
 DMA_BIT_MASK(32));
>>>
>>> As you state 32-bit is the default, therefore this part of your patch is 
>>> unnecessary.
>>
>> Perhaps, but the original code did not assume that either. Should I
>> remove the other call in a subsequent patch as well?
>
> I simply want you to respin this with the above hunk removed.
>
> Your code changes and your commit message must be consistent.

OK, fair enough


Re: [PATCH v2] r8169: set coherent DMA mask as well as streaming DMA mask

2016-10-14 Thread Ard Biesheuvel


> On 14 Oct 2016, at 14:42, David Laight wrote:
> 
> From: Of Ard Biesheuvel
>> Sent: 14 October 2016 14:41
>> PCI devices that are 64-bit DMA capable should set the coherent
>> DMA mask as well as the streaming DMA mask. On some architectures,
>> these are managed separately, and so the coherent DMA mask will be
>> left at its default value of 32 if it is not set explicitly. This
>> results in errors such as
>> 
>> r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
>> hwdev DMA mask = 0x, dev_addr = 0x0080fbfff000
>> swiotlb: coherent allocation failed for device :02:00.0 size=4096
>> CPU: 0 PID: 1062 Comm: systemd-udevd Not tainted 4.8.0+ #35
>> Hardware name: AMD Seattle/Seattle, BIOS 10:53:24 Oct 13 2016
>> 
>> on systems without memory that is 32-bit addressable by PCI devices.
>> 
>> Signed-off-by: Ard Biesheuvel 
>> ---
>> v2: dropped the hunk that sets the coherent DMA mask to DMA_BIT_MASK(32),
>>which is unnecessary given that it is the default
>> 
>> drivers/net/ethernet/realtek/r8169.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/net/ethernet/realtek/r8169.c 
>> b/drivers/net/ethernet/realtek/r8169.c
>> index e55638c7505a..bf000d819a21 100644
>> --- a/drivers/net/ethernet/realtek/r8169.c
>> +++ b/drivers/net/ethernet/realtek/r8169.c
>> @@ -8273,7 +8273,8 @@ static int rtl_init_one(struct pci_dev *pdev, const 
>> struct pci_device_id *ent)
>>if ((sizeof(dma_addr_t) > 4) &&
>>(use_dac == 1 || (use_dac == -1 && pci_is_pcie(pdev) &&
>>  tp->mac_version >= RTL_GIGA_MAC_VER_18)) &&
>> -!pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
>> +!pci_set_dma_mask(pdev, DMA_BIT_MASK(64)) &&
>> +!pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64))) {
> 
> Isn't there a dma_set_mask_and_coherent() function ?
> 
>David
> 


Re: [PATCH net] net/mlx4_en: fixup xdp tx irq to match rx

2016-10-14 Thread David Miller
From: Brenden Blanco 
Date: Thu, 13 Oct 2016 13:13:11 -0700

> In cases where the number of tx rings is not a multiple of the number of
> rx rings, the tx completion event will be handled on a different core
> from the transmit and population of the ring. Races on the ring will
> lead to a double-free of the page, and possibly other corruption.
> 
> The rings are initialized by default with a valid multiple of rings,
> based on the number of cpus, therefore an invalid configuration requires
> ethtool to change the ring layout. For instance 'ethtool -L eth0 rx 9 tx
> 8' will cause packets received on rx0, and XDP_TX'd to tx48, to be
> completed on cpu3 (48 % 9 == 3).
> 
> Resolve this discrepancy by shifting the irq for the xdp tx queues to
> start again from 0, modulo rx_ring_num.
> 
> Fixes: 9ecc2d86171a ("net/mlx4_en: add xdp forwarding and data write support")
> Reported-by: Jesper Dangaard Brouer 
> Signed-off-by: Brenden Blanco 

Applied and queued up for -stable, thanks.


Re: [PATCH trivial] net: add bbr to config DEFAULT_TCP_CONG

2016-10-14 Thread David Miller
From: Markus Trippelsdorf 
Date: Fri, 14 Oct 2016 10:07:16 +0200

> On 2016.10.14 at 09:43 +0200, Eric Dumazet wrote:
>> On Fri, 2016-10-14 at 09:33 +0200, Markus Trippelsdorf wrote:
>> > While playing with BBR I noticed that it was missing in the list of
>> > possible config DEFAULT_TCP_CONG choices. Fixed thusly.
>> > 
>> > Signed-off-by: Markus Trippelsdorf 
>> > 
>> > diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig
>> > index 300b06888fdf..b54b3ca939db 100644
>> > --- a/net/ipv4/Kconfig
>> > +++ b/net/ipv4/Kconfig
>> > @@ -715,6 +715,7 @@ config DEFAULT_TCP_CONG
>> >default "reno" if DEFAULT_RENO
>> >default "dctcp" if DEFAULT_DCTCP
>> >default "cdg" if DEFAULT_CDG
>> > +  default "bbr" if DEFAULT_BBR
>> >default "cubic"
>> 
>> Not sure if we want this at this moment.
>> 
>> BBR needs FQ packet scheduler, and this is not exactly trivial to
>> achieve.
> 
> For a start, it could be automatically selected:

Right but FQ has to be properly enabled and configured as well.


Re: [PATCH net-next 1/2] lwtunnel: Add destroy state operation

2016-10-14 Thread David Miller
From: Tom Herbert 
Date: Thu, 13 Oct 2016 17:57:42 -0700

> @@ -130,6 +130,19 @@ int lwtunnel_build_state(struct net_device *dev, u16 
> encap_type,
>  }
>  EXPORT_SYMBOL(lwtunnel_build_state);
>  
> +void  lwtstate_free(struct lwtunnel_state *lws)

There should only be one space between "void" and "lwstate_free".


Re: [mac80211] BUG_ON with current -git (4.8.0-11417-g24532f7)

2016-10-14 Thread Ard Biesheuvel
On 14 October 2016 at 11:00, Johannes Berg  wrote:
>
>> So why is the performance hit acceptable for ESP but not for WPA? We
>> could easily implement the same thing, i.e.,
>> kmalloc(GFP_ATOMIC)/kfree the aead_req struct rather than allocate it
>> on the stack
>
> Yeah, maybe we should. It's likely a much bigger allocation, but I
> don't actually know if that affects speed.
>
> In most cases where you want high performance we never hit this anyway
> since we'll have hardware crypto. I know for our (Intel's) devices we
> normally never hit these code paths.
>
> But on the other hand, you also did your changes for a reason, and the
> only reason I can see of that is performance. So you'd be the one with
> most "skin in the game", I guess?
>

Well, what sucks here is that the accelerated driver I implemented for
arm64 does not actually need this, as long as we take aad[] off the
stack. And note that the API was changed since my patch, to add aad[]
to the scatterlist: prior to this change, it used
aead_request_set_assoc() to set the associated data separately.


Re: [PATCH net-next 1/2] net: phy: Add Speed downshift set driver for Microsemi PHYs.

2016-10-14 Thread Andrew Lunn
On Fri, Oct 14, 2016 at 05:10:32PM +0530, Raju Lakkaraju wrote:
> From: Raju Lakkaraju 
> 
> For operation in cabling environments that are incompatible with
> 1000BAST-T, VSC8531 device provides an automatic link speed
> downshift operation. When enabled, the device automatically changes
> its 1000BAST-T auto-negotiation to the next slower speed after
> a configured number of failed attempts at 1000BAST-T.
> This feature is useful in setting up in networks using older cable
> installations that include only pairs A and B, and not pairs C and D.

Any reason not to just turn this on by default when auto-neg is
enabled?

Andrew


Re: [PATCH v2] net: Require exact match for TCP socket lookups if dif is l3mdev

2016-10-14 Thread David Ahern
On 10/14/16 12:33 AM, Eric Dumazet wrote:
> There is a catch here.
> TCP moves IP6CB() in a different location.
> 
> Reference :
> 
> 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses")

thanks for the reference.


> Problem is that the lookup can happen from IP early demux, before TCP
> moved IP{6}CB around.

For TCP we only need the exact_dif match for listen sockets, so early demux 
does not apply.

> 
> So you might need to let the caller pass IP6CB(skb)->flags (or
> TCP_SKB_CB(skb)->header.h6.flags ) instead of skb since
> inet6_exact_dif_match() does not know where to fetch the flags.
> 
> Same issue for IPv4.

I'll update the match functions to pull from TCP_SKB_CB instead of IP6CB and 
make a note of the above.

Thanks for the review


Re: Need help with mdiobus_register and phy

2016-10-14 Thread Andrew Lunn
On Fri, Oct 14, 2016 at 07:49:56AM -0500, Timur Tabi wrote:
> Andrew Lunn wrote:
> >So are you seeing that the reads to MII_PHYSID1 and MII_PHYSID2 return
> >0x, when called from get_phy_id()?

Have you tried using the ethernet-phy-id device tree property? It
looks like that will allow you to skip get_phy_device and just create
the phy device. You can then bring the phy out of sleep in the probe
function?

Andrew


Re: Need help with mdiobus_register and phy

2016-10-14 Thread Timur Tabi

Andrew Lunn wrote:

Does the datasheet say anything about this?

I would say for this device, suspend() is too aggressive.


I'll have to find the datasheet.  Let me do some research and get back 
to you.  Thanks for your help so far.


--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, hosted by The Linux Foundation.


Re: [PATCH] mac80211: aes_ccm: move struct aead_req off the stack

2016-10-14 Thread Johannes Berg
On Fri, 2016-10-14 at 14:13 +0100, Ard Biesheuvel wrote:
> 
> > But if we allocate things anyway, is it worth expending per-CPU
> > buffers on these?
> 
> Ehmm, maybe not. I could spin a v2 that allocates a bigger buffer,
> and copies aad[] into it as well

Copies in/out, I guess. Also there's B_0/J_0 for CCM/GCM, and the
'zero' thing that GMAC has.

> That does not help the other algos though

What do you mean?

johannes


Re: [PATCH net 0/3] qed: Fix dependencies and warnings series

2016-10-14 Thread David Miller
From: Yuval Mintz 
Date: Thu, 13 Oct 2016 22:57:00 +0300

> The first patch in this series follows Dan Carpenter's reports about
> Smatch warnings for recent qed additions and fixes those.
> 
> The second patch is the most significant one [and the reason this is
> ntended for 'net'] - it's based on Arnd Bermann's suggestion for fixing
> compilation issues that were introduced with the roce addition as a result
> of certain combinations of qed, qede and qedr Kconfig options.
> 
> The third follows the discussion with Arnd and clears a lot of the warnings
> that arise when compiling the drivers with "C=1".
> 
> Please consider applying this series to 'net'.

Series applied, thanks.


Re: [PATCH v2] net: Require exact match for TCP socket lookups if dif is l3mdev

2016-10-14 Thread David Ahern
On 10/14/16 6:21 AM, David Ahern wrote:
>> So you might need to let the caller pass IP6CB(skb)->flags (or
>> TCP_SKB_CB(skb)->header.h6.flags ) instead of skb since
>> inet6_exact_dif_match() does not know where to fetch the flags.
>>
>> Same issue for IPv4.
> 
> I'll update the match functions to pull from TCP_SKB_CB instead of IP6CB and 
> make a note of the above.

IPv6 does the move after the socket lookup where IPv4 does it before.



Re: [PATCH v3 net-next 0/7] qed*: driver updates

2016-10-14 Thread David Miller
From: Manish Chopra 
Date: Fri, 14 Oct 2016 05:19:16 -0400

> There are several new additions in this series;
> Most are connected to either Tx offloading or Rx classifications
> [either fastpath changes or supporting configuration].
> 
> In addition, there's a single IOV enhancement.
> 
> Please consider applying this series to `net-next'.
> 
> V2->V3:
> Fixes below kbuild warning
> call to '__compiletime_assert_60' declared with
> attribute error: Need native word sized stores/loads for atomicity.
> 
> V1->V2:
> Added a fix for the race in ramrod handling
> pointed by Eric Dumazet [patch 7].

Series applied, thanks.


  1   2   >