Re: [RFC PATCH net] net/core: don't increment rx_dropped on inactive slaves

2016-01-25 Thread Jarod Wilson
On Mon, Jan 25, 2016 at 09:27:20AM -0500, Jarod Wilson wrote:
> On Sun, Jan 24, 2016 at 10:42:22PM -0800, David Miller wrote:
> > From: Jarod Wilson 
> > Date: Fri, 22 Jan 2016 14:11:22 -0500
> > 
> > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > index 8cba3d8..1354c7b 100644
> > > --- a/net/core/dev.c
> > > +++ b/net/core/dev.c
> > > @@ -4153,8 +4153,11 @@ ncls:
> > >   else
> > >   ret = pt_prev->func(skb, skb->dev, pt_prev, orig_dev);
> > >   } else {
> > > + if (deliver_exact)
> > > + goto inactive; /* bond or team inactive slave */
> > >  drop:
> > >   atomic_long_inc(>dev->rx_dropped);
> > > +inactive:
> > >   kfree_skb(skb);
> > >   /* Jamal, now you will not able to escape explaining
> > >* me how you were going to use this. :-)
> > 
> > I agree that rx_dropped is not the correct stat to bump here, but
> > I'm totally against the event disappearing completely into thin
> > air.
> > 
> > You have to replace the rx_dropped bump with _something_.
> > 
> > The only reason this hasn't been "fixed" yet is that everyone is
> > too damn lazy to implement that "something".
> 
> Would you want to see all things that shouldn't increment rx_dropped come
> in one shot, along with the four or so other counters, as discussed in the
> prior thread, or can they be done piecemeal? To date, I'm really only
> familiar with this particular case, and could probably get something
> together this week. To address the rest, I'd have to poke around a bit
> more and see what there is to see and do.

Spent a while hacking around today, now have this, p7p1 and p5p2 are
the inactive slaves in the bond:

[root@dell-per720-06 ~]# cat /proc/net/dev
Inter-|   Receive   |  
Transmit
 face |bytespackets errs drop drop_i fifo frame compressed multicast|bytes  
  packets errs drop fifo colls carrier compressed
  p6p1:   16024 23800  00 0  0   521
0   0000 0   0  0
  p7p1: 1691386   1653700  165680 0  0   488
0   0000 0   0  0
  p7p2: 1709438   1671800  00 0  0   561
0   0000 0   0  0
 bond0: 6183056   6306500  331510 0  0 13964
24747 193000 0   0  0
  p4p1:   0   000  00 0  0 0
0   0000 0   0  0
  p4p2:   0   000  00 0  0 0
0   0000 0   0  0
lo:4928  5000  00 0  0 0 
4928  50000 0   0  0
  p5p1: 2259498   2340100  00 0  0  6740
24747 193000 0   0  0
  p5p2: 2232172   2312700  165830 0  0  6736
0   0000 0   0  0
   em4: 2347251   1822400  00 0  090 
4541  47000 0   0  0
   em2: 1590296   1606100  00 0  081
0   0000 0   0  0
   em1: 1590180   1606000  00 0  079
0   0000 0   0  0
   em3: 2343156   1820900  00 0  094
0   0000 0   0  0
[root@dell-per720-06 ~]# cat 
/sys/devices/virtual/net/bond0/statistics/rx_dropped_inactive
33181

Haven't yet thrown together anything for ethtool -S output as Eric had
suggested, but I'll dig into that tomorrow.

-- 
Jarod Wilson
ja...@redhat.com



Re: [BISECTED] v4.5-rc1 phylib regression

2016-01-25 Thread Andrew Lunn
On Mon, Jan 25, 2016 at 05:45:21PM +0200, Aaro Koskinen wrote:
> Hi,
> 
> I get the below crash on OCTEON (with octeon_mgmt interface, genphy)
> always during systemd boot.

Hi Aaro

I think i know what is going on now.

What does your phy look like in DT?

Thanks
Andrew


Re: [PATCH net] inet: frag: Always orphan skbs inside ip_defrag()

2016-01-25 Thread Eric Dumazet
On Mon, 2016-01-25 at 17:11 -0800, Joe Stringer wrote:

> Thanks, I can roll this into a v2 (or keep as a separate patch?). I
> got sidetracked on the IPv6 side, some other issues are blocking me on
> that but I intend to continue following up there as well.

No, don't worry, I will submit this in a separate patch.

Thanks.




Re: [PATCH 1/1] bonding: Use notifiers for slave link state detection

2016-01-25 Thread zhuyj

On 01/26/2016 02:26 PM, zhuyj wrote:

On 01/26/2016 02:00 PM, Jay Vosburgh wrote:

zhuyj  wrote:


On 01/26/2016 08:43 AM, Jay Vosburgh wrote:

 wrote:


From: Zhu Yanjun 

Bonding will utilize notifier callbacks to detect slave
link state changes. It is intended to be used with miimon
set to zero, and does not support the updelay or downdelay
options to bonding.

Because of link flap from the slave interface, if the notifier
is NETDEV_UP while the actual link state is down, it is not
necessary to continue.

Signed-off-by: Jay Vosburgh 

I haven't signed off on this patch.

I've just started some testing, but as before immediately get an
RCU warning; it looks to be coming from bond_miimon_inspect_slave();

[  316.473050] bond1: Enslaving eth1 as a backup interface with an 
up link

[  316.473059]
[  316.473806] ===
[  316.475630] [ INFO: suspicious RCU usage. ]
[  316.477519] 4.4.0+ #38 Not tainted
[  316.479094] ---
[  316.480765] drivers/net/bonding/bond_main.c:2024 suspicious 
rcu_dereference_check() usage!


This is presumably because the "case NETDEV_DOWN" call to
bond_miimon_inspect_slave does not hold RCU.  It does hold RTNL, 
though,
which should be safe for this usage (RTNL mutexes changes to the 
active
slave).  The appended patch on top of the original makes the 
warning go

away.

I'm still testing the patch and have no comment about its
functionality as yet.

diff --git a/drivers/net/bonding/bond_main.c 
b/drivers/net/bonding/bond_main.c

index 9f67948..e3faee9 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2014,14 +2014,14 @@ static int bond_slave_info_query(struct 
net_device *bond_dev, struct ifslave *in

 /* Monitoring
---*/
   -/* called with rcu_read_lock() */
+/* called with rcu_read_lock() or RTNL */
   static int bond_miimon_inspect_slave(struct bonding *bond, 
struct slave *slave,

unsigned long event)
   {
   int link_state;
   bool ignore_updelay;
   -ignore_updelay = !rcu_dereference(bond->curr_active_slave);
+ignore_updelay = !rcu_dereference_rtnl(bond->curr_active_slave);

Thanks a lot.
Because kernel v4.4 needs this kind of patch, I backport this patch 
from

net-next to kernel v4.4.

If it is not appropriate, I will revert this patch.

I don't understand what you mean here.

I've tested the patch (with my above modification), and while I
seem to be hitting an unrelated bug in the ARP monitor, I believe this
patch will misbehave when the ARP monitor is running.

For example, if arp_interval=1000 and miimon=0, the link state
notifier callback will change a slave to up should a notifier event take
place.  So, hypothetically, if a slave is "down" according to the ARP
monitor (but actually carrier up), and then experience a carrier down
then up transition, the slave would be set to "up" even though the ARP
monitor believes it to be down.

I'm not able to induce the speedy link flap events, so I'm not
sure about this portion of the patch:

+/* Because of link flap from the slave interface, it is possilbe 
that
+ * the notifiler is NETDEV_UP while the actual link state is 
down. If

+ * so, it is not necessary to contiune.
+ */
+switch (event) {
+case NETDEV_UP:
+if (!link_state)
+return 0;
+break;
+
+case NETDEV_DOWN:
+if (link_state)
+return 0;
+break;
+}
+

Unless I misunderstood, Emil's comments elsewhere suggest that
the current ixgbe driver won't cause those, though.

This patch will avoid useless configuration because of link flap.

Hi, Jay

Sorry. My bad. If there is no link flap in the current ixgbe driver, this
patch is not necessary.;-)

Best Regards!
Zhu Yanjun



Hi, Emil

Does the current ixgbe driver not cause link flap?

Thanks a lot.
Zhu Yanjun



-J

---
-Jay Vosburgh, jay.vosbu...@canonical.com







[PATCH] fddi: Fixup potential uninitialized bars

2016-01-25 Thread Hannes Reinecke
dfx_get_bars() allocates the various bars, depending on the
bus type. But as the function itself returns void and there
is no default selection there is a risk of the function
returning without allocating any bars.
This patch moves the entries around so that PCI is assumed
to the the default bus, and adds a WARN_ON check if that
should no be the case.
And I've made some minor code reshuffles to keep checkpatch
happy.

Signed-off-by: Hannes Reinecke 
---
 drivers/net/fddi/defxx.c | 29 -
 1 file changed, 16 insertions(+), 13 deletions(-)

diff --git a/drivers/net/fddi/defxx.c b/drivers/net/fddi/defxx.c
index 7f975a2..5fcaf03 100644
--- a/drivers/net/fddi/defxx.c
+++ b/drivers/net/fddi/defxx.c
@@ -434,19 +434,10 @@ static void dfx_port_read_long(DFX_board_t *bp, int 
offset, u32 *data)
 static void dfx_get_bars(struct device *bdev,
 resource_size_t *bar_start, resource_size_t *bar_len)
 {
-   int dfx_bus_pci = dev_is_pci(bdev);
int dfx_bus_eisa = DFX_BUS_EISA(bdev);
int dfx_bus_tc = DFX_BUS_TC(bdev);
int dfx_use_mmio = DFX_MMIO || dfx_bus_tc;
 
-   if (dfx_bus_pci) {
-   int num = dfx_use_mmio ? 0 : 1;
-
-   bar_start[0] = pci_resource_start(to_pci_dev(bdev), num);
-   bar_len[0] = pci_resource_len(to_pci_dev(bdev), num);
-   bar_start[2] = bar_start[1] = 0;
-   bar_len[2] = bar_len[1] = 0;
-   }
if (dfx_bus_eisa) {
unsigned long base_addr = to_eisa_device(bdev)->base_addr;
resource_size_t bar_lo;
@@ -476,13 +467,25 @@ static void dfx_get_bars(struct device *bdev,
bar_len[1] = PI_ESIC_K_BURST_HOLDOFF_LEN;
bar_start[2] = base_addr + PI_ESIC_K_ESIC_CSR;
bar_len[2] = PI_ESIC_K_ESIC_CSR_LEN;
-   }
-   if (dfx_bus_tc) {
+   } else if (dfx_bus_tc) {
bar_start[0] = to_tc_dev(bdev)->resource.start +
   PI_TC_K_CSR_OFFSET;
bar_len[0] = PI_TC_K_CSR_LEN;
-   bar_start[2] = bar_start[1] = 0;
-   bar_len[2] = bar_len[1] = 0;
+   bar_start[1] = 0;
+   bar_len[1] = 0;
+   bar_start[2] = 0;
+   bar_len[2] = 0;
+   } else {
+   /* Assume PCI */
+   int num = dfx_use_mmio ? 0 : 1;
+
+   WARN_ON(!dev_is_pci(bdev));
+   bar_start[0] = pci_resource_start(to_pci_dev(bdev), num);
+   bar_len[0] = pci_resource_len(to_pci_dev(bdev), num);
+   bar_start[1] = 0;
+   bar_len[1] = 0;
+   bar_start[2] = 0;
+   bar_len[2] = 0;
}
 }
 
-- 
1.8.5.6



Re: [PATCH] brcmfmac: sdio: Increase the default timeouts a bit

2016-01-25 Thread Arend van Spriel
On 26-01-16 00:41, Julian Calaby wrote:
> Hi Arend,
> 
> On Tue, Jan 26, 2016 at 2:39 AM, Arend van Spriel  wrote:
>> On 25-01-16 12:06, Julian Calaby wrote:
>>> Hi Sjoerd,
>>>
>>> On Mon, Jan 25, 2016 at 9:47 PM, Sjoerd Simons
>>>  wrote:
 On a Radxa Rock2 board with a Ampak AP6335 (Broadcom 4339 core) it seems
 the card responds very quickly most of the time, unfortunately during
 initialisation it sometimes seems to take just a bit over 2 seconds to
 respond.

 This results intialization failing with message like:
   brcmf_c_preinit_dcmds: Retreiving cur_etheraddr failed, -52
   brcmf_bus_start: failed: -52
   brcmf_sdio_firmware_callback: dongle is not responding

 Increasing the timeout to allow for a bit more headroom allows the
 card to initialize reliably.

 A quick search online after diagnosing/fixing this showed that Google
 has a similar patch in their ChromeOS tree, so this doesn't seem
 specific to the board I'm using.

 Signed-off-by: Sjoerd Simons 
>>>
>>> Looks sane to me.
>>>
>>> Reviewed-by: Julian Calaby 
>>
>> Not really a cleanup patch :-p , but thanks for the review.
> 
> I'm trying to review any "small" patch from (relatively) new people.

And it is surely appreciated. Just read your reply in "cleanup patch
pile" thread and felt I had to make the stupid remark with just fun
intended.

Regards,
Arend


Re: [PATCH 1/1] bonding: Use notifiers for slave link state detection

2016-01-25 Thread zhuyj

On 01/26/2016 02:00 PM, Jay Vosburgh wrote:

zhuyj  wrote:


On 01/26/2016 08:43 AM, Jay Vosburgh wrote:

 wrote:


From: Zhu Yanjun 

Bonding will utilize notifier callbacks to detect slave
link state changes. It is intended to be used with miimon
set to zero, and does not support the updelay or downdelay
options to bonding.

Because of link flap from the slave interface, if the notifier
is NETDEV_UP while the actual link state is down, it is not
necessary to continue.

Signed-off-by: Jay Vosburgh 

I haven't signed off on this patch.

I've just started some testing, but as before immediately get an
RCU warning; it looks to be coming from bond_miimon_inspect_slave();

[  316.473050] bond1: Enslaving eth1 as a backup interface with an up link
[  316.473059]
[  316.473806] ===
[  316.475630] [ INFO: suspicious RCU usage. ]
[  316.477519] 4.4.0+ #38 Not tainted
[  316.479094] ---
[  316.480765] drivers/net/bonding/bond_main.c:2024 suspicious 
rcu_dereference_check() usage!

This is presumably because the "case NETDEV_DOWN" call to
bond_miimon_inspect_slave does not hold RCU.  It does hold RTNL, though,
which should be safe for this usage (RTNL mutexes changes to the active
slave).  The appended patch on top of the original makes the warning go
away.

I'm still testing the patch and have no comment about its
functionality as yet.

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 9f67948..e3faee9 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2014,14 +2014,14 @@ static int bond_slave_info_query(struct net_device 
*bond_dev, struct ifslave *in
 /* Monitoring
---*/
   -/* called with rcu_read_lock() */
+/* called with rcu_read_lock() or RTNL */
   static int bond_miimon_inspect_slave(struct bonding *bond, struct slave 
*slave,
 unsigned long event)
   {
int link_state;
bool ignore_updelay;
   -ignore_updelay = !rcu_dereference(bond->curr_active_slave);
+   ignore_updelay = !rcu_dereference_rtnl(bond->curr_active_slave);

Thanks a lot.
Because kernel v4.4 needs this kind of patch, I backport this patch from
net-next to kernel v4.4.

If it is not appropriate, I will revert this patch.

I don't understand what you mean here.

I've tested the patch (with my above modification), and while I
seem to be hitting an unrelated bug in the ARP monitor, I believe this
patch will misbehave when the ARP monitor is running.

For example, if arp_interval=1000 and miimon=0, the link state
notifier callback will change a slave to up should a notifier event take
place.  So, hypothetically, if a slave is "down" according to the ARP
monitor (but actually carrier up), and then experience a carrier down
then up transition, the slave would be set to "up" even though the ARP
monitor believes it to be down.

I'm not able to induce the speedy link flap events, so I'm not
sure about this portion of the patch:

+   /* Because of link flap from the slave interface, it is possilbe that
+* the notifiler is NETDEV_UP while the actual link state is down. If
+* so, it is not necessary to contiune.
+*/
+   switch (event) {
+   case NETDEV_UP:
+   if (!link_state)
+   return 0;
+   break;
+
+   case NETDEV_DOWN:
+   if (link_state)
+   return 0;
+   break;
+   }
+

Unless I misunderstood, Emil's comments elsewhere suggest that
the current ixgbe driver won't cause those, though.

This patch will avoid useless configuration because of link flap.

Hi, Emil

Does the current ixgbe driver not cause link flap?

Thanks a lot.
Zhu Yanjun



-J

---
-Jay Vosburgh, jay.vosbu...@canonical.com





[PATCH] net_sched: drr: check for NULL pointer in drr_dequeue

2016-01-25 Thread Bernie Harris
There are cases where qdisc_dequeue_peeked can return NULL, and the result
is dereferenced later on in the function.

Similarly to the other qdisc dequeue functions, check whether the skb
pointer is NULL and if it is, goto out.

Signed-off-by: Bernie Harris 
---
 net/sched/sch_drr.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/sched/sch_drr.c b/net/sched/sch_drr.c
index f26bdea..8086c3d 100644
--- a/net/sched/sch_drr.c
+++ b/net/sched/sch_drr.c
@@ -403,6 +403,8 @@ static struct sk_buff *drr_dequeue(struct Qdisc *sch)
if (len <= cl->deficit) {
cl->deficit -= len;
skb = qdisc_dequeue_peeked(cl->qdisc);
+   if (skb == NULL)
+   goto out;
if (cl->qdisc->q.qlen == 0)
list_del(>alist);
 
-- 
2.7.0



Re: [PATCH 1/1] bonding: Use notifiers for slave link state detection

2016-01-25 Thread Jay Vosburgh
zhuyj  wrote:

>On 01/26/2016 08:43 AM, Jay Vosburgh wrote:
>>  wrote:
>>
>>> From: Zhu Yanjun 
>>>
>>> Bonding will utilize notifier callbacks to detect slave
>>> link state changes. It is intended to be used with miimon
>>> set to zero, and does not support the updelay or downdelay
>>> options to bonding.
>>>
>>> Because of link flap from the slave interface, if the notifier
>>> is NETDEV_UP while the actual link state is down, it is not
>>> necessary to continue.
>>>
>>> Signed-off-by: Jay Vosburgh 
>>  I haven't signed off on this patch.
>>
>>  I've just started some testing, but as before immediately get an
>> RCU warning; it looks to be coming from bond_miimon_inspect_slave();
>>
>> [  316.473050] bond1: Enslaving eth1 as a backup interface with an up link
>> [  316.473059]
>> [  316.473806] ===
>> [  316.475630] [ INFO: suspicious RCU usage. ]
>> [  316.477519] 4.4.0+ #38 Not tainted
>> [  316.479094] ---
>> [  316.480765] drivers/net/bonding/bond_main.c:2024 suspicious 
>> rcu_dereference_check() usage!
>>
>>  This is presumably because the "case NETDEV_DOWN" call to
>> bond_miimon_inspect_slave does not hold RCU.  It does hold RTNL, though,
>> which should be safe for this usage (RTNL mutexes changes to the active
>> slave).  The appended patch on top of the original makes the warning go
>> away.
>>
>>  I'm still testing the patch and have no comment about its
>> functionality as yet.
>>
>> diff --git a/drivers/net/bonding/bond_main.c 
>> b/drivers/net/bonding/bond_main.c
>> index 9f67948..e3faee9 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -2014,14 +2014,14 @@ static int bond_slave_info_query(struct net_device 
>> *bond_dev, struct ifslave *in
>> /* Monitoring
>> ---*/
>>   -/* called with rcu_read_lock() */
>> +/* called with rcu_read_lock() or RTNL */
>>   static int bond_miimon_inspect_slave(struct bonding *bond, struct slave 
>> *slave,
>>   unsigned long event)
>>   {
>>  int link_state;
>>  bool ignore_updelay;
>>   -  ignore_updelay = !rcu_dereference(bond->curr_active_slave);
>> +ignore_updelay = !rcu_dereference_rtnl(bond->curr_active_slave);
>
>Thanks a lot.
>Because kernel v4.4 needs this kind of patch, I backport this patch from
>net-next to kernel v4.4.
>
>If it is not appropriate, I will revert this patch.

I don't understand what you mean here.

I've tested the patch (with my above modification), and while I
seem to be hitting an unrelated bug in the ARP monitor, I believe this
patch will misbehave when the ARP monitor is running.

For example, if arp_interval=1000 and miimon=0, the link state
notifier callback will change a slave to up should a notifier event take
place.  So, hypothetically, if a slave is "down" according to the ARP
monitor (but actually carrier up), and then experience a carrier down
then up transition, the slave would be set to "up" even though the ARP
monitor believes it to be down.

I'm not able to induce the speedy link flap events, so I'm not
sure about this portion of the patch:

+   /* Because of link flap from the slave interface, it is possilbe that
+* the notifiler is NETDEV_UP while the actual link state is down. If
+* so, it is not necessary to contiune.
+*/
+   switch (event) {
+   case NETDEV_UP:
+   if (!link_state)
+   return 0;
+   break;
+
+   case NETDEV_DOWN:
+   if (link_state)
+   return 0;
+   break;
+   }
+

Unless I misunderstood, Emil's comments elsewhere suggest that
the current ixgbe driver won't cause those, though.

-J

---
-Jay Vosburgh, jay.vosbu...@canonical.com


Re: [net 0/2][pull request] Intel Wired LAN Driver Updates 2016-01-25

2016-01-25 Thread David Miller
From: Jeff Kirsher 
Date: Mon, 25 Jan 2016 15:58:50 -0800

> This series contains updates to i40e only and so I won't continue receiving
> patches to fix the same issue (again).
> 
> Arnd fixes the driver from causing the compiler whining about uninitialized
> variables, so initialize those variables.
> 
> Eric fixes the build errors/warnings which were introduced by Anjali
> when she added geneve support to i40e.

Pulled, thanks Jeff.


Re: [PATCH] net: take care of bonding in build_skb_flow_key (v4)

2016-01-25 Thread zhuyj

On 01/22/2016 02:52 PM, Jiri Pirko wrote:

Fri, Jan 22, 2016 at 05:21:28AM CET, wen.gang.w...@oracle.com wrote:


在 2016年01月21日 16:35, Jiri Pirko 写道:

Thu, Jan 21, 2016 at 06:32:58AM CET, wen.gang.w...@oracle.com wrote:

In a bonding setting, we determines fragment size according to MTU and
PMTU associated to the bonding master. If the slave finds the fragment
size is too big, it drops the fragment and calls ip_rt_update_pmtu(),
passing _skb_ and _pmtu_, trying to update the path MTU.
Problem is that the target device that function ip_rt_update_pmtu actually
tries to update is the slave (skb->dev), not the master. Thus since no
PMTU change happens on master, the fragment size for later packets doesn't
change so all later fragments/packets are dropped too.

The fix is letting build_skb_flow_key() take care of the transition of
device index from bonding slave to the master. That makes the master become
the target device that ip_rt_update_pmtu tries to update PMTU to.

Signed-off-by: Wengang Wang 
---
net/ipv4/route.c | 9 +
1 file changed, 9 insertions(+)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 85f184e..7e766b5 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -524,10 +524,19 @@ static void build_skb_flow_key(struct flowi4 *fl4, const 
struct sk_buff *skb,
{
const struct iphdr *iph = ip_hdr(skb);
int oif = skb->dev->ifindex;
+   struct net_device *master;
u8 tos = RT_TOS(iph->tos);
u8 prot = iph->protocol;
u32 mark = skb->mark;

+   if (netif_is_bond_slave(skb->dev)) {
+   rcu_read_lock();
+   master = netdev_master_upper_dev_get_rcu(skb->dev);
+   if (master)
+   oif = master->ifindex;
+   rcu_read_unlock();
+   }

This is certainly not correct as it should not be bond-specific but
rather generic.

Then what you would suggest to fix it?

Note that you may have bond over bond or bridge over
bond or other scenarios, which this patch ignores.

I don't think bond over bond is a good configuration. Do you have a real use
case for that configuration?

Stacking of multiple master devices is absolutelly common.

You have to go in the upper tree all the way up, for all master device
types.

I am not sure that the following can work or not.
Just a test patch.

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 85f184e..12b4982 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -523,10 +523,19 @@ static void build_skb_flow_key(struct flowi4 *fl4, 
const struct sk_buff *skb,

   const struct sock *sk)
 {
const struct iphdr *iph = ip_hdr(skb);
-   int oif = skb->dev->ifindex;
+   struct net_device *master = NULL;
u8 tos = RT_TOS(iph->tos);
u8 prot = iph->protocol;
u32 mark = skb->mark;
+   int oif = skb->dev->ifindex;
+
+   if (skb->dev->flags & IFF_SLAVE) {
+   rcu_read_lock();
+   master = skb_dst(skb)->dev;
+   if (master)
+   oif = master->ifindex;
+   rcu_read_unlock();
+   }

__build_flow_key(fl4, sk, iph, oif, tos, prot, mark, 0);
 }

Thanks a lot.
Zhu Yanjun




thanks,
wengang





Re: net/sctp: out-of-bounds access in sctp_add_bind_addr

2016-01-25 Thread Neil Horman
On Mon, Jan 25, 2016 at 03:02:38PM +0100, Dmitry Vyukov wrote:
> Hello,
> 
> I've git the following error report while running syzkaller fuzzer:
> 
> ==
> BUG: KASAN: slab-out-of-bounds in memcpy+0x1d/0x40 at addr 88006c6361e8
> Read of size 28 by task syz-executor/12551
> =
> BUG kmalloc-16 (Not tainted): kasan: bad access detected
> -
> 
> INFO: Allocated in sctp_setsockopt_bindx+0xd2/0x3e0 age=12 cpu=2 pid=12551
> [< inline >] kmalloc include/linux/slab.h:468
> [<  none  >] sctp_setsockopt_bindx+0xd2/0x3e0 net/sctp/socket.c:975
> [<  none  >] sctp_setsockopt+0x1493/0x3630 net/sctp/socket.c:3711
> [<  none  >] sock_common_setsockopt+0x97/0xd0 net/core/sock.c:2620
> [< inline >] SYSC_setsockopt net/socket.c:1752
> [<  none  >] SyS_setsockopt+0x15b/0x250 net/socket.c:1731
> [<  none  >] entry_SYSCALL_64_fastpath+0x16/0x7a
> arch/x86/entry/entry_64.S:185
> 
> INFO: Slab 0xea0001b18d80 objects=16 used=4 fp=0x88006c6376e0
> flags=0x5fffc004080
> INFO: Object 0x88006c6361e8 @offset=488 fp=0x0002
> Bytes b4 88006c6361d8: 00 00 00 00 00 00 00 00 2f 98 34 88 ff ff
> ff ff  /.4.
> Object 88006c6361e8: 02 00 00 00 00 00 00 00 02 00 ab 07 7f 00 00
> 01  
> CPU: 2 PID: 12551 Comm: syz-executor Tainted: GB   4.5.0-rc1+ #278
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>   880036397928 8299a02d 88003e807900
>  88006c6361e8 88006c636000 880036397958 81752814
>  88003e807900 ea0001b18d80 88006c6361e8 88006c6361e8
> 
> Call Trace:
>  [] __asan_loadN+0x124/0x1a0 mm/kasan/kasan.c:512
>  [] memcpy+0x1d/0x40 mm/kasan/kasan.c:297
>  [] sctp_add_bind_addr+0xa9/0x270 net/sctp/bind_addr.c:162
>  [] sctp_do_bind+0x336/0x580 net/sctp/socket.c:389
>  [] sctp_bindx_add+0xac/0x1a0 net/sctp/socket.c:471
>  [] sctp_setsockopt_bindx+0x2f8/0x3e0 net/sctp/socket.c:1010
>  [] sctp_setsockopt+0x1493/0x3630 net/sctp/socket.c:3711
>  [] sock_common_setsockopt+0x97/0xd0 net/core/sock.c:2620
>  [< inline >] SYSC_setsockopt net/socket.c:1752
>  [] SyS_setsockopt+0x15b/0x250 net/socket.c:1731
>  [] entry_SYSCALL_64_fastpath+0x16/0x7a
> arch/x86/entry/entry_64.S:185
> 
> Memory state around the buggy address:
>  88006c636080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>  88006c636100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >88006c636180: fc fc fc fc fc fc fc fc fc fc fc fc fc 00 00 fc
> ^
>  88006c636200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>  88006c636280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ==
> 
> 
> sctp_setsockopt_bindx verifies that the user-passed address has valid
> len for the specified family, but then sctp_add_bind_addr copies whole
> sctp_addr from there. This causes heap out-of-bounds access and can
> crash kernel. Not sure if it is possible to copy out the trailing
> garbage to user-space later.
> 

It does more than that though.  sctp_setsockopt_bindx checks the following:
1) That passed addr_size is greater than zero
2) that the entire range of memory between addrs and addrs+addr_size is readable
3) That at least one address structure worth of data is available (implicit in
the while (walk_size < addr_size) loop).

Could one of the sockaddr_len fields in one of the addresses have been mangled
so that it appeared shorter in the the while loop from (3), so that a copy of
sizeof(sctp_addr in sctp_add_bind_addr overrun the allocated memory?

Neil

> On commit 92e963f50fc74041b5e9e744c330dca48e04f08d (Jan 25).
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


Re: net/sctp: out-of-bounds access in sctp_add_bind_addr

2016-01-25 Thread Dmitry Vyukov
On Mon, Jan 25, 2016 at 3:31 PM, Neil Horman  wrote:
> On Mon, Jan 25, 2016 at 03:02:38PM +0100, Dmitry Vyukov wrote:
>> Hello,
>>
>> I've git the following error report while running syzkaller fuzzer:
>>
>> ==
>> BUG: KASAN: slab-out-of-bounds in memcpy+0x1d/0x40 at addr 88006c6361e8
>> Read of size 28 by task syz-executor/12551
>> =
>> BUG kmalloc-16 (Not tainted): kasan: bad access detected
>> -
>>
>> INFO: Allocated in sctp_setsockopt_bindx+0xd2/0x3e0 age=12 cpu=2 pid=12551
>> [< inline >] kmalloc include/linux/slab.h:468
>> [<  none  >] sctp_setsockopt_bindx+0xd2/0x3e0 net/sctp/socket.c:975
>> [<  none  >] sctp_setsockopt+0x1493/0x3630 net/sctp/socket.c:3711
>> [<  none  >] sock_common_setsockopt+0x97/0xd0 net/core/sock.c:2620
>> [< inline >] SYSC_setsockopt net/socket.c:1752
>> [<  none  >] SyS_setsockopt+0x15b/0x250 net/socket.c:1731
>> [<  none  >] entry_SYSCALL_64_fastpath+0x16/0x7a
>> arch/x86/entry/entry_64.S:185
>>
>> INFO: Slab 0xea0001b18d80 objects=16 used=4 fp=0x88006c6376e0
>> flags=0x5fffc004080
>> INFO: Object 0x88006c6361e8 @offset=488 fp=0x0002
>> Bytes b4 88006c6361d8: 00 00 00 00 00 00 00 00 2f 98 34 88 ff ff
>> ff ff  /.4.
>> Object 88006c6361e8: 02 00 00 00 00 00 00 00 02 00 ab 07 7f 00 00
>> 01  
>> CPU: 2 PID: 12551 Comm: syz-executor Tainted: GB   4.5.0-rc1+ 
>> #278
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>>   880036397928 8299a02d 88003e807900
>>  88006c6361e8 88006c636000 880036397958 81752814
>>  88003e807900 ea0001b18d80 88006c6361e8 88006c6361e8
>>
>> Call Trace:
>>  [] __asan_loadN+0x124/0x1a0 mm/kasan/kasan.c:512
>>  [] memcpy+0x1d/0x40 mm/kasan/kasan.c:297
>>  [] sctp_add_bind_addr+0xa9/0x270 net/sctp/bind_addr.c:162
>>  [] sctp_do_bind+0x336/0x580 net/sctp/socket.c:389
>>  [] sctp_bindx_add+0xac/0x1a0 net/sctp/socket.c:471
>>  [] sctp_setsockopt_bindx+0x2f8/0x3e0 
>> net/sctp/socket.c:1010
>>  [] sctp_setsockopt+0x1493/0x3630 net/sctp/socket.c:3711
>>  [] sock_common_setsockopt+0x97/0xd0 net/core/sock.c:2620
>>  [< inline >] SYSC_setsockopt net/socket.c:1752
>>  [] SyS_setsockopt+0x15b/0x250 net/socket.c:1731
>>  [] entry_SYSCALL_64_fastpath+0x16/0x7a
>> arch/x86/entry/entry_64.S:185
>>
>> Memory state around the buggy address:
>>  88006c636080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>  88006c636100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>> >88006c636180: fc fc fc fc fc fc fc fc fc fc fc fc fc 00 00 fc
>> ^
>>  88006c636200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>  88006c636280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>> ==
>>
>>
>> sctp_setsockopt_bindx verifies that the user-passed address has valid
>> len for the specified family, but then sctp_add_bind_addr copies whole
>> sctp_addr from there. This causes heap out-of-bounds access and can
>> crash kernel. Not sure if it is possible to copy out the trailing
>> garbage to user-space later.
>>
>
> It does more than that though.  sctp_setsockopt_bindx checks the following:
> 1) That passed addr_size is greater than zero
> 2) that the entire range of memory between addrs and addrs+addr_size is 
> readable
> 3) That at least one address structure worth of data is available (implicit in
> the while (walk_size < addr_size) loop).
>
> Could one of the sockaddr_len fields in one of the addresses have been mangled
> so that it appeared shorter in the the while loop from (3), so that a copy of
> sizeof(sctp_addr in sctp_add_bind_addr overrun the allocated memory?

I may be missing something, but what I see is:

1. we check that there is at least family:
if (walk_size + sizeof(sa_family_t) > addrs_size) {

2. get family descriptor:
af = sctp_get_af_specific(sa_addr->sa_family);

3. check that the address size is enough to hold the declared family:
if (!af || (walk_size + af->sockaddr_len) > addrs_size) {

4. then we do sctp_add_bind_addr, which copies whole sctp_addr from addr:

int sctp_add_bind_addr(struct sctp_bind_addr *bp, union sctp_addr *new,
...
memcpy(>a, new, sizeof(*new));

Now imagine that the addr is ipv4 (16 or so bytes, that's what we
checked) and we copy 28 bytes (ipv6) from addr.


Re: net/sctp: out-of-bounds access in sctp_add_bind_addr

2016-01-25 Thread Marcelo Ricardo Leitner
On Mon, Jan 25, 2016 at 03:42:14PM +0100, Dmitry Vyukov wrote:
> On Mon, Jan 25, 2016 at 3:31 PM, Neil Horman  wrote:
> > On Mon, Jan 25, 2016 at 03:02:38PM +0100, Dmitry Vyukov wrote:
> >> Hello,
> >>
> >> I've git the following error report while running syzkaller fuzzer:
> >>
> >> ==
> >> BUG: KASAN: slab-out-of-bounds in memcpy+0x1d/0x40 at addr 88006c6361e8
> >> Read of size 28 by task syz-executor/12551
> >> =
> >> BUG kmalloc-16 (Not tainted): kasan: bad access detected
> >> -
> >>
> >> INFO: Allocated in sctp_setsockopt_bindx+0xd2/0x3e0 age=12 cpu=2 pid=12551
> >> [< inline >] kmalloc include/linux/slab.h:468
> >> [<  none  >] sctp_setsockopt_bindx+0xd2/0x3e0 net/sctp/socket.c:975
> >> [<  none  >] sctp_setsockopt+0x1493/0x3630 net/sctp/socket.c:3711
> >> [<  none  >] sock_common_setsockopt+0x97/0xd0 net/core/sock.c:2620
> >> [< inline >] SYSC_setsockopt net/socket.c:1752
> >> [<  none  >] SyS_setsockopt+0x15b/0x250 net/socket.c:1731
> >> [<  none  >] entry_SYSCALL_64_fastpath+0x16/0x7a
> >> arch/x86/entry/entry_64.S:185
> >>
> >> INFO: Slab 0xea0001b18d80 objects=16 used=4 fp=0x88006c6376e0
> >> flags=0x5fffc004080
> >> INFO: Object 0x88006c6361e8 @offset=488 fp=0x0002
> >> Bytes b4 88006c6361d8: 00 00 00 00 00 00 00 00 2f 98 34 88 ff ff
> >> ff ff  /.4.
> >> Object 88006c6361e8: 02 00 00 00 00 00 00 00 02 00 ab 07 7f 00 00
> >> 01  
> >> CPU: 2 PID: 12551 Comm: syz-executor Tainted: GB   4.5.0-rc1+ 
> >> #278
> >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 
> >> 01/01/2011
> >>   880036397928 8299a02d 88003e807900
> >>  88006c6361e8 88006c636000 880036397958 81752814
> >>  88003e807900 ea0001b18d80 88006c6361e8 88006c6361e8
> >>
> >> Call Trace:
> >>  [] __asan_loadN+0x124/0x1a0 mm/kasan/kasan.c:512
> >>  [] memcpy+0x1d/0x40 mm/kasan/kasan.c:297
> >>  [] sctp_add_bind_addr+0xa9/0x270 
> >> net/sctp/bind_addr.c:162
> >>  [] sctp_do_bind+0x336/0x580 net/sctp/socket.c:389
> >>  [] sctp_bindx_add+0xac/0x1a0 net/sctp/socket.c:471
> >>  [] sctp_setsockopt_bindx+0x2f8/0x3e0 
> >> net/sctp/socket.c:1010
> >>  [] sctp_setsockopt+0x1493/0x3630 net/sctp/socket.c:3711
> >>  [] sock_common_setsockopt+0x97/0xd0 net/core/sock.c:2620
> >>  [< inline >] SYSC_setsockopt net/socket.c:1752
> >>  [] SyS_setsockopt+0x15b/0x250 net/socket.c:1731
> >>  [] entry_SYSCALL_64_fastpath+0x16/0x7a
> >> arch/x86/entry/entry_64.S:185
> >>
> >> Memory state around the buggy address:
> >>  88006c636080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >>  88006c636100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >> >88006c636180: fc fc fc fc fc fc fc fc fc fc fc fc fc 00 00 fc
> >> ^
> >>  88006c636200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >>  88006c636280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >> ==
> >>
> >>
> >> sctp_setsockopt_bindx verifies that the user-passed address has valid
> >> len for the specified family, but then sctp_add_bind_addr copies whole
> >> sctp_addr from there. This causes heap out-of-bounds access and can
> >> crash kernel. Not sure if it is possible to copy out the trailing
> >> garbage to user-space later.
> >>
> >
> > It does more than that though.  sctp_setsockopt_bindx checks the following:
> > 1) That passed addr_size is greater than zero
> > 2) that the entire range of memory between addrs and addrs+addr_size is 
> > readable
> > 3) That at least one address structure worth of data is available (implicit 
> > in
> > the while (walk_size < addr_size) loop).
> >
> > Could one of the sockaddr_len fields in one of the addresses have been 
> > mangled
> > so that it appeared shorter in the the while loop from (3), so that a copy 
> > of
> > sizeof(sctp_addr in sctp_add_bind_addr overrun the allocated memory?
> 
> I may be missing something, but what I see is:
> 
> 1. we check that there is at least family:
> if (walk_size + sizeof(sa_family_t) > addrs_size) {
> 
> 2. get family descriptor:
> af = sctp_get_af_specific(sa_addr->sa_family);
> 
> 3. check that the address size is enough to hold the declared family:
> if (!af || (walk_size + af->sockaddr_len) > addrs_size) {
> 
> 4. then we do sctp_add_bind_addr, which copies whole sctp_addr from addr:
> 
> int sctp_add_bind_addr(struct sctp_bind_addr *bp, union sctp_addr *new,
> ...
> memcpy(>a, new, sizeof(*new));
> 
> Now imagine that the addr is ipv4 (16 or so bytes, that's what we
> checked) and we copy 28 bytes 

Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage)

2016-01-25 Thread Jesper Dangaard Brouer

After reading John's reply about perfect filters, I want to re-state
my idea, for this very early RX stage.  And describe a packet-page
level bypass use-case, that John indirectly mentions.


There are two ideas, getting mixed up here.  (1) bundling from the
RX-ring, (2) allowing to pick up the "packet-page" directly.

Bundling (1) is something that seems natural, and which help us
amortize the cost between layers (and utilizes icache better). Lets
keep that in another thread.

This (2) direct forward of "packet-pages" is a fairly extreme idea,
BUT it have the potential of being an new integration point for
"selective" bypass-solutions and bringing RAW/af_packet (RX) up-to
speed with bypass-solutions.


Today, the bypass-solutions grab and control the entire NIC HW.  In
many cases this is not very practical, if you also want to use the NIC
for something else.

Solutions for bypassing only part of the traffic is starting to show
up.  Both a netmap[1] and a DPDK[2] based approach.

[1] https://blog.cloudflare.com/partial-kernel-bypass-merged-netmap/
[2] 
http://rhelblog.redhat.com/2015/10/02/getting-the-best-of-both-worlds-with-queue-splitting-bifurcated-driver/

Both approaches install a HW filter in the NIC, and redirect packets
to a separate RX HW queue (via ethtool ntuple + flow-type).  DPDK
needs pci SRIOV setup and then run it own poll-mode driver on top.
Netmap patch the orig ixgbe driver, and since CloudFlare/Gilberto's
changes[3] support a single RX queue mode.

[3] https://github.com/luigirizzo/netmap/pull/87


I'm thinking, why run all this extra driver software on top.  Why
don't we just pickup the (packet)-page from the RX ring, and
hand-it-over to a registered bypass handler?  (as mentioned before,
the HW descriptor need to somehow "mark" these packets for us).

I imagine some kind of page ring structure, and I also imagine
RAW/af_packet being a "bypass" consumer.  I guess the af_packet part
was also something John and Daniel have been looking at.


(top post, but left John's replay below, because it got me thinking)
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer




On Sun, 24 Jan 2016 09:28:36 -0800
John Fastabend  wrote:

> On 16-01-24 06:44 AM, Michael S. Tsirkin wrote:
> > On Sun, Jan 24, 2016 at 03:28:14PM +0100, Jesper Dangaard Brouer wrote:  
> >> On Thu, 21 Jan 2016 10:54:01 -0800 (PST)
> >> David Miller  wrote:
> >>  
> >>> From: Jesper Dangaard Brouer 
> >>> Date: Thu, 21 Jan 2016 12:27:30 +0100
> >>>  
[...]

> >>
> >> BUT then I realized, what if we take this even further.  What if we
> >> actually use this information, for something useful, at this very
> >> early RX stage.
> >>
> >> The information I'm interested in, from the HW descriptor, is if this
> >> packet is NOT for local delivery.  If so, we can send the packet on a
> >> "fast-forward" code path.
> >>
> >> Think about bridging packets to a guest OS.  Because we know very
> >> early at RX (from packet HW descriptor) we might even avoid allocating
> >> a SKB.  We could just "forward" the packet-page to the guest OS.  
> > 
> > OK, so you would build a new kind of rx handler, and then
> > e.g. macvtap could maybe get packets this way?
> > Sure - e.g. vhost expects an skb at the moment
> > but it won't be too hard to teach it that there's
> > some other option.  
> 
> + Daniel, Vlad
> 
> If you use the macvtap device with the offload features you can "know"
> via mac address that all packets on a specific hardware queue set belong
> to a specific guest. (the queues are bound to a new netdev) This works
> well with the passthru mode of macvlan. So you can do hardware bridging
> this way. Supporting similar L3 modes probably not via macvlan has been
> on my todo list for awhile but I haven't got there yet. ixgbe and fm10k
> intel drivers support this now maybe others but those are the two I've
> worked with recently.
> 
> The idea here is you remove any overhead from running bridge code, etc.
> but still allowing users to stick netfilter, qos, etc hooks in the
> datapath.
> 
> Also Daniel and I started working on a zero-copy RX mode which would
> further help this by letting vhost-net pass down a set of dma buffers
> we should probably get this working and submit it. iirc Vlad also
> had the same sort of idea. The initial data for this looked good but
> not as good as the solution below. However it had a similar issue as
> below in that you just jumped over netfilter, qos, etc. Our initial
> implementation used af_packet.
> 
> > 
> > Or maybe some kind of stub skb that just has
> > the correct length but no data is easier,
> > I'm not sure.
> >   
> 
> Another option is to use perfect filters to push traffic to a VF and
> then map the VF into user space and use the vhost dpdk bits. This
> works fairly well and gets pkts into the 

Re: [PATCH] ipv4+ipv6: Make INET*_ESP select CRYPTO_ECHAINIV

2016-01-25 Thread Herbert Xu
On Mon, Jan 25, 2016 at 12:58:44PM +0100, Thomas Egerer wrote:
> The ESP algorithms using CBC mode require echainiv. Hence INET*_ESP have
> to select CRYPTO_ECHAINIV in order to work properly. This solves the
> issues caused by a misconfiguration as described in [1].
> The original approach, patching crypto/Kconfig was turned down by
> Herbert Xu [2].
> 
> [1] https://lists.strongswan.org/pipermail/users/2015-December/009074.html
> [2] http://marc.info/?l=linux-crypto-vger=145224655809562=2
> 
> Signed-off-by: Thomas Egerer 

Acked-by: Herbert Xu 
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


net/sctp: out-of-bounds access in sctp_add_bind_addr

2016-01-25 Thread Dmitry Vyukov
Hello,

I've git the following error report while running syzkaller fuzzer:

==
BUG: KASAN: slab-out-of-bounds in memcpy+0x1d/0x40 at addr 88006c6361e8
Read of size 28 by task syz-executor/12551
=
BUG kmalloc-16 (Not tainted): kasan: bad access detected
-

INFO: Allocated in sctp_setsockopt_bindx+0xd2/0x3e0 age=12 cpu=2 pid=12551
[< inline >] kmalloc include/linux/slab.h:468
[<  none  >] sctp_setsockopt_bindx+0xd2/0x3e0 net/sctp/socket.c:975
[<  none  >] sctp_setsockopt+0x1493/0x3630 net/sctp/socket.c:3711
[<  none  >] sock_common_setsockopt+0x97/0xd0 net/core/sock.c:2620
[< inline >] SYSC_setsockopt net/socket.c:1752
[<  none  >] SyS_setsockopt+0x15b/0x250 net/socket.c:1731
[<  none  >] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185

INFO: Slab 0xea0001b18d80 objects=16 used=4 fp=0x88006c6376e0
flags=0x5fffc004080
INFO: Object 0x88006c6361e8 @offset=488 fp=0x0002
Bytes b4 88006c6361d8: 00 00 00 00 00 00 00 00 2f 98 34 88 ff ff
ff ff  /.4.
Object 88006c6361e8: 02 00 00 00 00 00 00 00 02 00 ab 07 7f 00 00
01  
CPU: 2 PID: 12551 Comm: syz-executor Tainted: GB   4.5.0-rc1+ #278
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  880036397928 8299a02d 88003e807900
 88006c6361e8 88006c636000 880036397958 81752814
 88003e807900 ea0001b18d80 88006c6361e8 88006c6361e8

Call Trace:
 [] __asan_loadN+0x124/0x1a0 mm/kasan/kasan.c:512
 [] memcpy+0x1d/0x40 mm/kasan/kasan.c:297
 [] sctp_add_bind_addr+0xa9/0x270 net/sctp/bind_addr.c:162
 [] sctp_do_bind+0x336/0x580 net/sctp/socket.c:389
 [] sctp_bindx_add+0xac/0x1a0 net/sctp/socket.c:471
 [] sctp_setsockopt_bindx+0x2f8/0x3e0 net/sctp/socket.c:1010
 [] sctp_setsockopt+0x1493/0x3630 net/sctp/socket.c:3711
 [] sock_common_setsockopt+0x97/0xd0 net/core/sock.c:2620
 [< inline >] SYSC_setsockopt net/socket.c:1752
 [] SyS_setsockopt+0x15b/0x250 net/socket.c:1731
 [] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185

Memory state around the buggy address:
 88006c636080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 88006c636100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>88006c636180: fc fc fc fc fc fc fc fc fc fc fc fc fc 00 00 fc
^
 88006c636200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 88006c636280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==


sctp_setsockopt_bindx verifies that the user-passed address has valid
len for the specified family, but then sctp_add_bind_addr copies whole
sctp_addr from there. This causes heap out-of-bounds access and can
crash kernel. Not sure if it is possible to copy out the trailing
garbage to user-space later.

On commit 92e963f50fc74041b5e9e744c330dca48e04f08d (Jan 25).


Re: [RFC PATCH net] net/core: don't increment rx_dropped on inactive slaves

2016-01-25 Thread Jarod Wilson
On Sun, Jan 24, 2016 at 10:42:22PM -0800, David Miller wrote:
> From: Jarod Wilson 
> Date: Fri, 22 Jan 2016 14:11:22 -0500
> 
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 8cba3d8..1354c7b 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -4153,8 +4153,11 @@ ncls:
> > else
> > ret = pt_prev->func(skb, skb->dev, pt_prev, orig_dev);
> > } else {
> > +   if (deliver_exact)
> > +   goto inactive; /* bond or team inactive slave */
> >  drop:
> > atomic_long_inc(>dev->rx_dropped);
> > +inactive:
> > kfree_skb(skb);
> > /* Jamal, now you will not able to escape explaining
> >  * me how you were going to use this. :-)
> > -- 
> > 1.8.3.1
> > 
> 
> I agree that rx_dropped is not the correct stat to bump here, but
> I'm totally against the event disappearing completely into thin
> air.
> 
> You have to replace the rx_dropped bump with _something_.
> 
> The only reason this hasn't been "fixed" yet is that everyone is
> too damn lazy to implement that "something".

Would you want to see all things that shouldn't increment rx_dropped come
in one shot, along with the four or so other counters, as discussed in the
prior thread, or can they be done piecemeal? To date, I'm really only
familiar with this particular case, and could probably get something
together this week. To address the rest, I'd have to poke around a bit
more and see what there is to see and do.

-- 
Jarod Wilson
ja...@redhat.com



Re: [PATCH 2/2] ppp: implement rtnetlink device handling

2016-01-25 Thread Guillaume Nault
On Mon, Jan 25, 2016 at 12:09:34PM +0100, walter harms wrote:
> 
> 
> Am 23.12.2015 21:04, schrieb Guillaume Nault:
> > @@ -1012,7 +1017,24 @@ static int ppp_dev_configure(struct net *src_net, 
> > struct net_device *dev,
> > int indx;
> > int err;
> >  
> > -   file = conf->file;
> > +   if (conf->fd >= 0) {
> > +   file = fget(conf->fd);
> > +   if (file) {
> > +   if (file->f_op != _device_fops) {
> > +   fput(file);
> > +   return -EBADF;
> > +   }
> > +
> > +   /* Don't hold reference on file: ppp_release() is
> > +* responsible for safely freeing the associated
> > +* resources upon release. So file won't go away
> > +* from under us.
> > +*/
> > +   fput(file);
> > +   }
> > +   } else {
> > +   file = conf->file;
> > +   }
> > if (!file)
> > return -EBADF;
> 
> 
> I would write that a bid different to reduce indent
> und improve readability
> 
> (note: totaly untested just reviewing)
> 
> if (conf->fd < 0) {
>   file = conf->file;
>   if (!file)
>   return -EBADF;
> }
> else
> {
> file = fget(conf->fd);
> if (!file)
>   return -EBADF;
> 
Early return on fget() failure looks indeed simpler.

> fput(file);
> if (file->f_op != _device_fops) { 
>   return -EBADF;
>   }
> 
But this is wrong: we can't act on file after fput(). So we have to
place fput() after the test.

Thanks for your review.
--
To unsubscribe from this list: send the line "unsubscribe linux-ppp" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] defxx: fix build warning

2016-01-25 Thread Maciej W. Rozycki
On Mon, 25 Jan 2016, Sudip Mukherjee wrote:

> We are getting many build warnings about:
> 'bar_start' may be used uninitialized
> and
> 'bar_len' may be used uninitialized
> 
> They are not actually uninitialized as dfx_get_bars() will initialize
> them properly. But still lets have them initialized just to satisfy the
> compiler (gcc 4.8.2).
> 
> Signed-off-by: Sudip Mukherjee 
> ---

Acked-by: Maciej W. Rozycki 

 Thanks,

  Maciej


[PATCH net-next] hv_netvsc: use skb_get_hash() instead of a homegrown implementation

2016-01-25 Thread Vitaly Kuznetsov
Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
VLAN ID to flow_keys")) introduced a performance regression in netvsc
driver. Is problem is, however, not the above mentioned commit but the
fact that netvsc_set_hash() function did some assumptions on the struct
flow_keys data layout and this is wrong.

Get rid of netvsc_set_hash() by switching to skb_get_hash(). This change
will also imply switching to Jenkins hash from the currently used Toeplitz
but it seems there is no good excuse for Toeplitz to stay.

Signed-off-by: Vitaly Kuznetsov 
---
- This patch is an alternative to the previosely sent "hv_netvsc: don't make
 assumptions on struct flow_keys layout" and Haiyang's "hv_netvsc: Use simple
 parser for IPv4 and v6 headers".
---
 drivers/net/hyperv/netvsc_drv.c | 67 ++---
 1 file changed, 3 insertions(+), 64 deletions(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 1c8db9a..1d3a665 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -196,65 +196,6 @@ static void *init_ppi_data(struct rndis_message *msg, u32 
ppi_size,
return ppi;
 }
 
-union sub_key {
-   u64 k;
-   struct {
-   u8 pad[3];
-   u8 kb;
-   u32 ka;
-   };
-};
-
-/* Toeplitz hash function
- * data: network byte order
- * return: host byte order
- */
-static u32 comp_hash(u8 *key, int klen, void *data, int dlen)
-{
-   union sub_key subk;
-   int k_next = 4;
-   u8 dt;
-   int i, j;
-   u32 ret = 0;
-
-   subk.k = 0;
-   subk.ka = ntohl(*(u32 *)key);
-
-   for (i = 0; i < dlen; i++) {
-   subk.kb = key[k_next];
-   k_next = (k_next + 1) % klen;
-   dt = ((u8 *)data)[i];
-   for (j = 0; j < 8; j++) {
-   if (dt & 0x80)
-   ret ^= subk.ka;
-   dt <<= 1;
-   subk.k <<= 1;
-   }
-   }
-
-   return ret;
-}
-
-static bool netvsc_set_hash(u32 *hash, struct sk_buff *skb)
-{
-   struct flow_keys flow;
-   int data_len;
-
-   if (!skb_flow_dissect_flow_keys(skb, , 0) ||
-   !(flow.basic.n_proto == htons(ETH_P_IP) ||
- flow.basic.n_proto == htons(ETH_P_IPV6)))
-   return false;
-
-   if (flow.basic.ip_proto == IPPROTO_TCP)
-   data_len = 12;
-   else
-   data_len = 8;
-
-   *hash = comp_hash(netvsc_hash_key, HASH_KEYLEN, , data_len);
-
-   return true;
-}
-
 static u16 netvsc_select_queue(struct net_device *ndev, struct sk_buff *skb,
void *accel_priv, select_queue_fallback_t fallback)
 {
@@ -267,11 +208,9 @@ static u16 netvsc_select_queue(struct net_device *ndev, 
struct sk_buff *skb,
if (nvsc_dev == NULL || ndev->real_num_tx_queues <= 1)
return 0;
 
-   if (netvsc_set_hash(, skb)) {
-   q_idx = nvsc_dev->send_table[hash % VRSS_SEND_TAB_SIZE] %
-   ndev->real_num_tx_queues;
-   skb_set_hash(skb, hash, PKT_HASH_TYPE_L3);
-   }
+   hash = skb_get_hash(skb);
+   q_idx = nvsc_dev->send_table[hash % VRSS_SEND_TAB_SIZE] %
+   ndev->real_num_tx_queues;
 
if (!nvsc_dev->chn_table[q_idx])
q_idx = 0;
-- 
2.5.0



[PATCH net] sit: set rtnl_link_ops before calling register_netdevice

2016-01-25 Thread Thadeu Lima de Souza Cascardo
When creating a SIT tunnel with ip tunnel, rtnl_link_ops is not set before
ipip6_tunnel_create is called. When register_netdevice is called, there is
no linkinfo attribute in the NEWLINK message because of that.

Setting rtnl_link_ops before calling register_netdevice fixes that.

Signed-off-by: Thadeu Lima de Souza Cascardo 
---
 net/ipv6/sit.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index e794ef6..2066d1c 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -201,14 +201,14 @@ static int ipip6_tunnel_create(struct net_device *dev)
if ((__force u16)t->parms.i_flags & SIT_ISATAP)
dev->priv_flags |= IFF_ISATAP;
 
+   dev->rtnl_link_ops = _link_ops;
+
err = register_netdevice(dev);
if (err < 0)
goto out;
 
ipip6_tunnel_clone_6rd(dev, sitn);
 
-   dev->rtnl_link_ops = _link_ops;
-
dev_hold(dev);
 
ipip6_tunnel_link(sitn, t);
-- 
2.5.0



Re: [PATCH net-next] hv_netvsc: Fix book keeping of skb during batching process

2016-01-25 Thread David Miller
From: Haiyang Zhang 
Date: Mon, 25 Jan 2016 09:49:31 -0800

> Since eliminating send_completion_tid from struct hv_netvsc_packet, we
> haven't add proper book keeping for the skb of the batched packet. This
> patch fixes this issue and allows the previous skb is properly freed.
> Otherwise, a panic may happen.
> Thanks to Simon Xiao  for bisecting and analysis.
> 
> Signed-off-by: Haiyang Zhang 
> Reviewed-by: K. Y. Srinivasan 

Applied.


Re: [dm-devel] [PATCH 22/26] iscsi_tcp: Use ahash

2016-01-25 Thread Mike Christie
On 01/24/2016 07:19 AM, Herbert Xu wrote:
> This patch replaces uses of the long obsolete hash interface with
> ahash.
> 
> Signed-off-by: Herbert Xu 
> ---
> 
>  drivers/scsi/iscsi_tcp.c|   54 
> ++--
>  drivers/scsi/iscsi_tcp.h|4 +--
>  drivers/scsi/libiscsi_tcp.c |   29 +--
>  include/scsi/libiscsi_tcp.h |   13 +-
>  4 files changed, 58 insertions(+), 42 deletions(-)
> 

iSCSI parts look ok.

Reviewed-by: Mike Christie 



Re: [PATCH] brcmfmac: sdio: Increase the default timeouts a bit

2016-01-25 Thread Arend van Spriel
On 25-1-2016 20:23, Doug Anderson wrote:
> Hi,
> 
> On Mon, Jan 25, 2016 at 7:36 AM, Arend van Spriel  wrote:
>> On 25-01-16 11:47, Sjoerd Simons wrote:
>>> On a Radxa Rock2 board with a Ampak AP6335 (Broadcom 4339 core) it seems
>>> the card responds very quickly most of the time, unfortunately during
>>> initialisation it sometimes seems to take just a bit over 2 seconds to
>>> respond.
>>>
>>> This results intialization failing with message like:
>>>   brcmf_c_preinit_dcmds: Retreiving cur_etheraddr failed, -52
>>>   brcmf_bus_start: failed: -52
>>>   brcmf_sdio_firmware_callback: dongle is not responding
>>>
>>> Increasing the timeout to allow for a bit more headroom allows the
>>> card to initialize reliably.
>>
>> I would prefer to know where the 2 second response time comes from.
>> Could be sdio retuning. Maybe the chromeos people can comment whether
>> this has been root caused.

Hi Doug,

Thanks for the elaborate response

> I reviewed Paul's change here
>  but didn't do
> any root causing.
> 
> I think that, like Sjoerd saw, we were seeing this problem at boot
> time.  Certainly at boot time lots of things are happening all at the
> same time in the system and there are often delays, so anything that
> might have been close to timing out in the past may now be actually
> timing out.
> 
> This is the kind of thing that, IMHO, should have a real timeout that
> is 10x what was expected and a non-fatal warning whenever we go over
> the expected time.  ...but maybe that's overdesign.  :-P
> 
> Kinda curious: do we get one or two really slow responses on every
> bootup, or just some bootups?  Do we ever succeed even with a slow
> (like 1.8 or 1.9 seconds) response, or is it always either "fast" or
> "2.1" seconds?

Now these are interesting questions that I should have spilled out in
the first place. Thanks.

> In any case, in my experience the Broadcom firmware is fairly
> complicated and has numerous cases where it stretches SDIO more than
> the other SDIO WiFi chip I've worked with.  It wouldn't terribly
> surprise me if there was a period of time during bootup where it was
> non-responsive for 2 seconds.  As unrelated "evidence" showing some of
> the Broadcom SDIO limitations, you can see
>  and also the
> fact that Broadcom often holds the SDIO "busy" signal whereas the
> other SDIO WiFi chip I've worked never did that.  Also, even with all
> fixes the Broadcom WiFi module will still show periodic SDIO errors
> that the higher level driver just knows to ignore.

The busy signal is in accordance with the SDIO spec. It would be good to
know if that is what is happening. Unfortunately I do have an SDIO
analyzer, but not reproduced it. May retry on veyron device.

> My old debugging from the (sorry, private) bug
> http://crosbug.com/p/36975 showed this periodically even with all
> known fixes:
> 
> [21310.271635] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0104
> [21550.583598] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0104
> [21550.616035] brcmfmac: brcmf_sdio_readframes: RXHEADER FAILED: -110
> [21550.648460] brcmfmac: brcmf_sdio_rxfail: abort command, terminate
> frame, send NAK
> [21550.683502] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0104
> [21550.691214] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0100
> [22671.121329] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0104
> [22671.153167] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x01000104
> [22671.184581] brcmfmac: brcmf_sdio_readframes: RXHEADER FAILED: -110
> [22671.192600] brcmfmac: brcmf_sdio_rxfail: abort command, terminate
> frame, send NAK
> [22671.201929] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0114
> [22671.209536] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0100
> [28463.941736] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0104
> 
> At the time dekim@ responded:
> 
>> There are several sleep/wake control at different level. The one we're 
>> talking
>> about here is controlled by brcmf_sdio_bus_sleep() in the host driver to turn
>> on/off bus core on the chip. There can be a period of time when chip is not
>> paying attention to the host command (cmd52 to the
>> SBSDIO_FUNC1_SLEEPCSR).
> 
> ...and we decided that the periodic SDIO errors weren't causing any
> huge problems (since they were retried).  As far as I know, they still
> happen today.

Were these true periodic errors or random at interval.

> 
> All of the above may not help you, but it serves as evidence that the
> SDIO communication to Broadcom isn't terribly amazing and apparently
> that's just the way that the module (or perhaps its firmware) is
> designed.  It doesn't seem to affect anything in the real world, so I
> suppose it is just something we need to live with.
> 
> 
> Obviously if you have access to the firmware source code and can debug
> further, that would be awesome.  I'm just not hopeful.

I have, but that does 

Re: [PATCH] brcmfmac: sdio: Increase the default timeouts a bit

2016-01-25 Thread Arend van Spriel
On 25-01-16 12:06, Julian Calaby wrote:
> Hi Sjoerd,
> 
> On Mon, Jan 25, 2016 at 9:47 PM, Sjoerd Simons
>  wrote:
>> On a Radxa Rock2 board with a Ampak AP6335 (Broadcom 4339 core) it seems
>> the card responds very quickly most of the time, unfortunately during
>> initialisation it sometimes seems to take just a bit over 2 seconds to
>> respond.
>>
>> This results intialization failing with message like:
>>   brcmf_c_preinit_dcmds: Retreiving cur_etheraddr failed, -52
>>   brcmf_bus_start: failed: -52
>>   brcmf_sdio_firmware_callback: dongle is not responding
>>
>> Increasing the timeout to allow for a bit more headroom allows the
>> card to initialize reliably.
>>
>> A quick search online after diagnosing/fixing this showed that Google
>> has a similar patch in their ChromeOS tree, so this doesn't seem
>> specific to the board I'm using.
>>
>> Signed-off-by: Sjoerd Simons 
> 
> Looks sane to me.
> 
> Reviewed-by: Julian Calaby 

Not really a cleanup patch :-p , but thanks for the review.

Regards,
Arend

> 
>>  drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c 
>> b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
>> index dd66143..75ac4bd 100644
>> --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
>> +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
>> @@ -45,8 +45,8 @@
>>  #include "chip.h"
>>  #include "firmware.h"
>>
>> -#define DCMD_RESP_TIMEOUT  msecs_to_jiffies(2000)
>> -#define CTL_DONE_TIMEOUT   msecs_to_jiffies(2000)
>> +#define DCMD_RESP_TIMEOUT  msecs_to_jiffies(2500)
>> +#define CTL_DONE_TIMEOUT   msecs_to_jiffies(2500)
>>
>>  #ifdef DEBUG
>>
>> --
>> 2.7.0
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 


bonding (IEEE 802.3ad) not working with qemu/virtio

2016-01-25 Thread Bjørnar Ness
As subject says, 802.3ad bonding is not working with virtio network model.

The only errors I see is:

No 802.3ad response from the link partner for any adapters in the bond.

Dumping the network traffic shows that no LACP packets are sent from the
host running with virtio driver, changing to for example e1000 solves
this problem
with no configuration changes.

Is this a known problem?

-- 
Bj(/)rnar


Re: net/sctp: out-of-bounds access in sctp_add_bind_addr

2016-01-25 Thread Neil Horman
On Mon, Jan 25, 2016 at 12:48:02PM -0200, Marcelo Ricardo Leitner wrote:
> On Mon, Jan 25, 2016 at 03:42:14PM +0100, Dmitry Vyukov wrote:
> > On Mon, Jan 25, 2016 at 3:31 PM, Neil Horman  wrote:
> > > On Mon, Jan 25, 2016 at 03:02:38PM +0100, Dmitry Vyukov wrote:
> > >> Hello,
> > >>
> > >> I've git the following error report while running syzkaller fuzzer:
> > >>
> > >> ==
> > >> BUG: KASAN: slab-out-of-bounds in memcpy+0x1d/0x40 at addr 
> > >> 88006c6361e8
> > >> Read of size 28 by task syz-executor/12551
> > >> =
> > >> BUG kmalloc-16 (Not tainted): kasan: bad access detected
> > >> -
> > >>
> > >> INFO: Allocated in sctp_setsockopt_bindx+0xd2/0x3e0 age=12 cpu=2 
> > >> pid=12551
> > >> [< inline >] kmalloc include/linux/slab.h:468
> > >> [<  none  >] sctp_setsockopt_bindx+0xd2/0x3e0 
> > >> net/sctp/socket.c:975
> > >> [<  none  >] sctp_setsockopt+0x1493/0x3630 net/sctp/socket.c:3711
> > >> [<  none  >] sock_common_setsockopt+0x97/0xd0 
> > >> net/core/sock.c:2620
> > >> [< inline >] SYSC_setsockopt net/socket.c:1752
> > >> [<  none  >] SyS_setsockopt+0x15b/0x250 net/socket.c:1731
> > >> [<  none  >] entry_SYSCALL_64_fastpath+0x16/0x7a
> > >> arch/x86/entry/entry_64.S:185
> > >>
> > >> INFO: Slab 0xea0001b18d80 objects=16 used=4 fp=0x88006c6376e0
> > >> flags=0x5fffc004080
> > >> INFO: Object 0x88006c6361e8 @offset=488 fp=0x0002
> > >> Bytes b4 88006c6361d8: 00 00 00 00 00 00 00 00 2f 98 34 88 ff ff
> > >> ff ff  /.4.
> > >> Object 88006c6361e8: 02 00 00 00 00 00 00 00 02 00 ab 07 7f 00 00
> > >> 01  
> > >> CPU: 2 PID: 12551 Comm: syz-executor Tainted: GB   
> > >> 4.5.0-rc1+ #278
> > >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 
> > >> 01/01/2011
> > >>   880036397928 8299a02d 88003e807900
> > >>  88006c6361e8 88006c636000 880036397958 81752814
> > >>  88003e807900 ea0001b18d80 88006c6361e8 88006c6361e8
> > >>
> > >> Call Trace:
> > >>  [] __asan_loadN+0x124/0x1a0 mm/kasan/kasan.c:512
> > >>  [] memcpy+0x1d/0x40 mm/kasan/kasan.c:297
> > >>  [] sctp_add_bind_addr+0xa9/0x270 
> > >> net/sctp/bind_addr.c:162
> > >>  [] sctp_do_bind+0x336/0x580 net/sctp/socket.c:389
> > >>  [] sctp_bindx_add+0xac/0x1a0 net/sctp/socket.c:471
> > >>  [] sctp_setsockopt_bindx+0x2f8/0x3e0 
> > >> net/sctp/socket.c:1010
> > >>  [] sctp_setsockopt+0x1493/0x3630 
> > >> net/sctp/socket.c:3711
> > >>  [] sock_common_setsockopt+0x97/0xd0 
> > >> net/core/sock.c:2620
> > >>  [< inline >] SYSC_setsockopt net/socket.c:1752
> > >>  [] SyS_setsockopt+0x15b/0x250 net/socket.c:1731
> > >>  [] entry_SYSCALL_64_fastpath+0x16/0x7a
> > >> arch/x86/entry/entry_64.S:185
> > >>
> > >> Memory state around the buggy address:
> > >>  88006c636080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> > >>  88006c636100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> > >> >88006c636180: fc fc fc fc fc fc fc fc fc fc fc fc fc 00 00 fc
> > >> ^
> > >>  88006c636200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> > >>  88006c636280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> > >> ==
> > >>
> > >>
> > >> sctp_setsockopt_bindx verifies that the user-passed address has valid
> > >> len for the specified family, but then sctp_add_bind_addr copies whole
> > >> sctp_addr from there. This causes heap out-of-bounds access and can
> > >> crash kernel. Not sure if it is possible to copy out the trailing
> > >> garbage to user-space later.
> > >>
> > >
> > > It does more than that though.  sctp_setsockopt_bindx checks the 
> > > following:
> > > 1) That passed addr_size is greater than zero
> > > 2) that the entire range of memory between addrs and addrs+addr_size is 
> > > readable
> > > 3) That at least one address structure worth of data is available 
> > > (implicit in
> > > the while (walk_size < addr_size) loop).
> > >
> > > Could one of the sockaddr_len fields in one of the addresses have been 
> > > mangled
> > > so that it appeared shorter in the the while loop from (3), so that a 
> > > copy of
> > > sizeof(sctp_addr in sctp_add_bind_addr overrun the allocated memory?
> > 
> > I may be missing something, but what I see is:
> > 
> > 1. we check that there is at least family:
> > if (walk_size + sizeof(sa_family_t) > addrs_size) {
> > 
> > 2. get family descriptor:
> > af = sctp_get_af_specific(sa_addr->sa_family);
> > 
> > 3. check that the address size is enough to hold the declared family:
> > if (!af || (walk_size + 

Re: [PATCH] brcmfmac: sdio: Increase the default timeouts a bit

2016-01-25 Thread Doug Anderson
Hi,

On Mon, Jan 25, 2016 at 12:07 PM, Arend van Spriel  wrote:
>> In any case, in my experience the Broadcom firmware is fairly
>> complicated and has numerous cases where it stretches SDIO more than
>> the other SDIO WiFi chip I've worked with.  It wouldn't terribly
>> surprise me if there was a period of time during bootup where it was
>> non-responsive for 2 seconds.  As unrelated "evidence" showing some of
>> the Broadcom SDIO limitations, you can see
>>  and also the
>> fact that Broadcom often holds the SDIO "busy" signal whereas the
>> other SDIO WiFi chip I've worked never did that.  Also, even with all
>> fixes the Broadcom WiFi module will still show periodic SDIO errors
>> that the higher level driver just knows to ignore.
>
> The busy signal is in accordance with the SDIO spec.

Sorry, didn't mean to imply that it wasn't.  That particular instance
is definitely legal, but it was also just showing a case where the
Broadcom WiFi needed something that wasn't needed elsewhere.

AKA: it's kinda like saying that other modules work on a UART without
flow control lines and the Broadcom needs the flow control lines.
UARTs are designed with flow control lines and it's totally legal to
need them.  ...but if you saw that one serial device didn't need them
then you would presume that maybe the device on the other end had very
low interrupt latency or had some other hardware help (large FIFOs,
DMA, fast processor, hard real time OS, etc).  A device with low
interrupt latency, large FIFOs, DMA, fast processor, etc would
presumably be less likely to have other pauses in communication.


> It would be good to
> know if that is what is happening. Unfortunately I do have an SDIO
> analyzer, but not reproduced it. May retry on veyron device.

You might just be able to put a trace to see if it hits the busy
signal.  Depending on your Linux version you might just be able to put
a trace in __mmc_start_request().  If not and if you're on dw_mmc you
can put a trace in dw_mci_wait_while_busy().


>> My old debugging from the (sorry, private) bug
>> http://crosbug.com/p/36975 showed this periodically even with all
>> known fixes:
>>
>> [21310.271635] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0104
>> [21550.583598] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0104
>> [21550.616035] brcmfmac: brcmf_sdio_readframes: RXHEADER FAILED: -110
>> [21550.648460] brcmfmac: brcmf_sdio_rxfail: abort command, terminate
>> frame, send NAK
>> [21550.683502] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0104
>> [21550.691214] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0100
>> [22671.121329] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0104
>> [22671.153167] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x01000104
>> [22671.184581] brcmfmac: brcmf_sdio_readframes: RXHEADER FAILED: -110
>> [22671.192600] brcmfmac: brcmf_sdio_rxfail: abort command, terminate
>> frame, send NAK
>> [22671.201929] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0114
>> [22671.209536] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0100
>> [28463.941736] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0104
>>
>> At the time dekim@ responded:
>>
>>> There are several sleep/wake control at different level. The one we're 
>>> talking
>>> about here is controlled by brcmf_sdio_bus_sleep() in the host driver to 
>>> turn
>>> on/off bus core on the chip. There can be a period of time when chip is not
>>> paying attention to the host command (cmd52 to the
>>> SBSDIO_FUNC1_SLEEPCSR).
>>
>> ...and we decided that the periodic SDIO errors weren't causing any
>> huge problems (since they were retried).  As far as I know, they still
>> happen today.
>
> Were these true periodic errors or random at interval.

I believe they were random, and I believe that I needed "power save"
mode turned on for WiFi.  I believe errors were made worse if I had a
TCP/IP ping running (AKA bring out of power save and go back in
constantly).


Re: [PATCH 16/26] libceph: Use skcipher

2016-01-25 Thread Ilya Dryomov
On Sun, Jan 24, 2016 at 2:18 PM, Herbert Xu  wrote:
> This patch replaces uses of blkcipher with skcipher.
>
> Signed-off-by: Herbert Xu 
> ---
>
>  net/ceph/crypto.c |   97 
> +++---
>  1 file changed, 56 insertions(+), 41 deletions(-)

Could you get rid of ivsize instead of assigning to it - see the
attached diff?

Otherwise:

Acked-by: Ilya Dryomov 

Thanks,

Ilya
diff --git a/net/ceph/crypto.c b/net/ceph/crypto.c
index 42e8649c6e79..db2847ac5f12 100644
--- a/net/ceph/crypto.c
+++ b/net/ceph/crypto.c
@@ -4,7 +4,8 @@
 #include 
 #include 
 #include 
-#include 
+#include 
+#include 
 #include 
 
 #include 
@@ -79,9 +80,9 @@ int ceph_crypto_key_unarmor(struct ceph_crypto_key *key, 
const char *inkey)
return 0;
 }
 
-static struct crypto_blkcipher *ceph_crypto_alloc_cipher(void)
+static struct crypto_skcipher *ceph_crypto_alloc_cipher(void)
 {
-   return crypto_alloc_blkcipher("cbc(aes)", 0, CRYPTO_ALG_ASYNC);
+   return crypto_alloc_skcipher("cbc(aes)", 0, CRYPTO_ALG_ASYNC);
 }
 
 static const u8 *aes_iv = (u8 *)CEPH_AES_IV;
@@ -162,11 +163,10 @@ static int ceph_aes_encrypt(const void *key, int key_len,
 {
struct scatterlist sg_in[2], prealloc_sg;
struct sg_table sg_out;
-   struct crypto_blkcipher *tfm = ceph_crypto_alloc_cipher();
-   struct blkcipher_desc desc = { .tfm = tfm, .flags = 0 };
+   struct crypto_skcipher *tfm = ceph_crypto_alloc_cipher();
+   SKCIPHER_REQUEST_ON_STACK(req, tfm);
int ret;
-   void *iv;
-   int ivsize;
+   char iv[AES_BLOCK_SIZE];
size_t zero_padding = (0x10 - (src_len & 0x0f));
char pad[16];
 
@@ -184,10 +184,13 @@ static int ceph_aes_encrypt(const void *key, int key_len,
if (ret)
goto out_tfm;
 
-   crypto_blkcipher_setkey((void *)tfm, key, key_len);
-   iv = crypto_blkcipher_crt(tfm)->iv;
-   ivsize = crypto_blkcipher_ivsize(tfm);
-   memcpy(iv, aes_iv, ivsize);
+   crypto_skcipher_setkey((void *)tfm, key, key_len);
+   memcpy(iv, aes_iv, AES_BLOCK_SIZE);
+
+   skcipher_request_set_tfm(req, tfm);
+   skcipher_request_set_callback(req, 0, NULL, NULL);
+   skcipher_request_set_crypt(req, sg_in, sg_out.sgl,
+  src_len + zero_padding, iv);
 
/*
print_hex_dump(KERN_ERR, "enc key: ", DUMP_PREFIX_NONE, 16, 1,
@@ -197,8 +200,8 @@ static int ceph_aes_encrypt(const void *key, int key_len,
print_hex_dump(KERN_ERR, "enc pad: ", DUMP_PREFIX_NONE, 16, 1,
pad, zero_padding, 1);
*/
-   ret = crypto_blkcipher_encrypt(, sg_out.sgl, sg_in,
-src_len + zero_padding);
+   ret = crypto_skcipher_encrypt(req);
+   skcipher_request_zero(req);
if (ret < 0) {
pr_err("ceph_aes_crypt failed %d\n", ret);
goto out_sg;
@@ -211,7 +214,7 @@ static int ceph_aes_encrypt(const void *key, int key_len,
 out_sg:
teardown_sgtable(_out);
 out_tfm:
-   crypto_free_blkcipher(tfm);
+   crypto_free_skcipher(tfm);
return ret;
 }
 
@@ -222,11 +225,10 @@ static int ceph_aes_encrypt2(const void *key, int 
key_len, void *dst,
 {
struct scatterlist sg_in[3], prealloc_sg;
struct sg_table sg_out;
-   struct crypto_blkcipher *tfm = ceph_crypto_alloc_cipher();
-   struct blkcipher_desc desc = { .tfm = tfm, .flags = 0 };
+   struct crypto_skcipher *tfm = ceph_crypto_alloc_cipher();
+   SKCIPHER_REQUEST_ON_STACK(req, tfm);
int ret;
-   void *iv;
-   int ivsize;
+   char iv[AES_BLOCK_SIZE];
size_t zero_padding = (0x10 - ((src1_len + src2_len) & 0x0f));
char pad[16];
 
@@ -245,10 +247,13 @@ static int ceph_aes_encrypt2(const void *key, int 
key_len, void *dst,
if (ret)
goto out_tfm;
 
-   crypto_blkcipher_setkey((void *)tfm, key, key_len);
-   iv = crypto_blkcipher_crt(tfm)->iv;
-   ivsize = crypto_blkcipher_ivsize(tfm);
-   memcpy(iv, aes_iv, ivsize);
+   crypto_skcipher_setkey((void *)tfm, key, key_len);
+   memcpy(iv, aes_iv, AES_BLOCK_SIZE);
+
+   skcipher_request_set_tfm(req, tfm);
+   skcipher_request_set_callback(req, 0, NULL, NULL);
+   skcipher_request_set_crypt(req, sg_in, sg_out.sgl,
+  src1_len + src2_len + zero_padding, iv);
 
/*
print_hex_dump(KERN_ERR, "enc  key: ", DUMP_PREFIX_NONE, 16, 1,
@@ -260,8 +265,8 @@ static int ceph_aes_encrypt2(const void *key, int key_len, 
void *dst,
print_hex_dump(KERN_ERR, "enc  pad: ", DUMP_PREFIX_NONE, 16, 1,
pad, zero_padding, 1);
*/
-   ret = crypto_blkcipher_encrypt(, sg_out.sgl, sg_in,
-src1_len + src2_len + zero_padding);
+   ret = 

Re: [PATCH net-next] hv_netvsc: use skb_get_hash() instead of a homegrown implementation

2016-01-25 Thread Eric Dumazet
On Mon, 2016-01-25 at 16:00 +0100, Vitaly Kuznetsov wrote:
> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
> VLAN ID to flow_keys")) introduced a performance regression in netvsc
> driver. Is problem is, however, not the above mentioned commit but the
> fact that netvsc_set_hash() function did some assumptions on the struct
> flow_keys data layout and this is wrong.
> 
> Get rid of netvsc_set_hash() by switching to skb_get_hash(). This change
> will also imply switching to Jenkins hash from the currently used Toeplitz
> but it seems there is no good excuse for Toeplitz to stay.
> 
> Signed-off-by: Vitaly Kuznetsov 
> ---

Acked-by: Eric Dumazet 

Thanks !




Re: [PATCH 4.1] [media] media/vivid-osd: fix info leak in ioctl

2016-01-25 Thread Greg KH
On Mon, Jan 25, 2016 at 07:42:18PM +0900, Yuki Machida wrote:
> commit eda98796aff0d9bf41094b06811f5def3b4c333c upstream.
> 
> The vivid_fb_ioctl() code fails to initialize the 16 _reserved bytes of
> struct fb_vblank after the ->hcount member. Add an explicit
> memset(0) before filling the structure to avoid the info leak.
> 
> This fixes CVE-2015-7884.
> 
> Signed-off-by: Salva Peiró 
> Signed-off-by: Hans Verkuil 
> Signed-off-by: Mauro Carvalho Chehab 
> Signed-off-by: Yuki Machida 
> ---
>  drivers/media/platform/vivid/vivid-osd.c | 1 +
>  1 file changed, 1 insertion(+)



This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read Documentation/stable_kernel_rules.txt
for how to do this properly.




net/irda: use-after-free in ircomm_param_request

2016-01-25 Thread Dmitry Vyukov
Hello,

I've hit the following use-after-free report while running syzkaller fuzzer:

==
BUG: KASAN: use-after-free in ircomm_param_request+0x514/0x570 at addr
880035732c78
Read of size 4 by task syz-executor/10736
=
BUG skbuff_head_cache (Not tainted): kasan: bad access detected
-

INFO: Allocated in __alloc_skb+0xba/0x5f0 age=4 cpu=1 pid=10738
[<  none  >] kmem_cache_alloc_node+0x93/0x2f0 mm/slub.c:2632
[<  none  >] __alloc_skb+0xba/0x5f0 net/core/skbuff.c:216
[< inline >] alloc_skb include/linux/skbuff.h:894
[<  none  >] ircomm_param_request+0x34b/0x570
net/irda/ircomm/ircomm_param.c:115
[<  none  >] ircomm_port_raise_dtr_rts+0x6a/0xc0
net/irda/ircomm/ircomm_tty.c:122
[<  none  >] tty_port_raise_dtr_rts+0x6a/0x90 drivers/tty/tty_port.c:313
[< inline >] ircomm_tty_block_til_ready net/irda/ircomm/ircomm_tty.c:291
[<  none  >] ircomm_tty_open+0xad7/0x12f0
net/irda/ircomm/ircomm_tty.c:462
[<  none  >] tty_open+0x34d/0xf80 drivers/tty/tty_io.c:2099
[<  none  >] chrdev_open+0x22a/0x4c0 fs/char_dev.c:388
[<  none  >] do_dentry_open+0x6a2/0xcb0 fs/open.c:736
[<  none  >] vfs_open+0x17b/0x1f0 fs/open.c:853
[< inline >] do_last fs/namei.c:3254
[<  none  >] path_openat+0xde9/0x5e30 fs/namei.c:3386
[<  none  >] do_filp_open+0x18e/0x250 fs/namei.c:3421
[<  none  >] do_sys_open+0x1fc/0x420 fs/open.c:1022
[< inline >] SYSC_open fs/open.c:1040
[<  none  >] SyS_open+0x2d/0x40 fs/open.c:1035

INFO: Freed in kfree_skbmem+0xe6/0x100 age=10 cpu=1 pid=1362
[<  none  >] kmem_cache_free+0x2e4/0x360 mm/slub.c:2844
[<  none  >] kfree_skbmem+0xe6/0x100 net/core/skbuff.c:612
[< inline >] __kfree_skb net/core/skbuff.c:674
[<  none  >] consume_skb+0xe4/0x2c0 net/core/skbuff.c:746
[<  none  >] ircomm_tty_do_softint+0x131/0x280
net/irda/ircomm/ircomm_tty.c:552
[<  none  >] process_one_work+0x796/0x1440 kernel/workqueue.c:2037
[<  none  >] worker_thread+0xdb/0xfc0 kernel/workqueue.c:2171
[<  none  >] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
[<  none  >] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468

INFO: Slab 0xead5cc00 objects=23 used=0 fp=0x880035732c00
flags=0x1fffc004080
INFO: Object 0x880035732c00 @offset=11264 fp=0x880035731340
CPU: 0 PID: 10736 Comm: syz-executor Tainted: GB   4.5.0-rc1+ #280
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  88b174c8 8299a06d 88003de85200
 880035732c00 88003573 88b174f8 81752854
 88003de85200 ead5cc00 880035732c00 0001

Call Trace:
 [] __asan_report_load4_noabort+0x3e/0x40
mm/kasan/report.c:294
 [] ircomm_param_request+0x514/0x570
net/irda/ircomm/ircomm_param.c:140
 [] ircomm_port_raise_dtr_rts+0x6a/0xc0
net/irda/ircomm/ircomm_tty.c:122
 [] tty_port_raise_dtr_rts+0x6a/0x90
drivers/tty/tty_port.c:313
 [< inline >] ircomm_tty_block_til_ready
net/irda/ircomm/ircomm_tty.c:291
 [] ircomm_tty_open+0xad7/0x12f0
net/irda/ircomm/ircomm_tty.c:462
 [] tty_open+0x34d/0xf80 drivers/tty/tty_io.c:2099
 [] chrdev_open+0x22a/0x4c0 fs/char_dev.c:388
 [] do_dentry_open+0x6a2/0xcb0 fs/open.c:736
 [] vfs_open+0x17b/0x1f0 fs/open.c:853
 [< inline >] do_last fs/namei.c:3254
 [] path_openat+0xde9/0x5e30 fs/namei.c:3386
 [] do_filp_open+0x18e/0x250 fs/namei.c:3421
 [] do_sys_open+0x1fc/0x420 fs/open.c:1022
 [< inline >] SYSC_open fs/open.c:1040
 [] SyS_open+0x2d/0x40 fs/open.c:1035
 [] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185
==

It seems that skb can be freed after skb_put() and spinlock unlock,
but ircomm_param_request reads skb->len afterwards:

int ircomm_param_request(struct ircomm_tty_cb *self, __u8 pi, int flush)
{
...
skb_put(skb, count);
spin_unlock_irqrestore(>spinlock, flags);
pr_debug("%s(), skb->len=%d\n", __func__ , skb->len);

On commit 92e963f50fc74041b5e9e744c330dca48e04f08d (Jan 24).


RE: [PATCH 1/1] bonding: Use notifiers for slave link state detection

2016-01-25 Thread Tantilov, Emil S
>-Original Message-
>From: zyjzyj2...@gmail.com [mailto:zyjzyj2...@gmail.com]
>Sent: Thursday, January 21, 2016 2:16 AM
>To: zyjzyj2...@gmail.com; mkube...@suse.cz; vfal...@gmail.com;
>go...@cumulusnetworks.com; netdev@vger.kernel.org; Shteinbock, Boris (Wind
>River); jay.vosbu...@canonical.com; Tantilov, Emil S
>Subject: [PATCH 1/1] bonding: Use notifiers for slave link state detection
>
>From: Zhu Yanjun 
>
>Bonding will utilize notifier callbacks to detect slave
>link state changes. It is intended to be used with miimon
>set to zero, and does not support the updelay or downdelay
>options to bonding.
>
>Because of link flap from the slave interface, if the notifier
>is NETDEV_UP while the actual link state is down, it is not
>necessary to continue.
>
>Signed-off-by: Jay Vosburgh 
>Tested-by: Tantilov, Emil S 
>Signed-off-by: Zhu Yanjun 
>---
> drivers/net/bonding/bond_main.c |  317 +--
>
> 1 file changed, 170 insertions(+), 147 deletions(-)

Just for the record - this is not a patch that I have tested.

I did run tests with the patch Jay Vosburgh submitted for introducing
notifiers and that is handled in a separate thread.

Why do you keep re-sending Jay's patches?

Thanks,
Emil



Re: net/sctp: out-of-bounds access in sctp_add_bind_addr

2016-01-25 Thread Marcelo Ricardo Leitner
Something like this. Builds, but UNTESTED.
Uses union sizeof where possible but when reading from a buffer that is
not aligned to it, like that user supplied one. Then relies on
af->sockaddr_len

--8<--

---
 include/net/sctp/structs.h |  2 +-
 net/sctp/bind_addr.c   | 14 --
 net/sctp/protocol.c|  1 +
 net/sctp/sm_make_chunk.c   |  2 +-
 net/sctp/socket.c  |  5 +++--
 5 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 
20e72129be1ce0063eeafcbaadcee1f37e0c614c..97ba8a8c466f5c50bdc87ec578792e56553baa91
 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -1099,7 +1099,7 @@ int sctp_bind_addr_dup(struct sctp_bind_addr *dest,
const struct sctp_bind_addr *src,
gfp_t gfp);
 int sctp_add_bind_addr(struct sctp_bind_addr *, union sctp_addr *,
-  __u8 addr_state, gfp_t gfp);
+  int new_size, __u8 addr_state, gfp_t gfp);
 int sctp_del_bind_addr(struct sctp_bind_addr *, union sctp_addr *);
 int sctp_bind_addr_match(struct sctp_bind_addr *, const union sctp_addr *,
 struct sctp_sock *);
diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
index 
871cdf9567e6bc9c13cb1077dc6866a67e6e4367..80129d10a0af9c33e7348b79d010b9e5e948e584
 100644
--- a/net/sctp/bind_addr.c
+++ b/net/sctp/bind_addr.c
@@ -111,7 +111,8 @@ int sctp_bind_addr_dup(struct sctp_bind_addr *dest,
dest->port = src->port;
 
list_for_each_entry(addr, >address_list, list) {
-   error = sctp_add_bind_addr(dest, >a, 1, gfp);
+   error = sctp_add_bind_addr(dest, >a, sizeof(addr->a),
+  1, gfp);
if (error < 0)
break;
}
@@ -150,7 +151,7 @@ void sctp_bind_addr_free(struct sctp_bind_addr *bp)
 
 /* Add an address to the bind address list in the SCTP_bind_addr structure. */
 int sctp_add_bind_addr(struct sctp_bind_addr *bp, union sctp_addr *new,
-  __u8 addr_state, gfp_t gfp)
+  int new_size, __u8 addr_state, gfp_t gfp)
 {
struct sctp_sockaddr_entry *addr;
 
@@ -159,7 +160,7 @@ int sctp_add_bind_addr(struct sctp_bind_addr *bp, union 
sctp_addr *new,
if (!addr)
return -ENOMEM;
 
-   memcpy(>a, new, sizeof(*new));
+   memcpy(>a, new, min_t(size_t, sizeof(*new), new_size));
 
/* Fix up the port if it has not yet been set.
 * Both v4 and v6 have the port at the same offset.
@@ -291,7 +292,8 @@ int sctp_raw_to_bind_addrs(struct sctp_bind_addr *bp, __u8 
*raw_addr_list,
}
 
af->from_addr_param(, rawaddr, htons(port), 0);
-   retval = sctp_add_bind_addr(bp, , SCTP_ADDR_SRC, gfp);
+   retval = sctp_add_bind_addr(bp, , sizeof(addr),
+   SCTP_ADDR_SRC, gfp);
if (retval) {
/* Can't finish building the list, clean up. */
sctp_bind_addr_clean(bp);
@@ -453,8 +455,8 @@ static int sctp_copy_one_addr(struct net *net, struct 
sctp_bind_addr *dest,
(((AF_INET6 == addr->sa.sa_family) &&
  (flags & SCTP_ADDR6_ALLOWED) &&
  (flags & SCTP_ADDR6_PEERSUPP
-   error = sctp_add_bind_addr(dest, addr, SCTP_ADDR_SRC,
-   gfp);
+   error = sctp_add_bind_addr(dest, addr, sizeof(addr),
+  SCTP_ADDR_SRC, gfp);
}
 
return error;
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 
ab0d538a74ed593571cfaef02cd1bb7ce872abe6..2fb609008311f51344704d82f21b4de9f08253da
 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -214,6 +214,7 @@ int sctp_copy_local_addr_list(struct net *net, struct 
sctp_bind_addr *bp,
  (copy_flags & SCTP_ADDR6_ALLOWED) &&
  (copy_flags & SCTP_ADDR6_PEERSUPP {
error = sctp_add_bind_addr(bp, >a,
+   sizeof(addr->a),
SCTP_ADDR_SRC, GFP_ATOMIC);
if (error)
goto end_copy;
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index 
5d6a03fad3789a12290f5f14c5a7efa69c98f41a..1b91e9760fe514db6d89457e1d5da9e02800745e
 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -1830,7 +1830,7 @@ no_hmac:
/* Also, add the destination address. */
if (list_empty(>base.bind_addr.address_list)) {
sctp_add_bind_addr(>base.bind_addr, >dest,
-   SCTP_ADDR_SRC, GFP_ATOMIC);
+  

RE: [PATCH net] lan78xx: changed to use updated phy-ignore-interrupts

2016-01-25 Thread Woojung.Huh
Florian,

> Looks fine, just one nit below:
> 
> > @@ -954,6 +956,7 @@ static int lan78xx_link_reset(struct lan78xx_net
> *dev)
> >
> > ret = lan78xx_update_flowcontrol(dev, ecmd.duplex, ladv,
> radv);
> > netif_carrier_on(dev->net);
> 
> Do you need this netif_carrier_on() call here? PHYLIB should set that
> already.
> 
> > +   phy_mac_interrupt(phydev, 1);
> > }

Thanks for feedback.
As you pointed, netif_carrier_on() is redundant and also netif_carrier_off() in 
same routine.
Will post new patch.

Thanks,
Woojung


[BISECTED] v4.5-rc1 phylib regression

2016-01-25 Thread Aaro Koskinen
Hi,

I get the below crash on OCTEON (with octeon_mgmt interface, genphy)
always during systemd boot.

Bisected to:

commit a9049e0c513c4521dbfaa302af8ed08b3366b41f
Author: Andrew Lunn 
Date:   Wed Jan 6 20:11:26 2016 +0100

mdio: Add support for mdio drivers.

[  250.179887] CPU 2 Unable to handle kernel paging request at virtual address 
, epc == 81637bac, ra == 81637b7c
[  250.218161] Oops[#1]:
[  250.224970] CPU: 2 PID: 850 Comm: systemd-network Not tainted 
4.5.0-rc1-octeon-distro.git-test #1
[  250.251569] task: 800031188000 ti: 80002f8e8000 task.ti: 
80002f8e8000
[  250.251586] $ 0   :  81639774  

[  250.251595] $ 4   :   8174 
0001
[  250.251604] $ 8   : 0001  81106100 
00010001
[  250.251613] $12   :  813eb81c 8150be18 

[  250.251622] $16   : 800031290fc0  800031290fc4 
800031188000
[  250.251631] $20   : 0002 0001 800031290fc8 
80002f8eba40
[  250.251640] $24   : 0038 81105fb0
  
[  250.251649] $28   : 80002f8e8000 80002f8eb740 00fff7857f88 
81637b7c
[  250.251651] Hi: 431bde82d7b50717
[  250.251653] Lo: c8de2ac3222855ea
[  250.251668] epc   : 81637bac __mutex_lock_slowpath+0x7c/0x190
[  250.251675] ra: 81637b7c __mutex_lock_slowpath+0x4c/0x190
[  250.251684] Status: 10108ce3 KX SX UX KERNEL EXL IE 
[  250.251686] Cause : 008c (ExcCode 03)
[  250.251688] BadVA : 
[  250.251690] PrId  : 000d0409 (Cavium Octeon+)
[  250.251699] Modules linked in: pata_octeon_cf libata autofs4
[  250.251704] Process systemd-network (pid: 850, threadinfo=80002f8e8000, 
task=800031188000, tls=00fff75ba700)
[  250.251782] Stack : 800031290fc8  0001 
83f9bca0
[  250.251782]800031290c00 81856558 800031290fc0 
800031225000
[  250.251782]0001  800031008800 
814b6284
[  250.251782]81639774 81175dc4 800031290c00 
814baf80
[  250.251782]80003121 814b64ac 80002f8eb810 
80002f8eb810
[  250.251782]814f7790 800031290c00 814baf80 
814baf80
[  250.251782]800031225000  80003290ca10 
814b66a4
[  250.251782]800031225000 800031290c00 0001 
814f7b80
[  250.251782]800031225000 800031225788 800031225830 
1002
[  250.251782]80002f8ebb80 814ba81c 800fb7105c32 
8116aab4
[  250.251782]...
[  250.251784] Call Trace:
[  250.251791] [] __mutex_lock_slowpath+0x7c/0x190
[  250.251801] [] phy_probe+0x6c/0x120
[  250.251807] [] phy_attach_direct+0xdc/0x1a8
[  250.251814] [] phy_connect_direct+0x2c/0x98
[  250.251822] [] of_phy_connect+0x60/0xb8
[  250.251830] [] octeon_mgmt_open+0x36c/0xad0
[  250.251838] [] __dev_open+0x11c/0x1b0
[  250.251844] [] __dev_change_flags+0xa0/0x188
[  250.251850] [] dev_change_flags+0x30/0x78
[  250.251856] [] do_setlink+0x374/0xa38
[  250.251862] [] rtnl_setlink+0xdc/0x120
[  250.251868] [] rtnetlink_rcv_msg+0xac/0x2a0
[  250.251875] [] netlink_rcv_skb+0xf0/0x120
[  250.251880] [] rtnetlink_rcv+0x38/0x48
[  250.251886] [] netlink_unicast+0x1c0/0x2d8
[  250.251891] [] netlink_sendmsg+0x424/0x480
[  250.251898] [] sock_sendmsg+0x24/0x40
[  250.251904] [] SyS_sendto+0xc4/0x108
[  250.251914] [] syscall_common+0x34/0x58
[  250.251918] 
[  250.251936] 
[  250.251936] Code: 24150001  ffa20008  24140002  0858def5  ffb30010 
 fe74  0c58e612  0240202d 
[  250.251983] ---[ end trace 1be6b781dce6d4dc ]---

A.


Re: [PATCH] brcmfmac: sdio: Increase the default timeouts a bit

2016-01-25 Thread Sjoerd Simons
On Mon, 2016-01-25 at 16:36 +0100, Arend van Spriel wrote:
> On 25-01-16 11:47, Sjoerd Simons wrote:
> > On a Radxa Rock2 board with a Ampak AP6335 (Broadcom 4339 core) it
> > seems
> > the card responds very quickly most of the time, unfortunately
> > during
> > initialisation it sometimes seems to take just a bit over 2 seconds
> > to
> > respond.
> > 
> > This results intialization failing with message like:
> >   brcmf_c_preinit_dcmds: Retreiving cur_etheraddr failed, -52
> >   brcmf_bus_start: failed: -52
> >   brcmf_sdio_firmware_callback: dongle is not responding
> > 
> > Increasing the timeout to allow for a bit more headroom allows the
> > card to initialize reliably.
> 
> I would prefer to know where the 2 second response time comes from.
> Could be sdio retuning. Maybe the chromeos people can comment whether
> this has been root caused.

The only reference i could find for where the timeout came from was in
the bcmdhd which has:
  #define IOCTL_RESP_TIMEOUT  2000  /* In milli second default value
for Production FW */

But not sure if that's helpful :). 


> There is a mmc patch pending in which retuning procedure can be
> deferred
> by the driver. Using that API may resolve the issue as well and I
> would
> prefer that solution.
> 
> > A quick search online after diagnosing/fixing this showed that
> > Google
> > has a similar patch in their ChromeOS tree, so this doesn't seem
> > specific to the board I'm using.
> 
> As the retuning stuff is not in main line I guess we need this fix
> for
> now so...
> 
> Acked-by: Arend van Spriel 
> > Signed-off-by: Sjoerd Simons 
> > 
> > ---
> Still would like to know whether it is firmware initialization or
> some
> mmc stack procedure. Any suggestions to debug this are welcome.
> 
> Regards,
> Arend
> ---
> > 
> >  drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git
> > a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
> > b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
> > index dd66143..75ac4bd 100644
> > --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
> > +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
> > @@ -45,8 +45,8 @@
> >  #include "chip.h"
> >  #include "firmware.h"
> >  
> > -#define DCMD_RESP_TIMEOUT  msecs_to_jiffies(2000)
> > -#define CTL_DONE_TIMEOUT   msecs_to_jiffies(2000)
> > +#define DCMD_RESP_TIMEOUT  msecs_to_jiffies(2500)
> > +#define CTL_DONE_TIMEOUT   msecs_to_jiffies(2500)
> >  
> >  #ifdef DEBUG
> >  
> > 

-- 
Sjoerd Simons
Collabora Ltd.


Re: [BISECTED] v4.5-rc1 phylib regression

2016-01-25 Thread Andrew Lunn
On Mon, Jan 25, 2016 at 05:45:21PM +0200, Aaro Koskinen wrote:
> Hi,
> 
> I get the below crash on OCTEON (with octeon_mgmt interface, genphy)
> always during systemd boot.

Hi Aaro

Olof reported a similar issue with a Marvell Ethernet driver/MDIO
driver. Olof thinking was the mutex was used before it was
initialised. I will look into this tonight.

Thanks
Andrew

> 
> Bisected to:
> 
> commit a9049e0c513c4521dbfaa302af8ed08b3366b41f
> Author: Andrew Lunn 
> Date:   Wed Jan 6 20:11:26 2016 +0100
> 
> mdio: Add support for mdio drivers.
> 
> [  250.179887] CPU 2 Unable to handle kernel paging request at virtual 
> address , epc == 81637bac, ra == 81637b7c
> [  250.218161] Oops[#1]:
> [  250.224970] CPU: 2 PID: 850 Comm: systemd-network Not tainted 
> 4.5.0-rc1-octeon-distro.git-test #1
> [  250.251569] task: 800031188000 ti: 80002f8e8000 task.ti: 
> 80002f8e8000
> [  250.251586] $ 0   :  81639774  
> 
> [  250.251595] $ 4   :   8174 
> 0001
> [  250.251604] $ 8   : 0001  81106100 
> 00010001
> [  250.251613] $12   :  813eb81c 8150be18 
> 
> [  250.251622] $16   : 800031290fc0  800031290fc4 
> 800031188000
> [  250.251631] $20   : 0002 0001 800031290fc8 
> 80002f8eba40
> [  250.251640] $24   : 0038 81105fb0  
> 
> [  250.251649] $28   : 80002f8e8000 80002f8eb740 00fff7857f88 
> 81637b7c
> [  250.251651] Hi: 431bde82d7b50717
> [  250.251653] Lo: c8de2ac3222855ea
> [  250.251668] epc   : 81637bac __mutex_lock_slowpath+0x7c/0x190
> [  250.251675] ra: 81637b7c __mutex_lock_slowpath+0x4c/0x190
> [  250.251684] Status: 10108ce3   KX SX UX KERNEL EXL IE 
> [  250.251686] Cause : 008c (ExcCode 03)
> [  250.251688] BadVA : 
> [  250.251690] PrId  : 000d0409 (Cavium Octeon+)
> [  250.251699] Modules linked in: pata_octeon_cf libata autofs4
> [  250.251704] Process systemd-network (pid: 850, 
> threadinfo=80002f8e8000, task=800031188000, tls=00fff75ba700)
> [  250.251782] Stack : 800031290fc8  0001 
> 83f9bca0
> [  250.251782]  800031290c00 81856558 800031290fc0 
> 800031225000
> [  250.251782]  0001  800031008800 
> 814b6284
> [  250.251782]  81639774 81175dc4 800031290c00 
> 814baf80
> [  250.251782]  80003121 814b64ac 80002f8eb810 
> 80002f8eb810
> [  250.251782]  814f7790 800031290c00 814baf80 
> 814baf80
> [  250.251782]  800031225000  80003290ca10 
> 814b66a4
> [  250.251782]  800031225000 800031290c00 0001 
> 814f7b80
> [  250.251782]  800031225000 800031225788 800031225830 
> 1002
> [  250.251782]  80002f8ebb80 814ba81c 800fb7105c32 
> 8116aab4
> [  250.251782]  ...
> [  250.251784] Call Trace:
> [  250.251791] [] __mutex_lock_slowpath+0x7c/0x190
> [  250.251801] [] phy_probe+0x6c/0x120
> [  250.251807] [] phy_attach_direct+0xdc/0x1a8
> [  250.251814] [] phy_connect_direct+0x2c/0x98
> [  250.251822] [] of_phy_connect+0x60/0xb8
> [  250.251830] [] octeon_mgmt_open+0x36c/0xad0
> [  250.251838] [] __dev_open+0x11c/0x1b0
> [  250.251844] [] __dev_change_flags+0xa0/0x188
> [  250.251850] [] dev_change_flags+0x30/0x78
> [  250.251856] [] do_setlink+0x374/0xa38
> [  250.251862] [] rtnl_setlink+0xdc/0x120
> [  250.251868] [] rtnetlink_rcv_msg+0xac/0x2a0
> [  250.251875] [] netlink_rcv_skb+0xf0/0x120
> [  250.251880] [] rtnetlink_rcv+0x38/0x48
> [  250.251886] [] netlink_unicast+0x1c0/0x2d8
> [  250.251891] [] netlink_sendmsg+0x424/0x480
> [  250.251898] [] sock_sendmsg+0x24/0x40
> [  250.251904] [] SyS_sendto+0xc4/0x108
> [  250.251914] [] syscall_common+0x34/0x58
> [  250.251918] 
> [  250.251936] 
> [  250.251936] Code: 24150001  ffa20008  24140002  0858def5  
> ffb30010  fe74  0c58e612  0240202d 
> [  250.251983] ---[ end trace 1be6b781dce6d4dc ]---
> 
> A.


Re: [PATCH] brcmfmac: sdio: Increase the default timeouts a bit

2016-01-25 Thread Arend van Spriel
On 25-01-16 11:47, Sjoerd Simons wrote:
> On a Radxa Rock2 board with a Ampak AP6335 (Broadcom 4339 core) it seems
> the card responds very quickly most of the time, unfortunately during
> initialisation it sometimes seems to take just a bit over 2 seconds to
> respond.
> 
> This results intialization failing with message like:
>   brcmf_c_preinit_dcmds: Retreiving cur_etheraddr failed, -52
>   brcmf_bus_start: failed: -52
>   brcmf_sdio_firmware_callback: dongle is not responding
> 
> Increasing the timeout to allow for a bit more headroom allows the
> card to initialize reliably.

I would prefer to know where the 2 second response time comes from.
Could be sdio retuning. Maybe the chromeos people can comment whether
this has been root caused.

There is a mmc patch pending in which retuning procedure can be deferred
by the driver. Using that API may resolve the issue as well and I would
prefer that solution.

> A quick search online after diagnosing/fixing this showed that Google
> has a similar patch in their ChromeOS tree, so this doesn't seem
> specific to the board I'm using.

As the retuning stuff is not in main line I guess we need this fix for
now so...

Acked-by: Arend van Spriel 
> Signed-off-by: Sjoerd Simons 
> 
> ---
Still would like to know whether it is firmware initialization or some
mmc stack procedure. Any suggestions to debug this are welcome.

Regards,
Arend
---
> 
>  drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c 
> b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
> index dd66143..75ac4bd 100644
> --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
> +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
> @@ -45,8 +45,8 @@
>  #include "chip.h"
>  #include "firmware.h"
>  
> -#define DCMD_RESP_TIMEOUTmsecs_to_jiffies(2000)
> -#define CTL_DONE_TIMEOUT msecs_to_jiffies(2000)
> +#define DCMD_RESP_TIMEOUTmsecs_to_jiffies(2500)
> +#define CTL_DONE_TIMEOUT msecs_to_jiffies(2500)
>  
>  #ifdef DEBUG
>  
> 


Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage)

2016-01-25 Thread Tom Herbert
On Mon, Jan 25, 2016 at 5:15 AM, Jesper Dangaard Brouer
 wrote:
>
> After reading John's reply about perfect filters, I want to re-state
> my idea, for this very early RX stage.  And describe a packet-page
> level bypass use-case, that John indirectly mentions.
>
>
> There are two ideas, getting mixed up here.  (1) bundling from the
> RX-ring, (2) allowing to pick up the "packet-page" directly.
>
> Bundling (1) is something that seems natural, and which help us
> amortize the cost between layers (and utilizes icache better). Lets
> keep that in another thread.
>
> This (2) direct forward of "packet-pages" is a fairly extreme idea,
> BUT it have the potential of being an new integration point for
> "selective" bypass-solutions and bringing RAW/af_packet (RX) up-to
> speed with bypass-solutions.
>
>
> Today, the bypass-solutions grab and control the entire NIC HW.  In
> many cases this is not very practical, if you also want to use the NIC
> for something else.
>
> Solutions for bypassing only part of the traffic is starting to show
> up.  Both a netmap[1] and a DPDK[2] based approach.
>
> [1] https://blog.cloudflare.com/partial-kernel-bypass-merged-netmap/
> [2] 
> http://rhelblog.redhat.com/2015/10/02/getting-the-best-of-both-worlds-with-queue-splitting-bifurcated-driver/
>
> Both approaches install a HW filter in the NIC, and redirect packets
> to a separate RX HW queue (via ethtool ntuple + flow-type).  DPDK
> needs pci SRIOV setup and then run it own poll-mode driver on top.
> Netmap patch the orig ixgbe driver, and since CloudFlare/Gilberto's
> changes[3] support a single RX queue mode.
>
Jepser, thanks for providing more specifics.

One comment: If you intend to change core code paths or APIs for this,
then I think that we should require up front that the associated HW
support is protocol agnostic (i.e. HW filters must be programmable and
generic ). We don't want a promising feature like this to be
undermined by protocol ossification.

Thanks,
Tom

> [3] https://github.com/luigirizzo/netmap/pull/87
>
>
> I'm thinking, why run all this extra driver software on top.  Why
> don't we just pickup the (packet)-page from the RX ring, and
> hand-it-over to a registered bypass handler?  (as mentioned before,
> the HW descriptor need to somehow "mark" these packets for us).
>
> I imagine some kind of page ring structure, and I also imagine
> RAW/af_packet being a "bypass" consumer.  I guess the af_packet part
> was also something John and Daniel have been looking at.
>
>
> (top post, but left John's replay below, because it got me thinking)
> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   Author of http://www.iptv-analyzer.org
>   LinkedIn: http://www.linkedin.com/in/brouer
>
>
>
>
> On Sun, 24 Jan 2016 09:28:36 -0800
> John Fastabend  wrote:
>
>> On 16-01-24 06:44 AM, Michael S. Tsirkin wrote:
>> > On Sun, Jan 24, 2016 at 03:28:14PM +0100, Jesper Dangaard Brouer wrote:
>> >> On Thu, 21 Jan 2016 10:54:01 -0800 (PST)
>> >> David Miller  wrote:
>> >>
>> >>> From: Jesper Dangaard Brouer 
>> >>> Date: Thu, 21 Jan 2016 12:27:30 +0100
>> >>>
> [...]
>
>> >>
>> >> BUT then I realized, what if we take this even further.  What if we
>> >> actually use this information, for something useful, at this very
>> >> early RX stage.
>> >>
>> >> The information I'm interested in, from the HW descriptor, is if this
>> >> packet is NOT for local delivery.  If so, we can send the packet on a
>> >> "fast-forward" code path.
>> >>
>> >> Think about bridging packets to a guest OS.  Because we know very
>> >> early at RX (from packet HW descriptor) we might even avoid allocating
>> >> a SKB.  We could just "forward" the packet-page to the guest OS.
>> >
>> > OK, so you would build a new kind of rx handler, and then
>> > e.g. macvtap could maybe get packets this way?
>> > Sure - e.g. vhost expects an skb at the moment
>> > but it won't be too hard to teach it that there's
>> > some other option.
>>
>> + Daniel, Vlad
>>
>> If you use the macvtap device with the offload features you can "know"
>> via mac address that all packets on a specific hardware queue set belong
>> to a specific guest. (the queues are bound to a new netdev) This works
>> well with the passthru mode of macvlan. So you can do hardware bridging
>> this way. Supporting similar L3 modes probably not via macvlan has been
>> on my todo list for awhile but I haven't got there yet. ixgbe and fm10k
>> intel drivers support this now maybe others but those are the two I've
>> worked with recently.
>>
>> The idea here is you remove any overhead from running bridge code, etc.
>> but still allowing users to stick netfilter, qos, etc hooks in the
>> datapath.
>>
>> Also Daniel and I started working on a zero-copy RX mode which would
>> further help this by letting vhost-net pass down a set of dma buffers
>> we should probably get this working and 

[PATCH net-next] hv_netvsc: Fix book keeping of skb during batching process

2016-01-25 Thread Haiyang Zhang
Since eliminating send_completion_tid from struct hv_netvsc_packet, we
haven't add proper book keeping for the skb of the batched packet. This
patch fixes this issue and allows the previous skb is properly freed.
Otherwise, a panic may happen.
Thanks to Simon Xiao  for bisecting and analysis.

Signed-off-by: Haiyang Zhang 
Reviewed-by: K. Y. Srinivasan 
---
 drivers/net/hyperv/hyperv_net.h |1 +
 drivers/net/hyperv/netvsc.c |   33 ++---
 2 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index f4130af..fcb92c0 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -624,6 +624,7 @@ struct nvsp_message {
 #define RNDIS_PKT_ALIGN_DEFAULT 8
 
 struct multi_send_data {
+   struct sk_buff *skb; /* skb containing the pkt */
struct hv_netvsc_packet *pkt; /* netvsc pkt pending */
u32 count; /* counter of batched packets */
 };
diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 059fc52..ec313fc 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -841,6 +841,18 @@ static inline int netvsc_send_pkt(
return ret;
 }
 
+/* Move packet out of multi send data (msd), and clear msd */
+static inline void move_pkt_msd(struct hv_netvsc_packet **msd_send,
+   struct sk_buff **msd_skb,
+   struct multi_send_data *msdp)
+{
+   *msd_skb = msdp->skb;
+   *msd_send = msdp->pkt;
+   msdp->skb = NULL;
+   msdp->pkt = NULL;
+   msdp->count = 0;
+}
+
 int netvsc_send(struct hv_device *device,
struct hv_netvsc_packet *packet,
struct rndis_message *rndis_msg,
@@ -855,6 +867,7 @@ int netvsc_send(struct hv_device *device,
unsigned int section_index = NETVSC_INVALID_INDEX;
struct multi_send_data *msdp;
struct hv_netvsc_packet *msd_send = NULL, *cur_send = NULL;
+   struct sk_buff *msd_skb = NULL;
bool try_batch;
bool xmit_more = (skb != NULL) ? skb->xmit_more : false;
 
@@ -897,10 +910,8 @@ int netvsc_send(struct hv_device *device,
   net_device->send_section_size) {
section_index = netvsc_get_next_send_section(net_device);
if (section_index != NETVSC_INVALID_INDEX) {
-   msd_send = msdp->pkt;
-   msdp->pkt = NULL;
-   msdp->count = 0;
-   msd_len = 0;
+   move_pkt_msd(_send, _skb, msdp);
+   msd_len = 0;
}
}
 
@@ -919,31 +930,31 @@ int netvsc_send(struct hv_device *device,
packet->total_data_buflen += msd_len;
}
 
-   if (msdp->pkt)
-   dev_kfree_skb_any(skb);
+   if (msdp->skb)
+   dev_kfree_skb_any(msdp->skb);
 
if (xmit_more && !packet->cp_partial) {
+   msdp->skb = skb;
msdp->pkt = packet;
msdp->count++;
} else {
cur_send = packet;
+   msdp->skb = NULL;
msdp->pkt = NULL;
msdp->count = 0;
}
} else {
-   msd_send = msdp->pkt;
-   msdp->pkt = NULL;
-   msdp->count = 0;
+   move_pkt_msd(_send, _skb, msdp);
cur_send = packet;
}
 
if (msd_send) {
-   m_ret = netvsc_send_pkt(msd_send, net_device, pb, skb);
+   m_ret = netvsc_send_pkt(msd_send, net_device, NULL, msd_skb);
 
if (m_ret != 0) {
netvsc_free_send_slot(net_device,
  msd_send->send_buf_index);
-   dev_kfree_skb_any(skb);
+   dev_kfree_skb_any(msd_skb);
}
}
 
-- 
1.7.4.1



RE: [PATCH 1/1] bonding: Use notifiers for slave link state detection

2016-01-25 Thread Tantilov, Emil S
>-Original Message-
>From: zyjzyj2...@gmail.com [mailto:zyjzyj2...@gmail.com]
>Sent: Thursday, January 21, 2016 2:16 AM
>To: zyjzyj2...@gmail.com; mkube...@suse.cz; vfal...@gmail.com;
>go...@cumulusnetworks.com; netdev@vger.kernel.org; Shteinbock, Boris (Wind
>River); jay.vosbu...@canonical.com; Tantilov, Emil S
>Subject: [PATCH 1/1] bonding: Use notifiers for slave link state detection
>
>
>Hi, Jay && Emil
>
>Thanks for your hard work. I forget to send this patch to you. Please help
>to review. Thanks a lot.
>
>I think the similar patch is needed in linux kernel 4.4. As such, based on
>linux kernel 4.4, I made this patch. Please comment.

The patch you are referring to has not been accepted in net-next yet.
If/when that happens you can request it to be ported to the stable tree.

Last version I tested seemed to work OK, but Jay mentioned some RCU warnings 
and 
I was expecting a follow up on it.

Thanks,
Emil



[PATCH net] sctp: fix copying more bytes than expected in sctp_add_bind_addr

2016-01-25 Thread Marcelo Ricardo Leitner
Great. Dmitry, please give this a run. Local tests looked good but who
knows what syzkaller may find.

Thanks

--8<--

Dmitry reported that sctp_add_bind_addr may read more bytes than
expected in case the parameter is a IPv4 addr supplied by the user
through calls such as sctp_bindx_add(), because it always copies
sizeof(union sctp_addr) while the buffer may be just a struct
sockaddr_in, which is smaller.

This patch then fixes it by limiting the memcpy to the min between the
union size and a (new parameter) provided addr size. Where possible this
parameter still is the size of that union, except for reading from
user-provided buffers, which then it accounts for protocol type.

Reported-by: Dmitry Vyukov 
Signed-off-by: Marcelo Ricardo Leitner 
---
 include/net/sctp/structs.h |  2 +-
 net/sctp/bind_addr.c   | 14 --
 net/sctp/protocol.c|  1 +
 net/sctp/sm_make_chunk.c   |  2 +-
 net/sctp/socket.c  |  5 +++--
 5 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 
20e72129be1ce0063eeafcbaadcee1f37e0c614c..97ba8a8c466f5c50bdc87ec578792e56553baa91
 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -1099,7 +1099,7 @@ int sctp_bind_addr_dup(struct sctp_bind_addr *dest,
const struct sctp_bind_addr *src,
gfp_t gfp);
 int sctp_add_bind_addr(struct sctp_bind_addr *, union sctp_addr *,
-  __u8 addr_state, gfp_t gfp);
+  int new_size, __u8 addr_state, gfp_t gfp);
 int sctp_del_bind_addr(struct sctp_bind_addr *, union sctp_addr *);
 int sctp_bind_addr_match(struct sctp_bind_addr *, const union sctp_addr *,
 struct sctp_sock *);
diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
index 
871cdf9567e6bc9c13cb1077dc6866a67e6e4367..80129d10a0af9c33e7348b79d010b9e5e948e584
 100644
--- a/net/sctp/bind_addr.c
+++ b/net/sctp/bind_addr.c
@@ -111,7 +111,8 @@ int sctp_bind_addr_dup(struct sctp_bind_addr *dest,
dest->port = src->port;
 
list_for_each_entry(addr, >address_list, list) {
-   error = sctp_add_bind_addr(dest, >a, 1, gfp);
+   error = sctp_add_bind_addr(dest, >a, sizeof(addr->a),
+  1, gfp);
if (error < 0)
break;
}
@@ -150,7 +151,7 @@ void sctp_bind_addr_free(struct sctp_bind_addr *bp)
 
 /* Add an address to the bind address list in the SCTP_bind_addr structure. */
 int sctp_add_bind_addr(struct sctp_bind_addr *bp, union sctp_addr *new,
-  __u8 addr_state, gfp_t gfp)
+  int new_size, __u8 addr_state, gfp_t gfp)
 {
struct sctp_sockaddr_entry *addr;
 
@@ -159,7 +160,7 @@ int sctp_add_bind_addr(struct sctp_bind_addr *bp, union 
sctp_addr *new,
if (!addr)
return -ENOMEM;
 
-   memcpy(>a, new, sizeof(*new));
+   memcpy(>a, new, min_t(size_t, sizeof(*new), new_size));
 
/* Fix up the port if it has not yet been set.
 * Both v4 and v6 have the port at the same offset.
@@ -291,7 +292,8 @@ int sctp_raw_to_bind_addrs(struct sctp_bind_addr *bp, __u8 
*raw_addr_list,
}
 
af->from_addr_param(, rawaddr, htons(port), 0);
-   retval = sctp_add_bind_addr(bp, , SCTP_ADDR_SRC, gfp);
+   retval = sctp_add_bind_addr(bp, , sizeof(addr),
+   SCTP_ADDR_SRC, gfp);
if (retval) {
/* Can't finish building the list, clean up. */
sctp_bind_addr_clean(bp);
@@ -453,8 +455,8 @@ static int sctp_copy_one_addr(struct net *net, struct 
sctp_bind_addr *dest,
(((AF_INET6 == addr->sa.sa_family) &&
  (flags & SCTP_ADDR6_ALLOWED) &&
  (flags & SCTP_ADDR6_PEERSUPP
-   error = sctp_add_bind_addr(dest, addr, SCTP_ADDR_SRC,
-   gfp);
+   error = sctp_add_bind_addr(dest, addr, sizeof(addr),
+  SCTP_ADDR_SRC, gfp);
}
 
return error;
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 
ab0d538a74ed593571cfaef02cd1bb7ce872abe6..2fb609008311f51344704d82f21b4de9f08253da
 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -214,6 +214,7 @@ int sctp_copy_local_addr_list(struct net *net, struct 
sctp_bind_addr *bp,
  (copy_flags & SCTP_ADDR6_ALLOWED) &&
  (copy_flags & SCTP_ADDR6_PEERSUPP {
error = sctp_add_bind_addr(bp, >a,
+   sizeof(addr->a),
SCTP_ADDR_SRC, GFP_ATOMIC);

Re: [PATCH v2] net: fec: use CONFIG_ARM instead of CONFIG_ARCH_MXC/SOC_IMX28

2016-01-25 Thread David Miller
From: Johannes Berg 
Date: Mon, 25 Jan 2016 11:40:50 +0100

> As Arnd Bergmann points out, using CONFIG_ARCH_MXC and/or SOC_IMX28
> is wrong if some other ARM platform uses this device - the operation
> of the driver would depend on an unrelated ARM platform that might
> or might not be set for multi-platform kernels.
> 
> Prior to my previous patch, any other platforms using it would have
> been broken already due to having the cbd_datlen/cbd_sc fields in
> the wrong order, but byte ordering correctly, so no such platforms
> can exist and work today.
> 
> In any case, it seems likely that only Freescale SoCs use this part,
> and those are little-endian on ARM, so CONFIG_ARM is safe for them.
> 
> Signed-off-by: Johannes Berg 

Applied.


Re: [PATCH net] sit: set rtnl_link_ops before calling register_netdevice

2016-01-25 Thread David Miller
From: Thadeu Lima de Souza Cascardo 
Date: Mon, 25 Jan 2016 11:29:19 -0200

> When creating a SIT tunnel with ip tunnel, rtnl_link_ops is not set before
> ipip6_tunnel_create is called. When register_netdevice is called, there is
> no linkinfo attribute in the NEWLINK message because of that.
> 
> Setting rtnl_link_ops before calling register_netdevice fixes that.
> 
> Signed-off-by: Thadeu Lima de Souza Cascardo 

Applied.


Re: [PATCH 2/3] net: macb: fix build warning

2016-01-25 Thread David Miller
From: Sudip Mukherjee 
Date: Mon, 25 Jan 2016 11:43:09 +0530

> We are getting build warning about:
> macb.c:2889:13: warning: 'tx_clk' may be used uninitialized in this function
> macb.c:2888:11: warning: 'hclk' may be used uninitialized in this function
> 
> In reality they are not used uninitialized as clk_init() will initialize
> them, this patch will just silence the warning.
> 
> Signed-off-by: Sudip Mukherjee 

Applied.


Re: [PATCH v2] defxx: fix build warning

2016-01-25 Thread David Miller
From: Sudip Mukherjee 
Date: Mon, 25 Jan 2016 13:05:20 +0530

> We are getting many build warnings about:
> 'bar_start' may be used uninitialized
> and
> 'bar_len' may be used uninitialized
> 
> They are not actually uninitialized as dfx_get_bars() will initialize
> them properly. But still lets have them initialized just to satisfy the
> compiler (gcc 4.8.2).
> 
> Signed-off-by: Sudip Mukherjee 

Applied.


Re: [PATCH 4/4] dt: binding: Add Qualcomm wcn36xx WiFi binding

2016-01-25 Thread Bjorn Andersson
On Tue, Dec 29, 2015 at 11:03 AM, Bjorn Andersson  wrote:
> On Tue 29 Dec 10:34 PST 2015, Rob Herring wrote:
>
>> On Sun, Dec 27, 2015 at 05:34:27PM -0800, Bjorn Andersson wrote:
>> > Add binding representing the Qualcomm wcn3620/60/80 WiFi block.
>> > Signed-off-by: Bjorn Andersson 
>> > ---
>> >  .../bindings/net/wireless/qcom,wcn36xx-wifi.txt| 76 
>> > ++
>> >  1 file changed, 76 insertions(+)
>> >  create mode 100644 
>> > Documentation/devicetree/bindings/net/wireless/qcom,wcn36xx-wifi.txt
>> >
>> > diff --git 
>> > a/Documentation/devicetree/bindings/net/wireless/qcom,wcn36xx-wifi.txt 
>> > b/Documentation/devicetree/bindings/net/wireless/qcom,wcn36xx-wifi.txt
>> > new file mode 100644
>> > index ..7b314b9f30af
>> > --- /dev/null
>> > +++ b/Documentation/devicetree/bindings/net/wireless/qcom,wcn36xx-wifi.txt
>> > @@ -0,0 +1,76 @@
>> > +Qualcomm WCN36xx WiFi Binding
>> > +
>> > +This binding describes the Qualcomm WCN36xx WiFi hardware. The hardware 
>> > block
>> > +is part of the Qualcomm WCNSS core, a WiFi/BT/FM combo chip, found in a 
>> > variety
>> > +of Qualcomm platforms.
>>
>> Are BT/FM functions completely separate? If so, separate bindings are
>> okay. If not, then we need to describe the full chip.
>>
>
> It's three different hardware blocks (WiFi, BT and FM-radio) with shared
> RF-hardware and an ARM core for control logic.
>
> There seems to be some control commands going towards the BT part that
> controls coexistence properties of the RF-hardware, but other than that
> I see it as logically separate blocks.
>
>
> So I think it's fine to model this as separate pieces in DT.
>

After more testing I've concluded that there is a timing dependency
between the WiFi driver and the wcnss_ctrl driver. If the WiFi driver
starts communicating with the WLAN subsystem in the WCNSS block before
we have finished uploading the NV data to the WCNSS core further
communication will fail.

So looks like I need to remodel this slightly to take this into account :/

Regards,
Bjorn


Re: [PATCH 1/1] bonding: Use notifiers for slave link state detection

2016-01-25 Thread David Miller
From: "Tantilov, Emil S" 
Date: Mon, 25 Jan 2016 16:33:37 +

> The patch you are referring to has not been accepted in net-next yet.
> If/when that happens you can request it to be ported to the stable tree.

Wrong.

If you want a patch to get submitted to stable, it must be appropriate for
and you must target it to 'net', not 'net-next'.


Re: net/sctp: out-of-bounds access in sctp_add_bind_addr

2016-01-25 Thread Neil Horman
On Mon, Jan 25, 2016 at 02:16:00PM -0200, Marcelo Ricardo Leitner wrote:
> Something like this. Builds, but UNTESTED.
> Uses union sizeof where possible but when reading from a buffer that is
> not aligned to it, like that user supplied one. Then relies on
> af->sockaddr_len
> 
> --8<--
> 
> ---
>  include/net/sctp/structs.h |  2 +-
>  net/sctp/bind_addr.c   | 14 --
>  net/sctp/protocol.c|  1 +
>  net/sctp/sm_make_chunk.c   |  2 +-
>  net/sctp/socket.c  |  5 +++--
>  5 files changed, 14 insertions(+), 10 deletions(-)
> 
> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
> index 
> 20e72129be1ce0063eeafcbaadcee1f37e0c614c..97ba8a8c466f5c50bdc87ec578792e56553baa91
>  100644
> --- a/include/net/sctp/structs.h
> +++ b/include/net/sctp/structs.h
> @@ -1099,7 +1099,7 @@ int sctp_bind_addr_dup(struct sctp_bind_addr *dest,
>   const struct sctp_bind_addr *src,
>   gfp_t gfp);
>  int sctp_add_bind_addr(struct sctp_bind_addr *, union sctp_addr *,
> -__u8 addr_state, gfp_t gfp);
> +int new_size, __u8 addr_state, gfp_t gfp);
>  int sctp_del_bind_addr(struct sctp_bind_addr *, union sctp_addr *);
>  int sctp_bind_addr_match(struct sctp_bind_addr *, const union sctp_addr *,
>struct sctp_sock *);
> diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
> index 
> 871cdf9567e6bc9c13cb1077dc6866a67e6e4367..80129d10a0af9c33e7348b79d010b9e5e948e584
>  100644
> --- a/net/sctp/bind_addr.c
> +++ b/net/sctp/bind_addr.c
> @@ -111,7 +111,8 @@ int sctp_bind_addr_dup(struct sctp_bind_addr *dest,
>   dest->port = src->port;
>  
>   list_for_each_entry(addr, >address_list, list) {
> - error = sctp_add_bind_addr(dest, >a, 1, gfp);
> + error = sctp_add_bind_addr(dest, >a, sizeof(addr->a),
> +1, gfp);
>   if (error < 0)
>   break;
>   }
> @@ -150,7 +151,7 @@ void sctp_bind_addr_free(struct sctp_bind_addr *bp)
>  
>  /* Add an address to the bind address list in the SCTP_bind_addr structure. 
> */
>  int sctp_add_bind_addr(struct sctp_bind_addr *bp, union sctp_addr *new,
> -__u8 addr_state, gfp_t gfp)
> +int new_size, __u8 addr_state, gfp_t gfp)
>  {
>   struct sctp_sockaddr_entry *addr;
>  
> @@ -159,7 +160,7 @@ int sctp_add_bind_addr(struct sctp_bind_addr *bp, union 
> sctp_addr *new,
>   if (!addr)
>   return -ENOMEM;
>  
> - memcpy(>a, new, sizeof(*new));
> + memcpy(>a, new, min_t(size_t, sizeof(*new), new_size));
>  
>   /* Fix up the port if it has not yet been set.
>* Both v4 and v6 have the port at the same offset.
> @@ -291,7 +292,8 @@ int sctp_raw_to_bind_addrs(struct sctp_bind_addr *bp, 
> __u8 *raw_addr_list,
>   }
>  
>   af->from_addr_param(, rawaddr, htons(port), 0);
> - retval = sctp_add_bind_addr(bp, , SCTP_ADDR_SRC, gfp);
> + retval = sctp_add_bind_addr(bp, , sizeof(addr),
> + SCTP_ADDR_SRC, gfp);
>   if (retval) {
>   /* Can't finish building the list, clean up. */
>   sctp_bind_addr_clean(bp);
> @@ -453,8 +455,8 @@ static int sctp_copy_one_addr(struct net *net, struct 
> sctp_bind_addr *dest,
>   (((AF_INET6 == addr->sa.sa_family) &&
> (flags & SCTP_ADDR6_ALLOWED) &&
> (flags & SCTP_ADDR6_PEERSUPP
> - error = sctp_add_bind_addr(dest, addr, SCTP_ADDR_SRC,
> - gfp);
> + error = sctp_add_bind_addr(dest, addr, sizeof(addr),
> +SCTP_ADDR_SRC, gfp);
>   }
>  
>   return error;
> diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
> index 
> ab0d538a74ed593571cfaef02cd1bb7ce872abe6..2fb609008311f51344704d82f21b4de9f08253da
>  100644
> --- a/net/sctp/protocol.c
> +++ b/net/sctp/protocol.c
> @@ -214,6 +214,7 @@ int sctp_copy_local_addr_list(struct net *net, struct 
> sctp_bind_addr *bp,
> (copy_flags & SCTP_ADDR6_ALLOWED) &&
> (copy_flags & SCTP_ADDR6_PEERSUPP {
>   error = sctp_add_bind_addr(bp, >a,
> + sizeof(addr->a),
>   SCTP_ADDR_SRC, GFP_ATOMIC);
>   if (error)
>   goto end_copy;
> diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
> index 
> 5d6a03fad3789a12290f5f14c5a7efa69c98f41a..1b91e9760fe514db6d89457e1d5da9e02800745e
>  100644
> --- a/net/sctp/sm_make_chunk.c
> +++ b/net/sctp/sm_make_chunk.c
> @@ -1830,7 +1830,7 @@ no_hmac:
>   /* Also, add the destination address. */
>  

Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage)

2016-01-25 Thread John Fastabend
On 16-01-25 09:09 AM, Tom Herbert wrote:
> On Mon, Jan 25, 2016 at 5:15 AM, Jesper Dangaard Brouer
>  wrote:
>>
>> After reading John's reply about perfect filters, I want to re-state
>> my idea, for this very early RX stage.  And describe a packet-page
>> level bypass use-case, that John indirectly mentions.
>>
>>
>> There are two ideas, getting mixed up here.  (1) bundling from the
>> RX-ring, (2) allowing to pick up the "packet-page" directly.
>>
>> Bundling (1) is something that seems natural, and which help us
>> amortize the cost between layers (and utilizes icache better). Lets
>> keep that in another thread.
>>
>> This (2) direct forward of "packet-pages" is a fairly extreme idea,
>> BUT it have the potential of being an new integration point for
>> "selective" bypass-solutions and bringing RAW/af_packet (RX) up-to
>> speed with bypass-solutions.
>>
>>
>> Today, the bypass-solutions grab and control the entire NIC HW.  In
>> many cases this is not very practical, if you also want to use the NIC
>> for something else.
>>
>> Solutions for bypassing only part of the traffic is starting to show
>> up.  Both a netmap[1] and a DPDK[2] based approach.
>>
>> [1] https://blog.cloudflare.com/partial-kernel-bypass-merged-netmap/
>> [2] 
>> http://rhelblog.redhat.com/2015/10/02/getting-the-best-of-both-worlds-with-queue-splitting-bifurcated-driver/
>>
>> Both approaches install a HW filter in the NIC, and redirect packets
>> to a separate RX HW queue (via ethtool ntuple + flow-type).  DPDK
>> needs pci SRIOV setup and then run it own poll-mode driver on top.
>> Netmap patch the orig ixgbe driver, and since CloudFlare/Gilberto's
>> changes[3] support a single RX queue mode.
>>

FWIW I wrote a version of the patch talked about in the queue splitting
article that didn't require SR-IOV and we also talked about it at last
netconf in ottowa. The problem is without SR-IOV if you map a queue
directly into userspace so you can run the poll mode drivers there is
nothing protecting the DMA engine. So userspace can put arbitrary
addresses in there. There is something called Process Address Space ID
(PASID) also part of the PCI-SIG spec that could help you here but I
don't know of any hardware that supports it. The other option is to
use system calls and validate the descriptors in the kernel but this
incurs some overhead we had it at 15% or so when I did the numbers
last year. However I'm told there is some interesting work going on
around syscall overhead that may help.

One thing to note is SRIOV does somewhat limit the number of these
types of interfaces you can support to the max VFs where as the
queue mechanism although slower with a function call would be limited
to max number of queues. Also busy polling will help here if you
are worried about pps.

Jesper, at least for you (2) case what are we missing with the
bifurcated/queue splitting work? Are you really after systems
without SR-IOV support or are you trying to get this on the order
of queues instead of VFs.

> Jepser, thanks for providing more specifics.
> 
> One comment: If you intend to change core code paths or APIs for this,
> then I think that we should require up front that the associated HW
> support is protocol agnostic (i.e. HW filters must be programmable and
> generic ). We don't want a promising feature like this to be
> undermined by protocol ossification.

At the moment we use ethtool ntuple filters which is basically adding
a new set of enums and structures every time we need a new protocol
so its painful and you need your vendor to support you and you need a
new kernel.

The flow api was shot down (which would get you to the point where
the user could specify the protocols for the driver to implement e.g.
put_parse_graph) and the only new proposals I've seen are bpf
translations in drivers and 'tc'. I plan to take another shot at this in
net-next.

> 
> Thanks,
> Tom
> 
>> [3] https://github.com/luigirizzo/netmap/pull/87
>>



RE: [PATCH 1/1] bonding: Use notifiers for slave link state detection

2016-01-25 Thread Tantilov, Emil S
>-Original Message-
>From: David Miller [mailto:da...@davemloft.net]
>Sent: Monday, January 25, 2016 10:00 AM
>To: Tantilov, Emil S
>Cc: zyjzyj2...@gmail.com; mkube...@suse.cz; vfal...@gmail.com;
>go...@cumulusnetworks.com; netdev@vger.kernel.org; Shteinbock, Boris (Wind
>River); jay.vosbu...@canonical.com
>Subject: Re: [PATCH 1/1] bonding: Use notifiers for slave link state
>detection
>
>From: "Tantilov, Emil S" 
>Date: Mon, 25 Jan 2016 16:33:37 +
>
>> The patch you are referring to has not been accepted in net-next yet.
>> If/when that happens you can request it to be ported to the stable tree.
>
>Wrong.
>
>If you want a patch to get submitted to stable, it must be appropriate for
>and you must target it to 'net', not 'net-next'.

Yeah that came out wrong. I was just trying to point out that this patch
is a port to 4.4 kernel of a test patch from Jay that hasn't even gotten
into the net/net-next trees yet.

Thanks,
Emil



Re: [PATCH] ipv4+ipv6: Make INET*_ESP select CRYPTO_ECHAINIV

2016-01-25 Thread David Miller
From: Herbert Xu 
Date: Mon, 25 Jan 2016 21:56:28 +0800

> On Mon, Jan 25, 2016 at 12:58:44PM +0100, Thomas Egerer wrote:
>> The ESP algorithms using CBC mode require echainiv. Hence INET*_ESP have
>> to select CRYPTO_ECHAINIV in order to work properly. This solves the
>> issues caused by a misconfiguration as described in [1].
>> The original approach, patching crypto/Kconfig was turned down by
>> Herbert Xu [2].
>> 
>> [1] https://lists.strongswan.org/pipermail/users/2015-December/009074.html
>> [2] http://marc.info/?l=linux-crypto-vger=145224655809562=2
>> 
>> Signed-off-by: Thomas Egerer 
> 
> Acked-by: Herbert Xu 

Applied, thanks.


Re: [PATCH v2] net: dsa: fix mv88e6xxx switches

2016-01-25 Thread David Miller
From: Russell King 
Date: Sun, 24 Jan 2016 09:22:05 +

> Since commit 76e398a62712 ("net: dsa: use switchdev obj for VLAN add/del
> ops"), the Marvell 88E6xxx switch has been unable to pass traffic
> between ports - any received traffic is discarded by the switch.
> Taking a port out of bridge mode and configuring a vlan on it also the
> port to start passing traffic.
> 
> With the debugfs files re-instated to allow debug of this issue by
> comparing the register settings between the working and non-working
> case, the reason becomes clear:
> 
>  GLOBAL GLOBAL2 SERDES   0123456
> - 7:  707f2001 2222202
> + 7:  707f2001 1111101
> 
> Register 7 for the ports is the default vlan tag register, and in the
> non-working setup, it has been set to 2, despite vlan 2 not being
> configured.  This causes the switch to drop all packets coming in to
> these ports.  The working setup has the default vlan tag register set
> to 1, which is the default vlan when none is configured.
> 
> Inspection of the code reveals why.  The code prior to this commit
> was:
> 
> - for (vid = vlan->vid_begin; vid <= vlan->vid_end; ++vid) {
> ...
> - if (!err && vlan->flags & BRIDGE_VLAN_INFO_PVID)
> - err = ds->drv->port_pvid_set(ds, p->port, vid);
> 
> but the new code is:
> 
> + for (vid = vlan->vid_begin; vid <= vlan->vid_end; ++vid) {
> ...
> + }
> ...
> + if (pvid)
> + err = _mv88e6xxx_port_pvid_set(ds, port, vid);
> 
> This causes the new code to always set the default vlan to one higher
> than the old code.
> 
> Fix this.
> 
> Fixes: 76e398a62712 ("net: dsa: use switchdev obj for VLAN add/del ops")
> Cc: 
> Signed-off-by: Russell King 

Applied and queued up for -stable, thanks Russell.


Re: [PATCH v2] net: fec: make driver endian-safe

2016-01-25 Thread David Miller
From: Johannes Berg 
Date: Sun, 24 Jan 2016 16:52:37 +0100

> The driver treats the device descriptors as CPU-endian, which appears
> to be correct with the default endianness on both ARM (typically LE)
> and PowerPC (typically BE) SoCs, indicating that the hardware block
> is generated differently. Add endianness annotations and byteswaps as
> necessary.
> 
> It's not clear that the ifdef there really is correct and shouldn't
> just be #ifdef CONFIG_ARM, but I also can't test on anything but the
> i.MX6 HummingBoard where this gets it working with a BE kernel.
> 
> Signed-off-by: Johannes Berg 

Applied.


Re: [PATCH net 1/3] sctp: fix the transport dead race check by using atomic_add_unless on refcnt

2016-01-25 Thread David Miller
From: Vlad Yasevich 
Date: Fri, 22 Jan 2016 13:54:09 -0500

> OK,  I see how that holds together, but I think there might be hole wrt icmp
> handling.  Some icmp processes assume transport can't disappear on them, but 
> in
> this case that last put(transport) may result in a call to 
> sctp_transport_destroy()
> and that might be bad.  I am looking at it now.

Vlad, this patch series is being held up because of this.  Please resolve
this one way or the other at your earliest possible convenience, thanks.


Re: [PATCH] 82xx: FCC: Fixing a bug causing to FCC port lock-up (second try)

2016-01-25 Thread David Miller
From: Martin Roth 
Date: Sun, 24 Jan 2016 00:56:19 +0200

> This is an additional patch to the one already submitted recently.
> The previous patch was not complete, and the FCC port lock-up scenario
> has been reproduced in lab.
> I had an opportunity to check the current patch in lab and the FCC
> port lock no longer freezes, while the previous patch still locks-up the
> FCC port.
> The current patch fixes a pointer arithmetic bug (second bug in the same
> line), which leads FCC port lock-up during underrun/collision handling.
> Within the tx_startup() function in mac-fcc.c, the address of last BD is
> not calculated correctly. As a result of wrong calculation of the last BD
> address, the next transmitted BD may be set to an area out of the transmit
> BD ring. This actually causes to port lock-up and it is not recoverable.
> 
> Signed-off-by: Martin Roth 

Applied, thank you.


Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage)

2016-01-25 Thread John Fastabend
On 16-01-25 01:32 PM, Tom Herbert wrote:
> On Mon, Jan 25, 2016 at 9:50 AM, John Fastabend
>  wrote:
>> On 16-01-25 09:09 AM, Tom Herbert wrote:
>>> On Mon, Jan 25, 2016 at 5:15 AM, Jesper Dangaard Brouer
>>>  wrote:

 After reading John's reply about perfect filters, I want to re-state
 my idea, for this very early RX stage.  And describe a packet-page
 level bypass use-case, that John indirectly mentions.


 There are two ideas, getting mixed up here.  (1) bundling from the
 RX-ring, (2) allowing to pick up the "packet-page" directly.

 Bundling (1) is something that seems natural, and which help us
 amortize the cost between layers (and utilizes icache better). Lets
 keep that in another thread.

 This (2) direct forward of "packet-pages" is a fairly extreme idea,
 BUT it have the potential of being an new integration point for
 "selective" bypass-solutions and bringing RAW/af_packet (RX) up-to
 speed with bypass-solutions.


 Today, the bypass-solutions grab and control the entire NIC HW.  In
 many cases this is not very practical, if you also want to use the NIC
 for something else.

 Solutions for bypassing only part of the traffic is starting to show
 up.  Both a netmap[1] and a DPDK[2] based approach.

 [1] https://blog.cloudflare.com/partial-kernel-bypass-merged-netmap/
 [2] 
 http://rhelblog.redhat.com/2015/10/02/getting-the-best-of-both-worlds-with-queue-splitting-bifurcated-driver/

 Both approaches install a HW filter in the NIC, and redirect packets
 to a separate RX HW queue (via ethtool ntuple + flow-type).  DPDK
 needs pci SRIOV setup and then run it own poll-mode driver on top.
 Netmap patch the orig ixgbe driver, and since CloudFlare/Gilberto's
 changes[3] support a single RX queue mode.

>>
>> FWIW I wrote a version of the patch talked about in the queue splitting
>> article that didn't require SR-IOV and we also talked about it at last
>> netconf in ottowa. The problem is without SR-IOV if you map a queue
>> directly into userspace so you can run the poll mode drivers there is
>> nothing protecting the DMA engine. So userspace can put arbitrary
>> addresses in there. There is something called Process Address Space ID
>> (PASID) also part of the PCI-SIG spec that could help you here but I
>> don't know of any hardware that supports it. The other option is to
>> use system calls and validate the descriptors in the kernel but this
>> incurs some overhead we had it at 15% or so when I did the numbers
>> last year. However I'm told there is some interesting work going on
>> around syscall overhead that may help.
>>
>> One thing to note is SRIOV does somewhat limit the number of these
>> types of interfaces you can support to the max VFs where as the
>> queue mechanism although slower with a function call would be limited
>> to max number of queues. Also busy polling will help here if you
>> are worried about pps.
>>
> I think you're understating that a bit :-) We know that busy polling
> helps with both pps and latency. IIRC, busy polling in the kernel
> reduced latency by 2/3. Any latency or pps comparison between an
> interrupt driven kernel stack and a userspace stack doing polling
> would be invalid. If this work is all about latency (like burning
> cores is not an issue), maybe busy polling should be be assumed for
> all test cases?

Probably if your going to try and report pps numbers and chart them
we mind as well play the game and use the best configuration we can.

Although I did want to make busy polling per queue or maybe create
L3/L4 netdev's like macvlan and put those in busy polling. Its a bit
overkill to put the entire device in busy polling mode when we have
only a couple sockets doing it. net-next is opening soon right ;)

> 
>> Jesper, at least for you (2) case what are we missing with the
>> bifurcated/queue splitting work? Are you really after systems
>> without SR-IOV support or are you trying to get this on the order
>> of queues instead of VFs.
>>
>>> Jepser, thanks for providing more specifics.
>>>
>>> One comment: If you intend to change core code paths or APIs for this,
>>> then I think that we should require up front that the associated HW
>>> support is protocol agnostic (i.e. HW filters must be programmable and
>>> generic ). We don't want a promising feature like this to be
>>> undermined by protocol ossification.
>>
>> At the moment we use ethtool ntuple filters which is basically adding
>> a new set of enums and structures every time we need a new protocol
>> so its painful and you need your vendor to support you and you need a
>> new kernel.
>>
>> The flow api was shot down (which would get you to the point where
>> the user could specify the protocols for the driver to implement e.g.
>> put_parse_graph) and the only new proposals I've seen are bpf
>> translations in 

Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage)

2016-01-25 Thread Tom Herbert
On Mon, Jan 25, 2016 at 9:50 AM, John Fastabend
 wrote:
> On 16-01-25 09:09 AM, Tom Herbert wrote:
>> On Mon, Jan 25, 2016 at 5:15 AM, Jesper Dangaard Brouer
>>  wrote:
>>>
>>> After reading John's reply about perfect filters, I want to re-state
>>> my idea, for this very early RX stage.  And describe a packet-page
>>> level bypass use-case, that John indirectly mentions.
>>>
>>>
>>> There are two ideas, getting mixed up here.  (1) bundling from the
>>> RX-ring, (2) allowing to pick up the "packet-page" directly.
>>>
>>> Bundling (1) is something that seems natural, and which help us
>>> amortize the cost between layers (and utilizes icache better). Lets
>>> keep that in another thread.
>>>
>>> This (2) direct forward of "packet-pages" is a fairly extreme idea,
>>> BUT it have the potential of being an new integration point for
>>> "selective" bypass-solutions and bringing RAW/af_packet (RX) up-to
>>> speed with bypass-solutions.
>>>
>>>
>>> Today, the bypass-solutions grab and control the entire NIC HW.  In
>>> many cases this is not very practical, if you also want to use the NIC
>>> for something else.
>>>
>>> Solutions for bypassing only part of the traffic is starting to show
>>> up.  Both a netmap[1] and a DPDK[2] based approach.
>>>
>>> [1] https://blog.cloudflare.com/partial-kernel-bypass-merged-netmap/
>>> [2] 
>>> http://rhelblog.redhat.com/2015/10/02/getting-the-best-of-both-worlds-with-queue-splitting-bifurcated-driver/
>>>
>>> Both approaches install a HW filter in the NIC, and redirect packets
>>> to a separate RX HW queue (via ethtool ntuple + flow-type).  DPDK
>>> needs pci SRIOV setup and then run it own poll-mode driver on top.
>>> Netmap patch the orig ixgbe driver, and since CloudFlare/Gilberto's
>>> changes[3] support a single RX queue mode.
>>>
>
> FWIW I wrote a version of the patch talked about in the queue splitting
> article that didn't require SR-IOV and we also talked about it at last
> netconf in ottowa. The problem is without SR-IOV if you map a queue
> directly into userspace so you can run the poll mode drivers there is
> nothing protecting the DMA engine. So userspace can put arbitrary
> addresses in there. There is something called Process Address Space ID
> (PASID) also part of the PCI-SIG spec that could help you here but I
> don't know of any hardware that supports it. The other option is to
> use system calls and validate the descriptors in the kernel but this
> incurs some overhead we had it at 15% or so when I did the numbers
> last year. However I'm told there is some interesting work going on
> around syscall overhead that may help.
>
> One thing to note is SRIOV does somewhat limit the number of these
> types of interfaces you can support to the max VFs where as the
> queue mechanism although slower with a function call would be limited
> to max number of queues. Also busy polling will help here if you
> are worried about pps.
>
I think you're understating that a bit :-) We know that busy polling
helps with both pps and latency. IIRC, busy polling in the kernel
reduced latency by 2/3. Any latency or pps comparison between an
interrupt driven kernel stack and a userspace stack doing polling
would be invalid. If this work is all about latency (like burning
cores is not an issue), maybe busy polling should be be assumed for
all test cases?

> Jesper, at least for you (2) case what are we missing with the
> bifurcated/queue splitting work? Are you really after systems
> without SR-IOV support or are you trying to get this on the order
> of queues instead of VFs.
>
>> Jepser, thanks for providing more specifics.
>>
>> One comment: If you intend to change core code paths or APIs for this,
>> then I think that we should require up front that the associated HW
>> support is protocol agnostic (i.e. HW filters must be programmable and
>> generic ). We don't want a promising feature like this to be
>> undermined by protocol ossification.
>
> At the moment we use ethtool ntuple filters which is basically adding
> a new set of enums and structures every time we need a new protocol
> so its painful and you need your vendor to support you and you need a
> new kernel.
>
> The flow api was shot down (which would get you to the point where
> the user could specify the protocols for the driver to implement e.g.
> put_parse_graph) and the only new proposals I've seen are bpf
> translations in drivers and 'tc'. I plan to take another shot at this in
> net-next.
>
>>
>> Thanks,
>> Tom
>>
>>> [3] https://github.com/luigirizzo/netmap/pull/87
>>>
>


Re: [PATCHv2 3/4] ARM: tegra: use build-in device properties withrfkill_gpio

2016-01-25 Thread Marc Dietrich
Am Montag 25 Januar 2016, 13:18:40 schrieb Thierry Reding:
> On Mon, Jan 25, 2016 at 12:03:48PM +0300, Heikki Krogerus wrote:
> > Pass the rfkill name and type to the device with properties
> > instead of driver specific platform data.
> > 
> > Signed-off-by: Heikki Krogerus 
> > CC: Alexandre Courbot 
> > CC: Thierry Reding 
> > CC: Stephen Warren 
> > ---
> > 
> >  arch/arm/mach-tegra/board-paz00.c | 17 ++---
> >  1 file changed, 10 insertions(+), 7 deletions(-)
> 
> Looks fine to me. We might want to wait for Marc (Cc'ed) to give this a
> spin, since I don't have the hardware. For reference, the series can be
> found here:
> 
>   http://patchwork.ozlabs.org/patch/572640/
>   http://patchwork.ozlabs.org/patch/572644/
>   http://patchwork.ozlabs.org/patch/572643/
>   http://patchwork.ozlabs.org/patch/572642/
> 
> Johannes, I assume that you'll want to take this through your tree
> because of the dependency? In that case:
> 
> Acked-by: Thierry Reding 

seems to work fine. I wish we could instantiate this from device-tree so we 
can finially get rid of this file.

Tested-by: Marc Dietrich 


signature.asc
Description: This is a digitally signed message part.


Re: [BISECTED] v4.5-rc1 phylib regression

2016-01-25 Thread Andrew Lunn
On Mon, Jan 25, 2016 at 05:45:21PM +0200, Aaro Koskinen wrote:
> Hi,
> 
> I get the below crash on OCTEON (with octeon_mgmt interface, genphy)
> always during systemd boot.
> 
> Bisected to:
> 
> commit a9049e0c513c4521dbfaa302af8ed08b3366b41f
> Author: Andrew Lunn 
> Date:   Wed Jan 6 20:11:26 2016 +0100
> 
> mdio: Add support for mdio drivers.
> 
> [  250.179887] CPU 2 Unable to handle kernel paging request at virtual 
> address , epc == 81637bac, ra == 81637b7c
> [  250.218161] Oops[#1]:
> [  250.224970] CPU: 2 PID: 850 Comm: systemd-network Not tainted 
> 4.5.0-rc1-octeon-distro.git-test #1
> [  250.251569] task: 800031188000 ti: 80002f8e8000 task.ti: 
> 80002f8e8000
> [  250.251586] $ 0   :  81639774  
> 
> [  250.251595] $ 4   :   8174 
> 0001
> [  250.251604] $ 8   : 0001  81106100 
> 00010001
> [  250.251613] $12   :  813eb81c 8150be18 
> 
> [  250.251622] $16   : 800031290fc0  800031290fc4 
> 800031188000
> [  250.251631] $20   : 0002 0001 800031290fc8 
> 80002f8eba40
> [  250.251640] $24   : 0038 81105fb0  
> 
> [  250.251649] $28   : 80002f8e8000 80002f8eb740 00fff7857f88 
> 81637b7c
> [  250.251651] Hi: 431bde82d7b50717
> [  250.251653] Lo: c8de2ac3222855ea
> [  250.251668] epc   : 81637bac __mutex_lock_slowpath+0x7c/0x190
> [  250.251675] ra: 81637b7c __mutex_lock_slowpath+0x4c/0x190
> [  250.251684] Status: 10108ce3   KX SX UX KERNEL EXL IE 
> [  250.251686] Cause : 008c (ExcCode 03)
> [  250.251688] BadVA : 
> [  250.251690] PrId  : 000d0409 (Cavium Octeon+)
> [  250.251699] Modules linked in: pata_octeon_cf libata autofs4
> [  250.251704] Process systemd-network (pid: 850, 
> threadinfo=80002f8e8000, task=800031188000, tls=00fff75ba700)
> [  250.251782] Stack : 800031290fc8  0001 
> 83f9bca0
> [  250.251782]  800031290c00 81856558 800031290fc0 
> 800031225000
> [  250.251782]  0001  800031008800 
> 814b6284
> [  250.251782]  81639774 81175dc4 800031290c00 
> 814baf80
> [  250.251782]  80003121 814b64ac 80002f8eb810 
> 80002f8eb810
> [  250.251782]  814f7790 800031290c00 814baf80 
> 814baf80
> [  250.251782]  800031225000  80003290ca10 
> 814b66a4
> [  250.251782]  800031225000 800031290c00 0001 
> 814f7b80
> [  250.251782]  800031225000 800031225788 800031225830 
> 1002
> [  250.251782]  80002f8ebb80 814ba81c 800fb7105c32 
> 8116aab4
> [  250.251782]  ...
> [  250.251784] Call Trace:
> [  250.251791] [] __mutex_lock_slowpath+0x7c/0x190
> [  250.251801] [] phy_probe+0x6c/0x120
> [  250.251807] [] phy_attach_direct+0xdc/0x1a8
> [  250.251814] [] phy_connect_direct+0x2c/0x98
> [  250.251822] [] of_phy_connect+0x60/0xb8
> [  250.251830] [] octeon_mgmt_open+0x36c/0xad0

Hi Aaro, Olof

I've not been able to reproduce this. I've tried a Kirkwood Qnap, a
Kirkwood DIR665, an Armada XP WRT1900AC and a Freescale Vybrid. And
i've turned on mutex debugging Nothing, always boots to user
space, no complaints.

 Andrew


[PATCH net 3/4] lan78xx: Add to handle mux control per chip id

2016-01-25 Thread Woojung.Huh

Depends on chip, some EEPROM pins are muxed with LED function.
Disable & restore LED function to access EEPROM.

Signed-off-by: Woojung Huh 
---
 drivers/net/usb/lan78xx.c | 92 ++-
 1 file changed, 67 insertions(+), 25 deletions(-)

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index 7a8391b..9b95333 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -463,32 +463,53 @@ static int lan78xx_read_raw_eeprom(struct lan78xx_net 
*dev, u32 offset,
   u32 length, u8 *data)
 {
u32 val;
+   u32 saved;
int i, ret;
+   int retval;
 
-   ret = lan78xx_eeprom_confirm_not_busy(dev);
-   if (ret)
-   return ret;
+   /* depends on chip, some EEPROM pins are muxed with LED function.
+* disable & restore LED function to access EEPROM.
+*/
+   ret = lan78xx_read_reg(dev, HW_CFG, );
+   saved = val;
+   if (dev->chipid == ID_REV_CHIP_ID_7800_) {
+   val &= ~(HW_CFG_LED1_EN_ | HW_CFG_LED0_EN_);
+   ret = lan78xx_write_reg(dev, HW_CFG, val);
+   }
+
+   retval = lan78xx_eeprom_confirm_not_busy(dev);
+   if (retval)
+   goto exit;
 
for (i = 0; i < length; i++) {
val = E2P_CMD_EPC_BUSY_ | E2P_CMD_EPC_CMD_READ_;
val |= (offset & E2P_CMD_EPC_ADDR_MASK_);
ret = lan78xx_write_reg(dev, E2P_CMD, val);
-   if (unlikely(ret < 0))
-   return -EIO;
+   if (unlikely(ret < 0)) {
+   retval = -EIO;
+   goto exit;
+   }
 
-   ret = lan78xx_wait_eeprom(dev);
-   if (ret < 0)
-   return ret;
+   retval = lan78xx_wait_eeprom(dev);
+   if (retval < 0)
+   goto exit;
 
ret = lan78xx_read_reg(dev, E2P_DATA, );
-   if (unlikely(ret < 0))
-   return -EIO;
+   if (unlikely(ret < 0)) {
+   retval = -EIO;
+   goto exit;
+   }
 
data[i] = val & 0xFF;
offset++;
}
 
-   return 0;
+   retval = 0;
+exit:
+   if (dev->chipid == ID_REV_CHIP_ID_7800_)
+   ret = lan78xx_write_reg(dev, HW_CFG, saved);
+
+   return retval;
 }
 
 static int lan78xx_read_eeprom(struct lan78xx_net *dev, u32 offset,
@@ -510,11 +531,23 @@ static int lan78xx_write_raw_eeprom(struct lan78xx_net 
*dev, u32 offset,
u32 length, u8 *data)
 {
u32 val;
+   u32 saved;
int i, ret;
+   int retval;
 
-   ret = lan78xx_eeprom_confirm_not_busy(dev);
-   if (ret)
-   return ret;
+   /* depends on chip, some EEPROM pins are muxed with LED function.
+* disable & restore LED function to access EEPROM.
+*/
+   ret = lan78xx_read_reg(dev, HW_CFG, );
+   saved = val;
+   if (dev->chipid == ID_REV_CHIP_ID_7800_) {
+   val &= ~(HW_CFG_LED1_EN_ | HW_CFG_LED0_EN_);
+   ret = lan78xx_write_reg(dev, HW_CFG, val);
+   }
+
+   retval = lan78xx_eeprom_confirm_not_busy(dev);
+   if (retval)
+   goto exit;
 
/* Issue write/erase enable command */
val = E2P_CMD_EPC_BUSY_ | E2P_CMD_EPC_CMD_EWEN_;
@@ -522,32 +555,41 @@ static int lan78xx_write_raw_eeprom(struct lan78xx_net 
*dev, u32 offset,
if (unlikely(ret < 0))
return -EIO;
 
-   ret = lan78xx_wait_eeprom(dev);
-   if (ret < 0)
-   return ret;
+   retval = lan78xx_wait_eeprom(dev);
+   if (retval < 0)
+   goto exit;
 
for (i = 0; i < length; i++) {
/* Fill data register */
val = data[i];
ret = lan78xx_write_reg(dev, E2P_DATA, val);
-   if (ret < 0)
-   return ret;
+   if (ret < 0) {
+   retval = -EIO;
+   goto exit;
+   }
 
/* Send "write" command */
val = E2P_CMD_EPC_BUSY_ | E2P_CMD_EPC_CMD_WRITE_;
val |= (offset & E2P_CMD_EPC_ADDR_MASK_);
ret = lan78xx_write_reg(dev, E2P_CMD, val);
-   if (ret < 0)
-   return ret;
+   if (ret < 0) {
+   retval = -EIO;
+   goto exit;
+   }
 
-   ret = lan78xx_wait_eeprom(dev);
-   if (ret < 0)
-   return ret;
+   retval = lan78xx_wait_eeprom(dev);
+   if (retval < 0)
+   goto exit;
 
offset++;
}
 
-   return 0;
+   retval = 0;
+exit:
+   if (dev->chipid == ID_REV_CHIP_ID_7800_)
+ 

Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage)

2016-01-25 Thread Jesper Dangaard Brouer

On Mon, 25 Jan 2016 09:50:16 -0800 John Fastabend  
wrote:

> On 16-01-25 09:09 AM, Tom Herbert wrote:
> > On Mon, Jan 25, 2016 at 5:15 AM, Jesper Dangaard Brouer
> >  wrote:  
> >>
[...]
> >>
> >> There are two ideas, getting mixed up here.  (1) bundling from the
> >> RX-ring, (2) allowing to pick up the "packet-page" directly.
> >>
> >> Bundling (1) is something that seems natural, and which help us
> >> amortize the cost between layers (and utilizes icache better). Lets
> >> keep that in another thread.
> >>
> >> This (2) direct forward of "packet-pages" is a fairly extreme idea,
> >> BUT it have the potential of being an new integration point for
> >> "selective" bypass-solutions and bringing RAW/af_packet (RX) up-to
> >> speed with bypass-solutions.
>
[...]
> 
> Jesper, at least for you (2) case what are we missing with the
> bifurcated/queue splitting work? Are you really after systems
> without SR-IOV support or are you trying to get this on the order
> of queues instead of VFs.

I'm not saying something is missing for bifurcated/queue splitting work.
I'm not trying to work-around SR-IOV.

This an extreme idea, which I got while looking at the lowest RX layer.


Before working any further on this idea/path, I need/want to evaluate
if it makes sense from a performance point of view.  I need to evaluate
if "pulling" out these "packet-pages" is fast enough to compete with
DPDK/netmap.  Else it makes no sense to work on this path.

As a first step to evaluate this lowest RX layer, I'm simply hacking
the drivers (ixgbe and mlx5) to drop/discard packets within-the-driver.
For now, simply replacing napi_gro_receive() with dev_kfree_skb(), and
measuring the "RX-drop" performance.

Next step was to avoid the skb alloc+free calls, but doing so is more
complicated that I first anticipated, as the SKB is tied in fairly
heavily.  Thus, right now I'm instead hooking in my bulk alloc+free
API, as that will remove/mitigate most of the overhead of the
kmem_cache/slab-allocators.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer


[PATCH net 4/4] lan78xx: throttle tx path per usb speed

2016-01-25 Thread Woojung.Huh

Throttle TX path when slower than SUPER SPEED USB to avoid
choking CPU.

Signed-off-by: Woojung Huh 
---
 drivers/net/usb/lan78xx.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index 9b95333..e0cbc5a 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -2264,7 +2264,9 @@ netdev_tx_t lan78xx_start_xmit(struct sk_buff *skb, 
struct net_device *net)
if (skb2) {
skb_queue_tail(>txq_pend, skb2);
 
-   if (skb_queue_len(>txq_pend) > 10)
+   /* throttle TX patch at slower than SUPER SPEED USB */
+   if ((dev->udev->speed < USB_SPEED_SUPER) &&
+   (skb_queue_len(>txq_pend) > 10))
netif_stop_queue(net);
} else {
netif_dbg(dev, tx_err, dev->net,
-- 
2.1.4


[RESEND][RFC][PATCH] vsock: Fix blocking ops call in prepare_to_wait

2016-01-25 Thread Laura Abbott
We receoved a bug report from someone using vmware:

WARNING: CPU: 3 PID: 660 at kernel/sched/core.c:7389
__might_sleep+0x7d/0x90()
do not call blocking ops when !TASK_RUNNING; state=1 set at
[] prepare_to_wait+0x2d/0x90
Modules linked in: vmw_vsock_vmci_transport vsock snd_seq_midi
snd_seq_midi_event snd_ens1371 iosf_mbi gameport snd_rawmidi
snd_ac97_codec ac97_bus snd_seq coretemp snd_seq_device snd_pcm
snd_timer snd soundcore ppdev crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel vmw_vmci vmw_balloon i2c_piix4 shpchp parport_pc
parport acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc btrfs
xor raid6_pq 8021q garp stp llc mrp crc32c_intel serio_raw mptspi vmwgfx
drm_kms_helper ttm drm scsi_transport_spi mptscsih e1000 ata_generic
mptbase pata_acpi
CPU: 3 PID: 660 Comm: vmtoolsd Not tainted
4.2.0-0.rc1.git3.1.fc23.x86_64 #1
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop
Reference Platform, BIOS 6.00 05/20/2014
  49e617f3 88006ac37ac8 818641f5
  88006ac37b20 88006ac37b08 810ab446
 880068009f40 81c63bc0 0061 
Call Trace:
 [] dump_stack+0x4c/0x65
 [] warn_slowpath_common+0x86/0xc0
 [] warn_slowpath_fmt+0x55/0x70
 [] ? debug_lockdep_rcu_enabled+0x1d/0x20
 [] ? prepare_to_wait+0x2d/0x90
 [] ? prepare_to_wait+0x2d/0x90
 [] __might_sleep+0x7d/0x90
 [] __might_fault+0x43/0xa0
 [] copy_from_iter+0x87/0x2a0
 [] __qp_memcpy_to_queue+0x9a/0x1b0 [vmw_vmci]
 [] ? qp_memcpy_to_queue+0x20/0x20 [vmw_vmci]
 [] qp_memcpy_to_queue_iov+0x17/0x20 [vmw_vmci]
 [] qp_enqueue_locked+0xa0/0x140 [vmw_vmci]
 [] vmci_qpair_enquev+0x4f/0xd0 [vmw_vmci]
 [] vmci_transport_stream_enqueue+0x1b/0x20
[vmw_vsock_vmci_transport]
 [] vsock_stream_sendmsg+0x2c5/0x320 [vsock]
 [] ? wake_atomic_t_function+0x70/0x70
 [] sock_sendmsg+0x38/0x50
 [] SYSC_sendto+0x104/0x190
 [] ? vfs_read+0x8a/0x140
 [] SyS_sendto+0xe/0x10
 [] entry_SYSCALL_64_fastpath+0x12/0x76

transport->stream_enqueue may call copy_to_user so it should
not be called inside a prepare_to_wait. Narrow the scope of
the prepare_to_wait to avoid the bad call.

Signed-off-by: Laura Abbott 
---
Resending since I never heard back. This has been reported by
a couple of times again but nobody ever gets back to me about
whether this actually works. Still seems to be an issue as well.
---
 net/vmw_vsock/af_vsock.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index df5fc6b..fd68c88 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1558,8 +1558,6 @@ static int vsock_stream_sendmsg(struct socket *sock, 
struct msghdr *msg,
if (err < 0)
goto out;
 
-   prepare_to_wait(sk_sleep(sk), , TASK_INTERRUPTIBLE);
-
while (total_written < len) {
ssize_t written;
 
@@ -1579,7 +1577,9 @@ static int vsock_stream_sendmsg(struct socket *sock, 
struct msghdr *msg,
goto out_wait;
 
release_sock(sk);
+   prepare_to_wait(sk_sleep(sk), , 
TASK_INTERRUPTIBLE);
timeout = schedule_timeout(timeout);
+   finish_wait(sk_sleep(sk), );
lock_sock(sk);
if (signal_pending(current)) {
err = sock_intr_errno(timeout);
@@ -1589,8 +1589,6 @@ static int vsock_stream_sendmsg(struct socket *sock, 
struct msghdr *msg,
goto out_wait;
}
 
-   prepare_to_wait(sk_sleep(sk), ,
-   TASK_INTERRUPTIBLE);
}
 
/* These checks occur both as part of and after the loop
@@ -1636,7 +1634,6 @@ static int vsock_stream_sendmsg(struct socket *sock, 
struct msghdr *msg,
 out_wait:
if (total_written > 0)
err = total_written;
-   finish_wait(sk_sleep(sk), );
 out:
release_sock(sk);
return err;
-- 
2.4.3



[PATCH 13/22] net: Fix dependencies for !HAS_IOMEM archs

2016-01-25 Thread Richard Weinberger
Not every arch has io memory.
So, unbreak the build by fixing the dependencies.

Signed-off-by: Richard Weinberger 
---
 drivers/net/ethernet/ezchip/Kconfig | 1 +
 drivers/net/phy/Kconfig | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/ezchip/Kconfig 
b/drivers/net/ethernet/ezchip/Kconfig
index 48ecbc8..b423ad3 100644
--- a/drivers/net/ethernet/ezchip/Kconfig
+++ b/drivers/net/ethernet/ezchip/Kconfig
@@ -18,6 +18,7 @@ if NET_VENDOR_EZCHIP
 config EZCHIP_NPS_MANAGEMENT_ENET
tristate "EZchip NPS management enet support"
depends on OF_IRQ && OF_NET
+   depends on HAS_IOMEM
---help---
  Simple LAN device for debug or management purposes.
  Device supports interrupts for RX and TX(completion).
diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index 60994a8..f0a7702 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -186,6 +186,7 @@ config MDIO_GPIO
 config MDIO_OCTEON
tristate "Support for MDIO buses on Octeon and ThunderX SOCs"
depends on 64BIT
+   depends on HAS_IOMEM
help
 
  This module provides a driver for the Octeon and ThunderX MDIO
-- 
1.8.4.5



[PATCH 0/4 net] lan78xx: updates & fixes

2016-01-25 Thread Woojung.Huh
Woojung Huh (4):
  lan78xx: change to use updated phy-ignore-interrupts 
  lan78xx: replace devid to chipid & chiprev
  lan78xx: add to handle mux control per chip id
  lan78xx: throttle tx path per usb speed

 drivers/net/usb/lan78xx.c | 137 ++
 drivers/net/usb/lan78xx.h |   1 +
 2 files changed, 92 insertions(+), 46 deletions(-)

-- 
2.1.4


[PATCH net 1/4] lan78xx: change to use updated phy-ignore-interrupts

2016-01-25 Thread Woojung.Huh
Update lan78xx to use patch of commit 4f2aaf7dd95b
 (fix-phy-ignore-interrupts) by Florian Fainelli.

Signed-off-by: Woojung Huh 
---
 drivers/net/usb/lan78xx.c | 30 ++
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index 2ed5333..027ee37 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -36,7 +36,7 @@
 #define DRIVER_AUTHOR  "WOOJUNG HUH "
 #define DRIVER_DESC"LAN78XX USB 3.0 Gigabit Ethernet Devices"
 #define DRIVER_NAME"lan78xx"
-#define DRIVER_VERSION "1.0.1"
+#define DRIVER_VERSION "1.0.2"
 
 #define TX_TIMEOUT_JIFFIES (5 * HZ)
 #define THROTTLE_JIFFIES   (HZ / 8)
@@ -904,7 +904,6 @@ static int lan78xx_link_reset(struct lan78xx_net *dev)
 
if (!phydev->link && dev->link_on) {
dev->link_on = false;
-   netif_carrier_off(dev->net);
 
/* reset MAC */
ret = lan78xx_read_reg(dev, MAC_CR, );
@@ -914,6 +913,8 @@ static int lan78xx_link_reset(struct lan78xx_net *dev)
ret = lan78xx_write_reg(dev, MAC_CR, buf);
if (unlikely(ret < 0))
return -EIO;
+
+   phy_mac_interrupt(phydev, 0);
} else if (phydev->link && !dev->link_on) {
dev->link_on = true;
 
@@ -953,7 +954,7 @@ static int lan78xx_link_reset(struct lan78xx_net *dev)
  ethtool_cmd_speed(), ecmd.duplex, ladv, radv);
 
ret = lan78xx_update_flowcontrol(dev, ecmd.duplex, ladv, radv);
-   netif_carrier_on(dev->net);
+   phy_mac_interrupt(phydev, 1);
}
 
return ret;
@@ -1495,7 +1496,6 @@ done:
 static int lan78xx_mdio_init(struct lan78xx_net *dev)
 {
int ret;
-   int i;
 
dev->mdiobus = mdiobus_alloc();
if (!dev->mdiobus) {
@@ -1511,10 +1511,6 @@ static int lan78xx_mdio_init(struct lan78xx_net *dev)
snprintf(dev->mdiobus->id, MII_BUS_ID_SIZE, "usb-%03d:%03d",
 dev->udev->bus->busnum, dev->udev->devnum);
 
-   /* handle our own interrupt */
-   for (i = 0; i < PHY_MAX_ADDR; i++)
-   dev->mdiobus->irq[i] = PHY_IGNORE_INTERRUPT;
-
switch (dev->devid & ID_REV_CHIP_ID_MASK_) {
case 0x7800:
case 0x7850:
@@ -1558,6 +1554,16 @@ static int lan78xx_phy_init(struct lan78xx_net *dev)
return -EIO;
}
 
+   /* Enable PHY interrupts.
+* We handle our own interrupt
+*/
+   ret = phy_read(phydev, LAN88XX_INT_STS);
+   ret = phy_write(phydev, LAN88XX_INT_MASK,
+   LAN88XX_INT_MASK_MDINTPIN_EN_ |
+   LAN88XX_INT_MASK_LINK_CHANGE_);
+
+   phydev->irq = PHY_IGNORE_INTERRUPT;
+
ret = phy_connect_direct(dev->net, phydev,
 lan78xx_link_status_change,
 PHY_INTERFACE_MODE_GMII);
@@ -1580,14 +1586,6 @@ static int lan78xx_phy_init(struct lan78xx_net *dev)
  SUPPORTED_Pause | SUPPORTED_Asym_Pause);
genphy_config_aneg(phydev);
 
-   /* Workaround to enable PHY interrupt.
-* phy_start_interrupts() is API for requesting and enabling
-* PHY interrupt. However, USB-to-Ethernet device can't use
-* request_irq() called in phy_start_interrupts().
-* Set PHY to PHY_HALTED and call phy_start()
-* to make a call to phy_enable_interrupts()
-*/
-   phy_stop(phydev);
phy_start(phydev);
 
netif_dbg(dev, ifup, dev->net, "phy initialised successfully");
-- 
2.1.4


[PATCH net 2/4] lan78xx: replace devid to chipid & chiprev

2016-01-25 Thread Woojung.Huh
Replace devid of struct lan78xx_net to chipid & chiprev for easy access.

Signed-off-by: Woojung Huh 
---
 drivers/net/usb/lan78xx.c | 11 +++
 drivers/net/usb/lan78xx.h |  1 +
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index 027ee37..7a8391b 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -278,7 +278,8 @@ struct lan78xx_net {
int link_on;
u8  mdix_ctrl;
 
-   u32 devid;
+   u32 chipid;
+   u32 chiprev;
struct mii_bus  *mdiobus;
 };
 
@@ -1511,8 +1512,9 @@ static int lan78xx_mdio_init(struct lan78xx_net *dev)
snprintf(dev->mdiobus->id, MII_BUS_ID_SIZE, "usb-%03d:%03d",
 dev->udev->bus->busnum, dev->udev->devnum);
 
-   switch (dev->devid & ID_REV_CHIP_ID_MASK_) {
-   case 0x7800:
+   switch (dev->chipid) {
+   case ID_REV_CHIP_ID_7800_:
+   case ID_REV_CHIP_ID_7850_:
case 0x7850:
/* set to internal PHY id */
dev->mdiobus->phy_mask = ~(1 << 1);
@@ -1874,7 +1876,8 @@ static int lan78xx_reset(struct lan78xx_net *dev)
 
/* save DEVID for later usage */
ret = lan78xx_read_reg(dev, ID_REV, );
-   dev->devid = buf;
+   dev->chipid = (buf & ID_REV_CHIP_ID_MASK_) >> 16;
+   dev->chiprev = buf & ID_REV_CHIP_REV_MASK_;
 
/* Respond to the IN token with a NAK */
ret = lan78xx_read_reg(dev, USB_CFG0, );
diff --git a/drivers/net/usb/lan78xx.h b/drivers/net/usb/lan78xx.h
index a93fb65..4092790 100644
--- a/drivers/net/usb/lan78xx.h
+++ b/drivers/net/usb/lan78xx.h
@@ -107,6 +107,7 @@
 #define ID_REV_CHIP_ID_MASK_   (0x)
 #define ID_REV_CHIP_REV_MASK_  (0x)
 #define ID_REV_CHIP_ID_7800_   (0x7800)
+#define ID_REV_CHIP_ID_7850_   (0x7850)
 
 #define FPGA_REV   (0x04)
 #define FPGA_REV_MINOR_MASK_   (0xFF00)
-- 
2.1.4


Re: [PATCH net 2/4] lan78xx: replace devid to chipid & chiprev

2016-01-25 Thread David Miller

Such cleanups are not appropriate for the 'net' tree.

And the 'net-next' tree is closed.

You _MUST_ separate out the pure bug fixes and submit only those
changes targetting the 'net' tree.

You must learn how to submit changes properly.



[PATCH net] tcp: fix tcp_mark_head_lost to check skb len before fragmenting

2016-01-25 Thread Yuchung Cheng
From: Neal Cardwell 

This commit fixes a corner case in tcp_mark_head_lost() which was
causing the WARN_ON(len > skb->len) in tcp_fragment() to fire.

tcp_mark_head_lost() was assuming that if a packet has
tcp_skb_pcount(skb) of N, then it's safe to fragment off a prefix of
M*mss bytes, for any M < N. But with the tricky way TCP pcounts are
maintained, this is not always true.

For example, suppose the sender sends 4 1-byte packets and have the
last 3 packet sacked. It will merge the last 3 packets in the write
queue into an skb with pcount = 3 and len = 3 bytes. If another
recovery happens after a sack reneging event, tcp_mark_head_lost()
may attempt to split the skb assuming it has more than 2*MSS bytes.

This sounds very counterintuitive, but as the commit description for
the related commit c0638c247f55 ("tcp: don't fragment SACKed skbs in
tcp_mark_head_lost()") notes, this is because tcp_shifted_skb()
coalesces adjacent regions of SACKed skbs, and when doing this it
preserves the sum of their packet counts in order to reflect the
real-world dynamics on the wire. The c0638c247f55 commit tried to
avoid problems by not fragmenting SACKed skbs, since SACKed skbs are
where the non-proportionality between pcount and skb->len/mss is known
to be possible. However, that commit did not handle the case where
during a reneging event one of these weird SACKed skbs becomes an
un-SACKed skb, which tcp_mark_head_lost() can then try to fragment.

The fix is to simply mark the entire skb lost when this happens.
This makes the recovery slightly more aggressive in such corner
cases before we detect reordering. But once we detect reordering
this code path is by-passed because FACK is disabled.

Signed-off-by: Neal Cardwell 
Signed-off-by: Yuchung Cheng 
Signed-off-by: Eric Dumazet 
---
 net/ipv4/tcp_input.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 0003d40..d2ad433 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2164,8 +2164,7 @@ static void tcp_mark_head_lost(struct sock *sk, int 
packets, int mark_head)
 {
struct tcp_sock *tp = tcp_sk(sk);
struct sk_buff *skb;
-   int cnt, oldcnt;
-   int err;
+   int cnt, oldcnt, lost;
unsigned int mss;
/* Use SACK to deduce losses of new sequences sent during recovery */
const u32 loss_high = tcp_is_sack(tp) ?  tp->snd_nxt : tp->high_seq;
@@ -2205,9 +2204,10 @@ static void tcp_mark_head_lost(struct sock *sk, int 
packets, int mark_head)
break;
 
mss = tcp_skb_mss(skb);
-   err = tcp_fragment(sk, skb, (packets - oldcnt) * mss,
-  mss, GFP_ATOMIC);
-   if (err < 0)
+   /* If needed, chop off the prefix to mark as lost. */
+   lost = (packets - oldcnt) * mss;
+   if (lost < skb->len &&
+   tcp_fragment(sk, skb, lost, mss, GFP_ATOMIC) < 0)
break;
cnt = packets;
}
-- 
2.7.0.rc3.207.g0ac5344



RE: [PATCH net 2/4] lan78xx: replace devid to chipid & chiprev

2016-01-25 Thread Woojung.Huh
> Such cleanups are not appropriate for the 'net' tree.
> 
> And the 'net-next' tree is closed.
> 
> You _MUST_ separate out the pure bug fixes and submit only those
> changes targetting the 'net' tree.
> 
> You must learn how to submit changes properly.

Thanks for pointing out. Will resubmit after modification.
And, will post this cleanup to net-next when it opens.


Re: [PATCH] brcmfmac: sdio: Increase the default timeouts a bit

2016-01-25 Thread Julian Calaby
Hi Arend,

On Tue, Jan 26, 2016 at 2:39 AM, Arend van Spriel  wrote:
> On 25-01-16 12:06, Julian Calaby wrote:
>> Hi Sjoerd,
>>
>> On Mon, Jan 25, 2016 at 9:47 PM, Sjoerd Simons
>>  wrote:
>>> On a Radxa Rock2 board with a Ampak AP6335 (Broadcom 4339 core) it seems
>>> the card responds very quickly most of the time, unfortunately during
>>> initialisation it sometimes seems to take just a bit over 2 seconds to
>>> respond.
>>>
>>> This results intialization failing with message like:
>>>   brcmf_c_preinit_dcmds: Retreiving cur_etheraddr failed, -52
>>>   brcmf_bus_start: failed: -52
>>>   brcmf_sdio_firmware_callback: dongle is not responding
>>>
>>> Increasing the timeout to allow for a bit more headroom allows the
>>> card to initialize reliably.
>>>
>>> A quick search online after diagnosing/fixing this showed that Google
>>> has a similar patch in their ChromeOS tree, so this doesn't seem
>>> specific to the board I'm using.
>>>
>>> Signed-off-by: Sjoerd Simons 
>>
>> Looks sane to me.
>>
>> Reviewed-by: Julian Calaby 
>
> Not really a cleanup patch :-p , but thanks for the review.

I'm trying to review any "small" patch from (relatively) new people.

Thanks,

-- 
Julian Calaby

Email: julian.cal...@gmail.com
Profile: http://www.google.com/profiles/julian.calaby/


Re: [PATCH 4.1] [media] media/vivid-osd: fix info leak in ioctl

2016-01-25 Thread Yuki Machida

It has sent to the wrong Mainling List.
sorry.

On 2016年01月25日 19:42, Yuki Machida wrote:

commit eda98796aff0d9bf41094b06811f5def3b4c333c upstream.

The vivid_fb_ioctl() code fails to initialize the 16 _reserved bytes of
struct fb_vblank after the ->hcount member. Add an explicit
memset(0) before filling the structure to avoid the info leak.

This fixes CVE-2015-7884.

Signed-off-by: Salva Peiró 
Signed-off-by: Hans Verkuil 
Signed-off-by: Mauro Carvalho Chehab 
Signed-off-by: Yuki Machida 
---
  drivers/media/platform/vivid/vivid-osd.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/media/platform/vivid/vivid-osd.c 
b/drivers/media/platform/vivid/vivid-osd.c
index 084d346..e15eef6 100644
--- a/drivers/media/platform/vivid/vivid-osd.c
+++ b/drivers/media/platform/vivid/vivid-osd.c
@@ -85,6 +85,7 @@ static int vivid_fb_ioctl(struct fb_info *info, unsigned cmd, 
unsigned long arg)
case FBIOGET_VBLANK: {
struct fb_vblank vblank;

+   memset(, 0, sizeof(vblank));
vblank.flags = FB_VBLANK_HAVE_COUNT | FB_VBLANK_HAVE_VCOUNT |
FB_VBLANK_HAVE_VSYNC;
vblank.count = 0;



Re: [PATCH net-next] hv_netvsc: use skb_get_hash() instead of a homegrown implementation

2016-01-25 Thread David Miller
From: Vitaly Kuznetsov 
Date: Mon, 25 Jan 2016 16:00:41 +0100

> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
> VLAN ID to flow_keys")) introduced a performance regression in netvsc
> driver. Is problem is, however, not the above mentioned commit but the
> fact that netvsc_set_hash() function did some assumptions on the struct
> flow_keys data layout and this is wrong.
> 
> Get rid of netvsc_set_hash() by switching to skb_get_hash(). This change
> will also imply switching to Jenkins hash from the currently used Toeplitz
> but it seems there is no good excuse for Toeplitz to stay.
> 
> Signed-off-by: Vitaly Kuznetsov 

Applied.


Re: [PATCH] brcmfmac: sdio: Increase the default timeouts a bit

2016-01-25 Thread Doug Anderson
Hi,

On Mon, Jan 25, 2016 at 7:36 AM, Arend van Spriel  wrote:
> On 25-01-16 11:47, Sjoerd Simons wrote:
>> On a Radxa Rock2 board with a Ampak AP6335 (Broadcom 4339 core) it seems
>> the card responds very quickly most of the time, unfortunately during
>> initialisation it sometimes seems to take just a bit over 2 seconds to
>> respond.
>>
>> This results intialization failing with message like:
>>   brcmf_c_preinit_dcmds: Retreiving cur_etheraddr failed, -52
>>   brcmf_bus_start: failed: -52
>>   brcmf_sdio_firmware_callback: dongle is not responding
>>
>> Increasing the timeout to allow for a bit more headroom allows the
>> card to initialize reliably.
>
> I would prefer to know where the 2 second response time comes from.
> Could be sdio retuning. Maybe the chromeos people can comment whether
> this has been root caused.

I reviewed Paul's change here
 but didn't do
any root causing.

I think that, like Sjoerd saw, we were seeing this problem at boot
time.  Certainly at boot time lots of things are happening all at the
same time in the system and there are often delays, so anything that
might have been close to timing out in the past may now be actually
timing out.

This is the kind of thing that, IMHO, should have a real timeout that
is 10x what was expected and a non-fatal warning whenever we go over
the expected time.  ...but maybe that's overdesign.  :-P

Kinda curious: do we get one or two really slow responses on every
bootup, or just some bootups?  Do we ever succeed even with a slow
(like 1.8 or 1.9 seconds) response, or is it always either "fast" or
"2.1" seconds?


In any case, in my experience the Broadcom firmware is fairly
complicated and has numerous cases where it stretches SDIO more than
the other SDIO WiFi chip I've worked with.  It wouldn't terribly
surprise me if there was a period of time during bootup where it was
non-responsive for 2 seconds.  As unrelated "evidence" showing some of
the Broadcom SDIO limitations, you can see
 and also the
fact that Broadcom often holds the SDIO "busy" signal whereas the
other SDIO WiFi chip I've worked never did that.  Also, even with all
fixes the Broadcom WiFi module will still show periodic SDIO errors
that the higher level driver just knows to ignore.

My old debugging from the (sorry, private) bug
http://crosbug.com/p/36975 showed this periodically even with all
known fixes:

[21310.271635] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0104
[21550.583598] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0104
[21550.616035] brcmfmac: brcmf_sdio_readframes: RXHEADER FAILED: -110
[21550.648460] brcmfmac: brcmf_sdio_rxfail: abort command, terminate
frame, send NAK
[21550.683502] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0104
[21550.691214] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0100
[22671.121329] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0104
[22671.153167] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x01000104
[22671.184581] brcmfmac: brcmf_sdio_readframes: RXHEADER FAILED: -110
[22671.192600] brcmfmac: brcmf_sdio_rxfail: abort command, terminate
frame, send NAK
[22671.201929] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0114
[22671.209536] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0100
[28463.941736] dwmmc_rockchip ff0d.dwmmc: CMD ERR: 0x0104

At the time dekim@ responded:

> There are several sleep/wake control at different level. The one we're talking
> about here is controlled by brcmf_sdio_bus_sleep() in the host driver to turn
> on/off bus core on the chip. There can be a period of time when chip is not
> paying attention to the host command (cmd52 to the
> SBSDIO_FUNC1_SLEEPCSR).

...and we decided that the periodic SDIO errors weren't causing any
huge problems (since they were retried).  As far as I know, they still
happen today.


All of the above may not help you, but it serves as evidence that the
SDIO communication to Broadcom isn't terribly amazing and apparently
that's just the way that the module (or perhaps its firmware) is
designed.  It doesn't seem to affect anything in the real world, so I
suppose it is just something we need to live with.


Obviously if you have access to the firmware source code and can debug
further, that would be awesome.  I'm just not hopeful.


In any case:

Reviewed-by: Douglas Anderson 


Re: [PATCH] af_packet: Raw socket destruction warning fix

2016-01-25 Thread Daniel Borkmann

On 01/21/2016 12:40 PM, Maninder Singh wrote:

The other sock_put() in packet_release() to drop the final ref and call into
sk_free(), which drops the 1 ref on the sk_wmem_alloc from init time. Since you
got into __sk_free() via sock_wfree() destructor, your socket must have invoked
packet_release() prior to this (perhaps kernel destroying the process).

What kernel do you use?


Issue is coming for 3.10.58.


[ sorry for late reply ]

What driver are you using (is that in-tree)? Can you reproduce the same issue
with a latest -net kernel, for example (or, a 'reasonably' recent one like 4.3 
or
4.4)? There has been quite a bit of changes in err queue handling (which also
accounts rmem) as well. How reliably can you trigger the issue? Does it trigger
with a completely different in-tree network driver as well with your tests? 
Would
be useful to track/debug sk_rmem_alloc increases/decreases to see from which 
path
new rmem is being charged in the time between packet_release() and 
packet_sock_destruct()
for that socket ...


Driver calls dev_kfree_skb_any->dev_kfree_skb_irq
and it adds buffer in completion queue to free and raises softirq NET_TX_SOFTIRQ

net_tx_action->__kfree_skb->skb_release_all->skb_release_head_state->sock_wfree->
__sk_free->packet_sock_destruct

Also purging of receive queue has been taken care in other protocols.


Re: [PATCH net] inet: frag: Always orphan skbs inside ip_defrag()

2016-01-25 Thread Joe Stringer
On 22 January 2016 at 17:22, Eric Dumazet  wrote:
> On Fri, 2016-01-22 at 15:49 -0800, Joe Stringer wrote:
>> Later parts of the stack (including fragmentation) expect that there is
>> never a socket attached to frag in a frag_list, however this invariant
>> was not enforced on all defrag paths. This could lead to the
>> BUG_ON(skb->sk) during ip_do_fragment(), as per the call stack at the
>> end of this commit message.
>>
>> While the call could be added to openvswitch to fix this particular
>> error, the head and tail of the frags list are already orphaned
>> indirectly inside ip_defrag(), so it seems like the remaining fragments
>> should all be orphaned in all circumstances.
>
>
> Yes, it looks we have a problem, and even IP early demux apparently does
> not check if incoming packet is a fragment.
>
> Your patch could also remove some socket leaks in this respect.
>
> I guess we also could add a safety check (ipv4 only, but ipv6 needs care
> as well)
>
> diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
> index b1209b63381f..99513c829213 100644
> --- a/net/ipv4/ip_input.c
> +++ b/net/ipv4/ip_input.c
> @@ -316,7 +316,9 @@ static int ip_rcv_finish(struct net *net, struct sock 
> *sk, struct sk_buff *skb)
> const struct iphdr *iph = ip_hdr(skb);
> struct rtable *rt;
>
> -   if (sysctl_ip_early_demux && !skb_dst(skb) && !skb->sk) {
> +   if (sysctl_ip_early_demux &&
> +   !skb_dst(skb) && !skb->sk &&
> +   !ip_is_fragment(iph)) {
> const struct net_protocol *ipprot;
> int protocol = iph->protocol;

Thanks, I can roll this into a v2 (or keep as a separate patch?). I
got sidetracked on the IPv6 side, some other issues are blocking me on
that but I intend to continue following up there as well.


[net 2/2] net: i40e: shut up uninitialized variable warnings

2016-01-25 Thread Jeff Kirsher
From: Arnd Bergmann 

intel/i40e/i40e_txrx.c: In function 'i40e_xmit_frame_ring':
intel/i40e/i40e_txrx.c:2367:20: error: 'oiph' may be used uninitialized in this 
function [-Werror=maybe-uninitialized]
intel/i40e/i40e_txrx.c:2317:16: note: 'oiph' was declared here
intel/i40e/i40e_txrx.c:2367:17: error: 'oudph' may be used uninitialized in 
this function [-Werror=maybe-uninitialized]
intel/i40e/i40e_txrx.c:2316:17: note: 'oudph' was declared here

Signed-off-by: Arnd Bergmann 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c 
b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 720516b..47bd8b3 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -2313,8 +2313,8 @@ static void i40e_tx_enable_csum(struct sk_buff *skb, u32 
*tx_flags,
struct iphdr *this_ip_hdr;
u32 network_hdr_len;
u8 l4_hdr = 0;
-   struct udphdr *oudph;
-   struct iphdr *oiph;
+   struct udphdr *oudph = NULL;
+   struct iphdr *oiph = NULL;
u32 l4_tunnel = 0;
 
if (skb->encapsulation) {
-- 
2.5.0



[net 1/2] i40e: fix build warnings

2016-01-25 Thread Jeff Kirsher
From: Eric Dumazet 

Fixes following build warnings :

drivers/net/ethernet/intel/i40e/i40e_main.c:7057:13: warning:
'i40e_sync_udp_filters_subtask' defined but not used [-Wunused-function]
drivers/net/ethernet/intel/i40e/i40e_main.c:8524:13: warning:
'i40e_add_vxlan_port' defined but not used [-Wunused-function]
drivers/net/ethernet/intel/i40e/i40e_main.c:8569:13: warning:
'i40e_del_vxlan_port' defined but not used [-Wunused-function]
drivers/net/ethernet/intel/i40e/i40e_main.c:8604:13: warning:
'i40e_add_geneve_port' defined but not used [-Wunused-function]
drivers/net/ethernet/intel/i40e/i40e_main.c:8651:13: warning:
'i40e_del_geneve_port' defined but not used [-Wunused-function]

Fixes: 6a899024058d ("i40e: geneve tunnel offload support")
Signed-off-by: Eric Dumazet 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 15 +--
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index bb4612c..8f3b53e 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -7117,9 +7117,7 @@ static void i40e_service_task(struct work_struct *work)
i40e_watchdog_subtask(pf);
i40e_fdir_reinit_subtask(pf);
i40e_sync_filters_subtask(pf);
-#if IS_ENABLED(CONFIG_VXLAN) || IS_ENABLED(CONFIG_GENEVE)
i40e_sync_udp_filters_subtask(pf);
-#endif
i40e_clean_adminq_subtask(pf);
 
i40e_service_event_complete(pf);
@@ -8515,6 +8513,8 @@ static u8 i40e_get_udp_port_idx(struct i40e_pf *pf, 
__be16 port)
 }
 
 #endif
+
+#if IS_ENABLED(CONFIG_VXLAN)
 /**
  * i40e_add_vxlan_port - Get notifications about VXLAN ports that come up
  * @netdev: This physical port's netdev
@@ -8524,7 +8524,6 @@ static u8 i40e_get_udp_port_idx(struct i40e_pf *pf, 
__be16 port)
 static void i40e_add_vxlan_port(struct net_device *netdev,
sa_family_t sa_family, __be16 port)
 {
-#if IS_ENABLED(CONFIG_VXLAN)
struct i40e_netdev_priv *np = netdev_priv(netdev);
struct i40e_vsi *vsi = np->vsi;
struct i40e_pf *pf = vsi->back;
@@ -8557,7 +8556,6 @@ static void i40e_add_vxlan_port(struct net_device *netdev,
pf->udp_ports[next_idx].type = I40E_AQC_TUNNEL_TYPE_VXLAN;
pf->pending_udp_bitmap |= BIT_ULL(next_idx);
pf->flags |= I40E_FLAG_UDP_FILTER_SYNC;
-#endif
 }
 
 /**
@@ -8569,7 +8567,6 @@ static void i40e_add_vxlan_port(struct net_device *netdev,
 static void i40e_del_vxlan_port(struct net_device *netdev,
sa_family_t sa_family, __be16 port)
 {
-#if IS_ENABLED(CONFIG_VXLAN)
struct i40e_netdev_priv *np = netdev_priv(netdev);
struct i40e_vsi *vsi = np->vsi;
struct i40e_pf *pf = vsi->back;
@@ -8592,9 +8589,10 @@ static void i40e_del_vxlan_port(struct net_device 
*netdev,
netdev_warn(netdev, "vxlan port %d was not found, not 
deleting\n",
ntohs(port));
}
-#endif
 }
+#endif
 
+#if IS_ENABLED(CONFIG_GENEVE)
 /**
  * i40e_add_geneve_port - Get notifications about GENEVE ports that come up
  * @netdev: This physical port's netdev
@@ -8604,7 +8602,6 @@ static void i40e_del_vxlan_port(struct net_device *netdev,
 static void i40e_add_geneve_port(struct net_device *netdev,
 sa_family_t sa_family, __be16 port)
 {
-#if IS_ENABLED(CONFIG_GENEVE)
struct i40e_netdev_priv *np = netdev_priv(netdev);
struct i40e_vsi *vsi = np->vsi;
struct i40e_pf *pf = vsi->back;
@@ -8639,7 +8636,6 @@ static void i40e_add_geneve_port(struct net_device 
*netdev,
pf->flags |= I40E_FLAG_UDP_FILTER_SYNC;
 
dev_info(>pdev->dev, "adding geneve port %d\n", ntohs(port));
-#endif
 }
 
 /**
@@ -8651,7 +8647,6 @@ static void i40e_add_geneve_port(struct net_device 
*netdev,
 static void i40e_del_geneve_port(struct net_device *netdev,
 sa_family_t sa_family, __be16 port)
 {
-#if IS_ENABLED(CONFIG_GENEVE)
struct i40e_netdev_priv *np = netdev_priv(netdev);
struct i40e_vsi *vsi = np->vsi;
struct i40e_pf *pf = vsi->back;
@@ -8677,8 +8672,8 @@ static void i40e_del_geneve_port(struct net_device 
*netdev,
netdev_warn(netdev, "geneve port %d was not found, not 
deleting\n",
ntohs(port));
}
-#endif
 }
+#endif
 
 static int i40e_get_phys_port_id(struct net_device *netdev,
 struct netdev_phys_item_id *ppid)
-- 
2.5.0



[net 0/2][pull request] Intel Wired LAN Driver Updates 2016-01-25

2016-01-25 Thread Jeff Kirsher
This series contains updates to i40e only and so I won't continue receiving
patches to fix the same issue (again).

Arnd fixes the driver from causing the compiler whining about uninitialized
variables, so initialize those variables.

Eric fixes the build errors/warnings which were introduced by Anjali
when she added geneve support to i40e.


The following are changes since commit c85e4924452ae8225c8829f3fa8a2f7baa34bc5c:
  hv_netvsc: Fix book keeping of skb during batching process
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue master

Arnd Bergmann (1):
  net: i40e: shut up uninitialized variable warnings

Eric Dumazet (1):
  i40e: fix build warnings

 drivers/net/ethernet/intel/i40e/i40e_main.c | 15 +--
 drivers/net/ethernet/intel/i40e/i40e_txrx.c |  4 ++--
 2 files changed, 7 insertions(+), 12 deletions(-)

-- 
2.5.0



Re: [PATCH 1/3] i40e: fix build warning

2016-01-25 Thread Jeff Kirsher
On Mon, 2016-01-25 at 11:40 +0530, Sudip Mukherjee wrote:
> While building we are getting warning about:
> i40e_main.c:8604:13: warning: 'i40e_add_geneve_port' defined but not
> used
> and
> i40e_main.c:8651:13: warning: 'i40e_del_geneve_port' defined but not
> used
> 
> The contents of these functions are defined under CONFIG_GENEVE, so
> if
> CONFIG_GENEVE is not defined then we are having unused empty
> functions.
> Lets have these functions under CONFIG_GENEVE as the callback is
> already
> defined under CONFIG_GENEVE there is no chance of any failure.
> 
> Signed-off-by: Sudip Mukherjee 
> ---
>  drivers/net/ethernet/intel/i40e/i40e_main.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)

This is not a complete fix for the issue and Eric Dumazet has already
submitted a fix for the issue, which I have just sent to David Miller
in a pull-request.

So I have dropped this patch, in favor of Eric's solution.

signature.asc
Description: This is a digitally signed message part


Re: [PATCH 1/1] bonding: Use notifiers for slave link state detection

2016-01-25 Thread Jay Vosburgh
 wrote:

>From: Zhu Yanjun 
>
>Bonding will utilize notifier callbacks to detect slave
>link state changes. It is intended to be used with miimon
>set to zero, and does not support the updelay or downdelay
>options to bonding.
>
>Because of link flap from the slave interface, if the notifier
>is NETDEV_UP while the actual link state is down, it is not
>necessary to continue.
>
>Signed-off-by: Jay Vosburgh 

I haven't signed off on this patch.

I've just started some testing, but as before immediately get an
RCU warning; it looks to be coming from bond_miimon_inspect_slave();

[  316.473050] bond1: Enslaving eth1 as a backup interface with an up link
[  316.473059] 
[  316.473806] ===
[  316.475630] [ INFO: suspicious RCU usage. ]
[  316.477519] 4.4.0+ #38 Not tainted
[  316.479094] ---
[  316.480765] drivers/net/bonding/bond_main.c:2024 suspicious 
rcu_dereference_check() usage!

This is presumably because the "case NETDEV_DOWN" call to
bond_miimon_inspect_slave does not hold RCU.  It does hold RTNL, though,
which should be safe for this usage (RTNL mutexes changes to the active
slave).  The appended patch on top of the original makes the warning go
away.

I'm still testing the patch and have no comment about its
functionality as yet.

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 9f67948..e3faee9 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2014,14 +2014,14 @@ static int bond_slave_info_query(struct net_device 
*bond_dev, struct ifslave *in
 
 /* Monitoring ---*/
 
-/* called with rcu_read_lock() */
+/* called with rcu_read_lock() or RTNL */
 static int bond_miimon_inspect_slave(struct bonding *bond, struct slave *slave,
 unsigned long event)
 {
int link_state;
bool ignore_updelay;
 
-   ignore_updelay = !rcu_dereference(bond->curr_active_slave);
+   ignore_updelay = !rcu_dereference_rtnl(bond->curr_active_slave);
 
slave->new_link = BOND_LINK_NOCHANGE;
 

-J

---
-Jay Vosburgh, jay.vosbu...@canonical.com


Re: [PATCH/RFC v4 net-next] ravb: Add dma queue interrupt support

2016-01-25 Thread Simon Horman
On Mon, Jan 25, 2016 at 12:52:55AM +0900, Yoshihiro Kaneko wrote:
> From: Kazuya Mizuguchi 
> 
> This patch supports the following interrupts.
> 
> - One interrupt for multiple (descriptor, error, management)
> - One interrupt for emac
> - Four interrupts for dma queue (best effort rx/tx, network control rx/tx)
> 
> This patch improve efficiency of the interrupt handler by adding the
> interrupt handler corresponding to each interrupt source described
> above. Additionally, it reduces the number of times of the access to
> EthernetAVB IF.
> 
> Signed-off-by: Kazuya Mizuguchi 
> Signed-off-by: Yoshihiro Kaneko 

I have tested this patch and the result seems positive.
Please let me know if any more/different testing would help.

My test was to examine /proc/interrupts after booting a Salvator-X board
using NFS root. The test used net-next merged with v4.5-rc1 (for
r8a7795/Salvator-X support). I then applied this patch.

Without this patch:
# grep eth /proc/interrupts
 74:  13002  0  0  0 GIC-0  93 Level eth0
 76:  3  0  0  0 GIC-0  95 Level eth0

With this patch:

# grep eth /proc/interrupts
 52:   8744  0  0  0 GIC-0  71 Level 
eth0:ch0:rx_be
 53:  0  0  0  0 GIC-0  72 Level 
eth0:ch1:rx_nc
 70:   4277  0  0  0 GIC-0  89 Level 
eth0:ch18:tx_be
 71:  0  0  0  0 GIC-0  90 Level 
eth0:ch19:tx_nc
 74:  0  0  0  0 GIC-0  93 Level 
eth0:ch22:multi
 76:  3  0  0  0 GIC-0  95 Level 
eth0:ch24:emac

Please feel free to add:

Tested-by: Simon Horman 


[PATCHv2 2/4] net: rfkill: gpio: get the name and type from device property

2016-01-25 Thread Heikki Krogerus
This prepares the driver for removal of platform data.

Signed-off-by: Heikki Krogerus 
---
 net/rfkill/rfkill-gpio.c | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/net/rfkill/rfkill-gpio.c b/net/rfkill/rfkill-gpio.c
index 4b1e3f3..1a9c031 100644
--- a/net/rfkill/rfkill-gpio.c
+++ b/net/rfkill/rfkill-gpio.c
@@ -81,7 +81,6 @@ static int rfkill_gpio_acpi_probe(struct device *dev,
if (!id)
return -ENODEV;
 
-   rfkill->name = dev_name(dev);
rfkill->type = (unsigned)id->driver_data;
 
return acpi_dev_add_driver_gpios(ACPI_COMPANION(dev),
@@ -93,12 +92,21 @@ static int rfkill_gpio_probe(struct platform_device *pdev)
struct rfkill_gpio_platform_data *pdata = pdev->dev.platform_data;
struct rfkill_gpio_data *rfkill;
struct gpio_desc *gpio;
+   const char *type_name;
int ret;
 
rfkill = devm_kzalloc(>dev, sizeof(*rfkill), GFP_KERNEL);
if (!rfkill)
return -ENOMEM;
 
+   device_property_read_string(>dev, "name", >name);
+   device_property_read_string(>dev, "type", _name);
+
+   if (!rfkill->name)
+   rfkill->name = dev_name(>dev);
+
+   rfkill->type = rfkill_find_type(type_name);
+
if (ACPI_HANDLE(>dev)) {
ret = rfkill_gpio_acpi_probe(>dev, rfkill);
if (ret)
@@ -124,10 +132,8 @@ static int rfkill_gpio_probe(struct platform_device *pdev)
 
rfkill->shutdown_gpio = gpio;
 
-   /* Make sure at-least one of the GPIO is defined and that
-* a name is specified for this instance
-*/
-   if ((!rfkill->reset_gpio && !rfkill->shutdown_gpio) || !rfkill->name) {
+   /* Make sure at-least one GPIO is defined for this instance */
+   if (!rfkill->reset_gpio && !rfkill->shutdown_gpio) {
dev_err(>dev, "invalid platform data\n");
return -EINVAL;
}
-- 
2.7.0.rc3



[PATCHv2 1/4] net: rfkill: add rfkill_find_type function

2016-01-25 Thread Heikki Krogerus
Helper for finding the type based on name. Useful if the
type needs to be determined based on device property.

Signed-off-by: Heikki Krogerus 
---
 include/linux/rfkill.h | 15 +
 net/rfkill/core.c  | 57 +-
 2 files changed, 44 insertions(+), 28 deletions(-)

diff --git a/include/linux/rfkill.h b/include/linux/rfkill.h
index d901078..522ccbc 100644
--- a/include/linux/rfkill.h
+++ b/include/linux/rfkill.h
@@ -212,6 +212,15 @@ void rfkill_set_states(struct rfkill *rfkill, bool sw, 
bool hw);
  * @rfkill: rfkill struct to query
  */
 bool rfkill_blocked(struct rfkill *rfkill);
+
+/**
+ * rfkill_find_type - Helpper for finding rfkill type by name
+ * @name: the name of the type
+ *
+ * Returns enum rfkill_type that conrresponds the name.
+ */
+enum rfkill_type rfkill_find_type(const char *name);
+
 #else /* !RFKILL */
 static inline struct rfkill * __must_check
 rfkill_alloc(const char *name,
@@ -268,6 +277,12 @@ static inline bool rfkill_blocked(struct rfkill *rfkill)
 {
return false;
 }
+
+static inline enum rfkill_type rfkill_find_type(const char *name)
+{
+   return RFKILL_TYPE_ALL;
+}
+
 #endif /* RFKILL || RFKILL_MODULE */
 
 
diff --git a/net/rfkill/core.c b/net/rfkill/core.c
index f53bf3b6..e9a5cdf 100644
--- a/net/rfkill/core.c
+++ b/net/rfkill/core.c
@@ -582,6 +582,33 @@ void rfkill_set_states(struct rfkill *rfkill, bool sw, 
bool hw)
 }
 EXPORT_SYMBOL(rfkill_set_states);
 
+static const char *rfkill_types[NUM_RFKILL_TYPES] = {
+   [RFKILL_TYPE_WLAN]  = "wlan",
+   [RFKILL_TYPE_BLUETOOTH] = "bluetooth",
+   [RFKILL_TYPE_UWB]   = "ultrawideband",
+   [RFKILL_TYPE_WIMAX] = "wimax",
+   [RFKILL_TYPE_WWAN]  = "wwan",
+   [RFKILL_TYPE_GPS]   = "gps",
+   [RFKILL_TYPE_FM]= "fm",
+   [RFKILL_TYPE_NFC]   = "nfc",
+};
+
+enum rfkill_type rfkill_find_type(const char *name)
+{
+   int i;
+
+   BUILD_BUG_ON(!rfkill_types[NUM_RFKILL_TYPES - 1]);
+
+   if (!name)
+   return RFKILL_TYPE_ALL;
+
+   for (i = 1; i < NUM_RFKILL_TYPES; i++)
+   if (!strcmp(name, rfkill_types[i]))
+   return i;
+   return RFKILL_TYPE_ALL;
+}
+EXPORT_SYMBOL(rfkill_find_type);
+
 static ssize_t name_show(struct device *dev, struct device_attribute *attr,
 char *buf)
 {
@@ -591,38 +618,12 @@ static ssize_t name_show(struct device *dev, struct 
device_attribute *attr,
 }
 static DEVICE_ATTR_RO(name);
 
-static const char *rfkill_get_type_str(enum rfkill_type type)
-{
-   BUILD_BUG_ON(NUM_RFKILL_TYPES != RFKILL_TYPE_NFC + 1);
-
-   switch (type) {
-   case RFKILL_TYPE_WLAN:
-   return "wlan";
-   case RFKILL_TYPE_BLUETOOTH:
-   return "bluetooth";
-   case RFKILL_TYPE_UWB:
-   return "ultrawideband";
-   case RFKILL_TYPE_WIMAX:
-   return "wimax";
-   case RFKILL_TYPE_WWAN:
-   return "wwan";
-   case RFKILL_TYPE_GPS:
-   return "gps";
-   case RFKILL_TYPE_FM:
-   return "fm";
-   case RFKILL_TYPE_NFC:
-   return "nfc";
-   default:
-   BUG();
-   }
-}
-
 static ssize_t type_show(struct device *dev, struct device_attribute *attr,
 char *buf)
 {
struct rfkill *rfkill = to_rfkill(dev);
 
-   return sprintf(buf, "%s\n", rfkill_get_type_str(rfkill->type));
+   return sprintf(buf, "%s\n", rfkill_types[rfkill->type]);
 }
 static DEVICE_ATTR_RO(type);
 
@@ -768,7 +769,7 @@ static int rfkill_dev_uevent(struct device *dev, struct 
kobj_uevent_env *env)
if (error)
return error;
error = add_uevent_var(env, "RFKILL_TYPE=%s",
-  rfkill_get_type_str(rfkill->type));
+  rfkill_types[rfkill->type]);
if (error)
return error;
spin_lock_irqsave(>lock, flags);
-- 
2.7.0.rc3



Re: net: GPF in netlink_getsockbyportid

2016-01-25 Thread Herbert Xu
On Sun, Jan 24, 2016 at 01:11:03AM +0100, Florian Westphal wrote:
> Daniel Borkmann  wrote:
> > On 01/23/2016 08:25 PM, Florian Westphal wrote:
> > >Dmitry Vyukov  wrote:
> > >
> > >[ CC nf-devel, not sure if its nfnetlink fault or NETLINK_MMAP ]
> > >
> > >>The following program causes GPF in netlink_getsockbyportid:
> [..]
> 
> > >CONFIG_NETLINK_MMAP and nfnetlink batching strike in unison :-/
> > >
> > >root cause is in nfnetlink_rcv_batch():
> > >
> > >296 replay:
> > >297 status = 0;
> > >298
> > >299 skb = netlink_skb_clone(oskb, GFP_KERNEL);
> > >
> > >The clone op doesn't copy oskb->sk, so we oops in
> > >__netlink_alloc_skb -> netlink_getsockbyportid() when nfnetlink_rcv_batch
> > >tries to send netlink ack.
> > 
> > If indeed oskb is the mmap'ed netlink skb, then it's not even allowed
> > to call into skb_clone()
> 
> Right, but in this case there is no mmap'd netlink sk involved -- we
> crash when we try to look up dst netlink socket to see if there is an
> mmap'd ring attached.
> 
> [ and that code isn't there with CONFIG_NETLINK_MMAP=n ].

Let's CC Pablo since he wrote the code in question.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH V2 0/3] basic busy polling support for vhost_net

2016-01-25 Thread Jason Wang


On 01/25/2016 03:58 PM, Michael Rapoport wrote:
> (restored 'CC, sorry for dropping it originally, Notes is still hard
> for me)
>
> > Jason Wang  wrote on 01/25/2016 05:00:05 AM:
> > On 01/24/2016 05:00 PM, Mike Rapoport wrote:
> > > Hi Jason,
> > >
> > >> Jason Wang  redhat.com> writes:
> > >>
> > >> Hi all:
> > >>
> > >> This series tries to add basic busy polling for vhost net. The
> idea is
> > >> simple: at the end of tx/rx processing, busy polling for new tx added
> > >> descriptor and rx receive socket for a while.
> > > There were several conciens Michael raised on the Razya's attempt
> to add
> > > polling to vhost-net ([1], [2]). Some of them seem relevant for these
> > > patches as well:
> > >
> > > - What happens in overcommit scenarios?
> >
> > We have an optimization here: busy polling will end if more than one
> > processes is runnable on local cpu. This was done by checking
> > single_task_running() in each iteration. So at the worst case, busy
> > polling should be as fast as or only a minor regression compared to
> > normal case. You can see this from the last test result.
> >
> > > - Have you checked the effect of polling on some macro benchmarks?
> >
> > I'm not sure I get the question. Cover letters shows some benchmark
> > result of netperf. What do you mean by "macro benchmarks"?
>
> Back then, when Razya posted her polling implementation, Michael had
> concern about the macro effect ([3]),
> so I was wondering if this concern is also valid for your implementation.
> Now, after I've reread your changes, I think it's not that relevant...

More benchmarks is good, but lots of kernel patches were accepted only
with simple netperf results. Anyway busy polling is disabled by default,
will try to do macro benchmark in the future if I had time.

>
>
> > >> The maximum number of time (in us) could be spent on busy polling was
> > >> specified ioctl.
> > > Although ioctl is definitely more appropriate interface to allow
> user to
> > > tune polling, it's still not clear for me how *end user* will
> interact with
> > > it and how easy it would be for him/her.
> >
> > There will be qemu part of the codes for end user. E.g. a vhost_poll_us
> > parameter for tap like:
> >
> > -netdev tap,id=hn0,vhost=on,vhost_pull_us=20
>
> Not strictly related, I'd like to give a try to polling + vhost thread
> sharing and polling + workqueues.
> Do you mind sharing the scripts you used to test the polling?

Sure, it was a subtest of autotest[1].

[1]
https://github.com/autotest/tp-qemu/blob/7cf589b490aff7511eccbf2e1336ecf8d9fa9cb9/generic/tests/netperf.py

>
>  
> Thanks,
> Mike.
>
> > Thanks
> >
> > >
> > > [1] http://thread.gmane.org/gmane.linux.kernel/1765593
> > > [2] http://thread.gmane.org/gmane.comp.emulators.kvm.devel/131343
> > >
> > > --
> > > Sincerely yours,
> > > Mike.
> > >
>
> [3] https://www.mail-archive.com/kvm@vger.kernel.org/msg109703.html



Re: [PATCH 2/3] net: macb: fix build warning

2016-01-25 Thread Nicolas Ferre
Le 25/01/2016 07:13, Sudip Mukherjee a écrit :
> We are getting build warning about:
> macb.c:2889:13: warning: 'tx_clk' may be used uninitialized in this function
> macb.c:2888:11: warning: 'hclk' may be used uninitialized in this function
> 
> In reality they are not used uninitialized as clk_init() will initialize
> them, this patch will just silence the warning.
> 
> Signed-off-by: Sudip Mukherjee 

Acked-by: Nicolas Ferre 

Thanks for your patch.

Bye,

> ---
>  drivers/net/ethernet/cadence/macb.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/cadence/macb.c 
> b/drivers/net/ethernet/cadence/macb.c
> index 9d9984a..50c9410 100644
> --- a/drivers/net/ethernet/cadence/macb.c
> +++ b/drivers/net/ethernet/cadence/macb.c
> @@ -2823,7 +2823,7 @@ static int macb_probe(struct platform_device *pdev)
>   struct device_node *np = pdev->dev.of_node;
>   struct device_node *phy_node;
>   const struct macb_config *macb_config = NULL;
> - struct clk *pclk, *hclk, *tx_clk;
> + struct clk *pclk, *hclk = NULL, *tx_clk = NULL;
>   unsigned int queue_mask, num_queues;
>   struct macb_platform_data *pdata;
>   bool native_io;
> 


-- 
Nicolas Ferre


[PATCHv2 0/4] net: rfkill: gpio: replace platform data with build-in property

2016-01-25 Thread Heikki Krogerus
Hi,

The changes to the unified properties interface that I have been
waiting for are finally available in v4.5-rc1.


Heikki Krogerus (4):
  net: rfkill: add rfkill_find_type function
  net: rfkill: gpio: get the name and type from device property
  ARM: tegra: use build-in device properties with rfkill_gpio
  net: rfkill: gpio: remove rfkill_gpio_platform_data

 arch/arm/mach-tegra/board-paz00.c | 17 +++-
 include/linux/rfkill-gpio.h   | 37 -
 include/linux/rfkill.h| 15 +++
 net/rfkill/Kconfig|  3 +--
 net/rfkill/core.c | 57 ---
 net/rfkill/rfkill-gpio.c  | 24 -
 6 files changed, 66 insertions(+), 87 deletions(-)
 delete mode 100644 include/linux/rfkill-gpio.h

-- 
2.7.0.rc3



[PATCHv2 4/4] net: rfkill: gpio: remove rfkill_gpio_platform_data

2016-01-25 Thread Heikki Krogerus
No more users for it.

Signed-off-by: Heikki Krogerus 
---
 include/linux/rfkill-gpio.h | 37 -
 net/rfkill/Kconfig  |  3 +--
 net/rfkill/rfkill-gpio.c|  8 
 3 files changed, 1 insertion(+), 47 deletions(-)
 delete mode 100644 include/linux/rfkill-gpio.h

diff --git a/include/linux/rfkill-gpio.h b/include/linux/rfkill-gpio.h
deleted file mode 100644
index 20bcb55..000
--- a/include/linux/rfkill-gpio.h
+++ /dev/null
@@ -1,37 +0,0 @@
-/*
- * Copyright (c) 2011, NVIDIA Corporation.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful, but WITHOUT
- * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
- * more details.
- *
- * You should have received a copy of the GNU General Public License along
- * with this program; if not, write to the Free Software Foundation, Inc.,
- * 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
- */
-
-
-#ifndef __RFKILL_GPIO_H
-#define __RFKILL_GPIO_H
-
-#include 
-#include 
-
-/**
- * struct rfkill_gpio_platform_data - platform data for rfkill gpio device.
- * for unused gpio's, the expected value is -1.
- * @name:  name for the gpio rf kill instance
- */
-
-struct rfkill_gpio_platform_data {
-   char*name;
-   enum rfkill_typetype;
-};
-
-#endif /* __RFKILL_GPIO_H */
diff --git a/net/rfkill/Kconfig b/net/rfkill/Kconfig
index 598d374..868f1ad 100644
--- a/net/rfkill/Kconfig
+++ b/net/rfkill/Kconfig
@@ -41,5 +41,4 @@ config RFKILL_GPIO
default n
help
  If you say yes here you get support of a generic gpio RFKILL
- driver. The platform should fill in the appropriate fields in the
- rfkill_gpio_platform_data structure and pass that to the driver.
+ driver.
diff --git a/net/rfkill/rfkill-gpio.c b/net/rfkill/rfkill-gpio.c
index 1a9c031..76c01cb 100644
--- a/net/rfkill/rfkill-gpio.c
+++ b/net/rfkill/rfkill-gpio.c
@@ -27,8 +27,6 @@
 #include 
 #include 
 
-#include 
-
 struct rfkill_gpio_data {
const char  *name;
enum rfkill_typetype;
@@ -89,7 +87,6 @@ static int rfkill_gpio_acpi_probe(struct device *dev,
 
 static int rfkill_gpio_probe(struct platform_device *pdev)
 {
-   struct rfkill_gpio_platform_data *pdata = pdev->dev.platform_data;
struct rfkill_gpio_data *rfkill;
struct gpio_desc *gpio;
const char *type_name;
@@ -111,11 +108,6 @@ static int rfkill_gpio_probe(struct platform_device *pdev)
ret = rfkill_gpio_acpi_probe(>dev, rfkill);
if (ret)
return ret;
-   } else if (pdata) {
-   rfkill->name = pdata->name;
-   rfkill->type = pdata->type;
-   } else {
-   return -ENODEV;
}
 
rfkill->clk = devm_clk_get(>dev, NULL);
-- 
2.7.0.rc3



[PATCHv2 3/4] ARM: tegra: use build-in device properties with rfkill_gpio

2016-01-25 Thread Heikki Krogerus
Pass the rfkill name and type to the device with properties
instead of driver specific platform data.

Signed-off-by: Heikki Krogerus 
CC: Alexandre Courbot 
CC: Thierry Reding 
CC: Stephen Warren 
---
 arch/arm/mach-tegra/board-paz00.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/arch/arm/mach-tegra/board-paz00.c 
b/arch/arm/mach-tegra/board-paz00.c
index 49d1110..52db8bf 100644
--- a/arch/arm/mach-tegra/board-paz00.c
+++ b/arch/arm/mach-tegra/board-paz00.c
@@ -17,23 +17,25 @@
  *
  */
 
+#include 
 #include 
 #include 
-#include 
 
 #include "board.h"
 
-static struct rfkill_gpio_platform_data wifi_rfkill_platform_data = {
-   .name   = "wifi_rfkill",
-   .type   = RFKILL_TYPE_WLAN,
+static struct property_entry __initdata wifi_rfkill_prop[] = {
+   PROPERTY_ENTRY_STRING("name", "wifi_rfkill"),
+   PROPERTY_ENTRY_STRING("type", "wlan"),
+   { },
+};
+
+static struct property_set __initdata wifi_rfkill_pset = {
+   .properties = wifi_rfkill_prop,
 };
 
 static struct platform_device wifi_rfkill_device = {
.name   = "rfkill_gpio",
.id = -1,
-   .dev= {
-   .platform_data = _rfkill_platform_data,
-   },
 };
 
 static struct gpiod_lookup_table wifi_gpio_lookup = {
@@ -47,6 +49,7 @@ static struct gpiod_lookup_table wifi_gpio_lookup = {
 
 void __init tegra_paz00_wifikill_init(void)
 {
+   platform_device_add_properties(_rfkill_device, _rfkill_pset);
gpiod_add_lookup_table(_gpio_lookup);
platform_device_register(_rfkill_device);
 }
-- 
2.7.0.rc3



Supermicro AOC-STGN-i2S w intel 82599ES on Brocade ICX6610 - random link failures

2016-01-25 Thread Nikola Ciprich
Hello netdev readers,

I'd like to consult following problem we're dealing with:

I have a cluster of three nodes connected to stacked Brocade ICX6610
switches using bonded AOC-STGN-i2S adapters (they're using 82599ES
chipsets).

The problem is, I see random link failures on practically all
interfaces. Link always goes down for very short time, then adapter
is reset and link goes up again.

Here's dmesg snippet:

[Jan22 22:09] ixgbe :03:00.0 eth0: NIC Link is Down
[  +0.005610] ixgbe :03:00.0 eth0: initiating reset to clear Tx work after 
link loss
[  +0.012792] bond0: link status definitely down for interface eth0, disabling 
it
[  +1.105826] ixgbe :03:00.0 eth0: Reset adapter
[  +0.307518] ixgbe :03:00.0 eth0: detected SFP+: 3
[  +0.145881] ixgbe :03:00.0 eth0: NIC Link is Up 10 Gbps, Flow Control: 
RX/TX

since I'm using bonding, it doesn't disrupt traffic, but I'd still like to
resolve it. We're using 5m passive SFP cables, we tried replacing one with 3m
piece, to no avail. 

all three boxes are supermicro X10DRW, running vanilla x86_64 4.0.5 kernel 
(I'll upgrade it to 4.1.16 soon)

we were using broadcom adapter before and they were working without such 
problems
(except for one particular port, which showed mysterious packet drops every few
months, thats why we switched to intel-based adapters), so I think cables and 
switches
should be fine, but I'm not sure of course

I think I've seen similar problems and they were PM related, but I'm not sure..

anyone seen similar problem?

or some tips on how could I debug it?

If I could provide more information, please let me know

BR

nik

-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava

tel.:   +420 591 166 214
fax:+420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-


pgpK10OxTydMI.pgp
Description: PGP signature


Re: [PATCH] ipv4+ipv6: Make INET*_ESP select CRYPTO_ECHAINIV

2016-01-25 Thread Herbert Xu
Thomas Egerer  wrote:
> The ESP algorithms using CBC mode require echainiv. Hence INET*_ESP have
> to select CRYPTO_ECHAINIV in order to work properly. This solves the
> issues caused by a misconfiguration as described in [1].
> The original approach, patching crypto/Kconfig was turned down by
> Herbert Xu [2].
> 
> [1] https://lists.strongswan.org/pipermail/users/2015-December/009074.html
> [2] http://marc.info/?l=linux-crypto-vger=145224655809562=2
> 
> Signed-off-by: Thomas Egerer 
> ---
> net/ipv4/Kconfig | 1 +
> net/ipv6/Kconfig | 1 +
> 2 files changed, 2 insertions(+)
> 
> diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig
> index c229205..7758247 100644
> --- a/net/ipv4/Kconfig
> +++ b/net/ipv4/Kconfig
> @@ -353,6 +353,7 @@ config INET_ESP
>select CRYPTO_CBC
>select CRYPTO_SHA1
>select CRYPTO_DES
> +   select CRYPTO_ECHAINIV
>---help---
>  Support for IPsec ESP.
> 
> diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig
> index bb7dabe..40c8975 100644
> --- a/net/ipv6/Kconfig
> +++ b/net/ipv6/Kconfig
> @@ -69,6 +69,7 @@ config INET6_ESP
>select CRYPTO_CBC
>select CRYPTO_SHA1
>select CRYPTO_DES

Your patch seems to be missing a few lines at the end.

Otherwise it looks good to me.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


6lowpan: presenting my stateful compression work

2016-01-25 Thread Alexander Aring
Hi,

maybe somebody is in the near and has time to visit me at [0].
I am be there at 27.01.2016 from 12:45 until 14:15 (MEZ) for doing a
postersession. The postersession is public, so everybody should be welcome
to join as visitor.

Background:
Currently I try to reach my master degree and need to do some "project".
I combined it with my work at Pengutronix and implement the stateful
compression support for Linux 6LoWPAN.

You also can get the poster online at [1].

- Alex

[0] https://goo.gl/maps/Vr9frb6sQSq
[1] http://wpan.cakelab.org/doc/aring_poster.pdf


[PATCH 4.1] [media] media/vivid-osd: fix info leak in ioctl

2016-01-25 Thread Yuki Machida
commit eda98796aff0d9bf41094b06811f5def3b4c333c upstream.

The vivid_fb_ioctl() code fails to initialize the 16 _reserved bytes of
struct fb_vblank after the ->hcount member. Add an explicit
memset(0) before filling the structure to avoid the info leak.

This fixes CVE-2015-7884.

Signed-off-by: Salva Peiró 
Signed-off-by: Hans Verkuil 
Signed-off-by: Mauro Carvalho Chehab 
Signed-off-by: Yuki Machida 
---
 drivers/media/platform/vivid/vivid-osd.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/media/platform/vivid/vivid-osd.c 
b/drivers/media/platform/vivid/vivid-osd.c
index 084d346..e15eef6 100644
--- a/drivers/media/platform/vivid/vivid-osd.c
+++ b/drivers/media/platform/vivid/vivid-osd.c
@@ -85,6 +85,7 @@ static int vivid_fb_ioctl(struct fb_info *info, unsigned cmd, 
unsigned long arg)
case FBIOGET_VBLANK: {
struct fb_vblank vblank;
 
+   memset(, 0, sizeof(vblank));
vblank.flags = FB_VBLANK_HAVE_COUNT | FB_VBLANK_HAVE_VCOUNT |
FB_VBLANK_HAVE_VSYNC;
vblank.count = 0;
-- 
1.9.1



Re: [PATCH v2] net: fec: use CONFIG_ARM instead of CONFIG_ARCH_MXC/SOC_IMX28

2016-01-25 Thread Arnd Bergmann
On Monday 25 January 2016 11:40:50 Johannes Berg wrote:
> As Arnd Bergmann points out, using CONFIG_ARCH_MXC and/or SOC_IMX28
> is wrong if some other ARM platform uses this device - the operation
> of the driver would depend on an unrelated ARM platform that might
> or might not be set for multi-platform kernels.
> 
> Prior to my previous patch, any other platforms using it would have
> been broken already due to having the cbd_datlen/cbd_sc fields in
> the wrong order, but byte ordering correctly, so no such platforms
> can exist and work today.
> 
> In any case, it seems likely that only Freescale SoCs use this part,
> and those are little-endian on ARM, so CONFIG_ARM is safe for them.
> 
> Signed-off-by: Johannes Berg 
> 

Thanks, looks good.

Reviewed-by: Arnd Bergmann 


Re: Supermicro AOC-STGN-i2S w intel 82599ES on Brocade ICX6610 - random link failures

2016-01-25 Thread zhuyj

https://www.mail-archive.com/netdev@vger.kernel.org/msg94109.html

Maybe this link can help you. If work, please let me know.

Thanks a lot.
Zhu Yanjun

On 01/25/2016 06:08 PM, Nikola Ciprich wrote:

Hello netdev readers,

I'd like to consult following problem we're dealing with:

I have a cluster of three nodes connected to stacked Brocade ICX6610
switches using bonded AOC-STGN-i2S adapters (they're using 82599ES
chipsets).

The problem is, I see random link failures on practically all
interfaces. Link always goes down for very short time, then adapter
is reset and link goes up again.

Here's dmesg snippet:

[Jan22 22:09] ixgbe :03:00.0 eth0: NIC Link is Down
[  +0.005610] ixgbe :03:00.0 eth0: initiating reset to clear Tx work after 
link loss
[  +0.012792] bond0: link status definitely down for interface eth0, disabling 
it
[  +1.105826] ixgbe :03:00.0 eth0: Reset adapter
[  +0.307518] ixgbe :03:00.0 eth0: detected SFP+: 3
[  +0.145881] ixgbe :03:00.0 eth0: NIC Link is Up 10 Gbps, Flow Control: 
RX/TX

since I'm using bonding, it doesn't disrupt traffic, but I'd still like to
resolve it. We're using 5m passive SFP cables, we tried replacing one with 3m
piece, to no avail.

all three boxes are supermicro X10DRW, running vanilla x86_64 4.0.5 kernel 
(I'll upgrade it to 4.1.16 soon)

we were using broadcom adapter before and they were working without such 
problems
(except for one particular port, which showed mysterious packet drops every few
months, thats why we switched to intel-based adapters), so I think cables and 
switches
should be fine, but I'm not sure of course

I think I've seen similar problems and they were PM related, but I'm not sure..

anyone seen similar problem?

or some tips on how could I debug it?

If I could provide more information, please let me know

BR

nik





[PATCH] brcmfmac: sdio: Increase the default timeouts a bit

2016-01-25 Thread Sjoerd Simons
On a Radxa Rock2 board with a Ampak AP6335 (Broadcom 4339 core) it seems
the card responds very quickly most of the time, unfortunately during
initialisation it sometimes seems to take just a bit over 2 seconds to
respond.

This results intialization failing with message like:
  brcmf_c_preinit_dcmds: Retreiving cur_etheraddr failed, -52
  brcmf_bus_start: failed: -52
  brcmf_sdio_firmware_callback: dongle is not responding

Increasing the timeout to allow for a bit more headroom allows the
card to initialize reliably.

A quick search online after diagnosing/fixing this showed that Google
has a similar patch in their ChromeOS tree, so this doesn't seem
specific to the board I'm using.

Signed-off-by: Sjoerd Simons 

---

 drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c 
b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
index dd66143..75ac4bd 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
@@ -45,8 +45,8 @@
 #include "chip.h"
 #include "firmware.h"
 
-#define DCMD_RESP_TIMEOUT  msecs_to_jiffies(2000)
-#define CTL_DONE_TIMEOUT   msecs_to_jiffies(2000)
+#define DCMD_RESP_TIMEOUT  msecs_to_jiffies(2500)
+#define CTL_DONE_TIMEOUT   msecs_to_jiffies(2500)
 
 #ifdef DEBUG
 
-- 
2.7.0



  1   2   >