date:20160730

Re: [PATCH] net/mlx5_core/pagealloc: Remove deprecated create_singlethread_workqueue

2016-07-30 Thread Leon Romanovsky

On Fri, Jul 29, 2016 at 08:22:37AM -0400, Tejun Heo wrote:
> Hello,
> 
> On Thu, Jul 28, 2016 at 12:37:35PM +0300, Leon Romanovsky wrote:
> > Did you test this patch? Did you notice the memory reclaim path nature
> > of this work?
> 
> The conversion uses WQ_MEM_RECLAIM, which is standard for all
> workqueues which can stall packet processing if stalled.  The
> requirement comes from nfs or block devices over network.

The title stays "remove deprecated create_singlethread_workqueue" and if
we put aside the word "deprecated", the code moves from ordered
workqueue to unordered workqueue and changes max_active count which
isn't expressed in commit message.

For reclaim paths, I want to be extra caution and want to see
justification for doing that along with understanding if it is tested or
not.

Right now, I'm feeling that I'm participating in soapie where one sends
patch for every line, waits and sends the same patch for another file.
It is worth to send one patch set and let us to test it all in once.

Thanks.

> 
> Thanks.
> 
> -- 
> tejun
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

signature.asc
Description: Digital signature

Re: [PATCH] net: dsa: bcm_sf2: Unwind errors in correct order

2016-07-30 Thread David Miller

From: Florian Fainelli 
Date: Fri, 29 Jul 2016 12:35:57 -0700

> In case we cannot complete bcm_sf2_sw_setup() for any reason, and we
> go to the out_unmap label, but the MDIO bus has not been registered yet,
> we will hit the BUG condition in drivers/net/phy/mdio_bus.c about the
> bus not being registered. Fix this by dedicating a specific lable for
> when we fail after the MDIO bus has been successfully registered.
> 
> Fixes: 461cd1b03e32 ("net: dsa: bcm_sf2: Register our slave MDIO bus")
> Signed-off-by: Florian Fainelli 

Applied, thanks Florian.

Re: [PATCH] net: tulip: fix spelling mistake: "attemping" -> "attempting"

2016-07-30 Thread David Miller

From: Colin King 
Date: Sat, 30 Jul 2016 17:47:07 +0100

> From: Colin Ian King 
> 
> trivial fix to spelling mistake in printk message
> 
> Signed-off-by: Colin Ian King 

Applied, thanks.

Re: [PATCH net] sctp: change to use TCP_CLOSE_WAIT as SCTP_SS_CLOSING

2016-07-30 Thread David Miller

From: Marcelo Ricardo Leitner 
Date: Sat, 30 Jul 2016 10:25:35 -0300

> On Sat, Jul 30, 2016 at 08:00:45PM +0800, Xin Long wrote:
>> Prior to this patch, sctp defined TCP_CLOSING as SCTP_SS_CLOSING.
>> TCP_CLOSING is such a special sk state in TCP that inet common codes
>> even exclude it.
>> 
>> For instance, inet_accept thinks the accept sk's state never be
>> TCP_CLOSING, or it will give a WARN_ON. TCP works well with that
>> while SCTP may trigger the call trace, as CLOSING state in SCTP
>> has different meaning from TCP.
>> 
>> This fix is to change to use TCP_CLOSE_WAIT as SCTP_SS_CLOSING,
>> instead of TCP_CLOSING. Some side-effects could be expected,
>> regardless of not being used before. inet_accept will accept it
>> now.
>> 
>> I did all the func_tests in lksctp-tools and ran sctp codnomicon
>> fuzzer tests against this patch, no regression or failure found.
>> 
>> Signed-off-by: Xin Long 
> 
> I don't think this is -net material. It's a one line change but a core
> one.
> Dave please consider it for net-next instead.
> Though, Xin you may need to re-post later..
> 
> Acked-by: Marcelo Ricardo Leitner 

But, the commit log message says that inet_accept() will generate
a WARN_ON() call trace without this change.  That makes it sound
like it's 'net' material to me.

Re: [PATCH net] sctp: allow receiving msg when TCP-style sk is in CLOSED state

2016-07-30 Thread David Miller

From: Xin Long 
Date: Sat, 30 Jul 2016 14:14:41 +0800

> Commit 141ddefce7c8 ("sctp: change sk state to CLOSED instead of
> CLOSING in sctp_sock_migrate") changed sk state to CLOSED if the
> assoc is closed when sctp_accept clones a new sk.
> 
> If there is still data in sk receive queue, users will not be able
> to read it any more, as sctp_recvmsg returns directly if sk state
> is CLOSED.
> 
> This patch is to add CLOSED state check in sctp_recvmsg to allow
> reading data from TCP-style sk with CLOSED state as what TCP does.
> 
> Signed-off-by: Xin Long 

Applied.

Re: [PATCH net] sctp: allow delivering notifications after receiving SHUTDOWN

2016-07-30 Thread David Miller

From: Xin Long 
Date: Sat, 30 Jul 2016 14:09:09 +0800

> Prior to this patch, once sctp received SHUTDOWN or shutdown with RD,
> sk->sk_shutdown would be set with RCV_SHUTDOWN, and all events would
> be dropped in sctp_ulpq_tail_event(). It would cause:
> 
> 1. some notifications couldn't be received by users. like
>SCTP_SHUTDOWN_COMP generated by sctp_sf_do_4_C().
> 
> 2. sctp would also never trigger sk_data_ready when the association
>was closed, making it harder to identify the end of the association
>by calling recvmsg() and getting an EOF. It was not convenient for
>kernel users.
> 
> The check here should be stopping delivering DATA chunks after receiving
> SHUTDOWN, and stopping delivering ANY chunks after sctp_close().
> 
> So this patch is to allow notifications to enqueue into receive queue
> even if sk->sk_shutdown is set to RCV_SHUTDOWN in sctp_ulpq_tail_event,
> but if sk->sk_shutdown == RCV_SHUTDOWN | SEND_SHUTDOWN, it drops all
> events.
> 
> Signed-off-by: Xin Long 

Applied.

Re: [PATCH net] sctp: fix the issue sctp requeue auth chunk incorrectly

2016-07-30 Thread David Miller

From: Xin Long 
Date: Sat, 30 Jul 2016 13:58:35 +0800

> sctp needs to queue auth chunk back when we know that we are going
> to generate another segment. But commit f1533cce60d1 ("sctp: fix
> panic when sending auth chunks") requeues the last chunk processed
> which is probably not the auth chunk.
> 
> It causes panic when calculating the MAC in sctp_auth_calculate_hmac(),
> as the incorrect offset of the auth chunk in skb->data.
> 
> This fix is to requeue it by using packet->auth.
> 
> Fixes: f1533cce60d1 ("sctp: fix panic when sending auth chunks")
> Signed-off-by: Xin Long 

Applied.

[PATCH v2] Bluetooth: Add LED triggers for HCI frames tx and rx

2016-07-30 Thread Guodong Xu

Two LED triggers are added into hci_dev: tx_led and rx_led. Upon ACL/SCO
packets available in tx or rx, the LEDs will blink.

For each hci registration, two triggers are added into LED subsystem:
[hdev->name]-tx and [hdev-name]-rx.
Refer to Documentation/leds/leds-class.txt for usage.

Verified on HiKey 96boards, which uses HiSilicon hi6220 SoC and TI
WL1835 WiFi/BT combo chip.

Signed-off-by: Guodong Xu 
---
 include/net/bluetooth/hci_core.h |  1 +
 net/bluetooth/hci_core.c |  6 ++
 net/bluetooth/leds.c | 17 +
 net/bluetooth/leds.h |  2 ++
 4 files changed, 26 insertions(+)

diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
index dc71473..37b8dd9 100644
--- a/include/net/bluetooth/hci_core.h
+++ b/include/net/bluetooth/hci_core.h
@@ -398,6 +398,7 @@ struct hci_dev {
bdaddr_trpa;
 
struct led_trigger  *power_led;
+   struct led_trigger  *tx_led, *rx_led;
 
int (*open)(struct hci_dev *hdev);
int (*close)(struct hci_dev *hdev);
diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index 45a9fc6..956dce1 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -3833,6 +3833,7 @@ static void hci_sched_acl(struct hci_dev *hdev)
if (!hci_conn_num(hdev, AMP_LINK) && hdev->dev_type == HCI_AMP)
return;
 
+   hci_leds_blink_oneshot(hdev->tx_led);
switch (hdev->flow_ctl_mode) {
case HCI_FLOW_CTL_MODE_PACKET_BASED:
hci_sched_acl_pkt(hdev);
@@ -3856,6 +3857,7 @@ static void hci_sched_sco(struct hci_dev *hdev)
if (!hci_conn_num(hdev, SCO_LINK))
return;
 
+   hci_leds_blink_oneshot(hdev->tx_led);
while (hdev->sco_cnt && (conn = hci_low_sent(hdev, SCO_LINK, "e))) {
while (quote-- && (skb = skb_dequeue(&conn->data_q))) {
BT_DBG("skb %p len %d", skb, skb->len);
@@ -3879,6 +3881,7 @@ static void hci_sched_esco(struct hci_dev *hdev)
if (!hci_conn_num(hdev, ESCO_LINK))
return;
 
+   hci_leds_blink_oneshot(hdev->tx_led);
while (hdev->sco_cnt && (conn = hci_low_sent(hdev, ESCO_LINK,
 "e))) {
while (quote-- && (skb = skb_dequeue(&conn->data_q))) {
@@ -3911,6 +3914,7 @@ static void hci_sched_le(struct hci_dev *hdev)
hci_link_tx_to(hdev, LE_LINK);
}
 
+   hci_leds_blink_oneshot(hdev->tx_led);
cnt = hdev->le_pkts ? hdev->le_cnt : hdev->acl_cnt;
tmp = cnt;
while (cnt && (chan = hci_chan_sent(hdev, LE_LINK, "e))) {
@@ -3990,6 +3994,7 @@ static void hci_acldata_packet(struct hci_dev *hdev, 
struct sk_buff *skb)
 
if (conn) {
hci_conn_enter_active_mode(conn, BT_POWER_FORCE_ACTIVE_OFF);
+   hci_leds_blink_oneshot(hdev->rx_led);
 
/* Send to upper protocol */
l2cap_recv_acldata(conn, skb, flags);
@@ -4022,6 +4027,7 @@ static void hci_scodata_packet(struct hci_dev *hdev, 
struct sk_buff *skb)
hci_dev_unlock(hdev);
 
if (conn) {
+   hci_leds_blink_oneshot(hdev->rx_led);
/* Send to upper protocol */
sco_recv_scodata(conn, skb);
return;
diff --git a/net/bluetooth/leds.c b/net/bluetooth/leds.c
index 8319c84..ae10c5d 100644
--- a/net/bluetooth/leds.c
+++ b/net/bluetooth/leds.c
@@ -19,6 +19,8 @@ struct hci_basic_led_trigger {
 #define to_hci_basic_led_trigger(arg) container_of(arg, \
struct hci_basic_led_trigger, led_trigger)
 
+#define BLUETOOTH_BLINK_DELAY  50 /* ms */
+
 void hci_leds_update_powered(struct hci_dev *hdev, bool enabled)
 {
if (hdev->power_led)
@@ -37,6 +39,17 @@ static void power_activate(struct led_classdev *led_cdev)
led_trigger_event(led_cdev->trigger, powered ? LED_FULL : LED_OFF);
 }
 
+void hci_leds_blink_oneshot(struct led_trigger *trig)
+{
+   unsigned long led_delay = BLUETOOTH_BLINK_DELAY;
+
+   if (!trig)
+   return;
+
+   BT_DBG("led_trig %p", trig);
+   led_trigger_blink_oneshot(trig, &led_delay, &led_delay, 0);
+}
+
 static struct led_trigger *led_allocate_basic(struct hci_dev *hdev,
void (*activate)(struct led_classdev *led_cdev),
const char *name)
@@ -71,4 +84,8 @@ void hci_leds_init(struct hci_dev *hdev)
 {
/* initialize power_led */
hdev->power_led = led_allocate_basic(hdev, power_activate, "power");
+   /* initialize tx_led */
+   hdev->tx_led = led_allocate_basic(hdev, NULL, "tx");
+   /* initialize rx_led */
+   hdev->rx_led = led_allocate_basic(hdev, NULL, "rx");
 }
diff --git a/net/bluetooth/leds.h b/net/bluetooth/leds.h
index a9c4d6e..9b1cccd 100644
--- a/net/bluetooth/leds.h
+++ b/net/bluetooth/leds.h
@@ -9,8 +9,10 @@
 #if IS_ENABLED(CONFIG_BT_LEDS)
 void h

Re: [PATCH v2 net] tcp: consider recv buf for the initial window scale

2016-07-30 Thread David Miller

From: Soheil Hassas Yeganeh 
Date: Fri, 29 Jul 2016 09:34:02 -0400

> From: Soheil Hassas Yeganeh 
> 
> tcp_select_initial_window() intends to advertise a window
> scaling for the maximum possible window size. To do so,
> it considers the maximum of net.ipv4.tcp_rmem[2] and
> net.core.rmem_max as the only possible upper-bounds.
> However, users with CAP_NET_ADMIN can use SO_RCVBUFFORCE
> to set the socket's receive buffer size to values
> larger than net.ipv4.tcp_rmem[2] and net.core.rmem_max.
> Thus, SO_RCVBUFFORCE is effectively ignored by
> tcp_select_initial_window().
> 
> To fix this, consider the maximum of net.ipv4.tcp_rmem[2],
> net.core.rmem_max and socket's initial buffer space.
> 
> Fixes: b0573dea1fb3 ("[NET]: Introduce SO_{SND,RCV}BUFFORCE socket options")
> Signed-off-by: Soheil Hassas Yeganeh 
> Suggested-by: Neal Cardwell 

Applied, thanks.

Re: [PATCH net 0/3] macsec: reference counting fixes

2016-07-30 Thread David Miller

From: Sabrina Dubroca 
Date: Fri, 29 Jul 2016 15:37:52 +0200

> Patch 1 adds explicit reference counting on RXSCs, instead of the
> current implicit reference counting using the RXSA's refcount.
> 
> Patch 2 fixes possible kernel panics during module unload caused by an
> RCU callback that schedules another RCU callback, which the
> rcu_barrier() added in b196c22af5c3 ("macsec: add rcu_barrier() on
> module exit") didn't protect against.
> 
> Patch 3 fixes a refcounting issue with the underlying device for a
> macsec device when link creation fails.

Series applied, thanks Sabrina.

Re: [PATCH v2 0/4] drivers: net: cpsw: fix driver loading/unloading

2016-07-30 Thread David Miller

From: Grygorii Strashko 
Date: Thu, 28 Jul 2016 20:50:33 +0300

> This series fixes set of isssues observed when CPSW driver module is 
> unloaded/loaded:
> 1) rmmod: deadlock in cpdma_ctlr_destroy
> 2) rmmod: L3 back-trace and crash if all net interfaces are down, because CPSW
> can be powerred down by PM runtime in this case.
> 3) insmod: mdio device is not recreated on next insmod
>  - need to use of_platform_depopulate() in cpsw_remove().
> 4) rmmod: system crash on omap_device removal
> 
> Tested on: am437x-idk, am57xx-beagle-x15
> 
> Changes in v2:
> - build warning fixed
> - added fix for correct omap_device removal
> 
> Link on v1:
>  https://lkml.org/lkml/2016/7/22/240

Series applied, thanks.

Re: [PATCH -next] net: ipv6: use list_move instead of list_del/list_add

2016-07-30 Thread David Miller

From: Wei Yongjun 
Date: Thu, 28 Jul 2016 16:19:03 +

> Using list_move() instead of list_del() + list_add().
> 
> Signed-off-by: Wei Yongjun 

Applied.

Re: [PATCH net] cxgb4/cxgb4vf: Fixes regression in perf when tx vlan offload is disabled

2016-07-30 Thread David Miller

From: Hariprasad Shenai 
Date: Thu, 28 Jul 2016 13:28:57 +0530

> The commit 637d3e997351 ("cxgb4: Discard the packet if the length is
> greater than mtu") introduced a regression in the VLAN interface
> performance when Tx VLAN offload is disabled.
> 
> Check if skb is tagged, regardless of whether it is hardware accelerated
> or not. Presently we were checking only for hardware acclereated one,
> which caused performance to drop to ~0.17Mbps on a 10GbE adapter for
> VLAN interface, when tx vlan offload is turned off using ethtool.
> The ethernet head length calculation was going wrong in this case, and
> driver ended up dropping packets.
> 
> Fixes: 637d3e997351 ("cxgb4: Discard the packet if the length is greater than 
> mtu")
> 
> Signed-off-by: Hariprasad Shenai 

Applied.

Re: [PATCH -next] drivers: net: phy: xgene: Remove redundant dev_err call in xgene_mdio_probe()

2016-07-30 Thread David Miller

From: Wei Yongjun 
Date: Thu, 28 Jul 2016 02:12:17 +

> There is a error message within devm_ioremap_resource
> already, so remove the dev_err call to avoid redundant
> error message.
> 
> Signed-off-by: Wei Yongjun 

Applied.

Re: [PATCH -next] tipc: fix imbalance read_unlock_bh in __tipc_nl_add_monitor()

2016-07-30 Thread David Miller

From: Wei Yongjun 
Date: Thu, 28 Jul 2016 02:07:49 +

> In the error handling case of nla_nest_start() failed read_unlock_bh()
> is called  to unlock a lock that had not been taken yet. sparse warns
> about the context imbalance as the following:
> 
> net/tipc/monitor.c:799:23: warning:
>  context imbalance in '__tipc_nl_add_monitor' - different lock contexts for 
> basic block
> 
> Fixes: cf6f7e1d5109 ('tipc: dump monitor attributes')
> Signed-off-by: Wei Yongjun 

Applied.

Re: [PATCH net] 8139too:fix system hang when there is a tx timeout event.

2016-07-30 Thread David Miller

From: Chunhao Lin 
Date: Thu, 28 Jul 2016 02:39:57 +0800

> If tx timeout event occur, kernel will call rtl8139_tx_timeout_task() to reset
> hardware. But in this function, driver does not stop tx and rx function before
> reset hardware, that will cause system hang.
> 
> In this patch, add stop tx and rx function before reset hardware.
> 
> Signed-off-by: Chunhao Lin 

First, please always place a space after the subsystem prefix in your Subject
lines "8139too: ".

> @@ -1667,6 +1667,10 @@ static void rtl8139_tx_timeout_task (struct 
> work_struct *work)
>   int i;
>   u8 tmp8;
>  
> + napi_disable(&tp->napi);
> + netif_stop_queue(dev);
> + synchronize_sched();
> +
>   netdev_dbg(dev, "Transmit timeout, status %02x %04x %04x media %02x\n",
>  RTL_R8(ChipCmd), RTL_R16(IntrStatus),
>  RTL_R16(IntrMask), RTL_R8(MediaStatus));
> @@ -1696,10 +1700,10 @@ static void rtl8139_tx_timeout_task (struct 
> work_struct *work)
>   spin_unlock_irq(&tp->lock);
>  
>   /* ...and finally, reset everything */
> - if (netif_running(dev)) {
> - rtl8139_hw_start (dev);
> - netif_wake_queue (dev);
> - }
> + napi_enable(&tp->napi);
> + rtl8139_hw_start (dev);
> + netif_wake_queue (dev);
> +

Get rid of these spaces in these function calls before the openning parenthesis.

Re: [PATCH net 0/6] qed*: Small fixes series

2016-07-30 Thread David Miller

From: Yuval Mintz 
Date: Wed, 27 Jul 2016 14:45:18 +0300

> This contains several small [and straight-forward] fixes to qed*
> drivers.
> 
> Please consider applying this to `net'.

Series applied, thanks.

[PATCH] net: thunderx: correct bound check in nic_config_loopback

2016-07-30 Thread Levin, Alexander

Off by one in nic_config_loopback would access an invalid arrat variable when
vf id == MAX_LMAC.

Signed-off-by: Sasha Levin 
---
 drivers/net/ethernet/cavium/thunder/nic_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c 
b/drivers/net/ethernet/cavium/thunder/nic_main.c
index 16ed203..a70f50d 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -615,7 +615,7 @@ static int nic_config_loopback(struct nicpf *nic, struct 
set_loopback *lbk)
 {
int bgx_idx, lmac_idx;
 
-   if (lbk->vf_id > MAX_LMAC)
+   if (lbk->vf_id >= MAX_LMAC)
return -1;
 
bgx_idx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[lbk->vf_id]);
-- 
2.7.4

Re: [PATCH] qed: Add and use specific logging functions to reduce object size

2016-07-30 Thread kbuild test robot

Hi,

[auto build test ERROR on net-next/master]
[also build test ERROR on v4.7 next-20160729]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Joe-Perches/qed-Add-and-use-specific-logging-functions-to-reduce-object-size/20160727-063300
config: i386-randconfig-sb0-07310616 (attached as .config)
compiler: gcc-5 (Debian 5.4.0-6) 5.4.0 20160609
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

>> ERROR: "qed_err" [drivers/net/ethernet/qlogic/qede/qede.ko] undefined!
>> ERROR: "qed_verbose" [drivers/net/ethernet/qlogic/qede/qede.ko] undefined!
>> ERROR: "qed_info" [drivers/net/ethernet/qlogic/qede/qede.ko] undefined!
>> ERROR: "qed_notice" [drivers/net/ethernet/qlogic/qede/qede.ko] undefined!

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data

Re: [PATCH v5] net: sched: convert qdisc linked list to hashtable

2016-07-30 Thread kbuild test robot

Hi,

[auto build test ERROR on v4.7-rc7]
[also build test ERROR on next-20160729]
[cannot apply to net/master net-next/master ipsec-next/master]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Jiri-Kosina/net-sched-convert-qdisc-linked-list-to-hashtable/20160729-155412
config: arm-allmodconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 5.4.0-6) 5.4.0 20160609
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm 

All errors (new ones prefixed by >>):

>> drivers/net/ethernet/ti/davinci_emac.c:736:57: error: macro "hash_add" 
>> requires 3 arguments, but only 2 given
static int hash_add(struct emac_priv *priv, u8 *mac_addr)
^
>> drivers/net/ethernet/ti/davinci_emac.c:737:1: error: expected '=', ',', ';', 
>> 'asm' or '__attribute__' before '{' token
{
^
>> drivers/net/ethernet/ti/davinci_emac.c:778:12: error: conflicting types for 
>> 'hash_del'
static int hash_del(struct emac_priv *priv, u8 *mac_addr)
   ^
   In file included from include/linux/netdevice.h:55:0,
from drivers/net/ethernet/ti/davinci_emac.c:44:
   include/linux/hashtable.h:104:20: note: previous definition of 'hash_del' 
was here
static inline void hash_del(struct hlist_node *node)
   ^
   drivers/net/ethernet/ti/davinci_emac.c: In function 'emac_add_mcast':
   drivers/net/ethernet/ti/davinci_emac.c:828:35: error: macro "hash_add" 
requires 3 arguments, but only 2 given
  update = hash_add(priv, mac_addr);
  ^
>> drivers/net/ethernet/ti/davinci_emac.c:828:12: error: 'hash_add' undeclared 
>> (first use in this function)
  update = hash_add(priv, mac_addr);
   ^
   drivers/net/ethernet/ti/davinci_emac.c:828:12: note: each undeclared 
identifier is reported only once for each function it appears in

vim +/hash_add +736 drivers/net/ethernet/ti/davinci_emac.c

a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18  730   
* @priv: The DaVinci EMAC private adapter structure
49ce9c2c drivers/net/ethernet/ti/davinci_emac.c Ben Hutchings 2012-07-10  731   
* @mac_addr: mac address to delete from hash table
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18  732   
*
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18  733   
* Adds mac address to the internal hash table
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18  734   
*
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18  735   
*/
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18 @736  
static int hash_add(struct emac_priv *priv, u8 *mac_addr)
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18 @737  {
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18  738   
struct device *emac_dev = &priv->ndev->dev;
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18  739   
u32 rc = 0;
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18  740   
u32 hash_bit;
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18  741   
u32 hash_value = hash_get(mac_addr);
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18  742  
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18  743   
if (hash_value >= EMAC_NUM_MULTICAST_BITS) {
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18  744   
if (netif_msg_drv(priv)) {
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18  745   
dev_err(emac_dev, "DaVinci EMAC: hash_add(): Invalid "\
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18  746   
"Hash %08x, should not be greater than %08x",
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18  747   
hash_value, (EMAC_NUM_MULTICAST_BITS - 1));
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18  748   
}
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18  749   
return -1;
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18  750   
}
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18  751  
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18  752   
/* set the hash bit only if not previously set */
a6286ee6 drivers/net/davinci_emac.c Anant Gole2009-05-18  753   
if (priv->multicast_hash_cnt[hash_value] == 0) {
a6286ee6 drivers/net/davinci_emac.c Anant Gole

Re: [PATCH 1/3] netfilter: nat: update hash bucket if nat changed after ct confirmed

2016-07-30 Thread Florian Westphal

fxp2001640...@gmail.com  wrote:
> From: Xiaoping Fan 
> 
> In some situations, NAT information is created after connection is
> confirmed.

That sounds like a bug.

How can this happen?

nf_nat_setup_info() is only safe for non-confirmed conntracks
(not in hash table).

Re: [PATCH] Networking: Core: netpoll: Fixed a missing spin_unlock

2016-07-30 Thread Francois Romieu

Salil Kapur  :
[...]
> diff --git a/net/core/netpoll.c b/net/core/netpoll.c
> index fc75c9e..9124f76 100644
> --- a/net/core/netpoll.c
> +++ b/net/core/netpoll.c
> @@ -386,8 +386,10 @@ void netpoll_send_skb_on_dev(struct netpoll *np, struct 
> sk_buff *skb,
>   
> !vlan_hw_offload_capable(netif_skb_features(skb),
>
> skb->vlan_proto)) {
>   skb = __vlan_put_tag(skb, 
> skb->vlan_proto, vlan_tx_tag_get(skb));
> - if (unlikely(!skb))
> + if (unlikely(!skb)) {
> + __netif_tx_unlock(txq);
>   break;
> + }
>   skb->vlan_tci = 0;
>   }
>  

Your kernel is outdated: __vlan_put_tag has disappeared from net/core/netpoll.c
since 62749e2cb3c4a7da3eaa5c01a7e787aebeff8536 ("vlan: rename __vlan_put_tag to
vlan_insert_tag_set_proto") by Jiri Pirko somewhere in 2014.

-- 
Ueimor

[PATCH 3/3] netfilter: nat: don't assign a null snat rule to bridged traffic if no matching

2016-07-30 Thread fxp2001640163

From: Xiaoping Fan 

In some case, bridged packet will come back again for routing. When bridge
netfilter is enabled, a null snat rule is assigned to bridged packet if no
matching in nat chain. Then nat rule matching is skipped when packet comes
back for routing. This result in private IP address exported to public
network. So we don't assign a null snat rule to bridged traffic if no
matching.

Signed-off-by: Xiaoping Fan 
---
 net/ipv4/netfilter/nf_nat_l3proto_ipv4.c | 3 +++
 net/ipv6/netfilter/nf_nat_l3proto_ipv6.c | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c 
b/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
index 41c7992..151eee6 100644
--- a/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
+++ b/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
@@ -315,6 +315,9 @@ nf_nat_ipv4_fn(void *priv, struct sk_buff *skb,
break;
}
 
+   if (nf_nat_is_bridged_pkt(skb))
+   break;
+
ret = nf_nat_alloc_null_binding(ct, state->hook);
if (ret != NF_ACCEPT)
return ret;
diff --git a/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c 
b/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c
index dc8df3a..c94eae4 100644
--- a/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c
+++ b/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c
@@ -324,6 +324,9 @@ nf_nat_ipv6_fn(void *priv, struct sk_buff *skb,
break;
}
 
+   if (nf_nat_is_bridged_pkt(skb))
+   break;
+
ret = nf_nat_alloc_null_binding(ct, state->hook);
if (ret != NF_ACCEPT)
return ret;
-- 
1.9.1

[PATCH 2/3] netfilter: nat: snat created in route process just apply to routed traffic

2016-07-30 Thread fxp2001640163

From: Xiaoping Fan 

In some situations, packet goes through Linux twice, one for bridging,
another for routing. If snat is created in bridging process, that means
snat rule only matches bridged traffic. If snat is created in routing
process, that means snat rule only matches routed traffic. If we apply
snat to both bridged and routed traffic, traffic will be translated
unexpectedly. So we limit snat created in bridge process to bridged
traffic, snat created in route process to routed traffic.

Signed-off-by: Xiaoping Fan 
---
 include/net/netfilter/nf_nat.h   |  9 +
 net/ipv4/netfilter/nf_nat_l3proto_ipv4.c | 15 ++-
 net/ipv6/netfilter/nf_nat_l3proto_ipv6.c | 15 ++-
 3 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/include/net/netfilter/nf_nat.h b/include/net/netfilter/nf_nat.h
index c327a43..ef6e06c 100644
--- a/include/net/netfilter/nf_nat.h
+++ b/include/net/netfilter/nf_nat.h
@@ -18,12 +18,21 @@ enum nf_nat_manip_type {
 #include 
 #include 
 
+/* Check if packet is a bridged packet when do SNAT */
+#if defined(CONFIG_BRIDGE_NETFILTER)
+#define nf_nat_is_bridged_pkt(SKB) ((SKB)->nf_bridge && \
+   ((SKB)->nf_bridge->physoutdev != NULL))
+#else
+#define nf_nat_is_bridged_pkt(SKB) 0
+#endif
+
 /* per conntrack: nat application helper private data */
 union nf_conntrack_nat_help {
/* insert nat helper private data here */
 #if defined(CONFIG_NF_NAT_PPTP) || defined(CONFIG_NF_NAT_PPTP_MODULE)
struct nf_nat_pptp nat_pptp_info;
 #endif
+   bool snat_in_bridge;
 };
 
 struct nf_conn;
diff --git a/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c 
b/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
index f8aad03..41c7992 100644
--- a/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
+++ b/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -277,6 +278,12 @@ nf_nat_ipv4_fn(void *priv, struct sk_buff *skb,
if (nat == NULL)
return NF_ACCEPT;
 
+   if ((maniptype == NF_NAT_MANIP_SRC) &&
+   nf_nat_initialized(ct, maniptype) &&
+   (nat->help.snat_in_bridge != nf_nat_is_bridged_pkt(skb))) {
+   return NF_ACCEPT;
+   }
+
switch (ctinfo) {
case IP_CT_RELATED:
case IP_CT_RELATED_REPLY:
@@ -299,8 +306,14 @@ nf_nat_ipv4_fn(void *priv, struct sk_buff *skb,
if (ret != NF_ACCEPT)
return ret;
 
-   if (nf_nat_initialized(ct, HOOK2MANIP(state->hook)))
+   if (nf_nat_initialized(ct, HOOK2MANIP(state->hook))) {
+   if (maniptype == NF_NAT_MANIP_SRC) {
+   nfct_nat(ct)->help.snat_in_bridge =
+   nf_nat_is_bridged_pkt(skb);
+   }
+
break;
+   }
 
ret = nf_nat_alloc_null_binding(ct, state->hook);
if (ret != NF_ACCEPT)
diff --git a/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c 
b/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c
index e0be97e..dc8df3a 100644
--- a/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c
+++ b/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -281,6 +282,12 @@ nf_nat_ipv6_fn(void *priv, struct sk_buff *skb,
if (nat == NULL)
return NF_ACCEPT;
 
+   if ((maniptype == NF_NAT_MANIP_SRC) &&
+   nf_nat_initialized(ct, maniptype) &&
+   (nat->help.snat_in_bridge != nf_nat_is_bridged_pkt(skb))) {
+   return NF_ACCEPT;
+   }
+
switch (ctinfo) {
case IP_CT_RELATED:
case IP_CT_RELATED_REPLY:
@@ -308,8 +315,14 @@ nf_nat_ipv6_fn(void *priv, struct sk_buff *skb,
if (ret != NF_ACCEPT)
return ret;
 
-   if (nf_nat_initialized(ct, HOOK2MANIP(state->hook)))
+   if (nf_nat_initialized(ct, HOOK2MANIP(state->hook))) {
+   if (maniptype == NF_NAT_MANIP_SRC) {
+   nfct_nat(ct)->help.snat_in_bridge =
+   nf_nat_is_bridged_pkt(skb);
+   }
+
break;
+   }
 
ret = nf_nat_alloc_null_binding(ct, state->hook);
if (ret != NF_ACCEPT)
-- 
1.9.1

[PATCH 1/3] netfilter: nat: update hash bucket if nat changed after ct confirmed

2016-07-30 Thread fxp2001640163

From: Xiaoping Fan 

In some situations, NAT information is created after connection is
confirmed. Since 5 tuple for reply direction is changed when creating
NAT information, so we need to update hash bucket of connection.

Signed-off-by: Xiaoping Fan 
---
 include/net/netfilter/nf_conntrack.h |  5 
 net/netfilter/nf_conntrack_core.c| 51 ++--
 net/netfilter/nf_nat_core.c  |  9 +++
 3 files changed, 63 insertions(+), 2 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack.h 
b/include/net/netfilter/nf_conntrack.h
index 445b019..cc9ba66 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -191,6 +191,9 @@ void *nf_ct_alloc_hashtable(unsigned int *sizep, int nulls);
 void nf_ct_free_hashtable(void *hash, unsigned int size);
 
 int nf_conntrack_hash_check_insert(struct nf_conn *ct);
+void nf_conntrack_ct_hash_bucket_update(struct nf_conn *ct,
+   unsigned int old_hash,
+   unsigned int old_reply_hash);
 bool nf_ct_delete(struct nf_conn *ct, u32 pid, int report);
 
 bool nf_ct_get_tuplepr(const struct sk_buff *skb, unsigned int nhoff,
@@ -305,6 +308,8 @@ int nf_conntrack_set_hashsize(const char *val, struct 
kernel_param *kp);
 int nf_conntrack_hash_resize(unsigned int hashsize);
 extern unsigned int nf_conntrack_htable_size;
 extern unsigned int nf_conntrack_max;
+u_int32_t hash_conntrack(const struct net *net,
+const struct nf_conntrack_tuple *tuple);
 
 struct nf_conn *nf_ct_tmpl_alloc(struct net *net,
 const struct nf_conntrack_zone *zone,
diff --git a/net/netfilter/nf_conntrack_core.c 
b/net/netfilter/nf_conntrack_core.c
index dd2c43a..d4ee145 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -202,11 +202,12 @@ static u32 __hash_conntrack(const struct net *net,
return reciprocal_scale(hash_conntrack_raw(tuple, net), size);
 }
 
-static u32 hash_conntrack(const struct net *net,
- const struct nf_conntrack_tuple *tuple)
+u32 hash_conntrack(const struct net *net,
+  const struct nf_conntrack_tuple *tuple)
 {
return scale_hash(hash_conntrack_raw(tuple, net));
 }
+EXPORT_SYMBOL(hash_conntrack);
 
 bool
 nf_ct_get_tuple(const struct sk_buff *skb,
@@ -636,6 +637,52 @@ out:
 }
 EXPORT_SYMBOL_GPL(nf_conntrack_hash_check_insert);
 
+/* Sometimes reply tuple of ct is changed by nat after ct is confirmed,
+ * hash bucket of ct has to be updated in this situation.
+ */
+void nf_conntrack_ct_hash_bucket_update(struct nf_conn *ct,
+   unsigned int old_hash,
+   unsigned int old_reply_hash)
+{
+   struct net *net;
+   unsigned int hash, reply_hash;
+   unsigned int sequence;
+
+   if (!ct || nf_ct_is_untracked(ct) || !nf_ct_is_confirmed(ct))
+   return;
+
+   net = nf_ct_net(ct);
+
+   local_bh_disable();
+   do {
+   sequence = read_seqcount_begin(&nf_conntrack_generation);
+   } while (nf_conntrack_double_lock(net, old_hash, old_reply_hash, 
sequence));
+
+   /* Remove from confirmed list */
+   hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
+   hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_REPLY].hnnode);
+
+   nf_conntrack_double_unlock(old_hash, old_reply_hash);
+
+   /* Make changes visible in other cores */
+   smp_wmb();
+
+   do {
+   sequence = read_seqcount_begin(&nf_conntrack_generation);
+   hash = hash_conntrack(net,
+ &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple);
+   reply_hash = hash_conntrack(net,
+   
&ct->tuplehash[IP_CT_DIR_REPLY].tuple);
+   } while (nf_conntrack_double_lock(net, hash, reply_hash, sequence));
+
+   /* Insert to confirmed list again */
+   __nf_conntrack_hash_insert(ct, hash, reply_hash);
+
+   nf_conntrack_double_unlock(hash, reply_hash);
+   local_bh_enable();
+}
+EXPORT_SYMBOL_GPL(nf_conntrack_ct_hash_bucket_update);
+
 static inline void nf_ct_acct_update(struct nf_conn *ct,
 enum ip_conntrack_info ctinfo,
 unsigned int len)
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index de31818..612d8d57 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -405,8 +405,10 @@ nf_nat_setup_info(struct nf_conn *ct,
  const struct nf_nat_range *range,
  enum nf_nat_manip_type maniptype)
 {
+   struct net *net = nf_ct_net(ct);
struct nf_conntrack_tuple curr_tuple, new_tuple;
struct nf_conn_nat *nat;
+   unsigned int old_hash, old_reply_hash;
 
/* nat helper or nfctnetlink also setup binding */

Re: [PATCH 2/3] sctp_diag: export timer value only if it is active

2016-07-30 Thread Phil Sutter

On Sat, Jul 30, 2016 at 10:33:48AM -0300, Marcelo Ricardo Leitner wrote:
> 
> 
> Em 30-07-2016 10:25, Xin Long escreveu:
> >>> diff --git a/net/sctp/sctp_diag.c b/net/sctp/sctp_diag.c
> >>> index f69edcf219e51..0ad6033a7330c 100644
> >>> --- a/net/sctp/sctp_diag.c
> >>> +++ b/net/sctp/sctp_diag.c
> >>> @@ -40,10 +40,12 @@ static void inet_diag_msg_sctpasoc_fill(struct 
> >>> inet_diag_msg *r,
> >>>   }
> >>>
> >>>   r->idiag_state = asoc->state;
> >>> - r->idiag_timer = SCTP_EVENT_TIMEOUT_T3_RTX;
> >>> - r->idiag_retrans = asoc->rtx_data_chunks;
> >>> - r->idiag_expires = jiffies_to_msecs(
> >>> - asoc->timeouts[SCTP_EVENT_TIMEOUT_T3_RTX] - jiffies);
> >>
> >> I think we have two issues here, prior to your patch, but I noticed
> >> while reviewing it :-)
> >>
> >> This array is actually not based on jiffies but on intervals instead, as
> >> per:
> >>
> >> sm_sideeffect.c:
> >> case SCTP_CMD_TIMER_START:   [1]
> >> timer = &asoc->timers[cmd->obj.to];
> >> timeout = asoc->timeouts[cmd->obj.to];   <---
> >> BUG_ON(!timeout);
> >>
> >> timer->expires = jiffies + timeout;  <---
> > understood.
> >
> >>
> >> But more importantly, this array is actually not used for this timeout
> >> and the timeout is sctp_transport dependant, as per:
> >>
> >> /* Schedule retransmission on the given transport */
> >> void sctp_transport_immediate_rtx(struct sctp_transport *t)
> >> {
> >> /* Stop pending T3_rtx_timer */
> >> if (del_timer(&t->T3_rtx_timer))
> >> sctp_transport_put(t);
> >>
> >> sctp_retransmit(&t->asoc->outqueue, t, SCTP_RTXR_T3_RTX);
> >> if (!timer_pending(&t->T3_rtx_timer)) {
> >> if (!mod_timer(&t->T3_rtx_timer, jiffies + t->rto))
> >>  
> >> sctp_transport_hold(t);
> >>
> >> Note how on sctp_get_sctp_info() it fetches the RTO (which is T3_RTX)
> >> this way:
> >> info->sctpi_p_rto = jiffies_to_msecs(prim->rto);
> >> If we want to know how long is left for the timer to expire, we have to
> >> read directly from it.
> > you are right, 3 timers (T3_tx, hb, rtx_data_chunks) are per transport.
> >
> >>
> >> With git grep -A 1 TIMER_START we can confirm that [1] is never hit for
> >> SCTP_EVENT_TIMEOUT_T3_RTX. Yet, the asoc is allocated with kzalloc(), so
> >> I guess you were just reading -jiffies in there.
> >>
> >> Note however that the stats rtx_data_chunks is the accumulated stats,
> >> it's good, and that we may have multiple T3 timers running at once, with
> >> different timeouts.
> >>
> >> Xin, ideas on how we can fix this? I'm not sure if we can dump
> >> per-transport info in there. Not as it is now, I guess.
> > It's not that easy to dump all transports info besed on current sctp_diag 
> > codes.
> 
> Okay
> 
> >
> > Now for the transport's info,  we only choose primary_path to dump.
> 
> Okay
> 
> > It means we should fix this by getting the left time to expire from
> > primary transport t->T3_rtx_timer. like:
> >
> > r->idiag_expires = jiffies_to_msecs(
> > -   asoc->timeouts[SCTP_EVENT_TIMEOUT_T3_RTX] - jiffies);
> > +   asoc->peer.primary_path->T3_rtx_timer.expires - jiffies);
> >
> > but yes, need to check with timer_pending firstly.
> 
> Yes :)
> 
> >
> > what do  you think ?
> >
> 
> Makes sense, LGTM.
> 
> Phil, not sure how you want to proceed here. Wanna handle the change above?

I'll look into this next week. One early question: Does the above mean
we are printing the primary path's timer value for every assoc? If so,
shouldn't we do that for just the EP or the primary path's assoc even?

Thanks, Phil

[PATCH] net: tulip: fix spelling mistake: "attemping" -> "attempting"

2016-07-30 Thread Colin King

From: Colin Ian King 

trivial fix to spelling mistake in printk message

Signed-off-by: Colin Ian King 
---
 drivers/net/ethernet/dec/tulip/de4x5.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/dec/tulip/de4x5.c 
b/drivers/net/ethernet/dec/tulip/de4x5.c
index cbe8497..f0e9e2e 100644
--- a/drivers/net/ethernet/dec/tulip/de4x5.c
+++ b/drivers/net/ethernet/dec/tulip/de4x5.c
@@ -1319,7 +1319,7 @@ de4x5_open(struct net_device *dev)
 
 if (request_irq(dev->irq, de4x5_interrupt, IRQF_SHARED,
 lp->adapter_name, dev)) {
-   printk("de4x5_open(): Requested IRQ%d is busy - attemping 
FAST/SHARE...", dev->irq);
+   printk("de4x5_open(): Requested IRQ%d is busy - attempting 
FAST/SHARE...", dev->irq);
if (request_irq(dev->irq, de4x5_interrupt, IRQF_SHARED,
 lp->adapter_name, dev)) {
printk("\n  Cannot get IRQ- reconfigure your 
hardware.\n");
-- 
2.8.1

[PATCH 2/2] net: ethernet: marvell: mvneta: use new api ethtool_{get|set}_link_ksettings

2016-07-30 Thread Philippe Reynes

The ethtool api {get|set}_settings is deprecated.
We move the mvneta driver to new api {get|set}_link_ksettings.

We use the generic function phy_ethtool_get_link_ksettings,
and update old mvneta_ethtool_set_settings to the new api.

Signed-off-by: Philippe Reynes 
---
 drivers/net/ethernet/marvell/mvneta.c |   34 
 1 files changed, 13 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c 
b/drivers/net/ethernet/marvell/mvneta.c
index 410adc7..2021863 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -3505,30 +3505,22 @@ static int mvneta_ioctl(struct net_device *dev, struct 
ifreq *ifr, int cmd)
 
 /* Ethtool methods */
 
-/* Get settings (phy address, speed) for ethtools */
-int mvneta_ethtool_get_settings(struct net_device *dev, struct ethtool_cmd 
*cmd)
+/* Set link ksettings (phy address, speed) for ethtools */
+int mvneta_ethtool_set_link_ksettings(struct net_device *ndev,
+ const struct ethtool_link_ksettings *cmd)
 {
-   if (!dev->phydev)
-   return -ENODEV;
-
-   return phy_ethtool_gset(dev->phydev, cmd);
-}
-
-/* Set settings (phy address, speed) for ethtools */
-int mvneta_ethtool_set_settings(struct net_device *dev, struct ethtool_cmd 
*cmd)
-{
-   struct mvneta_port *pp = netdev_priv(dev);
-   struct phy_device *phydev = dev->phydev;
+   struct mvneta_port *pp = netdev_priv(ndev);
+   struct phy_device *phydev = ndev->phydev;
 
if (!phydev)
return -ENODEV;
 
-   if ((cmd->autoneg == AUTONEG_ENABLE) != pp->use_inband_status) {
+   if ((cmd->base.autoneg == AUTONEG_ENABLE) != pp->use_inband_status) {
u32 val;
 
-   mvneta_set_autoneg(pp, cmd->autoneg == AUTONEG_ENABLE);
+   mvneta_set_autoneg(pp, cmd->base.autoneg == AUTONEG_ENABLE);
 
-   if (cmd->autoneg == AUTONEG_DISABLE) {
+   if (cmd->base.autoneg == AUTONEG_DISABLE) {
val = mvreg_read(pp, MVNETA_GMAC_AUTONEG_CONFIG);
val &= ~(MVNETA_GMAC_CONFIG_MII_SPEED |
 MVNETA_GMAC_CONFIG_GMII_SPEED |
@@ -3545,17 +3537,17 @@ int mvneta_ethtool_set_settings(struct net_device *dev, 
struct ethtool_cmd *cmd)
mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, val);
}
 
-   pp->use_inband_status = (cmd->autoneg == AUTONEG_ENABLE);
+   pp->use_inband_status = (cmd->base.autoneg == AUTONEG_ENABLE);
netdev_info(pp->dev, "autoneg status set to %i\n",
pp->use_inband_status);
 
-   if (netif_running(dev)) {
+   if (netif_running(ndev)) {
mvneta_port_down(pp);
mvneta_port_up(pp);
}
}
 
-   return phy_ethtool_sset(dev->phydev, cmd);
+   return phy_ethtool_ksettings_set(ndev->phydev, cmd);
 }
 
 /* Set interrupt coalescing for ethtools */
@@ -3819,8 +3811,6 @@ static const struct net_device_ops mvneta_netdev_ops = {
 
 const struct ethtool_ops mvneta_eth_tool_ops = {
.get_link   = ethtool_op_get_link,
-   .get_settings   = mvneta_ethtool_get_settings,
-   .set_settings   = mvneta_ethtool_set_settings,
.set_coalesce   = mvneta_ethtool_set_coalesce,
.get_coalesce   = mvneta_ethtool_get_coalesce,
.get_drvinfo= mvneta_ethtool_get_drvinfo,
@@ -3833,6 +3823,8 @@ const struct ethtool_ops mvneta_eth_tool_ops = {
.get_rxnfc  = mvneta_ethtool_get_rxnfc,
.get_rxfh   = mvneta_ethtool_get_rxfh,
.set_rxfh   = mvneta_ethtool_set_rxfh,
+   .get_link_ksettings = phy_ethtool_get_link_ksettings,
+   .set_link_ksettings = mvneta_ethtool_set_link_ksettings,
 };
 
 /* Initialize hw */
-- 
1.7.4.4

[PATCH 1/2] net: ethernet: marvell: mvneta: use phydev from struct net_device

2016-07-30 Thread Philippe Reynes

The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy_dev in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes 
---
 drivers/net/ethernet/marvell/mvneta.c |   34 +++-
 1 files changed, 16 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c 
b/drivers/net/ethernet/marvell/mvneta.c
index f92018b..410adc7 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -399,7 +399,6 @@ struct mvneta_port {
u16 rx_ring_size;
 
struct mii_bus *mii_bus;
-   struct phy_device *phy_dev;
phy_interface_t phy_interface;
struct device_node *phy_node;
unsigned int link;
@@ -2651,6 +2650,7 @@ static int mvneta_poll(struct napi_struct *napi, int 
budget)
u32 cause_rx_tx;
int rx_queue;
struct mvneta_port *pp = netdev_priv(napi->dev);
+   struct net_device *ndev = pp->dev;
struct mvneta_pcpu_port *port = this_cpu_ptr(pp->ports);
 
if (!netif_running(pp->dev)) {
@@ -2668,7 +2668,7 @@ static int mvneta_poll(struct napi_struct *napi, int 
budget)
(MVNETA_CAUSE_PHY_STATUS_CHANGE |
 MVNETA_CAUSE_LINK_CHANGE |
 MVNETA_CAUSE_PSC_SYNC_CHANGE))) {
-   mvneta_fixed_link_update(pp, pp->phy_dev);
+   mvneta_fixed_link_update(pp, ndev->phydev);
}
}
 
@@ -2963,6 +2963,7 @@ static int mvneta_setup_txqs(struct mvneta_port *pp)
 static void mvneta_start_dev(struct mvneta_port *pp)
 {
int cpu;
+   struct net_device *ndev = pp->dev;
 
mvneta_max_rx_size_set(pp, pp->pkt_size);
mvneta_txq_max_tx_size_set(pp, pp->pkt_size);
@@ -2985,15 +2986,16 @@ static void mvneta_start_dev(struct mvneta_port *pp)
MVNETA_CAUSE_LINK_CHANGE |
MVNETA_CAUSE_PSC_SYNC_CHANGE);
 
-   phy_start(pp->phy_dev);
+   phy_start(ndev->phydev);
netif_tx_start_all_queues(pp->dev);
 }
 
 static void mvneta_stop_dev(struct mvneta_port *pp)
 {
unsigned int cpu;
+   struct net_device *ndev = pp->dev;
 
-   phy_stop(pp->phy_dev);
+   phy_stop(ndev->phydev);
 
for_each_online_cpu(cpu) {
struct mvneta_pcpu_port *port = per_cpu_ptr(pp->ports, cpu);
@@ -3166,7 +3168,7 @@ static int mvneta_set_mac_addr(struct net_device *dev, 
void *addr)
 static void mvneta_adjust_link(struct net_device *ndev)
 {
struct mvneta_port *pp = netdev_priv(ndev);
-   struct phy_device *phydev = pp->phy_dev;
+   struct phy_device *phydev = ndev->phydev;
int status_change = 0;
 
if (phydev->link) {
@@ -3244,7 +3246,6 @@ static int mvneta_mdio_probe(struct mvneta_port *pp)
phy_dev->supported &= PHY_GBIT_FEATURES;
phy_dev->advertising = phy_dev->supported;
 
-   pp->phy_dev = phy_dev;
pp->link= 0;
pp->duplex  = 0;
pp->speed   = 0;
@@ -3254,8 +3255,9 @@ static int mvneta_mdio_probe(struct mvneta_port *pp)
 
 static void mvneta_mdio_remove(struct mvneta_port *pp)
 {
-   phy_disconnect(pp->phy_dev);
-   pp->phy_dev = NULL;
+   struct net_device *ndev = pp->dev;
+
+   phy_disconnect(ndev->phydev);
 }
 
 /* Electing a CPU must be done in an atomic way: it should be done
@@ -3495,12 +3497,10 @@ static int mvneta_stop(struct net_device *dev)
 
 static int mvneta_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 {
-   struct mvneta_port *pp = netdev_priv(dev);
-
-   if (!pp->phy_dev)
+   if (!dev->phydev)
return -ENOTSUPP;
 
-   return phy_mii_ioctl(pp->phy_dev, ifr, cmd);
+   return phy_mii_ioctl(dev->phydev, ifr, cmd);
 }
 
 /* Ethtool methods */
@@ -3508,19 +3508,17 @@ static int mvneta_ioctl(struct net_device *dev, struct 
ifreq *ifr, int cmd)
 /* Get settings (phy address, speed) for ethtools */
 int mvneta_ethtool_get_settings(struct net_device *dev, struct ethtool_cmd 
*cmd)
 {
-   struct mvneta_port *pp = netdev_priv(dev);
-
-   if (!pp->phy_dev)
+   if (!dev->phydev)
return -ENODEV;
 
-   return phy_ethtool_gset(pp->phy_dev, cmd);
+   return phy_ethtool_gset(dev->phydev, cmd);
 }
 
 /* Set settings (phy address, speed) for ethtools */
 int mvneta_ethtool_set_settings(struct net_device *dev, struct ethtool_cmd 
*cmd)
 {
struct mvneta_port *pp = netdev_priv(dev);
-   struct phy_device *phydev = pp->phy_dev;
+   struct phy_device *phydev = dev->phydev;
 
if (!phydev)
return -ENODEV;
@@ -3557,7 +3555,7 @@ int mvneta_ethtool_set_settings(struct net_device *dev, 
struct ethtool_cmd *cmd)
}
}
 
-   return phy_ethtool_sset(pp->phy_dev, cmd);
+   return phy

[PATCH 2/2] net: ethernet: greth: use phy_ethtool_{get|set}_link_ksettings

2016-07-30 Thread Philippe Reynes

There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.

Signed-off-by: Philippe Reynes 
---
 drivers/net/ethernet/aeroflex/greth.c |   23 ++-
 1 files changed, 2 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/aeroflex/greth.c 
b/drivers/net/ethernet/aeroflex/greth.c
index 010978b..f8df824 100644
--- a/drivers/net/ethernet/aeroflex/greth.c
+++ b/drivers/net/ethernet/aeroflex/greth.c
@@ -1105,25 +1105,6 @@ static void greth_set_msglevel(struct net_device *dev, 
u32 value)
struct greth_private *greth = netdev_priv(dev);
greth->msg_enable = value;
 }
-static int greth_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
-{
-   struct phy_device *phy = dev->phydev;
-
-   if (!phy)
-   return -ENODEV;
-
-   return phy_ethtool_gset(phy, cmd);
-}
-
-static int greth_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
-{
-   struct phy_device *phy = dev->phydev;
-
-   if (!phy)
-   return -ENODEV;
-
-   return phy_ethtool_sset(phy, cmd);
-}
 
 static int greth_get_regs_len(struct net_device *dev)
 {
@@ -1155,12 +1136,12 @@ static void greth_get_regs(struct net_device *dev, 
struct ethtool_regs *regs, vo
 static const struct ethtool_ops greth_ethtool_ops = {
.get_msglevel   = greth_get_msglevel,
.set_msglevel   = greth_set_msglevel,
-   .get_settings   = greth_get_settings,
-   .set_settings   = greth_set_settings,
.get_drvinfo= greth_get_drvinfo,
.get_regs_len   = greth_get_regs_len,
.get_regs   = greth_get_regs,
.get_link   = ethtool_op_get_link,
+   .get_link_ksettings = phy_ethtool_get_link_ksettings,
+   .set_link_ksettings = phy_ethtool_set_link_ksettings,
 };
 
 static struct net_device_ops greth_netdev_ops = {
-- 
1.7.4.4

[PATCH 1/2] net: ethernet: greth: use phydev from struct net_device

2016-07-30 Thread Philippe Reynes

The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes 
---
 drivers/net/ethernet/aeroflex/greth.c |   23 +++
 drivers/net/ethernet/aeroflex/greth.h |1 -
 2 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/aeroflex/greth.c 
b/drivers/net/ethernet/aeroflex/greth.c
index bca07c5..010978b 100644
--- a/drivers/net/ethernet/aeroflex/greth.c
+++ b/drivers/net/ethernet/aeroflex/greth.c
@@ -1107,8 +1107,7 @@ static void greth_set_msglevel(struct net_device *dev, 
u32 value)
 }
 static int greth_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
 {
-   struct greth_private *greth = netdev_priv(dev);
-   struct phy_device *phy = greth->phy;
+   struct phy_device *phy = dev->phydev;
 
if (!phy)
return -ENODEV;
@@ -1118,8 +1117,7 @@ static int greth_get_settings(struct net_device *dev, 
struct ethtool_cmd *cmd)
 
 static int greth_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
 {
-   struct greth_private *greth = netdev_priv(dev);
-   struct phy_device *phy = greth->phy;
+   struct phy_device *phy = dev->phydev;
 
if (!phy)
return -ENODEV;
@@ -1224,7 +1222,7 @@ static int greth_mdio_write(struct mii_bus *bus, int phy, 
int reg, u16 val)
 static void greth_link_change(struct net_device *dev)
 {
struct greth_private *greth = netdev_priv(dev);
-   struct phy_device *phydev = greth->phy;
+   struct phy_device *phydev = dev->phydev;
unsigned long flags;
int status_change = 0;
u32 ctrl;
@@ -1307,7 +1305,6 @@ static int greth_mdio_probe(struct net_device *dev)
greth->link = 0;
greth->speed = 0;
greth->duplex = -1;
-   greth->phy = phy;
 
return 0;
 }
@@ -1325,6 +1322,7 @@ static int greth_mdio_init(struct greth_private *greth)
 {
int ret;
unsigned long timeout;
+   struct net_device *ndev = greth->netdev;
 
greth->mdio = mdiobus_alloc();
if (!greth->mdio) {
@@ -1349,15 +1347,16 @@ static int greth_mdio_init(struct greth_private *greth)
goto unreg_mdio;
}
 
-   phy_start(greth->phy);
+   phy_start(ndev->phydev);
 
/* If Ethernet debug link is used make autoneg happen right away */
if (greth->edcl && greth_edcl == 1) {
-   phy_start_aneg(greth->phy);
+   phy_start_aneg(ndev->phydev);
timeout = jiffies + 6*HZ;
-   while (!phy_aneg_done(greth->phy) && time_before(jiffies, 
timeout)) {
+   while (!phy_aneg_done(ndev->phydev) &&
+  time_before(jiffies, timeout)) {
}
-   phy_read_status(greth->phy);
+   phy_read_status(ndev->phydev);
greth_link_change(greth->netdev);
}
 
@@ -1569,8 +1568,8 @@ static int greth_of_remove(struct platform_device *of_dev)
 
dma_free_coherent(&of_dev->dev, 1024, greth->tx_bd_base, 
greth->tx_bd_base_phys);
 
-   if (greth->phy)
-   phy_stop(greth->phy);
+   if (ndev->phydev)
+   phy_stop(ndev->phydev);
mdiobus_unregister(greth->mdio);
 
unregister_netdev(ndev);
diff --git a/drivers/net/ethernet/aeroflex/greth.h 
b/drivers/net/ethernet/aeroflex/greth.h
index 92dd918..9c07140 100644
--- a/drivers/net/ethernet/aeroflex/greth.h
+++ b/drivers/net/ethernet/aeroflex/greth.h
@@ -123,7 +123,6 @@ struct greth_private {
struct napi_struct napi;
spinlock_t devlock;
 
-   struct phy_device *phy;
struct mii_bus *mdio;
unsigned int link;
unsigned int speed;
-- 
1.7.4.4

[PATCH 1/2] net: ethernet: octeon: use phydev from struct net_device

2016-07-30 Thread Philippe Reynes

The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phydev in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes 
---
 drivers/staging/octeon/ethernet-mdio.c   |   48 -
 drivers/staging/octeon/ethernet-rgmii.c  |2 +-
 drivers/staging/octeon/ethernet.c|   12 +++
 drivers/staging/octeon/octeon-ethernet.h |1 -
 4 files changed, 26 insertions(+), 37 deletions(-)

diff --git a/drivers/staging/octeon/ethernet-mdio.c 
b/drivers/staging/octeon/ethernet-mdio.c
index e13a4ab..661b97b 100644
--- a/drivers/staging/octeon/ethernet-mdio.c
+++ b/drivers/staging/octeon/ethernet-mdio.c
@@ -36,36 +36,30 @@ static void cvm_oct_get_drvinfo(struct net_device *dev,
 
 static int cvm_oct_get_settings(struct net_device *dev, struct ethtool_cmd 
*cmd)
 {
-   struct octeon_ethernet *priv = netdev_priv(dev);
-
-   if (priv->phydev)
-   return phy_ethtool_gset(priv->phydev, cmd);
+   if (dev->phydev)
+   return phy_ethtool_gset(dev->phydev, cmd);
 
return -EINVAL;
 }
 
 static int cvm_oct_set_settings(struct net_device *dev, struct ethtool_cmd 
*cmd)
 {
-   struct octeon_ethernet *priv = netdev_priv(dev);
-
if (!capable(CAP_NET_ADMIN))
return -EPERM;
 
-   if (priv->phydev)
-   return phy_ethtool_sset(priv->phydev, cmd);
+   if (dev->phydev)
+   return phy_ethtool_sset(dev->phydev, cmd);
 
return -EINVAL;
 }
 
 static int cvm_oct_nway_reset(struct net_device *dev)
 {
-   struct octeon_ethernet *priv = netdev_priv(dev);
-
if (!capable(CAP_NET_ADMIN))
return -EPERM;
 
-   if (priv->phydev)
-   return phy_start_aneg(priv->phydev);
+   if (dev->phydev)
+   return phy_start_aneg(dev->phydev);
 
return -EINVAL;
 }
@@ -88,15 +82,13 @@ const struct ethtool_ops cvm_oct_ethtool_ops = {
  */
 int cvm_oct_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 {
-   struct octeon_ethernet *priv = netdev_priv(dev);
-
if (!netif_running(dev))
return -EINVAL;
 
-   if (!priv->phydev)
+   if (!dev->phydev)
return -EINVAL;
 
-   return phy_mii_ioctl(priv->phydev, rq, cmd);
+   return phy_mii_ioctl(dev->phydev, rq, cmd);
 }
 
 void cvm_oct_note_carrier(struct octeon_ethernet *priv,
@@ -119,9 +111,9 @@ void cvm_oct_adjust_link(struct net_device *dev)
cvmx_helper_link_info_t link_info;
 
link_info.u64   = 0;
-   link_info.s.link_up = priv->phydev->link ? 1 : 0;
-   link_info.s.full_duplex = priv->phydev->duplex ? 1 : 0;
-   link_info.s.speed   = priv->phydev->speed;
+   link_info.s.link_up = dev->phydev->link ? 1 : 0;
+   link_info.s.full_duplex = dev->phydev->duplex ? 1 : 0;
+   link_info.s.speed   = dev->phydev->speed;
priv->link_info = link_info.u64;
 
/*
@@ -130,8 +122,8 @@ void cvm_oct_adjust_link(struct net_device *dev)
if (priv->poll)
priv->poll(dev);
 
-   if (priv->last_link != priv->phydev->link) {
-   priv->last_link = priv->phydev->link;
+   if (priv->last_link != dev->phydev->link) {
+   priv->last_link = dev->phydev->link;
cvmx_helper_link_set(priv->port, link_info);
cvm_oct_note_carrier(priv, link_info);
}
@@ -151,9 +143,8 @@ int cvm_oct_common_stop(struct net_device *dev)
 
priv->poll = NULL;
 
-   if (priv->phydev)
-   phy_disconnect(priv->phydev);
-   priv->phydev = NULL;
+   if (dev->phydev)
+   phy_disconnect(dev->phydev);
 
if (priv->last_link) {
link_info.u64 = 0;
@@ -176,6 +167,7 @@ int cvm_oct_phy_setup_device(struct net_device *dev)
 {
struct octeon_ethernet *priv = netdev_priv(dev);
struct device_node *phy_node;
+   struct phy_device *phydev = NULL;
 
if (!priv->of_node)
goto no_phy;
@@ -193,14 +185,14 @@ int cvm_oct_phy_setup_device(struct net_device *dev)
if (!phy_node)
goto no_phy;
 
-   priv->phydev = of_phy_connect(dev, phy_node, cvm_oct_adjust_link, 0,
- PHY_INTERFACE_MODE_GMII);
+   phydev = of_phy_connect(dev, phy_node, cvm_oct_adjust_link, 0,
+   PHY_INTERFACE_MODE_GMII);
 
-   if (!priv->phydev)
+   if (!phydev)
return -ENODEV;
 
priv->last_link = 0;
-   phy_start_aneg(priv->phydev);
+   phy_start_aneg(phydev);
 
return 0;
 no_phy:
diff --git a/drivers/staging/octeon/ethernet-rgmii.c 
b/drivers/staging/octeon/ethernet-rgmii.c
index 91b148c..48846df 100644
--- a/drivers/staging/octeon/ethernet-rgmii.c
+++ b/drivers/staging/octeon/ethernet-rgmii.c

[PATCH 2/2] net: ethernet: octeon: use phy_ethtool_{get|set}_link_ksettings

2016-07-30 Thread Philippe Reynes

There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.

There was a check on CAP_NET_ADMIN in cvm_oct_set_settings, but this
check is already done in dev_ethtool, so no need to repeat it before
calling the generic function.

Signed-off-by: Philippe Reynes 
---
 drivers/staging/octeon/ethernet-mdio.c |   23 ++-
 1 files changed, 2 insertions(+), 21 deletions(-)

diff --git a/drivers/staging/octeon/ethernet-mdio.c 
b/drivers/staging/octeon/ethernet-mdio.c
index 661b97b..1fde9c8 100644
--- a/drivers/staging/octeon/ethernet-mdio.c
+++ b/drivers/staging/octeon/ethernet-mdio.c
@@ -34,25 +34,6 @@ static void cvm_oct_get_drvinfo(struct net_device *dev,
strlcpy(info->bus_info, "Builtin", sizeof(info->bus_info));
 }
 
-static int cvm_oct_get_settings(struct net_device *dev, struct ethtool_cmd 
*cmd)
-{
-   if (dev->phydev)
-   return phy_ethtool_gset(dev->phydev, cmd);
-
-   return -EINVAL;
-}
-
-static int cvm_oct_set_settings(struct net_device *dev, struct ethtool_cmd 
*cmd)
-{
-   if (!capable(CAP_NET_ADMIN))
-   return -EPERM;
-
-   if (dev->phydev)
-   return phy_ethtool_sset(dev->phydev, cmd);
-
-   return -EINVAL;
-}
-
 static int cvm_oct_nway_reset(struct net_device *dev)
 {
if (!capable(CAP_NET_ADMIN))
@@ -66,10 +47,10 @@ static int cvm_oct_nway_reset(struct net_device *dev)
 
 const struct ethtool_ops cvm_oct_ethtool_ops = {
.get_drvinfo = cvm_oct_get_drvinfo,
-   .get_settings = cvm_oct_get_settings,
-   .set_settings = cvm_oct_set_settings,
.nway_reset = cvm_oct_nway_reset,
.get_link = ethtool_op_get_link,
+   .get_link_ksettings = phy_ethtool_get_link_ksettings,
+   .set_link_ksettings = phy_ethtool_set_link_ksettings,
 };
 
 /**
-- 
1.7.4.4

Re: [PATCH RESEND nf] netfilter: avoid a race between nf_register_hook() and cleanup_net()

2016-07-30 Thread Eric W. Biederman

Michal Kubecek  writes:

> There is a race condition between nf_{,un}register_hook() and
> cleanup_net() which can either trigger WARN check or cause a memory
> leak. The scenario is like this (2a and 2b are alternatives):
>
> 1.  cleanup_net() removes one or more struct net from net_namespace_list
> 2a. nf_register_hook() adds per-netns hooks to all netns (but not those
> removed in step 1) and adds the hook to global nf_hook_list
> 2b. nf_unregister_hook() deletes per-netns hooks from all netns (but not
> those removed in step 1) and removes the hook from nf_hook_list
> 3.  cleanup_net() calls pernet subsystem exit functions for netns being
> removed; one of them is netfilter_net_exit() which (among others)
> calls nf_unregister_net_hook() to unregister per-netns hooks for all
> hooks in nf_hook_list.
>
> In case (a), per-netns hooks are never added as the namespace was
> already invisible to for_each_net() in step 2a but an attempt to remove
> them in step 3 (the hook is already in nf_hook_list) triggers a WARN
> check in nf_unregister_net_hook() (no real harm done, however). In case
> (b), the per-netns hook is removed neither in step 2b (netns is already
> invisible to for_each_net()) nor in step 3 (the hook is already removed
> from nf_hook_list), causing a memory leak.
>
> Prevent the race by protecting the for_each_net() loop in
> nf_{,un}register_hook() (also) by net_mutex. There is already a
> precendens for this in rtnl_link_unregister() which addresses similar
> race.

So this analysis of a problem appears to be spot on.

Reviewed-by: "Eric W. Biederman" 


I really really want there to be a better way to do this, but it is
really not ok for a hook to continue it's life past
nf_unregister_net_hook as after that point the code may be removed
from the kernel (sigh).

Although keeping with the precedent and minimizing net_mutex
we could remove the WARN and keep nf_register_hook as it is.
But that sounds entirely too clever for a fix that will
probably be backported.

But that sounds entirely too clever for a fix that likely needs to be
backported.

Eric

> Fixes: 085db2c04557 ("netfilter: Per network namespace netfilter hooks.")
> Signed-off-by: Michal Kubecek 
> ---
>  net/netfilter/core.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/net/netfilter/core.c b/net/netfilter/core.c
> index f39276d1c2d7..860978c9f82e 100644
> --- a/net/netfilter/core.c
> +++ b/net/netfilter/core.c
> @@ -193,6 +193,8 @@ int nf_register_hook(struct nf_hook_ops *reg)
>   struct net *net, *last;
>   int ret;
>  
> + /* prevent race with cleanup_net() */
> + mutex_lock(&net_mutex);
>   rtnl_lock();
>   for_each_net(net) {
>   ret = nf_register_net_hook(net, reg);
> @@ -201,6 +203,7 @@ int nf_register_hook(struct nf_hook_ops *reg)
>   }
>   list_add_tail(®->list, &nf_hook_list);
>   rtnl_unlock();
> + mutex_unlock(&net_mutex);
>  
>   return 0;
>  rollback:
> @@ -211,6 +214,7 @@ rollback:
>   nf_unregister_net_hook(net, reg);
>   }
>   rtnl_unlock();
> + mutex_unlock(&net_mutex);
>   return ret;
>  }
>  EXPORT_SYMBOL(nf_register_hook);
> @@ -219,11 +223,14 @@ void nf_unregister_hook(struct nf_hook_ops *reg)
>  {
>   struct net *net;
>  
> + /* prevent race with cleanup_net() */
> + mutex_lock(&net_mutex);
>   rtnl_lock();
>   list_del(®->list);
>   for_each_net(net)
>   nf_unregister_net_hook(net, reg);
>   rtnl_unlock();
> + mutex_unlock(&net_mutex);
>  }
>  EXPORT_SYMBOL(nf_unregister_hook);

Re: [PATCH 2/3] sctp_diag: export timer value only if it is active

2016-07-30 Thread Marcelo Ricardo Leitner




Em 30-07-2016 10:25, Xin Long escreveu:

diff --git a/net/sctp/sctp_diag.c b/net/sctp/sctp_diag.c
index f69edcf219e51..0ad6033a7330c 100644
--- a/net/sctp/sctp_diag.c
+++ b/net/sctp/sctp_diag.c
@@ -40,10 +40,12 @@ static void inet_diag_msg_sctpasoc_fill(struct 
inet_diag_msg *r,
  }

  r->idiag_state = asoc->state;
- r->idiag_timer = SCTP_EVENT_TIMEOUT_T3_RTX;
- r->idiag_retrans = asoc->rtx_data_chunks;
- r->idiag_expires = jiffies_to_msecs(
- asoc->timeouts[SCTP_EVENT_TIMEOUT_T3_RTX] - jiffies);


I think we have two issues here, prior to your patch, but I noticed
while reviewing it :-)

This array is actually not based on jiffies but on intervals instead, as
per:

sm_sideeffect.c:
case SCTP_CMD_TIMER_START:   [1]
timer = &asoc->timers[cmd->obj.to];
timeout = asoc->timeouts[cmd->obj.to];   <---
BUG_ON(!timeout);

timer->expires = jiffies + timeout;  <---

understood.



But more importantly, this array is actually not used for this timeout
and the timeout is sctp_transport dependant, as per:

/* Schedule retransmission on the given transport */
void sctp_transport_immediate_rtx(struct sctp_transport *t)
{
/* Stop pending T3_rtx_timer */
if (del_timer(&t->T3_rtx_timer))
sctp_transport_put(t);

sctp_retransmit(&t->asoc->outqueue, t, SCTP_RTXR_T3_RTX);
if (!timer_pending(&t->T3_rtx_timer)) {
if (!mod_timer(&t->T3_rtx_timer, jiffies + t->rto))
 
sctp_transport_hold(t);

Note how on sctp_get_sctp_info() it fetches the RTO (which is T3_RTX)
this way:
info->sctpi_p_rto = jiffies_to_msecs(prim->rto);
If we want to know how long is left for the timer to expire, we have to
read directly from it.

you are right, 3 timers (T3_tx, hb, rtx_data_chunks) are per transport.



With git grep -A 1 TIMER_START we can confirm that [1] is never hit for
SCTP_EVENT_TIMEOUT_T3_RTX. Yet, the asoc is allocated with kzalloc(), so
I guess you were just reading -jiffies in there.

Note however that the stats rtx_data_chunks is the accumulated stats,
it's good, and that we may have multiple T3 timers running at once, with
different timeouts.

Xin, ideas on how we can fix this? I'm not sure if we can dump
per-transport info in there. Not as it is now, I guess.

It's not that easy to dump all transports info besed on current sctp_diag codes.


Okay



Now for the transport's info,  we only choose primary_path to dump.


Okay


It means we should fix this by getting the left time to expire from
primary transport t->T3_rtx_timer. like:

r->idiag_expires = jiffies_to_msecs(
-   asoc->timeouts[SCTP_EVENT_TIMEOUT_T3_RTX] - jiffies);
+   asoc->peer.primary_path->T3_rtx_timer.expires - jiffies);

but yes, need to check with timer_pending firstly.


Yes :)



what do  you think ?



Makes sense, LGTM.

Phil, not sure how you want to proceed here. Wanna handle the change above?

Re: [PATCH 2/3] sctp_diag: export timer value only if it is active

2016-07-30 Thread Xin Long

>> diff --git a/net/sctp/sctp_diag.c b/net/sctp/sctp_diag.c
>> index f69edcf219e51..0ad6033a7330c 100644
>> --- a/net/sctp/sctp_diag.c
>> +++ b/net/sctp/sctp_diag.c
>> @@ -40,10 +40,12 @@ static void inet_diag_msg_sctpasoc_fill(struct 
>> inet_diag_msg *r,
>>   }
>>
>>   r->idiag_state = asoc->state;
>> - r->idiag_timer = SCTP_EVENT_TIMEOUT_T3_RTX;
>> - r->idiag_retrans = asoc->rtx_data_chunks;
>> - r->idiag_expires = jiffies_to_msecs(
>> - asoc->timeouts[SCTP_EVENT_TIMEOUT_T3_RTX] - jiffies);
>
> I think we have two issues here, prior to your patch, but I noticed
> while reviewing it :-)
>
> This array is actually not based on jiffies but on intervals instead, as
> per:
>
> sm_sideeffect.c:
> case SCTP_CMD_TIMER_START:   [1]
> timer = &asoc->timers[cmd->obj.to];
> timeout = asoc->timeouts[cmd->obj.to];   <---
> BUG_ON(!timeout);
>
> timer->expires = jiffies + timeout;  <---
understood.

>
> But more importantly, this array is actually not used for this timeout
> and the timeout is sctp_transport dependant, as per:
>
> /* Schedule retransmission on the given transport */
> void sctp_transport_immediate_rtx(struct sctp_transport *t)
> {
> /* Stop pending T3_rtx_timer */
> if (del_timer(&t->T3_rtx_timer))
> sctp_transport_put(t);
>
> sctp_retransmit(&t->asoc->outqueue, t, SCTP_RTXR_T3_RTX);
> if (!timer_pending(&t->T3_rtx_timer)) {
> if (!mod_timer(&t->T3_rtx_timer, jiffies + t->rto))
>  
> sctp_transport_hold(t);
>
> Note how on sctp_get_sctp_info() it fetches the RTO (which is T3_RTX)
> this way:
> info->sctpi_p_rto = jiffies_to_msecs(prim->rto);
> If we want to know how long is left for the timer to expire, we have to
> read directly from it.
you are right, 3 timers (T3_tx, hb, rtx_data_chunks) are per transport.

>
> With git grep -A 1 TIMER_START we can confirm that [1] is never hit for
> SCTP_EVENT_TIMEOUT_T3_RTX. Yet, the asoc is allocated with kzalloc(), so
> I guess you were just reading -jiffies in there.
>
> Note however that the stats rtx_data_chunks is the accumulated stats,
> it's good, and that we may have multiple T3 timers running at once, with
> different timeouts.
>
> Xin, ideas on how we can fix this? I'm not sure if we can dump
> per-transport info in there. Not as it is now, I guess.
It's not that easy to dump all transports info besed on current sctp_diag codes.

Now for the transport's info,  we only choose primary_path to dump.
It means we should fix this by getting the left time to expire from
primary transport t->T3_rtx_timer. like:

r->idiag_expires = jiffies_to_msecs(
-   asoc->timeouts[SCTP_EVENT_TIMEOUT_T3_RTX] - jiffies);
+   asoc->peer.primary_path->T3_rtx_timer.expires - jiffies);

but yes, need to check with timer_pending firstly.

what do  you think ?

Re: [PATCH net] sctp: change to use TCP_CLOSE_WAIT as SCTP_SS_CLOSING

2016-07-30 Thread Marcelo Ricardo Leitner

On Sat, Jul 30, 2016 at 08:00:45PM +0800, Xin Long wrote:
> Prior to this patch, sctp defined TCP_CLOSING as SCTP_SS_CLOSING.
> TCP_CLOSING is such a special sk state in TCP that inet common codes
> even exclude it.
> 
> For instance, inet_accept thinks the accept sk's state never be
> TCP_CLOSING, or it will give a WARN_ON. TCP works well with that
> while SCTP may trigger the call trace, as CLOSING state in SCTP
> has different meaning from TCP.
> 
> This fix is to change to use TCP_CLOSE_WAIT as SCTP_SS_CLOSING,
> instead of TCP_CLOSING. Some side-effects could be expected,
> regardless of not being used before. inet_accept will accept it
> now.
> 
> I did all the func_tests in lksctp-tools and ran sctp codnomicon
> fuzzer tests against this patch, no regression or failure found.
> 
> Signed-off-by: Xin Long 

I don't think this is -net material. It's a one line change but a core
one.
Dave please consider it for net-next instead.
Though, Xin you may need to re-post later..

Acked-by: Marcelo Ricardo Leitner 

> ---
>  include/net/sctp/constants.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/net/sctp/constants.h b/include/net/sctp/constants.h
> index 8c337cd..5b847e4 100644
> --- a/include/net/sctp/constants.h
> +++ b/include/net/sctp/constants.h
> @@ -214,7 +214,7 @@ typedef enum {
>   SCTP_SS_LISTENING  = TCP_LISTEN,
>   SCTP_SS_ESTABLISHING   = TCP_SYN_SENT,
>   SCTP_SS_ESTABLISHED= TCP_ESTABLISHED,
> - SCTP_SS_CLOSING= TCP_CLOSING,
> + SCTP_SS_CLOSING= TCP_CLOSE_WAIT,
>  } sctp_sock_state_t;
>  
>  /* These functions map various type to printable names.  */
> -- 
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Re: [PATCH net] sctp: fix the issue sctp requeue auth chunk incorrectly

2016-07-30 Thread Marcelo Ricardo Leitner

On Sat, Jul 30, 2016 at 01:58:35PM +0800, Xin Long wrote:
> sctp needs to queue auth chunk back when we know that we are going
> to generate another segment. But commit f1533cce60d1 ("sctp: fix
> panic when sending auth chunks") requeues the last chunk processed
> which is probably not the auth chunk.
> 
> It causes panic when calculating the MAC in sctp_auth_calculate_hmac(),
> as the incorrect offset of the auth chunk in skb->data.
> 
> This fix is to requeue it by using packet->auth.
> 
> Fixes: f1533cce60d1 ("sctp: fix panic when sending auth chunks")
> Signed-off-by: Xin Long 

Acked-by: Marcelo Ricardo Leitner 

> ---
>  net/sctp/output.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sctp/output.c b/net/sctp/output.c
> index 7425f6c..1f1682b 100644
> --- a/net/sctp/output.c
> +++ b/net/sctp/output.c
> @@ -610,7 +610,8 @@ int sctp_packet_transmit(struct sctp_packet *packet, 
> gfp_t gfp)
>   /* We will generate more packets, so re-queue
>* auth chunk.
>*/
> - list_add(&chunk->list, &packet->chunk_list);
> + list_add(&packet->auth->list,
> +  &packet->chunk_list);
>   } else {
>   sctp_chunk_free(packet->auth);
>   packet->auth = NULL;
> -- 
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Re: [PATCH net] sctp: allow delivering notifications after receiving SHUTDOWN

2016-07-30 Thread Marcelo Ricardo Leitner

On Sat, Jul 30, 2016 at 02:09:09PM +0800, Xin Long wrote:
> Prior to this patch, once sctp received SHUTDOWN or shutdown with RD,
> sk->sk_shutdown would be set with RCV_SHUTDOWN, and all events would
> be dropped in sctp_ulpq_tail_event(). It would cause:
> 
> 1. some notifications couldn't be received by users. like
>SCTP_SHUTDOWN_COMP generated by sctp_sf_do_4_C().
> 
> 2. sctp would also never trigger sk_data_ready when the association
>was closed, making it harder to identify the end of the association
>by calling recvmsg() and getting an EOF. It was not convenient for
>kernel users.
> 
> The check here should be stopping delivering DATA chunks after receiving
> SHUTDOWN, and stopping delivering ANY chunks after sctp_close().
> 
> So this patch is to allow notifications to enqueue into receive queue
> even if sk->sk_shutdown is set to RCV_SHUTDOWN in sctp_ulpq_tail_event,
> but if sk->sk_shutdown == RCV_SHUTDOWN | SEND_SHUTDOWN, it drops all
> events.
> 
> Signed-off-by: Xin Long 

Acked-by: Marcelo Ricardo Leitner 

> ---
>  net/sctp/ulpqueue.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sctp/ulpqueue.c b/net/sctp/ulpqueue.c
> index ec166d2..877e550 100644
> --- a/net/sctp/ulpqueue.c
> +++ b/net/sctp/ulpqueue.c
> @@ -204,7 +204,9 @@ int sctp_ulpq_tail_event(struct sctp_ulpq *ulpq, struct 
> sctp_ulpevent *event)
>   /* If the socket is just going to throw this away, do not
>* even try to deliver it.
>*/
> - if (sock_flag(sk, SOCK_DEAD) || (sk->sk_shutdown & RCV_SHUTDOWN))
> + if (sk->sk_shutdown & RCV_SHUTDOWN &&
> + (sk->sk_shutdown & SEND_SHUTDOWN ||
> +  !sctp_ulpevent_is_notification(event)))
>   goto out_free;
>  
>   if (!sctp_ulpevent_is_notification(event)) {
> -- 
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Re: [PATCH net] sctp: allow receiving msg when TCP-style sk is in CLOSED state

2016-07-30 Thread Marcelo Ricardo Leitner

On Sat, Jul 30, 2016 at 02:14:41PM +0800, Xin Long wrote:
> Commit 141ddefce7c8 ("sctp: change sk state to CLOSED instead of
> CLOSING in sctp_sock_migrate") changed sk state to CLOSED if the
> assoc is closed when sctp_accept clones a new sk.
> 
> If there is still data in sk receive queue, users will not be able
> to read it any more, as sctp_recvmsg returns directly if sk state
> is CLOSED.
> 
> This patch is to add CLOSED state check in sctp_recvmsg to allow
> reading data from TCP-style sk with CLOSED state as what TCP does.
> 
> Signed-off-by: Xin Long 

So scripts can track this later more easily,

Fixes: 141ddefce7c8 ("sctp: change sk state to CLOSED instead of CLOSING in 
sctp_sock_migrate")
Acked-by: Marcelo Ricardo Leitner 

> ---
>  net/sctp/socket.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index 8812e1b..9fc417a 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -2079,7 +2079,7 @@ static int sctp_recvmsg(struct sock *sk, struct msghdr 
> *msg, size_t len,
>   lock_sock(sk);
>  
>   if (sctp_style(sk, TCP) && !sctp_sstate(sk, ESTABLISHED) &&
> - !sctp_sstate(sk, CLOSING)) {
> + !sctp_sstate(sk, CLOSING) && !sctp_sstate(sk, CLOSED)) {
>   err = -ENOTCONN;
>   goto out;
>   }
> -- 
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

[PATCH net] sctp: change to use TCP_CLOSE_WAIT as SCTP_SS_CLOSING

2016-07-30 Thread Xin Long

Prior to this patch, sctp defined TCP_CLOSING as SCTP_SS_CLOSING.
TCP_CLOSING is such a special sk state in TCP that inet common codes
even exclude it.

For instance, inet_accept thinks the accept sk's state never be
TCP_CLOSING, or it will give a WARN_ON. TCP works well with that
while SCTP may trigger the call trace, as CLOSING state in SCTP
has different meaning from TCP.

This fix is to change to use TCP_CLOSE_WAIT as SCTP_SS_CLOSING,
instead of TCP_CLOSING. Some side-effects could be expected,
regardless of not being used before. inet_accept will accept it
now.

I did all the func_tests in lksctp-tools and ran sctp codnomicon
fuzzer tests against this patch, no regression or failure found.

Signed-off-by: Xin Long 
---
 include/net/sctp/constants.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/net/sctp/constants.h b/include/net/sctp/constants.h
index 8c337cd..5b847e4 100644
--- a/include/net/sctp/constants.h
+++ b/include/net/sctp/constants.h
@@ -214,7 +214,7 @@ typedef enum {
SCTP_SS_LISTENING  = TCP_LISTEN,
SCTP_SS_ESTABLISHING   = TCP_SYN_SENT,
SCTP_SS_ESTABLISHED= TCP_ESTABLISHED,
-   SCTP_SS_CLOSING= TCP_CLOSING,
+   SCTP_SS_CLOSING= TCP_CLOSE_WAIT,
 } sctp_sock_state_t;
 
 /* These functions map various type to printable names.  */
-- 
2.1.0

Re: [PATCH] [v6] net: emac: emac gigabit ethernet controller driver

2016-07-30 Thread Lino Sanfilippo

On 28.07.2016 21:12, Timur Tabi wrote:

> 
>>> +   if (ret) {
>>> +   netdev_err(adpt->netdev,
>>> +  "error:%d on request_irq(%d:%s flags:0)\n", ret,
>>> +  irq->irq, EMAC_MAC_IRQ_RES);
>>
>> freeing the irq is missing
> 
> Will fix.
> 
>>> +   /* disable mac irq */
>>> +   writel(DIS_INT, adpt->base + EMAC_INT_STATUS);
>>> +   writel(0, adpt->base + EMAC_INT_MASK);
>>> +   synchronize_irq(adpt->irq.irq);
>>> +   free_irq(adpt->irq.irq, &adpt->irq);
>>> +   clear_bit(EMAC_STATUS_TASK_REINIT_REQ, &adpt->status);
>>> +
>>> +   cancel_work_sync(&adpt->tx_ts_task);
>>> +   spin_lock_irqsave(&adpt->tx_ts_lock, flags);
>>
>> Maybe I am missing something but AFAICS tx_ts_lock is never called from irq 
>> context, so
>> there is no reason to disable irqs.
> 
> It might have been that way in an older version of the code, but it 
> appears you are correct.  I will change it to a normal spinlock.  Thanks.

By looking closer to the code, the lock seems to serve the protection of a list 
of skbs that
are queued to be timestamped. However there is nothing that ever enqueues those 
skbs, is it? 
I assume that this is a leftover of a previous version of that driver. Code to 
be merged into 
mailine has to be completely cleaned up and must not contains any functions 
which are never 
called. So either you implement the timestamping feature completely or you 
remove the concerning 
code altogether for the initial mainline driver version. You can add that 
feature any time later. 

> 
>>> +/* Push the received skb to upper layers */
>>> +static void emac_receive_skb(struct emac_rx_queue *rx_q,
>>> +struct sk_buff *skb,
>>> +u16 vlan_tag, bool vlan_flag)
>>> +{
>>> +   if (vlan_flag) {
>>> +   u16 vlan;
>>> +
>>> +   EMAC_TAG_TO_VLAN(vlan_tag, vlan);
>>> +   __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vlan);
>>> +   }
>>> +
>>> +   napi_gro_receive(&rx_q->napi, skb);
>>
>> napi_gro_receive requires rx checksum offload. However emac_receive_skb() is 
>> also called if
>> hardware checksumming is disabled.
> 
> So the hardware is a little weird here.  Apparently, there is a bug in 
> the parsing of the packet headers that is avoided if we disable hardware 
> checksumming.
> 
> In emac_mac_rx_process(), right before it calls emac_receive_skb(), it 
> does this:
> 
> if (netdev->features & NETIF_F_RXCSUM)
>   skb->ip_summed = RRD_L4F(&rrd) ?
> CHECKSUM_NONE : CHECKSUM_UNNECESSARY;
> else
>   skb_checksum_none_assert(skb);
> 
> RRD_L4F(&rrd) is always zero and NETIF_F_RXCSUM is set by default, so 
> ip_summed is set to CHECKSUM_UNNECESSARY.
> 
> So you're saying that if NETIF_F_RXCSUM is not set, then 
> napi_gro_receive() should not be called?

This requirement seems indeed to be obsolete now so you can ignore my former 
complaint and
leave it as it is. 

> In fact, I have not been able to find any clear example of a driver that 
> intentionally avoids calling napi_gro_receive() if hardware checksumming 
> is disabled.

Some drivers still make the differentiation:
http://lxr.free-electrons.com/source/drivers/net/ethernet/marvell/sky2.c#L2656
http://lxr.free-electrons.com/source/drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c#L1548
and several more.

>>> +/* Transmit the packet using specified transmit queue */
>>> +int emac_mac_tx_buf_send(struct emac_adapter *adpt, struct emac_tx_queue 
>>> *tx_q,
>>> +struct sk_buff *skb)
>>> +{
>>> +   struct emac_tpd tpd;
>>> +   u32 prod_idx;
>>> +
>>> +   if (!emac_tx_has_enough_descs(tx_q, skb)) {
>>
>> Drivers should avoid this situation right from the start by checking after 
>> each transmission if the max number
>>   of possible descriptors is still available for a further transmission and 
>> stop the queue if there are not.
> 
> Ok, to be clear, you're saying I should do what bcmgenet_xmit() does.
> 
>   if (ring->free_bds <= (MAX_SKB_FRAGS + 1))
>   netif_tx_stop_queue(txq);
> 
> At the end of emac_mac_tx_buf_send(), I should call 
> emac_tpd_num_free_descs() and check to see whether the number of free 
> descriptors is <= (MAX_SKB_FRAGS + 1).

Right. The current implemented approach to check the number of descs at the 
beginning instead of at the
end of a transmission is not wrong but it leads to one extra call of xmit if 
the driver is running
out of descs. So most drivers avoid this by stopping the queue as soon as they 
detect that the max 
required number of descs for a further transmission is not available. 

> 
>> Furthermore there does not seem to be any function that wakes the queue up 
>> again once it has been stopped.
> 
> If I make the above fix, won't that also fix this bug?

No, how should it? There still is nothing that wakes up the queue once it is 
stopped. Stopping and 
restarting/waking up a queue is up to the driver, since the network stack cant 
know if the hw is
ready to

Re: [PATCH net-next 01/10] drivers: net: xgene: Fix kbuild warning

2016-07-30 Thread Arnd Bergmann

On Friday, July 29, 2016 5:33:54 PM CEST Iyappan Subramanian wrote:
> This patch fixes the following kbuild warning, when ACPI was not enabled.
> 
> >> drivers/net/ethernet/apm/xgene/xgene_enet_hw.c:878:23: warning: 'phy_dev' 
> >> may be used uninitialized in this function [-Wmaybe-uninitialized]
>  phy_dev->advertising = phy_dev->supported;

>From looking at the patch, I don't think it addresses the warning.
Note that Linus added a patch to disable the warning in the latest
mainline kernel, so you won't see it any more, but I think the bug
is still there.

> diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c 
> b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
> index 7714b7d..b6bc6fa 100644
> --- a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
> +++ b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
> @@ -776,15 +776,11 @@ int xgene_enet_phy_connect(struct net_device *ndev)
> netdev_err(ndev, "Could not connect to PHY\n");
> return -ENODEV;
> }
> -
> -   pdata->phy_dev = phy_dev;
> } else {
>  #ifdef CONFIG_ACPI
> struct acpi_device *adev = acpi_phy_find_device(dev);
> if (adev)
> -   pdata->phy_dev =  adev->driver_data;
> -
> -   phy_dev = pdata->phy_dev;
> +   phy_dev =  adev->driver_data;
>  
> if (!phy_dev ||
> phy_connect_direct(ndev, phy_dev, &xgene_enet_adjust_link,
> @@ -795,6 +791,7 @@ int xgene_enet_phy_connect(struct net_device *ndev)
>  #endif
> }
>  
> +   pdata->phy_dev = phy_dev;

phy_dev is not initialized anywhere if CONFIG_ACPI is not set
and dev->of_node is NULL (which should not happen in practice,
but the compiler doesn't know that).

I think you want this instead:

diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c 
b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
index 7714b7d4026a..98779fe2d558 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
@@ -792,6 +792,8 @@ int xgene_enet_phy_connect(struct net_device *ndev)
netdev_err(ndev, "Could not connect to PHY\n");
return  -ENODEV;
}
+#else
+   return -ENODEV;
 #endif
}
 

ARnd

Arnd

Re: [PATCH v2 1/5] ethernet: add sun8i-emac driver

2016-07-30 Thread Maxime Ripard

On Sat, Jul 30, 2016 at 09:30:01AM +0800, Chen-Yu Tsai wrote:
> >> > > +static void sun8i_emac_unset_syscon(struct net_device *ndev)
> >> > > +{
> >> > > + struct sun8i_emac_priv *priv = netdev_priv(ndev);
> >> > > + u32 reg = 0;
> >> > > +
> >> > > + if (priv->variant == H3_EMAC)
> >> > > + reg = H3_EPHY_DEFAULT_VALUE;
> >> >
> >> > Why do you need that?
> >> >
> >> For resetting the syscon to the factory default.
> >
> > Yes, but does it matter? Does it have any side effect? Is that
> > register shared with another device?
> >
> > Otherwise, either it won't be used anymore, and you don't care, or you
> > will reload the driver later, and the driver should work whatever
> > state is programmed in there. In both cases, you don't need to reset
> > that value.
> 
> The "default" setting also disables and powers down the internal PHY.
> I think that's a good thing? The naming could be better.

Ah, it might, and that would obviously be the right thing to do. Using
a define for those enable and power down bits would be better though.

> >> > > +static irqreturn_t sun8i_emac_dma_interrupt(int irq, void *dev_id)
> >> > > +{
> >> > > + struct net_device *ndev = dev_id;
> >> > > + struct sun8i_emac_priv *priv = netdev_priv(ndev);
> >> > > + u32 v, u;
> >> > > +
> >> > > + v = readl(priv->base + SUN8I_EMAC_INT_STA);
> >> > > +
> >> > > + /* When this bit is asserted, a frame transmission is completed. */
> >> > > + if (v & BIT(0)) {
> >> > > + priv->estats.tx_int++;
> >> > > + writel(0, priv->base + SUN8I_EMAC_INT_EN);
> >> > > + napi_schedule(&priv->napi);
> >> > > + }
> >> > > +
> >> > > + /* When this bit is asserted, the TX DMA FSM is stopped. */
> >> > > + if (v & BIT(1))
> >> > > + priv->estats.tx_dma_stop++;
> >> > > +
> >> > > + /* When this asserted, the TX DMA can not acquire next TX descriptor
> >> > > +  * and TX DMA FSM is suspended.
> >> > > + */
> >> > > + if (v & BIT(2))
> >> > > + priv->estats.tx_dma_ua++;
> >> > > +
> >> > > + if (v & BIT(3))
> >> > > + netif_dbg(priv, intr, ndev, "Unhandled interrupt TX 
> >> > > TIMEOUT\n");
> >> >
> >> > Why do you enable that interrupt if you can't handle it?
> >>
> >> Some interrupt fire even when not enabled (like 
> >> RX_BUF_UA_INT/TX_BUF_UA_INT)
> >
> > So the bits 9 and 2, respectively, in the interrupt enable register
> > are useless?
> 
> Does it actually fire, i.e. pull the interrupt line on the GIC? Or is it just
> the interrupt state showing an event? IIRC some other hardware blocks have 
> this
> behavior, such as the timer.

That's quite easy to implement, you can do a bitwise and on the status
and enable registers.

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


signature.asc
Description: PGP signature

44 matches

Mail list logo