Re: [PATCH net-next] net: hns: optimize XGE capability by reducing cpu usage

2015-12-07 Thread Yankejian (Hackim Yim)


On 2015/12/8 14:30, Du, Fan wrote:
>
>
> On 2015/12/8 14:22, Yankejian (Hackim Yim) wrote:
>>
>> On 2015/12/7 16:58, Du, Fan wrote:
>>> >
>>> >
>>> >On 2015/12/5 15:32, yankejian wrote:
 >>here is the patch raising the performance of XGE by:
 >>1)changes the way page management method for enet momery, and
 >>2)reduces the count of rmb, and
 >>3)adds Memory prefetching
>>> >
>>> >Any numbers on how much it boost performance?
>>> >
>> it is almost the same as 82599.
>
> I mean how much it improves performance *BEFORE* and *AFTER* this patch
> for Huawei XGE chip, because the commit log states it "raising the 
> performance",
> but did give numbers of the testing.
>
> .
>
Hi Du Fan,
the bandwidth's raising is not obviously, but the cpu usage degracing up to 50%


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: MPLS decap with iproute2

2015-12-07 Thread Sam Russell
Thanks all for your help, I got it working with Robert and Roopa's
sysctl settings, the following works:

ip route -f mpls add 100 dev lo

On 8 December 2015 at 15:37, roopa  wrote:
> On 12/7/15, 11:42 AM, Sam Russell wrote:
>> Hi,
>>
>> I've had success with the iproute2 manpage example for encapsulating
>> outgoing traffic in MPLS, but I've not found a way to add decap routes
>> inbound.
>>
>> I've tried "ip route -f mpls add 100 dev lo" and other variations, but I
>> get netlink errors back.
>>
>> Has this been built yet? Is there sample config that I can try out? I'm
>> running a home-built 4.3 kernel + iproute2 built from head (on ubuntu
>> 15.10) and am comfortable with perf and splashing around in the codebase if
>> need be.
>>
> Example below should work
>  ip -f mpls route add 100 as 200 via inet 10.1.1.2 dev eth0
>
> You have to enable mpls on the interface first:
> echo 1 > /proc/sys/net/mpls/conf/eth0/input
>
> multipath iproute2 patches are not in yet. Will submit them soon.
>
> let me know if you still get errors.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


netlink: Add missing goto statement to netlink_insert

2015-12-07 Thread Herbert Xu
On Mon, Dec 07, 2015 at 07:58:25AM +0100, Stefan Priebe - Profihost AG wrote:
>
> Thanks, good. Can you help me to get this fix upstream into the stable
> lines?

Sure.  Greg, please apply this patch to fix up the backport for 4.1.

---8<---
The backport of 1f770c0a09da855a2b51af6d19de97fb955eca85 ("netlink:
Fix autobind race condition that leads to zero port ID") missed a
goto statement, which causes netlink to break subtly.

This was discovered by Stefan Priebe .

Fixes: 4e2776241766 ("netlink: Fix autobind race condition that...")
Reported-by: Stefan Priebe 
Reported-by: Philipp Hahn 
Signed-off-by: Herbert Xu 

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index d139c43..0d6038c 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1118,6 +1118,7 @@ static int netlink_insert(struct sock *sk, u32 portid)
if (err == -EEXIST)
err = -EADDRINUSE;
sock_put(sk);
+   goto err;
}
 
/* We need to ensure that the socket is hashed and visible. */

-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] net: hns: optimize XGE capability by reducing cpu usage

2015-12-07 Thread Du, Fan



On 2015/12/8 14:22, Yankejian (Hackim Yim) wrote:


On 2015/12/7 16:58, Du, Fan wrote:

>
>
>On 2015/12/5 15:32, yankejian wrote:

>>here is the patch raising the performance of XGE by:
>>1)changes the way page management method for enet momery, and
>>2)reduces the count of rmb, and
>>3)adds Memory prefetching

>
>Any numbers on how much it boost performance?
>

it is almost the same as 82599.


I mean how much it improves performance *BEFORE* and *AFTER* this patch
for Huawei XGE chip, because the commit log states it "raising the 
performance",

but did give numbers of the testing.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] net: thunderx: Correctly distinguish between VF and LMAC count

2015-12-07 Thread Pavel Fedin
Commit bc69fdfc6c13
("net: thunderx: Enable BGX LMAC's RX/TX only after VF is up")
introduces lmac_cnt member and starts verifying VF number against it.
This is plain wrong, and works only because currently we have hardcoded
1:1 mapping between VFs and LMACs, and in this case num_vf_en and
lmac_cnt are always equal. However in future this may change, and the
code will badly misbehave. The worst consequence of this is failure to
deliver link status messages, causing VFs to go defunct because since
commit 0b72a9a1060e ("net: thunderx: Switchon carrier only upon
interface link up") VF will not fully bring itself up without it.

This patch fixes the potential problem by doing VF number checks against
the num_vf_en. Since lmac_cnt is not used anywhere else, it is removed.

Additionally some duplicated code is factored out into nic_enable_vf()

Signed-off-by: Pavel Fedin 
---
v1 => v2:
- Updated subject and commit message for better understanding
---
 drivers/net/ethernet/cavium/thunder/nic_main.c | 39 --
 1 file changed, 18 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c 
b/drivers/net/ethernet/cavium/thunder/nic_main.c
index 4b7fd63..5f24d11 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -37,7 +37,6 @@ struct nicpf {
 #defineNIC_GET_BGX_FROM_VF_LMAC_MAP(map)   ((map >> 4) & 0xF)
 #defineNIC_GET_LMAC_FROM_VF_LMAC_MAP(map)  (map & 0xF)
u8  vf_lmac_map[MAX_LMAC];
-   u8  lmac_cnt;
struct delayed_work dwork;
struct workqueue_struct *check_link;
u8  link[MAX_LMAC];
@@ -280,7 +279,6 @@ static void nic_set_lmac_vf_mapping(struct nicpf *nic)
u64 lmac_credit;
 
nic->num_vf_en = 0;
-   nic->lmac_cnt = 0;
 
for (bgx = 0; bgx < NIC_MAX_BGX; bgx++) {
if (!(bgx_map & (1 << bgx)))
@@ -290,7 +288,6 @@ static void nic_set_lmac_vf_mapping(struct nicpf *nic)
nic->vf_lmac_map[next_bgx_lmac++] =
NIC_SET_VF_LMAC_MAP(bgx, lmac);
nic->num_vf_en += lmac_cnt;
-   nic->lmac_cnt += lmac_cnt;
 
/* Program LMAC credits */
lmac_credit = (1ull << 1); /* channel credit enable */
@@ -618,6 +615,21 @@ static int nic_config_loopback(struct nicpf *nic, struct 
set_loopback *lbk)
return 0;
 }
 
+static void nic_enable_vf(struct nicpf *nic, int vf, bool enable)
+{
+   int bgx, lmac;
+
+   nic->vf_enabled[vf] = enable;
+
+   if (vf >= nic->num_vf_en)
+   return;
+
+   bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
+   lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
+
+   bgx_lmac_rx_tx_enable(nic->node, bgx, lmac, enable);
+}
+
 /* Interrupt handler to handle mailbox messages from VFs */
 static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
 {
@@ -717,29 +729,14 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
break;
case NIC_MBOX_MSG_CFG_DONE:
/* Last message of VF config msg sequence */
-   nic->vf_enabled[vf] = true;
-   if (vf >= nic->lmac_cnt)
-   goto unlock;
-
-   bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
-   lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
-
-   bgx_lmac_rx_tx_enable(nic->node, bgx, lmac, true);
+   nic_enable_vf(nic, vf, true);
goto unlock;
case NIC_MBOX_MSG_SHUTDOWN:
/* First msg in VF teardown sequence */
-   nic->vf_enabled[vf] = false;
if (vf >= nic->num_vf_en)
nic->sqs_used[vf - nic->num_vf_en] = false;
nic->pqs_vf[vf] = 0;
-
-   if (vf >= nic->lmac_cnt)
-   break;
-
-   bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
-   lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
-
-   bgx_lmac_rx_tx_enable(nic->node, bgx, lmac, false);
+   nic_enable_vf(nic, vf, false);
break;
case NIC_MBOX_MSG_ALLOC_SQS:
nic_alloc_sqs(nic, _alloc);
@@ -958,7 +955,7 @@ static void nic_poll_for_link(struct work_struct *work)
 
mbx.link_status.msg = NIC_MBOX_MSG_BGX_LINK_CHANGE;
 
-   for (vf = 0; vf < nic->lmac_cnt; vf++) {
+   for (vf = 0; vf < nic->num_vf_en; vf++) {
/* Poll only if VF is UP */
if (!nic->vf_enabled[vf])
continue;
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To 

Re: [PATCH net-next] net: hns: optimize XGE capability by reducing cpu usage

2015-12-07 Thread Yankejian (Hackim Yim)


On 2015/12/7 16:58, Du, Fan wrote:
>
>
> On 2015/12/5 15:32, yankejian wrote:
>> here is the patch raising the performance of XGE by:
>> 1)changes the way page management method for enet momery, and
>> 2)reduces the count of rmb, and
>> 3)adds Memory prefetching
>
> Any numbers on how much it boost performance?
>

it is almost the same as 82599.

>> Signed-off-by: yankejian 
>> ---
>>   drivers/net/ethernet/hisilicon/hns/hnae.h |  5 +-
>>   drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c |  1 -
>>   drivers/net/ethernet/hisilicon/hns/hns_enet.c | 79 
>> +++
>>   3 files changed, 55 insertions(+), 30 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/hisilicon/hns/hnae.h 
>> b/drivers/net/ethernet/hisilicon/hns/hnae.h
>> index d1f3316..6ca94dc 100644
>> --- a/drivers/net/ethernet/hisilicon/hns/hnae.h
>> +++ b/drivers/net/ethernet/hisilicon/hns/hnae.h
>> @@ -341,7 +341,8 @@ struct hnae_queue {
>>   void __iomem *io_base;
>>   phys_addr_t phy_base;
>>   struct hnae_ae_dev *dev;/* the device who use this queue */
>> -struct hnae_ring rx_ring, tx_ring;
>> +struct hnae_ring rx_ring cacheline_internodealigned_in_smp;
>> +struct hnae_ring tx_ring cacheline_internodealigned_in_smp;
>>   struct hnae_handle *handle;
>>   };
>>
>> @@ -597,11 +598,9 @@ static inline void hnae_replace_buffer(struct hnae_ring 
>> *ring, int i,
>>  struct hnae_desc_cb *res_cb)
>>   {
>>   struct hnae_buf_ops *bops = ring->q->handle->bops;
>> -struct hnae_desc_cb tmp_cb = ring->desc_cb[i];
>>
>>   bops->unmap_buffer(ring, >desc_cb[i]);
>>   ring->desc_cb[i] = *res_cb;
>> -*res_cb = tmp_cb;
>>   ring->desc[i].addr = (__le64)ring->desc_cb[i].dma;
>>   ring->desc[i].rx.ipoff_bnum_pid_flag = 0;
>>   }
>> diff --git a/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c 
>> b/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
>> index 77c6edb..522b264 100644
>> --- a/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
>> +++ b/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
>> @@ -341,7 +341,6 @@ void hns_ae_toggle_ring_irq(struct hnae_ring *ring, u32 
>> mask)
>>   else
>>   flag = RCB_INT_FLAG_RX;
>>
>> -hns_rcb_int_clr_hw(ring->q, flag);
>>   hns_rcb_int_ctrl_hw(ring->q, flag, mask);
>>   }
>>
>> diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.c 
>> b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
>> index cad2663..e2be510 100644
>> --- a/drivers/net/ethernet/hisilicon/hns/hns_enet.c
>> +++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
>> @@ -33,6 +33,7 @@
>>
>>   #define RCB_IRQ_NOT_INITED 0
>>   #define RCB_IRQ_INITED 1
>> +#define HNS_BUFFER_SIZE_2048 2048
>>
>>   #define BD_MAX_SEND_SIZE 8191
>>   #define SKB_TMP_LEN(SKB) \
>> @@ -491,13 +492,51 @@ static unsigned int hns_nic_get_headlen(unsigned char 
>> *data, u32 flag,
>>   return max_size;
>>   }
>>
>> -static void
>> -hns_nic_reuse_page(struct hnae_desc_cb *desc_cb, int tsize, int last_offset)
>> +static void hns_nic_reuse_page(struct sk_buff *skb, int i,
>> +   struct hnae_ring *ring, int pull_len,
>> +   struct hnae_desc_cb *desc_cb)
>>   {
>> +struct hnae_desc *desc;
>> +int truesize, size;
>> +int last_offset = 0;
>> +
>> +desc = >desc[ring->next_to_clean];
>> +size = le16_to_cpu(desc->rx.size);
>> +
>> +#if (PAGE_SIZE < 8192)
>> +if (hnae_buf_size(ring) == HNS_BUFFER_SIZE_2048) {
>> +truesize = hnae_buf_size(ring);
>> +} else {
>> +truesize = ALIGN(size, L1_CACHE_BYTES);
>> +last_offset = hnae_page_size(ring) - hnae_buf_size(ring);
>> +}
>> +
>> +#else
>> +truesize = ALIGN(size, L1_CACHE_BYTES);
>> +last_offset = hnae_page_size(ring) - hnae_buf_size(ring);
>> +#endif
>> +
>> +skb_add_rx_frag(skb, i, desc_cb->priv, desc_cb->page_offset + pull_len,
>> +size - pull_len, truesize - pull_len);
>> +
>>/* avoid re-using remote pages,flag default unreuse */
>>   if (likely(page_to_nid(desc_cb->priv) == numa_node_id())) {
>> +#if (PAGE_SIZE < 8192)
>> +if (hnae_buf_size(ring) == HNS_BUFFER_SIZE_2048) {
>> +/* if we are only owner of page we can reuse it */
>> +if (likely(page_count(desc_cb->priv) == 1)) {
>> +/* flip page offset to other buffer */
>> +desc_cb->page_offset ^= truesize;
>> +
>> +desc_cb->reuse_flag = 1;
>> +/* bump ref count on page before it is given*/
>> +get_page(desc_cb->priv);
>> +}
>> +return;
>> +}
>> +#endif
>>   /* move offset up to the next cache line */
>> -desc_cb->page_offset += tsize;
>> +desc_cb->page_offset += truesize;
>>
>>   if (desc_cb->page_offset <= last_offset) {
>>   desc_cb->reuse_flag = 1;
>> @@ -529,11 +568,10 @@ static int 

Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload

2015-12-07 Thread John Fastabend
On 15-12-02 04:15 PM, Tom Herbert wrote:
> On Wed, Dec 2, 2015 at 3:35 PM, John Fastabend  
> wrote:
>> [...]
>>

 I wonder why we need protocol generic offloads? I know there are
 currently a lot of overlay encapsulation protocols. Are there many more
 coming?

>>> Yes, and assume that there are more coming with an unbounded limit
>>> (for instance I just noticed today that there is a netdev1.1 talk on
>>> supporting GTP in the kernel). Besides, this problem space not just
>>> limited to offload of encapsulation protocols, but how to generalize
>>> offload of any transport, IPv[46], application protocols, protocol
>>> implemented in user space, security protocols, etc.
>>>
 Besides, this offload is about TSO and RSS and they do need to parse the
 packet to get the information where the inner header starts. It is not
 only about checksum offloading.

>>> RSS does not require the device to parse the inner header. All the UDP
>>> encapsulations protocols being defined set the source port to entropy
>>> flow value and most devices already support RSS+UDP (just needs to be
>>> enabled) so this works just fine with dumb NICs. In fact, this is one
>>> of the main motivations of encapsulating UDP in the first place, to
>>> leverage existing RSS and ECMP mechanisms. The more general solution
>>> is to use IPv6 flow label (RFC6438). We need HW support to include the
>>> flow label into the hash for ECMP and RSS, but once we have that much
>>> of the motivation for using UDP goes away and we can get back to just
>>> doing GRE/IP, IPIP, MPLS/IP, etc. (hence eliminate overhead and
>>> complexity of UDP encap).
>>>
 Please provide a sketch up for a protocol generic api that can tell
 hardware where a inner protocol header starts that supports vxlan,
 vxlan-gpe, geneve and ipv6 extension headers and knows which protocol is
 starting at that point.

>>> BPF. Implementing protocol generic offloads are not just a HW concern
>>> either, adding kernel GRO code for every possible protocol that comes
>>> along doesn't scale well. This becomes especially obvious when we
>>> consider how to provide offloads for applications protocols. If the
>>> kernel provides a programmable framework for the offloads then
>>> application protocols, such as QUIC, could use use that without
>>> needing to hack the kernel to support the specific protocol (which no
>>> one wants!). Application protocol parsing in KCM and some other use
>>> cases of BPF have already foreshadowed this, and we are working on a
>>> prototype for a BPF programmable engine in the kernel. Presumably,
>>> this same model could eventually be applied as the HW API to
>>> programmable offload.
>>
>> Just keying off the last statement there...
>>
>> I think BPF programs are going to be hard to translate into hardware
>> for most devices. The problem is the BPF programs in general lack
>> structure. A parse graph would be much more friendly for hardware or
>> at minimum the BPF program would need to be a some sort of
>> well-structured program so a driver could turn that into a parse graph.
>>
> This might be relevant:
> http://richard.systems/research/pdf/IEEE_HPSR_BPF_OPENFLOW.pdf
> 

Thanks Tom interesting read but they seem to argue for a BPF engine in
hardware which I'm still not convinced is necessary and the numbers
provided are for a 1Gbps link where 10Gpbs/100Gbps+ would be more
valuable.

I am still leaning towards a fully programmable parse graph and a set
of basic actions push/pop/set/fwd/etc. This would be useful for other
features not just checksum offloads. I guess it doesn't necessarily
exclude also having 1s complement logic though.

.John

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [P.A. Semi] Does the ethernet interface work on your Electra, Chitra, Nemo, and Athena board?

2015-12-07 Thread Denis Kirjanov
On 12/7/15, Christian Zigotzky  wrote:
> Hi all,
>
> I have some good news for you. I was able to fix the issue with the P.A.
> Semi Ethernet. It was a problem with the new DMA handling. The patch '
> [RFC/PATCH,v2] powerpc/iommu: Support "hybrid" iommu/direct DMA ops for
> coherent_mask < dma_mask (https://patchwork.ozlabs.org/patch/472535/)'
> is the problem.
>
> I had patched the following files before I compiled a kernel.
>
> arch/powerpc/Kconfig
> arch/powerpc/include/asm/device.h
> arch/powerpc/include/asm/dma-mapping.h
> arch/powerpc/include/asm/iommu.h
> arch/powerpc/kernel/dma-iommu.c
> arch/powerpc/kernel/dma-swiotlb.c
> arch/powerpc/kernel/dma.c
> arch/powerpc/platforms/powernv/pci-ioda.c
> arch/powerpc/platforms/pseries/iommu.c
> arch/powerpc/sysdev/dart_iommu.c
> include/asm-generic/dma-mapping-common.h
>
> The P.A. Semi Ethernet works again with the patched kernel.

Hi Ben,

Could you please take a look..

Thanks!

>
> I am happy. :-)
>
> Please fix the issue in the kernel source code.
>
> Thanks in advance,
>
> Christian
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fixing full name in patchwork

2015-12-07 Thread Kalle Valo
Sudip Mukherjee  writes:

> On Mon, Dec 07, 2015 at 08:03:54PM +0200, Kalle Valo wrote:
>> Hi Sudip,
>> 
>> Sudip Mukherjee  writes:
>> 
>> > We were dereferencing cmd first and checking for NULL later. Lets first
>> > check for NULL.
>> >
>> > Signed-off-by: Sudip Mukherjee 
>> 
>> I noticed that your name in git log is not your full name:
>> 
>> commit 0a38c8e1b592c16d959da456f425053e323a5153
>> Author: sudip 
>> Date:   Tue Nov 24 13:51:38 2015 +0530
>> 
>> This is because for some reason in patchwork your fullname is just
>> "sudip":
>> 
>> https://patchwork.kernel.org/patch/7688171/
>> 
>> Could you please fix your name in patchwork so that in the future we can
>> use your correct full name? The problem is that I don't know exactly how
>> to do this but it should be possible because I remember someone else
>> having a similar problem and he was able to fix it.
>
> I have also noticed the patch. Anyway, I have created a profile in
> patchwork and given full name. Hopefully that should solve the problem.

At least now your name in the patchwork link above looks correct:

Sudip Mukherjee - Nov. 24, 2015, 8:21 a.m.

Thanks for fixing this.

-- 
Kalle Valo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: use-after-free in sctp_do_sm

2015-12-07 Thread Dmitry Vyukov
On Mon, Dec 7, 2015 at 2:15 PM, Marcelo Ricardo Leitner
 wrote:
> On Mon, Dec 07, 2015 at 12:26:09PM +0100, Dmitry Vyukov wrote:
>> On Sat, Dec 5, 2015 at 5:39 PM, Vlad Yasevich  wrote:
>> > On 12/04/2015 04:34 PM, Marcelo Ricardo Leitner wrote:
>> >> On Fri, Dec 04, 2015 at 09:25:35PM +0100, Dmitry Vyukov wrote:
>> >>> On Fri, Dec 4, 2015 at 6:48 PM, Marcelo Ricardo Leitner
>> >>>  wrote:
>>  Hi Dmitry,
>> 
>>  Can you please test this patch?
>>  I'll re-post with proper subject if it works.
>> >>>
>> >>> Still happening with the same stacks.
>> >>
>> >> Then there may be another one, I'm afraid.
>> >>
>> >> I'm using the testapp you shared in the first email, with that debug line
>> >> enabled and added a new one:
>> >> +   pr_debug("%p %d\n", asoc, asoc ? asoc->state : 0);
>> >> debug_post_sfx();
>> >> (should have used %x, but ok)
>> >>
>> >> Also enabled slub_debug=PUZ, and I get:
>> >>
>> >> without the patch:
>> >> [   87.873640] sctp: 8800b71533d8 1
>> >> [   87.873647] sctp: sctp_do_sm[post-sfx]: error:0,
>> >> asoc:8800b71533d8[STATE_CLOSED]
>> >> [   87.873739] sctp: 8800b71533d8 1
>> >> [   87.873742] sctp: sctp_do_sm[post-sfx]: error:0,
>> >> asoc:8800b71533d8[STATE_CLOSED]
>> >> [   87.875149] sctp: 8800b71533d8 1802201963
>> >> [   87.875238] sctp: sctp_do_sm[post-sfx]: error:0,
>> >> asoc:8800b71533d8[STATE_CLOSED]
>> >>
>> >> 1802201963 = 0x6b6b6b6b, poison
>> >>
>> >> with the patch:
>> >> [   81.071265] sctp: 880137571148 1
>> >> [   81.071273] sctp: sctp_do_sm[post-sfx]: error:0,
>> >> asoc:880137571148[STATE_CLOSED]
>> >> [   81.071372] sctp: 880137571148 1
>> >> [   81.071375] sctp: sctp_do_sm[post-sfx]: error:0,
>> >> asoc:880137571148[STATE_CLOSED]
>> >> [   81.072423] sctp:   (null) 0
>> >> [   81.072427] sctp: sctp_do_sm[post-sfx]: error:0, asoc:
>> >> (null)[STATE_CLOSED]
>> >>
>> >> This one, at least, is gone with this patch.
>> >>
>> >>   Marcelo
>> >>
>> >
>> > Hi Marcelo
>> >
>> > I think you also need to catch the SCTP_DISPOSITION_ABORT and update
>> > the pointer.  There are some issues there though as some functions report
>> > that code without actually destroying the association.  This happens when
>> > the ABORT chunk may be dropped.
>> >
>> > I think this might be why we still see the issue.
>>
>>
>> Marcelo,
>>
>> Is this info enough for you to cook another fix?
>
> Hi, I think so. I was really wondering how you could trigger that issue
> without the timestamp fix and Vlad's comment does shed some light on it.
>
> I'll do more tests later today, but what did you have connecting to the
> listening socket? Somehow you made that accept() call to return..

Local connect in another thread I guess.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 4/5] stmmac: Add ptp debugfs entry.

2015-12-07 Thread Arnd Bergmann
On Monday 07 December 2015 21:37:52 Phil Reid wrote:
> On 7/12/2015 5:03 PM, Arnd Bergmann wrote:
> > On Monday 07 December 2015 09:38:43 Phil Reid wrote:
> >> This adds a debugfs entry to view the current status of the ptp
> >> registers.
> >>
> >> Signed-off-by: Phil Reid 
> >>
> >
> > Your description should explain what this is good for. Why do you
> > need to look at this through debugfs?
> >
> 
> Happy to drop this one. I found it helpful when debugging the ptp
> behaviour. Allowing quick access to monitor the register updates
> by the ptp driver. Is there an alternative method that is preferred?

For tracing what the driver does, I'd add a trace event, which also
lets you see when any updates happen.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 5/5] stmmac: socfpga: Provide dt node to config ptp clk source.

2015-12-07 Thread Phil Reid

On 7/12/2015 5:05 PM, Arnd Bergmann wrote:

On Monday 07 December 2015 09:38:44 Phil Reid wrote:

Signed-off-by: Phil Reid 
---
  Documentation/devicetree/bindings/net/socfpga-dwmac.txt | 2 ++
  drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c | 9 +
  2 files changed, 11 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/socfpga-dwmac.txt 
b/Documentation/devicetree/bindings/net/socfpga-dwmac.txt
index 3a9d679..72d82d6 100644
--- a/Documentation/devicetree/bindings/net/socfpga-dwmac.txt
+++ b/Documentation/devicetree/bindings/net/socfpga-dwmac.txt
@@ -11,6 +11,8 @@ Required properties:
   designware version numbers documented in stmmac.txt
   - altr,sysmgr-syscon : Should be the phandle to the system manager node that
 encompasses the glue register, the register offset, and the register shift.
+ - altr,f2h_ptp_ref_clk use f2h_ptp_ref_clk instead of default eosc1 clock
+   for ptp ref clk. This affects all emacs as the clock is common.



Is this feature specific to the Altera glue logic, or would it be possible
to do the same thing on another dwmac implementation?


I think it is specific to Altera's glue logic. It selects either a clock 
connected
directly to the ARM HPS core or a clock routed from Altera FPGA fabric.
Control register is in the altera sysmgr.

--
Regards
Phil Reid

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net] vxlan: interpret IP headers for ECN correctly

2015-12-07 Thread Jiri Benc
When looking for outer IP header, use the actual socket address family, not
the address family of the default destination which is not set for metadata
based interfaces (and doesn't have to match the address family of the
received packet even if it was set).

Fix also the misleading comment.

Signed-off-by: Jiri Benc 
---
 drivers/net/vxlan.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 6369a5734d4c..2718b836c1e7 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1158,7 +1158,6 @@ static void vxlan_rcv(struct vxlan_sock *vs, struct 
sk_buff *skb,
struct pcpu_sw_netstats *stats;
union vxlan_addr saddr;
int err = 0;
-   union vxlan_addr *remote_ip;
 
/* For flow based devices, map all packets to VNI 0 */
if (vs->flags & VXLAN_F_COLLECT_METADATA)
@@ -1169,7 +1168,6 @@ static void vxlan_rcv(struct vxlan_sock *vs, struct 
sk_buff *skb,
if (!vxlan)
goto drop;
 
-   remote_ip = >default_dst.remote_ip;
skb_reset_mac_header(skb);
skb_scrub_packet(skb, !net_eq(vxlan->net, dev_net(vxlan->dev)));
skb->protocol = eth_type_trans(skb, vxlan->dev);
@@ -1179,8 +1177,8 @@ static void vxlan_rcv(struct vxlan_sock *vs, struct 
sk_buff *skb,
if (ether_addr_equal(eth_hdr(skb)->h_source, vxlan->dev->dev_addr))
goto drop;
 
-   /* Re-examine inner Ethernet packet */
-   if (remote_ip->sa.sa_family == AF_INET) {
+   /* Get data from the outer IP header */
+   if (vxlan_get_sk_family(vs) == AF_INET) {
oip = ip_hdr(skb);
saddr.sin.sin_addr.s_addr = oip->saddr;
saddr.sa.sa_family = AF_INET;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[P.A. Semi] Does the ethernet interface work on your Electra, Chitra, Nemo, and Athena board?

2015-12-07 Thread Christian Zigotzky

Hi all,

I have some good news for you. I was able to fix the issue with the P.A. 
Semi Ethernet. It was a problem with the new DMA handling. The patch ' 
[RFC/PATCH,v2] powerpc/iommu: Support "hybrid" iommu/direct DMA ops for 
coherent_mask < dma_mask (https://patchwork.ozlabs.org/patch/472535/)' 
is the problem.


I had patched the following files before I compiled a kernel.

arch/powerpc/Kconfig
arch/powerpc/include/asm/device.h
arch/powerpc/include/asm/dma-mapping.h
arch/powerpc/include/asm/iommu.h
arch/powerpc/kernel/dma-iommu.c
arch/powerpc/kernel/dma-swiotlb.c
arch/powerpc/kernel/dma.c
arch/powerpc/platforms/powernv/pci-ioda.c
arch/powerpc/platforms/pseries/iommu.c
arch/powerpc/sysdev/dart_iommu.c
include/asm-generic/dma-mapping-common.h

The P.A. Semi Ethernet works again with the patched kernel.

I am happy. :-)

Please fix the issue in the kernel source code.

Thanks in advance,

Christian


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 net-next 0/4] Further fix for dsa unbinding

2015-12-07 Thread Andrew Lunn
On Mon, Dec 07, 2015 at 01:57:31PM +0100, Neil Armstrong wrote:
> This serie fixes further issues for DSA dynamic unbinding.
> The first patch completely removes the PHY link state polling.
> The two following cleans up the dsa state upon removal.
> The last patch moves slave destroy code as slave function and
> adds missing netdev and phy cleanup calls.
> 
> v1: http://lkml.kernel.org/r/562f8ecb.6050...@baylibre.com
> v2: http://lkml.kernel.org/r/56321d9a.8010...@baylibre.com
> remove phy fix and add missing calls in dsa_switch_destroy
> then add dedicated dsa_slave_destroy
> 
> v3: remove polling instead of fixing it, make single patch for
> dsa slave destroy

Acked-by: Andrew Lunn 

Thanks
Andrew
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] ravb: ptp: Add CONFIG mode support

2015-12-07 Thread Yoshihiro Kaneko
Hello Sergei,

2015-12-07 4:19 GMT+09:00 Sergei Shtylyov :
> Hello.
>
> On 12/05/2015 01:01 PM, Yoshihiro Kaneko wrote:
>
>> Thanks for your review.
>
>
>From now on, it'll be my duty. :-)

Thank you always for your help.

>
>> 2015-12-04 6:09 GMT+09:00 Sergei Shtylyov
>> :
>>>
>>> Hello.
>>>
>>> On 12/01/2015 08:04 PM, Yoshihiro Kaneko wrote:
>>>
 From: Kazuya Mizuguchi 

 This patch makes PTP support active in CONFIG mode on R-Car Gen3.

 Signed-off-by: Kazuya Mizuguchi 
 Signed-off-by: Yoshihiro Kaneko 
 ---

 This patch is based on the master branch of David Miller's next
 networking
 tree.

drivers/net/ethernet/renesas/ravb.h  |  1 +
drivers/net/ethernet/renesas/ravb_main.c | 33
 +++-
2 files changed, 29 insertions(+), 5 deletions(-)

 diff --git a/drivers/net/ethernet/renesas/ravb.h
 b/drivers/net/ethernet/renesas/ravb.h
 index f9dee74..9fbe92a 100644
 --- a/drivers/net/ethernet/renesas/ravb.h
 +++ b/drivers/net/ethernet/renesas/ravb.h
>
> [...]
>
 diff --git a/drivers/net/ethernet/renesas/ravb_main.c
 b/drivers/net/ethernet/renesas/ravb_main.c
 index 990dc55..293046d 100644
 --- a/drivers/net/ethernet/renesas/ravb_main.c
 +++ b/drivers/net/ethernet/renesas/ravb_main.c
>
> [...]

 @@ -1855,6 +1870,10 @@ out_napi_del:
out_dma_free:
  dma_free_coherent(ndev->dev.parent, priv->desc_bat_size,
 priv->desc_bat,
priv->desc_bat_dma);
 +
 +   /* Stop PTP Clock driver */
 +   if (chip_id != RCAR_GEN2)
 +   ravb_ptp_stop(ndev);
>>>
>>>
>>>
>>> This is clearly misplaced.
>>
>>
>> It's my fault.
>
>
>Should we expect a new patch fixing this issue?

Sure, I will do.

>
> [...]
>
>> Regards,
>> Kaneko
>
>
> MBR, Sergei
>

Thanks,
Kaneko
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] batman-adv: fix speedy join for DAT cache replies

2015-12-07 Thread Antonio Quartulli
From: Simon Wunderlich 

DAT Cache replies are answered on behalf of other clients which are not
connected to the answering originator. Therefore, we shouldn't add these
clients to the answering originators TT table through speed join to
avoid bogus entries.

Reported-by: Alessandro Bolletta 
Signed-off-by: Simon Wunderlich 
Acked-by: Antonio Quartulli 
Signed-off-by: Marek Lindner 
Signed-off-by: Antonio Quartulli 
---
 net/batman-adv/routing.c | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/net/batman-adv/routing.c b/net/batman-adv/routing.c
index 8d990b0..3207667 100644
--- a/net/batman-adv/routing.c
+++ b/net/batman-adv/routing.c
@@ -836,6 +836,7 @@ int batadv_recv_unicast_packet(struct sk_buff *skb,
u8 *orig_addr;
struct batadv_orig_node *orig_node = NULL;
int check, hdr_size = sizeof(*unicast_packet);
+   enum batadv_subtype subtype;
bool is4addr;
 
unicast_packet = (struct batadv_unicast_packet *)skb->data;
@@ -863,10 +864,20 @@ int batadv_recv_unicast_packet(struct sk_buff *skb,
/* packet for me */
if (batadv_is_my_mac(bat_priv, unicast_packet->dest)) {
if (is4addr) {
-   batadv_dat_inc_counter(bat_priv,
-  unicast_4addr_packet->subtype);
-   orig_addr = unicast_4addr_packet->src;
-   orig_node = batadv_orig_hash_find(bat_priv, orig_addr);
+   subtype = unicast_4addr_packet->subtype;
+   batadv_dat_inc_counter(bat_priv, subtype);
+
+   /* Only payload data should be considered for speedy
+* join. For example, DAT also uses unicast 4addr
+* types, but those packets should not be considered
+* for speedy join, since the clients do not actually
+* reside at the sending originator.
+*/
+   if (subtype == BATADV_P_DATA) {
+   orig_addr = unicast_4addr_packet->src;
+   orig_node = batadv_orig_hash_find(bat_priv,
+ orig_addr);
+   }
}
 
if (batadv_dat_snoop_incoming_arp_request(bat_priv, skb,
-- 
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] batman-adv: fix erroneous client entry duplicate detection

2015-12-07 Thread Antonio Quartulli
From: Marek Lindner 

The translation table implementation, namely batadv_compare_tt(),
is used to compare two client entries and deciding if they are the
holding the same information. Each client entry is identified by
its mac address and its VLAN id (VID).
Consequently, batadv_compare_tt() has to not only compare the mac
addresses but also the VIDs.

Without this fix adding a new client entry that possesses the same
mac address as another client but operates on a different VID will
fail because both client entries will considered identical.

Signed-off-by: Marek Lindner 
Signed-off-by: Antonio Quartulli 
---
 net/batman-adv/translation-table.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/batman-adv/translation-table.c 
b/net/batman-adv/translation-table.c
index a3fc9033..76f19ba 100644
--- a/net/batman-adv/translation-table.c
+++ b/net/batman-adv/translation-table.c
@@ -68,13 +68,15 @@ static void batadv_tt_global_del(struct batadv_priv 
*bat_priv,
 unsigned short vid, const char *message,
 bool roaming);
 
-/* returns 1 if they are the same mac addr */
+/* returns 1 if they are the same mac addr and vid */
 static int batadv_compare_tt(const struct hlist_node *node, const void *data2)
 {
const void *data1 = container_of(node, struct batadv_tt_common_entry,
 hash_entry);
+   const struct batadv_tt_common_entry *tt1 = data1;
+   const struct batadv_tt_common_entry *tt2 = data2;
 
-   return batadv_compare_eth(data1, data2);
+   return (tt1->vid == tt2->vid) && batadv_compare_eth(data1, data2);
 }
 
 /**
-- 
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] batman-adv: avoid keeping false temporary entry

2015-12-07 Thread Antonio Quartulli
From: Simon Wunderlich 

In the case when a temporary entry is added first and a proper tt entry
is added after that, the temporary tt entry is kept in the orig list.
However the temporary flag is removed at this point, and therefore the
purge function can not find this temporary entry anymore.

Therefore, remove the previous temp entry before adding the new proper
one.

This case can happen if a client behind a given originator moves before
the TT announcement is sent out. Other than that, this case can also be
created by bogus or malicious payload frames for VLANs which are not
existent on the sending originator.

Reported-by: Alessandro Bolletta 
Signed-off-by: Simon Wunderlich 
Acked-by: Antonio Quartulli 
Signed-off-by: Marek Lindner 
Signed-off-by: Antonio Quartulli 
---
 net/batman-adv/translation-table.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/batman-adv/translation-table.c 
b/net/batman-adv/translation-table.c
index 4228b10..a3fc9033 100644
--- a/net/batman-adv/translation-table.c
+++ b/net/batman-adv/translation-table.c
@@ -1427,9 +1427,15 @@ static bool batadv_tt_global_add(struct batadv_priv 
*bat_priv,
}
 
/* if the client was temporary added before receiving the first
-* OGM announcing it, we have to clear the TEMP flag
+* OGM announcing it, we have to clear the TEMP flag. Also,
+* remove the previous temporary orig node and re-add it
+* if required. If the orig entry changed, the new one which
+* is a non-temporary entry is preferred.
 */
-   common->flags &= ~BATADV_TT_CLIENT_TEMP;
+   if (common->flags & BATADV_TT_CLIENT_TEMP) {
+   batadv_tt_global_del_orig_list(tt_global_entry);
+   common->flags &= ~BATADV_TT_CLIENT_TEMP;
+   }
 
/* the change can carry possible "attribute" flags like the
 * TT_CLIENT_WIFI, therefore they have to be copied in the
-- 
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] batman-adv: Fix invalid stack access in batadv_dat_select_candidates

2015-12-07 Thread Antonio Quartulli
From: Sven Eckelmann 

batadv_dat_select_candidates provides an u32 to batadv_hash_dat but it
needs a batadv_dat_entry with at least ip and vid filled in.

Fixes: 3e26722bc9f2 ("batman-adv: make the Distributed ARP Table vlan aware")
Signed-off-by: Sven Eckelmann 
Acked-by: Antonio Quartulli 
Signed-off-by: Marek Lindner 
Signed-off-by: Antonio Quartulli 
---
 net/batman-adv/distributed-arp-table.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/batman-adv/distributed-arp-table.c 
b/net/batman-adv/distributed-arp-table.c
index 83bc1aa..a49c705 100644
--- a/net/batman-adv/distributed-arp-table.c
+++ b/net/batman-adv/distributed-arp-table.c
@@ -566,6 +566,7 @@ batadv_dat_select_candidates(struct batadv_priv *bat_priv, 
__be32 ip_dst)
int select;
batadv_dat_addr_t last_max = BATADV_DAT_ADDR_MAX, ip_key;
struct batadv_dat_candidate *res;
+   struct batadv_dat_entry dat;
 
if (!bat_priv->orig_hash)
return NULL;
@@ -575,7 +576,9 @@ batadv_dat_select_candidates(struct batadv_priv *bat_priv, 
__be32 ip_dst)
if (!res)
return NULL;
 
-   ip_key = (batadv_dat_addr_t)batadv_hash_dat(_dst,
+   dat.ip = ip_dst;
+   dat.vid = 0;
+   ip_key = (batadv_dat_addr_t)batadv_hash_dat(,
BATADV_DAT_ADDR_MAX);
 
batadv_dbg(BATADV_DBG_DAT, bat_priv,
-- 
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 4/5] stmmac: Add ptp debugfs entry.

2015-12-07 Thread Phil Reid

On 7/12/2015 5:03 PM, Arnd Bergmann wrote:

On Monday 07 December 2015 09:38:43 Phil Reid wrote:

This adds a debugfs entry to view the current status of the ptp
registers.

Signed-off-by: Phil Reid 



Your description should explain what this is good for. Why do you
need to look at this through debugfs?



Happy to drop this one. I found it helpful when debugging the ptp
behaviour. Allowing quick access to monitor the register updates
by the ptp driver. Is there an alternative method that is preferred?

--
Regards
Phil Reid


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: use-after-free in ip6_xmit

2015-12-07 Thread Dmitry Vyukov
Yes, seems to be fixed on master of
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git. Just
can't pull in all fixes from all trees. Sorry.
When will it be merged into Linus tree?



On Mon, Dec 7, 2015 at 3:39 PM, Eric Dumazet  wrote:
> On Mon, 2015-12-07 at 06:36 -0800, Eric Dumazet wrote:
>
>
>> Thanks
>>
>
> Also note that Dave Jones reported a SCTP problem fixed by :
>
> https://patchwork.ozlabs.org/patch/553068/
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4.3 078/125] can: Use correct type in sizeof() in nla_put()

2015-12-07 Thread Greg Kroah-Hartman
4.3-stable review patch.  If anyone has any objections, please let me know.

--

From: Marek Vasut 

commit 562b103a21974c2f9cd67514d110f918bb3e1796 upstream.

The sizeof() is invoked on an incorrect variable, likely due to some
copy-paste error, and this might result in memory corruption. Fix this.

Signed-off-by: Marek Vasut 
Cc: Wolfgang Grandegger 
Cc: netdev@vger.kernel.org
Signed-off-by: Marc Kleine-Budde 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/net/can/dev.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/can/dev.c
+++ b/drivers/net/can/dev.c
@@ -915,7 +915,7 @@ static int can_fill_info(struct sk_buff
 nla_put(skb, IFLA_CAN_BITTIMING_CONST,
 sizeof(*priv->bittiming_const), priv->bittiming_const)) ||
 
-   nla_put(skb, IFLA_CAN_CLOCK, sizeof(cm), >clock) ||
+   nla_put(skb, IFLA_CAN_CLOCK, sizeof(priv->clock), >clock) ||
nla_put_u32(skb, IFLA_CAN_STATE, state) ||
nla_put(skb, IFLA_CAN_CTRLMODE, sizeof(cm), ) ||
nla_put_u32(skb, IFLA_CAN_RESTART_MS, priv->restart_ms) ||


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH net] vxlan: fix incorrect RCO bit in VXLAN header

2015-12-07 Thread Thomas Graf
On 12/04/15 at 01:54pm, Jiri Benc wrote:
> Commit 3511494ce2f3d ("vxlan: Group Policy extension") changed definition of
> VXLAN_HF_RCO from 0x0020 to BIT(24). This is obviously incorrect. It's
> also in violation with the RFC draft.
> 
> Fixes: 3511494ce2f3d ("vxlan: Group Policy extension")
> Cc: Thomas Graf 
> Cc: Tom Herbert 
> Signed-off-by: Jiri Benc 

Thanks for fixing this up Jiri. Sorry about the mess Tom.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 2/2] net: hns: enet specisies a reference to dsaf (config and documents)

2015-12-07 Thread Rob Herring
On Sat, Dec 05, 2015 at 03:59:16PM +0800, yankejian wrote:
> when enet specisies a reference to dsaf, the correlative config and

s/when/When/

> documents needs to update. this patch updates the correlative dtsi file

s/this/This/

> and bindings documents .
^
extra space

This change breaks compatibility with old dtbs. IIRC, this is all new, 
so maybe it doesn't matter, but you should be explicit that you are 
doing that.


> 
> Signed-off-by: yankejian 
> ---
>  .../devicetree/bindings/net/hisilicon-hns-dsaf.txt|  5 +
>  .../devicetree/bindings/net/hisilicon-hns-nic.txt |  7 ---
>  arch/arm64/boot/dts/hisilicon/hip05_hns.dtsi  | 19 
> +--
>  3 files changed, 14 insertions(+), 17 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt 
> b/Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt
> index 80411b2..ecacfa4 100644
> --- a/Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt
> +++ b/Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt
> @@ -4,8 +4,6 @@ Required properties:
>  - compatible: should be "hisilicon,hns-dsaf-v1" or "hisilicon,hns-dsaf-v2".
>"hisilicon,hns-dsaf-v1" is for hip05.
>"hisilicon,hns-dsaf-v2" is for Hi1610 and Hi1612.
> -- dsa-name: dsa fabric name who provide this interface.
> -  should be "dsafX", X is the dsaf id.
>  - mode: dsa fabric mode string. only support one of dsaf modes like these:
>   "2port-64vf",
>   "6port-16rss",
> @@ -26,9 +24,8 @@ Required properties:
>  
>  Example:
>  
> -dsa: dsa@c700 {
> +dsaf0: dsa@c700 {
>   compatible = "hisilicon,hns-dsaf-v1";
> - dsa_name = "dsaf0";
>   mode = "6port-16rss";
>   interrupt-parent = <_dsa>;
>   reg = <0x0 0xC000 0x0 0x42
> diff --git a/Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt 
> b/Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt
> index 41d19be..e6a9d1c 100644
> --- a/Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt
> +++ b/Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt
> @@ -4,8 +4,9 @@ Required properties:
>  - compatible: "hisilicon,hns-nic-v1" or "hisilicon,hns-nic-v2".
>"hisilicon,hns-nic-v1" is for hip05.
>"hisilicon,hns-nic-v2" is for Hi1610 and Hi1612.
> -- ae-name: accelerator name who provides this interface,
> -  is simply a name referring to the name of name in the accelerator node.
> +- ae-handle: accelerator engine handle for hns,
> +  specifies a reference to the associating hardware driver node.
> +  see Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt
>  - port-id: is the index of port provided by DSAF (the accelerator). DSAF can
>connect to 8 PHYs. Port 0 to 1 are both used for adminstration purpose. 
> They
>are called debug ports.
> @@ -41,7 +42,7 @@ Example:
>  
>   ethernet@0{
>   compatible = "hisilicon,hns-nic-v1";
> - ae-name = "dsaf0";
> + ae-handle = <>;
>   port-id = <0>;
>   local-mac-address = [a2 14 e4 4b 56 76];
>   };
> diff --git a/arch/arm64/boot/dts/hisilicon/hip05_hns.dtsi 
> b/arch/arm64/boot/dts/hisilicon/hip05_hns.dtsi
> index 606dd5a..89c883e 100644
> --- a/arch/arm64/boot/dts/hisilicon/hip05_hns.dtsi
> +++ b/arch/arm64/boot/dts/hisilicon/hip05_hns.dtsi
> @@ -23,9 +23,8 @@ soc0: soc@0 {
>   };
>   };
>  
> - dsa: dsa@c700 {
> + dsaf0: dsa@c700 {
>   compatible = "hisilicon,hns-dsaf-v1";
> - dsa_name = "dsaf0";
>   mode = "6port-16rss";
>   interrupt-parent = <_dsa>;
>  
> @@ -127,7 +126,7 @@ soc0: soc@0 {
>  
>   eth0: ethernet@0{
>   compatible = "hisilicon,hns-nic-v1";
> - ae-name = "dsaf0";
> + ae-handle = <>;
>   port-id = <0>;
>   local-mac-address = [00 00 00 01 00 58];
>   status = "disabled";
> @@ -135,14 +134,14 @@ soc0: soc@0 {
>   };
>   eth1: ethernet@1{
>   compatible = "hisilicon,hns-nic-v1";
> - ae-name = "dsaf0";
> + ae-handle = <>;
>   port-id = <1>;
>   status = "disabled";
>   dma-coherent;
>   };
>   eth2: ethernet@2{
>   compatible = "hisilicon,hns-nic-v1";
> - ae-name = "dsaf0";
> + ae-handle = <>;
>   port-id = <2>;
>   local-mac-address = [00 00 00 01 00 5a];
>   status = "disabled";
> @@ -150,7 +149,7 @@ soc0: soc@0 {
>   };
>   eth3: ethernet@3{
>   compatible = "hisilicon,hns-nic-v1";
> - ae-name = "dsaf0";
> + ae-handle = <>;
>   port-id = <3>;
>   local-mac-address = [00 00 00 01 00 5b];
>   status = "disabled";
> @@ -158,7 +157,7 @@ soc0: soc@0 {
>   };
>

[PATCH 4.1 62/95] can: Use correct type in sizeof() in nla_put()

2015-12-07 Thread Greg Kroah-Hartman
4.1-stable review patch.  If anyone has any objections, please let me know.

--

From: Marek Vasut 

commit 562b103a21974c2f9cd67514d110f918bb3e1796 upstream.

The sizeof() is invoked on an incorrect variable, likely due to some
copy-paste error, and this might result in memory corruption. Fix this.

Signed-off-by: Marek Vasut 
Cc: Wolfgang Grandegger 
Cc: netdev@vger.kernel.org
Signed-off-by: Marc Kleine-Budde 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/net/can/dev.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/can/dev.c
+++ b/drivers/net/can/dev.c
@@ -915,7 +915,7 @@ static int can_fill_info(struct sk_buff
 nla_put(skb, IFLA_CAN_BITTIMING_CONST,
 sizeof(*priv->bittiming_const), priv->bittiming_const)) ||
 
-   nla_put(skb, IFLA_CAN_CLOCK, sizeof(cm), >clock) ||
+   nla_put(skb, IFLA_CAN_CLOCK, sizeof(priv->clock), >clock) ||
nla_put_u32(skb, IFLA_CAN_STATE, state) ||
nla_put(skb, IFLA_CAN_CTRLMODE, sizeof(cm), ) ||
nla_put_u32(skb, IFLA_CAN_RESTART_MS, priv->restart_ms) ||


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RESEND net-next 3/3] arm64: hip05-d02: Document devicetree bindings for Hisilicon D02 Board

2015-12-07 Thread Bintian

On 2015/12/7 21:16, Rob Herring wrote:

On Sat, Dec 05, 2015 at 03:54:48PM +0800, yankejian wrote:

This patch adds documentation for the devicetree bindings used by the
DT files of Hisilicon Hip05-D02 development board.

Signed-off-by: yankejian 

You may need to configure as  "Kejian Yan " :)

BR,

Bintian

---
  .../devicetree/bindings/arm/hisilicon/hisilicon.txt  | 16 
  1 file changed, 16 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/hisilicon/hisilicon.txt 
b/Documentation/devicetree/bindings/arm/hisilicon/hisilicon.txt
index 6ac7c00..5318d78 100644
--- a/Documentation/devicetree/bindings/arm/hisilicon/hisilicon.txt
+++ b/Documentation/devicetree/bindings/arm/hisilicon/hisilicon.txt
@@ -187,6 +187,22 @@ Example:
reg = <0xb000 0x1>;
};
  
+Hisilicon HiP05 PERISUB system controller

+
+Required properties:
+- compatible : "hisilicon,peri-c-subctrl", "syscon";

This should be more specific and have the SOC name in it.


+- reg : Register address and size
+
+The HiP05 PERISUB system controller is shared by peripheral controllers in
+HiP05 Soc to implement some basic configurations. the peripheral
+ controllers include mdio, ddr, iic, uart, timer and so on.
+
+Example:
+   /* for HiP05 PCIe-SAS system */
+   pcie_sas: system_controller@0xb000 {
+   compatible = "hisilicon,pcie-sas-subctrl", "syscon";

The example doesn't match.


+   reg = <0xb000 0x1>;
+   };
  ---
  Hisilicon CPU controller
  
--

1.9.1

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

.




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] I.MX6: Fix Ethernet PHY mode on Ventana boards

2015-12-07 Thread Tim Harvey
On Mon, Dec 7, 2015 at 4:56 AM, Krzysztof Hałasa  wrote:
> Gateworks Ventana boards seem to need "RGMII-ID" (internal delay)
> PHY mode, instead of simple "RGMII", for their Marvell 88E1510
> transceiver. Otherwise, the Ethernet MAC doesn't work with Marvell PHY
> driver (TX doesn't seem to work correctly).
>
> Tested on GW5400 rev. C.
>
> This bug affects ARM Fedora 23.
>
> Signed-off-by: Krzysztof Hałasa 
>
> diff --git a/arch/arm/boot/dts/imx6q-gw5400-a.dts 
> b/arch/arm/boot/dts/imx6q-gw5400-a.dts
> index 822ffb2..6c168dc 100644
> --- a/arch/arm/boot/dts/imx6q-gw5400-a.dts
> +++ b/arch/arm/boot/dts/imx6q-gw5400-a.dts
> @@ -154,7 +154,7 @@
>   {
> pinctrl-names = "default";
> pinctrl-0 = <_enet>;
> -   phy-mode = "rgmii";
> +   phy-mode = "rgmii-id";
> phy-reset-gpios = < 30 GPIO_ACTIVE_HIGH>;
> status = "okay";
>  };
> diff --git a/arch/arm/boot/dts/imx6qdl-gw51xx.dtsi 
> b/arch/arm/boot/dts/imx6qdl-gw51xx.dtsi
> index f2867c4..90496aa 100644
> --- a/arch/arm/boot/dts/imx6qdl-gw51xx.dtsi
> +++ b/arch/arm/boot/dts/imx6qdl-gw51xx.dtsi
> @@ -94,7 +94,7 @@
>   {
> pinctrl-names = "default";
> pinctrl-0 = <_enet>;
> -   phy-mode = "rgmii";
> +   phy-mode = "rgmii-id";
> phy-reset-gpios = < 30 GPIO_ACTIVE_LOW>;
> status = "okay";
>  };
> diff --git a/arch/arm/boot/dts/imx6qdl-gw52xx.dtsi 
> b/arch/arm/boot/dts/imx6qdl-gw52xx.dtsi
> index 4493f6e..0a6730b 100644
> --- a/arch/arm/boot/dts/imx6qdl-gw52xx.dtsi
> +++ b/arch/arm/boot/dts/imx6qdl-gw52xx.dtsi
> @@ -154,7 +154,7 @@
>   {
> pinctrl-names = "default";
> pinctrl-0 = <_enet>;
> -   phy-mode = "rgmii";
> +   phy-mode = "rgmii-id";
> phy-reset-gpios = < 30 GPIO_ACTIVE_LOW>;
> status = "okay";
>  };
> diff --git a/arch/arm/boot/dts/imx6qdl-gw53xx.dtsi 
> b/arch/arm/boot/dts/imx6qdl-gw53xx.dtsi
> index cfad214..2c549ad 100644
> --- a/arch/arm/boot/dts/imx6qdl-gw53xx.dtsi
> +++ b/arch/arm/boot/dts/imx6qdl-gw53xx.dtsi
> @@ -174,7 +174,7 @@
>   {
> pinctrl-names = "default";
> pinctrl-0 = <_enet>;
> -   phy-mode = "rgmii";
> +   phy-mode = "rgmii-id";
> phy-reset-gpios = < 30 GPIO_ACTIVE_LOW>;
> status = "okay";
>  };
> diff --git a/arch/arm/boot/dts/imx6qdl-gw54xx.dtsi 
> b/arch/arm/boot/dts/imx6qdl-gw54xx.dtsi
> index 535b536..b4ea087 100644
> --- a/arch/arm/boot/dts/imx6qdl-gw54xx.dtsi
> +++ b/arch/arm/boot/dts/imx6qdl-gw54xx.dtsi
> @@ -164,7 +164,7 @@
>   {
> pinctrl-names = "default";
> pinctrl-0 = <_enet>;
> -   phy-mode = "rgmii";
> +   phy-mode = "rgmii-id";
> phy-reset-gpios = < 30 GPIO_ACTIVE_LOW>;
> status = "okay";
>  };
>
> --
> Krzysztof Halasa
>
> Industrial Research Institute for Automation and Measurements PIAP
> Al. Jerozolimskie 202, 02-486 Warsaw, Poland

Krzysztof,

It sounds like your saying this controls whether the phy is in charge
of delay vs the MAC. I have never needed to set this and haven't found
where its actually used (in at least 4.3). Is this caused by something
new in the kernel I haven't seen yet or is it possible you have board
that has an Ethernet issue?

Regards,

Tim

Tim Harvey - Principal Software Engineer
Gateworks Corporation - http://www.gateworks.com/
3026 S. Higuera St. San Luis Obispo CA 93401
805-781-2000
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 5/5] stmmac: socfpga: Provide dt node to config ptp clk source.

2015-12-07 Thread Arnd Bergmann
On Monday 07 December 2015 21:34:29 Phil Reid wrote:
> On 7/12/2015 5:05 PM, Arnd Bergmann wrote:
> > On Monday 07 December 2015 09:38:44 Phil Reid wrote:
> >> Signed-off-by: Phil Reid 
> >> ---
> >>   Documentation/devicetree/bindings/net/socfpga-dwmac.txt | 2 ++
> >>   drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c | 9 +
> >>   2 files changed, 11 insertions(+)
> >>
> >> diff --git a/Documentation/devicetree/bindings/net/socfpga-dwmac.txt 
> >> b/Documentation/devicetree/bindings/net/socfpga-dwmac.txt
> >> index 3a9d679..72d82d6 100644
> >> --- a/Documentation/devicetree/bindings/net/socfpga-dwmac.txt
> >> +++ b/Documentation/devicetree/bindings/net/socfpga-dwmac.txt
> >> @@ -11,6 +11,8 @@ Required properties:
> >>designware version numbers documented in stmmac.txt
> >>- altr,sysmgr-syscon : Should be the phandle to the system manager node 
> >> that
> >>  encompasses the glue register, the register offset, and the register 
> >> shift.
> >> + - altr,f2h_ptp_ref_clk use f2h_ptp_ref_clk instead of default eosc1 clock
> >> +   for ptp ref clk. This affects all emacs as the clock is common.
> >>
> >
> > Is this feature specific to the Altera glue logic, or would it be possible
> > to do the same thing on another dwmac implementation?
> >
> I think it is specific to Altera's glue logic. It selects either a clock 
> connected
> directly to the ARM HPS core or a clock routed from Altera FPGA fabric.
> Control register is in the altera sysmgr.
> 
> 

Ok, makes sense.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 5/5] stmmac: socfpga: Provide dt node to config ptp clk source.

2015-12-07 Thread Phil Reid

On 7/12/2015 7:59 PM, Sergei Shtylyov wrote:

On 12/07/2015 04:38 AM, Phil Reid wrote:

+if(dwmac->f2h_ptp_ref_clk)


Please run your patches thru scripts/checkpatch.pl (space needed after 
*if*).

[...]

MBR, Sergei



Will do.

--
Regards
Phil Reid

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


pull request [net]: batman-adv 20151207

2015-12-07 Thread Antonio Quartulli
Hello David,

long time no see :)

I know it starts to be a bit late in the release cycle, but I think that
these 4 small bug-fixes are still worth being merged.

Patch 1 fixes a compatibility issue between our Distributed ARP Table
mechanism and the "early client detection" feature. Such issue creates
an inconsistency in the global translation table leading to user clients
not being reachable anymore. Fix provided by Simon Wunderlich.

Patch 2 is again provided by Simon Wunderlich and fixes another bug
related to the "early client detection" feature. The fix consists in
ensuring that temporary client entries not claimed by any originator
are properly purged instead of being stored all time long till
shutdown.

Patch 3 fixes the function used by the TT hash tables to detect
duplicate entries. At the moment two clients with the same MAC address
but lying in different VLANs are considered the same, while this should
not be the case. Bugfix by Marek Lindner.

Patch 4 fixes an invalid stack access in the batadv_dat_select_candidates()
function where a u32 pointer is accessed as it was a pointer to a
batadv_dat_entry struct (larger than u32). Bugfix provided by Sven
Eckelmann.


Please pull or let me know of any problem!

Thanks a lot,
Antonio


The following changes since commit 326fcfa5acca446b3f71e99f6d19881145556e5c:

  net: remove unnecessary semicolon in netdev_alloc_pcpu_stats() (2015-12-06 
22:32:32 -0500)

are available in the git repository at:

  git://git.open-mesh.org/linux-merge.git tags/batman-adv-fix-for-davem

for you to fetch changes up to b7fe3d4f4a65bc675e737d88071300ea9c4bcddd:

  batman-adv: Fix invalid stack access in batadv_dat_select_candidates 
(2015-12-07 22:40:21 +0800)


Included changes:
- prevent compatibility issue between DAT and speedy join from creating
  inconsistencies in the global translation table
- make sure temporary TT entries are purged out if not claimed
- fix comparison function used for TT hash table
- fix invalid stack access in batadv_dat_select_candidates()


Marek Lindner (1):
  batman-adv: fix erroneous client entry duplicate detection

Simon Wunderlich (2):
  batman-adv: fix speedy join for DAT cache replies
  batman-adv: avoid keeping false temporary entry

Sven Eckelmann (1):
  batman-adv: Fix invalid stack access in batadv_dat_select_candidates

 net/batman-adv/distributed-arp-table.c |  5 -
 net/batman-adv/routing.c   | 19 +++
 net/batman-adv/translation-table.c | 16 
 3 files changed, 31 insertions(+), 9 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation

2015-12-07 Thread Thomas Graf
On 12/05/15 at 03:06pm, Herbert Xu wrote:
> On Fri, Dec 04, 2015 at 07:15:55PM +0100, Phil Sutter wrote:
> >
> > > Only one should really do this, while others are waiting.
> > 
> > Sure, that was my previous understanding of how this thing works.
> 
> Yes that's clearly how it should be.  Unfortunately while adding
> the locking to do this, I found out that you can't actually call
> __vmalloc with BH disabled so this is a no-go.
> 
> Unless we can make __vmalloc work with BH disabled, I guess we'll
> have to go back to multi-level lookups unless someone has a better
> suggestion.

Thanks for fixing the race.

As for the remaining problem, I think we'll have to find a way to
serve a hard pounding user if we want to convert TCP hashtables
later on.

Did you look into what __vmalloc prevents to work with BH disabled?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH V2 0/3] IXGBE/VFIO: Add live migration support for SRIOV NIC

2015-12-07 Thread Lan, Tianyu

On 12/5/2015 1:07 AM, Alexander Duyck wrote:


We still need to support Windows guest for migration and this is why our
patches keep all changes in the driver since it's impossible to change
Windows kernel.


That is a poor argument.  I highly doubt Microsoft is interested in
having to modify all of the drivers that will support direct assignment
in order to support migration.  They would likely request something
similar to what I have in that they will want a way to do DMA tracking
with minimal modification required to the drivers.


This totally depends on the NIC or other devices' vendors and they
should make decision to support migration or not. If yes, they would
modify driver.

If just target to call suspend/resume during migration, the feature will
be meaningless. Most cases don't want to affect user during migration
a lot and so the service down time is vital. Our target is to apply
SRIOV NIC passthough to cloud service and NFV(network functions
virtualization) projects which are sensitive to network performance
and stability. From my opinion, We should give a change for device
driver to implement itself migration job. Call suspend and resume
callback in the driver if it doesn't care the performance during migration.




Following is my idea to do DMA tracking.

Inject event to VF driver after memory iterate stage
and before stop VCPU and then VF driver marks dirty all
using DMA memory. The new allocated pages also need to
be marked dirty before stopping VCPU. All dirty memory
in this time slot will be migrated until stop-and-copy
stage. We also need to make sure to disable VF via clearing the
bus master enable bit for VF before migrating these memory.


The ordering of your explanation here doesn't quite work.  What needs to
happen is that you have to disable DMA and then mark the pages as dirty.
  What the disabling of the BME does is signal to the hypervisor that
the device is now stopped.  The ixgbevf_suspend call already supported
by the driver is almost exactly what is needed to take care of something
like this.


This is why I hope to reserve a piece of space in the dma page to do 
dummy write. This can help to mark page dirty while not require to stop 
DMA and not race with DMA data.


If can't do that, we have to stop DMA in a short time to mark all dma
pages dirty and then reenable it. I am not sure how much we can get by
this way to track all DMA memory with device running during migration. I
need to do some tests and compare results with stop DMA diretly at last
stage during migration.




The question is how we would go about triggering it.  I really don't
think the PCI configuration space approach is the right idea.
 I wonder
if we couldn't get away with some sort of ACPI event instead.  We
already require ACPI support in order to shut down the system
gracefully, I wonder if we couldn't get away with something similar in
order to suspend/resume the direct assigned devices gracefully.



I don't think there is such events in the current spec.
Otherwise, There are two kinds of suspend/resume callbacks.
1) System suspend/resume called during S2RAM and S2DISK.
2) Runtime suspend/resume called by pm core when device is idle.
If you want to do what you mentioned, you have to change PM core and
ACPI spec.


The dma page allocated by VF driver also needs to reserve space
to do dummy write.


No, this will not work.  If for example you have a VF driver allocating
memory for a 9K receive how will that work?  It isn't as if you can poke
a hole in the contiguous memory.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] wlcore/wl12xx: spi: fix oops on firmware load

2015-12-07 Thread Uri Mashiach
The maximum chunks used by the function is
(SPI_AGGR_BUFFER_SIZE / WSPI_MAX_CHUNK_SIZE + 1).
The original commands array had space for
(SPI_AGGR_BUFFER_SIZE / WSPI_MAX_CHUNK_SIZE) commands.
When the last chunk is used (len > 4 * WSPI_MAX_CHUNK_SIZE), the last
command is stored outside the bounds of the commands array.

Oops 5 (page fault) is generated during current wl1271 firmware load
attempt:

root@debian-armhf:~# ifconfig wlan0 up
[  294.312399] Unable to handle kernel paging request at virtual address
00203fc4
[  294.320173] pgd = de528000
[  294.323028] [00203fc4] *pgd=
[  294.326916] Internal error: Oops: 5 [#1] SMP ARM
[  294.331789] Modules linked in: bnep rfcomm bluetooth ipv6 arc4 wl12xx
wlcore mac80211 musb_dsps cfg80211 musb_hdrc usbcore usb_common
wlcore_spi omap_rng rng_core musb_am335x omap_wdt cpufreq_dt thermal_sys
hwmon
[  294.351838] CPU: 0 PID: 1827 Comm: ifconfig Not tainted
4.2.0-2-g3e9ad27-dirty #78
[  294.360154] Hardware name: Generic AM33XX (Flattened Device Tree)
[  294.366557] task: dc9d6d40 ti: de55 task.ti: de55
[  294.372236] PC is at __spi_validate+0xa8/0x2ac
[  294.376902] LR is at __spi_sync+0x78/0x210
[  294.381200] pc : []lr : []psr: 6013
[  294.381200] sp : de551998  ip : de5519d8  fp : 0020
[  294.393242] r10: de551c8c  r9 : de5519d8  r8 : de3a9000
[  294.398730] r7 : de3a9258  r6 : de3a9400  r5 : de551a48  r4 :
00203fbc
[  294.405577] r3 :   r2 :   r1 :   r0 :
de3a9000
[  294.412420] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM
Segment user
[  294.419918] Control: 10c5387d  Table: 9e528019  DAC: 0015
[  294.425954] Process ifconfig (pid: 1827, stack limit = 0xde550218)
[  294.432437] Stack: (0xde551998 to 0xde552000)

...

[  294.883613] [] (__spi_validate) from []
(__spi_sync+0x78/0x210)
[  294.891670] [] (__spi_sync) from []
(wl12xx_spi_raw_write+0xfc/0x148 [wlcore_spi])
[  294.901661] [] (wl12xx_spi_raw_write [wlcore_spi]) from
[] (wlcore_boot_upload_firmware+0x1ec/0x458 [wlcore])
[  294.914038] [] (wlcore_boot_upload_firmware [wlcore]) from
[] (wl12xx_boot+0xc10/0xfac [wl12xx])
[  294.925161] [] (wl12xx_boot [wl12xx]) from []
(wl1271_op_add_interface+0x5b0/0x910 [wlcore])
[  294.936364] [] (wl1271_op_add_interface [wlcore]) from
[] (ieee80211_do_open+0x44c/0xf7c [mac80211])
[  294.947963] [] (ieee80211_do_open [mac80211]) from
[] (__dev_open+0xa8/0x110)
[  294.957307] [] (__dev_open) from []
(__dev_change_flags+0x88/0x148)
[  294.965713] [] (__dev_change_flags) from []
(dev_change_flags+0x18/0x48)
[  294.974576] [] (dev_change_flags) from []
(devinet_ioctl+0x6b4/0x7d0)
[  294.983191] [] (devinet_ioctl) from []
(sock_ioctl+0x1e4/0x2bc)
[  294.991244] [] (sock_ioctl) from []
(do_vfs_ioctl+0x420/0x6b0)
[  294.999208] [] (do_vfs_ioctl) from []
(SyS_ioctl+0x6c/0x7c)
[  295.006880] [] (SyS_ioctl) from []
(ret_fast_syscall+0x0/0x54)
[  295.014835] Code: e1550004 e2444034 0a7d e5953018 (e5942008)
[  295.021544] ---[ end trace 66ed188198f4e24e ]---

Signed-off-by: Uri Mashiach 
---
 drivers/net/wireless/ti/wlcore/spi.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/ti/wlcore/spi.c 
b/drivers/net/wireless/ti/wlcore/spi.c
index f1ac283..720e4e4 100644
--- a/drivers/net/wireless/ti/wlcore/spi.c
+++ b/drivers/net/wireless/ti/wlcore/spi.c
@@ -73,7 +73,10 @@
  */
 #define SPI_AGGR_BUFFER_SIZE (4 * PAGE_SIZE)
 
-#define WSPI_MAX_NUM_OF_CHUNKS (SPI_AGGR_BUFFER_SIZE / WSPI_MAX_CHUNK_SIZE)
+/* Maximum number of SPI write chunks */
+#define WSPI_MAX_NUM_OF_CHUNKS \
+   ((SPI_AGGR_BUFFER_SIZE / WSPI_MAX_CHUNK_SIZE) + 1)
+
 
 struct wl12xx_spi_glue {
struct device *dev;
@@ -268,9 +271,10 @@ static int __must_check wl12xx_spi_raw_write(struct device 
*child, int addr,
 void *buf, size_t len, bool fixed)
 {
struct wl12xx_spi_glue *glue = dev_get_drvdata(child->parent);
-   struct spi_transfer t[2 * (WSPI_MAX_NUM_OF_CHUNKS + 1)];
+   /* SPI write buffers - 2 for each chunk */
+   struct spi_transfer t[2 * WSPI_MAX_NUM_OF_CHUNKS];
struct spi_message m;
-   u32 commands[WSPI_MAX_NUM_OF_CHUNKS];
+   u32 commands[WSPI_MAX_NUM_OF_CHUNKS]; /* 1 command per chunk */
u32 *cmd;
u32 chunk_len;
int i;
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: use-after-free in ip6_xmit

2015-12-07 Thread Eric Dumazet
On Mon, 2015-12-07 at 06:36 -0800, Eric Dumazet wrote:


> Thanks
> 

Also note that Dave Jones reported a SCTP problem fixed by : 

https://patchwork.ozlabs.org/patch/553068/


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rtlwifi: fix gigantic memleak in rtl_usb

2015-12-07 Thread Kalle Valo
Peter Wu  writes:

> Originally I had the Cc: stable line added, but the SubmittingPatches
> document seems to discourage that for networking. Added it again.

Yeah, stable wireless patches are handled differently from rest of the
networking subsystem. It would be great if somebody could update the
documentation.

-- 
Kalle Valo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Checksum offload queries

2015-12-07 Thread Edward Cree
Having decided to take Dave Miller's advice to push our hardware guys in the 
direction of generic checksum offload, I found I wasn't quite sure exactly 
what's being encouraged.  After discussing the subject with a colleague, some 
questions crystallised.  I expect it's mostly a result of misunderstandings on 
my part, but here goes:

1) Receive checksums.  Given that CHECKSUM_UNNECESSARY conversion exists (and 
is a cheap operation), what is the advantage to the stack of using 
CHECKSUM_COMPLETE if the packet happens to be a protocol which 
CHECKSUM_UNNECESSARY conversion can handle?  As I see it, CHECKSUM_UNNECESSARY 
is strictly better as the stack is told "the first csum_level+1 checksums are 
good" *and* (indirectly) "here is the whole-packet checksum, which you can use 
to help with anything beyond csum_level+1".  Is it not, then, best for a device 
only to use CHECKSUM_COMPLETE for protocols the conversion doesn't handle?  (I 
agree that having that fallback of CHECKSUM_COMPLETE is a good thing, sadly I 
don't think our new chip does that.  (But maybe firmware can fix it.))

2) Transmit checksums.  While many protocols permit using 0 in the outer 
checksum, it doesn't seem prudent to assume all will.  Besides, many NICs will 
still have IP and TCP/UDP checksum offload hardware, if only to support less 
enlightened operating systems; why not use it?  Would it not be better for a 
device to have both NETIF_F_HW_CSUM *and* NETIF_F_IP[|V6]_CSUM, and be smart 
enough to fill in IP checksum, TCP/UDP checksum and one encapsulated checksum 
of your choice (i.e. whatever csum_start and friends asked for)?  (Again, I 
agree that having a NETIF_F_IP_CSUM device do specific magic for a list of 
specific encapsulation protocols is unsatisfactory.  Sadly, guess what our new 
chip does!  (But maybe firmware can fix it.))

3) Related to the above, what does a NETIF_F_HW_CSUM device do when 
transmitting an unencapsulated packet (let's say it's UDP) currently?  Will it 
simply get no checksum offload at all?  Will csum_start point at the regular 
UDP checksum (and the stack will do the IP header checksum)?  Again, a device 
that does both HW_ and IP_CSUM could cope with this (do the IP and UDP 
checksums as per NETIF_F_IP_CSUM, and just don't ask for a 'generic' HW_CSUM), 
though that would require more checksum flags (there's no way for 
CHECKSUM_PARTIAL to say "do your IP-specific stuff but ignore csum_start and 
friends).

4) Where, precisely, should I tell our hardware guys to stuff the 
protocol-specific encapsulated checksum offloads they're so proud of having 
added to our new chip? ;)

--
Edward Cree, not speaking for Solarflare Communications
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4.2 085/124] can: Use correct type in sizeof() in nla_put()

2015-12-07 Thread Greg Kroah-Hartman
4.2-stable review patch.  If anyone has any objections, please let me know.

--

From: Marek Vasut 

commit 562b103a21974c2f9cd67514d110f918bb3e1796 upstream.

The sizeof() is invoked on an incorrect variable, likely due to some
copy-paste error, and this might result in memory corruption. Fix this.

Signed-off-by: Marek Vasut 
Cc: Wolfgang Grandegger 
Cc: netdev@vger.kernel.org
Signed-off-by: Marc Kleine-Budde 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/net/can/dev.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/can/dev.c
+++ b/drivers/net/can/dev.c
@@ -915,7 +915,7 @@ static int can_fill_info(struct sk_buff
 nla_put(skb, IFLA_CAN_BITTIMING_CONST,
 sizeof(*priv->bittiming_const), priv->bittiming_const)) ||
 
-   nla_put(skb, IFLA_CAN_CLOCK, sizeof(cm), >clock) ||
+   nla_put(skb, IFLA_CAN_CLOCK, sizeof(priv->clock), >clock) ||
nla_put_u32(skb, IFLA_CAN_STATE, state) ||
nla_put(skb, IFLA_CAN_CTRLMODE, sizeof(cm), ) ||
nla_put_u32(skb, IFLA_CAN_RESTART_MS, priv->restart_ms) ||


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: use-after-free in ip6_xmit

2015-12-07 Thread Eric Dumazet
On Mon, Dec 7, 2015 at 6:44 AM, Dmitry Vyukov  wrote:
> Yes, seems to be fixed on master of
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git. Just
> can't pull in all fixes from all trees. Sorry.
> When will it be merged into Linus tree?
>

As I said, they are already in Linus tree, part of linux-4.4-rc4,
released yesterday.

I would like to remind you lkml and netdev have thousands of subscribers,
so please do not flood them with duplicates, especially considering
you were CC on all
the patches I cooked to address your original report.

So you knew patches were on their way.

commit 071f5d105a0ae93aeb02197c4ee3557e8cc57a21
Merge: 2873d32ff493 e3c9b1ef78eb
Author: Linus Torvalds 
Date:   Thu Dec 3 16:02:46 2015 -0800

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH V2 0/3] IXGBE/VFIO: Add live migration support for SRIOV NIC

2015-12-07 Thread Alexander Duyck
On Mon, Dec 7, 2015 at 7:40 AM, Lan, Tianyu  wrote:
> On 12/5/2015 1:07 AM, Alexander Duyck wrote:
>>>
>>>
>>> We still need to support Windows guest for migration and this is why our
>>> patches keep all changes in the driver since it's impossible to change
>>> Windows kernel.
>>
>>
>> That is a poor argument.  I highly doubt Microsoft is interested in
>> having to modify all of the drivers that will support direct assignment
>> in order to support migration.  They would likely request something
>> similar to what I have in that they will want a way to do DMA tracking
>> with minimal modification required to the drivers.
>
>
> This totally depends on the NIC or other devices' vendors and they
> should make decision to support migration or not. If yes, they would
> modify driver.

Having to modify every driver that wants to support live migration is
a bit much.  In addition I don't see this being limited only to NIC
devices.  You can direct assign a number of different devices, your
solution cannot be specific to NICs.

> If just target to call suspend/resume during migration, the feature will
> be meaningless. Most cases don't want to affect user during migration
> a lot and so the service down time is vital. Our target is to apply
> SRIOV NIC passthough to cloud service and NFV(network functions
> virtualization) projects which are sensitive to network performance
> and stability. From my opinion, We should give a change for device
> driver to implement itself migration job. Call suspend and resume
> callback in the driver if it doesn't care the performance during migration.

The suspend/resume callback should be efficient in terms of time.
After all we don't want the system to stall for a long period of time
when it should be either running or asleep.  Having it burn cycles in
a power state limbo doesn't do anyone any good.  If nothing else maybe
it will help to push the vendors to speed up those functions which
then benefit migration and the system sleep states.

Also you keep assuming you can keep the device running while you do
the migration and you can't.  You are going to corrupt the memory if
you do, and you have yet to provide any means to explain how you are
going to solve that.


>
>>
>>> Following is my idea to do DMA tracking.
>>>
>>> Inject event to VF driver after memory iterate stage
>>> and before stop VCPU and then VF driver marks dirty all
>>> using DMA memory. The new allocated pages also need to
>>> be marked dirty before stopping VCPU. All dirty memory
>>> in this time slot will be migrated until stop-and-copy
>>> stage. We also need to make sure to disable VF via clearing the
>>> bus master enable bit for VF before migrating these memory.
>>
>>
>> The ordering of your explanation here doesn't quite work.  What needs to
>> happen is that you have to disable DMA and then mark the pages as dirty.
>>   What the disabling of the BME does is signal to the hypervisor that
>> the device is now stopped.  The ixgbevf_suspend call already supported
>> by the driver is almost exactly what is needed to take care of something
>> like this.
>
>
> This is why I hope to reserve a piece of space in the dma page to do dummy
> write. This can help to mark page dirty while not require to stop DMA and
> not race with DMA data.

You can't and it will still race.  What concerns me is that your
patches and the document you referenced earlier show a considerable
lack of understanding about how DMA and device drivers work.  There is
a reason why device drivers have so many memory barriers and the like
in them.  The fact is when you have CPU and a device both accessing
memory things have to be done in a very specific order and you cannot
violate that.

If you have a contiguous block of memory you expect the device to
write into you cannot just poke a hole in it.  Such a situation is not
supported by any hardware that I am aware of.

As far as writing to dirty the pages it only works so long as you halt
the DMA and then mark the pages dirty.  It has to be in that order.
Any other order will result in data corruption and I am sure the NFV
customers definitely don't want that.

> If can't do that, we have to stop DMA in a short time to mark all dma
> pages dirty and then reenable it. I am not sure how much we can get by
> this way to track all DMA memory with device running during migration. I
> need to do some tests and compare results with stop DMA diretly at last
> stage during migration.

We have to halt the DMA before we can complete the migration.  So
please feel free to test this.

In addition I still feel you would be better off taking this in
smaller steps.  I still say your first step would be to come up with a
generic solution for the dirty page tracking like the dma_mark_clean()
approach I had mentioned earlier.  If I get time I might try to take
care of it myself later this week since you don't seem to agree with
that approach.

>>
>> The question is how we would go about 

pull-request: wireless-drivers-next 2015-12-07

2015-12-07 Thread Kalle Valo
Hi Dave,

here's the first "real" pull request after the wireless directory
reorganisation. Nothing really out ordinary, new features and bugfixes
as usual. This time there's a regression in ath10k because of a bugfix
in wireless-drivers.git which conflicted with a patch in
wireless-drivers-next.git. But it should be easy to fix, just follow
what Stephen did in in linux-next:

http://article.gmane.org/gmane.linux.kernel.next/37391

Please let me know if you have any problems.

Kalle

The following changes since commit 6d808eba602b00f77f26191f45328774ff057cc0:

  mac80211_hwsim: move Kconfig entry for sorting alphabetically (2015-11-18 
15:23:36 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next.git 
tags/wireless-drivers-next-for-davem-2015-12-07

for you to fetch changes up to 2abcd3d40d2cae8d4698ba4b0f4d6c793dda6f8b:

  Merge tag 'iwlwifi-next-for-kalle-2015-12-01' of 
https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next 
(2015-12-03 17:23:43 +0200)



brcfmac

* support bcm4359 which can operate in two bands concurrently
* disable runtime pm for USB avoiding issues
* use generic pm callback in PCIe driver
* support wowlan wake indication reporting
* add beamforming support
* unified handling of firmware files

ath10k

* support Manegement Frame Protection (MFP)
* add thermal throttling support for 10.4 firmware
* add support for pktlog in QCA99X0
* add debugfs file to enable Bluetooth coexistence feature
* use firmware's native mesh interface type instead of raw mode

iwlwifi

* BT coex improvements
* D3 operation bugfixes
* rate control improvements
* firmware debugging infra improvements
* ground work for multi Rx
* various security fixes


Andy Green (5):
  wcn36xx: introduce WCN36XX_HAL_AVOID_FREQ_RANGE_IND
  wcn36xx: swallow two wcn3620 IND messages
  wcn36xx: handle new hal response format
  wcn36xx: use new response format for wcn3620 trigger_ba
  wcn36xx: use new response format for wcn3620 remove_bsskey

Andy Shevchenko (2):
  rtlwifi: btcoexist: re-use %*ph specifier to hexdump
  wireless: airo: re-use mac_pton()

Arend van Spriel (1):
  brcmfmac: assure net_ratelimit() is declared before use

Avraham Stern (1):
  iwlwifi: mvm: Configure fragmented scan for scheduled scan

Avri Altman (2):
  iwlwifi: mvm: Enable MPLUT only on supported hw
  iwlwifi: mvm: Align bt-coex priority with requirements

Dan Carpenter (5):
  ath9k_htc: check for underflow in ath9k_htc_rx_msg()
  rt2x00: type bug in _rt2500usb_register_read()
  libertas: cleanup a variable name
  brcm80211: fix error code in brcmf_pcie_exit_download_state()
  iwlwifi: mvm: rs: fix a warning message

Derek Basehore (1):
  iwlwifi: mvm: report wakeup for wowlan

Dreyfuss, Haim (1):
  iwlwifi: Add new PCI IDs for 9260 and 5165 series

Eliad Peller (2):
  iwlwifi: mvm: refactor d3 key update functions
  iwlwifi: remove IWL_DL_LED

Emmanuel Grumbach (4):
  Merge remote-tracking branch 'iwlwifi-fixes/master' into next
  iwlwifi: add support for 12K Receive Buffers
  iwlwifi: mvm: change name of iwl_mvm_d3_update_gtk
  iwlwifi: change the Intel Wireless email address

Eyal Shapira (1):
  iwlwifi: mvm: drop low_latency_agg_frame_cnt_limit

Felix Fietkau (1):
  ath10k: stop abusing GFP_DMA

Franky Lin (1):
  brcmfmac: no retries on rxglom superframe errors

Golan Ben Ami (1):
  iwlwifi: mvm: Support setting continuous recording debug mode

Golan Ben-Ami (4):
  iwlwifi: mvm: add trigger for firmware dump upon TDLS events
  iwlwifi: export the _no_grab version of PRPH IO functions
  iwlwifi: dump prph registers in a common place for all transports
  iwlwifi: mvm: move fw-dbg code to separate file

Hante Meuleman (17):
  brcmfmac: Add support for the BCM4359 11ac RSDB PCIE device.
  brcmfmac: Simplify and fix usage of brcmf_ifname.
  brcmfmac: Remove unnecessary check from start_xmit.
  brcmfmac: Remove unncessary variable irq_requested.
  brcmfmac: Disable runtime pm for USB.
  brcmfmac: Add RSDB support.
  brcmfmac: Use consistent naming for bsscfgidx.
  brcmfmac: Use new methods for pcie Power Management.
  brcmfmac: Add wowl wake indication report.
  brcmfmac: Cleanup ssid storage.
  brcmfmac: Return actual error by fwil.
  brcmfmac: Change error print on wlan0 existence.
  brcmfmac: Remove redundant parameter action from scan.
  brcmfmac: Cleanup roaming configuration.
  brcmfmac: Add beamforming support.
  brcmfmac: Unify methods to define and map firmware files.
  brcmfmac: Fix double free on exception at module load.

Johannes Berg (11):
  iwlwifi: nvm: fix up phy section when reading it
  iwlwifi: dvm: remove Kconfig default

[PATCH] r8152: fix lockup when runtime PM is enabled

2015-12-07 Thread Peter Wu
When an interface is brought up which was previously suspended (via
runtime PM), it would hang. This happens because napi_disable is called
before napi_enable.

Solve this by avoiding napi_disable before the device is fully up.

While at it, remove WORK_ENABLE check from rtl8152_open (introduced with
the original change) because it cannot happen:

 - After this patch, runtime resume will not set it during rtl8152_open.
 - When link is up, rtl8152_open is not called.
 - When link is down during system/auto suspend/resume, it is not set.

Fixes: 41cec84cf285 ("r8152: don't enable napi before rx ready")
Link: https://lkml.kernel.org/r/20151205105912.GA1766@al
Signed-off-by: Peter Wu 
---
 drivers/net/usb/r8152.c | 23 +++
 1 file changed, 7 insertions(+), 16 deletions(-)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index d9427ca..b8b083e 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -3067,17 +3067,6 @@ static int rtl8152_open(struct net_device *netdev)
 
mutex_lock(>control);
 
-   /* The WORK_ENABLE may be set when autoresume occurs */
-   if (test_bit(WORK_ENABLE, >flags)) {
-   clear_bit(WORK_ENABLE, >flags);
-   usb_kill_urb(tp->intr_urb);
-   cancel_delayed_work_sync(>schedule);
-
-   /* disable the tx/rx, if the workqueue has enabled them. */
-   if (netif_carrier_ok(netdev))
-   tp->rtl_ops.disable(tp);
-   }
-
tp->rtl_ops.up(tp);
 
rtl8152_set_speed(tp, AUTONEG_ENABLE,
@@ -3516,11 +3505,13 @@ static int rtl8152_resume(struct usb_interface *intf)
if (test_bit(SELECTIVE_SUSPEND, >flags)) {
rtl_runtime_suspend_enable(tp, false);
clear_bit(SELECTIVE_SUSPEND, >flags);
-   napi_disable(>napi);
-   set_bit(WORK_ENABLE, >flags);
-   if (netif_carrier_ok(tp->netdev))
-   rtl_start_rx(tp);
-   napi_enable(>napi);
+   if (tp->netdev->flags & IFF_UP) {
+   napi_disable(>napi);
+   set_bit(WORK_ENABLE, >flags);
+   if (netif_carrier_ok(tp->netdev))
+   rtl_start_rx(tp);
+   napi_enable(>napi);
+   }
} else {
tp->rtl_ops.up(tp);
rtl8152_set_speed(tp, AUTONEG_ENABLE,
-- 
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] ravb: Remove clear unhandled interrupt

2015-12-07 Thread Yoshihiro Kaneko
Hello,

2015-12-07 4:18 GMT+09:00 Sergei Shtylyov :
> Hello.
>
> On 12/06/2015 02:42 PM, Yoshihiro Kaneko wrote:
>
>> From: Kazuya Mizuguchi 
>>
>> AVB-DMAC Reception Warning interrupt is not enabled, so it is not
>> necessary to clear the interrupt.
>>
>> Signed-off-by: Kazuya Mizuguchi 
>> Signed-off-by: Yoshihiro Kaneko 
>
>
>In principle I agree but perhaps we should clear RIC1 in probe() to not
> depend on a state left from a bootloader?

I think that it is a good idea.
I'll add it to v2.

>
> MBR, Sergei
>

Thanks,
Kaneko
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH net-next 1/2] tcp: RTO Restart (RTOR)

2015-12-07 Thread Eric Dumazet
On Mon, 2015-12-07 at 10:00 +0100, Per Hurtig wrote:
\
>  
> +static u32 tcp_unsent_pkts(const struct sock *sk)
> +{
> + struct sk_buff *skb = tcp_send_head(sk);
> + u32 pkts = 0;
> +
> + if (skb)
> + tcp_for_write_queue_from(skb, sk)
> + pkts += tcp_skb_pcount(skb);
> +
> + return pkts;
> +}

write queue can be very big (consider GSO/TSO being off for example)

Parsing it just to implement later :

(tp->packets_out + tcp_unsent_pkts(sk) <
> + TCP_RTORESTART_THRESH)

is probably not very efficient.

I would rather implement a different helper, aborting the loop as soon
as the condition can not be met anymore.

> +
>  /* Restart timer after forward progress on connection.
>   * RFC2988 recommends to restart timer to now+rto.
>   */
> @@ -3027,6 +3040,17 @@ void tcp_rearm_rto(struct sock *sk)
>*/
>   if (delta > 0)
>   rto = delta;
> + } else if (icsk->icsk_pending == ICSK_TIME_RETRANS &&
> +(sysctl_tcp_timer_restart == 1 ||
> + sysctl_tcp_timer_restart == 3) &&
> +(tp->packets_out + tcp_unsent_pkts(sk) <
> + TCP_RTORESTART_THRESH)) {
> + struct sk_buff *skb = tcp_write_queue_head(sk);
> + const u32 rto_time_stamp = tcp_skb_timestamp(skb);
> + s32 delta = (s32)(tcp_time_stamp - rto_time_stamp);
> +
> + if (delta > 0 && rto > delta)
> + rto -= delta;
>   }
>   inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS, rto,
> TCP_RTO_MAX);


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 8/8] treewide: Remove newlines inside DEFINE_PER_CPU() macros

2015-12-07 Thread Joe Perches
On Mon, 2015-12-07 at 17:53 +0100, Michal Marek wrote:
> On 2015-12-07 17:33, David Laight wrote:
> > From: Michal Marek
> > > Sent: 04 December 2015 15:26
> > > Otherwise make tags can't parse them:
> > > 
> > > ctags: Warning: arch/ia64/kernel/smp.c:60: null expansion of name pattern 
> > > "\1"
> > ...
> > 
> > Seems to me you need to fix ctags.
> 
> I'm sure the maintainers of ctags and etags would accept patches to
> describe a custom context-free grammar via commandline options, but
> until then, let's continue using the regular expressions in tags.sh and
> remove newlines in macros that tags.sh is trying to expand.
> 

Do you have a list of the most common macros?

Perhaps it'd be good to add exceptions to checkpatch
80 column line rules for them.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC 07/26] phy: marvell: 88E1512: add flow control support

2015-12-07 Thread Russell King
The Marvell PHYs support pause frame advertisments, so we should not be
masking their support off.  Add the necessary flag to the Marvell PHY
to allow any MAC level pause frame support to be advertised.

Signed-off-by: Russell King 
---
 drivers/net/phy/marvell.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c
index 5de8d5827536..9a5329bfd0fd 100644
--- a/drivers/net/phy/marvell.c
+++ b/drivers/net/phy/marvell.c
@@ -1142,7 +1142,7 @@ static struct phy_driver marvell_drivers[] = {
.phy_id = MARVELL_PHY_ID_88E1510,
.phy_id_mask = MARVELL_PHY_ID_MASK,
.name = "Marvell 88E1510",
-   .features = PHY_GBIT_FEATURES,
+   .features = PHY_GBIT_FEATURES | SUPPORTED_Pause,
.flags = PHY_HAS_INTERRUPT,
.config_aneg = _config_aneg,
.read_status = _read_status,
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC 06/26] phy: provide a hook for link up/link down events

2015-12-07 Thread Russell King
Sometimes, we need to do additional work between the PHY coming up and
marking the carrier present - for example, we may need to wait for the
PHY to MAC link to finish negotiation.  This changes phylib to provide
a notification function pointer which avoids the built-in
netif_carrier_on() and netif_carrier_off() functions.

Standard ->adjust_link functionality is provided by hooking a helper
into the new ->phy_link_change method.

Signed-off-by: Russell King 
---
 drivers/net/phy/phy.c| 42 ++
 drivers/net/phy/phy_device.c | 14 ++
 include/linux/phy.h  |  1 +
 3 files changed, 37 insertions(+), 20 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index adb48abafc87..150497246922 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -802,6 +802,16 @@ void phy_start(struct phy_device *phydev)
 }
 EXPORT_SYMBOL(phy_start);
 
+static void phy_link_up(struct phy_device *phydev)
+{
+   phydev->phy_link_change(phydev, true, true);
+}
+
+static void phy_link_down(struct phy_device *phydev, bool do_carrier)
+{
+   phydev->phy_link_change(phydev, false, do_carrier);
+}
+
 /**
  * phy_state_machine - Handle the state machine
  * @work: work_struct that describes the work to be done
@@ -843,8 +853,7 @@ void phy_state_machine(struct work_struct *work)
/* If the link is down, give up on negotiation for now */
if (!phydev->link) {
phydev->state = PHY_NOLINK;
-   netif_carrier_off(phydev->attached_dev);
-   phydev->adjust_link(phydev->attached_dev);
+   phy_link_down(phydev, true);
break;
}
 
@@ -856,9 +865,7 @@ void phy_state_machine(struct work_struct *work)
/* If AN is done, we're running */
if (err > 0) {
phydev->state = PHY_RUNNING;
-   netif_carrier_on(phydev->attached_dev);
-   phydev->adjust_link(phydev->attached_dev);
-
+   phy_link_up(phydev);
} else if (0 == phydev->link_timeout--)
needs_aneg = true;
break;
@@ -880,8 +887,7 @@ void phy_state_machine(struct work_struct *work)
}
}
phydev->state = PHY_RUNNING;
-   netif_carrier_on(phydev->attached_dev);
-   phydev->adjust_link(phydev->attached_dev);
+   phy_link_up(phydev);
}
break;
case PHY_FORCING:
@@ -891,13 +897,12 @@ void phy_state_machine(struct work_struct *work)
 
if (phydev->link) {
phydev->state = PHY_RUNNING;
-   netif_carrier_on(phydev->attached_dev);
+   phy_link_up(phydev);
} else {
if (0 == phydev->link_timeout--)
needs_aneg = true;
+   phy_link_down(phydev, false);
}
-
-   phydev->adjust_link(phydev->attached_dev);
break;
case PHY_RUNNING:
/* Only register a CHANGE if we are polling or ignoring
@@ -920,14 +925,12 @@ void phy_state_machine(struct work_struct *work)
 
if (phydev->link) {
phydev->state = PHY_RUNNING;
-   netif_carrier_on(phydev->attached_dev);
+   phy_link_up(phydev);
} else {
phydev->state = PHY_NOLINK;
-   netif_carrier_off(phydev->attached_dev);
+   phy_link_down(phydev, true);
}
 
-   phydev->adjust_link(phydev->attached_dev);
-
if (phy_interrupt_is_valid(phydev))
err = phy_config_interrupt(phydev,
   PHY_INTERRUPT_ENABLED);
@@ -935,8 +938,7 @@ void phy_state_machine(struct work_struct *work)
case PHY_HALTED:
if (phydev->link) {
phydev->link = 0;
-   netif_carrier_off(phydev->attached_dev);
-   phydev->adjust_link(phydev->attached_dev);
+   phy_link_down(phydev, true);
do_suspend = true;
}
break;
@@ -956,11 +958,11 @@ void phy_state_machine(struct work_struct *work)
 
if (phydev->link) {
phydev->state = PHY_RUNNING;
-   netif_carrier_on(phydev->attached_dev);
+   phy_link_up(phydev);
} else  {
phydev->state = 

[PATCH RFC 11/26] phylink: add phylink infrastructure

2015-12-07 Thread Russell King
The link between the ethernet MAC and its PHY has become more complex
as the interface evolves.  This is especially true with serdes links,
where the part of the PHY is effectively integrated into the MAC.

Serdes links can be connected to a variety of devices, including SFF
modules soldered down onto the board with the MAC, a SFP cage with
a hotpluggable SFP module which may contain a PHY or directly modulate
the serdes signals onto optical media with or without a PHY, or even
a classical PHY connection.

Moreover, the negotiation information on serdes links comes in two
varieties - SGMII mode, where the PHY provides its speed/duplex/flow
control information to the MAC, and 1000base-X mode where both ends
exchange their abilities and each resolve the link capabilities.

This means we need a more flexible means to support these arrangements,
particularly with the hotpluggable nature of SFP, where the PHY can
be attached or detached after the network device has been brought up.

Ethtool information can come from multiple sources:
- we may have a PHY operating in either SGMII or 1000base-X mode, in
  which case we take ethtool/mii data directly from the PHY.
- we may have a optical SFP module without a PHY, with the MAC
  operating in 1000base-X mode - the ethtool/mii data needs to come
  from the MAC.
- we may have a copper SFP module with a PHY whic can't be accessed,
  which means we need to take ethtool/mii data from the MAC.

Phylink aims to solve this by providing an intermediary between the
MAC and PHY, providing a safe way for PHYs to be hotplugged, and
allowing a SFP driver to reconfigure the serdes connection.

Phylink also takes over support of fixed link connections, where
the speed/duplex/flow control are fixed, but link status may be
controlled by a GPIO signal.  By avoiding the fixed-phy implementation,
phylink can provide a faster response to link events: fixed-phy has
to wait for phylib to operate its state machine, which can take
several seconds.  In comparison, phylink takes milliseconds.

Signed-off-by: Russell King 
---
 drivers/net/phy/Kconfig  |  10 +
 drivers/net/phy/Makefile |   1 +
 drivers/net/phy/phy_device.c |   1 +
 drivers/net/phy/phylink.c| 787 +++
 include/linux/phy.h  |   2 +
 include/linux/phylink.h  |  70 
 6 files changed, 871 insertions(+)
 create mode 100644 drivers/net/phy/phylink.c
 create mode 100644 include/linux/phylink.h

diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index c475531a542e..5c634b4bc9bd 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -10,6 +10,16 @@ menuconfig PHYLIB
  devices.  This option provides infrastructure for
  managing PHY devices.
 
+config PHYLINK
+   tristate
+   depends on NETDEVICES
+   select PHYLIB
+   select SWPHY
+   help
+ PHYlink models the link between the PHY and MAC, allowing fixed
+ configuration links, PHYs, and Serdes links with MAC level
+ autonegotiation modes.
+
 if PHYLIB
 
 config SWPHY
diff --git a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile
index d5c3ff625fbe..bc052bb6cee0 100644
--- a/drivers/net/phy/Makefile
+++ b/drivers/net/phy/Makefile
@@ -3,6 +3,7 @@
 libphy-y   := phy.o phy_device.o mdio_bus.o
 libphy-$(CONFIG_SWPHY) += swphy.o
 
+obj-$(CONFIG_PHYLINK)  += phylink.o
 obj-$(CONFIG_PHYLIB)   += libphy.o
 obj-$(CONFIG_AQUANTIA_PHY) += aquantia.o
 obj-$(CONFIG_MARVELL_PHY)  += marvell.o
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 67a654a1179b..34f2ac29dbed 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -751,6 +751,7 @@ void phy_detach(struct phy_device *phydev)
phydev->attached_dev->phydev = NULL;
phydev->attached_dev = NULL;
phy_suspend(phydev);
+   phydev->phylink = NULL;
 
/* If the device had no specific driver before (i.e. - it
 * was using the generic driver), we unbind the device
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
new file mode 100644
index ..d385eb7c4147
--- /dev/null
+++ b/drivers/net/phy/phylink.c
@@ -0,0 +1,787 @@
+/*
+ * phylink models the MAC to optional PHY connection, supporting
+ * technologies such as SFP cages where the PHY is hot-pluggable.
+ *
+ * Copyright (C) 2015 Russell King
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "swphy.h"
+
+#define SUPPORTED_INTERFACES \
+   (SUPPORTED_TP | SUPPORTED_MII | SUPPORTED_FIBRE | \
+SUPPORTED_BNC | SUPPORTED_AUI | SUPPORTED_Backplane)
+#define 

[PATCH RFC 12/26] phylink: add hooks for SFP support

2015-12-07 Thread Russell King
Add support to phylink for SFP, which needs to control and configure
the ethernet MAC link state.  Specifically, SFP needs to:

1. set the negotiation mode between SGMII and 1000base-X
2. attach and detach the module PHY
3. prevent the link coming up when errors are reported

In the absence of a PHY, we also need to set the ethtool port type
according to the module plugged in.

Signed-off-by: Russell King 
---
 drivers/net/phy/phylink.c | 89 +++
 include/linux/phylink.h   |  6 
 2 files changed, 95 insertions(+)

diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index d385eb7c4147..7d56e5895087 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -29,11 +30,16 @@
(ADVERTISED_TP | ADVERTISED_MII | ADVERTISED_FIBRE | \
 ADVERTISED_BNC | ADVERTISED_AUI | ADVERTISED_Backplane)
 
+static LIST_HEAD(phylinks);
+static DEFINE_MUTEX(phylink_mutex);
+
 enum {
PHYLINK_DISABLE_STOPPED,
+   PHYLINK_DISABLE_LINK,
 };
 
 struct phylink {
+   struct list_head node;
struct net_device *netdev;
const struct phylink_mac_ops *ops;
struct mutex config_mutex;
@@ -313,12 +319,20 @@ struct phylink *phylink_create(struct net_device *ndev, 
struct device_node *np,
return ERR_PTR(ret);
}
 
+   mutex_lock(_mutex);
+   list_add_tail(>node, );
+   mutex_unlock(_mutex);
+
return pl;
 }
 EXPORT_SYMBOL_GPL(phylink_create);
 
 void phylink_destroy(struct phylink *pl)
 {
+   mutex_lock(_mutex);
+   list_del(>node);
+   mutex_unlock(_mutex);
+
cancel_work_sync(>resolve);
kfree(pl);
 }
@@ -784,4 +798,79 @@ int phylink_mii_ioctl(struct phylink *pl, struct ifreq 
*ifr, int cmd)
 }
 EXPORT_SYMBOL_GPL(phylink_mii_ioctl);
 
+
+
+void phylink_disable(struct phylink *pl)
+{
+   set_bit(PHYLINK_DISABLE_LINK, >phylink_disable_state);
+   flush_work(>resolve);
+
+   netif_carrier_off(pl->netdev);
+}
+EXPORT_SYMBOL_GPL(phylink_disable);
+
+void phylink_enable(struct phylink *pl)
+{
+   clear_bit(PHYLINK_DISABLE_LINK, >phylink_disable_state);
+   phylink_run_resolve(pl);
+}
+EXPORT_SYMBOL_GPL(phylink_enable);
+
+void phylink_set_link_port(struct phylink *pl, u32 support, u8 port)
+{
+   WARN_ON(support & ~SUPPORTED_INTERFACES);
+
+   mutex_lock(>config_mutex);
+   pl->link_port_support = support;
+   pl->link_port = port;
+   mutex_unlock(>config_mutex);
+}
+EXPORT_SYMBOL_GPL(phylink_set_link_port);
+
+int phylink_set_link_an_mode(struct phylink *pl, unsigned int mode)
+{
+   struct phylink_link_state state;
+   int ret = 0;
+
+   mutex_lock(>config_mutex);
+   if (pl->link_an_mode != mode) {
+   netdev_info(pl->netdev, "switching to link AN mode %s\n",
+   phylink_an_mode_str(mode));
+
+   state = pl->link_config;
+   ret = pl->ops->mac_get_support(pl->netdev, mode, );
+   if (ret == 0) {
+   pl->link_an_mode = mode;
+   pl->link_config = state;
+
+   if (!test_bit(PHYLINK_DISABLE_STOPPED,
+ >phylink_disable_state))
+   pl->ops->mac_config(pl->netdev,
+   pl->link_an_mode,
+   >link_config);
+   }
+   }
+   mutex_unlock(>config_mutex);
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(phylink_set_link_an_mode);
+
+struct phylink *phylink_lookup_by_netdev(struct net_device *ndev)
+{
+   struct phylink *pl, *found = NULL;
+
+   mutex_lock(_mutex);
+   list_for_each_entry(pl, , node)
+   if (pl->netdev == ndev) {
+   found = pl;
+   break;
+   }
+
+   mutex_unlock(_mutex);
+
+   return found;
+}
+EXPORT_SYMBOL_GPL(phylink_lookup_by_netdev);
+
 MODULE_LICENSE("GPL");
diff --git a/include/linux/phylink.h b/include/linux/phylink.h
index 05953c8abc70..c7a665a538c1 100644
--- a/include/linux/phylink.h
+++ b/include/linux/phylink.h
@@ -67,4 +67,10 @@ int phylink_ethtool_get_settings(struct phylink *, struct 
ethtool_cmd *);
 int phylink_ethtool_set_settings(struct phylink *, struct ethtool_cmd *);
 int phylink_mii_ioctl(struct phylink *, struct ifreq *, int);
 
+void phylink_set_link_port(struct phylink *pl, u32 support, u8 port);
+int phylink_set_link_an_mode(struct phylink *pl, unsigned int mode);
+void phylink_disable(struct phylink *pl);
+void phylink_enable(struct phylink *pl);
+struct phylink *phylink_lookup_by_netdev(struct net_device *ndev);
+
 #endif
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org

[PATCH RFC 10/26] phy: add I2C mdio bus

2015-12-07 Thread Russell King
Add an I2C MDIO bus bridge library, to allow phylib to access PHYs which
are connected to an I2C bus instead of the more conventional MDIO bus.
Such PHYs can be found in SFP adapters and SFF modules.

Signed-off-by: Russell King 
---
 drivers/net/phy/Kconfig| 10 ++
 drivers/net/phy/Makefile   |  1 +
 drivers/net/phy/mdio-i2c.c | 90 ++
 drivers/net/phy/mdio-i2c.h | 19 ++
 4 files changed, 120 insertions(+)
 create mode 100644 drivers/net/phy/mdio-i2c.c
 create mode 100644 drivers/net/phy/mdio-i2c.h

diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index c59f957fc282..c475531a542e 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -187,6 +187,16 @@ config MDIO_GPIO
  To compile this driver as a module, choose M here: the module
  will be called mdio-gpio.
 
+config MDIO_I2C
+   tristate
+   depends on I2C
+   help
+ Support I2C based PHYs.  This provides a MDIO bus bridged
+ to I2C to allow PHYs connected in I2C mode to be accessed
+ using the existing infrastructure.
+
+ This is library mode.
+
 config MDIO_OCTEON
tristate "Support for MDIO buses on Octeon and ThunderX SOCs"
depends on 64BIT
diff --git a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile
index 31bdf193adbd..d5c3ff625fbe 100644
--- a/drivers/net/phy/Makefile
+++ b/drivers/net/phy/Makefile
@@ -25,6 +25,7 @@ obj-$(CONFIG_LSI_ET1011C_PHY) += et1011c.o
 obj-$(CONFIG_FIXED_PHY)+= fixed_phy.o
 obj-$(CONFIG_MDIO_BITBANG) += mdio-bitbang.o
 obj-$(CONFIG_MDIO_GPIO)+= mdio-gpio.o
+obj-$(CONFIG_MDIO_I2C) += mdio-i2c.o
 obj-$(CONFIG_NATIONAL_PHY) += national.o
 obj-$(CONFIG_DP83640_PHY)  += dp83640.o
 obj-$(CONFIG_DP83848_PHY)  += dp83848.o
diff --git a/drivers/net/phy/mdio-i2c.c b/drivers/net/phy/mdio-i2c.c
new file mode 100644
index ..57b5de8c5a3e
--- /dev/null
+++ b/drivers/net/phy/mdio-i2c.c
@@ -0,0 +1,90 @@
+/*
+ * MDIO I2C bridge
+ *
+ * Copyright (C) 2015 Russell King
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#include 
+#include 
+
+#include "mdio-i2c.h"
+
+static int i2c_mii_read(struct mii_bus *bus, int phy_id, int reg)
+{
+   struct i2c_adapter *i2c = bus->priv;
+   struct i2c_msg msgs[2];
+   u8 data[2], dev_addr = reg;
+   int bus_addr, ret;
+
+   bus_addr = 0x40 + phy_id;
+   if (bus_addr == 0x50 || bus_addr == 0x51)
+   return 0x;
+
+   msgs[0].addr = bus_addr;
+   msgs[0].flags = 0;
+   msgs[0].len = 1;
+   msgs[0].buf = _addr;
+   msgs[1].addr = bus_addr;
+   msgs[1].flags = I2C_M_RD;
+   msgs[1].len = sizeof(data);
+   msgs[1].buf = data;
+
+   ret = i2c_transfer(i2c, msgs, ARRAY_SIZE(msgs));
+   if (ret != ARRAY_SIZE(msgs))
+   return 0x;
+
+   return data[0] << 8 | data[1];
+}
+
+static int i2c_mii_write(struct mii_bus *bus, int phy_id, int reg, u16 val)
+{
+   struct i2c_adapter *i2c = bus->priv;
+   struct i2c_msg msg;
+   int bus_addr, ret;
+   u8 data[3];
+
+   bus_addr = 0x40 + phy_id;
+   if (bus_addr == 0x50 || bus_addr == 0x51)
+   return 0;
+
+   data[0] = reg;
+   data[1] = val >> 8;
+   data[2] = val;
+
+   msg.addr = bus_addr;
+   msg.flags = 0;
+   msg.len = 3;
+   msg.buf = data;
+
+   ret = i2c_transfer(i2c, , 1);
+
+   return ret < 0 ? ret : 0;
+}
+
+struct mii_bus *mdio_i2c_alloc(struct device *parent, struct i2c_adapter *i2c)
+{
+   struct mii_bus *mii;
+
+   if (!i2c_check_functionality(i2c, I2C_FUNC_I2C))
+   return ERR_PTR(-EINVAL);
+
+   mii = mdiobus_alloc();
+   if (!mii)
+   return ERR_PTR(-ENOMEM);
+
+   snprintf(mii->id, MII_BUS_ID_SIZE, "i2c:%s", dev_name(parent));
+   mii->parent = parent;
+   mii->read = i2c_mii_read;
+   mii->write = i2c_mii_write;
+   mii->priv = i2c;
+
+   return mii;
+}
+EXPORT_SYMBOL_GPL(mdio_i2c_alloc);
+
+MODULE_AUTHOR("Russell King");
+MODULE_DESCRIPTION("MDIO I2C bridge library");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/net/phy/mdio-i2c.h b/drivers/net/phy/mdio-i2c.h
new file mode 100644
index ..889ab57d7f3e
--- /dev/null
+++ b/drivers/net/phy/mdio-i2c.h
@@ -0,0 +1,19 @@
+/*
+ * MDIO I2C bridge
+ *
+ * Copyright (C) 2015 Russell King
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef MDIO_I2C_H
+#define MDIO_I2C_H
+
+struct device;
+struct i2c_adapter;
+struct mii_bus;
+
+struct mii_bus *mdio_i2c_alloc(struct device *parent, struct i2c_adapter *i2c);
+

[PATCH RFC 04/26] phy: generate swphy registers on the fly

2015-12-07 Thread Russell King
Generate software phy registers as and when requested, rather than
duplicating the state in fixed_phy.  This allows us to eliminate
the duplicate storage of of the same data, which is only different
in format.

As fixed_phy_update_regs() no longer updates register state, rename
it to fixed_phy_update().

Signed-off-by: Russell King 
---
 drivers/net/phy/fixed_phy.c | 31 +-
 drivers/net/phy/swphy.c | 47 -
 drivers/net/phy/swphy.h |  2 +-
 3 files changed, 40 insertions(+), 40 deletions(-)

diff --git a/drivers/net/phy/fixed_phy.c b/drivers/net/phy/fixed_phy.c
index fe4ca05afc77..66555da2ef27 100644
--- a/drivers/net/phy/fixed_phy.c
+++ b/drivers/net/phy/fixed_phy.c
@@ -26,8 +26,6 @@
 
 #include "swphy.h"
 
-#define MII_REGS_NUM 29
-
 struct fixed_mdio_bus {
int irqs[PHY_MAX_ADDR];
struct mii_bus *mii_bus;
@@ -36,7 +34,6 @@ struct fixed_mdio_bus {
 
 struct fixed_phy {
int addr;
-   u16 regs[MII_REGS_NUM];
struct phy_device *phydev;
struct fixed_phy_status status;
int (*link_update)(struct net_device *, struct fixed_phy_status *);
@@ -49,12 +46,10 @@ static struct fixed_mdio_bus platform_fmb = {
.phys = LIST_HEAD_INIT(platform_fmb.phys),
 };
 
-static void fixed_phy_update_regs(struct fixed_phy *fp)
+static void fixed_phy_update(struct fixed_phy *fp)
 {
if (gpio_is_valid(fp->link_gpio))
fp->status.link = !!gpio_get_value_cansleep(fp->link_gpio);
-
-   swphy_update_regs(fp->regs, >status);
 }
 
 static int fixed_mdio_read(struct mii_bus *bus, int phy_addr, int reg_num)
@@ -62,29 +57,15 @@ static int fixed_mdio_read(struct mii_bus *bus, int 
phy_addr, int reg_num)
struct fixed_mdio_bus *fmb = bus->priv;
struct fixed_phy *fp;
 
-   if (reg_num >= MII_REGS_NUM)
-   return -1;
-
-   /* We do not support emulating Clause 45 over Clause 22 register reads
-* return an error instead of bogus data.
-*/
-   switch (reg_num) {
-   case MII_MMD_CTRL:
-   case MII_MMD_DATA:
-   return -1;
-   default:
-   break;
-   }
-
list_for_each_entry(fp, >phys, node) {
if (fp->addr == phy_addr) {
/* Issue callback if user registered it. */
if (fp->link_update) {
fp->link_update(fp->phydev->attached_dev,
>status);
-   fixed_phy_update_regs(fp);
+   fixed_phy_update(fp);
}
-   return fp->regs[reg_num];
+   return swphy_read_reg(reg_num, >status);
}
}
 
@@ -144,7 +125,7 @@ int fixed_phy_update_state(struct phy_device *phydev,
_UPD(pause);
_UPD(asym_pause);
 #undef _UPD
-   fixed_phy_update_regs(fp);
+   fixed_phy_update(fp);
return 0;
}
}
@@ -169,8 +150,6 @@ int fixed_phy_add(unsigned int irq, int phy_addr,
if (!fp)
return -ENOMEM;
 
-   memset(fp->regs, 0xFF,  sizeof(fp->regs[0]) * MII_REGS_NUM);
-
fmb->irqs[phy_addr] = irq;
 
fp->addr = phy_addr;
@@ -184,7 +163,7 @@ int fixed_phy_add(unsigned int irq, int phy_addr,
goto err_regs;
}
 
-   fixed_phy_update_regs(fp);
+   fixed_phy_update(fp);
 
list_add_tail(>node, >phys);
 
diff --git a/drivers/net/phy/swphy.c b/drivers/net/phy/swphy.c
index 21a9bd8a7830..34f58f2349e9 100644
--- a/drivers/net/phy/swphy.c
+++ b/drivers/net/phy/swphy.c
@@ -20,6 +20,8 @@
 
 #include "swphy.h"
 
+#define MII_REGS_NUM 29
+
 struct swmii_regs {
u16 bmcr;
u16 bmsr;
@@ -110,14 +112,13 @@ int swphy_validate_state(const struct fixed_phy_status 
*state)
 EXPORT_SYMBOL_GPL(swphy_validate_state);
 
 /**
- * swphy_update_regs - update MII register array with fixed phy state
- * @regs: array of 32 registers to update
+ * swphy_read_reg - return a MII register from the fixed phy state
+ * @reg: MII register
  * @state: fixed phy status
  *
- * Update the array of MII registers with the fixed phy link, speed,
- * duplex and pause mode settings.
+ * Return the MII @reg register generated from the fixed phy state @state.
  */
-void swphy_update_regs(u16 *regs, const struct fixed_phy_status *state)
+int swphy_read_reg(int reg, const struct fixed_phy_status *state)
 {
int speed_index, duplex_index;
u16 bmsr = BMSR_ANEGCAPABLE;
@@ -125,9 +126,12 @@ void swphy_update_regs(u16 *regs, const struct 
fixed_phy_status *state)
u16 lpagb = 0;
u16 lpa = 0;
 
+   if (reg > MII_REGS_NUM)
+   return -1;
+
speed_index = swphy_decode_speed(state->speed);
if 

[PATCH RFC 02/26] phy: convert swphy register generation to tabular form

2015-12-07 Thread Russell King
Convert the swphy register generation to tabular form which allows us
to eliminate multiple switch() statements.  This results in a smaller
object code size, more efficient, and easier to add support for faster
speeds.

Before:

Idx Name  Size  VMA   LMA   File off  Algn
  0 .text 0164      0034  2**2

   textdata bss dec hex filename
388   0   0 388 184 swphy.o

After:

Idx Name  Size  VMA   LMA   File off  Algn
  0 .text 00fc      0034  2**2
  5 .rodata   0028      0138  2**2

   textdata bss dec hex filename
324   0   0 324 144 swphy.o

Signed-off-by: Russell King 
---
 drivers/net/phy/swphy.c | 143 ++--
 1 file changed, 78 insertions(+), 65 deletions(-)

diff --git a/drivers/net/phy/swphy.c b/drivers/net/phy/swphy.c
index 0551a79a2454..c88a194b4cb6 100644
--- a/drivers/net/phy/swphy.c
+++ b/drivers/net/phy/swphy.c
@@ -20,6 +20,72 @@
 
 #include "swphy.h"
 
+struct swmii_regs {
+   u16 bmcr;
+   u16 bmsr;
+   u16 lpa;
+   u16 lpagb;
+};
+
+enum {
+   SWMII_SPEED_10 = 0,
+   SWMII_SPEED_100,
+   SWMII_SPEED_1000,
+   SWMII_DUPLEX_HALF = 0,
+   SWMII_DUPLEX_FULL,
+};
+
+/*
+ * These two tables get bitwise-anded together to produce the final result.
+ * This means the speed table must contain both duplex settings, and the
+ * duplex table must contain all speed settings.
+ */
+static const struct swmii_regs speed[] = {
+   [SWMII_SPEED_10] = {
+   .bmcr  = BMCR_FULLDPLX,
+   .lpa   = LPA_10FULL | LPA_10HALF,
+   },
+   [SWMII_SPEED_100] = {
+   .bmcr  = BMCR_FULLDPLX | BMCR_SPEED100,
+   .bmsr  = BMSR_100FULL | BMSR_100HALF,
+   .lpa   = LPA_100FULL | LPA_100HALF,
+   },
+   [SWMII_SPEED_1000] = {
+   .bmcr  = BMCR_FULLDPLX | BMCR_SPEED1000,
+   .bmsr  = BMSR_ESTATEN,
+   .lpagb = LPA_1000FULL | LPA_1000HALF,
+   },
+};
+
+static const struct swmii_regs duplex[] = {
+   [SWMII_DUPLEX_HALF] = {
+   .bmcr  = ~BMCR_FULLDPLX,
+   .bmsr  = BMSR_ESTATEN | BMSR_100HALF,
+   .lpa   = LPA_10HALF | LPA_100HALF,
+   .lpagb = LPA_1000HALF,
+   },
+   [SWMII_DUPLEX_FULL] = {
+   .bmcr  = ~0,
+   .bmsr  = BMSR_ESTATEN | BMSR_100FULL,
+   .lpa   = LPA_10FULL | LPA_100FULL,
+   .lpagb = LPA_1000FULL,
+   },
+};
+
+static int swphy_decode_speed(int speed)
+{
+   switch (speed) {
+   case 1000:
+   return SWMII_SPEED_1000;
+   case 100:
+   return SWMII_SPEED_100;
+   case 10:
+   return SWMII_SPEED_10;
+   default:
+   return -EINVAL;
+   }
+}
+
 /**
  * swphy_update_regs - update MII register array with fixed phy state
  * @regs: array of 32 registers to update
@@ -30,81 +96,28 @@
  */
 int swphy_update_regs(u16 *regs, const struct fixed_phy_status *state)
 {
+   int speed_index, duplex_index;
u16 bmsr = BMSR_ANEGCAPABLE;
u16 bmcr = 0;
u16 lpagb = 0;
u16 lpa = 0;
 
-   if (state->duplex) {
-   switch (state->speed) {
-   case 1000:
-   bmsr |= BMSR_ESTATEN;
-   break;
-   case 100:
-   bmsr |= BMSR_100FULL;
-   break;
-   case 10:
-   bmsr |= BMSR_10FULL;
-   break;
-   default:
-   break;
-   }
-   } else {
-   switch (state->speed) {
-   case 1000:
-   bmsr |= BMSR_ESTATEN;
-   break;
-   case 100:
-   bmsr |= BMSR_100HALF;
-   break;
-   case 10:
-   bmsr |= BMSR_10HALF;
-   break;
-   default:
-   break;
-   }
+   speed_index = swphy_decode_speed(state->speed);
+   if (speed_index < 0) {
+   pr_warn("swphy: unknown speed\n");
+   return -EINVAL;
}
 
+   duplex_index = state->duplex ? SWMII_DUPLEX_FULL : SWMII_DUPLEX_HALF;
+
+   bmsr |= speed[speed_index].bmsr & duplex[duplex_index].bmsr;
+
if (state->link) {
bmsr |= BMSR_LSTATUS | BMSR_ANEGCOMPLETE;
 
-   if (state->duplex) {
-   bmcr |= BMCR_FULLDPLX;
-
-   switch (state->speed) {
-   case 1000:
-   bmcr |= BMCR_SPEED1000;
-   lpagb |= LPA_1000FULL;
-   break;
-  

[PATCH RFC 09/26] phy: export phy_speed_to_str() for phylink

2015-12-07 Thread Russell King
phylink would like to reuse phy_speed_to_str() to convert the speed
to a string.  Add a prototype and export this helper function.

Signed-off-by: Russell King 
---
 drivers/net/phy/phy.c | 3 ++-
 include/linux/phy.h   | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index ec9953202f58..c1be21a84f1d 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -38,7 +38,7 @@
 
 #include 
 
-static const char *phy_speed_to_str(int speed)
+const char *phy_speed_to_str(int speed)
 {
switch (speed) {
case SPEED_10:
@@ -57,6 +57,7 @@ static const char *phy_speed_to_str(int speed)
return "Unsupported (update phy.c)";
}
 }
+EXPORT_SYMBOL_GPL(phy_speed_to_str);
 
 #define PHY_STATE_STR(_state)  \
case PHY_##_state:  \
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 63e52af00493..6b1ec2b99051 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -799,6 +799,7 @@ int phy_ethtool_sset(struct phy_device *phydev, struct 
ethtool_cmd *cmd);
 int phy_ethtool_gset(struct phy_device *phydev, struct ethtool_cmd *cmd);
 int phy_mii_ioctl(struct phy_device *phydev, struct ifreq *ifr, int cmd);
 int phy_start_interrupts(struct phy_device *phydev);
+const char *phy_speed_to_str(int speed);
 void phy_print_status(struct phy_device *phydev);
 void phy_device_free(struct phy_device *phydev);
 int phy_set_max_speed(struct phy_device *phydev, u32 max_speed);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC 03/26] phy: separate swphy state validation from register generation

2015-12-07 Thread Russell King
Separate out the generation of MII registers from the state validation.
This allows us to simplify the error handing in fixed_phy() by allowing
earlier error detection.

Signed-off-by: Russell King 
---
 drivers/net/phy/fixed_phy.c | 15 +++
 drivers/net/phy/swphy.c | 33 ++---
 drivers/net/phy/swphy.h |  3 ++-
 3 files changed, 35 insertions(+), 16 deletions(-)

diff --git a/drivers/net/phy/fixed_phy.c b/drivers/net/phy/fixed_phy.c
index 9a448e7f8f4e..fe4ca05afc77 100644
--- a/drivers/net/phy/fixed_phy.c
+++ b/drivers/net/phy/fixed_phy.c
@@ -49,12 +49,12 @@ static struct fixed_mdio_bus platform_fmb = {
.phys = LIST_HEAD_INIT(platform_fmb.phys),
 };
 
-static int fixed_phy_update_regs(struct fixed_phy *fp)
+static void fixed_phy_update_regs(struct fixed_phy *fp)
 {
if (gpio_is_valid(fp->link_gpio))
fp->status.link = !!gpio_get_value_cansleep(fp->link_gpio);
 
-   return swphy_update_regs(fp->regs, >status);
+   swphy_update_regs(fp->regs, >status);
 }
 
 static int fixed_mdio_read(struct mii_bus *bus, int phy_addr, int reg_num)
@@ -161,6 +161,10 @@ int fixed_phy_add(unsigned int irq, int phy_addr,
struct fixed_mdio_bus *fmb = _fmb;
struct fixed_phy *fp;
 
+   ret = swphy_validate_state(status);
+   if (ret < 0)
+   return ret;
+
fp = kzalloc(sizeof(*fp), GFP_KERNEL);
if (!fp)
return -ENOMEM;
@@ -180,17 +184,12 @@ int fixed_phy_add(unsigned int irq, int phy_addr,
goto err_regs;
}
 
-   ret = fixed_phy_update_regs(fp);
-   if (ret)
-   goto err_gpio;
+   fixed_phy_update_regs(fp);
 
list_add_tail(>node, >phys);
 
return 0;
 
-err_gpio:
-   if (gpio_is_valid(fp->link_gpio))
-   gpio_free(fp->link_gpio);
 err_regs:
kfree(fp);
return ret;
diff --git a/drivers/net/phy/swphy.c b/drivers/net/phy/swphy.c
index c88a194b4cb6..21a9bd8a7830 100644
--- a/drivers/net/phy/swphy.c
+++ b/drivers/net/phy/swphy.c
@@ -87,6 +87,29 @@ static int swphy_decode_speed(int speed)
 }
 
 /**
+ * swphy_validate_state - validate the software phy status
+ * @state: software phy status
+ *
+ * This checks that we can represent the state stored in @state can be
+ * represented in the emulated MII registers.  Returns 0 if it can,
+ * otherwise returns -EINVAL.
+ */
+int swphy_validate_state(const struct fixed_phy_status *state)
+{
+   int err;
+
+   if (state->link) {
+   err = swphy_decode_speed(state->speed);
+   if (err < 0) {
+   pr_warn("swphy: unknown speed\n");
+   return -EINVAL;
+   }
+   }
+   return 0;
+}
+EXPORT_SYMBOL_GPL(swphy_validate_state);
+
+/**
  * swphy_update_regs - update MII register array with fixed phy state
  * @regs: array of 32 registers to update
  * @state: fixed phy status
@@ -94,7 +117,7 @@ static int swphy_decode_speed(int speed)
  * Update the array of MII registers with the fixed phy link, speed,
  * duplex and pause mode settings.
  */
-int swphy_update_regs(u16 *regs, const struct fixed_phy_status *state)
+void swphy_update_regs(u16 *regs, const struct fixed_phy_status *state)
 {
int speed_index, duplex_index;
u16 bmsr = BMSR_ANEGCAPABLE;
@@ -103,10 +126,8 @@ int swphy_update_regs(u16 *regs, const struct 
fixed_phy_status *state)
u16 lpa = 0;
 
speed_index = swphy_decode_speed(state->speed);
-   if (speed_index < 0) {
-   pr_warn("swphy: unknown speed\n");
-   return -EINVAL;
-   }
+   if (WARN_ON(speed_index < 0))
+   return;
 
duplex_index = state->duplex ? SWMII_DUPLEX_FULL : SWMII_DUPLEX_HALF;
 
@@ -133,7 +154,5 @@ int swphy_update_regs(u16 *regs, const struct 
fixed_phy_status *state)
regs[MII_BMCR] = bmcr;
regs[MII_LPA] = lpa;
regs[MII_STAT1000] = lpagb;
-
-   return 0;
 }
 EXPORT_SYMBOL_GPL(swphy_update_regs);
diff --git a/drivers/net/phy/swphy.h b/drivers/net/phy/swphy.h
index feaa38ff86a2..33d2e061896e 100644
--- a/drivers/net/phy/swphy.h
+++ b/drivers/net/phy/swphy.h
@@ -3,6 +3,7 @@
 
 struct fixed_phy_status;
 
-int swphy_update_regs(u16 *regs, const struct fixed_phy_status *state);
+int swphy_validate_state(const struct fixed_phy_status *state);
+void swphy_update_regs(u16 *regs, const struct fixed_phy_status *state);
 
 #endif
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC 05/26] phy: improve safety of fixed-phy MII register reading

2015-12-07 Thread Russell King
There is no prevention of a concurrent call to both fixed_mdio_read()
and fixed_phy_update_state(), which can result in the state being
modified while it's being inspected.  Fix this by using a seqcount
to detect modifications, and memcpy()ing the state.

We remain slightly naughty here, calling link_update() and updating
the link status within the read-side loop - which would need rework
of the design to change.

Signed-off-by: Russell King 
---
 drivers/net/phy/fixed_phy.c | 28 +---
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/drivers/net/phy/fixed_phy.c b/drivers/net/phy/fixed_phy.c
index 66555da2ef27..474cc39a5457 100644
--- a/drivers/net/phy/fixed_phy.c
+++ b/drivers/net/phy/fixed_phy.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "swphy.h"
 
@@ -35,6 +36,7 @@ struct fixed_mdio_bus {
 struct fixed_phy {
int addr;
struct phy_device *phydev;
+   seqcount_t seqcount;
struct fixed_phy_status status;
int (*link_update)(struct net_device *, struct fixed_phy_status *);
struct list_head node;
@@ -59,13 +61,21 @@ static int fixed_mdio_read(struct mii_bus *bus, int 
phy_addr, int reg_num)
 
list_for_each_entry(fp, >phys, node) {
if (fp->addr == phy_addr) {
-   /* Issue callback if user registered it. */
-   if (fp->link_update) {
-   fp->link_update(fp->phydev->attached_dev,
-   >status);
-   fixed_phy_update(fp);
-   }
-   return swphy_read_reg(reg_num, >status);
+   struct fixed_phy_status state;
+   int s;
+
+   do {
+   s = read_seqcount_begin(>seqcount);
+   /* Issue callback if user registered it. */
+   if (fp->link_update) {
+   
fp->link_update(fp->phydev->attached_dev,
+   >status);
+   fixed_phy_update(fp);
+   }
+   state = fp->status;
+   } while (read_seqcount_retry(>seqcount, s));
+
+   return swphy_read_reg(reg_num, );
}
}
 
@@ -117,6 +127,7 @@ int fixed_phy_update_state(struct phy_device *phydev,
 
list_for_each_entry(fp, >phys, node) {
if (fp->addr == phydev->addr) {
+   write_seqcount_begin(>seqcount);
 #define _UPD(x) if (changed->x) \
fp->status.x = status->x
_UPD(link);
@@ -126,6 +137,7 @@ int fixed_phy_update_state(struct phy_device *phydev,
_UPD(asym_pause);
 #undef _UPD
fixed_phy_update(fp);
+   write_seqcount_end(>seqcount);
return 0;
}
}
@@ -150,6 +162,8 @@ int fixed_phy_add(unsigned int irq, int phy_addr,
if (!fp)
return -ENOMEM;
 
+   seqcount_init(>seqcount);
+
fmb->irqs[phy_addr] = irq;
 
fp->addr = phy_addr;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC 01/26] phy: move fixed_phy MII register generation to a library

2015-12-07 Thread Russell King
Move the fixed_phy MII register generation to a library to allow other
software phy implementations to use this code.

Signed-off-by: Russell King 
---
 drivers/net/phy/Kconfig |   4 ++
 drivers/net/phy/Makefile|   3 +-
 drivers/net/phy/fixed_phy.c |  95 ++---
 drivers/net/phy/swphy.c | 126 
 drivers/net/phy/swphy.h |   8 +++
 5 files changed, 143 insertions(+), 93 deletions(-)
 create mode 100644 drivers/net/phy/swphy.c
 create mode 100644 drivers/net/phy/swphy.h

diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index 60994a83a0d6..c59f957fc282 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -12,6 +12,9 @@ menuconfig PHYLIB
 
 if PHYLIB
 
+config SWPHY
+   bool
+
 comment "MII PHY device drivers"
 
 config AQUANTIA_PHY
@@ -159,6 +162,7 @@ config MICROCHIP_PHY
 config FIXED_PHY
tristate "Driver for MDIO Bus/PHY emulation with fixed speed/link PHYs"
depends on PHYLIB
+   select SWPHY
---help---
  Adds the platform "fixed" MDIO Bus to cover the boards that use
  PHYs that are not connected to the real MDIO bus.
diff --git a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile
index f31a4e25cf15..31bdf193adbd 100644
--- a/drivers/net/phy/Makefile
+++ b/drivers/net/phy/Makefile
@@ -1,6 +1,7 @@
 # Makefile for Linux PHY drivers
 
-libphy-objs:= phy.o phy_device.o mdio_bus.o
+libphy-y   := phy.o phy_device.o mdio_bus.o
+libphy-$(CONFIG_SWPHY) += swphy.o
 
 obj-$(CONFIG_PHYLIB)   += libphy.o
 obj-$(CONFIG_AQUANTIA_PHY) += aquantia.o
diff --git a/drivers/net/phy/fixed_phy.c b/drivers/net/phy/fixed_phy.c
index e23bf5b90e17..9a448e7f8f4e 100644
--- a/drivers/net/phy/fixed_phy.c
+++ b/drivers/net/phy/fixed_phy.c
@@ -24,6 +24,8 @@
 #include 
 #include 
 
+#include "swphy.h"
+
 #define MII_REGS_NUM 29
 
 struct fixed_mdio_bus {
@@ -49,101 +51,10 @@ static struct fixed_mdio_bus platform_fmb = {
 
 static int fixed_phy_update_regs(struct fixed_phy *fp)
 {
-   u16 bmsr = BMSR_ANEGCAPABLE;
-   u16 bmcr = 0;
-   u16 lpagb = 0;
-   u16 lpa = 0;
-
if (gpio_is_valid(fp->link_gpio))
fp->status.link = !!gpio_get_value_cansleep(fp->link_gpio);
 
-   if (fp->status.duplex) {
-   switch (fp->status.speed) {
-   case 1000:
-   bmsr |= BMSR_ESTATEN;
-   break;
-   case 100:
-   bmsr |= BMSR_100FULL;
-   break;
-   case 10:
-   bmsr |= BMSR_10FULL;
-   break;
-   default:
-   break;
-   }
-   } else {
-   switch (fp->status.speed) {
-   case 1000:
-   bmsr |= BMSR_ESTATEN;
-   break;
-   case 100:
-   bmsr |= BMSR_100HALF;
-   break;
-   case 10:
-   bmsr |= BMSR_10HALF;
-   break;
-   default:
-   break;
-   }
-   }
-
-   if (fp->status.link) {
-   bmsr |= BMSR_LSTATUS | BMSR_ANEGCOMPLETE;
-
-   if (fp->status.duplex) {
-   bmcr |= BMCR_FULLDPLX;
-
-   switch (fp->status.speed) {
-   case 1000:
-   bmcr |= BMCR_SPEED1000;
-   lpagb |= LPA_1000FULL;
-   break;
-   case 100:
-   bmcr |= BMCR_SPEED100;
-   lpa |= LPA_100FULL;
-   break;
-   case 10:
-   lpa |= LPA_10FULL;
-   break;
-   default:
-   pr_warn("fixed phy: unknown speed\n");
-   return -EINVAL;
-   }
-   } else {
-   switch (fp->status.speed) {
-   case 1000:
-   bmcr |= BMCR_SPEED1000;
-   lpagb |= LPA_1000HALF;
-   break;
-   case 100:
-   bmcr |= BMCR_SPEED100;
-   lpa |= LPA_100HALF;
-   break;
-   case 10:
-   lpa |= LPA_10HALF;
-   break;
-   default:
-   pr_warn("fixed phy: unknown speed\n");
-   return -EINVAL;
-   }
-   }
-
-   if (fp->status.pause)
-   lpa |= 

[PATCH RFC 14/26] sfp: display SFP module information

2015-12-07 Thread Russell King
Signed-off-by: Russell King 
---
 drivers/net/phy/sfp.c | 245 +-
 1 file changed, 244 insertions(+), 1 deletion(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 70a375403e55..678298844203 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -248,6 +248,180 @@ static unsigned int sfp_check(void *buf, size_t len)
return check;
 }
 
+static const char *sfp_link_len(char *buf, size_t size, unsigned int length,
+   unsigned int multiplier)
+{
+   if (length == 0)
+   return "unsupported/unspecified";
+
+   if (length == 255) {
+   *buf++ = '>';
+   size -= 1;
+   length -= 1;
+   }
+
+   length *= multiplier;
+
+   if (length >= 1000)
+   snprintf(buf, size, "%u.%0*ukm",
+   length / 1000,
+   multiplier > 100 ? 1 :
+   multiplier > 10 ? 2 : 3,
+   length % 1000);
+   else
+   snprintf(buf, size, "%um", length);
+
+   return buf;
+}
+
+struct bitfield {
+   unsigned int mask;
+   unsigned int val;
+   const char *str;
+};
+
+static const struct bitfield sfp_options[] = {
+   {
+   .mask = SFP_OPTIONS_HIGH_POWER_LEVEL,
+   .val = SFP_OPTIONS_HIGH_POWER_LEVEL,
+   .str = "hpl",
+   }, {
+   .mask = SFP_OPTIONS_PAGING_A2,
+   .val = SFP_OPTIONS_PAGING_A2,
+   .str = "paginga2",
+   }, {
+   .mask = SFP_OPTIONS_RETIMER,
+   .val = SFP_OPTIONS_RETIMER,
+   .str = "retimer",
+   }, {
+   .mask = SFP_OPTIONS_COOLED_XCVR,
+   .val = SFP_OPTIONS_COOLED_XCVR,
+   .str = "cooled",
+   }, {
+   .mask = SFP_OPTIONS_POWER_DECL,
+   .val = SFP_OPTIONS_POWER_DECL,
+   .str = "powerdecl",
+   }, {
+   .mask = SFP_OPTIONS_RX_LINEAR_OUT,
+   .val = SFP_OPTIONS_RX_LINEAR_OUT,
+   .str = "rxlinear",
+   }, {
+   .mask = SFP_OPTIONS_RX_DECISION_THRESH,
+   .val = SFP_OPTIONS_RX_DECISION_THRESH,
+   .str = "rxthresh",
+   }, {
+   .mask = SFP_OPTIONS_TUNABLE_TX,
+   .val = SFP_OPTIONS_TUNABLE_TX,
+   .str = "tunabletx",
+   }, {
+   .mask = SFP_OPTIONS_RATE_SELECT,
+   .val = SFP_OPTIONS_RATE_SELECT,
+   .str = "ratesel",
+   }, {
+   .mask = SFP_OPTIONS_TX_DISABLE,
+   .val = SFP_OPTIONS_TX_DISABLE,
+   .str = "txdisable",
+   }, {
+   .mask = SFP_OPTIONS_TX_FAULT,
+   .val = SFP_OPTIONS_TX_FAULT,
+   .str = "txfault",
+   }, {
+   .mask = SFP_OPTIONS_LOS_INVERTED,
+   .val = SFP_OPTIONS_LOS_INVERTED,
+   .str = "los-",
+   }, {
+   .mask = SFP_OPTIONS_LOS_NORMAL,
+   .val = SFP_OPTIONS_LOS_NORMAL,
+   .str = "los+",
+   }, { }
+};
+
+static const struct bitfield diagmon[] = {
+   {
+   .mask = SFP_DIAGMON_DDM,
+   .val = SFP_DIAGMON_DDM,
+   .str = "ddm",
+   }, {
+   .mask = SFP_DIAGMON_INT_CAL,
+   .val = SFP_DIAGMON_INT_CAL,
+   .str = "intcal",
+   }, {
+   .mask = SFP_DIAGMON_EXT_CAL,
+   .val = SFP_DIAGMON_EXT_CAL,
+   .str = "extcal",
+   }, {
+   .mask = SFP_DIAGMON_RXPWR_AVG,
+   .val = SFP_DIAGMON_RXPWR_AVG,
+   .str = "rxpwravg",
+   }, { }
+};
+
+static const char *sfp_bitfield(char *out, size_t outsz, const struct bitfield 
*bits, unsigned int val)
+{
+   char *p = out;
+   int n;
+
+   *p = '\0';
+   while (bits->mask) {
+   if ((val & bits->mask) == bits->val) {
+   n = snprintf(p, outsz, "%s%s",
+out != p ? ", " : "",
+bits->str);
+   if (n == outsz)
+   break;
+   p += n;
+   outsz -= n;
+   }
+   bits++;
+   }
+
+   return out;
+}
+
+static const char *sfp_connector(unsigned int connector)
+{
+   switch (connector) {
+   case SFP_CONNECTOR_UNSPEC:
+   return "unknown/unspecified";
+   case SFP_CONNECTOR_FIBERJACK:
+   return "Fiberjack";
+   case SFP_CONNECTOR_LC:
+   return "LC";
+   case SFP_CONNECTOR_MT_RJ:
+   return "MT-RJ";
+   case SFP_CONNECTOR_MU:
+   return "MU";
+   case SFP_CONNECTOR_SG:
+   return "SG";
+   case SFP_CONNECTOR_OPTICAL_PIGTAIL:
+   return "Optical 

Re: [RFC PATCH net-next 1/2] tcp: RTO Restart (RTOR)

2015-12-07 Thread Marcelo Ricardo Leitner
On Mon, Dec 07, 2015 at 10:00:11AM +0100, Per Hurtig wrote:
> This patch implements the RTO restart modification (RTOR). When data is
> ACKed, and the RTO timer is restarted, the time elapsed since the last
> outstanding segment was transmitted is subtracted from the calculated RTO
> value. This way, the RTO timer will expire after exactly RTO seconds, and
> not RTO + RTT [+ delACK] seconds.
> 
> This patch also implements a new sysctl (tcp_timer_restart) that is used
> to control the timer restart behavior.
> 
> Signed-off-by: Per Hurtig 
> ---
>  Documentation/networking/ip-sysctl.txt | 12 
>  include/net/tcp.h  |  4 
>  net/ipv4/sysctl_net_ipv4.c | 10 ++
>  net/ipv4/tcp_input.c   | 24 
>  4 files changed, 50 insertions(+)
> 
> diff --git a/Documentation/networking/ip-sysctl.txt 
> b/Documentation/networking/ip-sysctl.txt
> index 2ea4c45..4094128 100644
> --- a/Documentation/networking/ip-sysctl.txt
> +++ b/Documentation/networking/ip-sysctl.txt

(snip)

> @@ -2997,6 +2998,18 @@ static void tcp_cong_avoid(struct sock *sk, u32 ack, 
> u32 acked)
>   tcp_sk(sk)->snd_cwnd_stamp = tcp_time_stamp;
>  }
>  
> +static u32 tcp_unsent_pkts(const struct sock *sk)
> +{
> + struct sk_buff *skb = tcp_send_head(sk);
> + u32 pkts = 0;
> +
> + if (skb)
> + tcp_for_write_queue_from(skb, sk)
> + pkts += tcp_skb_pcount(skb);
> +
> + return pkts;
> +}
> +
>  /* Restart timer after forward progress on connection.
>   * RFC2988 recommends to restart timer to now+rto.
>   */
> @@ -3027,6 +3040,17 @@ void tcp_rearm_rto(struct sock *sk)
>*/
>   if (delta > 0)
>   rto = delta;
> + } else if (icsk->icsk_pending == ICSK_TIME_RETRANS &&
> +(sysctl_tcp_timer_restart == 1 ||
> + sysctl_tcp_timer_restart == 3) &&
> +(tp->packets_out + tcp_unsent_pkts(sk) <
> + TCP_RTORESTART_THRESH)) {

(snip)

By when this gets hit, you could have a big write queue.
What about wrapping at least this this condition 
tp->packets_out + tcp_unsent_pkts(sk) < TCP_RTORESTART_THRESH
in its own check function? Like:

+static bool tcp_can_rtor(const struct sock *sk)
+{
+   struct sk_buff *skb = tcp_send_head(sk);
+   s32 target = TCP_RTORESTART_THRESH - tp->packets_out;
+
+   if (target <= 0)
+   return false;
+
+   if (skb) {
+   tcp_for_write_queue_from(skb, sk) {
+   target -= tcp_skb_pcount(skb);
+   if (target <= 0)
+   return false;
+   }
+   }
+
+   return true;
+}

This way it will only traverse what is needed for the check itself.

  Marcelo

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC 00/26] Phylink & SFP support

2015-12-07 Thread Russell King - ARM Linux
Hi,

SFP modules are hot-pluggable ethernet transceivers; they can be
detected at runtime and accordingly configured.  There are a range of
modules offering many different features.

Some SFP modules have PHYs conventional integrated into them, others
drive a laser diode from the Serdes bus.  Some have monitoring, others
do not.

Some SFP modules want to use SGMII over the Serdes link, others want
to use 1000base-X over the Serdes link.

This makes it non-trivial to support with the existing code structure.
Not wanting to write something specific to the mvneta driver, I decided
to have a go at coming up with something more generic.

My initial attempts were to provide a PHY driver, but I found that
phylib's state machine got in the way, and it was hard to support two
chained PHYs.  Conversely, having a fixed DT specified setup (via
the fixed phy infrastructure) would allow some SFP modules to work, but
not others.  The same is true of the "managed" in-band status (which
is SGMII.)

The result is that I came up with phylink - an infrastructure layer
which sits between the network driver and any attached PHY, and a
SFP module layer detects the SFP module, and configures phylink
accordingly.

Overall, this supports:

* switching the serdes mode at the NIC driver
* controlling autonegotiation and autoneg results
* allowing PHYs to be hotplugged
* allowing SFP modules to be hotplugged with proper link indication
* fixed-mode links without involving phylib
* flow control
* EEE support
* reading SFP module EEPROMs

Overall, phylink supports several link modes, with dynamic switching
possible between these:
* A true fixed link mode, where the parameters are set by DT.
* PHY mode, where we read the negotiation results from the PHY registers
  and pass them to the NIC driver.
* SGMII mode, where the in-band status indicates the speed, duplex and
  flow control settings of the link partner.
* 1000base-X mode, where the in-band status indicates only duplex and
  flow control settings (different, incompatible bit layout from SGMII.)

Ethtool support is included, as well as emulation of the MII registers
for situations where a PHY is not attached, giving compatible emulation
of existing user interfaces where required.

The patches here include modification of mvneta (against 4.4-rc1, so
probably won't apply to current development tips.)  It basically
hooks into the places where the phylib would hook into.

DT wise, the changes needed to support SFP look like this (example
taken from Clearfog):

ethernet@34000 {
+   managed = "in-band-status";
phy-mode = "sgmii";
status = "okay";
-
-   fixed-link {
-   speed = <1000>;
-   full-duplex;
-   };
};
...
+   sfp: sfp {
+   compatible = "sff,sfp";
+   i2c-bus = <>;
+   los-gpio = < 12 GPIO_ACTIVE_HIGH>;
+   moddef0-gpio = < 15 GPIO_ACTIVE_LOW>;
+   sfp,ethernet = <>;
+   tx-disable-gpio = < 14 GPIO_ACTIVE_HIGH>;
+   tx-fault-gpio = < 13 GPIO_ACTIVE_HIGH>;
+   };

These DT changes are omitted from this patch set as the baseline DT
file is not in mainline yet (has been submitted.)

 drivers/net/ethernet/marvell/Kconfig  |2 +-
 drivers/net/ethernet/marvell/mvneta.c |  537 +-
 drivers/net/phy/Kconfig   |   29 +
 drivers/net/phy/Makefile  |6 +-
 drivers/net/phy/fixed_phy.c   |  178 +
 drivers/net/phy/marvell.c |2 +-
 drivers/net/phy/mdio-i2c.c|   90 +++
 drivers/net/phy/mdio-i2c.h|   19 +
 drivers/net/phy/phy.c |   46 +-
 drivers/net/phy/phy_device.c  |   15 +
 drivers/net/phy/phylink.c | 1138 ++
 drivers/net/phy/sfp.c | 1235 +
 drivers/net/phy/swphy.c   |  179 +
 drivers/net/phy/swphy.h   |9 +
 include/linux/phy.h   |4 +
 include/linux/phy_fixed.h |9 -
 include/linux/phylink.h   |  102 +++
 include/linux/sfp.h   |  338 +
 18 files changed, 3577 insertions(+), 361 deletions(-)

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: IPsec workshop/BoF at netdev1.1?

2015-12-07 Thread Eric Dumazet
On Mon, 2015-12-07 at 13:00 +0100, Steffen Klassert wrote:
> Is there any interest in doing an IPsec workshop/BoF at netdev1.1?
> 
> This mail is to probe if we can gather enough discussion topics to run
> such a workshop/BoF. So if someone is interested to attend and/or has a
> related discussion topic, please let me know.
> 
> My current topics:
> 
> - Adding a software GRO/GSO codepath for IPsec. I have working example code,
>   a RFC version could be ready before the netdev conference.


- Remove rwlock usage (xfrm_policy_lock) from fast path and szitch to
RCU ?




>  Can we support IPsec offloading to NICs? There are NICs that can handle
>   IPsec, so what are the requirements to support this?
> 


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net] xfrm: take care of request sockets

2015-12-07 Thread Eric Dumazet
From: Eric Dumazet 

TCP SYNACK messages might now be attached to request sockets.

XFRM needs to get back to a listener socket.

Adds new helpers that might be used elsewhere :
sk_to_full_sk() and sk_const_to_full_sk()

Note: We also need to add RCU protection for xfrm lookups,
now TCP/DCCP have lockless listener processing. This will
be addressed in separate patches.

Fixes: ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of 
listener")
Reported-by: Dave Jones 
Signed-off-by: Eric Dumazet 
Cc: Steffen Klassert 
---
 include/net/inet_sock.h |   27 +++
 net/xfrm/xfrm_policy.c  |2 ++
 2 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 2134e6d815bc..625bdf95d673 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -210,18 +210,37 @@ struct inet_sock {
 #define IP_CMSG_ORIGDSTADDRBIT(6)
 #define IP_CMSG_CHECKSUM   BIT(7)
 
-/* SYNACK messages might be attached to request sockets.
+/**
+ * sk_to_full_sk - Access to a full socket
+ * @sk: pointer to a socket
+ *
+ * SYNACK messages might be attached to request sockets.
  * Some places want to reach the listener in this case.
  */
-static inline struct sock *skb_to_full_sk(const struct sk_buff *skb)
+static inline struct sock *sk_to_full_sk(struct sock *sk)
 {
-   struct sock *sk = skb->sk;
-
+#ifdef CONFIG_INET
if (sk && sk->sk_state == TCP_NEW_SYN_RECV)
sk = inet_reqsk(sk)->rsk_listener;
+#endif
+   return sk;
+}
+
+/* sk_to_full_sk() variant with a const argument */
+static inline const struct sock *sk_const_to_full_sk(const struct sock *sk)
+{
+#ifdef CONFIG_INET
+   if (sk && sk->sk_state == TCP_NEW_SYN_RECV)
+   sk = ((const struct request_sock *)sk)->rsk_listener;
+#endif
return sk;
 }
 
+static inline struct sock *skb_to_full_sk(const struct sk_buff *skb)
+{
+   return sk_to_full_sk(skb->sk);
+}
+
 static inline struct inet_sock *inet_sk(const struct sock *sk)
 {
return (struct inet_sock *)sk;
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 09bfcbac63bb..18276f0cc32b 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -2198,6 +2198,7 @@ struct dst_entry *xfrm_lookup(struct net *net, struct 
dst_entry *dst_orig,
xdst = NULL;
route = NULL;
 
+   sk = sk_const_to_full_sk(sk);
if (sk && sk->sk_policy[XFRM_POLICY_OUT]) {
num_pols = 1;
pols[0] = xfrm_sk_policy_lookup(sk, XFRM_POLICY_OUT, fl);
@@ -2477,6 +2478,7 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct 
sk_buff *skb,
}
 
pol = NULL;
+   sk = sk_to_full_sk(sk);
if (sk && sk->sk_policy[dir]) {
pol = xfrm_sk_policy_lookup(sk, dir, );
if (IS_ERR(pol)) {


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 8/8] treewide: Remove newlines inside DEFINE_PER_CPU() macros

2015-12-07 Thread Michal Marek
On 2015-12-07 17:33, David Laight wrote:
> From: Michal Marek
>> Sent: 04 December 2015 15:26
>> Otherwise make tags can't parse them:
>>
>> ctags: Warning: arch/ia64/kernel/smp.c:60: null expansion of name pattern 
>> "\1"
> ...
> 
> Seems to me you need to fix ctags.

I'm sure the maintainers of ctags and etags would accept patches to
describe a custom context-free grammar via commandline options, but
until then, let's continue using the regular expressions in tags.sh and
remove newlines in macros that tags.sh is trying to expand.

Michal
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 8/8] treewide: Remove newlines inside DEFINE_PER_CPU() macros

2015-12-07 Thread Michal Marek
On 2015-12-07 18:04, Joe Perches wrote:
> On Mon, 2015-12-07 at 17:53 +0100, Michal Marek wrote:
>> On 2015-12-07 17:33, David Laight wrote:
>>> From: Michal Marek
 Sent: 04 December 2015 15:26
 Otherwise make tags can't parse them:

 ctags: Warning: arch/ia64/kernel/smp.c:60: null expansion of name pattern 
 "\1"
>>> ...
>>>
>>> Seems to me you need to fix ctags.
>>
>> I'm sure the maintainers of ctags and etags would accept patches to
>> describe a custom context-free grammar via commandline options, but
>> until then, let's continue using the regular expressions in tags.sh and
>> remove newlines in macros that tags.sh is trying to expand.
>>
> 
> Do you have a list of the most common macros?

In practice, it's only DEFINE_PER_CPU and its sibling
DEFINE_PER_CPU_SHARED_ALIGNED, where we try to pick the second argument
to the macro and the first argument can be lengthy.


> Perhaps it'd be good to add exceptions to checkpatch
> 80 column line rules for them.

Your call. But this is a fairly rare occurrence -- 10 cases so far.

Michal
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC 17/26] phylink: add ethtool nway_reset support

2015-12-07 Thread Russell King
Add ethtool nway_reset support to phylink, to allow userspace to
request a re-negotiation of the link.

Signed-off-by: Russell King 
---
 drivers/net/phy/phylink.c | 14 ++
 include/linux/phylink.h   |  1 +
 2 files changed, 15 insertions(+)

diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 7d56e5895087..908235fb16c1 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -659,6 +659,20 @@ int phylink_ethtool_set_settings(struct phylink *pl, 
struct ethtool_cmd *cmd)
 }
 EXPORT_SYMBOL_GPL(phylink_ethtool_set_settings);
 
+int phylink_ethtool_nway_reset(struct phylink *pl)
+{
+   int ret = 0;
+
+   mutex_lock(>config_mutex);
+   if (pl->phydev)
+   ret = genphy_restart_aneg(pl->phydev);
+   pl->ops->mac_an_restart(pl->netdev, pl->link_an_mode);
+   mutex_unlock(>config_mutex);
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(phylink_ethtool_nway_reset);
+
 /* This emulates MII registers for a fixed-mode phy operating as per the
  * passed in state. "aneg" defines if we report negotiation is possible.
  *
diff --git a/include/linux/phylink.h b/include/linux/phylink.h
index c7a665a538c1..ad3c85508d19 100644
--- a/include/linux/phylink.h
+++ b/include/linux/phylink.h
@@ -65,6 +65,7 @@ void phylink_stop(struct phylink *);
 
 int phylink_ethtool_get_settings(struct phylink *, struct ethtool_cmd *);
 int phylink_ethtool_set_settings(struct phylink *, struct ethtool_cmd *);
+int phylink_ethtool_nway_reset(struct phylink *);
 int phylink_mii_ioctl(struct phylink *, struct ifreq *, int);
 
 void phylink_set_link_port(struct phylink *pl, u32 support, u8 port);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC 23/26] net: mvneta: add EEE support

2015-12-07 Thread Russell King
Add EEE support to mvneta.  This allows us to enable the low power idle
support at MAC level if there is a PHY attached through phylink which
supports LPI.  The appropriate ethtool support is provided to allow the
feature to be controlled, including ethtool statistics for EEE wakeup
errors.

Signed-off-by: Russell King 
---
 drivers/net/ethernet/marvell/mvneta.c | 87 +++
 1 file changed, 87 insertions(+)

diff --git a/drivers/net/ethernet/marvell/mvneta.c 
b/drivers/net/ethernet/marvell/mvneta.c
index 165dfab134b7..3de2aa9335b2 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -224,6 +224,12 @@
 #define MVNETA_TXQ_TOKEN_SIZE_REG(q) (0x3e40 + ((q) << 2))
 #define  MVNETA_TXQ_TOKEN_SIZE_MAX   0x7fff
 
+#define MVNETA_LPI_CTRL_00x2cc0
+#define MVNETA_LPI_CTRL_10x2cc4
+#define  MVNETA_LPI_REQUEST_ENABLE   BIT(0)
+#define MVNETA_LPI_CTRL_20x2cc8
+#define MVNETA_LPI_STATUS0x2ccc
+
 #define MVNETA_CAUSE_TXQ_SENT_DESC_ALL_MASK 0xff
 
 /* Descriptor ring Macros */
@@ -288,6 +294,11 @@
 
 #define MVNETA_RX_BUF_SIZE(pkt_size)   ((pkt_size) + NET_SKB_PAD)
 
+enum {
+   ETHTOOL_STAT_EEE_WAKEUP,
+   ETHTOOL_MAX_STATS,
+};
+
 struct mvneta_statistic {
unsigned short offset;
unsigned short type;
@@ -296,6 +307,7 @@ struct mvneta_statistic {
 
 #define T_REG_32   32
 #define T_REG_64   64
+#define T_SW   1
 
 static const struct mvneta_statistic mvneta_statistics[] = {
{ 0x3000, T_REG_64, "good_octets_received", },
@@ -330,6 +342,7 @@ static const struct mvneta_statistic mvneta_statistics[] = {
{ 0x304c, T_REG_32, "broadcast_frames_sent", },
{ 0x3054, T_REG_32, "fc_sent", },
{ 0x300c, T_REG_32, "internal_mac_transmit_err", },
+   { ETHTOOL_STAT_EEE_WAKEUP, T_SW, "eee_wakeup_errors", },
 };
 
 struct mvneta_pcpu_stats {
@@ -373,6 +386,10 @@ struct mvneta_port {
unsigned int tx_csum_limit;
struct phylink *phylink;
 
+   bool eee_enabled;
+   bool eee_active;
+   bool tx_lpi_enabled;
+
u64 ethtool_stats[ARRAY_SIZE(mvneta_statistics)];
 };
 
@@ -2750,6 +2767,18 @@ static void mvneta_mac_config(struct net_device *ndev, 
unsigned int mode,
mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, new_an);
 }
 
+static void mvneta_set_eee(struct mvneta_port *pp, bool enable)
+{
+   u32 lpi_ctl1;
+
+   lpi_ctl1 = mvreg_read(pp, MVNETA_LPI_CTRL_1);
+   if (enable)
+   lpi_ctl1 |= MVNETA_LPI_REQUEST_ENABLE;
+   else
+   lpi_ctl1 &= ~MVNETA_LPI_REQUEST_ENABLE;
+   mvreg_write(pp, MVNETA_LPI_CTRL_1, lpi_ctl1);
+}
+
 static void mvneta_mac_link_down(struct net_device *ndev, unsigned int mode)
 {
struct mvneta_port *pp = netdev_priv(ndev);
@@ -2763,6 +2792,9 @@ static void mvneta_mac_link_down(struct net_device *ndev, 
unsigned int mode)
val |= MVNETA_GMAC_FORCE_LINK_DOWN;
mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, val);
}
+
+   pp->eee_active = false;
+   mvneta_set_eee(pp, false);
 }
 
 static void mvneta_mac_link_up(struct net_device *ndev, unsigned int mode,
@@ -2779,6 +2811,11 @@ static void mvneta_mac_link_up(struct net_device *ndev, 
unsigned int mode,
}
 
mvneta_port_up(pp);
+
+   if (phy && pp->eee_enabled) {
+   pp->eee_active = phy_init_eee(phy, 0) >= 0;
+   mvneta_set_eee(pp, pp->eee_active && pp->tx_lpi_enabled);
+   }
 }
 
 static const struct phylink_mac_ops mvneta_phylink_ops = {
@@ -3175,6 +3212,13 @@ static void mvneta_ethtool_update_stats(struct 
mvneta_port *pp)
high = readl_relaxed(base + s->offset + 4);
val = (u64)high << 32 | low;
break;
+   case T_SW:
+   switch (s->offset) {
+   case ETHTOOL_STAT_EEE_WAKEUP:
+   val = phylink_get_eee_err(pp->phylink);
+   break;
+   }
+   break;
}
 
pp->ethtool_stats[i] += val;
@@ -3200,6 +3244,47 @@ static int mvneta_ethtool_get_sset_count(struct 
net_device *dev, int sset)
return -EOPNOTSUPP;
 }
 
+static int mvneta_ethtool_get_eee(struct net_device *dev,
+ struct ethtool_eee *eee)
+{
+   struct mvneta_port *pp = netdev_priv(dev);
+   u32 lpi_ctl0;
+
+   lpi_ctl0 = mvreg_read(pp, MVNETA_LPI_CTRL_0);
+
+   eee->eee_enabled = pp->eee_enabled;
+   eee->eee_active = pp->eee_active;
+   eee->tx_lpi_enabled = pp->tx_lpi_enabled;
+   eee->tx_lpi_timer = (lpi_ctl0) >> 8; // * scale;
+
+   return phylink_ethtool_get_eee(pp->phylink, eee);
+}
+
+static int 

Re: [RFC PATCH V2 0/3] IXGBE/VFIO: Add live migration support for SRIOV NIC

2015-12-07 Thread Michael S. Tsirkin
On Mon, Dec 07, 2015 at 09:12:08AM -0800, Alexander Duyck wrote:
> On Mon, Dec 7, 2015 at 7:40 AM, Lan, Tianyu  wrote:
> > On 12/5/2015 1:07 AM, Alexander Duyck wrote:
> >>>
> >>>
> >>> We still need to support Windows guest for migration and this is why our
> >>> patches keep all changes in the driver since it's impossible to change
> >>> Windows kernel.
> >>
> >>
> >> That is a poor argument.  I highly doubt Microsoft is interested in
> >> having to modify all of the drivers that will support direct assignment
> >> in order to support migration.  They would likely request something
> >> similar to what I have in that they will want a way to do DMA tracking
> >> with minimal modification required to the drivers.
> >
> >
> > This totally depends on the NIC or other devices' vendors and they
> > should make decision to support migration or not. If yes, they would
> > modify driver.
> 
> Having to modify every driver that wants to support live migration is
> a bit much.  In addition I don't see this being limited only to NIC
> devices.  You can direct assign a number of different devices, your
> solution cannot be specific to NICs.
> 
> > If just target to call suspend/resume during migration, the feature will
> > be meaningless. Most cases don't want to affect user during migration
> > a lot and so the service down time is vital. Our target is to apply
> > SRIOV NIC passthough to cloud service and NFV(network functions
> > virtualization) projects which are sensitive to network performance
> > and stability. From my opinion, We should give a change for device
> > driver to implement itself migration job. Call suspend and resume
> > callback in the driver if it doesn't care the performance during migration.
> 
> The suspend/resume callback should be efficient in terms of time.
> After all we don't want the system to stall for a long period of time
> when it should be either running or asleep.  Having it burn cycles in
> a power state limbo doesn't do anyone any good.  If nothing else maybe
> it will help to push the vendors to speed up those functions which
> then benefit migration and the system sleep states.
> 
> Also you keep assuming you can keep the device running while you do
> the migration and you can't.  You are going to corrupt the memory if
> you do, and you have yet to provide any means to explain how you are
> going to solve that.
> 
> 
> >
> >>
> >>> Following is my idea to do DMA tracking.
> >>>
> >>> Inject event to VF driver after memory iterate stage
> >>> and before stop VCPU and then VF driver marks dirty all
> >>> using DMA memory. The new allocated pages also need to
> >>> be marked dirty before stopping VCPU. All dirty memory
> >>> in this time slot will be migrated until stop-and-copy
> >>> stage. We also need to make sure to disable VF via clearing the
> >>> bus master enable bit for VF before migrating these memory.
> >>
> >>
> >> The ordering of your explanation here doesn't quite work.  What needs to
> >> happen is that you have to disable DMA and then mark the pages as dirty.
> >>   What the disabling of the BME does is signal to the hypervisor that
> >> the device is now stopped.  The ixgbevf_suspend call already supported
> >> by the driver is almost exactly what is needed to take care of something
> >> like this.
> >
> >
> > This is why I hope to reserve a piece of space in the dma page to do dummy
> > write. This can help to mark page dirty while not require to stop DMA and
> > not race with DMA data.
> 
> You can't and it will still race.  What concerns me is that your
> patches and the document you referenced earlier show a considerable
> lack of understanding about how DMA and device drivers work.  There is
> a reason why device drivers have so many memory barriers and the like
> in them.  The fact is when you have CPU and a device both accessing
> memory things have to be done in a very specific order and you cannot
> violate that.
> 
> If you have a contiguous block of memory you expect the device to
> write into you cannot just poke a hole in it.  Such a situation is not
> supported by any hardware that I am aware of.
> 
> As far as writing to dirty the pages it only works so long as you halt
> the DMA and then mark the pages dirty.  It has to be in that order.
> Any other order will result in data corruption and I am sure the NFV
> customers definitely don't want that.
> 
> > If can't do that, we have to stop DMA in a short time to mark all dma
> > pages dirty and then reenable it. I am not sure how much we can get by
> > this way to track all DMA memory with device running during migration. I
> > need to do some tests and compare results with stop DMA diretly at last
> > stage during migration.
> 
> We have to halt the DMA before we can complete the migration.  So
> please feel free to test this.
> 
> In addition I still feel you would be better off taking this in
> smaller steps.  I still say your first step would be to come up with a
> generic 

[PATCH RFC 16/26] phy: fixed-phy: remove fixed_phy_update_state()

2015-12-07 Thread Russell King
mvneta is the only user of fixed_phy_update_state(), which has been
converted to use phylink instead.  Remove fixed_phy_update_state().

Signed-off-by: Russell King 
---
 drivers/net/phy/fixed_phy.c | 31 ---
 include/linux/phy_fixed.h   |  9 -
 2 files changed, 40 deletions(-)

diff --git a/drivers/net/phy/fixed_phy.c b/drivers/net/phy/fixed_phy.c
index 474cc39a5457..19ac47aab0ba 100644
--- a/drivers/net/phy/fixed_phy.c
+++ b/drivers/net/phy/fixed_phy.c
@@ -115,37 +115,6 @@ int fixed_phy_set_link_update(struct phy_device *phydev,
 }
 EXPORT_SYMBOL_GPL(fixed_phy_set_link_update);
 
-int fixed_phy_update_state(struct phy_device *phydev,
-  const struct fixed_phy_status *status,
-  const struct fixed_phy_status *changed)
-{
-   struct fixed_mdio_bus *fmb = _fmb;
-   struct fixed_phy *fp;
-
-   if (!phydev || phydev->bus != fmb->mii_bus)
-   return -EINVAL;
-
-   list_for_each_entry(fp, >phys, node) {
-   if (fp->addr == phydev->addr) {
-   write_seqcount_begin(>seqcount);
-#define _UPD(x) if (changed->x) \
-   fp->status.x = status->x
-   _UPD(link);
-   _UPD(speed);
-   _UPD(duplex);
-   _UPD(pause);
-   _UPD(asym_pause);
-#undef _UPD
-   fixed_phy_update(fp);
-   write_seqcount_end(>seqcount);
-   return 0;
-   }
-   }
-
-   return -ENOENT;
-}
-EXPORT_SYMBOL(fixed_phy_update_state);
-
 int fixed_phy_add(unsigned int irq, int phy_addr,
  struct fixed_phy_status *status,
  int link_gpio)
diff --git a/include/linux/phy_fixed.h b/include/linux/phy_fixed.h
index 2400d2ea4f34..cf3f718c62b9 100644
--- a/include/linux/phy_fixed.h
+++ b/include/linux/phy_fixed.h
@@ -23,9 +23,6 @@ extern void fixed_phy_del(int phy_addr);
 extern int fixed_phy_set_link_update(struct phy_device *phydev,
int (*link_update)(struct net_device *,
   struct fixed_phy_status *));
-extern int fixed_phy_update_state(struct phy_device *phydev,
-  const struct fixed_phy_status *status,
-  const struct fixed_phy_status *changed);
 #else
 static inline int fixed_phy_add(unsigned int irq, int phy_id,
struct fixed_phy_status *status,
@@ -50,12 +47,6 @@ static inline int fixed_phy_set_link_update(struct 
phy_device *phydev,
 {
return -ENODEV;
 }
-static inline int fixed_phy_update_state(struct phy_device *phydev,
-  const struct fixed_phy_status *status,
-  const struct fixed_phy_status *changed)
-{
-   return -ENODEV;
-}
 #endif /* CONFIG_FIXED_PHY */
 
 #endif /* __PHY_FIXED_H */
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC 13/26] sfp: add phylink based SFP module support

2015-12-07 Thread Russell King
Add support for SFP hotpluggable modules via phylink.  This supports
both copper and optical SFP modules, which require different Serdes
modes in order to properly negotiate the link.

Optical SFP modules typically require the Serdes link to be talking
1000base-X mode - this is the gigabit ethernet mode defined by the
802.3 standard.

Copper SFP modules typically integrate a PHY in the module to convert
from Serdes to copper, and the PHY will be configured by the vendor
to either present a 1000base-X Serdes link (for fixed 1000base-T) or
a SGMII Serdes link.  However, this is vendor defined, so we instead
detect the PHY, switch the link to SGMII mode, and use traditional
PHY based negotiation.

Signed-off-by: Russell King 
---
 drivers/net/phy/Kconfig  |   5 +
 drivers/net/phy/Makefile |   1 +
 drivers/net/phy/sfp.c| 989 +++
 include/linux/sfp.h  | 338 
 4 files changed, 1333 insertions(+)
 create mode 100644 drivers/net/phy/sfp.c
 create mode 100644 include/linux/sfp.h

diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index 5c634b4bc9bd..5bdf5c24c6ef 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -179,6 +179,11 @@ config FIXED_PHY
 
  Currently tested with mpc866ads and mpc8349e-mitx.
 
+config SFP
+   tristate "SFP cage support"
+   depends on I2C && PHYLINK
+   select MDIO_I2C
+
 config MDIO_BITBANG
tristate "Support for bitbanged MDIO buses"
help
diff --git a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile
index bc052bb6cee0..a7be372fb123 100644
--- a/drivers/net/phy/Makefile
+++ b/drivers/net/phy/Makefile
@@ -45,3 +45,4 @@ obj-$(CONFIG_MDIO_MOXART) += mdio-moxart.o
 obj-$(CONFIG_MDIO_BCM_UNIMAC)  += mdio-bcm-unimac.o
 obj-$(CONFIG_MICROCHIP_PHY)+= microchip.o
 obj-$(CONFIG_MDIO_BCM_IPROC)   += mdio-bcm-iproc.o
+obj-$(CONFIG_SFP)  += sfp.o
diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
new file mode 100644
index ..70a375403e55
--- /dev/null
+++ b/drivers/net/phy/sfp.c
@@ -0,0 +1,989 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "mdio-i2c.h"
+#include "swphy.h"
+
+enum {
+   GPIO_MODDEF0,
+   GPIO_LOS,
+   GPIO_TX_FAULT,
+   GPIO_TX_DISABLE,
+   GPIO_RATE_SELECT,
+   GPIO_MAX,
+
+   SFP_F_PRESENT = BIT(GPIO_MODDEF0),
+   SFP_F_LOS = BIT(GPIO_LOS),
+   SFP_F_TX_FAULT = BIT(GPIO_TX_FAULT),
+   SFP_F_TX_DISABLE = BIT(GPIO_TX_DISABLE),
+   SFP_F_RATE_SELECT = BIT(GPIO_RATE_SELECT),
+
+   SFP_E_INSERT = 0,
+   SFP_E_REMOVE,
+   SFP_E_DEV_DOWN,
+   SFP_E_DEV_UP,
+   SFP_E_TX_FAULT,
+   SFP_E_TX_CLEAR,
+   SFP_E_LOS_HIGH,
+   SFP_E_LOS_LOW,
+   SFP_E_TIMEOUT,
+
+   SFP_MOD_EMPTY = 0,
+   SFP_MOD_PROBE,
+   SFP_MOD_PRESENT,
+   SFP_MOD_ERROR,
+
+   SFP_DEV_DOWN = 0,
+   SFP_DEV_UP,
+
+   SFP_S_DOWN = 0,
+   SFP_S_INIT,
+   SFP_S_WAIT_LOS,
+   SFP_S_LINK_UP,
+   SFP_S_TX_FAULT,
+   SFP_S_REINIT,
+   SFP_S_TX_DISABLE,
+};
+
+static const char *gpio_of_names[] = {
+   "moddef0",
+   "los",
+   "tx-fault",
+   "tx-disable",
+   "rate-select",
+};
+
+static const enum gpiod_flags gpio_flags[] = {
+   GPIOD_IN,
+   GPIOD_IN,
+   GPIOD_IN,
+   GPIOD_ASIS,
+   GPIOD_ASIS,
+};
+
+#define T_INIT_JIFFIES msecs_to_jiffies(300)
+#define T_RESET_US 10
+#define T_FAULT_RECOVERmsecs_to_jiffies(1000)
+
+/* SFP module presence detection is poor: the three MOD DEF signals are
+ * the same length on the PCB, which means it's possible for MOD DEF 0 to
+ * connect before the I2C bus on MOD DEF 1/2.  Try to work around this
+ * design bug by waiting 50ms before probing, and then retry every 250ms.
+ */
+#define T_PROBE_INIT   msecs_to_jiffies(50)
+#define T_PROBE_RETRY  msecs_to_jiffies(250)
+
+/*
+ * SFP modules appear to always have their PHY configured for bus address
+ * 0x56 (which with mdio-i2c, translates to a PHY address of 22).
+ */
+#define SFP_PHY_ADDR   22
+
+/*
+ * Give this long for the PHY to reset.
+ */
+#define T_PHY_RESET_MS 50
+
+static DEFINE_MUTEX(sfp_mutex);
+
+struct sfp {
+   struct device *dev;
+   struct i2c_adapter *i2c;
+   struct mii_bus *i2c_mii;
+   struct net_device *ndev;
+   struct phylink *phylink;
+   struct phy_device *mod_phy;
+
+   unsigned int (*get_state)(struct sfp *);
+   void (*set_state)(struct sfp *, unsigned int);
+   int (*read)(struct sfp *, bool, u8, void *, size_t);
+
+   struct gpio_desc *gpio[GPIO_MAX];
+
+   unsigned int state;
+   struct delayed_work poll;
+   struct delayed_work timeout;
+   struct mutex sm_mutex;
+   unsigned char sm_mod_state;
+   unsigned char sm_dev_state;
+   

[PATCH RFC 24/26] phylink: add module EEPROM support

2015-12-07 Thread Russell King
Add support for reading module EEPROMs through phylink.

Signed-off-by: Russell King 
---
 drivers/net/phy/phylink.c | 66 +++
 include/linux/phylink.h   | 12 +
 2 files changed, 78 insertions(+)

diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 82492db0fd04..cd56d454c587 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -60,6 +60,9 @@ struct phylink {
struct work_struct resolve;
 
bool mac_link_up;
+
+   const struct phylink_module_ops *module_ops;
+   void *module_data;
 };
 
 static const char *phylink_an_mode_str(unsigned int mode)
@@ -800,6 +803,36 @@ int phylink_ethtool_set_pauseparam(struct phylink *pl,
 }
 EXPORT_SYMBOL_GPL(phylink_ethtool_set_pauseparam);
 
+int phylink_ethtool_get_module_info(struct phylink *pl,
+   struct ethtool_modinfo *modinfo)
+{
+   int ret = -EOPNOTSUPP;
+
+   mutex_lock(>config_mutex);
+   if (pl->module_ops)
+   ret = pl->module_ops->get_module_info(pl->module_data,
+ modinfo);
+   mutex_unlock(>config_mutex);
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(phylink_ethtool_get_module_info);
+
+int phylink_ethtool_get_module_eeprom(struct phylink *pl,
+ struct ethtool_eeprom *ee, u8 *buf)
+{
+   int ret = -EOPNOTSUPP;
+
+   mutex_lock(>config_mutex);
+   if (pl->module_ops)
+   ret = pl->module_ops->get_module_eeprom(pl->module_data, ee,
+   buf);
+   mutex_unlock(>config_mutex);
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(phylink_ethtool_get_module_eeprom);
+
 int phylink_init_eee(struct phylink *pl, bool clk_stop_enable)
 {
int ret = -EPROTONOSUPPORT;
@@ -996,6 +1029,39 @@ EXPORT_SYMBOL_GPL(phylink_mii_ioctl);
 
 
 
+int phylink_register_module(struct phylink *pl, void *data,
+   const struct phylink_module_ops *ops)
+{
+   int ret = -EBUSY;
+
+   mutex_lock(>config_mutex);
+   if (!pl->module_ops) {
+   pl->module_ops = ops;
+   pl->module_data = data;
+   ret = 0;
+   }
+   mutex_unlock(>config_mutex);
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(phylink_register_module);
+
+int phylink_unregister_module(struct phylink *pl, void *data)
+{
+   int ret = -EINVAL;
+
+   mutex_lock(>config_mutex);
+   if (pl->module_data == data) {
+   pl->module_ops = NULL;
+   pl->module_data = NULL;
+   ret = 0;
+   }
+   mutex_unlock(>config_mutex);
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(phylink_unregister_module);
+
 void phylink_disable(struct phylink *pl)
 {
set_bit(PHYLINK_DISABLE_LINK, >phylink_disable_state);
diff --git a/include/linux/phylink.h b/include/linux/phylink.h
index 361fbe9222b2..01d442b08e62 100644
--- a/include/linux/phylink.h
+++ b/include/linux/phylink.h
@@ -55,6 +55,11 @@ struct phylink_mac_ops {
struct phy_device *);
 };
 
+struct phylink_module_ops {
+   int (*get_module_info)(void *, struct ethtool_modinfo *);
+   int (*get_module_eeprom)(void *, struct ethtool_eeprom *, u8 *);
+};
+
 struct phylink *phylink_create(struct net_device *, struct device_node *,
phy_interface_t iface, const struct phylink_mac_ops *ops);
 void phylink_destroy(struct phylink *);
@@ -75,12 +80,19 @@ void phylink_ethtool_get_pauseparam(struct phylink *,
struct ethtool_pauseparam *);
 int phylink_ethtool_set_pauseparam(struct phylink *,
   struct ethtool_pauseparam *);
+int phylink_ethtool_get_module_info(struct phylink *, struct ethtool_modinfo 
*);
+int phylink_ethtool_get_module_eeprom(struct phylink *,
+ struct ethtool_eeprom *, u8 *);
 int phylink_init_eee(struct phylink *, bool);
 int phylink_get_eee_err(struct phylink *);
 int phylink_ethtool_get_eee(struct phylink *, struct ethtool_eee *);
 int phylink_ethtool_set_eee(struct phylink *, struct ethtool_eee *);
 int phylink_mii_ioctl(struct phylink *, struct ifreq *, int);
 
+int phylink_register_module(struct phylink *, void *,
+   const struct phylink_module_ops *);
+int phylink_unregister_module(struct phylink *, void *);
+
 void phylink_set_link_port(struct phylink *pl, u32 support, u8 port);
 int phylink_set_link_an_mode(struct phylink *pl, unsigned int mode);
 void phylink_disable(struct phylink *pl);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC 18/26] net: mvneta: add nway_reset support

2015-12-07 Thread Russell King
Add ethtool nway_reset support to mvneta via phylink, so that userspace
can request the link in whatever mode to be renegotiated via
ethtool -r ethX.

Signed-off-by: Russell King 
---
 drivers/net/ethernet/marvell/mvneta.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/marvell/mvneta.c 
b/drivers/net/ethernet/marvell/mvneta.c
index f19d9a31dccd..82aa2b59a249 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -3020,6 +3020,13 @@ int mvneta_ethtool_set_settings(struct net_device *dev, 
struct ethtool_cmd *cmd)
return phylink_ethtool_set_settings(pp->phylink, cmd);
 }
 
+static int mvneta_ethtool_nway_reset(struct net_device *dev)
+{
+   struct mvneta_port *pp = netdev_priv(dev);
+
+   return phylink_ethtool_nway_reset(pp->phylink);
+}
+
 /* Set interrupt coalescing for ethtools */
 static int mvneta_ethtool_set_coalesce(struct net_device *dev,
   struct ethtool_coalesce *c)
@@ -3184,6 +3191,7 @@ const struct ethtool_ops mvneta_eth_tool_ops = {
.get_link   = ethtool_op_get_link,
.get_settings   = mvneta_ethtool_get_settings,
.set_settings   = mvneta_ethtool_set_settings,
+   .nway_reset = mvneta_ethtool_nway_reset,
.set_coalesce   = mvneta_ethtool_set_coalesce,
.get_coalesce   = mvneta_ethtool_get_coalesce,
.get_drvinfo= mvneta_ethtool_get_drvinfo,
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC 21/26] net: mvneta: enable flow control for PHY connections

2015-12-07 Thread Russell King
Enable flow control support for PHY connections by indicating our
support via the ethtool capabilities.  phylink takes care of the
appropriate handling.

Signed-off-by: Russell King 
---
 drivers/net/ethernet/marvell/mvneta.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/marvell/mvneta.c 
b/drivers/net/ethernet/marvell/mvneta.c
index 00cfb120e324..165dfab134b7 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -2600,12 +2600,14 @@ static int mvneta_mac_support(struct net_device *ndev, 
unsigned int mode,
state->supported = PHY_10BT_FEATURES |
   PHY_100BT_FEATURES |
   SUPPORTED_1000baseT_Full |
+  SUPPORTED_Pause |
   SUPPORTED_Autoneg;
state->advertising = ADVERTISED_10baseT_Half |
 ADVERTISED_10baseT_Full |
 ADVERTISED_100baseT_Half |
 ADVERTISED_100baseT_Full |
 ADVERTISED_1000baseT_Full |
+ADVERTISED_Pause |
 ADVERTISED_Autoneg;
state->an_enabled = 1;
break;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload

2015-12-07 Thread Jesse Gross
On Sun, Dec 6, 2015 at 7:02 PM, David Ahern  wrote:
> On 12/6/15 6:20 PM, Alexander Duyck wrote:
>>
>> That works for Linux to Linux, but what about the cases where you have
>> a non-Linux endpoint on the other end such as something like a Cisco
>> switch?
>
>
> Why does is matter what kind of switch the NIC is connected to?

I think Cisco was just an example, not anything particular about their
switches. But there are two general problems:

 * Some protocols, like VXLAN, recommend that the UDP checksum be zero
so this is what pretty much everyone implements. As a result,
independent of the merits of using the checksum, most non-Linux
endpoints won't support it.

* The reason why this recommendation exists in the first place is that
most ASIC based switches can't compute/verify UDP checksums. They
slice off the headers and only run that through the chip's core
memory, so the rest of the packet isn't available to compute a
checksum over.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: use-after-free in ip6_xmit

2015-12-07 Thread Eric Dumazet
On Mon, 2015-12-07 at 11:22 +0100, Dmitry Vyukov wrote:
> Hello,
> 
> The following program triggers use-after-free in ip6_xmit:
> 
> // autogenerated by syzkaller (http://github.com/google/syzkaller)
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> 
> void *thr0(void *arg)
> {
> *(uint32_t*)0x2000 = 0x5;
> long r5 = syscall(SYS_setsockopt, arg, 0x29ul, 0x6ul,
> 0x2000ul, 0x4ul, 0);
> return 0;
> }
> 
> void *thr1(void *arg)
> {
> struct sockaddr_in6 sa = {};
> sa.sin6_family = AF_INET6;
> sa.sin6_port = getpid();
> sa.sin6_addr.s6_addr[15] = 1;
> syscall(SYS_bind, arg, , sizeof(sa), 0, 0, 0);
> return 0;
> }
> 
> void *thr2(void *arg)
> {
> struct sockaddr_in6 sa = {};
> sa.sin6_family = AF_INET6;
> sa.sin6_port = getpid();
> sa.sin6_addr.s6_addr[15] = 1;
> syscall(SYS_connect, arg, , sizeof(sa), 0, 0, 0);
> return 0;
> }
> 
> void *thr3(void *arg)
> {
> syscall(SYS_listen, arg, 0x3ul, 0, 0, 0, 0);
> return 0;
> }
> 
> void *thr4(void *arg)
> {
> syscall(SYS_accept4, arg, 0, 0, 0x800ul, 0, 0);
> return 0;
> }
> 
> int main()
> {
> srand(getpid());
> syscall(SYS_mmap, 0x2000ul, 0x1ul, 0x2ul, 0x32ul,
> 0xul, 0x0ul);
> int fd[2];
> fd[0] = syscall(SYS_socket, PF_INET6, SOCK_STREAM, IPPROTO_TCP);
> fd[1] = syscall(SYS_socket, PF_INET6, SOCK_STREAM, IPPROTO_TCP);
> pthread_t th;
> pthread_create(, 0, thr0, (void*)(long)fd[0]);
> pthread_create(, 0, thr1, (void*)(long)fd[0]);
> pthread_create(, 0, thr2, (void*)(long)fd[0]);
> pthread_create(, 0, thr3, (void*)(long)fd[0]);
> pthread_create(, 0, thr4, (void*)(long)fd[0]);
> pthread_create(, 0, thr0, (void*)(long)fd[0]);
> pthread_create(, 0, thr1, (void*)(long)fd[0]);
> pthread_create(, 0, thr2, (void*)(long)fd[0]);
> pthread_create(, 0, thr3, (void*)(long)fd[0]);
> pthread_create(, 0, thr4, (void*)(long)fd[0]);
> pthread_create(, 0, thr0, (void*)(long)fd[1]);
> pthread_create(, 0, thr1, (void*)(long)fd[1]);
> pthread_create(, 0, thr2, (void*)(long)fd[1]);
> pthread_create(, 0, thr3, (void*)(long)fd[1]);
> pthread_create(, 0, thr4, (void*)(long)fd[1]);
> pthread_create(, 0, thr0, (void*)(long)fd[1]);
> pthread_create(, 0, thr1, (void*)(long)fd[1]);
> pthread_create(, 0, thr2, (void*)(long)fd[1]);
> pthread_create(, 0, thr3, (void*)(long)fd[1]);
> pthread_create(, 0, thr4, (void*)(long)fd[1]);
> 
> usleep(2);
> return 0;
> }
> 
> 
> 
> ==
> BUG: KASAN: use-after-free in ip6_xmit+0x15e3/0x1f50 at addr 88003991c786
> Read of size 2 by task a.out/6864
> =
> BUG kmalloc-64 (Not tainted): kasan: bad access detected
> -
> 
> Disabling lock debugging due to kernel taint
> INFO: Allocated in sock_kmalloc+0x93/0x100 age=40 cpu=1 pid=6850
> [<  none  >] ___slab_alloc+0x648/0x8c0 mm/slub.c:2438
> [<  none  >] __slab_alloc+0x4c/0x90 mm/slub.c:2467
> [< inline >] slab_alloc_node mm/slub.c:2530
> [< inline >] slab_alloc mm/slub.c:2572
> [<  none  >] __kmalloc+0x2d9/0x480 mm/slub.c:3532
> [< inline >] kmalloc include/linux/slab.h:463
> [<  none  >] sock_kmalloc+0x93/0x100 net/core/sock.c:1772
> [<  none  >] do_ipv6_setsockopt.isra.5+0xf4d/0x2d30
> net/ipv6/ipv6_sockglue.c:483
> [<  none  >] ipv6_setsockopt+0x4f/0x150 net/ipv6/ipv6_sockglue.c:885
> [<  none  >] sctp_setsockopt+0x194/0x4020 net/sctp/socket.c:3702
> [<  none  >] sock_common_setsockopt+0xb4/0x140 net/core/sock.c:2643
> [< inline >] SYSC_setsockopt net/socket.c:1757
> [<  none  >] SyS_setsockopt+0x161/0x290 net/socket.c:1736
> [<  none  >] entry_SYSCALL_64_fastpath+0x16/0x7a
> arch/x86/entry/entry_64.S:185
> 
> INFO: Freed in sock_kfree_s+0x29/0x90 age=34 cpu=0 pid=6855
> [<  none  >] __slab_free+0x21e/0x3e0 mm/slub.c:2648
> [< inline >] slab_free mm/slub.c:2803
> [<  none  >] kfree+0x26f/0x3e0 mm/slub.c:3632
> [< inline >] __sock_kfree_s net/core/sock.c:1793
> [<  none  >] sock_kfree_s+0x29/0x90 net/core/sock.c:1799
> [<  none  >] do_ipv6_setsockopt.isra.5+0x100f/0x2d30
> net/ipv6/ipv6_sockglue.c:506
> [<  none  >] ipv6_setsockopt+0x4f/0x150 net/ipv6/ipv6_sockglue.c:885
> [<  none  >] sctp_setsockopt+0x194/0x4020 net/sctp/socket.c:3702
> [<  none  >] sock_common_setsockopt+0xb4/0x140 net/core/sock.c:2643
> [< inline >] 

[PATCH RFC 20/26] net: mvneta: add flow control support via phylink

2015-12-07 Thread Russell King
Add flow control support to mvneta, including the ethtool hooks.  This
uses the phylink code to calculate the result of autonegotiation where
a phy is attached, and to handle the ethtool settings.

Signed-off-by: Russell King 
---
 drivers/net/ethernet/marvell/mvneta.c | 29 +++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c 
b/drivers/net/ethernet/marvell/mvneta.c
index 82aa2b59a249..00cfb120e324 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -2701,9 +2701,13 @@ static void mvneta_mac_config(struct net_device *ndev, 
unsigned int mode,
if (state->advertising & ADVERTISED_Pause)
new_an |= MVNETA_GMAC_ADVERT_SYM_FLOW_CTRL;
 
+   if (state->pause & MLO_PAUSE_AN && state->an_enabled)
+   new_an |= MVNETA_GMAC_AN_FLOW_CTRL_EN;
+   else if (state->pause & MLO_PAUSE_TXRX_MASK)
+   new_an |= MVNETA_GMAC_CONFIG_FLOW_CTRL;
+
if (state->an_enabled)
-   new_an |= MVNETA_GMAC_AN_FLOW_CTRL_EN |
- MVNETA_GMAC_AN_DUPLEX_EN;
+   new_an |= MVNETA_GMAC_AN_DUPLEX_EN;
else if (state->duplex)
new_an |= MVNETA_GMAC_CONFIG_FULL_DUPLEX;
break;
@@ -2717,6 +2721,9 @@ static void mvneta_mac_config(struct net_device *ndev, 
unsigned int mode,
new_an |= MVNETA_GMAC_CONFIG_GMII_SPEED;
else if (state->speed == SPEED_100)
new_an |= MVNETA_GMAC_CONFIG_MII_SPEED;
+
+   if (state->pause & MLO_PAUSE_TXRX_MASK)
+   new_an |= MVNETA_GMAC_CONFIG_FLOW_CTRL;
break;
}
 
@@ -3116,6 +3123,22 @@ static int mvneta_ethtool_set_ringparam(struct 
net_device *dev,
return 0;
 }
 
+static void mvneta_ethtool_get_pauseparam(struct net_device *dev,
+ struct ethtool_pauseparam *pause)
+{
+   struct mvneta_port *pp = netdev_priv(dev);
+
+   phylink_ethtool_get_pauseparam(pp->phylink, pause);
+}
+
+static int mvneta_ethtool_set_pauseparam(struct net_device *dev,
+struct ethtool_pauseparam *pause)
+{
+   struct mvneta_port *pp = netdev_priv(dev);
+
+   return phylink_ethtool_set_pauseparam(pp->phylink, pause);
+}
+
 static void mvneta_ethtool_get_strings(struct net_device *netdev, u32 sset,
   u8 *data)
 {
@@ -3197,6 +3220,8 @@ const struct ethtool_ops mvneta_eth_tool_ops = {
.get_drvinfo= mvneta_ethtool_get_drvinfo,
.get_ringparam  = mvneta_ethtool_get_ringparam,
.set_ringparam  = mvneta_ethtool_set_ringparam,
+   .get_pauseparam = mvneta_ethtool_get_pauseparam,
+   .set_pauseparam = mvneta_ethtool_set_pauseparam,
.get_strings= mvneta_ethtool_get_strings,
.get_ethtool_stats = mvneta_ethtool_get_stats,
.get_sset_count = mvneta_ethtool_get_sset_count,
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC 26/26] sfp/phylink: hook up eeprom functions

2015-12-07 Thread Russell King
Signed-off-by: Russell King 
---
 drivers/net/phy/sfp.c | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 678298844203..feb5f7062b2c 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -902,11 +902,9 @@ static void sfp_sm_event(struct sfp *sfp, unsigned int 
event)
mutex_unlock(>sm_mutex);
 }
 
-#if 0
-static int sfp_phy_module_info(struct phy_device *phy,
-  struct ethtool_modinfo *modinfo)
+static int sfp_module_info(void *priv, struct ethtool_modinfo *modinfo)
 {
-   struct sfp *sfp = phy->priv;
+   struct sfp *sfp = priv;
 
/* locking... and check module is present */
 
@@ -920,10 +918,9 @@ static int sfp_phy_module_info(struct phy_device *phy,
return 0;
 }
 
-static int sfp_phy_module_eeprom(struct phy_device *phy,
-   struct ethtool_eeprom *ee, u8 *data)
+static int sfp_module_eeprom(void *priv, struct ethtool_eeprom *ee, u8 *data)
 {
-   struct sfp *sfp = phy->priv;
+   struct sfp *sfp = priv;
unsigned int first, last, len;
int ret;
 
@@ -954,7 +951,11 @@ static int sfp_phy_module_eeprom(struct phy_device *phy,
}
return 0;
 }
-#endif
+
+static const struct phylink_module_ops sfp_module_ops = {
+   .get_module_info = sfp_module_info,
+   .get_module_eeprom = sfp_module_eeprom,
+};
 
 static void sfp_timeout(struct work_struct *work)
 {
@@ -1030,6 +1031,7 @@ static int sfp_netdev_notify(struct notifier_block *nb, 
unsigned long act, void
case NETDEV_UNREGISTER:
if (sfp->mod_phy && sfp->phylink)
phylink_disconnect_phy(sfp->phylink);
+   phylink_unregister_module(sfp->phylink, sfp);
sfp->phylink = NULL;
dev_put(sfp->ndev);
sfp->ndev = NULL;
@@ -1146,6 +1148,7 @@ static int sfp_probe(struct platform_device *pdev)
}
 
phylink_disable(sfp->phylink);
+   phylink_register_module(sfp->phylink, sfp, _module_ops);
}
 
sfp->state = sfp_get_state(sfp);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC 15/26] net: mvneta: convert to phylink

2015-12-07 Thread Russell King
Convert mvneta to use phylink, which models the MAC to PHY link in
a generic, reusable form.

Signed-off-by: Russell King 
---
 drivers/net/ethernet/marvell/Kconfig  |   2 +-
 drivers/net/ethernet/marvell/mvneta.c | 399 +++---
 2 files changed, 224 insertions(+), 177 deletions(-)

diff --git a/drivers/net/ethernet/marvell/Kconfig 
b/drivers/net/ethernet/marvell/Kconfig
index a1c862b4664d..d59fb29f28b3 100644
--- a/drivers/net/ethernet/marvell/Kconfig
+++ b/drivers/net/ethernet/marvell/Kconfig
@@ -44,7 +44,7 @@ config MVNETA
tristate "Marvell Armada 370/38x/XP network interface support"
depends on PLAT_ORION
select MVMDIO
-   select FIXED_PHY
+   select PHYLINK
---help---
  This driver supports the network interface units in the
  Marvell ARMADA XP, ARMADA 370 and ARMADA 38x SoC family.
diff --git a/drivers/net/ethernet/marvell/mvneta.c 
b/drivers/net/ethernet/marvell/mvneta.c
index e84c7f2634d3..f19d9a31dccd 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -169,6 +170,7 @@
 #define MVNETA_GMAC_CTRL_0   0x2c00
 #define  MVNETA_GMAC_MAX_RX_SIZE_SHIFT   2
 #define  MVNETA_GMAC_MAX_RX_SIZE_MASK0x7ffc
+#define  MVNETA_GMAC0_PORT_1000BASE_XBIT(1)
 #define  MVNETA_GMAC0_PORT_ENABLEBIT(0)
 #define MVNETA_GMAC_CTRL_2   0x2c08
 #define  MVNETA_GMAC2_INBAND_AN_ENABLE   BIT(0)
@@ -184,13 +186,19 @@
 #define  MVNETA_GMAC_TX_FLOW_CTRL_ENABLE BIT(5)
 #define  MVNETA_GMAC_RX_FLOW_CTRL_ACTIVE BIT(6)
 #define  MVNETA_GMAC_TX_FLOW_CTRL_ACTIVE BIT(7)
+#define  MVNETA_GMAC_AN_COMPLETE BIT(11)
+#define  MVNETA_GMAC_SYNC_OK BIT(14)
 #define MVNETA_GMAC_AUTONEG_CONFIG   0x2c0c
 #define  MVNETA_GMAC_FORCE_LINK_DOWN BIT(0)
 #define  MVNETA_GMAC_FORCE_LINK_PASS BIT(1)
 #define  MVNETA_GMAC_INBAND_AN_ENABLEBIT(2)
+#define  MVNETA_GMAC_AN_BYPASS_ENABLEBIT(3)
+#define  MVNETA_GMAC_INBAND_RESTART_AN   BIT(4)
 #define  MVNETA_GMAC_CONFIG_MII_SPEEDBIT(5)
 #define  MVNETA_GMAC_CONFIG_GMII_SPEED   BIT(6)
 #define  MVNETA_GMAC_AN_SPEED_EN BIT(7)
+#define  MVNETA_GMAC_CONFIG_FLOW_CTRLBIT(8)
+#define  MVNETA_GMAC_ADVERT_SYM_FLOW_CTRLBIT(9)
 #define  MVNETA_GMAC_AN_FLOW_CTRL_EN BIT(11)
 #define  MVNETA_GMAC_CONFIG_FULL_DUPLEX  BIT(12)
 #define  MVNETA_GMAC_AN_DUPLEX_ENBIT(13)
@@ -361,15 +369,9 @@ struct mvneta_port {
u16 tx_ring_size;
u16 rx_ring_size;
 
-   struct mii_bus *mii_bus;
-   struct phy_device *phy_dev;
-   phy_interface_t phy_interface;
-   struct device_node *phy_node;
-   unsigned int link;
-   unsigned int duplex;
-   unsigned int speed;
+   struct device_node *dn;
unsigned int tx_csum_limit;
-   int use_inband_status:1;
+   struct phylink *phylink;
 
u64 ethtool_stats[ARRAY_SIZE(mvneta_statistics)];
 };
@@ -1056,26 +1058,6 @@ static void mvneta_defaults_set(struct mvneta_port *pp)
val &= ~MVNETA_PHY_POLLING_ENABLE;
mvreg_write(pp, MVNETA_UNIT_CONTROL, val);
 
-   if (pp->use_inband_status) {
-   val = mvreg_read(pp, MVNETA_GMAC_AUTONEG_CONFIG);
-   val &= ~(MVNETA_GMAC_FORCE_LINK_PASS |
-MVNETA_GMAC_FORCE_LINK_DOWN |
-MVNETA_GMAC_AN_FLOW_CTRL_EN);
-   val |= MVNETA_GMAC_INBAND_AN_ENABLE |
-  MVNETA_GMAC_AN_SPEED_EN |
-  MVNETA_GMAC_AN_DUPLEX_EN;
-   mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, val);
-   val = mvreg_read(pp, MVNETA_GMAC_CLOCK_DIVIDER);
-   val |= MVNETA_GMAC_1MS_CLOCK_ENABLE;
-   mvreg_write(pp, MVNETA_GMAC_CLOCK_DIVIDER, val);
-   } else {
-   val = mvreg_read(pp, MVNETA_GMAC_AUTONEG_CONFIG);
-   val &= ~(MVNETA_GMAC_INBAND_AN_ENABLE |
-  MVNETA_GMAC_AN_SPEED_EN |
-  MVNETA_GMAC_AN_DUPLEX_EN);
-   mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, val);
-   }
-
mvneta_set_ucast_table(pp, -1);
mvneta_set_special_mcast_table(pp, -1);
mvneta_set_other_mcast_table(pp, -1);
@@ -2115,26 +2097,11 @@ static irqreturn_t mvneta_isr(int irq, void *dev_id)
return IRQ_HANDLED;
 }
 
-static int mvneta_fixed_link_update(struct mvneta_port *pp,
-   struct phy_device *phy)
+static void mvneta_link_change(struct mvneta_port *pp)
 {
-   struct fixed_phy_status status;
-   struct fixed_phy_status changed = {};
u32 gmac_stat = mvreg_read(pp, MVNETA_GMAC_STATUS);
 
-  

[PATCH RFC 22/26] phylink: add EEE support

2015-12-07 Thread Russell King
Add EEE hooks to phylink to allow the phylib EEE functions for the
connected phy to be safely accessed.

Signed-off-by: Russell King 
---
 drivers/net/phy/phylink.c | 58 ++-
 include/linux/phylink.h   |  7 +-
 2 files changed, 63 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 22c8b0711252..82492db0fd04 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -314,7 +314,8 @@ static void phylink_resolve(struct work_struct *w)
if (pl->link_an_mode == MLO_AN_PHY)
pl->ops->mac_config(ndev, MLO_AN_PHY, 
_state);
 
-   pl->ops->mac_link_up(ndev, pl->link_an_mode);
+   pl->ops->mac_link_up(ndev, pl->link_an_mode,
+pl->phydev);
 
netif_carrier_on(ndev);
 
@@ -799,6 +800,61 @@ int phylink_ethtool_set_pauseparam(struct phylink *pl,
 }
 EXPORT_SYMBOL_GPL(phylink_ethtool_set_pauseparam);
 
+int phylink_init_eee(struct phylink *pl, bool clk_stop_enable)
+{
+   int ret = -EPROTONOSUPPORT;
+
+   mutex_lock(>config_mutex);
+   if (pl->phydev)
+   ret = phy_init_eee(pl->phydev, clk_stop_enable);
+   mutex_unlock(>config_mutex);
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(phylink_init_eee);
+
+int phylink_get_eee_err(struct phylink *pl)
+{
+   int ret = 0;
+
+   mutex_lock(>config_mutex);
+   if (pl->phydev)
+   ret = phy_get_eee_err(pl->phydev);
+   mutex_unlock(>config_mutex);
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(phylink_get_eee_err);
+
+int phylink_ethtool_get_eee(struct phylink *pl, struct ethtool_eee *eee)
+{
+   int ret = -EOPNOTSUPP;
+
+   mutex_lock(>config_mutex);
+   if (pl->phydev)
+   ret = phy_ethtool_get_eee(pl->phydev, eee);
+   mutex_unlock(>config_mutex);
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(phylink_ethtool_get_eee);
+
+int phylink_ethtool_set_eee(struct phylink *pl, struct ethtool_eee *eee)
+{
+   int ret = -EOPNOTSUPP;
+
+   mutex_lock(>config_mutex);
+   if (pl->phydev) {
+   ret = phy_ethtool_set_eee(pl->phydev, eee);
+   if (ret == 0 && eee->eee_enabled)
+   phy_start_aneg(pl->phydev);
+   }
+   mutex_unlock(>config_mutex);
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(phylink_ethtool_set_eee);
+
 /* This emulates MII registers for a fixed-mode phy operating as per the
  * passed in state. "aneg" defines if we report negotiation is possible.
  *
diff --git a/include/linux/phylink.h b/include/linux/phylink.h
index a23c772cc3f9..361fbe9222b2 100644
--- a/include/linux/phylink.h
+++ b/include/linux/phylink.h
@@ -51,7 +51,8 @@ struct phylink_mac_ops {
void (*mac_an_restart)(struct net_device *, unsigned int mode);
 
void (*mac_link_down)(struct net_device *, unsigned int mode);
-   void (*mac_link_up)(struct net_device *, unsigned int mode);
+   void (*mac_link_up)(struct net_device *, unsigned int mode,
+   struct phy_device *);
 };
 
 struct phylink *phylink_create(struct net_device *, struct device_node *,
@@ -74,6 +75,10 @@ void phylink_ethtool_get_pauseparam(struct phylink *,
struct ethtool_pauseparam *);
 int phylink_ethtool_set_pauseparam(struct phylink *,
   struct ethtool_pauseparam *);
+int phylink_init_eee(struct phylink *, bool);
+int phylink_get_eee_err(struct phylink *);
+int phylink_ethtool_get_eee(struct phylink *, struct ethtool_eee *);
+int phylink_ethtool_set_eee(struct phylink *, struct ethtool_eee *);
 int phylink_mii_ioctl(struct phylink *, struct ifreq *, int);
 
 void phylink_set_link_port(struct phylink *pl, u32 support, u8 port);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC 19/26] phylink: add flow control support

2015-12-07 Thread Russell King
Add flow control support, including ethtool support, to phylink.  We
add support to allow ethtool to get and set the current flow control
settings, and the 802.3 specified resolution for the local and remote
link partner abilities.

Signed-off-by: Russell King 
---
 drivers/net/phy/phylink.c | 128 +-
 include/linux/phylink.h   |   8 +++
 2 files changed, 135 insertions(+), 1 deletion(-)

diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 908235fb16c1..22c8b0711252 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -95,6 +95,9 @@ static int phylink_parse_fixedlink(struct phylink *pl, struct 
device_node *np)
 
if (of_property_read_bool(fixed_node, "full-duplex"))
pl->link_config.duplex = DUPLEX_FULL;
+
+   /* We treat the "pause" and "asym-pause" terminology as
+* defining the link partner's ability. */
if (of_property_read_bool(fixed_node, "pause"))
pl->link_config.pause |= MLO_PAUSE_SYM;
if (of_property_read_bool(fixed_node, "asym-pause"))
@@ -216,6 +219,56 @@ static void phylink_get_fixed_state(struct phylink *pl, 
struct phylink_link_stat
state->link = !!gpiod_get_value(pl->link_gpio);
 }
 
+/* Flow control is resolved according to our and the link partners
+ * advertisments using the following drawn from the 802.3 specs:
+ *  Local device  Link partner
+ *  Pause AsymDir Pause AsymDir Result
+ *1 X   1 X TX+RX
+ *0 1   1 1 RX
+ *1 1   0 1 TX
+ */
+static void phylink_resolve_flow(struct phylink *pl,
+   struct phylink_link_state *state)
+{
+   int new_pause = 0;
+
+   if (pl->link_config.pause & MLO_PAUSE_AN) {
+   int pause = 0;
+
+   if (pl->link_config.advertising & ADVERTISED_Pause)
+   pause |= MLO_PAUSE_SYM;
+   if (pl->link_config.advertising & ADVERTISED_Asym_Pause)
+   pause |= MLO_PAUSE_ASYM;
+
+   pause &= state->pause;
+
+   if (pause & MLO_PAUSE_SYM)
+   new_pause = MLO_PAUSE_TX | MLO_PAUSE_RX;
+   else if (pause & MLO_PAUSE_ASYM)
+   new_pause = state->pause & MLO_PAUSE_SYM ?
+MLO_PAUSE_RX : MLO_PAUSE_TX;
+   } else {
+   new_pause = pl->link_config.pause & MLO_PAUSE_TXRX_MASK;
+   }
+
+   state->pause &= ~MLO_PAUSE_TXRX_MASK;
+   state->pause |= new_pause;
+}
+
+static const char *phylink_pause_to_str(int pause)
+{
+   switch (pause & MLO_PAUSE_TXRX_MASK) {
+   case MLO_PAUSE_TX | MLO_PAUSE_RX:
+   return "rx/tx";
+   case MLO_PAUSE_TX:
+   return "tx";
+   case MLO_PAUSE_RX:
+   return "rx";
+   default:
+   return "off";
+   }
+}
+
 extern const char *phy_speed_to_str(int speed);
 
 static void phylink_resolve(struct work_struct *w)
@@ -231,6 +284,7 @@ static void phylink_resolve(struct work_struct *w)
switch (pl->link_an_mode) {
case MLO_AN_PHY:
link_state = pl->phy_state;
+   phylink_resolve_flow(pl, _state);
break;
 
case MLO_AN_FIXED:
@@ -268,7 +322,7 @@ static void phylink_resolve(struct work_struct *w)
"Link is Up - %s/%s - flow control %s\n",
phy_speed_to_str(link_state.speed),
link_state.duplex ? "Full" : "Half",
-   link_state.pause ? "rx/tx" : "off");
+   phylink_pause_to_str(link_state.pause));
}
}
mutex_unlock(>state_mutex);
@@ -297,6 +351,7 @@ struct phylink *phylink_create(struct net_device *ndev, 
struct device_node *np,
pl->link_interface = iface;
pl->link_port_support = SUPPORTED_MII;
pl->link_port = PORT_MII;
+   pl->link_config.pause = MLO_PAUSE_AN;
pl->ops = ops;
__set_bit(PHYLINK_DISABLE_STOPPED, >phylink_disable_state);
 
@@ -479,6 +534,7 @@ void phylink_start(struct phylink *pl)
 * a fixed-link to start with the correct parameters, and also
 * ensures that we set the appropriate advertisment for Serdes links.
 */
+   phylink_resolve_flow(pl, >link_config);
pl->ops->mac_config(pl->netdev, pl->link_an_mode, >link_config);
 
clear_bit(PHYLINK_DISABLE_STOPPED, >phylink_disable_state);
@@ -673,6 +729,76 @@ int phylink_ethtool_nway_reset(struct phylink *pl)
 }
 EXPORT_SYMBOL_GPL(phylink_ethtool_nway_reset);
 
+void phylink_ethtool_get_pauseparam(struct phylink *pl,
+   struct ethtool_pauseparam *pause)
+{
+   

Re: [PATCH] rtlwifi: fix gigantic memleak in rtl_usb

2015-12-07 Thread Bruno Randolf
On 12/06/2015 09:39 PM, Peter Wu wrote:
>>> While using the rtl8192cu driver in monitor mode, somehow 5G of memory
>>> was permanently lost (observable via the Available column in `free -m`).
>>>
>>> This issue has existed since the introduction of this driver in v2.6.x,

One more reason to switch to rtl8xxxu as soon as possible...

bruno

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net] ipv6: sctp: fix lockdep splat in sctp_v6_get_dst()

2015-12-07 Thread Eric Dumazet
From: Eric Dumazet 

While cooking the sctp np->opt rcu fixes, I forgot to move
one rcu_read_unlock() after the added rcu_dereference() in
sctp_v6_get_dst()

This gave lockdep warnings reported by Dave Jones.

Fixes: c836a8ba9386 ("ipv6: sctp: add rcu protection around np->opt")
Reported-by: Dave Jones 
Signed-off-by: Eric Dumazet 
---
 net/sctp/ipv6.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
index acb45b8c2a9d..d28c0b4c9128 100644
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -323,14 +323,13 @@ static void sctp_v6_get_dst(struct sctp_transport *t, 
union sctp_addr *saddr,
}
}
}
-   rcu_read_unlock();
-
if (baddr) {
fl6->saddr = baddr->v6.sin6_addr;
fl6->fl6_sport = baddr->v6.sin6_port;
final_p = fl6_update_dst(fl6, rcu_dereference(np->opt), );
dst = ip6_dst_lookup_flow(sk, fl6, final_p);
}
+   rcu_read_unlock();
 
 out:
if (!IS_ERR_OR_NULL(dst)) {


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] ravb: Remove clear unhandled interrupt

2015-12-07 Thread Sergei Shtylyov

Hello.

On 12/07/2015 07:59 PM, Yoshihiro Kaneko wrote:


From: Kazuya Mizuguchi 

AVB-DMAC Reception Warning interrupt is not enabled, so it is not
necessary to clear the interrupt.

Signed-off-by: Kazuya Mizuguchi 
Signed-off-by: Yoshihiro Kaneko 



In principle I agree but perhaps we should clear RIC1 in probe() to not
depend on a state left from a bootloader?


I think that it is a good idea.
I'll add it to v2.


   Perhaps the ndo_open() method would be a better place...


MBR, Sergei



Thanks,
Kaneko


MBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 8/8] treewide: Remove newlines inside DEFINE_PER_CPU() macros

2015-12-07 Thread David Laight
From: Michal Marek
> Sent: 04 December 2015 15:26
> Otherwise make tags can't parse them:
> 
> ctags: Warning: arch/ia64/kernel/smp.c:60: null expansion of name pattern "\1"
...

Seems to me you need to fix ctags.

David
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Checksum offload queries

2015-12-07 Thread Tom Herbert
On Mon, Dec 7, 2015 at 7:39 AM, Edward Cree  wrote:
> Having decided to take Dave Miller's advice to push our hardware guys in the 
> direction of generic checksum offload, I found I wasn't quite sure exactly 
> what's being encouraged.  After discussing the subject with a colleague, some 
> questions crystallised.  I expect it's mostly a result of misunderstandings 
> on my part, but here goes:
>

Hi Edward, thanks for looking into this.

> 1) Receive checksums.  Given that CHECKSUM_UNNECESSARY conversion exists (and 
> is a cheap operation), what is the advantage to the stack of using 
> CHECKSUM_COMPLETE if the packet happens to be a protocol which 
> CHECKSUM_UNNECESSARY conversion can handle?  As I see it, 
> CHECKSUM_UNNECESSARY is strictly better as the stack is told "the first 
> csum_level+1 checksums are good" *and* (indirectly) "here is the whole-packet 
> checksum, which you can use to help with anything beyond csum_level+1".  Is 
> it not, then, best for a device only to use CHECKSUM_COMPLETE for protocols 
> the conversion doesn't handle?  (I agree that having that fallback of 
> CHECKSUM_COMPLETE is a good thing, sadly I don't think our new chip does 
> that.  (But maybe firmware can fix it.))
>
Checksum unnecessary conversion works great but it's applicability is
limited. This only helps in encapsulation when the UDP checksum can be
enabled, but due to restrictions of other devices we may need to
communicate with (e.g. Cisco switches) it might not be usable. Also,
checksum conversion is not relevant to many other protocols we want to
run like GRE, IPIP, SIT, MPLS/IP, etc., and does not help with IPv6
extension headers. CHECKSUM_COMPLETE is the really closet thing to a
universal solution.

> 2) Transmit checksums.  While many protocols permit using 0 in the outer 
> checksum, it doesn't seem prudent to assume all will.  Besides, many NICs 
> will still have IP and TCP/UDP checksum offload hardware, if only to support 
> less enlightened operating systems; why not use it?  Would it not be better 
> for a device to have both NETIF_F_HW_CSUM *and* NETIF_F_IP[|V6]_CSUM, and be 
> smart enough to fill in IP checksum, TCP/UDP checksum and one encapsulated 
> checksum of your choice (i.e. whatever csum_start and friends asked for)?  
> (Again, I agree that having a NETIF_F_IP_CSUM device do specific magic for a 
> list of specific encapsulation protocols is unsatisfactory.  Sadly, guess 
> what our new chip does!  (But maybe firmware can fix it.))
>
It's analogous to CHECKSUM_COMPLETE, NETIF_F_HW_CSUM works for all
cases of checksum offload and any combination of protocol layering.
NETIF_F_IP[V6]_CSUM is limited and requires a lot of logic in both
driver and HW to implement correctly. I can't help you with these less
enlightening operating systems, maybe if they see the advantages of a
protocol generic offload model they'll "get with the program" as Dave
might say? In any case I do not believe we should be at all
constrained in building Linux interfaces or capabilities by the design
decisions made in other operating systems.

> 3) Related to the above, what does a NETIF_F_HW_CSUM device do when 
> transmitting an unencapsulated packet (let's say it's UDP) currently?  Will 
> it simply get no checksum offload at all?  Will csum_start point at the 
> regular UDP checksum (and the stack will do the IP header checksum)?  Again, 
> a device that does both HW_ and IP_CSUM could cope with this (do the IP and 
> UDP checksums as per NETIF_F_IP_CSUM, and just don't ask for a 'generic' 
> HW_CSUM), though that would require more checksum flags (there's no way for 
> CHECKSUM_PARTIAL to say "do your IP-specific stuff but ignore csum_start and 
> friends).
>
The NETIF_F_*_CSUM flags only describe the capabilities of the device,
the interface between the stack and the driver to offload the checksum
of a particular packet is solely the csum_start and csum_offset fields
in the skb (non-GSO case). It is up to the driver to decide whether
the particular checksum can actually be offloaded, but in any case it
must set the correct checksum (through skb_checksum_help if
necessary).

> 4) Where, precisely, should I tell our hardware guys to stuff the 
> protocol-specific encapsulated checksum offloads they're so proud of having 
> added to our new chip? ;)

Tell them that if the support generic checksum you're marketing guys
will be able to list at least fifteen protocols in the product specs
that can be checksum offloaded to the device instead of just one or
two like the competition ;-)

Thanks,
Tom
.
>
> --
> Edward Cree, not speaking for Solarflare Communications
> The information contained in this message is confidential and is intended for 
> the addressee(s) only. If you have received this message in error, please 
> notify the sender immediately and delete the message. Unless you are an 
> addressee (or authorized to receive for an addressee), you may not use, copy 
> or disclose to anyone 

[PATCH RFC 08/26] phy: export phy_start_machine() for phylink

2015-12-07 Thread Russell King
phylink will need phy_start_machine exported, so lets export it as a
GPL symbol.  Documentation/networking/phy.txt indicates that this
should be a PHY API function.

Signed-off-by: Russell King 
---
 drivers/net/phy/phy.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 150497246922..ec9953202f58 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -523,6 +523,7 @@ void phy_start_machine(struct phy_device *phydev)
 {
queue_delayed_work(system_power_efficient_wq, >state_queue, HZ);
 }
+EXPORT_SYMBOL_GPL(phy_start_machine);
 
 /**
  * phy_stop_machine - stop the PHY state machine tracking
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 next 0/2] dts: hisi: fixes no syscon error when init mdio

2015-12-07 Thread yankejian
this patchset fixes the bug that eth can't initial successful on hip05-D02
because the dts files doesn't match the source code.

yankejian (2):
  dts: hisi: fixes no syscon error when init mdio
  net: hns: fixes no syscon error when init mdio

---
change log:
v2:
 1) update the related documented in the binding as well
 2) use the normal naming conventions using '-' instead of '_', and
update the related *.c files.

v1:
 initial version

 .../devicetree/bindings/arm/hisilicon/hisilicon.txt  | 16 
 arch/arm64/boot/dts/hisilicon/hip05.dtsi |  5 +
 arch/arm64/boot/dts/hisilicon/hip05_hns.dtsi |  4 ++--
 drivers/net/ethernet/hisilicon/hns_mdio.c|  2 +-
 4 files changed, 24 insertions(+), 3 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 next 1/2] dts: hisi: fixes no syscon error when init mdio

2015-12-07 Thread yankejian
Signed-nux start up, we get the log below:
"Hi-HNS_MDIO 803c.mdio: no syscon hisilicon,peri-c-subctrl
 mdio_bus mdio@803c: mdio sys ctl reg has not maped   "

the source code about the subctrl is dealled with syscon, but dts doesn't.
it cause such fault. so this patch adds the syscon info on dts files to
fixes it. and it adds documentation for the devicetree bindings used by
DT files of Hisilicon Hip05-D02 development board.

Signed-off-by: yankejian 
---
change log:
v2:
 1) updates the related documented in the binding as well
 2) use the normal naming conventions using '-' instead of '_'

v1:
 initial version
---
 .../devicetree/bindings/arm/hisilicon/hisilicon.txt  | 16 
 arch/arm64/boot/dts/hisilicon/hip05.dtsi |  5 +
 arch/arm64/boot/dts/hisilicon/hip05_hns.dtsi |  4 ++--
 3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/arm/hisilicon/hisilicon.txt 
b/Documentation/devicetree/bindings/arm/hisilicon/hisilicon.txt
index 6ac7c00..9f05767 100644
--- a/Documentation/devicetree/bindings/arm/hisilicon/hisilicon.txt
+++ b/Documentation/devicetree/bindings/arm/hisilicon/hisilicon.txt
@@ -187,6 +187,22 @@ Example:
reg = <0xb000 0x1>;
};
 
+Hisilicon HiP05 PERISUB system controller
+
+Required properties:
+- compatible : "hisilicon,peri-c-subctrl", "syscon";
+- reg : Register address and size
+
+The HiP05 PERISUB system controller is shared by peripheral controllers in
+HiP05 Soc to implement some basic configurations. the peripheral
+controllers include mdio, ddr, iic, uart, timer and so on.
+
+Example:
+   /* for HiP05 PCIe-SAS system */
+   peri-c-subctrl: sub-ctrl-c@8000 {
+   compatible = "hisilicon,peri-c-subctrl", "syscon";
+   reg = <0x0 0x8000 0x0 0x1>;
+   };
 ---
 Hisilicon CPU controller
 
diff --git a/arch/arm64/boot/dts/hisilicon/hip05.dtsi 
b/arch/arm64/boot/dts/hisilicon/hip05.dtsi
index 4ff16d0..5fec740 100644
--- a/arch/arm64/boot/dts/hisilicon/hip05.dtsi
+++ b/arch/arm64/boot/dts/hisilicon/hip05.dtsi
@@ -246,6 +246,11 @@
clock-frequency = <2>;
};
 
+   peri_c_subctrl: sub-ctrl-c@8000 {
+   compatible = "hisilicon,peri-c-subctrl", "syscon";
+   reg = < 0x0 0x8000 0x0 0x1>;
+   };
+
uart0: uart@8030 {
compatible = "snps,dw-apb-uart";
reg = <0x0 0x8030 0x0 0x1>;
diff --git a/arch/arm64/boot/dts/hisilicon/hip05_hns.dtsi 
b/arch/arm64/boot/dts/hisilicon/hip05_hns.dtsi
index 606dd5a..da7b6e6 100644
--- a/arch/arm64/boot/dts/hisilicon/hip05_hns.dtsi
+++ b/arch/arm64/boot/dts/hisilicon/hip05_hns.dtsi
@@ -10,8 +10,8 @@ soc0: soc@0 {
#address-cells = <1>;
#size-cells = <0>;
compatible = "hisilicon,hns-mdio";
-   reg = <0x0 0x803c 0x0 0x1
-  0x0 0x8000 0x0 0x1>;
+   reg = <0x0 0x803c 0x0 0x1>;
+   subctrl-vbase = <_c_subctrl>;
 
soc0_phy0: ethernet-phy@0 {
reg = <0x0>;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH net-next 1/2] tcp: RTO Restart (RTOR)

2015-12-07 Thread Per Hurtig
This patch implements the RTO restart modification (RTOR). When data is
ACKed, and the RTO timer is restarted, the time elapsed since the last
outstanding segment was transmitted is subtracted from the calculated RTO
value. This way, the RTO timer will expire after exactly RTO seconds, and
not RTO + RTT [+ delACK] seconds.

This patch also implements a new sysctl (tcp_timer_restart) that is used
to control the timer restart behavior.

Signed-off-by: Per Hurtig 
---
 Documentation/networking/ip-sysctl.txt | 12 
 include/net/tcp.h  |  4 
 net/ipv4/sysctl_net_ipv4.c | 10 ++
 net/ipv4/tcp_input.c   | 24 
 4 files changed, 50 insertions(+)

diff --git a/Documentation/networking/ip-sysctl.txt 
b/Documentation/networking/ip-sysctl.txt
index 2ea4c45..4094128 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -591,6 +591,18 @@ tcp_syn_retries - INTEGER
with the current initial RTO of 1second. With this the final timeout
for an active TCP connection attempt will happen after 127seconds.
 
+tcp_timer_restart - INTEGER
+   Controls how the RTO and PTO timers are restarted (RTOR and TLPR).
+   If set (per timer or combined) the timers are restarted with
+   respect to the earliest outstanding segment, to not extend tail loss
+   latency unnecessarily.
+   Possible values:
+   0 disables RTOR and TLPR.
+   1 enables RTOR.
+   2 enables TLPR.
+   3 enables RTOR and TLPR.
+   Default: 3
+
 tcp_timestamps - BOOLEAN
Enable timestamps as defined in RFC1323.
 
diff --git a/include/net/tcp.h b/include/net/tcp.h
index f80e74c..bf98768 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -76,6 +76,9 @@ void tcp_time_wait(struct sock *sk, int state, int timeo);
 /* After receiving this amount of duplicate ACKs fast retransmit starts. */
 #define TCP_FASTRETRANS_THRESH 3
 
+/* Disable RTO Restart if the number of outstanding segments is at least. */
+#define TCP_RTORESTART_THRESH  4
+
 /* Maximal number of ACKs sent quickly to accelerate slow-start. */
 #define TCP_MAX_QUICKACKS  16U
 
@@ -284,6 +287,7 @@ extern int sysctl_tcp_autocorking;
 extern int sysctl_tcp_invalid_ratelimit;
 extern int sysctl_tcp_pacing_ss_ratio;
 extern int sysctl_tcp_pacing_ca_ratio;
+extern int sysctl_tcp_timer_restart;
 
 extern atomic_long_t tcp_memory_allocated;
 extern struct percpu_counter tcp_sockets_allocated;
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index a0bd7a5..dfb6968 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -28,6 +28,7 @@
 
 static int zero;
 static int one = 1;
+static int three = 3;
 static int four = 4;
 static int thousand = 1000;
 static int gso_max_segs = GSO_MAX_SEGS;
@@ -745,6 +746,15 @@ static struct ctl_table ipv4_table[] = {
.extra2 = ,
},
{
+   .procname   = "tcp_timer_restart",
+   .data   = _tcp_timer_restart,
+   .maxlen = sizeof(int),
+   .mode   = 0644,
+   .proc_handler   = proc_dointvec_minmax,
+   .extra1 = ,
+   .extra2 = ,
+   },
+   {
.procname   = "tcp_autocorking",
.data   = _tcp_autocorking,
.maxlen = sizeof(int),
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index fdd88c3..66e0425 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -101,6 +101,7 @@ int sysctl_tcp_thin_dupack __read_mostly;
 
 int sysctl_tcp_moderate_rcvbuf __read_mostly = 1;
 int sysctl_tcp_early_retrans __read_mostly = 3;
+int sysctl_tcp_timer_restart __read_mostly = 3;
 int sysctl_tcp_invalid_ratelimit __read_mostly = HZ/2;
 
 #define FLAG_DATA  0x01 /* Incoming frame contained data.  
*/
@@ -2997,6 +2998,18 @@ static void tcp_cong_avoid(struct sock *sk, u32 ack, u32 
acked)
tcp_sk(sk)->snd_cwnd_stamp = tcp_time_stamp;
 }
 
+static u32 tcp_unsent_pkts(const struct sock *sk)
+{
+   struct sk_buff *skb = tcp_send_head(sk);
+   u32 pkts = 0;
+
+   if (skb)
+   tcp_for_write_queue_from(skb, sk)
+   pkts += tcp_skb_pcount(skb);
+
+   return pkts;
+}
+
 /* Restart timer after forward progress on connection.
  * RFC2988 recommends to restart timer to now+rto.
  */
@@ -3027,6 +3040,17 @@ void tcp_rearm_rto(struct sock *sk)
 */
if (delta > 0)
rto = delta;
+   } else if (icsk->icsk_pending == ICSK_TIME_RETRANS &&
+  (sysctl_tcp_timer_restart == 1 ||
+   sysctl_tcp_timer_restart == 3) &&
+  (tp->packets_out + tcp_unsent_pkts(sk) <
+  

[RFC PATCH net-next 0/2] tcp: timer restart for tail loss

2015-12-07 Thread Per Hurtig
This is a request for comments.

RTO and TLP restart is a modification to the restart process of the RTO and
TLP timers in TCP. Currently, both timers are restarted with its
corresponding timeout value when an acknowledgment (ACK) for correctly
received data is received. In many situations, resetting the timers on
incoming ACKs will add an implicit offset of at least RTT seconds to the
loss recovery process.

The goal of the modified restart is to provide quicker loss recovery for
segments lost in the end of a burst/connection, where the limited feedback
from a receiver inhibits the use of fast/early retransmit. To accomplish
this the algorithm adjusts the RTO and PTO (TLP's timer) values on each
rearm of the timers to allow them to expire after exactly RTO and PTO ms,
respectively.

The restart behavior is controlled by the new tcp_timer_restart sysctl.
tcp_timer_restart==0; disables RTOR and TLPR.
 ==1; enables RTOR.
 ==2; enables TLPR.
 ==3; enables both RTOR and TLPR. [DEFAULT]

The new restart behavior has been approved by the IETF for publication as
an experimental RFC (for the RTO part of the mechanism) [1], and
experiments have been conducted to show both the benefit of this strategy
(in terms of reduced loss-recovery delays) and the limited negative impact
(in terms of spurious retransmissions) [2]. More information regarding the
mechanism can be found at [3].

Basic functionality tests, using packetdrill, are available at:
https://github.com/perhurt/packetdrill/tree/master/gtests/net/packetdrill/tests/linux/timer_restart


[1] https://datatracker.ietf.org/doc/draft-ietf-tcpm-rtorestart/
[2] 
http://www.sigcomm.org/sites/default/files/ccr/papers/2015/January/000-000.pdf
[3] http://riteproject.eu/resources/rto-restart/

Per Hurtig (2):
  tcp: RTO Restart (RTOR)
  tcp: TLP restart (TLPR)

 Documentation/networking/ip-sysctl.txt | 12 
 include/net/tcp.h  |  6 +-
 net/ipv4/sysctl_net_ipv4.c | 10 ++
 net/ipv4/tcp_input.c   | 26 +-
 net/ipv4/tcp_output.c  | 12 ++--
 5 files changed, 62 insertions(+), 4 deletions(-)

-- 
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH net-next 2/2] tcp: TLP restart (TLPR)

2015-12-07 Thread Per Hurtig
This patch implements the TLP restart modification (TLPR). When data is
ACKed, and TLP's PTO timer is restarted, the time elapsed since the last
outstanding segment was transmitted is subtracted from the calculated RTO
value to not unnecessarily delay loss probes.

Signed-off-by: Per Hurtig 
---
 include/net/tcp.h |  2 +-
 net/ipv4/tcp_input.c  |  2 +-
 net/ipv4/tcp_output.c | 12 ++--
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index bf98768..8ac4118 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -566,7 +566,7 @@ void tcp_push_one(struct sock *, unsigned int mss_now);
 void tcp_send_ack(struct sock *sk);
 void tcp_send_delayed_ack(struct sock *sk);
 void tcp_send_loss_probe(struct sock *sk);
-bool tcp_schedule_loss_probe(struct sock *sk);
+bool tcp_schedule_loss_probe(struct sock *sk, bool restart);
 
 /* tcp_input.c */
 void tcp_resume_early_retransmit(struct sock *sk);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 66e0425..28d3b21 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3655,7 +3655,7 @@ static int tcp_ack(struct sock *sk, const struct sk_buff 
*skb, int flag)
}
 
if (icsk->icsk_pending == ICSK_TIME_RETRANS)
-   tcp_schedule_loss_probe(sk);
+   tcp_schedule_loss_probe(sk, true);
tcp_update_pacing_rate(sk);
return 1;
 
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index cb7ca56..752db3d 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2135,7 +2135,7 @@ repair:
 
/* Send one loss probe per tail loss episode. */
if (push_one != 2)
-   tcp_schedule_loss_probe(sk);
+   tcp_schedule_loss_probe(sk, false);
is_cwnd_limited |= (tcp_packets_in_flight(tp) >= tp->snd_cwnd);
tcp_cwnd_validate(sk, is_cwnd_limited);
return false;
@@ -2143,10 +2143,11 @@ repair:
return !tp->packets_out && tcp_send_head(sk);
 }
 
-bool tcp_schedule_loss_probe(struct sock *sk)
+bool tcp_schedule_loss_probe(struct sock *sk, bool restart)
 {
struct inet_connection_sock *icsk = inet_csk(sk);
struct tcp_sock *tp = tcp_sk(sk);
+   struct sk_buff *skb = tcp_write_queue_head(sk);
u32 timeout, tlp_time_stamp, rto_time_stamp;
u32 rtt = usecs_to_jiffies(tp->srtt_us >> 3);
 
@@ -2186,6 +2187,13 @@ bool tcp_schedule_loss_probe(struct sock *sk)
if (tp->packets_out == 1)
timeout = max_t(u32, timeout,
(rtt + (rtt >> 1) + TCP_DELACK_MAX));
+   if (sysctl_tcp_timer_restart > 1 && restart && skb) {
+   const u32 rto_time_stamp = tcp_skb_timestamp(skb);
+   s32 delta = (s32)(tcp_time_stamp - rto_time_stamp);
+
+   if (delta > 0 && timeout > delta)
+   timeout -= delta;
+   }
timeout = max_t(u32, timeout, msecs_to_jiffies(10));
 
/* If RTO is shorter, just schedule TLP in its place. */
-- 
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 5/5] stmmac: socfpga: Provide dt node to config ptp clk source.

2015-12-07 Thread Arnd Bergmann
On Monday 07 December 2015 09:38:44 Phil Reid wrote:
> Signed-off-by: Phil Reid 
> ---
>  Documentation/devicetree/bindings/net/socfpga-dwmac.txt | 2 ++
>  drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c | 9 +
>  2 files changed, 11 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/net/socfpga-dwmac.txt 
> b/Documentation/devicetree/bindings/net/socfpga-dwmac.txt
> index 3a9d679..72d82d6 100644
> --- a/Documentation/devicetree/bindings/net/socfpga-dwmac.txt
> +++ b/Documentation/devicetree/bindings/net/socfpga-dwmac.txt
> @@ -11,6 +11,8 @@ Required properties:
>   designware version numbers documented in stmmac.txt
>   - altr,sysmgr-syscon : Should be the phandle to the system manager node that
> encompasses the glue register, the register offset, and the register 
> shift.
> + - altr,f2h_ptp_ref_clk use f2h_ptp_ref_clk instead of default eosc1 clock
> +   for ptp ref clk. This affects all emacs as the clock is common.
> 

Is this feature specific to the Altera glue logic, or would it be possible
to do the same thing on another dwmac implementation?

Arnd
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] net: hns: optimize XGE capability by reducing cpu usage

2015-12-07 Thread Joe Perches
On Mon, 2015-12-07 at 16:58 +0800, Yankejian (Hackim Yim) wrote:
> On 2015/12/7 11:32, Joe Perches wrote:
> > On Sun, 2015-12-06 at 22:29 -0500, David Miller wrote:
> > > > From: yankejian 
> > > > Date: Sat, 5 Dec 2015 15:32:29 +0800
> > > > 
> > > > > > +#if (PAGE_SIZE < 8192)
> > > > > > + if (hnae_buf_size(ring) == HNS_BUFFER_SIZE_2048) {
> > > > > > + truesize = hnae_buf_size(ring);
> > > > > > + } else {
> > > > > > + truesize = ALIGN(size, L1_CACHE_BYTES);
> > > > > > + last_offset = hnae_page_size(ring) - 
> > > > > > hnae_buf_size(ring);
> > > > > > + }
> > > > > > +
> > > > > > +#else
> > > > > > + truesize = ALIGN(size, L1_CACHE_BYTES);
> > > > > > + last_offset = hnae_page_size(ring) - 
> > > > > > hnae_buf_size(ring);
> > > > > > +#endif
> > > > 
> > > > This is not indented properly, and it looks terrible.
> > And it makes one curious as to why last_offset isn't set
> > in the first block.
> 
> Hi Joe,

Hello.

> if hnae_buf_size que equal to HNS_BUFFER_SIZE, last_offset is useless in the 
> routines of this function.
> so it is ignored in the first block. thanks for your suggestion.

More to the point, last_offset is initialized to 0.

It'd be clearer not to initialize it at all and
set it to 0 in the first block and not overwrite
the initialization in each subsequent block.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 4/5] stmmac: Add ptp debugfs entry.

2015-12-07 Thread Arnd Bergmann
On Monday 07 December 2015 09:38:43 Phil Reid wrote:
> This adds a debugfs entry to view the current status of the ptp
> registers.
> 
> Signed-off-by: Phil Reid 
> 

Your description should explain what this is good for. Why do you
need to look at this through debugfs?

Arnd
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] net: hns: optimize XGE capability by reducing cpu usage

2015-12-07 Thread Yankejian (Hackim Yim)


On 2015/12/7 17:05, Joe Perches wrote:
> On Mon, 2015-12-07 at 16:58 +0800, Yankejian (Hackim Yim) wrote:
>> On 2015/12/7 11:32, Joe Perches wrote:
>>> On Sun, 2015-12-06 at 22:29 -0500, David Miller wrote:
> From: yankejian 
> Date: Sat, 5 Dec 2015 15:32:29 +0800
>
>>> +#if (PAGE_SIZE < 8192)
>>> + if (hnae_buf_size(ring) == HNS_BUFFER_SIZE_2048) {
>>> + truesize = hnae_buf_size(ring);
>>> + } else {
>>> + truesize = ALIGN(size, L1_CACHE_BYTES);
>>> + last_offset = hnae_page_size(ring) - hnae_buf_size(ring);
>>> + }
>>> +
>>> +#else
>>> + truesize = ALIGN(size, L1_CACHE_BYTES);
>>> + last_offset = hnae_page_size(ring) - hnae_buf_size(ring);
>>> +#endif
> This is not indented properly, and it looks terrible.
>>> And it makes one curious as to why last_offset isn't set
>>> in the first block.
>> Hi Joe,
> Hello.
>
>> if hnae_buf_size que equal to HNS_BUFFER_SIZE, last_offset is useless in the 
>> routines of this function.
>> so it is ignored in the first block. thanks for your suggestion.
> More to the point, last_offset is initialized to 0.
>
> It'd be clearer not to initialize it at all and
thanks, that is a good idea.

> set it to 0 in the first block and not overwrite
> the initialization in each subsequent block.
because it is useless, i think we'd better ignored it.

> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> .
>


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: (4.3.0) r8152: deadlock related to runtime suspend?

2015-12-07 Thread Peter Wu
On Mon, Dec 07, 2015 at 05:11:50PM +0800, Lu Baolu wrote:
> Hi Peter,
> 
> Have you ever tried disabling auto-pm? Did things go smoothly if auto-pm is 
> disabled?
> 
> I always disable usb auto-pm in below way.
> 
> # echo on | tee /sys/bus/usb/devices/*/power/control
> # echo on > /sys/bus/pci/devices//power/control
> 
> Thanks,
> Baolu

Hi Baolu,

The deadlock does not seem to occur with auto-PM disabled, but that is a
workaround for the issue. The hang can always be reproduced under this
test:

 - Start a QEMU VM, passing through the USB adapter
 - This VM boots to a busybox shell with no other services running or
   udev magic (to reduce interference).
 - Enable runtime PM for all devices by default (see script below)
 - From the console, invoke "ip link set eth1 up" (eth0 is a virtio
   adapter).

# somewhere in /init after mounting filesystems
echo /sbin/hotplug > /proc/sys/kernel/hotplug
echo auto | tee  /sys/bus/pci/devices/*/power/control \
/sys/bus/usb/devices/*/power/control >/dev/null

#!/bin/sh
# /sbin/hotplug
path="/sys/$DEVPATH/power/control"
[ -e "$path" ] || return
newval=auto
read status < "$path"
if [ "x$status" != "x$newval" ]; then
echo "$DEVPATH: $status -> $newval" >/dev/kmsg
echo $newval > "$path"
fi

With "auto", the ip command hangs (a trace can be found on the bottom of
this mail). With "on", it does not.

If I keep a loop spinning that invokes `ethtool eth1`, the command
returns immediately without issues (presumably because the device is not
suspended through runtime PM).

Under some circumstances I get a lockdep warning (when trying to bring
an interface down if I remember correctly). Its trace can be found on
the bottom of this mail.

I'll keep testing. For the lockdep warning, my initial guess is that
calling schedule_delayed_work_sync under tp->lock is a bad idea because
scheduled work can execute and try to claim tp->lock too.

Maybe there are two different lockup cases here, I'll keep testing.

Kind regards,
Peter

> On 12/05/2015 06:59 PM, Peter Wu wrote:
> > Hi,
> >
> > I rarely use a Realtek USB 3.0 Gigabit Ethernet adapter (vid/pid
> > 0bda:8153), but when I did last night, it resulted in a lockup of
> > processes doing networking ("ip link", "ping", "ethtool", ...).
> >
> > A (few) minute(s) before that event, I noticed that there was no network
> > connectivity (ping hung) which was somehow solved by invoking "ethtool
> > eth1" (triggering runtime pm wakeup?). This same trick did not work at
> > the next event. Invoking "ethtool eth1", "ip link", etc. hung completely
> > and interrupt (^C) did not work at all.
> >
> > Since that did not work, I pulled the USB adapter and re-inserted it,
> > hoping it would reset things. That did not work at all, there was a
> > "usb disconnect" message, but no further driver messages.
> >
> > Fast forward an hour, and it has become a disaster. I have terminated
> > and killed many programs via SysRq but am still unable to get a stable
> > system that does not hang on network I/O. Even the suspend process
> > fails so in the end I attempted to shutdown the system. After half an
> > hour after getting the poweroff message, I issued SysRq + B to reboot
> > (since SysRq + O did not shut down either).
> >
> > Attached are logs with various various backtraces from SysRq and failed
> > suspend. Let me know if you need more information!
> >
> > By the way, often I have to rmmod xhci and re-insert it, otherwise
> > plugging it in does not result in a detection. A USB 2.0 port does not
> > have this problem (runtime PM is enabled for all devices). This is the
> > USB 3.0 port:
> >
> > 02:00.0 USB controller [0c03]: NEC Corporation uPD720200 USB 3.0
> > Host Controller [1033:0194] (rev 03)
> 

-- 

lockdep splat from the bare machine:

==
[ INFO: possible circular locking dependency detected ]
4.3.0-custom #1 Tainted: G   O   
---
kworker/0:1/38 is trying to acquire lock:
 (>control){+.+.+.}, at: [] rtl8152_resume+0x24/0x130 
[r8152]

but task is already holding lock:
 ((&(>schedule)->work)){+.+.+.}, at: [] 
process_one_work+0x15c/0x660

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 ((&(>schedule)->work)){+.+.+.}:

[] lock_acquire+0xc3/0x1d0
   [] flush_work+0x3d/0x290
   [] __cancel_work_timer+0xfe/0x1c0
   [] cancel_delayed_work_sync+0x13/0x20
   [] rtl8152_set_speed+0x2a/0x260 [r8152]
   [] rtl8152_open+0x396/0x4f0 [r8152]
   [] __dev_open+0xaf/0x120
   [] __dev_change_flags+0x9d/0x160
   [] dev_change_flags+0x29/0x70
   [] do_setlink+0x5ba/0xb00
   [] rtnl_newlink+0x5a9/0x8a0
   [] rtnetlink_rcv_msg+0x84/0x210
   [] netlink_rcv_skb+0x97/0xb0
   

Re: ipsec impact on performance

2015-12-07 Thread Steffen Klassert
On Thu, Dec 03, 2015 at 06:38:20AM -0500, Sowmini Varadhan wrote:
> On (12/03/15 09:45), Steffen Klassert wrote:
> > pcrypt(echainiv(authenc(hmac(sha1-ssse3),cbc-aes-aesni)))
> > 
> > Result:
> > 
> > iperf -c 10.0.0.12 -t 60
> > 
> > Client connecting to 10.0.0.12, TCP port 5001
> > TCP window size: 45.0 KByte (default)
> > 
> > [  3] local 192.168.0.12 port 39380 connected with 10.0.0.12 port 5001
> > [ ID] Interval   Transfer Bandwidth
> > [  3]  0.0-60.0 sec  32.8 GBytes  4.70 Gbits/sec
> > 
> > I provide more informatios as soon as the code is available.
> 
> that's pretty good compared to the baseline. 

This is GRO in combination with a pcrypt parallelized
crypto algorithm, without the parallelization GRO/GSO
does not help because crypto is the bottleneck then.

> I'd like to try out our patches, when they are ready.

I've pushed it to

https://git.kernel.org/cgit/linux/kernel/git/klassert/linux-stk.git/log/?h=net-next-ipsec-offload

It is just example code, nothing that I would show usually.
But you asked for it, so here is it :)

The GRO part seems to work well, the GSO part is just a hack at the 
moment.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: (4.3.0) r8152: deadlock related to runtime suspend?

2015-12-07 Thread Lu Baolu
Hi Peter,

Have you ever tried disabling auto-pm? Did things go smoothly if auto-pm is 
disabled?

I always disable usb auto-pm in below way.

# echo on | tee /sys/bus/usb/devices/*/power/control
# echo on > /sys/bus/pci/devices//power/control

Thanks,
Baolu

On 12/05/2015 06:59 PM, Peter Wu wrote:
> Hi,
>
> I rarely use a Realtek USB 3.0 Gigabit Ethernet adapter (vid/pid
> 0bda:8153), but when I did last night, it resulted in a lockup of
> processes doing networking ("ip link", "ping", "ethtool", ...).
>
> A (few) minute(s) before that event, I noticed that there was no network
> connectivity (ping hung) which was somehow solved by invoking "ethtool
> eth1" (triggering runtime pm wakeup?). This same trick did not work at
> the next event. Invoking "ethtool eth1", "ip link", etc. hung completely
> and interrupt (^C) did not work at all.
>
> Since that did not work, I pulled the USB adapter and re-inserted it,
> hoping it would reset things. That did not work at all, there was a
> "usb disconnect" message, but no further driver messages.
>
> Fast forward an hour, and it has become a disaster. I have terminated
> and killed many programs via SysRq but am still unable to get a stable
> system that does not hang on network I/O. Even the suspend process
> fails so in the end I attempted to shutdown the system. After half an
> hour after getting the poweroff message, I issued SysRq + B to reboot
> (since SysRq + O did not shut down either).
>
> Attached are logs with various various backtraces from SysRq and failed
> suspend. Let me know if you need more information!
>
> By the way, often I have to rmmod xhci and re-insert it, otherwise
> plugging it in does not result in a detection. A USB 2.0 port does not
> have this problem (runtime PM is enabled for all devices). This is the
> USB 3.0 port:
>
> 02:00.0 USB controller [0c03]: NEC Corporation uPD720200 USB 3.0
> Host Controller [1033:0194] (rev 03)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [net-next v5 0/8] dpaa_eth: Add the Freescale DPAA Ethernet driver

2015-12-07 Thread Madalin-Cristian Bucur

Hi Timur,

I've managed somehow to make got send-email to move the From: line in the body 
instead of the header, probably typed something wrong when asked to confirm the 
sender. I've resent the series.

Regards,
Madalin

From: Timur Tabi 
Sent: Saturday, December 5, 2015 6:40:11 AM
To: Bucur Madalin-Cristian-B32716
Cc: netdev@vger.kernel.org; linuxppc-...@lists.ozlabs.org; lkml; David Miller; 
Wood Scott-B07421; Liberman Igal-B31950; p...@mindchasers.com; Joe Perches; 
pebo...@tiscali.nl; Joakim Tjernlund; Greg Kroah-Hartman
Subject: Re: [net-next v5 0/8] dpaa_eth: Add the Freescale DPAA Ethernet driver

On Thu, Dec 3, 2015 at 6:08 AM,  <> wrote:
> From: Madalin Bucur 
>
> This patch series adds the Ethernet driver for the Freescale
> QorIQ Data Path Acceleration Architecture (DPAA).

Please fix your git-send-email configuration, so that your emails are
formatted properly.  This is the From: header:

From: <>

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


arRe: [PATCH net-next 2/2] net: hns: enet specisies a reference to dsaf (config and documents)

2015-12-07 Thread Arnd Bergmann
On Monday 07 December 2015 15:14:13 Yankejian wrote:
> On 2015/12/6 6:19, Arnd Bergmann wrote:
> > On Saturday 05 December 2015 14:10:56 yankejian wrote:
> >> diff --git a/Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt 
> >> b/Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt
> >> index 80411b2..ecacfa4 100644
> >> --- a/Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt
> >> +++ b/Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt
> >> @@ -4,8 +4,6 @@ Required properties:
> >>  - compatible: should be "hisilicon,hns-dsaf-v1" or 
> >> "hisilicon,hns-dsaf-v2".
> >>"hisilicon,hns-dsaf-v1" is for hip05.
> >>"hisilicon,hns-dsaf-v2" is for Hi1610 and Hi1612.
> >> -- dsa-name: dsa fabric name who provide this interface.
> >> -  should be "dsafX", X is the dsaf id.
> >>  - mode: dsa fabric mode string. only support one of dsaf modes like these:
> >> "2port-64vf",
> >> "6port-16rss",
> >> @@ -26,9 +24,8 @@ Required properties:
> >>  
> >>  Example:
> >>  
> >> -dsa: dsa@c700 {
> >> +dsaf0: dsa@c700 {
> >> compatible = "hisilicon,hns-dsaf-v1";
> >> -   dsa_name = "dsaf0";
> >> mode = "6port-16rss";
> >> interrupt-parent = <_dsa>;
> >> reg = <0x0 0xC000 0x0 0x42
> >> diff --git a/Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt 
> >> b/Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt
> >> index 41d19be..e6a9d1c 100644
> >> --- a/Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt
> >> +++ b/Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt
> >> @@ -4,8 +4,9 @@ Required properties:
> >>  - compatible: "hisilicon,hns-nic-v1" or "hisilicon,hns-nic-v2".
> >>"hisilicon,hns-nic-v1" is for hip05.
> >>"hisilicon,hns-nic-v2" is for Hi1610 and Hi1612.
> >> -- ae-name: accelerator name who provides this interface,
> >> -  is simply a name referring to the name of name in the accelerator node.
> >> +- ae-handle: accelerator engine handle for hns,
> >> +  specifies a reference to the associating hardware driver node.
> >> +  see Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt
> >>  - port-id: is the index of port provided by DSAF (the accelerator). DSAF 
> >> can
> >>connect to 8 PHYs. Port 0 to 1 are both used for adminstration purpose. 
> >> They
> >>are called debug ports.
> >> @@ -41,7 +42,7 @@ Example:
> >>  
> >>
> > This looks like an incompatible change, as you add and remove
> > required properties. Is there a way to support both the old and
> > the new style?
> >
> >   Arnd
> >
> > .
> 
> Hi Arnd,
> Thanks for your suggestions.  it must be set the same strings in dsaf node 
> and every enet node before.
> it seems inappropriate. as Rob Herring  's suggestions, that 
> would solve associating
> enet with a particular dsaf. so we discus the solution with Yisen Zhuang 
> .
> we decide to use the new way instead of the old one.

I agree the new form looks better than the original way, but I'm worried
about the migration path. You don't explain in the patch description
how you want to ensure that nothing breaks for existing systems.

We generally try to avoid doing incompatible changes altogether and
prefer to keep backwards compatibility, unless we can prove that no
other systems exist that would get impacted by the change.

Are you sure that nobody ships a DTB file for this hardware with their
firmware that would now require an incompatible update which in turn
breaks old kernels?

Are you sure that there is no hardware using the same dsa hardware
with out-of-tree dts files that need to make the same change but might
not be aware of the change?

Arnd
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] prism54: fix compare_const_fl.cocci warnings (fwd)

2015-12-07 Thread Julia Lawall
Move constants to the right of binary operators.

Generated by: scripts/coccinelle/misc/compare_const_fl.cocci

Signed-off-by: Fengguang Wu 
Signed-off-by: Julia Lawall 
---

It looks a little nicer to me because n is the thing we care about.

 oid_mgt.c |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

--- a/drivers/net/wireless/intersil/prism54/oid_mgt.c
+++ b/drivers/net/wireless/intersil/prism54/oid_mgt.c
@@ -424,7 +424,7 @@ mgt_set_request(islpci_private *priv, en
void *cache, *_data = data;
u32 oid;

-   BUG_ON(OID_NUM_LAST <= n);
+   BUG_ON(n >= OID_NUM_LAST);
BUG_ON(extra > isl_oid[n].range);

if (!priv->mib)
@@ -485,7 +485,7 @@ mgt_set_varlen(islpci_private *priv, enu
int dlen;
u32 oid;

-   BUG_ON(OID_NUM_LAST <= n);
+   BUG_ON(n >= OID_NUM_LAST);

dlen = isl_oid[n].size;
oid = isl_oid[n].oid;
@@ -524,7 +524,7 @@ mgt_get_request(islpci_private *priv, en
void *cache, *_res = NULL;
u32 oid;

-   BUG_ON(OID_NUM_LAST <= n);
+   BUG_ON(n >= OID_NUM_LAST);
BUG_ON(extra > isl_oid[n].range);

res->ptr = NULL;
@@ -626,7 +626,7 @@ mgt_commit_list(islpci_private *priv, en
 void
 mgt_set(islpci_private *priv, enum oid_num_t n, void *data)
 {
-   BUG_ON(OID_NUM_LAST <= n);
+   BUG_ON(n >= OID_NUM_LAST);
BUG_ON(priv->mib[n] == NULL);

memcpy(priv->mib[n], data, isl_oid[n].size);
@@ -636,7 +636,7 @@ mgt_set(islpci_private *priv, enum oid_n
 void
 mgt_get(islpci_private *priv, enum oid_num_t n, void *res)
 {
-   BUG_ON(OID_NUM_LAST <= n);
+   BUG_ON(n >= OID_NUM_LAST);
BUG_ON(priv->mib[n] == NULL);
BUG_ON(res == NULL);

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] netfilter: nfnetlink_queue: Unregister pernet subsys in case of init failure

2015-12-07 Thread Nikolay Borisov
Commit 3bfe049807c2403 ('netfilter: nfnetlink_{log,queue}:
Register pernet in first place') reorganised the initialisation
order of the pernet_subsys to avoid "use-before-initialised"
condition. However, in doing so the cleanup logic in nfnetlink_queue
got botched in that the pernet_subsys wasn't cleaned in case
nfnetlink_subsys_register failed. This patch adds the necessary
cleanup routine call.

Fixes: 3bfe049807c2403 ('netfilter: nfnetlink_{log,queue}: Register
pernet in first place')

Signed-off-by: Nikolay Borisov 
---
 net/netfilter/nfnetlink_queue.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c
index 7d81d280cb4f..2e94603c2dec 100644
--- a/net/netfilter/nfnetlink_queue.c
+++ b/net/netfilter/nfnetlink_queue.c
@@ -1417,6 +1417,7 @@ static int __init nfnetlink_queue_init(void)
 
 cleanup_netlink_notifier:
netlink_unregister_notifier(_rtnl_notifier);
+   unregister_pernet_subsys(_queue_net_ops);
 out:
return status;
 }
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   3   >