Re: [PATCH net v2 1/1] net: tcp: Permit user set TCP_MAXSEG to default value

2017-03-20 Thread Feng Gao
On Tue, Mar 21, 2017 at 9:23 AM, Neal Cardwell  wrote:
> On Mon, Mar 20, 2017 at 8:45 PM,  wrote:
>> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
>> index 1e319a5..4f7f163 100644
>> --- a/net/ipv4/tcp.c
>> +++ b/net/ipv4/tcp.c
>> @@ -2470,7 +2470,7 @@ static int do_tcp_setsockopt(struct sock *sk, int 
>> level,
>> /* Values greater than interface MTU won't take effect. 
>> However
>>  * at the point when this call is done we typically don't yet
>>  * know which interface is going to be used */
>> -   if (val < TCP_MIN_MSS || val > MAX_TCP_WINDOW) {
>> +   if (!val && (val < TCP_MIN_MSS || val > MAX_TCP_WINDOW)) {
>> err = -EINVAL;
>> break;
>
> I believe the sense of the val check is flipped in the proposed patch.
>
> I believe Eric suggested:
>
>   if (val && (val < TCP_MIN_MSS || val > MAX_TCP_WINDOW)) {
>
> Has this been tested?
>
> neal

Sorry, I missed the test this time because of the minor fix.
As a result, wrote the wrong logic.

Regards
Feng


Re: [PATCH nf v2 1/1] netfilter: snmp: Fix one possible panic when snmp_trap_helper fail to register

2017-03-20 Thread Feng Gao
On Tue, Mar 21, 2017 at 12:35 AM, Sergei Shtylyov
<sergei.shtyl...@cogentembedded.com> wrote:
> On 03/20/2017 01:15 PM, Feng Gao wrote:
>
>>>> From: Gao Feng <f...@ikuai8.com>
>>>>
>>>> In the commit <93557f53e1fb> ("netfilter: nf_conntrack: nf_conntrack
>>>> snmp
>>>
>>>
>>>
>>>Angle brackets not needed. :-)
>>>The commit citing style is the same as for the Fixes: tag.
>>
>>
>> The checkpatch.pl reports the following error, if remove the angle
>> brackets.
>
>
>Because it stops recognizing the commit ID! :-)
>
>> ERROR: Please use git commit description style 'commit <12+ chars of
>> sha1> ("")' - ie: 'commit fatal: ambig ("evision or path
>> not in the working tree.")'
>
>
>So check the patch in the correct tree because that seems to be the
> problem... Angle brackets are surely not required.

Actually I didn't add the angle brackets firstly, but it fail to pass
the check_patch.pl check.
So I had to modify it.

Ok, I removed the angle brackets now, just ignored the error report of
check_patch.pl.

Best Regards
Feng

>
>> #7:
>> In the commit 93557f53e1fb ("netfilter: nf_conntrack: nf_conntrack snmp
>>
>> total: 1 errors, 0 warnings, 0 checks, 27 lines checked
>>
>>
>> Regards
>> Feng
>
>
> [...]
>
> MBR, Sergei
>


Re: [PATCH nf 1/1] netfilter: snmp: Fix one possible panic when snmp_trap_helper fail to register

2017-03-20 Thread Feng Gao
On Mon, Mar 20, 2017 at 5:37 PM, Sergei Shtylyov
 wrote:
> Hello!
>
> On 3/20/2017 4:44 AM, f...@ikuai8.com wrote:
>
>> From: Gao Feng 
>>
>> In the commit ("netfilter: nf_conntrack: nf_conntrack snmp helper"),
>
>
>Need to specify the 12-digit SHA1 ID as well.

Thanks Sergei.
I didn't know it is asked to add SHA1 ID into description, thanks your reminder.

I have sent the v2 patch.

Regards
Feng

>
>> the snmp_helper is replaced by nf_nat_snmp_hook. So the snmp_helper
>> is never registered. But it still tries to unregister the snmp_helper,
>> it could cause the panic.
>>
>> Now remove the useless snmp_helper and the unregister call in the
>> error handler.
>>
>> Fixes: 93557f53e1fb ("netfilter: nf_conntrack: nf_conntrack snmp helper")
>>
>> Signed-off-by: Gao Feng 
>
> [...]
>
> MBR, Sergei
>


Re: [PATCH nf v2 1/1] netfilter: snmp: Fix one possible panic when snmp_trap_helper fail to register

2017-03-20 Thread Feng Gao
On Mon, Mar 20, 2017 at 6:09 PM, Sergei Shtylyov
 wrote:
> On 3/20/2017 12:55 PM, f...@ikuai8.com wrote:
>
>> From: Gao Feng 
>>
>> In the commit <93557f53e1fb> ("netfilter: nf_conntrack: nf_conntrack snmp
>
>
>Angle brackets not needed. :-)
>The commit citing style is the same as for the Fixes: tag.

The checkpatch.pl reports the following error, if remove the angle brackets.

ERROR: Please use git commit description style 'commit <12+ chars of
sha1> ("")' - ie: 'commit fatal: ambig ("evision or path
not in the working tree.")'
#7:
In the commit 93557f53e1fb ("netfilter: nf_conntrack: nf_conntrack snmp

total: 1 errors, 0 warnings, 0 checks, 27 lines checked


Regards
Feng

>
>> helper"), the snmp_helper is replaced by nf_nat_snmp_hook. So the
>> snmp_helper is never registered. But it still tries to unregister the
>> snmp_helper, it could cause the panic.
>>
>> Now remove the useless snmp_helper and the unregister call in the
>> error handler.
>>
>> Fixes: 93557f53e1fb ("netfilter: nf_conntrack: nf_conntrack snmp helper")
>>
>> Signed-off-by: Gao Feng 
>
> [...]
>
> MBR, Sergei
>


Re: [PATCH v3 net-next 1/2] net: Avoid unnessary loop when master_idx is invalid in inet_select_addr

2017-03-09 Thread Feng Gao
Hi David,

On Fri, Mar 10, 2017 at 11:58 AM, David Ahern  wrote:
> On 3/9/17 7:52 PM, f...@ikuai8.com wrote:
>> From: Gao Feng 
>>
>> When master_idx is invalid, it is zero. It is unnecessary to iterate
>> all netdevs. Because l3mdev_master_ifindex_rcu(dev) != master_idx must
>> be true.
>> Now put this loop into the condition block when master_idx is valid.
>
> you are significantly changing how this loop works, not just the
> master_idx == 0 case. Basically, if dev does not have an address, you
> are not going to look at the other interfaces.
>
> The
>if (l3mdev_master_ifindex_rcu(dev) != master_idx)
>
> is actually relevant.
>
> If dev is enslaved to an L3 device, only consider devices in that
> domain. Conversely, if dev is NOT enslaved to an L3 device, do NOT
> consider devices that are enslaved to one. Both of those are equally
> important.

Thanks. I didn't consider about the case that dev is enslaved to an l3
device you pointed.

Regards
Feng


Re: [PATCH net-next 2/2] net: Eliminate duplicated codes by creating one new function in_dev_select_addr

2017-03-09 Thread Feng Gao
Hi Sergei,

On Tue, Mar 7, 2017 at 11:46 PM, Sergei Shtylyov
 wrote:
> Hello!
>
> On 03/07/2017 05:51 PM, f...@ikuai8.com wrote:
>
>> From: Gao Feng 
>>
>> There are two duplicated loop codes which used to select right
>
>
>Just "loops".
>
>> address in current codes. Now eliminate these codes by creating
>> one new function in_dev_select_addr.
>>
>> Signed-off-by: Gao Feng 
>> ---
>>  net/ipv4/devinet.c | 34 +++---
>>  1 file changed, 19 insertions(+), 15 deletions(-)
>>
>> diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
>> index 1a9e550..d0964c5 100644
>> --- a/net/ipv4/devinet.c
>> +++ b/net/ipv4/devinet.c
>> @@ -1191,6 +1191,19 @@ static int inet_gifconf(struct net_device *dev,
>> char __user *buf, int len)
>> return done;
>>  }
>>
>> +static __be32 in_dev_select_addr(const struct in_device *in_dev,
>> +int scope)
>> +{
>> +   for_primary_ifa(in_dev) {
>> +   if (ifa->ifa_scope != RT_SCOPE_LINK &&
>> +   ifa->ifa_scope <= scope) {
>> +   return ifa->ifa_local;
>> +   }
>
>
>Could drop the useless {} here, while at it.

Thanks, I would correct it in v2 patch.

The checkpatch.pl ignores this issue when I checked this patch.
It seems one bug of checkpatch.pl

Regards
Feng

>
>> +   } endfor_ifa(in_dev);
>> +
>> +   return 0;
>> +}
>> +
>>  __be32 inet_select_addr(const struct net_device *dev, __be32 dst, int
>> scope)
>>  {
>> __be32 addr = 0;
>
> [...]
>
> MBR, Sergei
>


Re: [PATCH net-next 1/1] net: sock: Use USEC_PER_SEC macro instead of literal 1000000

2017-02-20 Thread Feng Gao
On Tue, Feb 21, 2017 at 3:13 PM, Joe Perches  wrote:
> On Mon, 2017-02-20 at 22:33 +0800, f...@ikuai8.com wrote:
>> From: Gao Feng 
>>
>> The USEC_PER_SEC is used once in sock_set_timeout as the max value of
>> tv_usec. But there are other similar codes which use the literal
>> 100 in this file.
>> It is minor cleanup to keep consitent.
> []
>> diff --git a/net/core/sock.c b/net/core/sock.c
> []
>> @@ -367,7 +367,7 @@ static int sock_set_timeout(long *timeo_p, char __user 
>> *optval, int optlen)
>>   if (tv.tv_sec == 0 && tv.tv_usec == 0)
>>   return 0;
>>   if (tv.tv_sec < (MAX_SCHEDULE_TIMEOUT/HZ - 1))
>> - *timeo_p = tv.tv_sec*HZ + 
>> (tv.tv_usec+(100/HZ-1))/(100/HZ);
>> + *timeo_p = tv.tv_sec * HZ + (tv.tv_usec + (USEC_PER_SEC / HZ - 
>> 1)) / (USEC_PER_SEC / HZ);
>
> Maybe convert this to DIV_ROUND_UP one day too?
>
> *timeo_p = tv.tv_sec * HZ + DIV_ROUND_UP(tv.tv_usec, USEC_PER_SEC / 
> HZ);
>

Good idea, thanks.
I would send v2 update.

Regards
Feng


Re: [B.A.T.M.A.N.] [PATCH net-next 1/2] net: batman-adv: Treat NET_XMIT_CN as transmit successfully

2016-12-16 Thread Feng Gao
On Fri, Dec 16, 2016 at 5:19 PM, Sven Eckelmann  wrote:
> On Montag, 21. November 2016 23:00:32 CET f...@ikuai8.com wrote:
>> From: Gao Feng 
>>
>> The tc could return NET_XMIT_CN as one congestion notification, but
>> it does not mean the packet is lost. Other modules like ipvlan,
>> macvlan, and others treat NET_XMIT_CN as success too.
>>
>> So batman-adv should add the NET_XMIT_CN check.
>>
>> Signed-off-by: Gao Feng 
>> ---
>>  net/batman-adv/distributed-arp-table.c |  2 +-
>>  net/batman-adv/fragmentation.c |  2 +-
>>  net/batman-adv/routing.c   | 10 +-
>>  net/batman-adv/soft-interface.c|  2 +-
>>  net/batman-adv/tp_meter.c  |  2 +-
>>  5 files changed, 9 insertions(+), 9 deletions(-)
>
> David marked your patches as "derefered" after "under review" and did not
> apply them directly. Also Florian Westphal didn't continue the discussion
> about the direction you should choose.
>
> The patches were therefore queued up in the in batman-adv
> 671630d6aad0..eab7617142d2. They will be submitted later(tm) in a pull
> request to David.

I get it. Thanks Sven.

Regards
Feng

>
> Thanks,
> Sven


Re: [PATCH net-next 1/1] driver: ipvlan: Define common functions to decrease duplicated codes used to add or del IP address

2016-12-14 Thread Feng Gao
On Wed, Dec 14, 2016 at 10:52 PM,   wrote:
> From: Gao Feng 
>
> There are some duplicated codes in ipvlan_add_addr6/4 and
> ipvlan_del_addr6/4. Now define two common functions ipvlan_add_addr
> and ipvlan_del_addr to decrease the duplicated codes.
> It could be helful to maintain the codes.
>
> Signed-off-by: Gao Feng 
> ---
>  drivers/net/ipvlan/ipvlan_main.c | 68 
> +---
>  1 file changed, 29 insertions(+), 39 deletions(-)
>
> diff --git a/drivers/net/ipvlan/ipvlan_main.c 
> b/drivers/net/ipvlan/ipvlan_main.c
> index 693ec5b..5874d30 100644
> --- a/drivers/net/ipvlan/ipvlan_main.c
> +++ b/drivers/net/ipvlan/ipvlan_main.c
> @@ -669,23 +669,22 @@ static int ipvlan_device_event(struct notifier_block 
> *unused,
> return NOTIFY_DONE;
>  }
>
> -static int ipvlan_add_addr6(struct ipvl_dev *ipvlan, struct in6_addr 
> *ip6_addr)
> +static int ipvlan_add_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6)
>  {
> struct ipvl_addr *addr;
>
> -   if (ipvlan_addr_busy(ipvlan->port, ip6_addr, true)) {
> -   netif_err(ipvlan, ifup, ipvlan->dev,
> - "Failed to add IPv6=%pI6c addr for %s intf\n",
> - ip6_addr, ipvlan->dev->name);
> -   return -EINVAL;
> -   }
> addr = kzalloc(sizeof(struct ipvl_addr), GFP_ATOMIC);
> if (!addr)
> return -ENOMEM;
>
> addr->master = ipvlan;
> -   memcpy(>ip6addr, ip6_addr, sizeof(struct in6_addr));
> -   addr->atype = IPVL_IPV6;
> +   if (is_v6) {
> +   memcpy(>ip6addr, iaddr, sizeof(struct in6_addr));
> +   addr->atype = IPVL_IPV6;
> +   } else {
> +   memcpy(>ip4addr, iaddr, sizeof(struct in_addr));
> +   addr->atype = IPVL_IPV4;
> +   }
> list_add_tail(>anode, >addrs);
>
> /* If the interface is not up, the address will be added to the hash
> @@ -697,11 +696,11 @@ static int ipvlan_add_addr6(struct ipvl_dev *ipvlan, 
> struct in6_addr *ip6_addr)
> return 0;
>  }
>
> -static void ipvlan_del_addr6(struct ipvl_dev *ipvlan, struct in6_addr 
> *ip6_addr)
> +static void ipvlan_del_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6)
>  {
> struct ipvl_addr *addr;
>
> -   addr = ipvlan_find_addr(ipvlan, ip6_addr, true);
> +   addr = ipvlan_find_addr(ipvlan, iaddr, is_v6);
> if (!addr)
> return;
>
> @@ -712,6 +711,23 @@ static void ipvlan_del_addr6(struct ipvl_dev *ipvlan, 
> struct in6_addr *ip6_addr)
> return;
>  }
>
> +static int ipvlan_add_addr6(struct ipvl_dev *ipvlan, struct in6_addr 
> *ip6_addr)
> +{
> +   if (ipvlan_addr_busy(ipvlan->port, ip6_addr, true)) {
> +   netif_err(ipvlan, ifup, ipvlan->dev,
> + "Failed to add IPv6=%pI6c addr for %s intf\n",
> + ip6_addr, ipvlan->dev->name);
> +   return -EINVAL;
> +   }
> +
> +   return ipvlan_add_addr(ipvlan, ip6_addr, true);
> +}
> +
> +static void ipvlan_del_addr6(struct ipvl_dev *ipvlan, struct in6_addr 
> *ip6_addr)
> +{
> +   return ipvlan_del_addr(ipvlan, ip6_addr, true);
> +}
> +
>  static int ipvlan_addr6_event(struct notifier_block *unused,
>   unsigned long event, void *ptr)
>  {
> @@ -745,45 +761,19 @@ static int ipvlan_addr6_event(struct notifier_block 
> *unused,
>
>  static int ipvlan_add_addr4(struct ipvl_dev *ipvlan, struct in_addr 
> *ip4_addr)
>  {
> -   struct ipvl_addr *addr;
> -
> if (ipvlan_addr_busy(ipvlan->port, ip4_addr, false)) {
> netif_err(ipvlan, ifup, ipvlan->dev,
>   "Failed to add IPv4=%pI4 on %s intf.\n",
>   ip4_addr, ipvlan->dev->name);
> return -EINVAL;
> }
> -   addr = kzalloc(sizeof(struct ipvl_addr), GFP_KERNEL);
> -   if (!addr)
> -   return -ENOMEM;
> -
> -   addr->master = ipvlan;
> -   memcpy(>ip4addr, ip4_addr, sizeof(struct in_addr));
> -   addr->atype = IPVL_IPV4;
> -   list_add_tail(>anode, >addrs);
> -
> -   /* If the interface is not up, the address will be added to the hash
> -* list by ipvlan_open.
> -*/
> -   if (netif_running(ipvlan->dev))
> -   ipvlan_ht_addr_add(ipvlan, addr);
>
> -   return 0;
> +   return ipvlan_add_addr(ipvlan, ip4_addr, false);
>  }
>
>  static void ipvlan_del_addr4(struct ipvl_dev *ipvlan, struct in_addr 
> *ip4_addr)
>  {
> -   struct ipvl_addr *addr;
> -
> -   addr = ipvlan_find_addr(ipvlan, ip4_addr, false);
> -   if (!addr)
> -   return;
> -
> -   ipvlan_ht_addr_del(addr);
> -   list_del(>anode);
> -   kfree_rcu(addr, rcu);
> -
> -   return;
> +   return ipvlan_del_addr(ipvlan, ip4_addr, false);
>  }
>
>  static int ipvlan_addr4_event(struct notifier_block 

Re: [PATCH net-next 1/1] driver: ipvlan: Add the sanity check for ipvlan mode

2016-11-28 Thread Feng Gao
Hi David & Mahesh,

On Tue, Nov 29, 2016 at 4:08 AM, David Miller  wrote:
> From: Mahesh Bandewar (महेश बंडेवार) 
> Date: Mon, 28 Nov 2016 11:02:45 -0800
>
>> On Mon, Nov 28, 2016 at 5:23 AM,  wrote:
>>
>>> From: Gao Feng 
>>>
>>> The ipvlan mode variable "nval" is from userspace, so the ipvlan codes
>>> should check if the mode variable "nval" is valid.
>>>
>>> Signed-off-by: Gao Feng 
>>> ---
>>>  drivers/net/ipvlan/ipvlan_main.c | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_
>>> main.c
>>> index ab90b22..537b5a9 100644
>>> --- a/drivers/net/ipvlan/ipvlan_main.c
>>> +++ b/drivers/net/ipvlan/ipvlan_main.c
>>> @@ -65,6 +65,9 @@ static int ipvlan_set_port_mode(struct ipvl_port *port,
>>> u16 nval)
>>> struct net_device *mdev = port->dev;
>>> int err = 0;
>>>
>>> +   if (nval >= IPVLAN_MODE_MAX)
>>> +   return -EINVAL;
>>> +
>>>
>> I'm curious to know how you encountered this issue? The values are
>> validated in ipvlan_nl_validate() and it should fail at that time itself.
>
> I'm not applying this without at least a better explanation.

Sorry, I didn't find the function "ipvlan_nl_validate" during reading
the ipvlan codes.

Regards
Feng


Re: [PATCH net 1/1] net: batman-adv: Treat NET_XMIT_CN as transmit successfully

2016-11-21 Thread Feng Gao
Hi Sergei,

On Mon, Nov 21, 2016 at 6:44 PM, Sergei Shtylyov
 wrote:
> Hello.
>
> On 11/21/2016 3:39 AM, f...@ikuai8.com wrote:
>
>> From: Gao Feng 
>>
>> The tc could return NET_XMIT_CN as one congestion notification, but
>> it does not mean the packe is lost. Other modules like ipvlan,
>
>
>Packet.

Thanks, it was typo.

>
>> macvlan, and others treat NET_XMIT_CN as success too.
>>
>> So batman-adv should add the NET_XMIT_CN check.
>>
>> Signed-off-by: Gao Feng 
>
>
> [...]
>
>> diff --git a/net/batman-adv/routing.c b/net/batman-adv/routing.c
>> index 7e8dc64..8edd324 100644
>> --- a/net/batman-adv/routing.c
>> +++ b/net/batman-adv/routing.c
>> @@ -706,7 +706,7 @@ static int batadv_route_unicast_packet(struct sk_buff
>> *skb,
>> goto out;
>>
>> /* translate transmit result into receive result */
>> -   if (res == NET_XMIT_SUCCESS) {
>> +   if (res == NET_XMIT_SUCCESS || ret == NET_XMIT_CN) {

Thanks again.
I didn't find it during myself's review and compile process.

>
>
>Not 'res == NET_XMIT_CN'?
>
>> /* skb was transmitted and consumed */
>> batadv_inc_counter(bat_priv, BATADV_CNT_FORWARD);
>> batadv_add_counter(bat_priv, BATADV_CNT_FORWARD_BYTES,
>
> [...]
>
> MBR, Sergei
>

I have sent the v2 patch which corrects these two typos.

Best Regards
Feng


Re: [B.A.T.M.A.N.] [PATCH net 1/1] net: batman-adv: Treat NET_XMIT_CN as transmit successfully

2016-11-21 Thread Feng Gao
Hi Sven,

On Mon, Nov 21, 2016 at 4:31 PM, Sven Eckelmann <s...@narfation.org> wrote:
> On Montag, 21. November 2016 16:21:52 CET Feng Gao wrote:
>> Hi Sven,
>>
>> On Mon, Nov 21, 2016 at 4:16 PM, Sven Eckelmann <s...@narfation.org> wrote:
>> > On Montag, 21. November 2016 08:39:39 CET f...@ikuai8.com wrote:
>> >> From: Gao Feng <f...@ikuai8.com>
>> >>
>> >> The tc could return NET_XMIT_CN as one congestion notification, but
>> >> it does not mean the packe is lost. Other modules like ipvlan,
>> >
>> > s/packe/packet/
>>
>> What's this mean?
>
> That there is a minor typo (*t* is missing) and this sed statement (when
> applied only to the commit message) would fix it.

Thanks. I didn't thought it was sed statement.

>
> David already marked this patch as "Under Review" in his patchwork. So I would
> guess that he will accept this patch and not the batman-adv maintainers. And
> maybe he will fix this small typo - or maybe not.
>
> Kind regards,
> Sven

I would correct the typo in the patch for net-next.

Best Regards
Feng


Re: [B.A.T.M.A.N.] [PATCH net 1/1] net: batman-adv: Treat NET_XMIT_CN as transmit successfully

2016-11-21 Thread Feng Gao
Hi Sven,

On Mon, Nov 21, 2016 at 4:16 PM, Sven Eckelmann  wrote:
> On Montag, 21. November 2016 08:39:39 CET f...@ikuai8.com wrote:
>> From: Gao Feng 
>>
>> The tc could return NET_XMIT_CN as one congestion notification, but
>> it does not mean the packe is lost. Other modules like ipvlan,
>
> s/packe/packet/

What's this mean?

>
>> macvlan, and others treat NET_XMIT_CN as success too.
>>
>> So batman-adv should add the NET_XMIT_CN check.
>>
>> Signed-off-by: Gao Feng 
>> ---
>>  net/batman-adv/distributed-arp-table.c | 2 +-
>>  net/batman-adv/fragmentation.c | 2 +-
>>  net/batman-adv/routing.c   | 2 +-
>>  net/batman-adv/tp_meter.c  | 2 +-
>>  4 files changed, 4 insertions(+), 4 deletions(-)
> [...]
>
> Not sure how the batman-adv maintainers see this - but this patch needs
> an additional patch for net-next to also add it to the parts which were
> rewritten.
>
> Kind regards,
> Sven

Ok. I would commit another patch to net-next.

Best Regards
Feng


Re: [PATCH net-next 1/1] driver: macvlan: Remove duplicated IFF_UP condition check in macvlan_forward_source

2016-11-20 Thread Feng Gao
On Mon, Nov 21, 2016 at 8:26 AM,   wrote:
> From: Gao Feng 
>
> The function macvlan_forward_source_one has already checked the flag
> IFF_UP, so needn't check it outside in macvlan_forward_source too.
>
> Signed-off-by: Gao Feng 
> ---
>  v2: Remove the IFF_UP check in macvlan_forward_source instead of 
> macvlan_forward_source_one
>  v1: Initial patch
>
>  drivers/net/macvlan.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
> index 13b7e0b..7ddfd2c 100644
> --- a/drivers/net/macvlan.c
> +++ b/drivers/net/macvlan.c
> @@ -400,8 +400,7 @@ static void macvlan_forward_source(struct sk_buff *skb,
>
> hlist_for_each_entry_rcu(entry, h, hlist) {
> if (ether_addr_equal_64bits(entry->addr, addr))
> -   if (entry->vlan->dev->flags & IFF_UP)
> -   macvlan_forward_source_one(skb, entry->vlan);
> +   macvlan_forward_source_one(skb, entry->vlan);
> }
>  }
>
> --
> 1.9.1
>
>

Sorry, I forget add the "v2" in the title.

Regards
Feng


Re: [PATCH net 1/1] driver: macvlan: Destroy new macvlan port if macvlan_common_newlink failed.

2016-11-07 Thread Feng Gao
On Fri, Nov 4, 2016 at 10:28 AM,   wrote:
> From: Gao Feng 
>
> When there is no existing macvlan port in lowdev, one new macvlan port
> would be created. But it doesn't be destoried when something failed later.
> It casues some memleak.
>
> Now add one flag to indicate if new macvlan port is created.
>
> Signed-off-by: Gao Feng 
> ---
>  drivers/net/macvlan.c | 31 ++-
>  1 file changed, 22 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
> index 3234fcd..d2d6f12 100644
> --- a/drivers/net/macvlan.c
> +++ b/drivers/net/macvlan.c
> @@ -1278,6 +1278,7 @@ int macvlan_common_newlink(struct net *src_net, struct 
> net_device *dev,
> struct net_device *lowerdev;
> int err;
> int macmode;
> +   bool create = false;
>
> if (!tb[IFLA_LINK])
> return -EINVAL;
> @@ -1304,12 +1305,18 @@ int macvlan_common_newlink(struct net *src_net, 
> struct net_device *dev,
> err = macvlan_port_create(lowerdev);
> if (err < 0)
> return err;
> +   create = true;
> }
> port = macvlan_port_get_rtnl(lowerdev);
>
> /* Only 1 macvlan device can be created in passthru mode */
> -   if (port->passthru)
> -   return -EINVAL;
> +   if (port->passthru) {
> +   /* The macvlan port must be not created this time,
> +* still goto destroy_macvlan_port for readability.
> +*/
> +   err = -EINVAL;
> +   goto destroy_macvlan_port;
> +   }
>
> vlan->lowerdev = lowerdev;
> vlan->dev  = dev;
> @@ -1325,24 +1332,28 @@ int macvlan_common_newlink(struct net *src_net, 
> struct net_device *dev,
> vlan->flags = nla_get_u16(data[IFLA_MACVLAN_FLAGS]);
>
> if (vlan->mode == MACVLAN_MODE_PASSTHRU) {
> -   if (port->count)
> -   return -EINVAL;
> +   if (port->count) {
> +   err = -EINVAL;
> +   goto destroy_macvlan_port;
> +   }
> port->passthru = true;
> eth_hw_addr_inherit(dev, lowerdev);
> }
>
> if (data && data[IFLA_MACVLAN_MACADDR_MODE]) {
> -   if (vlan->mode != MACVLAN_MODE_SOURCE)
> -   return -EINVAL;
> +   if (vlan->mode != MACVLAN_MODE_SOURCE) {
> +   err = -EINVAL;
> +   goto destroy_macvlan_port;
> +   }
> macmode = nla_get_u32(data[IFLA_MACVLAN_MACADDR_MODE]);
> err = macvlan_changelink_sources(vlan, macmode, data);
> if (err)
> -   return err;
> +   goto destroy_macvlan_port;
> }
>
> err = register_netdevice(dev);
> if (err < 0)
> -   return err;
> +   goto destroy_macvlan_port;
>
> dev->priv_flags |= IFF_MACVLAN;
> err = netdev_upper_dev_link(lowerdev, dev);
> @@ -1357,7 +1368,9 @@ int macvlan_common_newlink(struct net *src_net, struct 
> net_device *dev,
>
>  unregister_netdev:
> unregister_netdevice(dev);
> -
> +destroy_macvlan_port:
> +   if (create)
> +   macvlan_port_destroy(port->dev);
> return err;
>  }
>  EXPORT_SYMBOL_GPL(macvlan_common_newlink);
> --
> 1.9.1
>
>

Hi all,

How about this fix ? Is there any issue ?

Regards
Feng


Re: [PATCH net-next 1/1] driver: veth: Refine the statistics codes of veth driver

2016-11-03 Thread Feng Gao
It is deprecated.

On Thu, Nov 3, 2016 at 8:50 PM,   wrote:
> From: Gao Feng 
>
> The dropped count of veth is located in struct veth_priv, but other
> statistics like packets and bytes are in another struct pcpu_vstats.
> Now keep these three counters in the same struct.
>
> Signed-off-by: Gao Feng 
> ---
>  drivers/net/veth.c | 32 ++--
>  1 file changed, 18 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> index 0520952a..3d8326f 100644
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c
> @@ -26,12 +26,12 @@
>  struct pcpu_vstats {
> u64 packets;
> u64 bytes;
> +   u64 dropped;
> struct u64_stats_sync   syncp;
>  };
>
>  struct veth_priv {
> struct net_device __rcu *peer;
> -   atomic64_t  dropped;
> unsignedrequested_headroom;
>  };
>
> @@ -108,6 +108,8 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct 
> net_device *dev)
> struct veth_priv *priv = netdev_priv(dev);
> struct net_device *rcv;
> int length = skb->len;
> +   struct pcpu_vstats *stats = this_cpu_ptr(dev->vstats);
> +   int ret = NET_RX_DROP;
>
> rcu_read_lock();
> rcv = rcu_dereference(priv->peer);
> @@ -116,17 +118,16 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, 
> struct net_device *dev)
> goto drop;
> }
>
> -   if (likely(dev_forward_skb(rcv, skb) == NET_RX_SUCCESS)) {
> -   struct pcpu_vstats *stats = this_cpu_ptr(dev->vstats);
> -
> -   u64_stats_update_begin(>syncp);
> +   ret = dev_forward_skb(rcv, skb);
> +drop:
> +   u64_stats_update_begin(>syncp);
> +   if (likely(ret == NET_RX_SUCCESS)) {
> stats->bytes += length;
> stats->packets++;
> -   u64_stats_update_end(>syncp);
> } else {
> -drop:
> -   atomic64_inc(>dropped);
> +   stats->dropped++;
> }
> +   u64_stats_update_end(>syncp);
> rcu_read_unlock();
> return NETDEV_TX_OK;
>  }
> @@ -135,27 +136,28 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, 
> struct net_device *dev)
>   * general routines
>   */
>
> -static u64 veth_stats_one(struct pcpu_vstats *result, struct net_device *dev)
> +static void veth_stats_one(struct pcpu_vstats *result, struct net_device 
> *dev)
>  {
> -   struct veth_priv *priv = netdev_priv(dev);
> int cpu;
>
> result->packets = 0;
> result->bytes = 0;
> +   result->dropped = 0;
> for_each_possible_cpu(cpu) {
> struct pcpu_vstats *stats = per_cpu_ptr(dev->vstats, cpu);
> -   u64 packets, bytes;
> +   u64 packets, bytes, dropped;
> unsigned int start;
>
> do {
> start = u64_stats_fetch_begin_irq(>syncp);
> packets = stats->packets;
> bytes = stats->bytes;
> +   dropped = stats->dropped;
> } while (u64_stats_fetch_retry_irq(>syncp, start));
> result->packets += packets;
> result->bytes += bytes;
> +   result->dropped += dropped;
> }
> -   return atomic64_read(>dropped);
>  }
>
>  static struct rtnl_link_stats64 *veth_get_stats64(struct net_device *dev,
> @@ -165,16 +167,18 @@ static struct rtnl_link_stats64 
> *veth_get_stats64(struct net_device *dev,
> struct net_device *peer;
> struct pcpu_vstats one;
>
> -   tot->tx_dropped = veth_stats_one(, dev);
> +   veth_stats_one(, dev);
> tot->tx_bytes = one.bytes;
> tot->tx_packets = one.packets;
> +   tot->tx_dropped = one.dropped;
>
> rcu_read_lock();
> peer = rcu_dereference(priv->peer);
> if (peer) {
> -   tot->rx_dropped = veth_stats_one(, peer);
> +   veth_stats_one(, dev);

I find there is one issue. It should be peer, not dev.

> tot->rx_bytes = one.bytes;
> tot->rx_packets = one.packets;
> +   tot->rx_dropped = one.dropped;
> }
> rcu_read_unlock();
>
> --
> 1.9.1
>
>

I find this patch contains one error, so please ignore it.

Regards
Feng


Re: [PATCH 1/1] net: vlan: Use sizeof instead of literal number

2016-10-17 Thread Feng Gao
On Tue, Oct 18, 2016 at 8:44 AM,   wrote:
> From: Gao Feng 
>
> Use sizeof variable instead of literal number to enhance the readability.
>
> Signed-off-by: Gao Feng 
> ---
>  net/8021q/vlan.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
> index 8de138d..5a3903b 100644
> --- a/net/8021q/vlan.c
> +++ b/net/8021q/vlan.c
> @@ -515,8 +515,8 @@ static int vlan_ioctl_handler(struct net *net, void 
> __user *arg)
> return -EFAULT;
>
> /* Null terminate this sucker, just in case. */
> -   args.device1[23] = 0;
> -   args.u.device2[23] = 0;
> +   args.device1[sizeof(args.device1) - 1] = 0;
> +   args.u.device2[sizeof(args.u.device2) - 1] = 0;
>
> rtnl_lock();
>
> --
> 1.9.1
>
>

Sorry, I forget add the "net-next" in the title.
Now I have sent another new patch, please ignore this conversation.

Regards
Feng


Re: [PATCH nf-next v2 1/2] netfilter: Fix potential null pointer dereference

2016-09-28 Thread Feng Gao
Hi Liping,

On Wed, Sep 28, 2016 at 11:13 AM, Liping Zhang <zlpnob...@gmail.com> wrote:
> 2016-09-28 11:08 GMT+08:00 Liping Zhang <zlpnob...@gmail.com>:
>> Hi Feng,
>>
>> 2016-09-28 9:23 GMT+08:00 Feng Gao <gfree.w...@gmail.com>:
>>> Hi Aaraon,
>>>
>>> On Tue, Sep 27, 2016 at 9:38 PM, Aaron Conole <acon...@bytheb.org> wrote:
>>>> It's possible for nf_hook_entry_head to return NULL if two
>>>> nf_unregister_net_hook calls happen simultaneously with a single hook
>>>
>>> The critical region of nf_unregister_net_hook is protected by 
>>> _hook_mutex.
>>> When it would be called simultaneously?
>>
>> This is unrelated to race condition.
>>
>> Suppose that only the last nf_hook_entry exist, and two callers want to do
>> un-register work.
>>
>> The first one will remove it successfully, after the end of the work, the
>> second one will enter the critical section, but it will see the NULL pointer.
>> Because the last nf_hook_entry was already removed by the first one.
>>
>>>
>>> Regards
>>> Feng
>>>
>>>> entry in the list.  This fix ensures that no null pointer dereference
>>>> could occur when such a race happens.
>>>>
>>>> Signed-off-by: Aaron Conole <acon...@bytheb.org>
>
> I read the commit log again, I think the description here is a
> little confusing indeed.

Yes. I doesn't check if the list head always exists, just learn the
patch from commit log.
It confuses me indeed.

Regards
Feng


Re: [PATCH nf-next v2 1/2] netfilter: Fix potential null pointer dereference

2016-09-27 Thread Feng Gao
Hi Aaraon,

On Tue, Sep 27, 2016 at 9:38 PM, Aaron Conole  wrote:
> It's possible for nf_hook_entry_head to return NULL if two
> nf_unregister_net_hook calls happen simultaneously with a single hook

The critical region of nf_unregister_net_hook is protected by _hook_mutex.
When it would be called simultaneously?

Regards
Feng

> entry in the list.  This fix ensures that no null pointer dereference
> could occur when such a race happens.
>
> Signed-off-by: Aaron Conole 
> ---
>  net/netfilter/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/netfilter/core.c b/net/netfilter/core.c
> index 360c63d..e58e420 100644
> --- a/net/netfilter/core.c
> +++ b/net/netfilter/core.c
> @@ -160,7 +160,7 @@ void nf_unregister_net_hook(struct net *net, const struct 
> nf_hook_ops *reg)
>
> mutex_lock(_hook_mutex);
> hooks_entry = nf_hook_entry_head(net, reg);
> -   if (hooks_entry->orig_ops == reg) {
> +   if (hooks_entry && hooks_entry->orig_ops == reg) {
> nf_set_hooks_head(net, reg,
>   nf_entry_dereference(hooks_entry->next));
> goto unlock;
> --
> 2.7.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nf] netfilter: seqadj: Fix the wrong ack adjust for the RST packet without ack

2016-09-05 Thread Feng Gao
On Tue, Sep 6, 2016 at 10:06 AM,   wrote:
> From: Gao Feng 
>
> It is valid that the TCP RST packet which does not set ack flag, and bytes
> of ack number are zero. For these RST packets, seqadj could not adjust the
> ack number.
>
> Signed-off-by: Gao Feng 
> ---
>  v2: Regenerate because the first patch is removed
>  v1: Initial patch
>
>  net/netfilter/nf_conntrack_seqadj.c | 34 +++---
>  1 file changed, 19 insertions(+), 15 deletions(-)
>
> diff --git a/net/netfilter/nf_conntrack_seqadj.c 
> b/net/netfilter/nf_conntrack_seqadj.c
> index dff0f0c..3bd9c7e 100644
> --- a/net/netfilter/nf_conntrack_seqadj.c
> +++ b/net/netfilter/nf_conntrack_seqadj.c
> @@ -179,30 +179,34 @@ int nf_ct_seq_adjust(struct sk_buff *skb,
>
> tcph = (void *)skb->data + protoff;
> spin_lock_bh(>lock);
> +
> if (after(ntohl(tcph->seq), this_way->correction_pos))
> seqoff = this_way->offset_after;
> else
> seqoff = this_way->offset_before;
>
> -   if (after(ntohl(tcph->ack_seq) - other_way->offset_before,
> - other_way->correction_pos))
> -   ackoff = other_way->offset_after;
> -   else
> -   ackoff = other_way->offset_before;
> -
> newseq = htonl(ntohl(tcph->seq) + seqoff);
> -   newack = htonl(ntohl(tcph->ack_seq) - ackoff);
> -
> inet_proto_csum_replace4(>check, skb, tcph->seq, newseq, false);
> -   inet_proto_csum_replace4(>check, skb, tcph->ack_seq, newack,
> -false);
> -
> -   pr_debug("Adjusting sequence number from %u->%u, ack from %u->%u\n",
> -ntohl(tcph->seq), ntohl(newseq), ntohl(tcph->ack_seq),
> -ntohl(newack));
>
> +   pr_debug("Adjusting sequence number from %u->%u\n",
> +ntohl(tcph->seq), ntohl(newseq));
> tcph->seq = newseq;
> -   tcph->ack_seq = newack;
> +
> +   if (likely(tcph->ack)) {
> +   if (after(ntohl(tcph->ack_seq) - other_way->offset_before,
> + other_way->correction_pos))
> +   ackoff = other_way->offset_after;
> +   else
> +   ackoff = other_way->offset_before;
> +
> +   newack = htonl(ntohl(tcph->ack_seq) - ackoff);
> +   inet_proto_csum_replace4(>check, skb, tcph->ack_seq,
> +newack, false);
> +
> +   pr_debug("Adjusting ack number from %u->%u\n",
> +ntohl(tcph->ack_seq), ntohl(newack));
> +   tcph->ack_seq = newack;
> +   }
>
> res = nf_ct_sack_adjust(skb, protoff, tcph, ct, ctinfo);
> spin_unlock_bh(>lock);
> --
> 1.9.1
>
>

Sorry, I forget to add the v2 in the subject.

Best Regards
Feng


Re: [PATCH 2/2 nf] netfilter: seqadj: Fix the wrong ack adjust for the RST packet without ack

2016-09-05 Thread Feng Gao
Hi Pablo,

On Mon, Sep 5, 2016 at 11:02 PM,   wrote:
> From: Gao Feng 
>
> It is valid that the TCP RST packet which does not set ack flag, and bytes
> of ack number are zero. For these RST packets, seqadj could not adjust the
> ack number.
>
> Signed-off-by: Gao Feng 
> ---
>  net/netfilter/nf_conntrack_seqadj.c | 34 +++---
>  1 file changed, 19 insertions(+), 15 deletions(-)
>
> diff --git a/net/netfilter/nf_conntrack_seqadj.c 
> b/net/netfilter/nf_conntrack_seqadj.c
> index 7f8d814..65bb4a6 100644
> --- a/net/netfilter/nf_conntrack_seqadj.c
> +++ b/net/netfilter/nf_conntrack_seqadj.c
> @@ -182,30 +182,34 @@ int nf_ct_seq_adjust(struct sk_buff *skb,
>
> tcph = (void *)skb->data + protoff;
> spin_lock_bh(>lock);
> +
> if (after(ntohl(tcph->seq), this_way->correction_pos))
> seqoff = this_way->offset_after;
> else
> seqoff = this_way->offset_before;
>
> -   if (after(ntohl(tcph->ack_seq) - other_way->offset_before,
> - other_way->correction_pos))
> -   ackoff = other_way->offset_after;
> -   else
> -   ackoff = other_way->offset_before;
> -
> newseq = htonl(ntohl(tcph->seq) + seqoff);
> -   newack = htonl(ntohl(tcph->ack_seq) - ackoff);
> -
> inet_proto_csum_replace4(>check, skb, tcph->seq, newseq, false);
> -   inet_proto_csum_replace4(>check, skb, tcph->ack_seq, newack,
> -false);
> -
> -   pr_debug("Adjusting sequence number from %u->%u, ack from %u->%u\n",
> -ntohl(tcph->seq), ntohl(newseq), ntohl(tcph->ack_seq),
> -ntohl(newack));
>
> +   pr_debug("Adjusting sequence number from %u->%u\n",
> +ntohl(tcph->seq), ntohl(newseq));
> tcph->seq = newseq;
> -   tcph->ack_seq = newack;
> +
> +   if (likely(tcph->ack)) {
> +   if (after(ntohl(tcph->ack_seq) - other_way->offset_before,
> + other_way->correction_pos))
> +   ackoff = other_way->offset_after;
> +   else
> +   ackoff = other_way->offset_before;
> +
> +   newack = htonl(ntohl(tcph->ack_seq) - ackoff);
> +   inet_proto_csum_replace4(>check, skb, tcph->ack_seq,
> +newack, false);
> +
> +   pr_debug("Adjusting ack number from %u->%u\n",
> +ntohl(tcph->ack_seq), ntohl(newack));
> +   tcph->ack_seq = newack;
> +   }
>
> res = nf_ct_sack_adjust(skb, protoff, tcph, ct, ctinfo);
> spin_unlock_bh(>lock);
> --
> 1.9.1
>
>

This patch is generated base on the patch commit "netfilter: seqadj:
Fix one possible panic in seqadj when mem is exhausted" whose link is
http://patchwork.ozlabs.org/patch/665116/.

So its subject contains "2/2".

Best Regards
Feng



Best Regards
Feng


Re: [PATCH v2 nf] netfilter: log: Check param to avoid overflow in nf_log_set

2016-08-29 Thread Feng Gao
On Mon, Aug 29, 2016 at 6:25 PM,   wrote:
> From: Gao Feng 
>
> The nf_log_set is an interface function, so it should do the strict sanity
> check of parameters. Convert the return value of nf_log_set as int instead
> of void. When the pf is invalid, return -EOPNOTSUPP.
>
> Signed-off-by: Gao Feng 
> ---
>  v2: Use ARRAY_SIZE(net->nf.nf_loggers) instead of NFPROTO_NUMPROTO;
>  Return error code -EOPNOTSUPP when pf is invalid;
>  v1: Initial patch
>
>  include/net/netfilter/nf_log.h   | 2 +-
>  net/bridge/netfilter/nf_log_bridge.c | 3 +--
>  net/ipv4/netfilter/nf_log_arp.c  | 3 +--
>  net/ipv4/netfilter/nf_log_ipv4.c | 3 +--
>  net/ipv6/netfilter/nf_log_ipv6.c | 3 +--
>  net/netfilter/nf_log.c   | 8 +---
>  6 files changed, 10 insertions(+), 12 deletions(-)
>
> diff --git a/include/net/netfilter/nf_log.h b/include/net/netfilter/nf_log.h
> index 83d855b..f4eebd0 100644
> --- a/include/net/netfilter/nf_log.h
> +++ b/include/net/netfilter/nf_log.h
> @@ -60,7 +60,7 @@ struct nf_logger {
>  int nf_log_register(u_int8_t pf, struct nf_logger *logger);
>  void nf_log_unregister(struct nf_logger *logger);
>
> -void nf_log_set(struct net *net, u_int8_t pf,
> +int nf_log_set(struct net *net, u_int8_t pf,
> const struct nf_logger *logger);
>  void nf_log_unset(struct net *net, const struct nf_logger *logger);
>
> diff --git a/net/bridge/netfilter/nf_log_bridge.c 
> b/net/bridge/netfilter/nf_log_bridge.c
> index 5d9953a..1663df5 100644
> --- a/net/bridge/netfilter/nf_log_bridge.c
> +++ b/net/bridge/netfilter/nf_log_bridge.c
> @@ -50,8 +50,7 @@ static struct nf_logger nf_bridge_logger __read_mostly = {
>
>  static int __net_init nf_log_bridge_net_init(struct net *net)
>  {
> -   nf_log_set(net, NFPROTO_BRIDGE, _bridge_logger);
> -   return 0;
> +   return nf_log_set(net, NFPROTO_BRIDGE, _bridge_logger);
>  }
>
>  static void __net_exit nf_log_bridge_net_exit(struct net *net)
> diff --git a/net/ipv4/netfilter/nf_log_arp.c b/net/ipv4/netfilter/nf_log_arp.c
> index e7ad950..73599f2 100644
> --- a/net/ipv4/netfilter/nf_log_arp.c
> +++ b/net/ipv4/netfilter/nf_log_arp.c
> @@ -111,8 +111,7 @@ static struct nf_logger nf_arp_logger __read_mostly = {
>
>  static int __net_init nf_log_arp_net_init(struct net *net)
>  {
> -   nf_log_set(net, NFPROTO_ARP, _arp_logger);
> -   return 0;
> +   return nf_log_set(net, NFPROTO_ARP, _arp_logger);
>  }
>
>  static void __net_exit nf_log_arp_net_exit(struct net *net)
> diff --git a/net/ipv4/netfilter/nf_log_ipv4.c 
> b/net/ipv4/netfilter/nf_log_ipv4.c
> index 076aadd..20f2255 100644
> --- a/net/ipv4/netfilter/nf_log_ipv4.c
> +++ b/net/ipv4/netfilter/nf_log_ipv4.c
> @@ -347,8 +347,7 @@ static struct nf_logger nf_ip_logger __read_mostly = {
>
>  static int __net_init nf_log_ipv4_net_init(struct net *net)
>  {
> -   nf_log_set(net, NFPROTO_IPV4, _ip_logger);
> -   return 0;
> +   return nf_log_set(net, NFPROTO_IPV4, _ip_logger);
>  }
>
>  static void __net_exit nf_log_ipv4_net_exit(struct net *net)
> diff --git a/net/ipv6/netfilter/nf_log_ipv6.c 
> b/net/ipv6/netfilter/nf_log_ipv6.c
> index 8dd8696..c1bcf69 100644
> --- a/net/ipv6/netfilter/nf_log_ipv6.c
> +++ b/net/ipv6/netfilter/nf_log_ipv6.c
> @@ -379,8 +379,7 @@ static struct nf_logger nf_ip6_logger __read_mostly = {
>
>  static int __net_init nf_log_ipv6_net_init(struct net *net)
>  {
> -   nf_log_set(net, NFPROTO_IPV6, _ip6_logger);
> -   return 0;
> +   return nf_log_set(net, NFPROTO_IPV6, _ip6_logger);
>  }
>
>  static void __net_exit nf_log_ipv6_net_exit(struct net *net)
> diff --git a/net/netfilter/nf_log.c b/net/netfilter/nf_log.c
> index aa5847a..30a17d6 100644
> --- a/net/netfilter/nf_log.c
> +++ b/net/netfilter/nf_log.c
> @@ -39,12 +39,12 @@ static struct nf_logger *__find_logger(int pf, const char 
> *str_logger)
> return NULL;
>  }
>
> -void nf_log_set(struct net *net, u_int8_t pf, const struct nf_logger *logger)
> +int nf_log_set(struct net *net, u_int8_t pf, const struct nf_logger *logger)
>  {
> const struct nf_logger *log;
>
> -   if (pf == NFPROTO_UNSPEC)
> -   return;
> +   if (pf == NFPROTO_UNSPEC || pf >= ARRAY_SIZE(net->nf.nf_loggers))
> +   return -EOPNOTSUPP;
>
> mutex_lock(_log_mutex);
> log = nft_log_dereference(net->nf.nf_loggers[pf]);
> @@ -52,6 +52,8 @@ void nf_log_set(struct net *net, u_int8_t pf, const struct 
> nf_logger *logger)
> rcu_assign_pointer(net->nf.nf_loggers[pf], logger);
>
> mutex_unlock(_log_mutex);
> +
> +   return 0;
>  }
>  EXPORT_SYMBOL(nf_log_set);
>
> --
> 1.9.1
>

Sorry, this patch does not fix any bug.
The subject should be "nf-next".

Regards
Feng


Re: [PATCH nf-next] netfilter: log: Check param to avoid overflow in nf_log_set

2016-08-28 Thread Feng Gao
On Sun, Aug 28, 2016 at 10:30 PM,   wrote:
> From: Gao Feng 
>
> The nf_log_set is an interface function, so it should do the strict sanity
> check of parameters. Add  one sanity check for pf, it could not exceed
> NFPROTO_NUMPROTO, and print error log when pf is invalid.
>
> Signed-off-by: Gao Feng 
> ---
>  net/netfilter/nf_log.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/net/netfilter/nf_log.c b/net/netfilter/nf_log.c
> index aa5847a..02ce0b9 100644
> --- a/net/netfilter/nf_log.c
> +++ b/net/netfilter/nf_log.c
> @@ -43,8 +43,10 @@ void nf_log_set(struct net *net, u_int8_t pf, const struct 
> nf_logger *logger)
>  {
> const struct nf_logger *log;
>
> -   if (pf == NFPROTO_UNSPEC)
> +   if (pf == NFPROTO_UNSPEC || pf >= NFPROTO_NUMPROTO) {
> +   pr_err("Wrong pf(%d) for nf log", pf);
> return;
> +   }
>
> mutex_lock(_log_mutex);
> log = nft_log_dereference(net->nf.nf_loggers[pf]);
> --
> 1.9.1
>
>

BTW, another similar interface function "nf_log_register" checks
sanity of param "pf".
So I think nf_log_set also need to check if param "pf" exceeds the valid range.


Re: [PATCH net 0/2] ppp: fix deadlock upon recursive xmit

2016-08-28 Thread Feng Gao
On Sun, Aug 28, 2016 at 4:20 AM, Guillaume Nault  wrote:
> This series fixes the issue reported by Feng where packets looping
> through a ppp device makes the module deadlock:
> https://marc.info/?l=linux-netdev=147134567319038=2
>
> The problem can occur on virtual interfaces (e.g. PPP over L2TP, or
> PPPoE on vxlan devices), when a PPP packet is routed back to the PPP
> interface.
>
> PPP's xmit path isn't reentrant, so patch #1 uses a per-cpu variable
> to detect and break recursion. Patch #2 sets the NETIF_F_LLTX flag to
> avoid lock inversion issues between ppp and txqueue locks.
>
> There are multiple entry points to the PPP xmit path. This series has
> been tested with lockdep and should address recursion issues no matter
> how the packet entered the path.
>
>
> A similar issue in L2TP is not covered by this series:
> l2tp_xmit_skb() also isn't reentrant, and it can be called as part of
> PPP's xmit path (pppol2tp_xmit()), or directly from the L2TP socket
> (l2tp_ppp_sendmsg()). If a packet is sent by l2tp_ppp_sendmsg() and
> routed to the parent PPP interface, then it's going to hit
> l2tp_xmit_skb() again.
>
> Breaking recursion as done in ppp_generic is not enough, because we'd
> still have a lock inversion issue (locking in l2tp_xmit_skb() can
> happen before or after locking in ppp_generic). The best approach would
> be to use the ip_tunnel functions and remove the socket locking in
> l2tp_xmit_skb(). But that'd be something for net-next.
>
>
> BTW, I hope the commit messages aren't too long. Just let me know if I
> should trim something.
>
>
> Guillaume Nault (2):
>   ppp: avoid dealock on recursive xmit
>   ppp: declare PPP devices as LLTX
>
>  drivers/net/ppp/ppp_generic.c | 54 
> +--
>  1 file changed, 42 insertions(+), 12 deletions(-)
>
> --
> 2.9.3
>

I am learning your codes. It is better than my solution :))

Best Regards
Feng


Re: [PATCH v1 1/1 net-next] 8139cp: Fix one possible deadloop in cp_rx_poll

2016-08-25 Thread Feng Gao
On Thu, Aug 25, 2016 at 9:45 AM,   wrote:
> From: Gao Feng 
>
> When cp_rx_poll does not get enough packet, it will check the rx
> interrupt status again. If so, it will jumpt to rx_status_loop again.
> But the goto jump resets the rx variable as zero too.
>
> As a result, it causes one possible deadloop. Assume this case,
> rx_status_loop only gets the packet count which is less than budget,
> and (cpr16(IntrStatus) & cp_rx_intr_mask) condition is always true.
> It causes the deadloop happens and system is blocked.
>
> Signed-off-by: Gao Feng 
> ---
>  drivers/net/ethernet/realtek/8139cp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/realtek/8139cp.c 
> b/drivers/net/ethernet/realtek/8139cp.c
> index deae10d..5297bf7 100644
> --- a/drivers/net/ethernet/realtek/8139cp.c
> +++ b/drivers/net/ethernet/realtek/8139cp.c
> @@ -467,8 +467,8 @@ static int cp_rx_poll(struct napi_struct *napi, int 
> budget)
> unsigned int rx_tail = cp->rx_tail;
> int rx;
>
> -rx_status_loop:
> rx = 0;
> +rx_status_loop:
> cpw16(IntrStatus, cp_rx_intr_mask);
>
> while (rx < budget) {
> --
> 1.9.1
>
>

Sorry, this commit should be for net.git, not net-next.git
Because it fixed one possible infinite loop.

Best Regards
Feng


Re: [PATCH 1/1] netfilter: gre: Use the consitent GRE and PPTP struct instead of the structures defined in netfilter

2016-08-25 Thread Feng Gao
Hi Pablo,

inline

On Thu, Aug 25, 2016 at 8:16 PM, Pablo Neira Ayuso <pa...@netfilter.org> wrote:
> On Fri, Aug 19, 2016 at 11:03:46PM +0800, Feng Gao wrote:
>> My email server reports the last same patch email failed to send.
>> So I just sent it again.
>>
>> I am sorry, if anyone receives duplicated ones.
>
> git am 
> v2-1-2-net-next-netfilter-gre-Use-consistent-GRE_-macros-instead-of-ones-defined-by-netfilter..patch
> -s
> Applying: netfilter: gre: Use consistent GRE_* macros instead of ones
> defined by netfilter.
> error: patch failed: include/uapi/linux/if_tunnel.h:36
> error: include/uapi/linux/if_tunnel.h: patch does not apply
>
> It seems your base was missing this patch:
>
> commit ab10dccb11608b96b43b557c12a5ad867723e503
> Author: Gao Feng <f...@ikuai8.com>
> Date:   Tue Aug 9 12:38:24 2016 +0800
>
> rps: Inspect PPTP encapsulated by GRE to get flow hash
>
> Since I cannot see GRE_FLAGS in your patch as context.

Oh, it is possible that the patches are generated based on my local
branch which has this commit locally, not the latest net-next.
Now, I am more familiar with the rules than before.

>
> Please rebase and resubmit, thanks!

Ok, I will resubmit. Sorry for this fault.

Regards
Feng


Re: [PATCH v2 1/2 net-next] netfilter: gre: Use consistent GRE_* macros instead of ones defined by netfilter.

2016-08-22 Thread Feng Gao
Sorry to bother you if receive duplicate email whose subject is "
[patch v2 1/2 net-next] netfilter: gre: Use consistent GRE_* macros
instead of
 ones defined by netfilter.".

Because my mail server told it failed to send at the first time.

Regards
Feng

On Tue, Aug 23, 2016 at 10:20 AM,   wrote:
> From: Gao Feng 
>
> There are already some GRE_* macros in kernel, so it is unnecessary
> to define these macros. And remove some useless macros
>
> Signed-off-by: Gao Feng 
> ---
>  v2: Split the original patch to review easily
>  v1: Intial patch
>
>  include/linux/netfilter/nf_conntrack_proto_gre.h | 22 ++
>  include/uapi/linux/if_tunnel.h   |  1 +
>  net/ipv4/netfilter/nf_nat_proto_gre.c|  4 ++--
>  net/netfilter/nf_conntrack_proto_gre.c   |  4 ++--
>  4 files changed, 7 insertions(+), 24 deletions(-)
>
> diff --git a/include/linux/netfilter/nf_conntrack_proto_gre.h 
> b/include/linux/netfilter/nf_conntrack_proto_gre.h
> index df78dc2..0189747 100644
> --- a/include/linux/netfilter/nf_conntrack_proto_gre.h
> +++ b/include/linux/netfilter/nf_conntrack_proto_gre.h
> @@ -1,29 +1,11 @@
>  #ifndef _CONNTRACK_PROTO_GRE_H
>  #define _CONNTRACK_PROTO_GRE_H
>  #include 
> +#include 
> +#include 
>
>  /* GRE PROTOCOL HEADER */
>
> -/* GRE Version field */
> -#define GRE_VERSION_1701   0x0
> -#define GRE_VERSION_PPTP   0x1
> -
> -/* GRE Protocol field */
> -#define GRE_PROTOCOL_PPTP  0x880B
> -
> -/* GRE Flags */
> -#define GRE_FLAG_C 0x80
> -#define GRE_FLAG_R 0x40
> -#define GRE_FLAG_K 0x20
> -#define GRE_FLAG_S 0x10
> -#define GRE_FLAG_A 0x80
> -
> -#define GRE_IS_C(f)((f)_FLAG_C)
> -#define GRE_IS_R(f)((f)_FLAG_R)
> -#define GRE_IS_K(f)((f)_FLAG_K)
> -#define GRE_IS_S(f)((f)_FLAG_S)
> -#define GRE_IS_A(f)((f)_FLAG_A)
> -
>  /* GRE is a mess: Four different standards */
>  struct gre_hdr {
>  #if defined(__LITTLE_ENDIAN_BITFIELD)
> diff --git a/include/uapi/linux/if_tunnel.h b/include/uapi/linux/if_tunnel.h
> index 361b9f0..1b27e2c 100644
> --- a/include/uapi/linux/if_tunnel.h
> +++ b/include/uapi/linux/if_tunnel.h
> @@ -36,6 +36,7 @@
>  #define GRE_IS_REC(f)  ((f) & GRE_REC)
>  #define GRE_IS_ACK(f)  ((f) & GRE_ACK)
>
> +#define GRE_VERSION_0  __cpu_to_be16(0x)
>  #define GRE_VERSION_1  __cpu_to_be16(0x0001)
>  #define GRE_PROTO_PPP  __cpu_to_be16(0x880b)
>  #define GRE_PPTP_KEY_MASK  __cpu_to_be32(0x)
> diff --git a/net/ipv4/netfilter/nf_nat_proto_gre.c 
> b/net/ipv4/netfilter/nf_nat_proto_gre.c
> index 9414923..93198d7 100644
> --- a/net/ipv4/netfilter/nf_nat_proto_gre.c
> +++ b/net/ipv4/netfilter/nf_nat_proto_gre.c
> @@ -104,11 +104,11 @@ gre_manip_pkt(struct sk_buff *skb,
> if (maniptype != NF_NAT_MANIP_DST)
> return true;
> switch (greh->version) {
> -   case GRE_VERSION_1701:
> +   case ntohs(GRE_VERSION_0):
> /* We do not currently NAT any GREv0 packets.
>  * Try to behave like "nf_nat_proto_unknown" */
> break;
> -   case GRE_VERSION_PPTP:
> +   case ntohs(GRE_VERSION_1):
> pr_debug("call_id -> 0x%04x\n", ntohs(tuple->dst.u.gre.key));
> pgreh->call_id = tuple->dst.u.gre.key;
> break;
> diff --git a/net/netfilter/nf_conntrack_proto_gre.c 
> b/net/netfilter/nf_conntrack_proto_gre.c
> index a96451a..deb239a 100644
> --- a/net/netfilter/nf_conntrack_proto_gre.c
> +++ b/net/netfilter/nf_conntrack_proto_gre.c
> @@ -200,7 +200,7 @@ static bool gre_pkt_to_tuple(const struct sk_buff *skb, 
> unsigned int dataoff,
>
> /* first only delinearize old RFC1701 GRE header */
> grehdr = skb_header_pointer(skb, dataoff, sizeof(_grehdr), &_grehdr);
> -   if (!grehdr || grehdr->version != GRE_VERSION_PPTP) {
> +   if (!grehdr || grehdr->version != ntohs(GRE_VERSION_1)) {
> /* try to behave like "nf_conntrack_proto_generic" */
> tuple->src.u.all = 0;
> tuple->dst.u.all = 0;
> @@ -212,7 +212,7 @@ static bool gre_pkt_to_tuple(const struct sk_buff *skb, 
> unsigned int dataoff,
> if (!pgrehdr)
> return true;
>
> -   if (ntohs(grehdr->protocol) != GRE_PROTOCOL_PPTP) {
> +   if (grehdr->protocol != GRE_PROTO_PPP) {
> pr_debug("GRE_VERSION_PPTP but unknown proto\n");
> return false;
> }
> --
> 1.9.1
>


Re: [PATCH v4 net-next] l2tp: Refactor the codes with existing macros instead of literal number

2016-08-22 Thread Feng Gao
Sorry, I forget to run the check_patch.pl script this time.
Now I send the v5 update.

Regards
Feng

On Mon, Aug 22, 2016 at 10:43 PM, Guillaume Nault  wrote:
> On Mon, Aug 22, 2016 at 06:59:51PM +0800, f...@ikuai8.com wrote:
>> From: Gao Feng 
>>
>> Use PPP_ALLSTATIONS, PPP_UI, and SEND_SHUTDOWN instead of 0xff,
>> 0x03, and 2 separately.
>>
> Apart from the checkpatch errors,
>
> Acked-by: Guillaume Nault 
>
>> @@ -325,8 +324,8 @@ static int pppol2tp_sendmsg(struct socket *sock, struct 
>> msghdr *m,
>>   skb_reserve(skb, uhlen);
>>
>>   /* Add PPP header */
>> - skb->data[0] = ppph[0];
>> - skb->data[1] = ppph[1];
>> + skb->data[0] = PPP_ALLSTATIONS;
>>
> Trailing whitespace.
>
>> @@ -398,14 +396,14 @@ static int pppol2tp_xmit(struct ppp_channel *chan, 
>> struct sk_buff *skb)
>>  sizeof(struct iphdr) + /* IP header */
>>  uhlen +  /* UDP header (if L2TP_ENCAPTYPE_UDP) 
>> */
>>  session->hdr_len +   /* L2TP header */
>> -sizeof(ppph);/* PPP header */
>> +2;   /* 2 bytes for PPP_ALLSTATIONS & 
>> PPP_UI */
>>   if (skb_cow_head(skb, headroom))
>>   goto abort_put_sess_tun;
>>
>>   /* Setup PPP header */
>> - __skb_push(skb, sizeof(ppph));
>> - skb->data[0] = ppph[0];
>> - skb->data[1] = ppph[1];
>> + __skb_push(skb, 2);
>>
> Here too.


Re: [PATCH v4 net-next] ppp: Fix one deadlock issue of PPP when reentrant

2016-08-22 Thread Feng Gao
It seems a better solution, simple and apparent.
I accept any best solution which could make kernel works well :))

Best Regards
Feng

On Mon, Aug 22, 2016 at 8:35 PM, Guillaume Nault  wrote:
> On Mon, Aug 22, 2016 at 09:20:14AM +0800, f...@ikuai8.com wrote:
>> From: Gao Feng 
>>
>> PPP channel holds one spinlock before send frame. But the skb may
>> select the same PPP channel with wrong route policy. As a result,
>> the skb reaches the same channel path. It tries to get the same
>> spinlock which is held before. Bang, the deadlock comes out.
>>
> Thanks for following up on this case.
> On my side, I've thought a bit more about it in the weekend and cooked
> this patch.
> It's experimental and requires cleanup and further testing, but it
> should fix all issues I could think of (at least for PPP over L2TP).
>
> The main idea is to use a per-cpu variable to detect recursion in
> selected points of PPP and L2TP xmit path.
>
> ---
>  drivers/net/ppp/ppp_generic.c | 49 
> ---
>  net/l2tp/l2tp_core.c  | 28 +
>  2 files changed, 61 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c
> index f226db4..c33036bf 100644
> --- a/drivers/net/ppp/ppp_generic.c
> +++ b/drivers/net/ppp/ppp_generic.c
> @@ -1354,6 +1354,8 @@ static void ppp_setup(struct net_device *dev)
> dev->netdev_ops = _netdev_ops;
> SET_NETDEV_DEVTYPE(dev, _type);
>
> +   dev->features |= NETIF_F_LLTX;
> +
> dev->hard_header_len = PPP_HDRLEN;
> dev->mtu = PPP_MRU;
> dev->addr_len = 0;
> @@ -1367,12 +1369,7 @@ static void ppp_setup(struct net_device *dev)
>   * Transmit-side routines.
>   */
>
> -/*
> - * Called to do any work queued up on the transmit side
> - * that can now be done.
> - */
> -static void
> -ppp_xmit_process(struct ppp *ppp)
> +static void __ppp_xmit_process(struct ppp *ppp)
>  {
> struct sk_buff *skb;
>
> @@ -1392,6 +1389,23 @@ ppp_xmit_process(struct ppp *ppp)
> ppp_xmit_unlock(ppp);
>  }
>
> +static DEFINE_PER_CPU(int, ppp_xmit_recursion);
> +
> +/* Called to do any work queued up on the transmit side that can now be done 
> */
> +static void ppp_xmit_process(struct ppp *ppp)
> +{
> +   if (unlikely(__this_cpu_read(ppp_xmit_recursion))) {
> +   WARN(1, "recursion detected\n");
> +   return;
> +   }
> +
> +   __this_cpu_inc(ppp_xmit_recursion);
> +   local_bh_disable();
> +   __ppp_xmit_process(ppp);
> +   local_bh_enable();
> +   __this_cpu_dec(ppp_xmit_recursion);
> +}
> +
>  static inline struct sk_buff *
>  pad_compress_skb(struct ppp *ppp, struct sk_buff *skb)
>  {
> @@ -1847,11 +1861,7 @@ static int ppp_mp_explode(struct ppp *ppp, struct 
> sk_buff *skb)
>  }
>  #endif /* CONFIG_PPP_MULTILINK */
>
> -/*
> - * Try to send data out on a channel.
> - */
> -static void
> -ppp_channel_push(struct channel *pch)
> +static void __ppp_channel_push(struct channel *pch)
>  {
> struct sk_buff *skb;
> struct ppp *ppp;
> @@ -1876,11 +1886,26 @@ ppp_channel_push(struct channel *pch)
> read_lock_bh(>upl);
> ppp = pch->ppp;
> if (ppp)
> -   ppp_xmit_process(ppp);
> +   __ppp_xmit_process(ppp);
> read_unlock_bh(>upl);
> }
>  }
>
> +/* Try to send data out on a channel */
> +static void ppp_channel_push(struct channel *pch)
> +{
> +   if (unlikely(__this_cpu_read(ppp_xmit_recursion))) {
> +   WARN(1, "recursion detected\n");
> +   return;
> +   }
> +
> +   __this_cpu_inc(ppp_xmit_recursion);
> +   local_bh_disable();
> +   __ppp_channel_push(pch);
> +   local_bh_enable();
> +   __this_cpu_dec(ppp_xmit_recursion);
> +}
> +
>  /*
>   * Receive-side routines.
>   */
> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> index 1e40dac..bdfb1be 100644
> --- a/net/l2tp/l2tp_core.c
> +++ b/net/l2tp/l2tp_core.c
> @@ -1096,10 +1096,8 @@ static int l2tp_xmit_core(struct l2tp_session 
> *session, struct sk_buff *skb,
> return 0;
>  }
>
> -/* If caller requires the skb to have a ppp header, the header must be
> - * inserted in the skb data before calling this function.
> - */
> -int l2tp_xmit_skb(struct l2tp_session *session, struct sk_buff *skb, int 
> hdr_len)
> +static int __l2tp_xmit_skb(struct l2tp_session *session, struct sk_buff *skb,
> +  int hdr_len)
>  {
> int data_len = skb->len;
> struct l2tp_tunnel *tunnel = session->tunnel;
> @@ -1178,6 +1176,28 @@ out_unlock:
>
> return ret;
>  }
> +
> +static DEFINE_PER_CPU(int, l2tp_xmit_recursion);
> +
> +/* If caller requires the skb to have a ppp header, the header must be
> + * inserted in the skb data before calling this function.
> + */
> +int l2tp_xmit_skb(struct l2tp_session *session, 

Re: [PATCH v3 net-next] l2tp: Refactor the codes with existing macros instead of literal number

2016-08-22 Thread Feng Gao
inline

On Mon, Aug 22, 2016 at 5:48 PM, Guillaume Nault  wrote:
> On Sun, Aug 21, 2016 at 04:36:52PM -0600, Philp Prindeville wrote:
>> Inline
>>
>>
>> On 08/20/2016 09:52 AM, f...@48lvckh6395k16k5.yundunddos.com wrote:
>> > From: Gao Feng 
>> >
>> > Use PPP_ALLSTATIONS, PPP_UI, and SEND_SHUTDOWN instead of 0xff,
>> > 0x03, and 2 separately.
>> >
>> > Signed-off-by: Gao Feng 
>> > ---
>> >   v3: Modify the subject;
>> >   v2: Only replace the literal number with macros according to Guillaume's 
>> > advice
>> >   v1: Inital patch
>> >
>> >   net/l2tp/l2tp_ppp.c | 8 
>> >   1 file changed, 4 insertions(+), 4 deletions(-)
>> >
>> > diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c
>> > index d9560aa..65e2fd6 100644
>> > --- a/net/l2tp/l2tp_ppp.c
>> > +++ b/net/l2tp/l2tp_ppp.c
>> > @@ -177,7 +177,7 @@ static int pppol2tp_recv_payload_hook(struct sk_buff 
>> > *skb)
>> > if (!pskb_may_pull(skb, 2))
>> > return 1;
>> > -   if ((skb->data[0] == 0xff) && (skb->data[1] == 0x03))
>> > +   if ((skb->data[0] == PPP_ALLSTATIONS) && (skb->data[1] == PPP_UI))
>>
>> This should have used PPP_ADDRESS() and PPP_CONTROL() here.
>>
> Then please justify how would that make the code more readable.
> We're not trying to interpret a known valid PPP header here.
>
>> > skb_pull(skb, 2);
>>
>> This magic number should go away.
>>
> Again, this is *not* a magic number. We've explicitely accessed the
> first _two_ header bytes and want to skip them.
> pskb_may_pull(2), ->data[0], ->data[1] and skb_pull(2) all go together.
>
> There's even a nice comment telling you what is done and why:
> /* Skip PPP header, if present.  In testing, Microsoft L2TP clients
>  * don't send the PPP header (PPP header compression enabled), but
>  * other clients can include the header. So we cope with both cases
>  * here. The PPP header is always FF03 when using L2TP.
>  *
>  * Note that skb->data[] isn't dereferenced from a u16 ptr here since
>  * the field may be unaligned.
>  */
> Apart from the unprecise "PPP header" term, which should be read as
> "address and control fields", things should be quite clear.

If remove the static ppph, may be more clear. Because it will cause
person think about the ppp header.

Regards
Feng

>
>> > @@ -282,7 +282,7 @@ static void pppol2tp_session_sock_put(struct 
>> > l2tp_session *session)
>> >   static int pppol2tp_sendmsg(struct socket *sock, struct msghdr *m,
>> > size_t total_len)
>> >   {
>> > -   static const unsigned char ppph[2] = { 0xff, 0x03 };
>> > +   static const unsigned char ppph[2] = {PPP_ALLSTATIONS, PPP_UI};
>>
>> PPP has a 4-byte header.  Where's the protocol value?
>>
> No, PPP header (whatever you include in it) is of variable length. And
> the protocol has already been set by the PPP layer anyway.
> We're in L2TP here.


Re: [PATCH v3 net-next] l2tp: Refactor the codes with existing macros instead of literal number

2016-08-22 Thread Feng Gao
inline

On Mon, Aug 22, 2016 at 6:07 PM, Guillaume Nault  wrote:
> On Sat, Aug 20, 2016 at 11:52:27PM +0800, f...@ikuai8.com wrote:
>> From: Gao Feng 
>>
>> Use PPP_ALLSTATIONS, PPP_UI, and SEND_SHUTDOWN instead of 0xff,
>> 0x03, and 2 separately.
>>
>> Signed-off-by: Gao Feng 
>> ---
>>  v3: Modify the subject;
>>  v2: Only replace the literal number with macros according to Guillaume's 
>> advice
>>  v1: Inital patch
>>
>>  net/l2tp/l2tp_ppp.c | 8 
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c
>> index d9560aa..65e2fd6 100644
>> --- a/net/l2tp/l2tp_ppp.c
>> +++ b/net/l2tp/l2tp_ppp.c
>> @@ -177,7 +177,7 @@ static int pppol2tp_recv_payload_hook(struct sk_buff 
>> *skb)
>>   if (!pskb_may_pull(skb, 2))
>>   return 1;
>>
>> - if ((skb->data[0] == 0xff) && (skb->data[1] == 0x03))
>> + if ((skb->data[0] == PPP_ALLSTATIONS) && (skb->data[1] == PPP_UI))
>>   skb_pull(skb, 2);
>>
>>   return 0;
>> @@ -282,7 +282,7 @@ static void pppol2tp_session_sock_put(struct 
>> l2tp_session *session)
>>  static int pppol2tp_sendmsg(struct socket *sock, struct msghdr *m,
>>   size_t total_len)
>>  {
>> - static const unsigned char ppph[2] = { 0xff, 0x03 };
>> + static const unsigned char ppph[2] = {PPP_ALLSTATIONS, PPP_UI};
>>
> Minor nit: I'd prefer to keep the space after '{' and before '}'.
> I didn't want to bother you with this, but since it seems you'll have
> to repost...

I don't know if it is the coding style of Linux kernel.

>
>>   struct sock *sk = sock->sk;
>>   struct sk_buff *skb;
>>   int error;
>> @@ -369,7 +369,7 @@ error:
>>   */
>>  static int pppol2tp_xmit(struct ppp_channel *chan, struct sk_buff *skb)
>>  {
>> - static const u8 ppph[2] = { 0xff, 0x03 };
>> + static const u8 ppph[2] = {PPP_ALLSTATIONS, PPP_UI};
>>
> Same here.
>
> BTW, I thought you also wanted to remove the static ppph variable
> from pppol2tp_xmit() / pppol2tp_sendmsg(), to directly assign
> skb->data[0/1] with PPP_ALLSTATIONS/PPP_UI.

If removed static ppph, there will be some codes which use literal "2"
instead of sizeof ppph.
Is it ok?

Regards
Feng


Re: [PATCH v3 net-next] l2tp: Refactor the codes with existing macros instead of literal number

2016-08-21 Thread Feng Gao
inline

On Mon, Aug 22, 2016 at 6:36 AM, Philp Prindeville
 wrote:
> Inline
>
>
> On 08/20/2016 09:52 AM, f...@48lvckh6395k16k5.yundunddos.com wrote:
>>
>> From: Gao Feng 
>>
>> Use PPP_ALLSTATIONS, PPP_UI, and SEND_SHUTDOWN instead of 0xff,
>> 0x03, and 2 separately.
>>
>> Signed-off-by: Gao Feng 
>> ---
>>   v3: Modify the subject;
>>   v2: Only replace the literal number with macros according to Guillaume's
>> advice
>>   v1: Inital patch
>>
>>   net/l2tp/l2tp_ppp.c | 8 
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c
>> index d9560aa..65e2fd6 100644
>> --- a/net/l2tp/l2tp_ppp.c
>> +++ b/net/l2tp/l2tp_ppp.c
>> @@ -177,7 +177,7 @@ static int pppol2tp_recv_payload_hook(struct sk_buff
>> *skb)
>> if (!pskb_may_pull(skb, 2))
>> return 1;
>>   - if ((skb->data[0] == 0xff) && (skb->data[1] == 0x03))
>> +   if ((skb->data[0] == PPP_ALLSTATIONS) && (skb->data[1] == PPP_UI))
>
>
> This should have used PPP_ADDRESS() and PPP_CONTROL() here.

In my initial patch, I replace them with PPP_ADDRESS() and PPP_CONTROL.
But Guillaume thought it was not clear as before.
So I revert it.

>
>> skb_pull(skb, 2);
>
>
> This magic number should go away.

Same as above.

>
>> return 0;
>> @@ -282,7 +282,7 @@ static void pppol2tp_session_sock_put(struct
>> l2tp_session *session)
>>   static int pppol2tp_sendmsg(struct socket *sock, struct msghdr *m,
>> size_t total_len)
>>   {
>> -   static const unsigned char ppph[2] = { 0xff, 0x03 };
>> +   static const unsigned char ppph[2] = {PPP_ALLSTATIONS, PPP_UI};
>
>
> PPP has a 4-byte header.  Where's the protocol value?

In the original code, I fail to find the code which is used to fill
the protocol value.
So I keep the two bytes header. And I thought the protocol value may be filled
by the upper layer.

>
>> struct sock *sk = sock->sk;
>> struct sk_buff *skb;
>> int error;
>> @@ -369,7 +369,7 @@ error:
>>*/
>>   static int pppol2tp_xmit(struct ppp_channel *chan, struct sk_buff *skb)
>>   {
>> -   static const u8 ppph[2] = { 0xff, 0x03 };
>> +   static const u8 ppph[2] = {PPP_ALLSTATIONS, PPP_UI};
>
>
> Same as above.
>
>
>
>> struct sock *sk = (struct sock *) chan->private;
>> struct sock *sk_tun;
>> struct l2tp_session *session;
>> @@ -440,7 +440,7 @@ static void pppol2tp_session_close(struct l2tp_session
>> *session)
>> BUG_ON(session->magic != L2TP_SESSION_MAGIC);
>> if (sock) {
>> -   inet_shutdown(sock, 2);
>> +   inet_shutdown(sock, SEND_SHUTDOWN);
>> /* Don't let the session go away before our socket does */
>> l2tp_session_inc_refcount(session);
>> }
>
>


Re: [PATCH 1/1] ppp: Fix one deadlock issue of PPP when reentrant

2016-08-20 Thread Feng Gao
On Sat, Aug 20, 2016 at 5:48 AM, Guillaume Nault  wrote:
> On Fri, Aug 19, 2016 at 11:16:41PM +0800, f...@ikuai8.com wrote:
>> From: Gao Feng 
>>
>> PPP channel holds one spinlock before send frame. But the skb may
>> select the same PPP channel with wrong route policy. As a result,
>> the skb reaches the same channel path. It tries to get the same
>> spinlock which is held before. Bang, the deadlock comes out.
>>
>> Now add one lock owner to avoid it like xmit_lock_owner of
>> netdev_queue. Check the lock owner before try to get the spinlock.
>> If the current cpu is already the owner, it means ppp finds there is
>> one reentrant and returns directly. If not owner and hold the spinlock
>> successfully, it sets owner with current CPU ID.
>>
>> The following is the panic stack of 3.3.8. But the same issue
>> should be in the upstream too.
>>
>> [] ? _raw_spin_lock_bh+0x11/0x40
>> [] ppp_unregister_channel+0x1347/0x2170 [ppp_generic]
>> [] ? kmem_cache_free+0xa7/0xc0
>> [] ppp_unregister_channel+0x1db7/0x2170 [ppp_generic]
>> [] ppp_unregister_channel+0x2065/0x2170 [ppp_generic]
>> [] dev_hard_start_xmit+0x4cd/0x620
>> [] sch_direct_xmit+0x74/0x1d0
>> [] dev_queue_xmit+0x1d/0x30
>> [] neigh_direct_output+0xc/0x10
>> [] ip_finish_output+0x25e/0x2b0
>> [] ip_output+0x88/0x90
>> [] ? __ip_local_out+0x9f/0xb0
>> [] ip_local_out+0x24/0x30
>> [] 0xa00b9744
>> [] ppp_unregister_channel+0x20f8/0x2170 [ppp_generic]
>> [] ppp_output_wakeup+0x122/0x11d0 [ppp_generic]
>> [] vfs_write+0xb8/0x160
>> [] sys_write+0x45/0x90
>> [] system_call_fastpath+0x16/0x1b
>>
>> The call flow is like this.
>> ppp_write->ppp_channel_push->start_xmit->select inappropriate route
>>  -> dev_hard_start_xmit->ppp_start_xmit->ppp_xmit_process->
>> ppp_push. Now ppp_push tries to get the same spinlock which is held
>> in ppp_channel_push.
>>
>> Although the PPP deadlock is caused by inappropriate route policy
>> with L2TP, I think it is not accepted the PPP module would cause kernel
>> deadlock with wrong route policy.
>>
>> Signed-off-by: Gao Feng 
>> ---
>>  v3: Change the fix solution. Giveup the send chance instead of recursive 
>> lock
>>  v2: Fix recursive unlock issue
>>  v1: Initial patch
>>
>>  drivers/net/ppp/ppp_generic.c | 95 
>> +--
>>  1 file changed, 73 insertions(+), 22 deletions(-)
>>
>> diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c
>> index 70cfa06..b653f1f 100644
>> --- a/drivers/net/ppp/ppp_generic.c
>> +++ b/drivers/net/ppp/ppp_generic.c
>> @@ -162,6 +162,46 @@ struct ppp {
>>|SC_MULTILINK|SC_MP_SHORTSEQ|SC_MP_XSHORTSEQ \
>>|SC_COMP_TCP|SC_REJ_COMP_TCP|SC_MUST_COMP)
>>
>> +struct channel_lock {
>> + spinlock_t lock;
>> + int owner;
>> +};
>> +
>> +static inline void ppp_channel_lock_init(struct channel_lock *cl)
> No need for inline in .c files.

OK. I make them as non-inline.

>
>> +{
>> + cl->owner = -1;
>> + spin_lock_init(>lock);
>> +}
>> +
>> +static inline bool ppp_channel_lock_bh(struct channel_lock *cl)
>> +{
>> + int cpu;
>> +
>> + local_bh_disable();
>> + cpu = smp_processor_id();
>> + if (cpu == cl->owner) {
>> + /* The CPU already holds this channel lock and sends. But the
>> +  * channel is selected after inappropriate route. It causes
>> +  * reenter the channel again. It is forbidden by PPP module.
>> +  */
>> + if (net_ratelimit())
>> + pr_err("PPP detects one recursive channel send\n");
>> + local_bh_enable();
> What about calling local_bh_enable() before logging the error?

Ok.

>
>> + return false;
>> + }
>> + spin_lock(>lock);
>> + cl->owner = cpu;
>> +
>> + return true;
>> +}
>> +
>> +static inline void ppp_channel_unlock_bh(struct channel_lock *cl)
>> +{
>> + cl->owner = -1;
>> + spin_unlock(>lock);
>> + local_bh_enable();
>> +}
>> +
>>  /*
>>   * Private data structure for each channel.
>>   * This includes the data structure used for multilink.
>> @@ -171,7 +211,7 @@ struct channel {
>>   struct list_head list;  /* link in all/new_channels list */
>>   struct ppp_channel *chan;   /* public channel data structure */
>>   struct rw_semaphore chan_sem;   /* protects `chan' during chan ioctl */
>> - spinlock_t  downl;  /* protects `chan', file.xq dequeue */
>> + struct channel_lock downl;  /* protects `chan', file.xq dequeue */
>>   struct ppp  *ppp;   /* ppp unit we're connected to */
>>   struct net  *chan_net;  /* the net channel belongs to */
>>   struct list_head clist; /* link in list of channels per unit */
>
>> @@ -1645,6 +1687,7 @@ static int ppp_mp_explode(struct ppp *ppp, struct 
>> sk_buff *skb)
>>   struct channel *pch;
>>   struct sk_buff *frag;
>>  

Re: [PATCH v1 1/1] l2tp: Use existing macros instead of literal number

2016-08-20 Thread Feng Gao
Sorry, I forget to modify the title.
I will sent another update.

On Sat, Aug 20, 2016 at 2:28 AM, Guillaume Nault  wrote:
> On Fri, Aug 19, 2016 at 05:55:26PM +0800, f...@ikuai8.com wrote:
>> From: Gao Feng 
>>
>> 1. Use PPP_ALLSTATIONS/PPP_UI instead of literal 0xff/0x03;
>> 2. Use one static const global fixed_ppphdr instead of two same
>> static variable ppph in two different functions;
>> 3. Use SEND_SHUTDOWN instead of literal 2;
>>
>> Signed-off-by: Gao Feng 
>> ---
>>  v1: Initial patch
>>
> v1 again?
>
> BTW, no need to number your patch for a single patch series.
> But you have to tell which tree is the series for.
>
> So instead of [PATCH v1 1/1], you should send [PATCH v2 net-next].


Re: [PATCH 1/1] netfilter: gre: Use the consitent GRE and PPTP struct instead of the structures defined in netfilter

2016-08-19 Thread Feng Gao
My email server reports the last same patch email failed to send.
So I just sent it again.

I am sorry, if anyone receives duplicated ones.

Regards
Feng

On Fri, Aug 19, 2016 at 11:01 PM,   wrote:
> From: Gao Feng 
>
> There are two structures which define the GRE header and PPTP
> header. So it is unneccessary to define duplicated structures in
> netfilter again.
>
> Signed-off-by: Gao Feng 
> ---
>  v1: Intial patch
>
>  include/linux/netfilter/nf_conntrack_proto_gre.h | 63 
> +---
>  include/uapi/linux/if_tunnel.h   |  1 +
>  net/ipv4/netfilter/nf_nat_proto_gre.c| 15 +++---
>  net/netfilter/nf_conntrack_proto_gre.c   | 14 +++---
>  4 files changed, 19 insertions(+), 74 deletions(-)
>
> diff --git a/include/linux/netfilter/nf_conntrack_proto_gre.h 
> b/include/linux/netfilter/nf_conntrack_proto_gre.h
> index df78dc2..9c741da 100644
> --- a/include/linux/netfilter/nf_conntrack_proto_gre.h
> +++ b/include/linux/netfilter/nf_conntrack_proto_gre.h
> @@ -2,67 +2,8 @@
>  #define _CONNTRACK_PROTO_GRE_H
>  #include 
>
> -/* GRE PROTOCOL HEADER */
> -
> -/* GRE Version field */
> -#define GRE_VERSION_1701   0x0
> -#define GRE_VERSION_PPTP   0x1
> -
> -/* GRE Protocol field */
> -#define GRE_PROTOCOL_PPTP  0x880B
> -
> -/* GRE Flags */
> -#define GRE_FLAG_C 0x80
> -#define GRE_FLAG_R 0x40
> -#define GRE_FLAG_K 0x20
> -#define GRE_FLAG_S 0x10
> -#define GRE_FLAG_A 0x80
> -
> -#define GRE_IS_C(f)((f)_FLAG_C)
> -#define GRE_IS_R(f)((f)_FLAG_R)
> -#define GRE_IS_K(f)((f)_FLAG_K)
> -#define GRE_IS_S(f)((f)_FLAG_S)
> -#define GRE_IS_A(f)((f)_FLAG_A)
> -
> -/* GRE is a mess: Four different standards */
> -struct gre_hdr {
> -#if defined(__LITTLE_ENDIAN_BITFIELD)
> -   __u16   rec:3,
> -   srr:1,
> -   seq:1,
> -   key:1,
> -   routing:1,
> -   csum:1,
> -   version:3,
> -   reserved:4,
> -   ack:1;
> -#elif defined(__BIG_ENDIAN_BITFIELD)
> -   __u16   csum:1,
> -   routing:1,
> -   key:1,
> -   seq:1,
> -   srr:1,
> -   rec:3,
> -   ack:1,
> -   reserved:4,
> -   version:3;
> -#else
> -#error "Adjust your  defines"
> -#endif
> -   __be16  protocol;
> -};
> -
> -/* modified GRE header for PPTP */
> -struct gre_hdr_pptp {
> -   __u8   flags;   /* bitfield */
> -   __u8   version; /* should be GRE_VERSION_PPTP */
> -   __be16 protocol;/* should be GRE_PROTOCOL_PPTP */
> -   __be16 payload_len; /* size of ppp payload, not inc. gre header */
> -   __be16 call_id; /* peer's call_id for this session */
> -   __be32 seq; /* sequence number.  Present if S==1 */
> -   __be32 ack; /* seq number of highest packet received by */
> -   /*  sender in this session */
> -};
> +#include 
> +#include 
>
>  struct nf_ct_gre {
> unsigned int stream_timeout;
> diff --git a/include/uapi/linux/if_tunnel.h b/include/uapi/linux/if_tunnel.h
> index 361b9f0..1b27e2c 100644
> --- a/include/uapi/linux/if_tunnel.h
> +++ b/include/uapi/linux/if_tunnel.h
> @@ -36,6 +36,7 @@
>  #define GRE_IS_REC(f)  ((f) & GRE_REC)
>  #define GRE_IS_ACK(f)  ((f) & GRE_ACK)
>
> +#define GRE_VERSION_0  __cpu_to_be16(0x)
>  #define GRE_VERSION_1  __cpu_to_be16(0x0001)
>  #define GRE_PROTO_PPP  __cpu_to_be16(0x880b)
>  #define GRE_PPTP_KEY_MASK  __cpu_to_be32(0x)
> diff --git a/net/ipv4/netfilter/nf_nat_proto_gre.c 
> b/net/ipv4/netfilter/nf_nat_proto_gre.c
> index 9414923..afe81a8 100644
> --- a/net/ipv4/netfilter/nf_nat_proto_gre.c
> +++ b/net/ipv4/netfilter/nf_nat_proto_gre.c
> @@ -88,8 +88,9 @@ gre_manip_pkt(struct sk_buff *skb,
>   const struct nf_conntrack_tuple *tuple,
>   enum nf_nat_manip_type maniptype)
>  {
> -   const struct gre_hdr *greh;
> -   struct gre_hdr_pptp *pgreh;
> +   const struct gre_base_hdr *greh;
> +   struct pptp_gre_header *pgreh;
> +   u16 gre_ver;
>
> /* pgreh includes two optional 32bit fields which are not required
>  * to be there.  That's where the magic '8' comes from */
> @@ -97,18 +98,20 @@ gre_manip_pkt(struct sk_buff *skb,
> return false;
>
> greh = (void *)skb->data + hdroff;
> -   pgreh = (struct gre_hdr_pptp *)greh;
> +   pgreh = (struct pptp_gre_header *)greh;
>
> /* we only have destination manip of a packet, since 'source key'
>  * is not present in the packet itself */
> if (maniptype != NF_NAT_MANIP_DST)
> return true;
> -   switch (greh->version) {
> -   case GRE_VERSION_1701:
> +
> +   gre_ver = ntohs(greh->flags 

Re: [PATCH 1/1] ppp: Fix one deadlock issue of PPP when send frame

2016-08-19 Thread Feng Gao
Hi Philip,

I answer your question inline.

On Thu, Aug 18, 2016 at 10:11 PM, Philp Prindeville
<phil...@redfish-solutions.com> wrote:
> Feng,
>
> If the CPU can already be holding the lock, that implies re-entrancy.
> What's to stop the first flow of code which acquired the lock from releasing
> it again before the 2nd flow is done?  Is the 2nd flow running at a higher
> priority or with interrupts disabled?

There is no preemption happened. It is caused by wrong route policy by l2tp.
For example, the cpu0 get the spinlock of channel1, then the channel1
is selected again after route. As a result, cpu0 tries to get the same
spinlock again.

The call flow is like this.
ppp_write->ppp_channel_push->start_xmit->select inappropriate route
 -> dev_hard_start_xmit->ppp_start_xmit->ppp_xmit_process->
ppp_push. Now ppp_push tries to get the same spinlock which is held
in ppp_channel_push.

Regards
Feng

>
> It seems unlikely to me that this sort of locking problem hasn't existed
> elsewhere before and that an entirely new mechanism for handling it is
> needed...  How are similar re-entrancy issues handled elsewhere?
>
> -Philip
>
>
>
>
> On 08/16/2016 05:36 AM, Feng Gao wrote:
>>
>> Hi Paul,
>>
>> The v1 patch does not handle the recursive lock case. It could cause
>> unlock multiple times.
>> So I send the v2 patch as one update.
>>
>> Best Regards
>> Feng
>>
>> On Tue, Aug 16, 2016 at 7:05 PM,  <f...@ikuai8.com> wrote:
>>>
>>> From: Gao Feng <f...@ikuai8.com>
>>>
>>> PPP channel holds one spinlock before send frame. But the skb may
>>> select the same PPP channel with wrong route policy. As a result,
>>> the skb reaches the same channel path. It tries to get the same
>>> spinlock which is held before. Bang, the deadlock comes out.
>>>
>>> Now add one lock owner to avoid it like xmit_lock_owner of
>>> netdev_queue. Check the lock owner before try to get the spinlock.
>>> If the current cpu is already the owner, needn't lock again. When
>>> PPP channel holds the spinlock at the first time, it sets owner
>>> with current CPU ID.
>>>
>>> The following is the panic stack of 3.3.8. But the same issue
>>> should be in the upstream too.
>>>
>>> [] ? _raw_spin_lock_bh+0x11/0x40
>>> [] ppp_unregister_channel+0x1347/0x2170 [ppp_generic]
>>> [] ? kmem_cache_free+0xa7/0xc0
>>> [] ppp_unregister_channel+0x1db7/0x2170 [ppp_generic]
>>> [] ppp_unregister_channel+0x2065/0x2170 [ppp_generic]
>>> [] dev_hard_start_xmit+0x4cd/0x620
>>> [] sch_direct_xmit+0x74/0x1d0
>>> [] dev_queue_xmit+0x1d/0x30
>>> [] neigh_direct_output+0xc/0x10
>>> [] ip_finish_output+0x25e/0x2b0
>>> [] ip_output+0x88/0x90
>>> [] ? __ip_local_out+0x9f/0xb0
>>> [] ip_local_out+0x24/0x30
>>> [] 0xa00b9744
>>> [] ppp_unregister_channel+0x20f8/0x2170 [ppp_generic]
>>> [] ppp_output_wakeup+0x122/0x11d0 [ppp_generic]
>>> [] vfs_write+0xb8/0x160
>>> [] sys_write+0x45/0x90
>>> [] system_call_fastpath+0x16/0x1b
>>>
>>> The call flow is like this.
>>> ppp_write->ppp_channel_push->start_xmit->select inappropriate route
>>>  -> dev_hard_start_xmit->ppp_start_xmit->ppp_xmit_process->
>>> ppp_push. Now ppp_push tries to get the same spinlock which is held
>>> in ppp_channel_push.
>>>
>>> Although the PPP deadlock is caused by inappropriate route policy
>>> with L2TP, I think it is not accepted the PPP module would cause kernel
>>> deadlock with wrong route policy.
>>>
>>> Signed-off-by: Gao Feng <f...@ikuai8.com>
>>> ---
>>>   v1: Initial Patch
>>>
>>>   drivers/net/ppp/ppp_generic.c | 49
>>> +++
>>>   1 file changed, 36 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/drivers/net/ppp/ppp_generic.c
>>> b/drivers/net/ppp/ppp_generic.c
>>> index 70cfa06..ffd0233 100644
>>> --- a/drivers/net/ppp/ppp_generic.c
>>> +++ b/drivers/net/ppp/ppp_generic.c
>>> @@ -162,6 +162,29 @@ struct ppp {
>>>   |SC_MULTILINK|SC_MP_SHORTSEQ|SC_MP_XSHORTSEQ \
>>>   |SC_COMP_TCP|SC_REJ_COMP_TCP|SC_MUST_COMP)
>>>
>>> +struct chennel_lock {
>>> +   spinlock_t lock;
>>> +   u32 owner;
>>> +};
>>> +
>>> +#define PPP_CHANNEL_LOCK_INIT(cl) \
>>> +   cl.owne

Re: [PATCH v2 1/1] ppp: Fix one deadlock issue of PPP when send frame

2016-08-18 Thread Feng Gao
Hi Guillaume,

Thanks your detail analyses.
Now I think it is a good solution that just drop the packet and print
error log instead my original solution that supports reentrant. This
solution will not bring any side effects.

I will send one update according to this new solution.


Regards
Feng


On Fri, Aug 19, 2016 at 12:13 AM, Guillaume Nault <g.na...@alphalink.fr> wrote:
> On Thu, Aug 18, 2016 at 08:58:31AM +0800, Feng Gao wrote:
>> On Thu, Aug 18, 2016 at 1:42 AM, Guillaume Nault <g.na...@alphalink.fr> 
>> wrote:
>> > On Tue, Aug 16, 2016 at 07:33:38PM +0800, f...@ikuai8.com wrote:
>> >> From: Gao Feng <f...@ikuai8.com>
>> >>
>> >> PPP channel holds one spinlock before send frame. But the skb may
>> >> select the same PPP channel with wrong route policy. As a result,
>> >> the skb reaches the same channel path. It tries to get the same
>> >> spinlock which is held before. Bang, the deadlock comes out.
>> >>
>> > Unless I misunderstood the problem you're trying to solve, this patch
>> > doesn't really help: deadlock still occurs if the same IP is used for
>> > L2TP and PPP's peer address.
>> >
>>
>> The deadlock happens because the same cpu try to hold the spinlock
>> which is already held before by itself.
>> Now the PPP_CHANNEL_LOCK_BH sets the lock owner after hold lock, then
>> when the same cpu
>> invokes PPP_CHANNEL_LOCK_BH again. The cl.owner equals current cpu id,
>> so it only increases
>> the lock_cnt without trying to hold the lock again.
>> So it avoids the deadlock.
>>
> I'm sorry but, again, it just _moves_ the deadlock down to L2TP. The
> kernel still oops because, now, l2tp_xmit_skb() is called recursively
> while holding its tunnel socket.
>
>> >> Now add one lock owner to avoid it like xmit_lock_owner of
>> >> netdev_queue. Check the lock owner before try to get the spinlock.
>> >> If the current cpu is already the owner, needn't lock again. When
>> >> PPP channel holds the spinlock at the first time, it sets owner
>> >> with current CPU ID.
>> >>
>> > I think you should forbid lock recursion entirely, and drop the packet
>> > if the owner tries to re-acquire the channel lock. Otherwise you just
>> > move the deadlock down the stack (l2tp_xmit_skb() can't be called
>> > recursively).
>>
>> The reason that fix it in ppp is that there are other layer on the ppp 
>> module.
>> We resolve it in ppp module, it could avoid all similar potential issues.
>>
> Not sure if I understand your point here.
> The xmit path of PPP and its sub-layers hasn't been designed to be
> reentrant. Allowing recursive sends thus require to review the full
> path.
>
> Beyond the L2TP issue discussed above, just consider the locking
> dependencies used in PPP: ppp->wlock has to be held before
> channel->downl. Sending a packet directly on a channel will lock
> channel->downl. If this packet is routed back to the parent unit then
> ppp_xmit_process() will lock ppp->wlock, effectively leading to lock
> inversion and potential deadlock.
>
> So we have two options: adapt the whole xmit path to handle recursive
> sends or forbid recursion entirely. Unfortunately none of these options
> looks easy to achieve:
>
>   * Making PPP xmit path reentrant will be hard and error prone because
> of all the locking dependencies. Looks like simplifying PPP's
> locking scheme will be required first.
>
>   * I can't see any way to reliably prevent settings where a packet sent
> on a given channel would be routed back to the parent unit.
>
>
> OTOH, we can try to limit the impact of recursive sends for simple
> cases:
>
>   * Following your approach, either adapt the lower layers
> (like l2tp_xmit_skb() for L2TP), or drop the packet when
> cl.owner == smp_processor_id(). This is very limited in scope and
> doesn't address issues like locking inversions. But it may let the
> system survive long enough for the PPP to time out.
>
>   * In the lower layer, check where the packet is going to be enqueued
> and drop it if it's the parent device. That should reliably handle
> simple and common cases. However this requires to update at least
> L2TP and PPTP and to get a way to access the parent device. Also,
> it doesn't prevent recursion with stacked interfaces.


Re: [PATCH 1/1] ppp: Fix one deadlock issue of PPP when send frame

2016-08-18 Thread Feng Gao
Hi Philp,

Yes. I am agree with you.
Just drop is better to support recursive lock.

I will send a new patch later.

Regards
Feng


On Fri, Aug 19, 2016 at 12:48 AM, Philp Prindeville
<phil...@redfish-solutions.com> wrote:
>
>
> On 08/18/2016 09:05 AM, Feng Gao wrote:
>>
>> On Thu, Aug 18, 2016 at 10:11 PM, Philp Prindeville
>> <phil...@redfish-solutions.com>  wrote:
>>>
>>> >Feng,
>>> >
>>> >If the CPU can already be holding the lock, that implies re-entrancy.
>>> >What's to stop the first flow of code which acquired the lock from
>>> > releasing
>>> >it again before the 2nd flow is done?  Is the 2nd flow running at a
>>> > higher
>>> >priority or with interrupts disabled?
>>
>> There is no preemption happened. It is caused by wrong route policy by
>> l2tp.
>> For example, the cpu0 get the spinlock of channel1, then the channel1
>> is selected again after route. As a result, cpu0 tries to get the same
>> spinlock again.
>>
>> The call flow is like this.
>> ppp_write->ppp_channel_push->start_xmit->select inappropriate route
>>  -> dev_hard_start_xmit->ppp_start_xmit->ppp_xmit_process->
>> ppp_push. Now ppp_push tries to get the same spinlock which is held
>> in ppp_channel_push.
>>
>> Regards
>> Feng
>>
>
> If we're detecting (through the fact that the lock has already been
> acquired) that the wrong route is being applied, why don't we just punt the
> packet instead?
>
> -Philip
>


Re: [PATCH v1 1/1] pppoe: l2tp: the PPPOX_CONNECTED should be used with bit operation

2016-08-18 Thread Feng Gao
inline.

On Fri, Aug 19, 2016 at 1:44 AM, Guillaume Nault  wrote:
> On Thu, Aug 18, 2016 at 09:59:03AM +0800, f...@ikuai8.com wrote:
>> From: Gao Feng 
>>
>> There are some codes in pppoe and l2tp which use the PPPOX_CONNECTED
>> as the value including assignment and condition check.
>> They should keep consistent with other codes.
>>
>> Signed-off-by: Gao Feng 
>> ---
>>  v1: Initial Patch
>>
>>  drivers/net/ppp/pppoe.c | 2 +-
>>  net/l2tp/l2tp_ppp.c | 4 ++--
>>  2 files changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c
>> index 4ddae81..684b773 100644
>> --- a/drivers/net/ppp/pppoe.c
>> +++ b/drivers/net/ppp/pppoe.c
>> @@ -697,7 +697,7 @@ static int pppoe_connect(struct socket *sock, struct 
>> sockaddr *uservaddr,
>>   goto err_put;
>>   }
>>
>> - sk->sk_state = PPPOX_CONNECTED;
>> + sk->sk_state |= PPPOX_CONNECTED;
>>
> Using plain assignment makes it clear for the reader that other flags
> are unset. I see no reason for changing this.

I get you. So I don't modify the PPPOX_DEAD assignment.
But I am afraid if there is some case that the flag PPPOX_BOUND is set
before PPPOX_CONNECTED . Then the assignment of "PPPOX_CONNECTED" will
clear the PPPOX_BOUND flag.

>
>> diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c
>> index d9560aa..3984385 100644
>> --- a/net/l2tp/l2tp_ppp.c
>> +++ b/net/l2tp/l2tp_ppp.c
>> @@ -774,7 +774,7 @@ static int pppol2tp_connect(struct socket *sock, struct 
>> sockaddr *uservaddr,
>>  out_no_ppp:
>>   /* This is how we get the session context from the socket. */
>>   sk->sk_user_data = session;
>> - sk->sk_state = PPPOX_CONNECTED;
>> + sk->sk_state |= PPPOX_CONNECTED;
>>
> Same here.
>
>> @@ -856,7 +856,7 @@ static int pppol2tp_getname(struct socket *sock, struct 
>> sockaddr *uaddr,
>>   error = -ENOTCONN;
>>   if (sk == NULL)
>>   goto end;
>> - if (sk->sk_state != PPPOX_CONNECTED)
>> + if (!(sk->sk_state & PPPOX_CONNECTED))
>>
> Looks like it was a bug. This one is worth a separate patch.

Ok, I send another patch for this bug.

Regards
Feng


Re: [PATCH v1 1/1] l2tp: Use existing macros instead of literal number

2016-08-18 Thread Feng Gao
On Fri, Aug 19, 2016 at 2:41 AM, Guillaume Nault  wrote:
> On Thu, Aug 18, 2016 at 03:05:19PM +0800, f...@ikuai8.com wrote:
>> From: Gao Feng 
>>
>> 1. Use PPP_ALLSTATIONS/PPP_UI instead of literal 0xff/0x03;
>> 2. Use one static const global fixed_ppphdr instead of two same
>> static variable ppph in two different functions;
>> 3. Use SEND_SHUTDOWN instead of literal 2;
>>
>> Signed-off-by: Gao Feng 
>> ---
>>  v1: Initial patch
> No need to send 'v1' for the initial series.

OK, I get it.

>
>> --- a/net/l2tp/l2tp_ppp.c
>> +++ b/net/l2tp/l2tp_ppp.c
>> @@ -138,6 +138,8 @@ static const struct ppp_channel_ops pppol2tp_chan_ops = {
>>
>>  static const struct proto_ops pppol2tp_ops;
>>
>> +static const unsigned char fixed_ppphdr[2] = {PPP_ALLSTATIONS, PPP_UI};
>> +
>>  /* Helpers to obtain tunnel/session contexts from sockets.
>>   */
>>  static inline struct l2tp_session *pppol2tp_sock_to_session(struct sock *sk)
>> @@ -174,11 +176,11 @@ static int pppol2tp_recv_payload_hook(struct sk_buff 
>> *skb)
>>* Note that skb->data[] isn't dereferenced from a u16 ptr here since
>>* the field may be unaligned.
>>*/
>> - if (!pskb_may_pull(skb, 2))
>> + if (!pskb_may_pull(skb, sizeof(fixed_ppphdr)))
>>   return 1;
>>
>> - if ((skb->data[0] == 0xff) && (skb->data[1] == 0x03))
>> - skb_pull(skb, 2);
>> + if ((PPP_ADDRESS(skb->data) == PPP_ALLSTATIONS) && 
>> (PPP_CONTROL(skb->data) == PPP_UI))
>> + skb_pull(skb, sizeof(fixed_ppphdr));
>>
> Sorry, but I find the original code clearer. It's important to be
> explicit about what's done with the sk_buff. Hiding skb->data[x] behind
> macros certainly doesn't help.
>
> Same thing for the use of sizeof(fixed_ppphdr) in pskb_may_pull(). The
> size of fixed_ppphdr isn't used aftewards, so it's unclear why its size
> was pulled. 2 was not a magic number here, it was directly related with
> the operations done on the skb (i.e. accessing skb->data[0] and
> skb->data[1]). So pskb_may_pull(skb, 2) makes perfect sense.

Agree now because of detail explanation.
Thanks.

>
> OTOH, replacing 0xff and 0x03 with PPP_ALLSTATIONS and PPP_UI is fine.

OK. get it.

>
>> @@ -312,7 +313,7 @@ static int pppol2tp_sendmsg(struct socket *sock, struct 
>> msghdr *m,
>>   error = -ENOMEM;
>>   skb = sock_wmalloc(sk, NET_SKB_PAD + sizeof(struct iphdr) +
>>  uhlen + session->hdr_len +
>> -sizeof(ppph) + total_len,
>> +sizeof(fixed_ppphdr) + total_len,
>>  0, GFP_KERNEL);
>>   if (!skb)
>>   goto error_put_sess_tun;
>> @@ -325,9 +326,9 @@ static int pppol2tp_sendmsg(struct socket *sock, struct 
>> msghdr *m,
>>   skb_reserve(skb, uhlen);
>>
>>   /* Add PPP header */
>> - skb->data[0] = ppph[0];
>> - skb->data[1] = ppph[1];
>> - skb_put(skb, 2);
>> + PPP_ADDRESS(skb->data) = fixed_ppphdr[0];
>> + PPP_CONTROL(skb->data) = fixed_ppphdr[1];
>> + skb_put(skb, sizeof(fixed_ppphdr));
>>
> Same here. What about
> + skb->data[0] = PPP_ALLSTATIONS;
> + skb->data[1] = PPP_UI;
> + skb_put(skb, 2);
> and removing ppph entirely?

Agree with you.

>
>>   /* Copy user data into skb */
>>   error = memcpy_from_msg(skb_put(skb, total_len), m, total_len);
>> @@ -369,7 +370,6 @@ error:
>>   */
>>  static int pppol2tp_xmit(struct ppp_channel *chan, struct sk_buff *skb)
>>  {
>> - static const u8 ppph[2] = { 0xff, 0x03 };
>>   struct sock *sk = (struct sock *) chan->private;
>>   struct sock *sk_tun;
>>   struct l2tp_session *session;
>> @@ -398,14 +398,14 @@ static int pppol2tp_xmit(struct ppp_channel *chan, 
>> struct sk_buff *skb)
>>  sizeof(struct iphdr) + /* IP header */
>>  uhlen +  /* UDP header (if L2TP_ENCAPTYPE_UDP) 
>> */
>>  session->hdr_len +   /* L2TP header */
>> -sizeof(ppph);/* PPP header */
>> +sizeof(fixed_ppphdr); /* PPP header */
>>   if (skb_cow_head(skb, headroom))
>>   goto abort_put_sess_tun;
>>
>>   /* Setup PPP header */
>> - __skb_push(skb, sizeof(ppph));
>> - skb->data[0] = ppph[0];
>> - skb->data[1] = ppph[1];
>> + __skb_push(skb, sizeof(fixed_ppphdr));
>> + skb->data[0] = fixed_ppphdr[0];
>> + skb->data[1] = fixed_ppphdr[1];
>>
> Same as for pppol2tp_sendmsg().
>
>> @@ -440,7 +440,7 @@ static void pppol2tp_session_close(struct l2tp_session 
>> *session)
>>   BUG_ON(session->magic != L2TP_SESSION_MAGIC);
>>
>>   if (sock) {
>> - inet_shutdown(sock, 2);
>> + inet_shutdown(sock, SEND_SHUTDOWN);
>>
> Ok.

Regards
Feng


Re: [PATCH v2 1/1] ppp: Fix one deadlock issue of PPP when send frame

2016-08-17 Thread Feng Gao
inline answer.

On Thu, Aug 18, 2016 at 1:42 AM, Guillaume Nault  wrote:
> On Tue, Aug 16, 2016 at 07:33:38PM +0800, f...@ikuai8.com wrote:
>> From: Gao Feng 
>>
>> PPP channel holds one spinlock before send frame. But the skb may
>> select the same PPP channel with wrong route policy. As a result,
>> the skb reaches the same channel path. It tries to get the same
>> spinlock which is held before. Bang, the deadlock comes out.
>>
> Unless I misunderstood the problem you're trying to solve, this patch
> doesn't really help: deadlock still occurs if the same IP is used for
> L2TP and PPP's peer address.
>

The deadlock happens because the same cpu try to hold the spinlock
which is already held before by itself.
Now the PPP_CHANNEL_LOCK_BH sets the lock owner after hold lock, then
when the same cpu
invokes PPP_CHANNEL_LOCK_BH again. The cl.owner equals current cpu id,
so it only increases
the lock_cnt without trying to hold the lock again.
So it avoids the deadlock.

>> Now add one lock owner to avoid it like xmit_lock_owner of
>> netdev_queue. Check the lock owner before try to get the spinlock.
>> If the current cpu is already the owner, needn't lock again. When
>> PPP channel holds the spinlock at the first time, it sets owner
>> with current CPU ID.
>>
> I think you should forbid lock recursion entirely, and drop the packet
> if the owner tries to re-acquire the channel lock. Otherwise you just
> move the deadlock down the stack (l2tp_xmit_skb() can't be called
> recursively).

The reason that fix it in ppp is that there are other layer on the ppp module.
We resolve it in ppp module, it could avoid all similar potential issues.

>
>> diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c
>> index 70cfa06..6909ab1 100644
>> --- a/drivers/net/ppp/ppp_generic.c
>> +++ b/drivers/net/ppp/ppp_generic.c
>> @@ -162,6 +162,37 @@ struct ppp {
>>|SC_MULTILINK|SC_MP_SHORTSEQ|SC_MP_XSHORTSEQ \
>>|SC_COMP_TCP|SC_REJ_COMP_TCP|SC_MUST_COMP)
>>
>> +struct chennel_lock {
> ^
> I guess you meant 'channel_lock'.

Yes. It is a typo. Thanks.

>
>> + spinlock_t lock;
>> + u32 owner;
> This field's default value is -1. So why not declaring it as 'int'?

OK. I will fix it.

Best Regards
Feng
--
To unsubscribe from this list: send the line "unsubscribe linux-ppp" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] ppp: Fix one deadlock issue of PPP when send frame

2016-08-16 Thread Feng Gao
Hi Paul,

The v1 patch does not handle the recursive lock case. It could cause
unlock multiple times.
So I send the v2 patch as one update.

Best Regards
Feng

On Tue, Aug 16, 2016 at 7:05 PM,   wrote:
> From: Gao Feng 
>
> PPP channel holds one spinlock before send frame. But the skb may
> select the same PPP channel with wrong route policy. As a result,
> the skb reaches the same channel path. It tries to get the same
> spinlock which is held before. Bang, the deadlock comes out.
>
> Now add one lock owner to avoid it like xmit_lock_owner of
> netdev_queue. Check the lock owner before try to get the spinlock.
> If the current cpu is already the owner, needn't lock again. When
> PPP channel holds the spinlock at the first time, it sets owner
> with current CPU ID.
>
> The following is the panic stack of 3.3.8. But the same issue
> should be in the upstream too.
>
> [] ? _raw_spin_lock_bh+0x11/0x40
> [] ppp_unregister_channel+0x1347/0x2170 [ppp_generic]
> [] ? kmem_cache_free+0xa7/0xc0
> [] ppp_unregister_channel+0x1db7/0x2170 [ppp_generic]
> [] ppp_unregister_channel+0x2065/0x2170 [ppp_generic]
> [] dev_hard_start_xmit+0x4cd/0x620
> [] sch_direct_xmit+0x74/0x1d0
> [] dev_queue_xmit+0x1d/0x30
> [] neigh_direct_output+0xc/0x10
> [] ip_finish_output+0x25e/0x2b0
> [] ip_output+0x88/0x90
> [] ? __ip_local_out+0x9f/0xb0
> [] ip_local_out+0x24/0x30
> [] 0xa00b9744
> [] ppp_unregister_channel+0x20f8/0x2170 [ppp_generic]
> [] ppp_output_wakeup+0x122/0x11d0 [ppp_generic]
> [] vfs_write+0xb8/0x160
> [] sys_write+0x45/0x90
> [] system_call_fastpath+0x16/0x1b
>
> The call flow is like this.
> ppp_write->ppp_channel_push->start_xmit->select inappropriate route
>  -> dev_hard_start_xmit->ppp_start_xmit->ppp_xmit_process->
> ppp_push. Now ppp_push tries to get the same spinlock which is held
> in ppp_channel_push.
>
> Although the PPP deadlock is caused by inappropriate route policy
> with L2TP, I think it is not accepted the PPP module would cause kernel
> deadlock with wrong route policy.
>
> Signed-off-by: Gao Feng 
> ---
>  v1: Initial Patch
>
>  drivers/net/ppp/ppp_generic.c | 49 
> +++
>  1 file changed, 36 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c
> index 70cfa06..ffd0233 100644
> --- a/drivers/net/ppp/ppp_generic.c
> +++ b/drivers/net/ppp/ppp_generic.c
> @@ -162,6 +162,29 @@ struct ppp {
>  |SC_MULTILINK|SC_MP_SHORTSEQ|SC_MP_XSHORTSEQ \
>  |SC_COMP_TCP|SC_REJ_COMP_TCP|SC_MUST_COMP)
>
> +struct chennel_lock {
> +   spinlock_t lock;
> +   u32 owner;
> +};
> +
> +#define PPP_CHANNEL_LOCK_INIT(cl) \
> +   cl.owner = -1; \
> +   spin_lock_init()
> +
> +#define PPP_CHANNEL_LOCK_BH(cl) \
> +   do { \
> +   local_bh_disable(); \
> +   if (cl.owner != smp_processor_id()) { \
> +   spin_lock(); \
> +   cl.owner = smp_processor_id(); \
> +   } \
> +   } while (0)
> +
> +#define PPP_CHANNEL_UNLOCK_BH(cl) \
> +   cl.owner = -1; \
> +   spin_unlock(); \
> +   local_bh_enable()
> +
>  /*
>   * Private data structure for each channel.
>   * This includes the data structure used for multilink.
> @@ -171,7 +194,7 @@ struct channel {
> struct list_head list;  /* link in all/new_channels list */
> struct ppp_channel *chan;   /* public channel data structure */
> struct rw_semaphore chan_sem;   /* protects `chan' during chan ioctl 
> */
> -   spinlock_t  downl;  /* protects `chan', file.xq dequeue */
> +   struct chennel_lock downl;  /* protects `chan', file.xq dequeue */
> struct ppp  *ppp;   /* ppp unit we're connected to */
> struct net  *chan_net;  /* the net channel belongs to */
> struct list_head clist; /* link in list of channels per unit 
> */
> @@ -1597,7 +1620,7 @@ ppp_push(struct ppp *ppp)
> list = list->next;
> pch = list_entry(list, struct channel, clist);
>
> -   spin_lock_bh(>downl);
> +   PPP_CHANNEL_LOCK_BH(pch->downl);
> if (pch->chan) {
> if (pch->chan->ops->start_xmit(pch->chan, skb))
> ppp->xmit_pending = NULL;
> @@ -1606,7 +1629,7 @@ ppp_push(struct ppp *ppp)
> kfree_skb(skb);
> ppp->xmit_pending = NULL;
> }
> -   spin_unlock_bh(>downl);
> +   PPP_CHANNEL_UNLOCK_BH(pch->downl);
> return;
> }
>
> @@ -1736,7 +1759,7 @@ static int ppp_mp_explode(struct ppp *ppp, struct 
> sk_buff *skb)
> }
>
> /* check the channel's mtu and whether it is still attached. 
> */
> -   spin_lock_bh(>downl);
> +   

Re: [PATCH v5 1/1] rps: Inspect PPTP encapsulated by GRE to get flow hash

2016-08-08 Thread Feng Gao
Oh, I think I really get you this time.
Please see my inline comment.

If they are right, I could send the v6 patch.

Best Regards
Feng


On Tue, Aug 9, 2016 at 9:06 AM, Philip Prindeville
 wrote:
> Inline...
>
>
>
> On 08/08/2016 06:37 PM, f...@48lvckh6395k16k5.yundunddos.com wrote:
>>
>> From: Gao Feng 
>>
>> The PPTP is encapsulated by GRE header with that GRE_VERSION bits
>> must contain one. But current GRE RPS needs the GRE_VERSION must be
>> zero. So RPS does not work for PPTP traffic.
>>
>> In my test environment, there are four MIPS cores, and all traffic
>> are passed through by PPTP. As a result, only one core is 100% busy
>> while other three cores are very idle. After this patch, the usage
>> of four cores are balanced well.
>>
>> Signed-off-by: Gao Feng 
>> ---
>>   v5: 1) Make fix header of gre_full_hdr as uname struct;
>>   2) Create macro GRE_PPTP_KEY_MASK;
>>   v4: 1) Define struct gre_full_hdr, and use sizeof its member directly;
>>   2) Move version and routing check ahead;
>>   3) Only PPTP in GRE check the ack flag;
>>   v3: 1) Move struct pptp_gre_header defination into new file pptp.h
>>   2) Use sizeof GRE and PPTP type instead of literal value;
>>   3) Remove strict flag check for PPTP to robust;
>>   4) Consolidate the codes again;
>>   v2: Update according to Tom and Philp's advice.
>>   1) Consolidate the codes with GRE version 0 path;
>>   2) Use PPP_PROTOCOL to get ppp protol;
>>   3) Set the FLOW_DIS_ENCAPSULATION flag;
>>   v1: Intial Patch
>>
>>   drivers/net/ppp/pptp.c |  36 +
>>   include/net/gre.h  |  13 -
>>   include/net/pptp.h |  40 +++
>>   include/uapi/linux/if_tunnel.h |   7 ++-
>>   net/core/flow_dissector.c  | 113
>> -
>>   5 files changed, 138 insertions(+), 71 deletions(-)
>>   create mode 100644 include/net/pptp.h
>>
>> diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
>> index ae0905e..3e68dbc 100644
>> --- a/drivers/net/ppp/pptp.c
>> +++ b/drivers/net/ppp/pptp.c
>> @@ -37,6 +37,7 @@
>>   #include 
>>   #include 
>>   #include 
>> +#include 
>> #include 
>>   @@ -53,41 +54,6 @@ static struct proto pptp_sk_proto __read_mostly;
>>   static const struct ppp_channel_ops pptp_chan_ops;
>>   static const struct proto_ops pptp_ops;
>>   -#define PPP_LCP_ECHOREQ 0x09
>> -#define PPP_LCP_ECHOREP 0x0A
>> -#define SC_RCV_BITS(SC_RCV_B7_1|SC_RCV_B7_0|SC_RCV_ODDP|SC_RCV_EVNP)
>> -
>> -#define MISSING_WINDOW 20
>> -#define WRAPPED(curseq, lastseq)\
>> -   curseq) & 0xff00) == 0) &&\
>> -   (((lastseq) & 0xff00) == 0xff00))
>> -
>> -#define PPTP_GRE_PROTO  0x880B
>> -#define PPTP_GRE_VER0x1
>> -
>> -#define PPTP_GRE_FLAG_C0x80
>> -#define PPTP_GRE_FLAG_R0x40
>> -#define PPTP_GRE_FLAG_K0x20
>> -#define PPTP_GRE_FLAG_S0x10
>> -#define PPTP_GRE_FLAG_A0x80
>> -
>> -#define PPTP_GRE_IS_C(f) ((f)_GRE_FLAG_C)
>> -#define PPTP_GRE_IS_R(f) ((f)_GRE_FLAG_R)
>> -#define PPTP_GRE_IS_K(f) ((f)_GRE_FLAG_K)
>> -#define PPTP_GRE_IS_S(f) ((f)_GRE_FLAG_S)
>> -#define PPTP_GRE_IS_A(f) ((f)_GRE_FLAG_A)
>> -
>> -#define PPTP_HEADER_OVERHEAD (2+sizeof(struct pptp_gre_header))
>> -struct pptp_gre_header {
>> -   u8  flags;
>> -   u8  ver;
>> -   __be16 protocol;
>> -   __be16 payload_len;
>> -   __be16 call_id;
>> -   __be32 seq;
>> -   __be32 ack;
>> -} __packed;
>> -
>>   static struct pppox_sock *lookup_chan(u16 call_id, __be32 s_addr)
>>   {
>> struct pppox_sock *sock;
>> diff --git a/include/net/gre.h b/include/net/gre.h
>> index 7a54a31..6347f16 100644
>> --- a/include/net/gre.h
>> +++ b/include/net/gre.h
>> @@ -7,9 +7,20 @@
>>   struct gre_base_hdr {
>> __be16 flags;
>> __be16 protocol;
>> -};
>> +} __packed;
>>   #define GRE_HEADER_SECTION 4
>>   +struct gre_full_hdr {
>> +   struct {
>> +   __be16 flags;
>> +   __be16 protocols;
>> +   };
>
>
> No, you're misunderstanding what I was saying.  I was suggesting using
> "struct gre_base_hdr" here.
>
> Anyway, we can skip it.  Was trying to reflect that the gre_full_hdr was
> derived from the gre_base_hdr.

struct gre_full_hdr {
   struct gre_base_hdr {
__be16 flags;
__be16 protocols;
   } fixed_header;
   .
};
I think you mean it is this.
Sorry about misunderstanding about that always :((

And I have got the advantage now, thanks.

>
>
>
>> +   __be16 csum;
>> +   __be16 reserved1;
>> +   __be32 key;
>> +   __be32 seq;
>> +} __packed;
>> +
>>   #define GREPROTO_CISCO0
>>   #define GREPROTO_PPTP 1
>>   #define GREPROTO_MAX  2
>> diff --git a/include/net/pptp.h b/include/net/pptp.h
>> new file mode 100644
>> index 000..301d3e2
>> --- /dev/null
>> +++ b/include/net/pptp.h
>> 

Re: [PATCH v4 1/1] rps: Inspect PPTP encapsulated by GRE to get flow hash

2016-08-08 Thread Feng Gao
Ok, I have sent the v5 patch with two updates:
1. One is make fixed_header of gre_full_header as uname struct;
2. Use one macro instead of the key literal master htonl(0x);
The v5 link is 
https://www.mail-archive.com/netdev@vger.kernel.org/msg122261.html.

But I don't know what's advantage with the uname struct here.
There is one struct gre_base_hdr definition, why could we use the
struct directly?

Philip, could you show me more details about it please?
Thank you.


Best Regards
Feng

On Mon, Aug 8, 2016 at 11:20 PM, Philp Prindeville
<phil...@redfish-solutions.com> wrote:
> No, I was referring to anonymous structures, which is a feature of C11.
>
> Please see the link I sent.
>
>
>
> On 08/08/2016 03:13 AM, Feng Gao wrote:
>>
>> Hi Philip,
>>
>> Do you mean like the following?
>>
>> struct gre_full_hdr {
>>  struct {
>>  __be16 flags;
>>  __be16 protocol;
>>  } fixed_header;
>>  __be16 csum;
>>  __be16 reserved1;
>>  __be32 key;
>>  __be32 seq;
>> } __packed;
>>
>> But we need struct gre_base_hdr to get the fixed header of GRE in
>> function __skb_flow_dissect like the following codes.
>> struct gre_base_hdr *hdr, _hdr;
>> hdr = __skb_header_pointer(skb, nhoff, sizeof(_hdr), data, hlen, &_hdr);
>>
>> BTW, the original codes define one local stuct gre_hdr. Now I use the
>> unified struct gre_base_hdr instead of it.
>>
>> Best Regards
>> Feng
>>
>>
>> On Mon, Aug 8, 2016 at 11:27 AM, Philp Prindeville
>> <phil...@redfish-solutions.com> wrote:
>>>
>>> Feng,
>>>
>>> An anonymous structure is defined here:
>>> https://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html
>>>
>>> i.e.:
>>>
>>> struct gre_full_hdr {
>>>  struct gre_base_hdr;
>>>  ...
>>>
>>> so yes, I'm talking about making fixed_header be anonymous instead.
>>>
>>> -Philip
>>>
>>>
>>>
>>> On 08/07/2016 08:50 PM, Feng Gao wrote:
>>>>
>>>> Hi Philp,
>>>>
>>>> Forgive my poor English, I am not clear about the comment #1.
>>>> "Can you make gre_base_hdr be anonymous?".
>>>>
>>>> +struct gre_full_hdr {
>>>> +   struct gre_base_hdr fixed_header;
>>>>
>>>> Do you mean make the member "fixed_header" as anonymous or not?
>>>>
>>>> Best Regards
>>>> Feng
>>>>
>>>>
>>>> On Mon, Aug 8, 2016 at 5:03 AM, Philp Prindeville
>>>> <phil...@redfish-solutions.com> wrote:
>>>>>
>>>>> Inline...
>>>>>
>>>>>
>>>>>
>>>>> On 08/04/2016 01:06 AM, f...@48lvckh6395k16k5.yundunddos.com wrote:
>>>>>>
>>>>>> From: Gao Feng <f...@ikuai8.com>
>>>>>>
>>>>>> The PPTP is encapsulated by GRE header with that GRE_VERSION bits
>>>>>> must contain one. But current GRE RPS needs the GRE_VERSION must be
>>>>>> zero. So RPS does not work for PPTP traffic.
>>>>>>
>>>>>> In my test environment, there are four MIPS cores, and all traffic
>>>>>> are passed through by PPTP. As a result, only one core is 100% busy
>>>>>> while other three cores are very idle. After this patch, the usage
>>>>>> of four cores are balanced well.
>>>>>>
>>>>>> Signed-off-by: Gao Feng <f...@ikuai8.com>
>>>>>> ---
>>>>>> v4: 1) Define struct gre_full_hdr, and use sizeof its member
>>>>>> directly;
>>>>>> 2) Move version and routing check ahead;
>>>>>> 3) Only PPTP in GRE check the ack flag;
>>>>>> v3: 1) Move struct pptp_gre_header defination into new file pptp.h
>>>>>> 2) Use sizeof GRE and PPTP type instead of literal value;
>>>>>> 3) Remove strict flag check for PPTP to robust;
>>>>>> 4) Consolidate the codes again;
>>>>>> v2: Update according to Tom and Philp's advice.
>>>>>> 1) Consolidate the codes with GRE version 0 path;
>>>>>> 2) Use PPP_PROTOCOL to get ppp protol;
>>>>>> 3) Set the FLOW_DIS_ENCAPSULATION flag;
>>>>>> v1: Intial Patc

Re: [PATCH v4 1/1] rps: Inspect PPTP encapsulated by GRE to get flow hash

2016-08-08 Thread Feng Gao
Hi Philip,

Do you mean like the following?

struct gre_full_hdr {
struct {
__be16 flags;
__be16 protocol;
} fixed_header;
__be16 csum;
__be16 reserved1;
__be32 key;
__be32 seq;
} __packed;

But we need struct gre_base_hdr to get the fixed header of GRE in
function __skb_flow_dissect like the following codes.
struct gre_base_hdr *hdr, _hdr;
hdr = __skb_header_pointer(skb, nhoff, sizeof(_hdr), data, hlen, &_hdr);

BTW, the original codes define one local stuct gre_hdr. Now I use the
unified struct gre_base_hdr instead of it.

Best Regards
Feng


On Mon, Aug 8, 2016 at 11:27 AM, Philp Prindeville
<phil...@redfish-solutions.com> wrote:
> Feng,
>
> An anonymous structure is defined here:
> https://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html
>
> i.e.:
>
> struct gre_full_hdr {
> struct gre_base_hdr;
> ...
>
> so yes, I'm talking about making fixed_header be anonymous instead.
>
> -Philip
>
>
>
> On 08/07/2016 08:50 PM, Feng Gao wrote:
>>
>> Hi Philp,
>>
>> Forgive my poor English, I am not clear about the comment #1.
>> "Can you make gre_base_hdr be anonymous?".
>>
>> +struct gre_full_hdr {
>> +   struct gre_base_hdr fixed_header;
>>
>> Do you mean make the member "fixed_header" as anonymous or not?
>>
>> Best Regards
>> Feng
>>
>>
>> On Mon, Aug 8, 2016 at 5:03 AM, Philp Prindeville
>> <phil...@redfish-solutions.com> wrote:
>>>
>>> Inline...
>>>
>>>
>>>
>>> On 08/04/2016 01:06 AM, f...@48lvckh6395k16k5.yundunddos.com wrote:
>>>>
>>>> From: Gao Feng <f...@ikuai8.com>
>>>>
>>>> The PPTP is encapsulated by GRE header with that GRE_VERSION bits
>>>> must contain one. But current GRE RPS needs the GRE_VERSION must be
>>>> zero. So RPS does not work for PPTP traffic.
>>>>
>>>> In my test environment, there are four MIPS cores, and all traffic
>>>> are passed through by PPTP. As a result, only one core is 100% busy
>>>> while other three cores are very idle. After this patch, the usage
>>>> of four cores are balanced well.
>>>>
>>>> Signed-off-by: Gao Feng <f...@ikuai8.com>
>>>> ---
>>>>v4: 1) Define struct gre_full_hdr, and use sizeof its member
>>>> directly;
>>>>2) Move version and routing check ahead;
>>>>3) Only PPTP in GRE check the ack flag;
>>>>v3: 1) Move struct pptp_gre_header defination into new file pptp.h
>>>>2) Use sizeof GRE and PPTP type instead of literal value;
>>>>3) Remove strict flag check for PPTP to robust;
>>>>4) Consolidate the codes again;
>>>>v2: Update according to Tom and Philp's advice.
>>>>1) Consolidate the codes with GRE version 0 path;
>>>>2) Use PPP_PROTOCOL to get ppp protol;
>>>>3) Set the FLOW_DIS_ENCAPSULATION flag;
>>>>v1: Intial Patch
>>>>
>>>>drivers/net/ppp/pptp.c |  36 +
>>>>include/net/gre.h  |  10 +++-
>>>>include/net/pptp.h |  40 +++
>>>>include/uapi/linux/if_tunnel.h |   7 ++-
>>>>net/core/flow_dissector.c  | 113
>>>> -
>>>>5 files changed, 135 insertions(+), 71 deletions(-)
>>>>create mode 100644 include/net/pptp.h
>>>>
>>>> diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
>>>> index ae0905e..3e68dbc 100644
>>>> --- a/drivers/net/ppp/pptp.c
>>>> +++ b/drivers/net/ppp/pptp.c
>>>> @@ -37,6 +37,7 @@
>>>>#include 
>>>>#include 
>>>>#include 
>>>> +#include 
>>>>  #include 
>>>>@@ -53,41 +54,6 @@ static struct proto pptp_sk_proto __read_mostly;
>>>>static const struct ppp_channel_ops pptp_chan_ops;
>>>>static const struct proto_ops pptp_ops;
>>>>-#define PPP_LCP_ECHOREQ 0x09
>>>> -#define PPP_LCP_ECHOREP 0x0A
>>>> -#define SC_RCV_BITS
>>>> (SC_RCV_B7_1|SC_RCV_B7_0|SC_RCV_ODDP|SC_RCV_EVNP)
>>>> -
>>>> -#define MISSING_WINDOW 20
>>>> -#define WRAPPED(curseq, lastseq)\
>>>> -   curseq) & 0xff00) == 0) &&\
>>>> -   (((lastseq) & 0xff00) == 0xff00))
&g

Re: [PATCH v4 1/1] rps: Inspect PPTP encapsulated by GRE to get flow hash

2016-08-07 Thread Feng Gao
Hi Philp,

Forgive my poor English, I am not clear about the comment #1.
"Can you make gre_base_hdr be anonymous?".

+struct gre_full_hdr {
+   struct gre_base_hdr fixed_header;

Do you mean make the member "fixed_header" as anonymous or not?

Best Regards
Feng


On Mon, Aug 8, 2016 at 5:03 AM, Philp Prindeville
 wrote:
> Inline...
>
>
>
> On 08/04/2016 01:06 AM, f...@48lvckh6395k16k5.yundunddos.com wrote:
>>
>> From: Gao Feng 
>>
>> The PPTP is encapsulated by GRE header with that GRE_VERSION bits
>> must contain one. But current GRE RPS needs the GRE_VERSION must be
>> zero. So RPS does not work for PPTP traffic.
>>
>> In my test environment, there are four MIPS cores, and all traffic
>> are passed through by PPTP. As a result, only one core is 100% busy
>> while other three cores are very idle. After this patch, the usage
>> of four cores are balanced well.
>>
>> Signed-off-by: Gao Feng 
>> ---
>>   v4: 1) Define struct gre_full_hdr, and use sizeof its member directly;
>>   2) Move version and routing check ahead;
>>   3) Only PPTP in GRE check the ack flag;
>>   v3: 1) Move struct pptp_gre_header defination into new file pptp.h
>>   2) Use sizeof GRE and PPTP type instead of literal value;
>>   3) Remove strict flag check for PPTP to robust;
>>   4) Consolidate the codes again;
>>   v2: Update according to Tom and Philp's advice.
>>   1) Consolidate the codes with GRE version 0 path;
>>   2) Use PPP_PROTOCOL to get ppp protol;
>>   3) Set the FLOW_DIS_ENCAPSULATION flag;
>>   v1: Intial Patch
>>
>>   drivers/net/ppp/pptp.c |  36 +
>>   include/net/gre.h  |  10 +++-
>>   include/net/pptp.h |  40 +++
>>   include/uapi/linux/if_tunnel.h |   7 ++-
>>   net/core/flow_dissector.c  | 113
>> -
>>   5 files changed, 135 insertions(+), 71 deletions(-)
>>   create mode 100644 include/net/pptp.h
>>
>> diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
>> index ae0905e..3e68dbc 100644
>> --- a/drivers/net/ppp/pptp.c
>> +++ b/drivers/net/ppp/pptp.c
>> @@ -37,6 +37,7 @@
>>   #include 
>>   #include 
>>   #include 
>> +#include 
>> #include 
>>   @@ -53,41 +54,6 @@ static struct proto pptp_sk_proto __read_mostly;
>>   static const struct ppp_channel_ops pptp_chan_ops;
>>   static const struct proto_ops pptp_ops;
>>   -#define PPP_LCP_ECHOREQ 0x09
>> -#define PPP_LCP_ECHOREP 0x0A
>> -#define SC_RCV_BITS(SC_RCV_B7_1|SC_RCV_B7_0|SC_RCV_ODDP|SC_RCV_EVNP)
>> -
>> -#define MISSING_WINDOW 20
>> -#define WRAPPED(curseq, lastseq)\
>> -   curseq) & 0xff00) == 0) &&\
>> -   (((lastseq) & 0xff00) == 0xff00))
>> -
>> -#define PPTP_GRE_PROTO  0x880B
>> -#define PPTP_GRE_VER0x1
>> -
>> -#define PPTP_GRE_FLAG_C0x80
>> -#define PPTP_GRE_FLAG_R0x40
>> -#define PPTP_GRE_FLAG_K0x20
>> -#define PPTP_GRE_FLAG_S0x10
>> -#define PPTP_GRE_FLAG_A0x80
>> -
>> -#define PPTP_GRE_IS_C(f) ((f)_GRE_FLAG_C)
>> -#define PPTP_GRE_IS_R(f) ((f)_GRE_FLAG_R)
>> -#define PPTP_GRE_IS_K(f) ((f)_GRE_FLAG_K)
>> -#define PPTP_GRE_IS_S(f) ((f)_GRE_FLAG_S)
>> -#define PPTP_GRE_IS_A(f) ((f)_GRE_FLAG_A)
>> -
>> -#define PPTP_HEADER_OVERHEAD (2+sizeof(struct pptp_gre_header))
>> -struct pptp_gre_header {
>> -   u8  flags;
>> -   u8  ver;
>> -   __be16 protocol;
>> -   __be16 payload_len;
>> -   __be16 call_id;
>> -   __be32 seq;
>> -   __be32 ack;
>> -} __packed;
>> -
>>   static struct pppox_sock *lookup_chan(u16 call_id, __be32 s_addr)
>>   {
>> struct pppox_sock *sock;
>> diff --git a/include/net/gre.h b/include/net/gre.h
>> index 7a54a31..c469dcc 100644
>> --- a/include/net/gre.h
>> +++ b/include/net/gre.h
>> @@ -7,9 +7,17 @@
>>   struct gre_base_hdr {
>> __be16 flags;
>> __be16 protocol;
>> -};
>> +} __packed;
>>   #define GRE_HEADER_SECTION 4
>>   +struct gre_full_hdr {
>> +   struct gre_base_hdr fixed_header;
>
>
> Can you make gre_base_hdr be anonymous?
>
>
>
>> +   __be16 csum;
>> +   __be16 reserved1;
>> +   __be32 key;
>> +   __be32 seq;
>> +} __packed;
>> +
>>   #define GREPROTO_CISCO0
>>   #define GREPROTO_PPTP 1
>>   #define GREPROTO_MAX  2
>> diff --git a/include/net/pptp.h b/include/net/pptp.h
>> new file mode 100644
>> index 000..301d3e2
>> --- /dev/null
>> +++ b/include/net/pptp.h
>> @@ -0,0 +1,40 @@
>> +#ifndef _NET_PPTP_H
>> +#define _NET_PPTP_H
>> +
>> +#define PPP_LCP_ECHOREQ 0x09
>> +#define PPP_LCP_ECHOREP 0x0A
>> +#define SC_RCV_BITS (SC_RCV_B7_1|SC_RCV_B7_0|SC_RCV_ODDP|SC_RCV_EVNP)
>> +
>> +#define MISSING_WINDOW 20
>> +#define WRAPPED(curseq, lastseq)\
>> +   curseq) & 0xff00) == 0) &&\
>> +   (((lastseq) & 0xff00) == 0xff00))
>> +
>> +#define PPTP_GRE_PROTO  0x880B
>> +#define PPTP_GRE_VER0x1
>> +
>> +#define 

Re: [PATCH v3 1/1] rps: Inspect PPTP encapsulated by GRE to get flow hash

2016-08-04 Thread Feng Gao
The link is https://patchwork.ozlabs.org/patch/655695/

Best Regards
Feng

On Thu, Aug 4, 2016 at 3:37 PM, Feng Gao <gfree.w...@gmail.com> wrote:
> Hi Tom & Philp,
>
> The v4 patch is sent already.
> Could you help review again please?
>
> Tom,
> I follow your modification.
>
> Philp,
> I define one new struct gre_full_hdr which contains the completed gre
> header, for example, csum, key, and so on.
> And these members are not defined in gre_base_hdr.
> It is only used to offset the sizeof type.
>
> BTW, I find the struct and macro about pptp and gre  are redundant.
> I want to refactor them in other patches.
>
> On Thu, Aug 4, 2016 at 8:33 AM, Philp Prindeville
> <phil...@redfish-solutions.com> wrote:
>> Inline
>>
>>
>>
>> On 08/03/2016 05:58 PM, Feng Gao wrote:
>>>
>>> inline comment.
>>> There are two comments that I am not clear.
>>>
>>> Best Regards
>>> Feng
>>>
>>> On Thu, Aug 4, 2016 at 4:43 AM, Philip Prindeville
>>> <phil...@redfish-solutions.com> wrote:
>>>>
>>>> Inline…
>>>>
>>>>> On Aug 3, 2016, at 8:52 AM, f...@48lvckh6395k16k5.yundunddos.com wrote:
>>>>>
>>>>> From: Gao Feng <f...@ikuai8.com>
>>>>>
>>>>> The PPTP is encapsulated by GRE header with that GRE_VERSION bits
>>>>> must contain one. But current GRE RPS needs the GRE_VERSION must be
>>>>> zero. So RPS does not work for PPTP traffic.
>>>>>
>>>>> In my test environment, there are four MIPS cores, and all traffic
>>>>> are passed through by PPTP. As a result, only one core is 100% busy
>>>>> while other three cores are very idle. After this patch, the usage
>>>>> of four cores are balanced well.
>>>>>
>>>>> Signed-off-by: Gao Feng <f...@ikuai8.com>
>>>>> ---
>>>>> v3: 1) Move struct pptp_gre_header defination into new file pptp.h
>>>>>  2) Use sizeof GRE and PPTP type instead of literal value;
>>>>>  3) Remove strict flag check for PPTP to robust;
>>>>>  4) Consolidate the codes again;
>>>>> v2: Update according to Tom and Philp's advice.
>>>>>  1) Consolidate the codes with GRE version 0 path;
>>>>>  2) Use PPP_PROTOCOL to get ppp protol;
>>>>>  3) Set the FLOW_DIS_ENCAPSULATION flag;
>>>>> v1: Intial patch
>>>>>
>>>>> drivers/net/ppp/pptp.c |  36 +--
>>>>> include/net/pptp.h |  40 
>>>>> include/uapi/linux/if_tunnel.h |   7 +-
>>>>> net/core/flow_dissector.c  | 141
>>>>> +
>>>>> 4 files changed, 134 insertions(+), 90 deletions(-)
>>>>> create mode 100644 include/net/pptp.h
>>>>>
>>>>> diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
>>>>> index ae0905e..3e68dbc 100644
>>>>> --- a/drivers/net/ppp/pptp.c
>>>>> +++ b/drivers/net/ppp/pptp.c
>>>>> @@ -37,6 +37,7 @@
>>>>> #include 
>>>>> #include 
>>>>> #include 
>>>>> +#include 
>>>>>
>>>>> #include 
>>>>>
>>>>> @@ -53,41 +54,6 @@ static struct proto pptp_sk_proto __read_mostly;
>>>>> static const struct ppp_channel_ops pptp_chan_ops;
>>>>> static const struct proto_ops pptp_ops;
>>>>>
>>>>> -#define PPP_LCP_ECHOREQ 0x09
>>>>> -#define PPP_LCP_ECHOREP 0x0A
>>>>> -#define SC_RCV_BITS  (SC_RCV_B7_1|SC_RCV_B7_0|SC_RCV_ODDP|SC_RCV_EVNP)
>>>>> -
>>>>> -#define MISSING_WINDOW 20
>>>>> -#define WRAPPED(curseq, lastseq)\
>>>>> - curseq) & 0xff00) == 0) &&\
>>>>> - (((lastseq) & 0xff00) == 0xff00))
>>>>> -
>>>>> -#define PPTP_GRE_PROTO  0x880B
>>>>> -#define PPTP_GRE_VER0x1
>>>>> -
>>>>> -#define PPTP_GRE_FLAG_C  0x80
>>>>> -#define PPTP_GRE_FLAG_R  0x40
>>>>> -#define PPTP_GRE_FLAG_K  0x20
>>>>> -#define PPTP_GRE_FLAG_S  0x10
>>>>> -#define PPTP_GRE_FLAG_A  0x80
>>>>> -
>>>>> -#define PPTP_GRE_IS_C(f) ((f)_GRE_FLAG_C)
>>>>> -#define PPTP_GRE_IS_R(f) ((f)_GRE_F

Re: [PATCH v3 1/1] rps: Inspect PPTP encapsulated by GRE to get flow hash

2016-08-04 Thread Feng Gao
Hi Tom & Philp,

The v4 patch is sent already.
Could you help review again please?

Tom,
I follow your modification.

Philp,
I define one new struct gre_full_hdr which contains the completed gre
header, for example, csum, key, and so on.
And these members are not defined in gre_base_hdr.
It is only used to offset the sizeof type.

BTW, I find the struct and macro about pptp and gre  are redundant.
I want to refactor them in other patches.

On Thu, Aug 4, 2016 at 8:33 AM, Philp Prindeville
<phil...@redfish-solutions.com> wrote:
> Inline
>
>
>
> On 08/03/2016 05:58 PM, Feng Gao wrote:
>>
>> inline comment.
>> There are two comments that I am not clear.
>>
>> Best Regards
>> Feng
>>
>> On Thu, Aug 4, 2016 at 4:43 AM, Philip Prindeville
>> <phil...@redfish-solutions.com> wrote:
>>>
>>> Inline…
>>>
>>>> On Aug 3, 2016, at 8:52 AM, f...@48lvckh6395k16k5.yundunddos.com wrote:
>>>>
>>>> From: Gao Feng <f...@ikuai8.com>
>>>>
>>>> The PPTP is encapsulated by GRE header with that GRE_VERSION bits
>>>> must contain one. But current GRE RPS needs the GRE_VERSION must be
>>>> zero. So RPS does not work for PPTP traffic.
>>>>
>>>> In my test environment, there are four MIPS cores, and all traffic
>>>> are passed through by PPTP. As a result, only one core is 100% busy
>>>> while other three cores are very idle. After this patch, the usage
>>>> of four cores are balanced well.
>>>>
>>>> Signed-off-by: Gao Feng <f...@ikuai8.com>
>>>> ---
>>>> v3: 1) Move struct pptp_gre_header defination into new file pptp.h
>>>>  2) Use sizeof GRE and PPTP type instead of literal value;
>>>>  3) Remove strict flag check for PPTP to robust;
>>>>  4) Consolidate the codes again;
>>>> v2: Update according to Tom and Philp's advice.
>>>>  1) Consolidate the codes with GRE version 0 path;
>>>>  2) Use PPP_PROTOCOL to get ppp protol;
>>>>  3) Set the FLOW_DIS_ENCAPSULATION flag;
>>>> v1: Intial patch
>>>>
>>>> drivers/net/ppp/pptp.c |  36 +--
>>>> include/net/pptp.h |  40 
>>>> include/uapi/linux/if_tunnel.h |   7 +-
>>>> net/core/flow_dissector.c  | 141
>>>> +
>>>> 4 files changed, 134 insertions(+), 90 deletions(-)
>>>> create mode 100644 include/net/pptp.h
>>>>
>>>> diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
>>>> index ae0905e..3e68dbc 100644
>>>> --- a/drivers/net/ppp/pptp.c
>>>> +++ b/drivers/net/ppp/pptp.c
>>>> @@ -37,6 +37,7 @@
>>>> #include 
>>>> #include 
>>>> #include 
>>>> +#include 
>>>>
>>>> #include 
>>>>
>>>> @@ -53,41 +54,6 @@ static struct proto pptp_sk_proto __read_mostly;
>>>> static const struct ppp_channel_ops pptp_chan_ops;
>>>> static const struct proto_ops pptp_ops;
>>>>
>>>> -#define PPP_LCP_ECHOREQ 0x09
>>>> -#define PPP_LCP_ECHOREP 0x0A
>>>> -#define SC_RCV_BITS  (SC_RCV_B7_1|SC_RCV_B7_0|SC_RCV_ODDP|SC_RCV_EVNP)
>>>> -
>>>> -#define MISSING_WINDOW 20
>>>> -#define WRAPPED(curseq, lastseq)\
>>>> - curseq) & 0xff00) == 0) &&\
>>>> - (((lastseq) & 0xff00) == 0xff00))
>>>> -
>>>> -#define PPTP_GRE_PROTO  0x880B
>>>> -#define PPTP_GRE_VER0x1
>>>> -
>>>> -#define PPTP_GRE_FLAG_C  0x80
>>>> -#define PPTP_GRE_FLAG_R  0x40
>>>> -#define PPTP_GRE_FLAG_K  0x20
>>>> -#define PPTP_GRE_FLAG_S  0x10
>>>> -#define PPTP_GRE_FLAG_A  0x80
>>>> -
>>>> -#define PPTP_GRE_IS_C(f) ((f)_GRE_FLAG_C)
>>>> -#define PPTP_GRE_IS_R(f) ((f)_GRE_FLAG_R)
>>>> -#define PPTP_GRE_IS_K(f) ((f)_GRE_FLAG_K)
>>>> -#define PPTP_GRE_IS_S(f) ((f)_GRE_FLAG_S)
>>>> -#define PPTP_GRE_IS_A(f) ((f)_GRE_FLAG_A)
>>>> -
>>>> -#define PPTP_HEADER_OVERHEAD (2+sizeof(struct pptp_gre_header))
>>>> -struct pptp_gre_header {
>>>> - u8  flags;
>>>> - u8  ver;
>>>> - __be16 protocol;
>>>> - __be16 payload_len;
>>>> - __be16 call_id;
>>>> - __be32 seq;
>>>> - __be32 ack;
>&

Re: [PATCH v3 1/1] rps: Inspect PPTP encapsulated by GRE to get flow hash

2016-08-03 Thread Feng Gao
inline comment.
There are two comments that I am not clear.

Best Regards
Feng

On Thu, Aug 4, 2016 at 4:43 AM, Philip Prindeville
 wrote:
> Inline…
>
>> On Aug 3, 2016, at 8:52 AM, f...@48lvckh6395k16k5.yundunddos.com wrote:
>>
>> From: Gao Feng 
>>
>> The PPTP is encapsulated by GRE header with that GRE_VERSION bits
>> must contain one. But current GRE RPS needs the GRE_VERSION must be
>> zero. So RPS does not work for PPTP traffic.
>>
>> In my test environment, there are four MIPS cores, and all traffic
>> are passed through by PPTP. As a result, only one core is 100% busy
>> while other three cores are very idle. After this patch, the usage
>> of four cores are balanced well.
>>
>> Signed-off-by: Gao Feng 
>> ---
>> v3: 1) Move struct pptp_gre_header defination into new file pptp.h
>> 2) Use sizeof GRE and PPTP type instead of literal value;
>> 3) Remove strict flag check for PPTP to robust;
>> 4) Consolidate the codes again;
>> v2: Update according to Tom and Philp's advice.
>> 1) Consolidate the codes with GRE version 0 path;
>> 2) Use PPP_PROTOCOL to get ppp protol;
>> 3) Set the FLOW_DIS_ENCAPSULATION flag;
>> v1: Intial patch
>>
>> drivers/net/ppp/pptp.c |  36 +--
>> include/net/pptp.h |  40 
>> include/uapi/linux/if_tunnel.h |   7 +-
>> net/core/flow_dissector.c  | 141 
>> +
>> 4 files changed, 134 insertions(+), 90 deletions(-)
>> create mode 100644 include/net/pptp.h
>>
>> diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
>> index ae0905e..3e68dbc 100644
>> --- a/drivers/net/ppp/pptp.c
>> +++ b/drivers/net/ppp/pptp.c
>> @@ -37,6 +37,7 @@
>> #include 
>> #include 
>> #include 
>> +#include 
>>
>> #include 
>>
>> @@ -53,41 +54,6 @@ static struct proto pptp_sk_proto __read_mostly;
>> static const struct ppp_channel_ops pptp_chan_ops;
>> static const struct proto_ops pptp_ops;
>>
>> -#define PPP_LCP_ECHOREQ 0x09
>> -#define PPP_LCP_ECHOREP 0x0A
>> -#define SC_RCV_BITS  (SC_RCV_B7_1|SC_RCV_B7_0|SC_RCV_ODDP|SC_RCV_EVNP)
>> -
>> -#define MISSING_WINDOW 20
>> -#define WRAPPED(curseq, lastseq)\
>> - curseq) & 0xff00) == 0) &&\
>> - (((lastseq) & 0xff00) == 0xff00))
>> -
>> -#define PPTP_GRE_PROTO  0x880B
>> -#define PPTP_GRE_VER0x1
>> -
>> -#define PPTP_GRE_FLAG_C  0x80
>> -#define PPTP_GRE_FLAG_R  0x40
>> -#define PPTP_GRE_FLAG_K  0x20
>> -#define PPTP_GRE_FLAG_S  0x10
>> -#define PPTP_GRE_FLAG_A  0x80
>> -
>> -#define PPTP_GRE_IS_C(f) ((f)_GRE_FLAG_C)
>> -#define PPTP_GRE_IS_R(f) ((f)_GRE_FLAG_R)
>> -#define PPTP_GRE_IS_K(f) ((f)_GRE_FLAG_K)
>> -#define PPTP_GRE_IS_S(f) ((f)_GRE_FLAG_S)
>> -#define PPTP_GRE_IS_A(f) ((f)_GRE_FLAG_A)
>> -
>> -#define PPTP_HEADER_OVERHEAD (2+sizeof(struct pptp_gre_header))
>> -struct pptp_gre_header {
>> - u8  flags;
>> - u8  ver;
>> - __be16 protocol;
>> - __be16 payload_len;
>> - __be16 call_id;
>> - __be32 seq;
>> - __be32 ack;
>> -} __packed;
>> -
>> static struct pppox_sock *lookup_chan(u16 call_id, __be32 s_addr)
>> {
>>   struct pppox_sock *sock;
>> diff --git a/include/net/pptp.h b/include/net/pptp.h
>> new file mode 100644
>> index 000..301d3e2
>> --- /dev/null
>> +++ b/include/net/pptp.h
>> @@ -0,0 +1,40 @@
>> +#ifndef _NET_PPTP_H
>> +#define _NET_PPTP_H
>> +
>> +#define PPP_LCP_ECHOREQ 0x09
>> +#define PPP_LCP_ECHOREP 0x0A
>> +#define SC_RCV_BITS (SC_RCV_B7_1|SC_RCV_B7_0|SC_RCV_ODDP|SC_RCV_EVNP)
>> +
>> +#define MISSING_WINDOW 20
>> +#define WRAPPED(curseq, lastseq)\
>> + curseq) & 0xff00) == 0) &&\
>> + (((lastseq) & 0xff00) == 0xff00))
>> +
>> +#define PPTP_GRE_PROTO  0x880B
>> +#define PPTP_GRE_VER0x1
>
> What about macros for accessing the lower 3 bits of the version?

There is already one macro "GRE_VERSION" as the mask to get version.

>
>
>> +
>> +#define PPTP_GRE_FLAG_C 0x80
>> +#define PPTP_GRE_FLAG_R 0x40
>> +#define PPTP_GRE_FLAG_K 0x20
>> +#define PPTP_GRE_FLAG_S 0x10
>> +#define PPTP_GRE_FLAG_A 0x80
>> +
>> +#define PPTP_GRE_IS_C(f) ((f)_GRE_FLAG_C)
>> +#define PPTP_GRE_IS_R(f) ((f)_GRE_FLAG_R)
>> +#define PPTP_GRE_IS_K(f) ((f)_GRE_FLAG_K)
>> +#define PPTP_GRE_IS_S(f) ((f)_GRE_FLAG_S)
>> +#define PPTP_GRE_IS_A(f) ((f)_GRE_FLAG_A)
>> +
>> +#define PPTP_HEADER_OVERHEAD (2+sizeof(struct pptp_gre_header))
>> +struct pptp_gre_header {
>> + u8  flags;
>> + u8  ver;
>> + __be16 protocol;
>> + __be16 payload_len;
>> + __be16 call_id;
>> + __be32 seq;
>> + __be32 ack;
>> +} __packed;
>
>
> What about a definition of a V0 (RFC-1701) packet?  We’re handling both, so 
> it makes sense to define both.

I don't get you. The struct "gre_base_hdr" is defined in gre.h. Do you
mean define them in same file ?

>
>
>> +
>> +
>> +#endif
>> diff --git a/include/uapi/linux/if_tunnel.h b/include/uapi/linux/if_tunnel.h
>> index 1046f55..7d889db 

Re: [PATCH v3 1/1] rps: Inspect PPTP encapsulated by GRE to get flow hash

2016-08-03 Thread Feng Gao
Hi Tom,

inline comments

On Thu, Aug 4, 2016 at 12:15 AM, Tom Herbert  wrote:
> On Wed, Aug 3, 2016 at 7:52 AM,   wrote:
>> From: Gao Feng 
>>
>> The PPTP is encapsulated by GRE header with that GRE_VERSION bits
>> must contain one. But current GRE RPS needs the GRE_VERSION must be
>> zero. So RPS does not work for PPTP traffic.
>>
>> In my test environment, there are four MIPS cores, and all traffic
>> are passed through by PPTP. As a result, only one core is 100% busy
>> while other three cores are very idle. After this patch, the usage
>> of four cores are balanced well.
>>
>> Signed-off-by: Gao Feng 
>> ---
>>  v3: 1) Move struct pptp_gre_header defination into new file pptp.h
>>  2) Use sizeof GRE and PPTP type instead of literal value;
>>  3) Remove strict flag check for PPTP to robust;
>>  4) Consolidate the codes again;
>>  v2: Update according to Tom and Philp's advice.
>>  1) Consolidate the codes with GRE version 0 path;
>>  2) Use PPP_PROTOCOL to get ppp protol;
>>  3) Set the FLOW_DIS_ENCAPSULATION flag;
>>  v1: Intial patch
>>
>>  drivers/net/ppp/pptp.c |  36 +--
>>  include/net/pptp.h |  40 
>>  include/uapi/linux/if_tunnel.h |   7 +-
>>  net/core/flow_dissector.c  | 141 
>> +
>>  4 files changed, 134 insertions(+), 90 deletions(-)
>>  create mode 100644 include/net/pptp.h
>>
>> diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
>> index ae0905e..3e68dbc 100644
>> --- a/drivers/net/ppp/pptp.c
>> +++ b/drivers/net/ppp/pptp.c
>> @@ -37,6 +37,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>
>>  #include 
>>
>> @@ -53,41 +54,6 @@ static struct proto pptp_sk_proto __read_mostly;
>>  static const struct ppp_channel_ops pptp_chan_ops;
>>  static const struct proto_ops pptp_ops;
>>
>> -#define PPP_LCP_ECHOREQ 0x09
>> -#define PPP_LCP_ECHOREP 0x0A
>> -#define SC_RCV_BITS(SC_RCV_B7_1|SC_RCV_B7_0|SC_RCV_ODDP|SC_RCV_EVNP)
>> -
>> -#define MISSING_WINDOW 20
>> -#define WRAPPED(curseq, lastseq)\
>> -   curseq) & 0xff00) == 0) &&\
>> -   (((lastseq) & 0xff00) == 0xff00))
>> -
>> -#define PPTP_GRE_PROTO  0x880B
>> -#define PPTP_GRE_VER0x1
>> -
>> -#define PPTP_GRE_FLAG_C0x80
>> -#define PPTP_GRE_FLAG_R0x40
>> -#define PPTP_GRE_FLAG_K0x20
>> -#define PPTP_GRE_FLAG_S0x10
>> -#define PPTP_GRE_FLAG_A0x80
>> -
>> -#define PPTP_GRE_IS_C(f) ((f)_GRE_FLAG_C)
>> -#define PPTP_GRE_IS_R(f) ((f)_GRE_FLAG_R)
>> -#define PPTP_GRE_IS_K(f) ((f)_GRE_FLAG_K)
>> -#define PPTP_GRE_IS_S(f) ((f)_GRE_FLAG_S)
>> -#define PPTP_GRE_IS_A(f) ((f)_GRE_FLAG_A)
>> -
>> -#define PPTP_HEADER_OVERHEAD (2+sizeof(struct pptp_gre_header))
>> -struct pptp_gre_header {
>> -   u8  flags;
>> -   u8  ver;
>> -   __be16 protocol;
>> -   __be16 payload_len;
>> -   __be16 call_id;
>> -   __be32 seq;
>> -   __be32 ack;
>> -} __packed;
>> -
>>  static struct pppox_sock *lookup_chan(u16 call_id, __be32 s_addr)
>>  {
>> struct pppox_sock *sock;
>> diff --git a/include/net/pptp.h b/include/net/pptp.h
>> new file mode 100644
>> index 000..301d3e2
>> --- /dev/null
>> +++ b/include/net/pptp.h
>> @@ -0,0 +1,40 @@
>> +#ifndef _NET_PPTP_H
>> +#define _NET_PPTP_H
>> +
>> +#define PPP_LCP_ECHOREQ 0x09
>> +#define PPP_LCP_ECHOREP 0x0A
>> +#define SC_RCV_BITS (SC_RCV_B7_1|SC_RCV_B7_0|SC_RCV_ODDP|SC_RCV_EVNP)
>> +
>> +#define MISSING_WINDOW 20
>> +#define WRAPPED(curseq, lastseq)\
>> +   curseq) & 0xff00) == 0) &&\
>> +   (((lastseq) & 0xff00) == 0xff00))
>> +
>> +#define PPTP_GRE_PROTO  0x880B
>> +#define PPTP_GRE_VER0x1
>> +
>> +#define PPTP_GRE_FLAG_C 0x80
>> +#define PPTP_GRE_FLAG_R 0x40
>> +#define PPTP_GRE_FLAG_K 0x20
>> +#define PPTP_GRE_FLAG_S 0x10
>> +#define PPTP_GRE_FLAG_A 0x80
>> +
>> +#define PPTP_GRE_IS_C(f) ((f)_GRE_FLAG_C)
>> +#define PPTP_GRE_IS_R(f) ((f)_GRE_FLAG_R)
>> +#define PPTP_GRE_IS_K(f) ((f)_GRE_FLAG_K)
>> +#define PPTP_GRE_IS_S(f) ((f)_GRE_FLAG_S)
>> +#define PPTP_GRE_IS_A(f) ((f)_GRE_FLAG_A)
>> +
>> +#define PPTP_HEADER_OVERHEAD (2+sizeof(struct pptp_gre_header))
>> +struct pptp_gre_header {
>> +   u8  flags;
>> +   u8  ver;
>> +   __be16 protocol;
>> +   __be16 payload_len;
>> +   __be16 call_id;
>> +   __be32 seq;
>> +   __be32 ack;
>> +} __packed;
>> +
>> +
>> +#endif
>> diff --git a/include/uapi/linux/if_tunnel.h b/include/uapi/linux/if_tunnel.h
>> index 1046f55..7d889db 100644
>> --- a/include/uapi/linux/if_tunnel.h
>> +++ b/include/uapi/linux/if_tunnel.h
>> @@ -24,9 +24,14 @@
>>  #define GRE_SEQ__cpu_to_be16(0x1000)
>>  #define GRE_STRICT __cpu_to_be16(0x0800)
>>  #define GRE_REC__cpu_to_be16(0x0700)
>> -#define GRE_FLAGS  __cpu_to_be16(0x00F8)
>> +#define GRE_ACK__cpu_to_be16(0x0080)
>> +#define GRE_FLAGS 

Re: [PATCH 1/1] rps: Inspect PPTP encapsulated by GRE to get flow hash

2016-08-03 Thread Feng Gao
Hi Philip & Tom,

I have sent the v3 patch now.

Philip,
I just move the definition of pptp_gre_header into new file
include/net/pptp.h without refactoring the pptp codes now.
I think the refactor should be done in another patch.

Tom,
Now I have consolidate the PPTP codes with GRE codes together.

Thanks both of you :-))

On Wed, Aug 3, 2016 at 12:01 PM, Tom Herbert <t...@herbertland.com> wrote:
> On Tue, Aug 2, 2016 at 7:17 PM, Feng Gao <gfree.w...@gmail.com> wrote:
>> Thanks Tom's detail explanation.
>> It is my misinterpret because my mother language is not English.
>>
> No, the RFC is somewhat poorly worded with respect to setting the
> version number. The fact that flag bit must be set to zero on
> transmit, but can be processed on received is permitted by
> "legalistic" interpretation of an IETF RFC :-)
>
> Tom
>
>> Thanks again:))
>>
>> On Wed, Aug 3, 2016 at 9:59 AM, Tom Herbert <t...@herbertland.com> wrote:
>>> On Tue, Aug 2, 2016 at 5:28 PM, Feng Gao <gfree.w...@gmail.com> wrote:
>>>> Hi Tom.
>>>>
>>>> I quote some description in RFC 2637.
>>>>
>>>> /*/
>>>>C
>>>>   (Bit 0) Checksum Present.  Set to zero (0).
>>>>
>>> Note that the requirement is "*Set* to zero"; it is not must check to
>>> be zero on reception. Per the robustness principle (part about be
>>> liberal in what you receive) it is allowable to process the flag if
>>> the bit is set set. In practice this is nice to have because we know
>>> that overloading fields in GRE (seq, csum in particular) with private
>>> data has been done before. Consolidating makes for cleaner code, and
>>> if someone is really violating the protocol the worst thing that would
>>> happen it that they would just get suboptimal RPS for those packets.
>>>
>>>>R
>>>>   (Bit 1) Routing Present.  Set to zero (0).
>>>>
>>>>K
>>>>   (Bit 2) Key Present.  Set to one (1).
>>>>S
>>>>   (Bit 3) Sequence Number Present.  Set to one (1) if a payload
>>>>   (data) packet is present.  Set to zero (0) if payload is not
>>>>   present (GRE packet is an Acknowledgment only).
>>>>
>>>>s
>>>>   (Bit 4) Strict source route present.  Set to zero (0).
>>>>
>>>>Recur
>>>>   (Bits 5-7) Recursion control.  Set to zero (0).
>>>>
>>>>A
>>>>   (Bit 8) Acknowledgment sequence number present.  Set to one (1) if
>>>>   packet contains Acknowledgment Number to be used for acknowledging
>>>>   previously transmitted data.
>>>>
>>>>Flags
>>>>   (Bits 9-12) Must be set to zero (0).
>>>>
>>>>Ver
>>>>   (Bits 13-15) Must contain 1 (enhanced GRE).
>>>>
>>>>Protocol Type
>>>>   Set to hex 880B [8].
>>>>
>>>>Key
>>>>   Use of the Key field is up to the implementation.  PPTP uses it as
>>>>   follows:
>>>>  Payload Length
>>>> (High 2 octets of Key) Size of the payload, not including
>>>> the GRE header
>>>>
>>>>  Call ID
>>>> (Low 2 octets) Contains the Peer's Call ID for the session
>>>> to which this packet belongs.
>>>> /*/
>>>>
>>>> Just a reminder, the version description is "contain 1", not "set 1".
>>>>
>>> I think your misinterpreting that (at least I hope :-) ). "Contains 1"
>>> should be that the contents of the Ver field is the value of "1".
>>> Otherwise, the RFC has appropriated half of the eight possible
>>> versions to be enhanced GRE.
>>>
>>> Tom
>>>
>>>> Best Regards
>>>> Feng
>>>>
>>>> On Wed, Aug 3, 2016 at 4:35 AM, Tom Herbert <t...@herbertland.com> wrote:
>>>>> On Thu, Jul 28, 2016 at 12:14 AM,  <f...@ikuai8.com> wrote:
>>>>>> From: Gao Feng <f...@ikuai8.com>
>>>>>>
>>>>>> The PPTP is encapsulated by GRE header with that GRE_VERSION bits
>>>>>> must contain one. But current GRE RPS needs the GRE_VERSION must be
>&g

Re: [PATCH 1/1] rps: Inspect PPTP encapsulated by GRE to get flow hash

2016-08-02 Thread Feng Gao
Thanks Tom's detail explanation.
It is my misinterpret because my mother language is not English.

Thanks again:))

On Wed, Aug 3, 2016 at 9:59 AM, Tom Herbert <t...@herbertland.com> wrote:
> On Tue, Aug 2, 2016 at 5:28 PM, Feng Gao <gfree.w...@gmail.com> wrote:
>> Hi Tom.
>>
>> I quote some description in RFC 2637.
>>
>> /*/
>>C
>>   (Bit 0) Checksum Present.  Set to zero (0).
>>
> Note that the requirement is "*Set* to zero"; it is not must check to
> be zero on reception. Per the robustness principle (part about be
> liberal in what you receive) it is allowable to process the flag if
> the bit is set set. In practice this is nice to have because we know
> that overloading fields in GRE (seq, csum in particular) with private
> data has been done before. Consolidating makes for cleaner code, and
> if someone is really violating the protocol the worst thing that would
> happen it that they would just get suboptimal RPS for those packets.
>
>>R
>>   (Bit 1) Routing Present.  Set to zero (0).
>>
>>K
>>   (Bit 2) Key Present.  Set to one (1).
>>S
>>   (Bit 3) Sequence Number Present.  Set to one (1) if a payload
>>   (data) packet is present.  Set to zero (0) if payload is not
>>   present (GRE packet is an Acknowledgment only).
>>
>>s
>>   (Bit 4) Strict source route present.  Set to zero (0).
>>
>>Recur
>>   (Bits 5-7) Recursion control.  Set to zero (0).
>>
>>A
>>   (Bit 8) Acknowledgment sequence number present.  Set to one (1) if
>>   packet contains Acknowledgment Number to be used for acknowledging
>>   previously transmitted data.
>>
>>Flags
>>   (Bits 9-12) Must be set to zero (0).
>>
>>Ver
>>   (Bits 13-15) Must contain 1 (enhanced GRE).
>>
>>Protocol Type
>>   Set to hex 880B [8].
>>
>>Key
>>   Use of the Key field is up to the implementation.  PPTP uses it as
>>   follows:
>>  Payload Length
>> (High 2 octets of Key) Size of the payload, not including
>> the GRE header
>>
>>  Call ID
>> (Low 2 octets) Contains the Peer's Call ID for the session
>> to which this packet belongs.
>> /*/
>>
>> Just a reminder, the version description is "contain 1", not "set 1".
>>
> I think your misinterpreting that (at least I hope :-) ). "Contains 1"
> should be that the contents of the Ver field is the value of "1".
> Otherwise, the RFC has appropriated half of the eight possible
> versions to be enhanced GRE.
>
> Tom
>
>> Best Regards
>> Feng
>>
>> On Wed, Aug 3, 2016 at 4:35 AM, Tom Herbert <t...@herbertland.com> wrote:
>>> On Thu, Jul 28, 2016 at 12:14 AM,  <f...@ikuai8.com> wrote:
>>>> From: Gao Feng <f...@ikuai8.com>
>>>>
>>>> The PPTP is encapsulated by GRE header with that GRE_VERSION bits
>>>> must contain one. But current GRE RPS needs the GRE_VERSION must be
>>>> zero. So RPS does not work for PPTP traffic.
>>>>
>>>> In my test environment, there are four MIPS cores, and all traffic
>>>> are passed through by PPTP. As a result, only one core is 100% busy
>>>> while other three cores are very idle. After this patch, the usage
>>>> of four cores are balanced well.
>>>>
>>>> Signed-off-by: Gao Feng <f...@ikuai8.com>
>>>> ---
>>>>  v2: Update according to Tom and Philp's advice.
>>>>  1) Consolidate the codes with GRE version 0 path;
>>>>  2) Use PPP_PROTOCOL to get ppp protol;
>>>>  3) Set the FLOW_DIS_ENCAPSULATION flag;
>>>>  v1: Initial patch
>>>>
>>>>  include/uapi/linux/if_tunnel.h |   5 +-
>>>>  net/core/flow_dissector.c  | 146 
>>>> ++---
>>>>  2 files changed, 97 insertions(+), 54 deletions(-)
>>>>
>>>> diff --git a/include/uapi/linux/if_tunnel.h 
>>>> b/include/uapi/linux/if_tunnel.h
>>>> index 1046f55..dda4e4b 100644
>>>> --- a/include/uapi/linux/if_tunnel.h
>>>> +++ b/include/uapi/linux/if_tunnel.h
>>>> @@ -24,9 +24,12 @@
>>>>  #define GRE_SEQ__cpu_to_be16(0x1000)
>>>

Re: [PATCH 1/1] rps: Inspect PPTP encapsulated by GRE to get flow hash

2016-08-02 Thread Feng Gao
Hi Tom.

I quote some description in RFC 2637.

/*/
   C
  (Bit 0) Checksum Present.  Set to zero (0).

   R
  (Bit 1) Routing Present.  Set to zero (0).

   K
  (Bit 2) Key Present.  Set to one (1).
   S
  (Bit 3) Sequence Number Present.  Set to one (1) if a payload
  (data) packet is present.  Set to zero (0) if payload is not
  present (GRE packet is an Acknowledgment only).

   s
  (Bit 4) Strict source route present.  Set to zero (0).

   Recur
  (Bits 5-7) Recursion control.  Set to zero (0).

   A
  (Bit 8) Acknowledgment sequence number present.  Set to one (1) if
  packet contains Acknowledgment Number to be used for acknowledging
  previously transmitted data.

   Flags
  (Bits 9-12) Must be set to zero (0).

   Ver
  (Bits 13-15) Must contain 1 (enhanced GRE).

   Protocol Type
  Set to hex 880B [8].

   Key
  Use of the Key field is up to the implementation.  PPTP uses it as
  follows:
 Payload Length
(High 2 octets of Key) Size of the payload, not including
the GRE header

 Call ID
(Low 2 octets) Contains the Peer's Call ID for the session
to which this packet belongs.
/*/

Just a reminder, the version description is "contain 1", not "set 1".

Best Regards
Feng

On Wed, Aug 3, 2016 at 4:35 AM, Tom Herbert  wrote:
> On Thu, Jul 28, 2016 at 12:14 AM,   wrote:
>> From: Gao Feng 
>>
>> The PPTP is encapsulated by GRE header with that GRE_VERSION bits
>> must contain one. But current GRE RPS needs the GRE_VERSION must be
>> zero. So RPS does not work for PPTP traffic.
>>
>> In my test environment, there are four MIPS cores, and all traffic
>> are passed through by PPTP. As a result, only one core is 100% busy
>> while other three cores are very idle. After this patch, the usage
>> of four cores are balanced well.
>>
>> Signed-off-by: Gao Feng 
>> ---
>>  v2: Update according to Tom and Philp's advice.
>>  1) Consolidate the codes with GRE version 0 path;
>>  2) Use PPP_PROTOCOL to get ppp protol;
>>  3) Set the FLOW_DIS_ENCAPSULATION flag;
>>  v1: Initial patch
>>
>>  include/uapi/linux/if_tunnel.h |   5 +-
>>  net/core/flow_dissector.c  | 146 
>> ++---
>>  2 files changed, 97 insertions(+), 54 deletions(-)
>>
>> diff --git a/include/uapi/linux/if_tunnel.h b/include/uapi/linux/if_tunnel.h
>> index 1046f55..dda4e4b 100644
>> --- a/include/uapi/linux/if_tunnel.h
>> +++ b/include/uapi/linux/if_tunnel.h
>> @@ -24,9 +24,12 @@
>>  #define GRE_SEQ__cpu_to_be16(0x1000)
>>  #define GRE_STRICT __cpu_to_be16(0x0800)
>>  #define GRE_REC__cpu_to_be16(0x0700)
>> -#define GRE_FLAGS  __cpu_to_be16(0x00F8)
>> +#define GRE_ACK__cpu_to_be16(0x0080)
>> +#define GRE_FLAGS  __cpu_to_be16(0x0078)
>>  #define GRE_VERSION__cpu_to_be16(0x0007)
>>
>> +#define GRE_PROTO_PPP  __cpu_to_be16(0x880b)
>> +
>>  struct ip_tunnel_parm {
>> charname[IFNAMSIZ];
>> int link;
>> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
>> index 61ad43f..33e957b 100644
>> --- a/net/core/flow_dissector.c
>> +++ b/net/core/flow_dissector.c
>> @@ -346,63 +346,103 @@ ip_proto_again:
>> hdr = __skb_header_pointer(skb, nhoff, sizeof(_hdr), data, 
>> hlen, &_hdr);
>> if (!hdr)
>> goto out_bad;
>> -   /*
>> -* Only look inside GRE if version zero and no
>> -* routing
>> -*/
>> -   if (hdr->flags & (GRE_VERSION | GRE_ROUTING))
>> -   break;
>> -
>> -   proto = hdr->proto;
>> -   nhoff += 4;
>> -   if (hdr->flags & GRE_CSUM)
>> -   nhoff += 4;
>> -   if (hdr->flags & GRE_KEY) {
>> -   const __be32 *keyid;
>> -   __be32 _keyid;
>> -
>> -   keyid = __skb_header_pointer(skb, nhoff, 
>> sizeof(_keyid),
>> -data, hlen, &_keyid);
>>
>> -   if (!keyid)
>> -   goto out_bad;
>> +   /* Only look inside GRE without routing */
>> +   if (!(hdr->flags & GRE_ROUTING)) {
>> +   proto = hdr->proto;
>> +
>> +   if (hdr->flags & GRE_VERSION) {
>
> This matches any non-zero GRE version (1-7). Is that really what you want?
>
>> +   /* It should be the PPTP in GRE */
>> +   u8 _ppp_hdr[PPP_HDRLEN];
>> +   u8 *ppp_hdr;
>> +

Re: [PATCH 1/1] rps: Inspect PPTP encapsulated by GRE to get flow hash

2016-08-02 Thread Feng Gao
Thanks Philp.

I get it, then i will send another update v3 patch soon.

On Wed, Aug 3, 2016 at 1:33 AM, Philp Prindeville
<phil...@redfish-solutions.com> wrote:
> Inline...
>
>
> On 08/02/2016 12:10 AM, Feng Gao wrote:
>>
>> Thanks.
>> Because the original GRE uses the literal number directly, so I follow
>> this style.
>>
>> Then I have two questions.
>> 1. pptp_gre_header is defined in "drivers/net/ppp/pptp.c"; If we want
>> to use it, need to create one new header file, is it ok ?
>
>
> Yes, I would move the relevant constants and packet structures into
> include/net/pptp.h ...
>
> While you're at it, there are places like:
>
> islcp = ((data[0] << 8) + data[1]) == PPP_LCP && 1 <= data[2] && data[2]
> <= 7;
>
> that should be written as:
>
> islcp = PPP_PROTOCOL(data-2) == PPP_LCP && CP_CONF_REQ <= data[2] &&
> data[2] <= CP_CODE_REJ;
>
> Similarly for:
>
> data[0] = PPP_ALLSTATIONS;
> data[1] = PPP_UI;
>
> that should be using:
>
> PPP_ADDRESS(data) = PPP_ALLSTATIONS;
> PPP_CONTROL(data) = PPP_UI;
>
> etc. but this cleanup can go in a 2nd patch.  The definitions for CONFREQ
> and CONFACK and CP_CONF_REQ and CP_CONF_ACK and PPP_LCP_ECHOREQ and
> PPP_LCP_ECHOREP are redundant and probably should all go into ppp_defs.h as
> well (which would require refactoring drivers/net/ppp/ppp_async.c).
>
>
>> 2. The GRE version 0 patch could use the sizeof pptp_gre_header.seq ?
>
>
> Actually, since you're going to be doing the pptp.h header anyway, I'd add a
> v0 as well as a v1 struct just to be more clear/concise.
>
> -Philip
>
>> Best Regards
>> Feng
>>
>>
> [snip]


Re: [PATCH 1/1] rps: Inspect PPTP encapsulated by GRE to get flow hash

2016-08-02 Thread Feng Gao
Thanks.
Because the original GRE uses the literal number directly, so I follow
this style.

Then I have two questions.
1. pptp_gre_header is defined in "drivers/net/ppp/pptp.c"; If we want
to use it, need to create one new header file, is it ok ?
2. The GRE version 0 patch could use the sizeof pptp_gre_header.seq ?

Best Regards
Feng

On Tue, Aug 2, 2016 at 9:36 AM, Philp Prindeville
 wrote:
> Inline...
>
> Main issue in a nutshell is I don't like using "4" instead of
> "sizeof(my_32bit_header_field_here)".
>
>
>
> On 07/28/2016 01:14 AM, f...@48lvckh6395k16k5.yundunddos.com wrote:
>>
>> From: Gao Feng 
>>
>> The PPTP is encapsulated by GRE header with that GRE_VERSION bits
>> must contain one. But current GRE RPS needs the GRE_VERSION must be
>> zero. So RPS does not work for PPTP traffic.
>>
>> In my test environment, there are four MIPS cores, and all traffic
>> are passed through by PPTP. As a result, only one core is 100% busy
>> while other three cores are very idle. After this patch, the usage
>> of four cores are balanced well.
>>
>> Signed-off-by: Gao Feng 
>> ---
>>   v2: Update according to Tom and Philp's advice.
>>   1) Consolidate the codes with GRE version 0 path;
>>   2) Use PPP_PROTOCOL to get ppp protol;
>>   3) Set the FLOW_DIS_ENCAPSULATION flag;
>>   v1: Initial patch
>>
>>   include/uapi/linux/if_tunnel.h |   5 +-
>>   net/core/flow_dissector.c  | 146
>> ++---
>>   2 files changed, 97 insertions(+), 54 deletions(-)
>>
>> diff --git a/include/uapi/linux/if_tunnel.h
>> b/include/uapi/linux/if_tunnel.h
>> index 1046f55..dda4e4b 100644
>> --- a/include/uapi/linux/if_tunnel.h
>> +++ b/include/uapi/linux/if_tunnel.h
>> @@ -24,9 +24,12 @@
>>   #define GRE_SEQ   __cpu_to_be16(0x1000)
>>   #define GRE_STRICT__cpu_to_be16(0x0800)
>>   #define GRE_REC   __cpu_to_be16(0x0700)
>> -#define GRE_FLAGS  __cpu_to_be16(0x00F8)
>> +#define GRE_ACK__cpu_to_be16(0x0080)
>> +#define GRE_FLAGS  __cpu_to_be16(0x0078)
>>   #define GRE_VERSION   __cpu_to_be16(0x0007)
>>   +#define GRE_PROTO_PPP__cpu_to_be16(0x880b)
>> +
>>   struct ip_tunnel_parm {
>> charname[IFNAMSIZ];
>> int link;
>> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
>> index 61ad43f..33e957b 100644
>> --- a/net/core/flow_dissector.c
>> +++ b/net/core/flow_dissector.c
>> @@ -346,63 +346,103 @@ ip_proto_again:
>> hdr = __skb_header_pointer(skb, nhoff, sizeof(_hdr), data,
>> hlen, &_hdr);
>> if (!hdr)
>> goto out_bad;
>> -   /*
>> -* Only look inside GRE if version zero and no
>> -* routing
>> -*/
>> -   if (hdr->flags & (GRE_VERSION | GRE_ROUTING))
>> -   break;
>> -
>> -   proto = hdr->proto;
>> -   nhoff += 4;
>> -   if (hdr->flags & GRE_CSUM)
>> -   nhoff += 4;
>> -   if (hdr->flags & GRE_KEY) {
>> -   const __be32 *keyid;
>> -   __be32 _keyid;
>> -
>> -   keyid = __skb_header_pointer(skb, nhoff,
>> sizeof(_keyid),
>> -data, hlen, &_keyid);
>>   - if (!keyid)
>> -   goto out_bad;
>> +   /* Only look inside GRE without routing */
>> +   if (!(hdr->flags & GRE_ROUTING)) {
>> +   proto = hdr->proto;
>> +
>> +   if (hdr->flags & GRE_VERSION) {
>> +   /* It should be the PPTP in GRE */
>> +   u8 _ppp_hdr[PPP_HDRLEN];
>> +   u8 *ppp_hdr;
>> +   int offset = 0;
>> +
>> +   /* Check the flags according to RFC 2637*/
>> +   if (!(proto == GRE_PROTO_PPP &&
>> (hdr->flags & GRE_KEY) &&
>> + !(hdr->flags & (GRE_CSUM |
>> GRE_STRICT | GRE_REC | GRE_FLAGS {
>> +   break;
>> +   }
>> +
>> +   /* Skip GRE header */
>> +   offset += 4;
>
>
> Please don't use literal values.  Instead use the size of the field(s):
>
> sizeof(struct gre_base_hdr) or offsetof(pptp_gre_header.payload_len)
>
>> +   /* Skip payload length and call id */
>> +   offset += 4;
>
>
> sizeof(pptp_gre_header.payload_len) + sizeof(pptp_gre_header.call_id)
>
>> +
>> +   if (hdr->flags & GRE_SEQ)
>> +   offset += 4;
>
>
> sizeof(pptp_gre_header.seq)
>
>> +
>> +  

Re: [PATCH 1/1] rps: Inspect PPTP encapsulated by GRE to get flow hash

2016-08-01 Thread Feng Gao
Hi Philp & Tom,

Thanks your advice.
I have send the update patch some days ago.

The link is http://www.spinics.net/lists/netdev/msg388083.html.
Could you help review it please?

Best Regards
Feng

On Thu, Jul 28, 2016 at 8:46 AM, Philp Prindeville
 wrote:
> Inline...
>
>
>
> On 07/27/2016 06:04 PM, Tom Herbert wrote:
>>
>> On Wed, Jul 27, 2016 at 4:42 PM,   wrote:
>>>
>>> From: Gao Feng 
>>>
>>> The PPTP is encapsulated by GRE header with that GRE_VERSION bits
>>> must contain one. But current GRE RPS needs the GRE_VERSION must be
>>> zero. So RPS does not work for PPTP traffic.
>>>
>>> In my test environment, there are four MIPS cores, and all traffic
>>> are passed through by PPTP. As a result, only one core is 100% busy
>>> while other three cores are very idle. After this patch, the usage
>>> of four cores are balanced well.
>>>
>>> Signed-off-by: Gao Feng 
>>> ---
>>>   v1: Initial patch
>>>
>>>   include/uapi/linux/if_tunnel.h |   5 +-
>>>   net/core/flow_dissector.c  | 138
>>> ++---
>>>   2 files changed, 92 insertions(+), 51 deletions(-)
>>>
>>> diff --git a/include/uapi/linux/if_tunnel.h
>>> b/include/uapi/linux/if_tunnel.h
>>> index 1046f55..dda4e4b 100644
>>> --- a/include/uapi/linux/if_tunnel.h
>>> +++ b/include/uapi/linux/if_tunnel.h
>>> @@ -24,9 +24,12 @@
>>>   #define GRE_SEQ__cpu_to_be16(0x1000)
>>>   #define GRE_STRICT __cpu_to_be16(0x0800)
>>>   #define GRE_REC__cpu_to_be16(0x0700)
>>> -#define GRE_FLAGS  __cpu_to_be16(0x00F8)
>>> +#define GRE_ACK__cpu_to_be16(0x0080)
>>> +#define GRE_FLAGS  __cpu_to_be16(0x0078)
>>>   #define GRE_VERSION__cpu_to_be16(0x0007)
>>>
>>> +#define GRE_PROTO_PPP  __cpu_to_be16(0x880b)
>>> +
>>>   struct ip_tunnel_parm {
>>>  charname[IFNAMSIZ];
>>>  int link;
>>> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
>>> index 61ad43f..d95e060 100644
>>> --- a/net/core/flow_dissector.c
>>> +++ b/net/core/flow_dissector.c
>>> @@ -346,63 +346,101 @@ ip_proto_again:
>>>  hdr = __skb_header_pointer(skb, nhoff, sizeof(_hdr),
>>> data, hlen, &_hdr);
>>>  if (!hdr)
>>>  goto out_bad;
>>> -   /*
>>> -* Only look inside GRE if version zero and no
>>> -* routing
>>> -*/
>>> -   if (hdr->flags & (GRE_VERSION | GRE_ROUTING))
>>> -   break;
>>>
>>> -   proto = hdr->proto;
>>> -   nhoff += 4;
>>> -   if (hdr->flags & GRE_CSUM)
>>> +   /*
>>> +   * Only look inside GRE if version zero and no
>>> +   * routing
>>
>> This comment is no longer true, GRE version 1 is being handled.
>>
>>> +   */
>>> +   if (!(hdr->flags & (GRE_VERSION | GRE_ROUTING))) {
>>> +   proto = hdr->proto;
>>>  nhoff += 4;
>>> -   if (hdr->flags & GRE_KEY) {
>>> -   const __be32 *keyid;
>>> -   __be32 _keyid;
>>> +   if (hdr->flags & GRE_CSUM)
>>> +   nhoff += 4;
>>> +   if (hdr->flags & GRE_KEY) {
>>> +   const __be32 *keyid;
>>> +   __be32 _keyid;
>>> +
>>> +   keyid = __skb_header_pointer(skb, nhoff,
>>> sizeof(_keyid),
>>> +data, hlen,
>>> &_keyid);
>>> +
>>> +   if (!keyid)
>>> +   goto out_bad;
>>> +
>>> +   if (dissector_uses_key(flow_dissector,
>>> +
>>> FLOW_DISSECTOR_KEY_GRE_KEYID)) {
>>> +   key_keyid =
>>> skb_flow_dissector_target(flow_dissector,
>>> +
>>> FLOW_DISSECTOR_KEY_GRE_KEYID,
>>> +
>>> target_container);
>>> +   key_keyid->keyid = *keyid;
>>> +   }
>>> +   nhoff += 4;
>>> +   }
>>> +   if (hdr->flags & GRE_SEQ)
>>> +   nhoff += 4;
>>> +   if (proto == htons(ETH_P_TEB)) {
>>> +   const struct ethhdr *eth;
>>> +   struct ethhdr _eth;
>>> +
>>> +   eth = __skb_header_pointer(skb, nhoff,
>>> +  sizeof(_eth),
>>> +  data, hlen,
>>> &_eth);
>>> +   if (!eth)
>>> +   goto out_bad;
>>> +   proto = eth->h_proto;

Re: [PATCH 1/1] rps: Inspect PPTP encapsulated by GRE to get flow hash

2016-07-27 Thread Feng Gao
Hi Tom,

I think maybe it is not good idea to consolidate these codes with the
version 0 path.

Because I need to check the flags strictly like this "!(hdr->flags &
(GRE_CSUM | GRE_ROUTING | GRE_STRICT | GRE_REC | GRE_FLAGS))".
These flag must be zero with PPTP GRE according to the RFC, but
original GRE only ask the "GRE_VERSION|GRE_ROUTING" must be zero.
And the inner proto structures are different between with the original
GRE and PPTP GRE.

So I think it would be more awkful if consolidate these codes, unless
we don't check the flags strictly.

On Thu, Jul 28, 2016 at 8:24 AM, Feng Gao <gfree.w...@gmail.com> wrote:
> Thanks Tom, I append some inline comments.
>
> BTW, forgive me reply the email by gmail because the original mail
> server doesn't support kernel email format.
> I will send the update later.
>
> Best Regards
> Feng
>
> On Thu, Jul 28, 2016 at 8:04 AM, Tom Herbert <t...@herbertland.com> wrote:
>>
>> On Wed, Jul 27, 2016 at 4:42 PM,  <f...@ikuai8.com> wrote:
>> > From: Gao Feng <f...@ikuai8.com>
>> >
>> > The PPTP is encapsulated by GRE header with that GRE_VERSION bits
>> > must contain one. But current GRE RPS needs the GRE_VERSION must be
>> > zero. So RPS does not work for PPTP traffic.
>> >
>> > In my test environment, there are four MIPS cores, and all traffic
>> > are passed through by PPTP. As a result, only one core is 100% busy
>> > while other three cores are very idle. After this patch, the usage
>> > of four cores are balanced well.
>> >
>> > Signed-off-by: Gao Feng <f...@ikuai8.com>
>> > ---
>> >  v1: Initial patch
>> >
>> >  include/uapi/linux/if_tunnel.h |   5 +-
>> >  net/core/flow_dissector.c  | 138 
>> > ++---
>> >  2 files changed, 92 insertions(+), 51 deletions(-)
>> >
>> > diff --git a/include/uapi/linux/if_tunnel.h 
>> > b/include/uapi/linux/if_tunnel.h
>> > index 1046f55..dda4e4b 100644
>> > --- a/include/uapi/linux/if_tunnel.h
>> > +++ b/include/uapi/linux/if_tunnel.h
>> > @@ -24,9 +24,12 @@
>> >  #define GRE_SEQ__cpu_to_be16(0x1000)
>> >  #define GRE_STRICT __cpu_to_be16(0x0800)
>> >  #define GRE_REC__cpu_to_be16(0x0700)
>> > -#define GRE_FLAGS  __cpu_to_be16(0x00F8)
>> > +#define GRE_ACK__cpu_to_be16(0x0080)
>> > +#define GRE_FLAGS  __cpu_to_be16(0x0078)
>> >  #define GRE_VERSION__cpu_to_be16(0x0007)
>> >
>> > +#define GRE_PROTO_PPP  __cpu_to_be16(0x880b)
>> > +
>> >  struct ip_tunnel_parm {
>> > charname[IFNAMSIZ];
>> > int link;
>> > diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
>> > index 61ad43f..d95e060 100644
>> > --- a/net/core/flow_dissector.c
>> > +++ b/net/core/flow_dissector.c
>> > @@ -346,63 +346,101 @@ ip_proto_again:
>> > hdr = __skb_header_pointer(skb, nhoff, sizeof(_hdr), data, 
>> > hlen, &_hdr);
>> > if (!hdr)
>> > goto out_bad;
>> > -   /*
>> > -* Only look inside GRE if version zero and no
>> > -* routing
>> > -*/
>> > -   if (hdr->flags & (GRE_VERSION | GRE_ROUTING))
>> > -   break;
>> >
>> > -   proto = hdr->proto;
>> > -   nhoff += 4;
>> > -   if (hdr->flags & GRE_CSUM)
>> > +   /*
>> > +   * Only look inside GRE if version zero and no
>> > +   * routing
>>
>> This comment is no longer true, GRE version 1 is being handled.
>
> Ok, I get it. Thanks.
>
>>
>> > +   */
>> > +   if (!(hdr->flags & (GRE_VERSION | GRE_ROUTING))) {
>> > +   proto = hdr->proto;
>> > nhoff += 4;
>> > -   if (hdr->flags & GRE_KEY) {
>> > -   const __be32 *keyid;
>> > -   __be32 _keyid;
>> > +   if (hdr->flags & GRE_CSUM)
>> > +   nhoff += 4;
>> > +   if (hdr->flags & GRE_KEY) {
>> >

Re: [PATCH 1/1] rps: Inspect PPTP encapsulated by GRE to get flow hash

2016-07-27 Thread Feng Gao
Thanks Tom, I append some inline comments.

BTW, forgive me reply the email by gmail because the original mail
server doesn't support kernel email format.
I will send the update later.

Best Regards
Feng

On Thu, Jul 28, 2016 at 8:04 AM, Tom Herbert  wrote:
>
> On Wed, Jul 27, 2016 at 4:42 PM,   wrote:
> > From: Gao Feng 
> >
> > The PPTP is encapsulated by GRE header with that GRE_VERSION bits
> > must contain one. But current GRE RPS needs the GRE_VERSION must be
> > zero. So RPS does not work for PPTP traffic.
> >
> > In my test environment, there are four MIPS cores, and all traffic
> > are passed through by PPTP. As a result, only one core is 100% busy
> > while other three cores are very idle. After this patch, the usage
> > of four cores are balanced well.
> >
> > Signed-off-by: Gao Feng 
> > ---
> >  v1: Initial patch
> >
> >  include/uapi/linux/if_tunnel.h |   5 +-
> >  net/core/flow_dissector.c  | 138 
> > ++---
> >  2 files changed, 92 insertions(+), 51 deletions(-)
> >
> > diff --git a/include/uapi/linux/if_tunnel.h b/include/uapi/linux/if_tunnel.h
> > index 1046f55..dda4e4b 100644
> > --- a/include/uapi/linux/if_tunnel.h
> > +++ b/include/uapi/linux/if_tunnel.h
> > @@ -24,9 +24,12 @@
> >  #define GRE_SEQ__cpu_to_be16(0x1000)
> >  #define GRE_STRICT __cpu_to_be16(0x0800)
> >  #define GRE_REC__cpu_to_be16(0x0700)
> > -#define GRE_FLAGS  __cpu_to_be16(0x00F8)
> > +#define GRE_ACK__cpu_to_be16(0x0080)
> > +#define GRE_FLAGS  __cpu_to_be16(0x0078)
> >  #define GRE_VERSION__cpu_to_be16(0x0007)
> >
> > +#define GRE_PROTO_PPP  __cpu_to_be16(0x880b)
> > +
> >  struct ip_tunnel_parm {
> > charname[IFNAMSIZ];
> > int link;
> > diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> > index 61ad43f..d95e060 100644
> > --- a/net/core/flow_dissector.c
> > +++ b/net/core/flow_dissector.c
> > @@ -346,63 +346,101 @@ ip_proto_again:
> > hdr = __skb_header_pointer(skb, nhoff, sizeof(_hdr), data, 
> > hlen, &_hdr);
> > if (!hdr)
> > goto out_bad;
> > -   /*
> > -* Only look inside GRE if version zero and no
> > -* routing
> > -*/
> > -   if (hdr->flags & (GRE_VERSION | GRE_ROUTING))
> > -   break;
> >
> > -   proto = hdr->proto;
> > -   nhoff += 4;
> > -   if (hdr->flags & GRE_CSUM)
> > +   /*
> > +   * Only look inside GRE if version zero and no
> > +   * routing
>
> This comment is no longer true, GRE version 1 is being handled.

Ok, I get it. Thanks.

>
> > +   */
> > +   if (!(hdr->flags & (GRE_VERSION | GRE_ROUTING))) {
> > +   proto = hdr->proto;
> > nhoff += 4;
> > -   if (hdr->flags & GRE_KEY) {
> > -   const __be32 *keyid;
> > -   __be32 _keyid;
> > +   if (hdr->flags & GRE_CSUM)
> > +   nhoff += 4;
> > +   if (hdr->flags & GRE_KEY) {
> > +   const __be32 *keyid;
> > +   __be32 _keyid;
> > +
> > +   keyid = __skb_header_pointer(skb, nhoff, 
> > sizeof(_keyid),
> > +data, hlen, 
> > &_keyid);
> > +
> > +   if (!keyid)
> > +   goto out_bad;
> > +
> > +   if (dissector_uses_key(flow_dissector,
> > +  
> > FLOW_DISSECTOR_KEY_GRE_KEYID)) {
> > +   key_keyid = 
> > skb_flow_dissector_target(flow_dissector,
> > +   
> >   FLOW_DISSECTOR_KEY_GRE_KEYID,
> > +   
> >   target_container);
> > +   key_keyid->keyid = *keyid;
> > +   }
> > +   nhoff += 4;
> > +   }
> > +   if (hdr->flags & GRE_SEQ)
> > +   nhoff += 4;
> > +   if (proto == htons(ETH_P_TEB)) {
> > +   const struct ethhdr *eth;
> > +   struct ethhdr _eth;
> > +
> > +   eth = __skb_header_pointer(skb, nhoff,
> > +  sizeof(_eth),
> > +  data, hlen, 
> > &_eth);
> > +