Re: [PATCHv3 next 3/3] ipvlan: Introduce l3s mode

2016-09-16 Thread महेश बंडेवार
On Thu, Sep 15, 2016 at 6:49 PM, David Ahern  wrote:
> On 9/15/16 6:14 PM, Mahesh Bandewar wrote:
>> diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h
>> index 695a5dc9ace3..371f4548c42d 100644
>> --- a/drivers/net/ipvlan/ipvlan.h
>> +++ b/drivers/net/ipvlan/ipvlan.h
>> @@ -23,11 +23,13 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>
>>  #define IPVLAN_DRV   "ipvlan"
>>  #define IPV_DRV_VER  "0.1"
>> @@ -96,6 +98,7 @@ struct ipvl_port {
>>   struct work_struct  wq;
>>   struct sk_buff_head backlog;
>>   int count;
>> + boolhooks_attached;
>
> With a refcnt on the hook registration you don't need this bool and removing 
> simplifies the set_mode logic.
>
not sure it simplifies the logic, but mode change fact can be used
instead of relying on the value of hooks_attached (so it's more
indirect).
>
>> diff --git a/drivers/net/ipvlan/ipvlan_main.c 
>> b/drivers/net/ipvlan/ipvlan_main.c
>> index 18b4e8c7f68a..aca690f41559 100644
>> --- a/drivers/net/ipvlan/ipvlan_main.c
>> +++ b/drivers/net/ipvlan/ipvlan_main.c
>
> 
>
>> +static void ipvlan_unregister_nf_hook(void)
>> +{
>> + BUG_ON(!ipvl_nf_hook_refcnt);
>
> not a panic() worthy issue. just a pr_warn or WARN_ON_ONCE should be ok.
>
sure, I don't think it would ever hit considering that RTNL mutex is
protecting all these updates (for now). But would definitely prefer
something there (probably WARN_ON) as a protection.
>


Re: [PATCHv3 next 3/3] ipvlan: Introduce l3s mode

2016-09-15 Thread David Ahern
On 9/15/16 6:14 PM, Mahesh Bandewar wrote:
> diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h
> index 695a5dc9ace3..371f4548c42d 100644
> --- a/drivers/net/ipvlan/ipvlan.h
> +++ b/drivers/net/ipvlan/ipvlan.h
> @@ -23,11 +23,13 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define IPVLAN_DRV   "ipvlan"
>  #define IPV_DRV_VER  "0.1"
> @@ -96,6 +98,7 @@ struct ipvl_port {
>   struct work_struct  wq;
>   struct sk_buff_head backlog;
>   int count;
> + boolhooks_attached;

With a refcnt on the hook registration you don't need this bool and removing 
simplifies the set_mode logic.


> diff --git a/drivers/net/ipvlan/ipvlan_main.c 
> b/drivers/net/ipvlan/ipvlan_main.c
> index 18b4e8c7f68a..aca690f41559 100644
> --- a/drivers/net/ipvlan/ipvlan_main.c
> +++ b/drivers/net/ipvlan/ipvlan_main.c



> +static void ipvlan_unregister_nf_hook(void)
> +{
> + BUG_ON(!ipvl_nf_hook_refcnt);

not a panic() worthy issue. just a pr_warn or WARN_ON_ONCE should be ok.




[PATCHv3 next 3/3] ipvlan: Introduce l3s mode

2016-09-15 Thread Mahesh Bandewar
From: Mahesh Bandewar 

In a typical IPvlan L3 setup where master is in default-ns and
each slave is into different (slave) ns. In this setup egress
packet processing for traffic originating from slave-ns will
hit all NF_HOOKs in slave-ns as well as default-ns. However same
is not true for ingress processing. All these NF_HOOKs are
hit only in the slave-ns skipping them in the default-ns.
IPvlan in L3 mode is restrictive and if admins want to deploy
iptables rules in default-ns, this asymmetric data path makes it
impossible to do so.

This patch makes use of the l3_rcv() (added as part of l3mdev
enhancements) to perform input route lookup on RX packets without
changing the skb->dev and then uses nf_hook at NF_INET_LOCAL_IN
to change the skb->dev just before handing over skb to L4.

Signed-off-by: Mahesh Bandewar 
CC: David Ahern 
---
 Documentation/networking/ipvlan.txt |  7 ++-
 drivers/net/Kconfig |  1 +
 drivers/net/ipvlan/ipvlan.h |  7 +++
 drivers/net/ipvlan/ipvlan_core.c| 94 +
 drivers/net/ipvlan/ipvlan_main.c| 92 +---
 include/uapi/linux/if_link.h|  1 +
 6 files changed, 194 insertions(+), 8 deletions(-)

diff --git a/Documentation/networking/ipvlan.txt 
b/Documentation/networking/ipvlan.txt
index 14422f8fcdc4..24196cef7c91 100644
--- a/Documentation/networking/ipvlan.txt
+++ b/Documentation/networking/ipvlan.txt
@@ -22,7 +22,7 @@ The driver can be built into the kernel (CONFIG_IPVLAN=y) or 
as a module
There are no module parameters for this driver and it can be configured
 using IProute2/ip utility.
 
-   ip link add link   type ipvlan mode { l2 | L3 }
+   ip link add link   type ipvlan mode { l2 | l3 | 
l3s }
 
e.g. ip link add link ipvl0 eth0 type ipvlan mode l2
 
@@ -48,6 +48,11 @@ master device for the L2 processing and routing from that 
instance will be
 used before packets are queued on the outbound device. In this mode the slaves
 will not receive nor can send multicast / broadcast traffic.
 
+4.3 L3S mode:
+   This is very similar to the L3 mode except that iptables (conn-tracking)
+works in this mode and hence it is L3-symmetric (L3s). This will have slightly 
less
+performance but that shouldn't matter since you are choosing this mode over 
plain-L3
+mode to make conn-tracking work.
 
 5. What to choose (macvlan vs. ipvlan)?
These two devices are very similar in many regards and the specific use
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 0c5415b05ea9..8768a625350d 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -149,6 +149,7 @@ config IPVLAN
 tristate "IP-VLAN support"
 depends on INET
 depends on IPV6
+depends on NET_L3_MASTER_DEV
 ---help---
   This allows one to create virtual devices off of a main interface
   and packets will be delivered based on the dest L3 (IPv6/IPv4 addr)
diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h
index 695a5dc9ace3..371f4548c42d 100644
--- a/drivers/net/ipvlan/ipvlan.h
+++ b/drivers/net/ipvlan/ipvlan.h
@@ -23,11 +23,13 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
+#include 
 
 #define IPVLAN_DRV "ipvlan"
 #define IPV_DRV_VER"0.1"
@@ -96,6 +98,7 @@ struct ipvl_port {
struct work_struct  wq;
struct sk_buff_head backlog;
int count;
+   boolhooks_attached;
struct rcu_head rcu;
 };
 
@@ -124,4 +127,8 @@ struct ipvl_addr *ipvlan_find_addr(const struct ipvl_dev 
*ipvlan,
   const void *iaddr, bool is_v6);
 bool ipvlan_addr_busy(struct ipvl_port *port, void *iaddr, bool is_v6);
 void ipvlan_ht_addr_del(struct ipvl_addr *addr);
+struct sk_buff *ipvlan_l3_rcv(struct net_device *dev, struct sk_buff *skb,
+ u16 proto);
+unsigned int ipvlan_nf_input(void *priv, struct sk_buff *skb,
+const struct nf_hook_state *state);
 #endif /* __IPVLAN_H */
diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
index b5f9511d819e..b4e990743e1d 100644
--- a/drivers/net/ipvlan/ipvlan_core.c
+++ b/drivers/net/ipvlan/ipvlan_core.c
@@ -560,6 +560,7 @@ int ipvlan_queue_xmit(struct sk_buff *skb, struct 
net_device *dev)
case IPVLAN_MODE_L2:
return ipvlan_xmit_mode_l2(skb, dev);
case IPVLAN_MODE_L3:
+   case IPVLAN_MODE_L3S:
return ipvlan_xmit_mode_l3(skb, dev);
}
 
@@ -664,6 +665,8 @@ rx_handler_result_t ipvlan_handle_frame(struct sk_buff 
**pskb)
return ipvlan_handle_mode_l2(pskb, port);
case IPVLAN_MODE_L3:
return ipvlan_handle_mode_l3(pskb, port);
+   case IPVLAN_MODE_L3S:
+   return RX_HANDLER_PASS;
}