Re: [PATCH net-next 1/4] net: Add SRIOV VGT+ support

2017-08-30 Thread Saeed Mahameed
On Tue, Aug 29, 2017 at 7:43 PM, Sabrina Dubroca  wrote:
> 2017-08-29, 13:13:09 +0300, Saeed Mahameed wrote:
>> On Mon, Aug 28, 2017 at 6:52 PM, Sabrina Dubroca  
>> wrote:
>> > 2017-08-27, 14:06:15 +0300, Saeed Mahameed wrote:
>> > [...]
>> >> +#define VF_VLAN_BITMAP   DIV_ROUND_UP(VF_VLAN_N_VID, sizeof(__u64) * 
>> >> BITS_PER_BYTE)
>> >> +struct ifla_vf_vlan_trunk {
>> >> + __u32 vf;
>> >> + __u64 allowed_vlans_8021q_bm[VF_VLAN_BITMAP];
>> >> + __u64 allowed_vlans_8021ad_bm[VF_VLAN_BITMAP];
>> >> +};
>> >
>> > This is huge (1032B). And you put one of these in the netlink message
>> > for each VF.  This means that with 51 VF (at least in my environment,
>> > where each VF takes 1296B), you're going to overflow the u16 size of a
>> > single attribute (IFLA_VFINFO_LIST), and you cannot dump the device
>> > anymore. I'm afraid this is going to break existing setups.
>> >
>>
>> Yes ! We will fix this,
>> we are considering to report only a boolean in VFINFO which indecates
>> if VGT+ is enable or not
>> and provide a new attribute per VF to report only the vlan list of specific 
>> VF.
>
> I don't see what this is going to look like. Maybe you can describe
> more precisely what you want to add to the netlink message? (otherwise
> I'll wait for the patches)
>

We are still looking for the best way from user experience perspective.
Currently we think we will only allow to query VF vlan list on one VF
at a time, with vlan list special ip route command.

Still under definition, we will let you know once ready.

> If you add large attributes for each VF, this is still going to break.
>
>
> Thanks.
>
> --
> Sabrina


Re: [PATCH net-next 1/4] net: Add SRIOV VGT+ support

2017-08-29 Thread Sabrina Dubroca
2017-08-29, 13:13:09 +0300, Saeed Mahameed wrote:
> On Mon, Aug 28, 2017 at 6:52 PM, Sabrina Dubroca  wrote:
> > 2017-08-27, 14:06:15 +0300, Saeed Mahameed wrote:
> > [...]
> >> +#define VF_VLAN_BITMAP   DIV_ROUND_UP(VF_VLAN_N_VID, sizeof(__u64) * 
> >> BITS_PER_BYTE)
> >> +struct ifla_vf_vlan_trunk {
> >> + __u32 vf;
> >> + __u64 allowed_vlans_8021q_bm[VF_VLAN_BITMAP];
> >> + __u64 allowed_vlans_8021ad_bm[VF_VLAN_BITMAP];
> >> +};
> >
> > This is huge (1032B). And you put one of these in the netlink message
> > for each VF.  This means that with 51 VF (at least in my environment,
> > where each VF takes 1296B), you're going to overflow the u16 size of a
> > single attribute (IFLA_VFINFO_LIST), and you cannot dump the device
> > anymore. I'm afraid this is going to break existing setups.
> >
> 
> Yes ! We will fix this,
> we are considering to report only a boolean in VFINFO which indecates
> if VGT+ is enable or not
> and provide a new attribute per VF to report only the vlan list of specific 
> VF.

I don't see what this is going to look like. Maybe you can describe
more precisely what you want to add to the netlink message? (otherwise
I'll wait for the patches)

If you add large attributes for each VF, this is still going to break.


Thanks.

-- 
Sabrina


Re: [PATCH net-next 1/4] net: Add SRIOV VGT+ support

2017-08-29 Thread Saeed Mahameed
On Mon, Aug 28, 2017 at 6:52 PM, Sabrina Dubroca  wrote:
> 2017-08-27, 14:06:15 +0300, Saeed Mahameed wrote:
> [...]
>> +#define VF_VLAN_BITMAP   DIV_ROUND_UP(VF_VLAN_N_VID, sizeof(__u64) * 
>> BITS_PER_BYTE)
>> +struct ifla_vf_vlan_trunk {
>> + __u32 vf;
>> + __u64 allowed_vlans_8021q_bm[VF_VLAN_BITMAP];
>> + __u64 allowed_vlans_8021ad_bm[VF_VLAN_BITMAP];
>> +};
>
> This is huge (1032B). And you put one of these in the netlink message
> for each VF.  This means that with 51 VF (at least in my environment,
> where each VF takes 1296B), you're going to overflow the u16 size of a
> single attribute (IFLA_VFINFO_LIST), and you cannot dump the device
> anymore. I'm afraid this is going to break existing setups.
>

Yes ! We will fix this,
we are considering to report only a boolean in VFINFO which indecates
if VGT+ is enable or not
and provide a new attribute per VF to report only the vlan list of specific VF.

Thanks for the input,
we will handle this.


> --
> Sabrina


Re: [PATCH net-next 1/4] net: Add SRIOV VGT+ support

2017-08-28 Thread Sabrina Dubroca
2017-08-27, 14:06:15 +0300, Saeed Mahameed wrote:
[...]
> +#define VF_VLAN_BITMAP   DIV_ROUND_UP(VF_VLAN_N_VID, sizeof(__u64) * 
> BITS_PER_BYTE)
> +struct ifla_vf_vlan_trunk {
> + __u32 vf;
> + __u64 allowed_vlans_8021q_bm[VF_VLAN_BITMAP];
> + __u64 allowed_vlans_8021ad_bm[VF_VLAN_BITMAP];
> +};

This is huge (1032B). And you put one of these in the netlink message
for each VF.  This means that with 51 VF (at least in my environment,
where each VF takes 1296B), you're going to overflow the u16 size of a
single attribute (IFLA_VFINFO_LIST), and you cannot dump the device
anymore. I'm afraid this is going to break existing setups.

-- 
Sabrina


Re: [PATCH net-next 1/4] net: Add SRIOV VGT+ support

2017-08-28 Thread Saeed Mahameed
On Mon, Aug 28, 2017 at 3:38 AM, Jakub Kicinski  wrote:
> On Sun, 27 Aug 2017 14:06:15 +0300, Saeed Mahameed wrote:
>> From: Mohamad Haj Yahia 
>>
>> VGT+ is a security feature that gives the administrator the ability of
>> controlling the allowed vlan-ids list that can be transmitted/received
>> from/to the VF.
>> The allowed vlan-ids list is called "trunk".
>> Admin can add/remove a range of allowed vlan-ids via iptool.
>> Example:
>> After this series of configuration :
>> 1) ip link set eth3 vf 0 trunk add 10 100 (allow vlan-id 10-100, default 
>> tpid 0x8100)
>> 2) ip link set eth3 vf 0 trunk add 105 proto 802.1q (allow vlan-id 105 tpid 
>> 0x8100)
>> 3) ip link set eth3 vf 0 trunk add 105 proto 802.1ad (allow vlan-id 105 tpid 
>> 0x88a8)
>> 4) ip link set eth3 vf 0 trunk rem 90 (block vlan-id 90)
>> 5) ip link set eth3 vf 0 trunk rem 50 60 (block vlan-ids 50-60)
>>
>> The VF 0 can only communicate on vlan-ids: 10-49,61-89,91-100,105 with
>> tpid 0x8100 and vlan-id 105 with tpid 0x88a8.
>>
>> For this purpose we added the following netlink sr-iov commands:
>>
>> 1) IFLA_VF_VLAN_RANGE: used to add/remove allowed vlan-ids range.
>> We added the ifla_vf_vlan_range struct to specify the range we want to
>> add/remove from the userspace.
>> We added ndo_add_vf_vlan_trunk_range and ndo_del_vf_vlan_trunk_range
>> netdev ops to add/remove allowed vlan-ids range in the netdev.
>>
>> 2) IFLA_VF_VLAN_TRUNK: used to query the allowed vlan-ids trunk.
>> We added trunk bitmap to the ifla_vf_info struct to get the current
>> allowed vlan-ids trunk from the netdev.
>> We added ifla_vf_vlan_trunk struct for sending the allowed vlan-ids
>> trunk to the userspace.
>>
>> Signed-off-by: Mohamad Haj Yahia 
>> Signed-off-by: Eugenia Emantayev 
>> Signed-off-by: Saeed Mahameed 
>
> Interesting work, I have some minor questions if you don't mind :)
>

Hi Jakub, Thanks for the review.

> I was under impression that "trunk" is a vendor-specific term, would it
> make sense to drop it from the APIs?
>

Well, the term trunk is widely used in switches APIs and since those
patches refer to SRIOV architecture
where an E-Switch is running in the HW level operating as an L2
switch, I think it makes sense to use the same term for the
same functionality we already have in the switches.

>> diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
>> index 8d062c58d5cb..3aa895c5fbc1 100644
>> --- a/include/uapi/linux/if_link.h
>> +++ b/include/uapi/linux/if_link.h
>> @@ -168,6 +168,8 @@ enum {
>>  #ifndef __KERNEL__
>>  #define IFLA_RTA(r)  ((struct rtattr*)(((char*)(r)) + 
>> NLMSG_ALIGN(sizeof(struct ifinfomsg
>>  #define IFLA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct ifinfomsg))
>> +#define BITS_PER_BYTE 8
>> +#define DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
>>  #endif
>>
>>  enum {
>> @@ -645,6 +647,8 @@ enum {
>>   IFLA_VF_IB_NODE_GUID,   /* VF Infiniband node GUID */
>>   IFLA_VF_IB_PORT_GUID,   /* VF Infiniband port GUID */
>>   IFLA_VF_VLAN_LIST,  /* nested list of vlans, option for QinQ */
>> + IFLA_VF_VLAN_RANGE, /* add/delete vlan range filtering */
>> + IFLA_VF_VLAN_TRUNK, /* vlan trunk filtering */
>>   __IFLA_VF_MAX,
>>  };
>>
>> @@ -669,6 +673,7 @@ enum {
>>
>>  #define IFLA_VF_VLAN_INFO_MAX (__IFLA_VF_VLAN_INFO_MAX - 1)
>>  #define MAX_VLAN_LIST_LEN 1
>> +#define VF_VLAN_N_VID 4096
>>
>>  struct ifla_vf_vlan_info {
>>   __u32 vf;
>> @@ -677,6 +682,21 @@ struct ifla_vf_vlan_info {
>>   __be16 vlan_proto; /* VLAN protocol either 802.1Q or 802.1ad */
>>  };
>>
>> +struct ifla_vf_vlan_range {
>> + __u32 vf;
>> + __u32 start_vid;   /* 1 - 4095 */
>> + __u32 end_vid; /* 1 - 4095 */
>> + __u32 setting;
>> + __be16 vlan_proto; /* VLAN protocol either 802.1Q or 802.1ad */
>> +};
>> +
>> +#define VF_VLAN_BITMAP   DIV_ROUND_UP(VF_VLAN_N_VID, sizeof(__u64) * 
>> BITS_PER_BYTE)
>> +struct ifla_vf_vlan_trunk {
>> + __u32 vf;
>> + __u64 allowed_vlans_8021q_bm[VF_VLAN_BITMAP];
>> + __u64 allowed_vlans_8021ad_bm[VF_VLAN_BITMAP];
>> +};
>
> Would you mind explaining why you chose to make the API asymmetrical
> like that?  I mean the set operation is range-based, yet the get
> returns a bitmask.  You seem to solely depend on the bitmasks in the
> driver anyway...
>

It is not about driver dependency, simply we would like to store the
allowed vlan in a simple data structure and bitmap is
the simplest form, otherwise we would complicate the APIs to transfer
Lists and ranges back and forth between userspace and kernel.


>>  struct ifla_vf_tx_rate {
>>   __u32 vf;
>>   __u32 rate; /* Max TX bandwidth in Mbps, 0 disables throttling */
>> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
>> index a78fd61da0ec..56909f11d88e 100644
>> --- a/net/core/rtnetlink.c
>> +++ b/net/core/rtnetlink.c
>> @@ -827,6 +827,7 @@ static inline 

Re: [PATCH net-next 1/4] net: Add SRIOV VGT+ support

2017-08-27 Thread Jakub Kicinski
On Sun, 27 Aug 2017 14:06:15 +0300, Saeed Mahameed wrote:
> From: Mohamad Haj Yahia 
> 
> VGT+ is a security feature that gives the administrator the ability of
> controlling the allowed vlan-ids list that can be transmitted/received
> from/to the VF.
> The allowed vlan-ids list is called "trunk".
> Admin can add/remove a range of allowed vlan-ids via iptool.
> Example:
> After this series of configuration :
> 1) ip link set eth3 vf 0 trunk add 10 100 (allow vlan-id 10-100, default tpid 
> 0x8100)
> 2) ip link set eth3 vf 0 trunk add 105 proto 802.1q (allow vlan-id 105 tpid 
> 0x8100)
> 3) ip link set eth3 vf 0 trunk add 105 proto 802.1ad (allow vlan-id 105 tpid 
> 0x88a8)
> 4) ip link set eth3 vf 0 trunk rem 90 (block vlan-id 90)
> 5) ip link set eth3 vf 0 trunk rem 50 60 (block vlan-ids 50-60)
> 
> The VF 0 can only communicate on vlan-ids: 10-49,61-89,91-100,105 with
> tpid 0x8100 and vlan-id 105 with tpid 0x88a8.
> 
> For this purpose we added the following netlink sr-iov commands:
> 
> 1) IFLA_VF_VLAN_RANGE: used to add/remove allowed vlan-ids range.
> We added the ifla_vf_vlan_range struct to specify the range we want to
> add/remove from the userspace.
> We added ndo_add_vf_vlan_trunk_range and ndo_del_vf_vlan_trunk_range
> netdev ops to add/remove allowed vlan-ids range in the netdev.
> 
> 2) IFLA_VF_VLAN_TRUNK: used to query the allowed vlan-ids trunk.
> We added trunk bitmap to the ifla_vf_info struct to get the current
> allowed vlan-ids trunk from the netdev.
> We added ifla_vf_vlan_trunk struct for sending the allowed vlan-ids
> trunk to the userspace.
> 
> Signed-off-by: Mohamad Haj Yahia 
> Signed-off-by: Eugenia Emantayev 
> Signed-off-by: Saeed Mahameed 

Interesting work, I have some minor questions if you don't mind :)

I was under impression that "trunk" is a vendor-specific term, would it
make sense to drop it from the APIs?

> diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
> index 8d062c58d5cb..3aa895c5fbc1 100644
> --- a/include/uapi/linux/if_link.h
> +++ b/include/uapi/linux/if_link.h
> @@ -168,6 +168,8 @@ enum {
>  #ifndef __KERNEL__
>  #define IFLA_RTA(r)  ((struct rtattr*)(((char*)(r)) + 
> NLMSG_ALIGN(sizeof(struct ifinfomsg
>  #define IFLA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct ifinfomsg))
> +#define BITS_PER_BYTE 8
> +#define DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
>  #endif
>  
>  enum {
> @@ -645,6 +647,8 @@ enum {
>   IFLA_VF_IB_NODE_GUID,   /* VF Infiniband node GUID */
>   IFLA_VF_IB_PORT_GUID,   /* VF Infiniband port GUID */
>   IFLA_VF_VLAN_LIST,  /* nested list of vlans, option for QinQ */
> + IFLA_VF_VLAN_RANGE, /* add/delete vlan range filtering */
> + IFLA_VF_VLAN_TRUNK, /* vlan trunk filtering */
>   __IFLA_VF_MAX,
>  };
>  
> @@ -669,6 +673,7 @@ enum {
>  
>  #define IFLA_VF_VLAN_INFO_MAX (__IFLA_VF_VLAN_INFO_MAX - 1)
>  #define MAX_VLAN_LIST_LEN 1
> +#define VF_VLAN_N_VID 4096
>  
>  struct ifla_vf_vlan_info {
>   __u32 vf;
> @@ -677,6 +682,21 @@ struct ifla_vf_vlan_info {
>   __be16 vlan_proto; /* VLAN protocol either 802.1Q or 802.1ad */
>  };
>  
> +struct ifla_vf_vlan_range {
> + __u32 vf;
> + __u32 start_vid;   /* 1 - 4095 */
> + __u32 end_vid; /* 1 - 4095 */
> + __u32 setting;
> + __be16 vlan_proto; /* VLAN protocol either 802.1Q or 802.1ad */
> +};
> +
> +#define VF_VLAN_BITMAP   DIV_ROUND_UP(VF_VLAN_N_VID, sizeof(__u64) * 
> BITS_PER_BYTE)
> +struct ifla_vf_vlan_trunk {
> + __u32 vf;
> + __u64 allowed_vlans_8021q_bm[VF_VLAN_BITMAP];
> + __u64 allowed_vlans_8021ad_bm[VF_VLAN_BITMAP];
> +};

Would you mind explaining why you chose to make the API asymmetrical
like that?  I mean the set operation is range-based, yet the get
returns a bitmask.  You seem to solely depend on the bitmasks in the
driver anyway...

>  struct ifla_vf_tx_rate {
>   __u32 vf;
>   __u32 rate; /* Max TX bandwidth in Mbps, 0 disables throttling */
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index a78fd61da0ec..56909f11d88e 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -827,6 +827,7 @@ static inline int rtnl_vfinfo_size(const struct 
> net_device *dev,
>nla_total_size(MAX_VLAN_LIST_LEN *
>   sizeof(struct ifla_vf_vlan_info)) +
>nla_total_size(sizeof(struct ifla_vf_spoofchk)) +
> +  nla_total_size(sizeof(struct ifla_vf_vlan_trunk)) +
>nla_total_size(sizeof(struct ifla_vf_tx_rate)) +
>nla_total_size(sizeof(struct ifla_vf_rate)) +
>nla_total_size(sizeof(struct ifla_vf_link_state)) +
> @@ -1098,31 +1099,43 @@ static noinline_for_stack int rtnl_fill_vfinfo(struct 
> sk_buff *skb,
>   struct ifla_vf_link_state vf_linkstate;
>   struct ifla_vf_vlan_info 

[PATCH net-next 1/4] net: Add SRIOV VGT+ support

2017-08-27 Thread Saeed Mahameed
From: Mohamad Haj Yahia 

VGT+ is a security feature that gives the administrator the ability of
controlling the allowed vlan-ids list that can be transmitted/received
from/to the VF.
The allowed vlan-ids list is called "trunk".
Admin can add/remove a range of allowed vlan-ids via iptool.
Example:
After this series of configuration :
1) ip link set eth3 vf 0 trunk add 10 100 (allow vlan-id 10-100, default tpid 
0x8100)
2) ip link set eth3 vf 0 trunk add 105 proto 802.1q (allow vlan-id 105 tpid 
0x8100)
3) ip link set eth3 vf 0 trunk add 105 proto 802.1ad (allow vlan-id 105 tpid 
0x88a8)
4) ip link set eth3 vf 0 trunk rem 90 (block vlan-id 90)
5) ip link set eth3 vf 0 trunk rem 50 60 (block vlan-ids 50-60)

The VF 0 can only communicate on vlan-ids: 10-49,61-89,91-100,105 with
tpid 0x8100 and vlan-id 105 with tpid 0x88a8.

For this purpose we added the following netlink sr-iov commands:

1) IFLA_VF_VLAN_RANGE: used to add/remove allowed vlan-ids range.
We added the ifla_vf_vlan_range struct to specify the range we want to
add/remove from the userspace.
We added ndo_add_vf_vlan_trunk_range and ndo_del_vf_vlan_trunk_range
netdev ops to add/remove allowed vlan-ids range in the netdev.

2) IFLA_VF_VLAN_TRUNK: used to query the allowed vlan-ids trunk.
We added trunk bitmap to the ifla_vf_info struct to get the current
allowed vlan-ids trunk from the netdev.
We added ifla_vf_vlan_trunk struct for sending the allowed vlan-ids
trunk to the userspace.

Signed-off-by: Mohamad Haj Yahia 
Signed-off-by: Eugenia Emantayev 
Signed-off-by: Saeed Mahameed 
---
 include/linux/if_link.h  |   2 +
 include/linux/netdevice.h|  12 +
 include/uapi/linux/if_link.h |  20 
 net/core/rtnetlink.c | 109 +++
 4 files changed, 114 insertions(+), 29 deletions(-)

diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index 0b17c585b5cd..da70af27e42e 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -25,6 +25,8 @@ struct ifla_vf_info {
__u32 max_tx_rate;
__u32 rss_query_en;
__u32 trusted;
+   __u64 trunk_8021q[VF_VLAN_BITMAP];
+   __u64 trunk_8021ad[VF_VLAN_BITMAP];
__be16 vlan_proto;
 };
 #endif /* _LINUX_IF_LINK_H */
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index c5475b37a631..10633cabc58f 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -959,6 +959,10 @@ struct xfrmdev_ops {
  *  Hash Key. This is needed since on some devices VF share this 
information
  *  with PF and querying it may introduce a theoretical security risk.
  * int (*ndo_set_vf_rss_query_en)(struct net_device *dev, int vf, bool 
setting);
+ * int (*ndo_add_vf_vlan_trunk_range)(struct net_device *dev, int vf,
+ *   u16 start_vid, u16 end_vid, __be16 proto);
+ * int (*ndo_del_vf_vlan_trunk_range)(struct net_device *dev, int vf,
+ *   u16 start_vid, u16 end_vid, __be16 proto);
  * int (*ndo_get_vf_port)(struct net_device *dev, int vf, struct sk_buff *skb);
  * int (*ndo_setup_tc)(struct net_device *dev, enum tc_setup_type type,
  *void *type_data);
@@ -1208,6 +1212,14 @@ struct net_device_ops {
int (*ndo_set_vf_rss_query_en)(
   struct net_device *dev,
   int vf, bool setting);
+   int (*ndo_add_vf_vlan_trunk_range)(
+  struct net_device *dev,
+  int vf, u16 start_vid,
+  u16 end_vid, __be16 proto);
+   int (*ndo_del_vf_vlan_trunk_range)(
+  struct net_device *dev,
+  int vf, u16 start_vid,
+  u16 end_vid, __be16 proto);
int (*ndo_setup_tc)(struct net_device *dev,
enum tc_setup_type type,
void *type_data);
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 8d062c58d5cb..3aa895c5fbc1 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -168,6 +168,8 @@ enum {
 #ifndef __KERNEL__
 #define IFLA_RTA(r)  ((struct rtattr*)(((char*)(r)) + 
NLMSG_ALIGN(sizeof(struct ifinfomsg
 #define IFLA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct ifinfomsg))
+#define BITS_PER_BYTE 8
+#define DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
 #endif
 
 enum {
@@ -645,6 +647,8 @@ enum {
IFLA_VF_IB_NODE_GUID,   /* VF Infiniband node GUID */
IFLA_VF_IB_PORT_GUID,   /* VF Infiniband port GUID */