from:"Roopa Prabhu"

Re: [net-next PATCH v0 2/5] net: addr_list: add exclusive dev_uc_add

2012-03-25 Thread Roopa Prabhu




On 3/18/12 11:51 PM, John Fastabend john.r.fastab...@intel.com wrote:

 This adds a dev_uc_add_excl() call similar to the original
 dev_uc_add() except it sets the global bit. With this
 change the reference count will not be bumped and -EEXIST
 will be returned if a duplicate address exists.
 
 This is useful for drivers that support SR-IOV and want
 to manage the unicast lists.
 
 Signed-off-by: John Fastabend john.r.fastab...@intel.com
 ---
 
  include/linux/netdevice.h |1 +
  net/core/dev_addr_lists.c |   19 +++
  2 files changed, 20 insertions(+), 0 deletions(-)
 
 diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
 index 4208901..5e43cec 100644
 --- a/include/linux/netdevice.h
 +++ b/include/linux/netdevice.h
 @@ -2571,6 +2571,7 @@ extern int dev_addr_init(struct net_device *dev);
  
  /* Functions used for unicast addresses handling */
  extern int dev_uc_add(struct net_device *dev, unsigned char *addr);
 +extern int dev_uc_add_excl(struct net_device *dev, unsigned char *addr);
  extern int dev_uc_del(struct net_device *dev, unsigned char *addr);
  extern int dev_uc_sync(struct net_device *to, struct net_device *from);
  extern void dev_uc_unsync(struct net_device *to, struct net_device *from);
 diff --git a/net/core/dev_addr_lists.c b/net/core/dev_addr_lists.c
 index 29c07fe..c7d27ad 100644
 --- a/net/core/dev_addr_lists.c
 +++ b/net/core/dev_addr_lists.c
 @@ -377,6 +377,25 @@ EXPORT_SYMBOL(dev_addr_del_multiple);
   */
  
  /**
 + * dev_uc_add_excl - Add a global secondary unicast address
 + * @dev: device
 + * @addr: address to add
 + */
 +int dev_uc_add_excl(struct net_device *dev, unsigned char *addr)
 +{
 + int err;
 +
 + netif_addr_lock_bh(dev);
 + err = __hw_addr_add_ex(dev-uc, addr, dev-addr_len,
 +  NETDEV_HW_ADDR_T_UNICAST, true);
 + if (!err)
 +  __dev_set_rx_mode(dev);
 + netif_addr_unlock_bh(dev);
 + return err;
 +}
 +EXPORT_SYMBOL(dev_uc_add_excl);
 +
 +/**

ACK.
We will need a similar function for multicast as well ?. Macvlan could use
it.

Thanks,
Roopa



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net-next PATCH v0 3/5] net: add fdb generic dump routine

2012-03-25 Thread Roopa Prabhu




On 3/18/12 11:52 PM, John Fastabend john.r.fastab...@intel.com wrote:

 This adds a generic dump routine drivers can call. It
 should be sufficient to handle any bridging model that
 uses the unicast address list. This should be most SR-IOV
 enabled NICs.
 
 Signed-off-by: John Fastabend john.r.fastab...@intel.com
 ---
 
  net/core/rtnetlink.c |   56
 ++
  1 files changed, 56 insertions(+), 0 deletions(-)
 
 diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
 index 8c3278a..35ee2d6 100644
 --- a/net/core/rtnetlink.c
 +++ b/net/core/rtnetlink.c
 @@ -2082,6 +2082,62 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct
 nlmsghdr *nlh, void *arg)
 return err;
  }
  
 +/**
 + * ndo_dflt_fdb_dump: default netdevice operation to dump an FDB table.
 + * @nlh: netlink message header
 + * @dev: netdevice
 + *
 + * Default netdevice operation to dump the existing unicast address list.
 + * Returns zero on success.
 + */
 +int ndo_dflt_fdb_dump(struct sk_buff *skb,
 +struct netlink_callback *cb,
 +struct net_device *dev,
 +int idx)
 +{
 + struct netdev_hw_addr *ha;
 + struct nlmsghdr *nlh;
 + struct ndmsg *ndm;
 + u32 pid, seq;
 +
 + pid = NETLINK_CB(cb-skb).pid;
 + seq = cb-nlh-nlmsg_seq;
 +
 + netif_addr_lock_bh(dev);
 + list_for_each_entry(ha, dev-uc.list, list) {
 +  if (idx  cb-args[0])
 +   goto skip;

Any reason why its only uc ?. What about mc ?

 +
 +  nlh = nlmsg_put(skb, pid, seq,
 +RTM_NEWNEIGH, sizeof(*ndm), NLM_F_MULTI);
 +  if (!nlh)
 +   break;
 +
 +  ndm = nlmsg_data(nlh);
 +  ndm-ndm_family  = AF_BRIDGE;
 +  ndm-ndm_pad1  = 0;
 +  ndm-ndm_pad2= 0;
 +  ndm-ndm_flags  = NTF_LOWERDEV;
 +  ndm-ndm_type  = 0;
 +  ndm-ndm_ifindex = dev-ifindex;
 +  ndm-ndm_state   = NUD_PERMANENT;
 +
 +  NLA_PUT(skb, NDA_LLADDR, ETH_ALEN, ha-addr);
 +
 +  nlmsg_end(skb, nlh);
 +skip:
 +  ++idx;
 + }
 + netif_addr_unlock_bh(dev);
 +
 + return idx;
 +nla_put_failure:
 + netif_addr_unlock_bh(dev);
 + nlmsg_cancel(skb, nlh);
 + return idx;
 +}
 +EXPORT_SYMBOL(ndo_dflt_fdb_dump);
 +
  static int rtnl_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb)
  {
 int idx = 0;
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH v0 1/2] net: bridge: propagate FDB table into hardware

2012-02-10 Thread Roopa Prabhu




On 2/9/12 9:36 AM, John Fastabend john.r.fastab...@intel.com wrote:

 On 2/8/2012 8:36 PM, Stephen Hemminger wrote:
 On Wed, 08 Feb 2012 19:22:06 -0800
 John Fastabend john.r.fastab...@intel.com wrote:
 
 Propagate software FDB table into hardware uc, mc lists when
 the NETIF_F_HW_FDB is set.
 
 This resolves the case below where an embedded switch is used
 in hardware to do inter-VF or VF-PF switching. This patch
 pushes the FDB entry (specifically the MAC address) into the
 embedded switch with dev_add_uc and dev_add_mc so the switch
 learns about the software bridge.
 
 
   veth0  veth2
 |  |
   
   |  bridge0 |    software bridging
   
/
/
   ethx.y  ethx
 VF PF
  \ \   propagate FDB entries to HW
  \ \
   
   |  Embedded Bridge | hardware offloaded switching
   
 
 This is only an RFC couple more changes are needed.
 
 (1) Optimize HW FDB set/del to only walk list if an FDB offloaded
 device is attached. Or decide it doesn't matter from unlikely()
 path.
 
 (2) Is it good enough to just call dev_uc_{add|del} or
 dev_mc_{add|del}? Or do some devices really need a new netdev
 callback to do this operation correctly. I think it should be
 good enough as is.
 
 (3) wrapped list walk in rcu_read_lock() just in case maybe every
 case is already inside rcu_read_lock()/unlock().
 
 Also this is in response to this thread regarding the macvlan and
 exposing rx filters posting now to see if folks think this is the
 right idea and if it will resolve at least the bridge case.
 
 http://lists.openwall.net/netdev/2011/11/08/135
 
 Signed-off-by: John Fastabend john.r.fastab...@intel.com
 ---
 
  include/linux/netdev_features.h |2 ++
  net/bridge/br_fdb.c |   34 ++
  2 files changed, 36 insertions(+), 0 deletions(-)
 
 diff --git a/include/linux/netdev_features.h
 b/include/linux/netdev_features.h
 index 77f5202..5936fae 100644
 
 Rather than yet another device feature, I would rather use netlink_notifier
 callback. The notifier is more general and generic without messing with
 internals
 of bridge.
 
 
 But the device features makes it easy for user space to learn that the device
 supports this sort of offload. Now if all SR-IOV devices support this then it
 doesn't matter but I thought there were SR-IOV devices that didn't do any
 switching? I'll dig through the SR-IOV drivers to check there are not too
 many of them.

Correct. Our 802.1Qbh sriov device (enic) does not do local switching.

 
 By netlink_notifier do you mean adding a notifier_block and using
 atomic_notifier_call_chain()
 probably in rtnl_notify()? Then drivers could register with the notifier chain
 with
 atomic_notifier_chain_register() and receive the events correctly. Or did I
 miss
 some notifier chain that already exists?
 
 Thanks,
 John
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net-next-2.6 PATCH 0/6 v4] macvlan: MAC Address filtering support for passthru mode

2012-02-05 Thread Roopa Prabhu




On 2/3/12 7:32 AM, Roopa Prabhu ropra...@cisco.com wrote:

 
 
 
 On 2/2/12 10:58 AM, John Fastabend john.r.fastab...@intel.com wrote:
snip..

 Are you sure they will be good to have? I'm  not so sure you want to be
 able to manipulate the uc and mc tables from user space. MACVLAN seems to
 be one type of device where it is useful but doing this to a PF or VF seems
 hard to use for any real use case. Fun to test the embedded bridge though.
 
 
 I wont say I am sure. Would be nice have to have netlink interfaces to
 ADD/DEL additional uc, mc addrs, filter flags and vlans. I have looked at
 the existing interfaces and nothing seemed straightforward then. But I
 forget and need to take a look again.
 I think vlans and filter flags is somehow possible today. And maybe mc too.
 But if I am right we don't have a way to add additional unicast addresses
 from userspace. 
 
 I will dig my notes and try and list down the problems with using the
 existing netlink interfaces for this.

There are kernel api's/ops to add/del hw uc/mc/vlan/filter filter flags:
Ndo_set_rx_mode, add/del_vid, dev_uc_add, dev_mc_add and dev_filter_flags.

But there are no straight forward mechanisms to add these from userspace. L2
mc addresses can be added by SIOCADDMULTI. And filter flags maybe via
netlink. Nothing for uc and vlan as far as I know (correct me if I am
wrong). Setting of hw filters is usually done indirectly by the kernel
during creation of vlan devices for example.

There is a netlink msg to create a vlan device. But there is no way to add a
vlan filter directly to the hw. Nothing for secondary uc addrs.
This is ok for all cases except for the virtualization case I am trying to
solve. 

To summarize,

The requirement is to have a mechanism from userspace to populate hw filters
on a device. And this is required to program guest nic filters into the host
device backing the guest nic. In the direct attach case, its the macvtap
device and in turn the macvtap lower device.

Today I cant think of any other use case that would require this (except
that there is a brief chance that this could be used in the hybrid
acceleration stuff that ben and intel have been discussing).

I see the below ways this can be done:
1) TUNSETTXFILTER: My v1 patch, that targets only the above specific macvtap
problem. This works for only uc/mc and flags filter. Possibly requires a new
cmd TUNSETVLANFILTER for vlan filters.


2) rtnetlink ops for setting hw filters: My v2 patch targeting virtual
devices that implement rtnl_link_ops. Example macvtap/macvlan

This netlink interface to set filters follows TUNSETTXFILTER giving the
ability to set filters on these devices. The netlink payload must contain
all the uc, mc, vid's and filter flags that go on the device.


3) netdev_ops for setting hw filters: my v3 and v4 patches. This is same as
2 but moves the ops to netdev, so that it can be used by all devices if
required.


4) v5 (New approach. Not submitted yet):
In 2 and 3 above, the netlink msg could be broken down to have separate msgs
to support add/del of uc/mc/vlan. This should be close to what we have today
for vf vlan and vf mac. (Something similar to what John Fastabend was
suggesting too). Advantage, use existing hw ops. (This slightly varies from
the original goal which was not targeted at getting in to uc,mc lists alone.
The goal was to have macvlan maintain its own filters and use it in fwding
lookups if needed in the future. But I guess if we need this in the future
we could possibly use the macvlan uc, mc lists.)


Netlink msgs to set hw filters (basically for dev_uc_add/del,
dev_mc_add/del, and vlans). The below is not a final cut. Just attempting
something here. Please comment.

[IFLA_FILTER_ADDR] = {
[IFLA_FILTER_ADD_ADDR] = {
[IFLA_FILTER_HWADDR]  // Maybe a list here
}

[IFLA_FILTER_DEL_ADDR] = {
[IFLA_FILTER_HWADDR]// Maybe a list here
}
}

[IFLA_FILTER_VLAN] = {
[IFLA_FILTER_ADD_VLAN] = {
[IFLA_FILTER_VID]  // Maybe a list here too
}
[IFLA_FILTER_DEL_VLAN] = {
[IFLA_FILTER_VID] // Maybe a list here too
}
}



No additional ops, these will call the dev_uc_add/del and add/del_vid api's
directly.

If people think this might be a better way to go, I can work up a patch for
this.

Or if anyone thinks we can use something existing please let me know.

Again, this is needed for passing guest filters in the virtualization case.
I don't see a need to add support for this in iproute2 too (unless anyone
thinks otherwise)

[Note1: Since the addr is a resource and multiple adds/dels are handled by
reference counts, a thread can really call delete multiple times and delete
all references of a particular addr even if he had not owned it. But this is
possible even with existing code today I think. Except that today the kernel
does not expose an interface to do this to userspace

Note2: We probably will need one for filter flags as well. I am still
thinking maybe we can

Re: [net-next-2.6 PATCH 0/6 v4] macvlan: MAC Address filtering support for passthru mode

2012-02-03 Thread Roopa Prabhu




On 2/2/12 10:58 AM, John Fastabend john.r.fastab...@intel.com wrote:

 On 2/2/2012 10:07 AM, Roopa Prabhu wrote:

snip..
 
 My patches were trying to do just this (unless I am missing something).
 
 
 Right I was trying enumerate the cases. Your patches 5,6 seem to use
 dev_{uc|mc}_{add|del} like this.
 
Ok

 
 I think this has some real advantages to the above scheme. First
 we get rid of _all_ the drivers having to add a bunch of new
 net_device ops and do it once in the layer above. This is nice
 for driver implementers but also because your feature becomes usable
 immediately and we don't have to wait for driver developers to implement
 it.
 
 Yes my patches were targeting towards this too. I had macvlan implement the
 netlink ops and macvlan internally was using the dev_uc_add and del routines
 to pass the addr lists to lower device.
 
 Yes. But I am missing why the VF ops and netlink extensions are useful. Or
 even the op/netlink extension into the PF for that matter.
 

This was to support something that intel wanted. I think Ben described that
in another email. This was done by v3 version of the patch. This is not
needed for the macvlan problem that this patch is trying to solve. We were
trying to make the new netlink interface fit other use cases if possible.

I believe at this point we are convinced that the hybrid acceleration with
PF/VF that ben described can be solved in possibly other ways.

 
 
 Also it prunes down the number of netlink extensions being added
 here. 
 
 Additionally the existing semantics seem a bit strange to me on the
 netlink message side. Taking a quick look at the macvlan implementation
 it looks like every set has to have a complete list of address. But
 the dev_uc_add and dev_uc_del seem to be using a refcnt scheme so
 if I want to add a second address and then latter a third address
 how does that work?
 
 Every set has a complete list of addresses because, for macvlan non-passthru
 modes, in future we might want to have macvlan driver do the filtering (This
 is for the case when we have a single lower device and multiple macvlans)
 
 
 hmm but lists seem problematic when hooked up to the netdev uc and mc addr
 lists. Consider this case
 
 read uc_list  --- thread1: dumps unicast table
 add vlan  --- thread2: adds a vlan maybe inserting a uc addr
 write uc_list --- thread1: writes the table back + 1 addr
 
 Does the uc addr of the vlan get deleted? And this case

It should.

 
 read uc_list   --- dump table
 write uc_list  --- add a new filter A to the uc list
 read uc_list   --- dump table
 write uc_list  --- add a second filter B to the uc list
 
 Now based on your patch 4,5 it looks like the refcnt on the address A is
 two so to remove it I have to call set filters twice without the A addr.

I think this depends on the implementation of the handler. The macvtap
handler will remove A and add B if A is not part of the second filter.


 
 read  uc_list   --- dump table
 write uc_list   --- list without A
 write uc_list   --- list without A
 
 This seems really easy to get screwed up and it doesn't look like user
 space can learn the refcnt (at least in this series).
 
 


the sequences you are describing above are possible but they depend on how
you implement the filter handler I think.
For macvlan, this filter op could just populate the internal filter.
Except that for passthru mode it tries to program the lower dev hw filter
using uc/mc lists. And in passthru mode it assumes that it owns the lower
device, and tries to make sure that it adds or deletes only addresses it
knows about. I think if you have another thread also adding/deleting address
to the lowerdev when it is assigned to macvtap in passthru mode, the other
thread might see inconsistent results.

The netlink filter interface which the patch was trying to add was not a
replacement for existing uc/mc lists. It was really targeted to virtual
devices that want to do filtering on their own.
I believe uc/mc lists are used to set/unset hw filters and they are not used
in fwding lookups in the kernel (pls correct me if I am wrong). The filter
that this netlink msg was trying to set was for devices that want to do
filtering/fwding lookups like the hw.

 
 Is the expected flow from user space 'read uc_list - write uc_list'?
 This seems risky because with two adders in user space you might
 lose addresses unless they are somehow kept in sync. IMHO it is likely
 easier to implement an ADD and DEL attribute rather than a table
 approach.
 
 The ADD and DEL will work for macvlan passthru mode because it maps 1-1 with
 the lowerdev uc and mc list. The table was for non passthru modes when
 macvlan driver might need to do filtering. So my patchset started with
 macvlan filter table for all macvlan modes (hopefully) with passthru mode as
 a specific case of offloading everything to the lowerdevice.
 
 
 Still this doesn't require a table right. Repeated ADD/DEL should work
 correct?
 
Yes that's correct. You

Re: [net-next-2.6 PATCH 0/6 v4] macvlan: MAC Address filtering support for passthru mode

2012-02-02 Thread Roopa Prabhu




On 2/2/12 12:46 AM, John Fastabend john.r.fastab...@intel.com wrote:

 On 2/1/2012 11:24 PM, Michael S. Tsirkin wrote:
 On Sun, Nov 20, 2011 at 08:30:24AM -0800, Roopa Prabhu wrote:
 
 
 
 On 11/17/11 4:15 PM, Ben Hutchings bhutchi...@solarflare.com wrote:
 
 Sorry to come to this rather late.
 
 On Tue, 2011-11-08 at 23:55 -0800, Roopa Prabhu wrote:
 [...]
 v2 - v3
 - Moved set and get filter ops from rtnl_link_ops to netdev_ops
 - Support for SRIOV VFs.
 [Note: The get filters msg (in the way current get rtnetlink
 handles
 it) might get too big for SRIOV vfs. This patch follows existing
 sriov 
 vf get code and tries to accomodate filters for all VF's in a PF.
 And for the SRIOV case I have only tested the fact that the VF
 arguments are getting delivered to rtnetlink correctly. The code
 follows existing sriov vf handling code so rest of it should work
 fine]
 [...]
 
 This is already broken for large numbers of VFs, and increasing the
 amount of information per VF is going to make the situation worse.  I am
 no netlink expert but I think that the current approach of bundling all
 information about an interface in a single message may not be
 sustainable.
 
 Yes agreed. I have the same concern.
 
 So it seems that we need to extend the existing interface to allow
 tweaking filters per VF. Does it need to block this
 patchset though? After all, we'll need to support the existing
 
 hmm not sure I follow what patchset is this blocking?
 
 interface indefinitely, too.
 
 
 OK finally got to read through this. And its not clear to me why we need
 these per VF/PF filter netdevice ops and netlink extensions if we can
 get the stacking correct. (Adding filters to the macvlan seems reasonable
 to me)
 
 In the cases I saw listed above I see a few enumerations:
 
 PF -- MACVLAN  --- Guest --- [...]
 
 VF -- MACVLAN  --- Guest --- [...]
 
 VF|Guest --- [...]   direct assigned VF
 
 PF|Guest --- [...]   direct assigned PF
 
 
 I used '[...]' to represent whatever additional stacking is done in the
 guest unknown to the host. In the direct assign VF case (Greg Rose
 correct me if I am wrong) the normal uc and mc addr lists should suffice
 along with the netdev op ndo_set_rx_mode(). Here the guest adds MAC
 addresses and/or VLANS as normal and then the VF-PF back channel
 should handle this if needed. This should work for Linux guests and other
 OS's should do something similar.
 
 In the direct assign PF case the hardware is owned by the guest so
 no problems here.
 
 This leaves the two MACVLAN cases which can be handled the same. If
 the MACVLAN driver and netlink interface is extended to add filters
 to the MACVLAN then the addresses can be pushed to the lower device
 using the normal dev_uc_{add|del}() and dev_mc_{add|del}() routines.

My patches were trying to do just this (unless I am missing something).

 
 I think this has some real advantages to the above scheme. First
 we get rid of _all_ the drivers having to add a bunch of new
 net_device ops and do it once in the layer above. This is nice
 for driver implementers but also because your feature becomes usable
 immediately and we don't have to wait for driver developers to implement
 it.

Yes my patches were targeting towards this too. I had macvlan implement the
netlink ops and macvlan internally was using the dev_uc_add and del routines
to pass the addr lists to lower device.

 
 Also it prunes down the number of netlink extensions being added
 here. 
 
 Additionally the existing semantics seem a bit strange to me on the
 netlink message side. Taking a quick look at the macvlan implementation
 it looks like every set has to have a complete list of address. But
 the dev_uc_add and dev_uc_del seem to be using a refcnt scheme so
 if I want to add a second address and then latter a third address
 how does that work?

Every set has a complete list of addresses because, for macvlan non-passthru
modes, in future we might want to have macvlan driver do the filtering (This
is for the case when we have a single lower device and multiple macvlans)

 
 Is the expected flow from user space 'read uc_list - write uc_list'?
 This seems risky because with two adders in user space you might
 lose addresses unless they are somehow kept in sync. IMHO it is likely
 easier to implement an ADD and DEL attribute rather than a table
 approach.

The ADD and DEL will work for macvlan passthru mode because it maps 1-1 with
the lowerdev uc and mc list. The table was for non passthru modes when
macvlan driver might need to do filtering. So my patchset started with
macvlan filter table for all macvlan modes (hopefully) with passthru mode as
a specific case of offloading everything to the lowerdevice.

 Also the table was mimicking existing tap device filter table for macvtap.

 Took a quick stab at something like this below but there
 might be a better way to do this and allow direct

Re: [net-next-2.6 PATCH 0/6 v4] macvlan: MAC Address filtering support for passthru mode

2011-11-20 Thread Roopa Prabhu




On 11/17/11 4:15 PM, Ben Hutchings bhutchi...@solarflare.com wrote:

 Sorry to come to this rather late.
 
 On Tue, 2011-11-08 at 23:55 -0800, Roopa Prabhu wrote:
 [...]
 v2 - v3
 - Moved set and get filter ops from rtnl_link_ops to netdev_ops
 - Support for SRIOV VFs.
 [Note: The get filters msg (in the way current get rtnetlink handles
 it) might get too big for SRIOV vfs. This patch follows existing
 sriov 
 vf get code and tries to accomodate filters for all VF's in a PF.
 And for the SRIOV case I have only tested the fact that the VF
 arguments are getting delivered to rtnetlink correctly. The code
 follows existing sriov vf handling code so rest of it should work
 fine]
 [...]
 
 This is already broken for large numbers of VFs, and increasing the
 amount of information per VF is going to make the situation worse.  I am
 no netlink expert but I think that the current approach of bundling all
 information about an interface in a single message may not be
 sustainable.

Yes agreed. I have the same concern.

 
 Also, I'm unclear on why this interface is to be used to set filtering
 for the (PF) net device as well as for related VFs.  Doesn't that
 duplicate the functionality of ndo_set_rx_mode and
 ndo_vlan_rx_{add,kill}_vid?
 
Yes..I have thought about this. But the reason the final version is the way
it is because its trying to accommodate sriov and non sriov cases because I
was just trying to make the netlink interface available to as many use cases
that might need it.

I just wanted to bring up the original intent of this patch.
Which was to add support for TUNSETTXFILTER to macvtap so that it can do
filtering instead of putting the lower dev (physical dev) in promiscuous
mode (This part really does not care if the lowerdev is an SRIOV VF or not).
And the focus was on macvlan passthru mode because it is the simplest case
to solve (you have to just pass the filters to lowerdev device/driver).
Now this may seem like It can be done with existing set_rx_mode/add_vlan_id
etc (which are essentially the mechanisms I am using in the macvlan driver
to send the filters to lowerdev for passthru mode), but it might not be the
case for other macvlan modes. Macvlan device might need to maintain and do a
filter lookup like the tap driver does today. And the only reason SRIOV came
up in the original patch was because PASSTHRU mode of macvlan was added for
SRIOV use case, though it really does not care if the lowerdev is an SRIOV
VF or not.


Instead of implementing TUNSETTXFILTER, michael had suggested netlink
interface instead. 
When implementing the netlink interface, I did go back and forth in deciding
whether this should be on every netdev or only virtual devices that support
rtnl link ops. And the existence of set_rx_mode and add_vlan_id netdev ops
Was the reason for confusion. So the next version implemented it as rtnl
link ops because all I really want is a mechanism like TUNSETTXFILTER which
can set/get filters for virtual devices that need to do filtering by
themselves. But restricting this interface for only virtual devices did not
make great sense so when greg pointed it out that he will need it for VF
netdevs, I was happy to move it to netdev ops.

And the only reason this patch works on both PF and VF if the final version
is because, its trying to accommodate both SRIOV and non-SRIOV devices.
So by saying PF and VF, for me it really means SRIOV VF and any other netdev
devices. So I intentionally did not put PF or VF in the name of the op.

Thanks,
Roopa

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net-next-2.6 PATCH 0/6 v4] macvlan: MAC Address filtering support for passthru mode

2011-11-08 Thread Roopa Prabhu

,
int vf);
int (*ndo_get_rx_filter_addr)(
const struct net_device *dev,
int vf, struct sk_buff *skb);
int (*ndo_get_rx_filter_vlan)(
const struct net_device *dev,
int vf, struct sk_buff *skb);

Some answers to questions that were raised during the review:
- Protection against address spoofing:
- This patch adds filtering support only for macvtap PASSTHRU 
Mode. PASSTHRU mode is used mainly with SRIOV VF's. And SRIOV VF's 
come with anti mac/vlan spoofing support in the lowerdev driver. 
(netdev infrastructure to support this was added recently 
with IFLA_VF_SPOOFCHK). For 802.1Qbh devices, the port profile has a 
knob to enable/disable anti spoof check. Lowerdevice drivers also 
enforce limits on the number of address registrations allowed. 
For non-SRIOV VF's its the responsibility of the lowerdev driver
to implement any such protection. The currrent netdev hooks for 
SRIOV VF's spoof check could be extended to accomodate any network 
interface in the future.

- Support for multiqueue devices: Enable filtering on individual queues (?):
As i understand after the thread between (Micheal and Greg),
VMdq Linux implementation is not in yet and dont know how its going to
take shape. But Intel VMdq devices do accept filters on a per-queue
basis. Since the netdev infrastructure for VMdq is not in yet, Its
hard to say how this patch can support it.

This patch makes use of current netdev infrastructure for setting
address and vlan filters. And if that changes for vmdq tomorrow,
then the work that this patch represents can be modified to accomodate
vmdq devices at that time. 

So i dont see a huge problem with this patch coming in the way for
vmdq devices.

- Support for non-PASSTHRU mode:
I started implementing this. But there are a couple of problems.
- Today, in non-PASSTHRU cases macvlan_handle_frame assumes that 
every macvlan device has a single unique mac.
And the macvlans are hashed on that single mac address. 
To support filtering for non-PASSTHRU mode in addition to this 
patch the following needs to be done:
- non-passthru mode with a single macvlan over a lower dev
can be treated as PASSTHRU case
- For non-PASSTHRU mode with multiple macvlans over a single 
lower dev:  
- Multiple unicast mac's now need to be hashed to the 
same macvlan device. The macvlan hash needs to change 
for lookup based on any one of the multiple unicast 
addresses a macvlan is interested in
- We need to consider vlans during the lookup too
- So the macvlan device hash needs to hash on both mac 
and vlan
- But the support for filtering in non-PASSTHRU mode can be 
built on this patch

This patch series implements the following 
01/6 rtnetlink: Netlink interface for setting MAC and VLAN filters
02/6 netdev: Add netdev_ops to set and get MAC/VLAN rx filters
03/6 rtnetlink: Add support to set MAC/VLAN filters
04/6 rtnetlink: Add support to get MAC/VLAN filters
05/6 macvlan: Add support to set MAC/VLAN filter netdev ops
06/6 macvlan: Add support to get MAC/VLAN filter netdev ops

Please comment. Thanks.

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net-next-2.6 PATCH 1/6 v4] rtnetlink: Netlink interface for setting MAC and VLAN filters

2011-11-08 Thread Roopa Prabhu

From: Roopa Prabhu ropra...@cisco.com

This patch introduces the following netlink interface to set
MAC and VLAN filters on an network interface. It can be used to
set RX filter on any network interface (if supported by the driver) and
also on a SRIOV VF via its PF

Interface to set RX filter on a SRIOV VF
[IFLA_VF_RX_FILTERS] = {
[IFLA_VF_RX_FILTER] = {
[IFLA_RX_FILTER_VF]
[IFLA_RX_FILTER_ADDR] = {
[IFLA_RX_FILTER_ADDR_FLAGS]
[IFLA_RX_FILTER_ADDR_UC_LIST] = {
[IFLA_ADDR_LIST_ENTRY]
}
[IFLA_RX_FILTER_ADDR_MC_LIST] = {
[IFLA_ADDR_LIST_ENTRY]
}
}
[IFLA_RX_FILTER_VLAN] = {
[IFLA_RX_FILTER_VLAN_BITMAP]
}
}
...
}

Interface to set RX filter on any network interface.:
[IFLA_RX_FILTER] = {
[IFLA_RX_FILTER_VF]
[IFLA_RX_FILTER_ADDR] = {
[IFLA_RX_FILTER_ADDR_FLAGS]
[IFLA_RX_FILTER_ADDR_UC_LIST] = {
[IFLA_ADDR_LIST_ENTRY]
}
[IFLA_RX_FILTER_ADDR_MC_LIST] = {
[IFLA_ADDR_LIST_ENTRY]
}
}
[IFLA_RX_FILTER_VLAN] = {
[IFLA_RX_FILTER_VLAN_BITMAP]
}
}

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
---
 include/linux/if_link.h |   61 +++
 net/core/rtnetlink.c|   20 +++
 2 files changed, 81 insertions(+), 0 deletions(-)


diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index c52d4b5..74a9f17 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -137,6 +137,8 @@ enum {
IFLA_AF_SPEC,
IFLA_GROUP, /* Group the device belongs to */
IFLA_NET_NS_FD,
+   IFLA_VF_RX_FILTERS,
+   IFLA_RX_FILTER,
__IFLA_MAX
 };
 
@@ -390,4 +392,63 @@ struct ifla_port_vsi {
__u8 pad[3];
 };
 
+/* VF rx filters management section
+ *
+ * Nested layout of set/get msg is:
+ *
+ * [IFLA_VF_RX_FILTERS]
+ * [IFLA_VF_RX_FILTER]
+ * [IFLA_RX_FILTER_*], ...
+ * [IFLA_VF_RX_FILTER]
+ * [IFLA_RX_FILTER_*], ...
+ * ...
+ * [IFLA_RX_FILTER]
+ * [IFLA_RX_FILTER_*], ...
+ */
+enum {
+   IFLA_VF_RX_FILTER_UNSPEC,
+   IFLA_VF_RX_FILTER,  /* nest */
+   __IFLA_VF_RX_FILTER_MAX,
+};
+
+#define IFLA_VF_RX_FILTER_MAX (__IFLA_VF_RX_FILTER_MAX - 1)
+
+enum {
+   IFLA_RX_FILTER_UNSPEC,
+   IFLA_RX_FILTER_VF,  /* __u32 */
+   IFLA_RX_FILTER_ADDR,
+   IFLA_RX_FILTER_VLAN,
+   __IFLA_RX_FILTER_MAX,
+};
+#define IFLA_RX_FILTER_MAX (__IFLA_RX_FILTER_MAX - 1)
+
+enum {
+   IFLA_RX_FILTER_ADDR_UNSPEC,
+   IFLA_RX_FILTER_ADDR_FLAGS,
+   IFLA_RX_FILTER_ADDR_UC_LIST,
+   IFLA_RX_FILTER_ADDR_MC_LIST,
+   __IFLA_RX_FILTER_ADDR_MAX,
+};
+#define IFLA_RX_FILTER_ADDR_MAX (__IFLA_RX_FILTER_ADDR_MAX - 1)
+
+#define RX_FILTER_FLAGS (IFF_UP | IFF_BROADCAST | IFF_MULTICAST | \
+   IFF_PROMISC | IFF_ALLMULTI)
+
+enum {
+   IFLA_ADDR_LIST_UNSPEC,
+   IFLA_ADDR_LIST_ENTRY,
+   __IFLA_ADDR_LIST_MAX,
+};
+#define IFLA_ADDR_LIST_MAX (__IFLA_ADDR_LIST_MAX - 1)
+
+enum {
+   IFLA_RX_FILTER_VLAN_UNSPEC,
+   IFLA_RX_FILTER_VLAN_BITMAP,
+   __IFLA_RX_FILTER_VLAN_MAX,
+};
+#define IFLA_RX_FILTER_VLAN_MAX (__IFLA_RX_FILTER_VLAN_MAX - 1)
+
+#define VLAN_BITMAP_SPLIT_MAX 8
+#define VLAN_BITMAP_SIZE   (VLAN_N_VID/VLAN_BITMAP_SPLIT_MAX)
+
 #endif /* _LINUX_IF_LINK_H */
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 9083e82..9eead8e 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -42,6 +42,7 @@
 
 #include linux/inet.h
 #include linux/netdevice.h
+#include linux/if_vlan.h
 #include net/ip.h
 #include net/protocol.h
 #include net/arp.h
@@ -1097,6 +1098,8 @@ const struct nla_policy ifla_policy[IFLA_MAX+1] = {
[IFLA_VF_PORTS] = { .type = NLA_NESTED },
[IFLA_PORT_SELF]= { .type = NLA_NESTED },
[IFLA_AF_SPEC]  = { .type = NLA_NESTED },
+   [IFLA_VF_RX_FILTERS]= { .type = NLA_NESTED },
+   [IFLA_RX_FILTER]= { .type = NLA_NESTED },
 };
 EXPORT_SYMBOL(ifla_policy);
 
@@ -1132,6 +1135,23 @@ static const struct nla_policy 
ifla_port_policy[IFLA_PORT_MAX+1] = {
[IFLA_PORT_RESPONSE]= { .type = NLA_U16, },
 };
 
+static const struct nla_policy ifla_rx_filter_policy[IFLA_RX_FILTER_MAX+1] = {
+   [IFLA_RX_FILTER_VF] = { .type = NLA_U32 },
+   [IFLA_RX_FILTER_ADDR]   = { .type = NLA_NESTED },
+   [IFLA_RX_FILTER_VLAN]   = { .type = NLA_NESTED

[net-next-2.6 PATCH 2/6 v4] net: Add netdev_ops to set and get MAC/VLAN rx filters

2011-11-08 Thread Roopa Prabhu

From: Roopa Prabhu ropra...@cisco.com

This patch adds the following netdev_ops to set and get MAC/VLAN
filters on a SRIOV VF or any netdev interface. Each op takes a vf argument.
vf value of SELF_VF or -1 is for applying the operation directly on the
interface.

ndo_set_rx_filter_addr - to set address filter
ndo_get_rx_filter_addr_size - to get address filter size
ndo_get_rx_filter_addr - To get addr filter

ndo_set_rx_filter_vlan - to set vlan filter
ndo_get_rx_filter_vlan_size - to get vlan filter size
ndo_get_rx_filter_vlan - To get vlan filter

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
---
 include/linux/netdevice.h |   32 
 1 files changed, 32 insertions(+), 0 deletions(-)


diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index cbeb586..3cbd700 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -855,6 +855,20 @@ struct netdev_tc_txq {
  * feature set might be less than what was returned by ndo_fix_features()).
  * Must return 0 or -errno if it changed dev-features itself.
  *
+ * Address Filter management functions:
+ * int (*ndo_set_rx_filter_addr)(struct net_device *dev, int vf,
+ *  struct nlattr *tb[]);
+ * size_t (*ndo_get_rx_filter_addr_size)(const struct net_device *dev, int vf);
+ * int (*ndo_get_rx_filter_addr)(const struct net_device *dev, int vf,
+ *  struct sk_buff *skb);
+ *
+ * Vlan Filter management functions:
+ * int (*ndo_set_rx_filter_vlan)(struct net_device *dev, int vf,
+ *  struct nlattr *tb[]);
+ * size_t (*ndo_get_rx_filter_vlan_size)(const struct net_device *dev, int vf);
+ * int (*ndo_get_rx_filter_vlan)(const struct net_device *dev, int vf,
+ *  struct sk_buff *skb);
+ *
  */
 struct net_device_ops {
int (*ndo_init)(struct net_device *dev);
@@ -948,6 +962,24 @@ struct net_device_ops {
u32 features);
int (*ndo_set_features)(struct net_device *dev,
u32 features);
+   int (*ndo_set_rx_filter_addr)(
+   struct net_device *dev, int vf,
+   struct nlattr *tb[]);
+   size_t  (*ndo_get_rx_filter_addr_size)(
+   const struct net_device *dev,
+   int vf);
+   int (*ndo_get_rx_filter_addr)(
+   const struct net_device *dev,
+   int vf, struct sk_buff *skb);
+   int (*ndo_set_rx_filter_vlan)(
+   struct net_device *dev, int vf,
+   struct nlattr *tb[]);
+   size_t  (*ndo_get_rx_filter_vlan_size)(
+   const struct net_device *dev,
+   int vf);
+   int (*ndo_get_rx_filter_vlan)(
+   const struct net_device *dev,
+   int vf, struct sk_buff *skb);
 };
 
 /*

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net-next-2.6 PATCH 3/6 v4] rtnetlink: Add support to set MAC/VLAN filters

2011-11-08 Thread Roopa Prabhu

From: Roopa Prabhu ropra...@cisco.com

This patch adds support in rtnetlink for IFLA_RX_FILTER and
IFLA_VF_RX_FILTERS set. It calls netdev_ops-set_rx_filter_addr and
rtnl_link_ops-set_rx_filter_vlan

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
---
 include/linux/if_link.h |2 +
 net/core/rtnetlink.c|  101 +++
 2 files changed, 103 insertions(+), 0 deletions(-)


diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index 74a9f17..a8c2c14 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -268,6 +268,8 @@ enum macvlan_mode {
 
 /* SR-IOV virtual function management section */
 
+#define SELF_VF-1
+
 enum {
IFLA_VF_INFO_UNSPEC,
IFLA_VF_INFO,
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 9eead8e..a042910 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1294,6 +1294,66 @@ static int do_set_master(struct net_device *dev, int 
ifindex)
return 0;
 }
 
+static int do_set_rx_filter(struct net_device *dev, int vf,
+   struct nlattr *rx_filter[],
+   int *modified)
+{
+   const struct net_device_ops *ops = dev-netdev_ops;
+   int err = 0;
+
+   if (rx_filter[IFLA_RX_FILTER_ADDR]) {
+   struct nlattr *addr_filter[IFLA_RX_FILTER_ADDR_MAX+1];
+
+   if (!ops-ndo_set_rx_filter_addr) {
+   err = -EOPNOTSUPP;
+   goto errout;
+   }
+
+   err = nla_parse_nested(addr_filter, IFLA_RX_FILTER_ADDR_MAX,
+   rx_filter[IFLA_RX_FILTER_ADDR],
+   ifla_addr_filter_policy);
+   if (err  0)
+   goto errout;
+
+   if (addr_filter[IFLA_RX_FILTER_ADDR_FLAGS]) {
+   unsigned int flags = nla_get_u32(
+   addr_filter[IFLA_RX_FILTER_ADDR_FLAGS]);
+   if (flags  ~RX_FILTER_FLAGS) {
+   err = -EINVAL;
+   goto errout;
+   }
+   }
+
+   err = ops-ndo_set_rx_filter_addr(dev, vf, addr_filter);
+   if (err  0)
+   goto errout;
+   *modified = 1;
+   }
+
+   if (rx_filter[IFLA_RX_FILTER_VLAN]) {
+   struct nlattr *vlan_filter[IFLA_RX_FILTER_VLAN_MAX+1];
+
+   if (!ops-ndo_set_rx_filter_vlan) {
+   err = -EOPNOTSUPP;
+   goto errout;
+   }
+
+   err = nla_parse_nested(vlan_filter, IFLA_RX_FILTER_VLAN_MAX,
+   rx_filter[IFLA_RX_FILTER_VLAN],
+   ifla_vlan_filter_policy);
+   if (err  0)
+   goto errout;
+
+   err = ops-ndo_set_rx_filter_vlan(dev, vf, vlan_filter);
+   if (err  0)
+   goto errout;
+   *modified = 1;
+   }
+
+errout:
+   return err;
+}
+
 static int do_setlink(struct net_device *dev, struct ifinfomsg *ifm,
  struct nlattr **tb, char *ifname, int modified)
 {
@@ -1515,6 +1575,47 @@ static int do_setlink(struct net_device *dev, struct 
ifinfomsg *ifm,
modified = 1;
}
}
+
+   if (tb[IFLA_VF_RX_FILTERS]) {
+   struct nlattr *vf_rx_filter[IFLA_RX_FILTER_MAX+1];
+   struct nlattr *attr;
+   int vf;
+   int rem;
+
+   nla_for_each_nested(attr, tb[IFLA_VF_RX_FILTERS], rem) {
+   if (nla_type(attr) != IFLA_VF_RX_FILTER)
+   continue;
+   err = nla_parse_nested(vf_rx_filter, IFLA_RX_FILTER_MAX,
+   attr, ifla_rx_filter_policy);
+   if (err  0)
+   goto errout;
+
+   if (!vf_rx_filter[IFLA_RX_FILTER_VF]) {
+   err = -EOPNOTSUPP;
+   goto errout;
+   }
+   vf = nla_get_u32(vf_rx_filter[IFLA_RX_FILTER_VF]);
+
+   err = do_set_rx_filter(dev, vf, vf_rx_filter,
+modified);
+   if (err  0)
+   goto errout;
+   }
+   }
+
+   if (tb[IFLA_RX_FILTER]) {
+   struct nlattr *rx_filter[IFLA_RX_FILTER_MAX+1];
+
+   err = nla_parse_nested(rx_filter, IFLA_RX_FILTER_MAX,
+   tb[IFLA_RX_FILTER], ifla_rx_filter_policy);
+   if (err  0)
+   goto errout;
+
+   err = do_set_rx_filter(dev, SELF_VF, rx_filter

[net-next-2.6 PATCH 4/6 v4] rtnetlink: Add support to get MAC/VLAN filters

2011-11-08 Thread Roopa Prabhu

From: Roopa Prabhu ropra...@cisco.com

This patch adds support in rtnetlink for IFLA_RX_VF_FILTERS and
IFLA_RX_FILTER get. It gets the size of the filters using
netdev_ops-get_rx_filter_addr_size and netdev_ops-get_rx_filter_vlan_size
and uses netdev_ops-get_rx_filter_addr and netdev_ops-get_rx_filter_vlan.
In case of IFLA_RX_VF_FILTERS it loops through all vf's to get the filter
data

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
---
 net/core/rtnetlink.c |  159 ++
 1 files changed, 158 insertions(+), 1 deletions(-)


diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index a042910..ea861b4 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -475,6 +475,62 @@ static size_t rtnl_link_get_af_size(const struct 
net_device *dev)
return size;
 }
 
+static size_t rtnl_vf_rx_filter_size(const struct net_device *dev, int vf)
+{
+   const struct net_device_ops *ops = dev-netdev_ops;
+   size_t size;
+
+   /* IFLA_RX_FILTER  or IFLA_VF_RX_FILTER */
+   size = nla_total_size(sizeof(struct nlattr));
+
+   if (vf != SELF_VF)
+   size = nla_total_size(4); /* IFLA_RX_FILTER_VF */
+
+   if (ops-ndo_get_rx_filter_addr_size) {
+   size_t rx_filter_addr_size =
+   ops-ndo_get_rx_filter_addr_size(dev, vf);
+
+   if (rx_filter_addr_size)
+   /* IFLA_RX_FILTER_ADDR */
+   size += nla_total_size(sizeof(struct nlattr)) +
+   rx_filter_addr_size;
+   }
+
+   if (ops-ndo_get_rx_filter_vlan_size) {
+   size_t rx_filter_vlan_size =
+   ops-ndo_get_rx_filter_vlan_size(dev, vf);
+
+   if (rx_filter_vlan_size)
+   /* IFLA_RX_FILTER_VLAN */
+   size += nla_total_size(sizeof(struct nlattr)) +
+   rx_filter_vlan_size;
+   }
+
+   return size;
+}
+
+static size_t rtnl_rx_filter_size(const struct net_device *dev)
+{
+   const struct net_device_ops *ops = dev-netdev_ops;
+   int vf = SELF_VF;
+   size_t size;
+
+   if (!ops-ndo_get_rx_filter_addr_size 
+   !ops-ndo_get_rx_filter_vlan_size)
+   return 0;
+
+   size = rtnl_vf_rx_filter_size(dev, vf); /* SELF_VF */
+
+   if (dev-dev.parent  dev_num_vf(dev-dev.parent)) {
+   /* IFLA_VF_RX_FILTERS */
+   size = nla_total_size(sizeof(struct nlattr));
+   for (vf = 0; vf  dev_num_vf(dev-dev.parent); vf++)
+   size += rtnl_vf_rx_filter_size(dev, vf);
+   }
+
+   return size;
+}
+
 static int rtnl_link_fill(struct sk_buff *skb, const struct net_device *dev)
 {
const struct rtnl_link_ops *ops = dev-rtnl_link_ops;
@@ -513,6 +569,102 @@ out:
return err;
 }
 
+static int rtnl_vf_rx_filter_fill(struct sk_buff *skb,
+ const struct net_device *dev, int vf)
+{
+   const struct net_device_ops *ops = dev-netdev_ops;
+   struct nlattr *addr_filter = NULL, *vlan_filter = NULL;
+   struct nlattr *rx_filter;
+   int err = -EMSGSIZE;
+   int filter_attrtype =
+   (vf == SELF_VF ? IFLA_RX_FILTER : IFLA_VF_RX_FILTER);
+
+   rx_filter = nla_nest_start(skb, filter_attrtype);
+   if (rx_filter == NULL)
+   goto nla_put_failure;
+
+   if (vf != SELF_VF)
+   NLA_PUT_U32(skb, IFLA_RX_FILTER_VF, vf);
+
+   if (ops-ndo_get_rx_filter_addr) {
+   addr_filter = nla_nest_start(skb, IFLA_RX_FILTER_ADDR);
+   if (addr_filter == NULL)
+   goto err_cancel_rx_filter;
+   err = ops-ndo_get_rx_filter_addr(dev, vf, skb);
+   if (err == -ENODATA)
+   nla_nest_cancel(skb, addr_filter);
+   else if (err  0)
+   goto err_cancel_addr_filter;
+   else
+   nla_nest_end(skb, addr_filter);
+   }
+
+   if (ops-ndo_get_rx_filter_vlan) {
+   vlan_filter = nla_nest_start(skb, IFLA_RX_FILTER_VLAN);
+   if (vlan_filter == NULL)
+   goto err_cancel_addr_filter;
+   err = ops-ndo_get_rx_filter_vlan(dev, vf, skb);
+   if (err == -ENODATA)
+   nla_nest_cancel(skb, vlan_filter);
+   else if (err)
+   goto err_cancel_vlan_filter;
+   else
+   nla_nest_end(skb, vlan_filter);
+   }
+   nla_nest_end(skb, rx_filter);
+
+   return 0;
+
+err_cancel_vlan_filter:
+   if (vlan_filter)
+   nla_nest_cancel(skb, vlan_filter);
+err_cancel_addr_filter:
+   if (addr_filter)
+   nla_nest_cancel(skb, addr_filter

[net-next-2.6 PATCH 5/6 v4] macvlan: Add support to for netdev ops to set MAC/VLAN filters

2011-11-08 Thread Roopa Prabhu

From: Roopa Prabhu ropra...@cisco.com

This patch adds support for MAC and VLAN filter netdev ops
on a macvlan interface. It adds support for set_rx_filter_addr and
set_rx_filter_vlan netdev operations. It currently supports only macvlan
PASSTHRU mode. And removes the code that puts the lowerdev in promiscous mode.

For passthru mode,
For both Address and vlan filters set, lowerdev
netdev_ops-set_rx_filter_addr and netdev_ops-set_rx_filter_vlan
are called if the lowerdev supports these ops.

Else parse the filter data and update the lowerdev filters:
 - Address filters: macvlan netdev uc and mc lists and flags are
updated to reflect the addresses and address filter flags that came
in the filter. Which inturn results in calls to macvlan_set_rx_mode and
macvlan_change_rx_flags. These functions pass the filter addresses
and flags to lowerdev netdev. And the lowerdev driver will pass it
to the hw.

- VLAN filter: Currently applied vlan bitmap is cached in
struct macvlan_dev-vlan_filter. This vlan bitmap is updated to
reflect the new bitmap that came in the netlink vlan filter msg.
macvlan_vlan_rx_add_vid and macvlan_vlan_rx_kill_vid are called
to update the vlan ids on the macvlan netdev, which in turn results in
passing the vlan ids to the lowerdev using netdev_ops
ndo_vlan_rx_add_vid and ndo_vlan_rx_kill_vid


Note: In future if most lowerdev drivers find use for these ops and start
supporting them, we could remove the local handling of filters for passthru
mode in macvlan

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
---
 drivers/net/macvlan.c  |  331 
 include/linux/if_macvlan.h |2 
 2 files changed, 300 insertions(+), 33 deletions(-)


diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 7413497..c2dea97 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -309,30 +309,37 @@ static int macvlan_open(struct net_device *dev)
struct net_device *lowerdev = vlan-lowerdev;
int err;
 
-   if (vlan-port-passthru) {
-   dev_set_promiscuity(lowerdev, 1);
-   goto hash_add;
-   }
+   if (!vlan-port-passthru) {
+   err = -EBUSY;
+   if (macvlan_addr_busy(vlan-port, dev-dev_addr))
+   goto out;
 
-   err = -EBUSY;
-   if (macvlan_addr_busy(vlan-port, dev-dev_addr))
-   goto out;
+   err = dev_uc_add(lowerdev, dev-dev_addr);
+   if (err  0)
+   goto out;
+   }
 
-   err = dev_uc_add(lowerdev, dev-dev_addr);
-   if (err  0)
-   goto out;
if (dev-flags  IFF_ALLMULTI) {
err = dev_set_allmulti(lowerdev, 1);
if (err  0)
goto del_unicast;
}
 
-hash_add:
+   if (dev-flags  IFF_PROMISC) {
+   err = dev_set_promiscuity(lowerdev, 1);
+   if (err  0)
+   goto unset_allmulti;
+   }
+
macvlan_hash_add(vlan);
return 0;
 
+unset_allmulti:
+   dev_set_allmulti(lowerdev, -1);
+
 del_unicast:
-   dev_uc_del(lowerdev, dev-dev_addr);
+   if (!vlan-port-passthru)
+   dev_uc_del(lowerdev, dev-dev_addr);
 out:
return err;
 }
@@ -342,18 +349,16 @@ static int macvlan_stop(struct net_device *dev)
struct macvlan_dev *vlan = netdev_priv(dev);
struct net_device *lowerdev = vlan-lowerdev;
 
-   if (vlan-port-passthru) {
-   dev_set_promiscuity(lowerdev, -1);
-   goto hash_del;
-   }
-
+   dev_uc_unsync(lowerdev, dev);
dev_mc_unsync(lowerdev, dev);
if (dev-flags  IFF_ALLMULTI)
dev_set_allmulti(lowerdev, -1);
+   if (dev-flags  IFF_PROMISC)
+   dev_set_promiscuity(lowerdev, -1);
 
-   dev_uc_del(lowerdev, dev-dev_addr);
+   if (!vlan-port-passthru)
+   dev_uc_del(lowerdev, dev-dev_addr);
 
-hash_del:
macvlan_hash_del(vlan, !dev-dismantle);
return 0;
 }
@@ -394,12 +399,16 @@ static void macvlan_change_rx_flags(struct net_device 
*dev, int change)
 
if (change  IFF_ALLMULTI)
dev_set_allmulti(lowerdev, dev-flags  IFF_ALLMULTI ? 1 : -1);
+   if (change  IFF_PROMISC)
+   dev_set_promiscuity(lowerdev,
+   dev-flags  IFF_PROMISC ? 1 : -1);
 }
 
-static void macvlan_set_multicast_list(struct net_device *dev)
+static void macvlan_set_rx_mode(struct net_device *dev)
 {
struct macvlan_dev *vlan = netdev_priv(dev);
 
+   dev_uc_sync(vlan-lowerdev, dev);
dev_mc_sync(vlan-lowerdev, dev);
 }
 
@@ -542,6 +551,257 @@ static void macvlan_vlan_rx_kill_vid(struct net_device 
*dev,
ops

[net-next-2.6 PATCH 6/6 v4] macvlan: Add support to get MAC/VLAN filter netdev ops

2011-11-08 Thread Roopa Prabhu

From: Roopa Prabhu ropra...@cisco.com

This patch adds support to get MAC and VLAN filter netdev ops
on a macvlan interface. It adds support for get_rx_filter_addr_size,
get_rx_filter_vlan_size, fill_rx_filter_addr and fill_rx_filter_vlan
netdev ops

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
---
 drivers/net/macvlan.c |  158 +
 1 files changed, 158 insertions(+), 0 deletions(-)


diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index c2dea97..8a5320b 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -623,6 +623,55 @@ static int macvlan_set_rx_filter_vlan(struct net_device 
*dev, int vf,
return 0;
 }
 
+static size_t macvlan_get_rx_filter_vlan_size(const struct net_device *dev,
+ int vf)
+{
+   struct macvlan_dev *vlan = netdev_priv(dev);
+   struct net_device *lowerdev = vlan-lowerdev;
+   const struct net_device_ops *ops = lowerdev-netdev_ops;
+
+   if (vf != SELF_VF)
+   return -EINVAL;
+
+   switch (vlan-mode) {
+   case MACVLAN_MODE_PASSTHRU:
+   if (ops-ndo_get_rx_filter_vlan_size)
+   return ops-ndo_get_rx_filter_vlan_size(dev, vf);
+   /* IFLA_RX_FILTER_VLAN_BITMAP */
+   return nla_total_size(VLAN_BITMAP_SIZE);
+   default:
+   return 0;
+   }
+}
+
+static int macvlan_get_rx_filter_vlan(const struct net_device *dev, int vf,
+ struct sk_buff *skb)
+{
+   struct macvlan_dev *vlan = netdev_priv(dev);
+   struct net_device *lowerdev = vlan-lowerdev;
+   const struct net_device_ops *ops = lowerdev-netdev_ops;
+
+   if (vf != SELF_VF)
+   return -EINVAL;
+
+   switch (vlan-mode) {
+   case MACVLAN_MODE_PASSTHRU:
+   if (ops-ndo_get_rx_filter_vlan)
+   return ops-ndo_get_rx_filter_vlan(dev, vf, skb);
+
+   NLA_PUT(skb, IFLA_RX_FILTER_VLAN_BITMAP, VLAN_BITMAP_SIZE,
+   vlan-vlan_filter);
+   break;
+   default:
+   return -ENODATA; /* No data to Fill */
+   }
+
+   return 0;
+
+nla_put_failure:
+   return -EMSGSIZE;
+}
+
 static int macvlan_addr_in_hw_list(struct netdev_hw_addr_list *list,
   u8 *addr, int addrlen)
 {
@@ -802,6 +851,111 @@ static int macvlan_set_rx_filter_addr(struct net_device 
*dev, int vf,
return 0;
 }
 
+static size_t macvlan_get_rx_filter_addr_passthru_size(
+   const struct net_device *dev, int vf)
+{
+   struct macvlan_dev *vlan = netdev_priv(dev);
+   struct net_device *lowerdev = vlan-lowerdev;
+   const struct net_device_ops *ops = lowerdev-netdev_ops;
+   size_t size;
+
+   if (ops-ndo_get_rx_filter_addr_size)
+   return ops-ndo_get_rx_filter_addr_size(dev, vf);
+
+   /* IFLA_RX_FILTER_ADDR_FLAGS */
+   size = nla_total_size(sizeof(u32));
+
+   if (netdev_uc_count(dev))
+   /* IFLA_RX_FILTER_ADDR_UC_LIST */
+   size += nla_total_size(netdev_uc_count(dev) *
+  ETH_ALEN * sizeof(struct nlattr));
+
+   if (netdev_mc_count(dev))
+   /* IFLA_RX_FILTER_ADDR_MC_LIST */
+   size += nla_total_size(netdev_mc_count(dev) *
+  ETH_ALEN * sizeof(struct nlattr));
+
+   return size;
+}
+
+static size_t macvlan_get_rx_filter_addr_size(const struct net_device *dev,
+ int vf)
+{
+   struct macvlan_dev *vlan = netdev_priv(dev);
+
+   if (vf != SELF_VF)
+   return -EINVAL;
+
+   switch (vlan-mode) {
+   case MACVLAN_MODE_PASSTHRU:
+   return macvlan_get_rx_filter_addr_passthru_size(dev, vf);
+   default:
+   return 0;
+   }
+}
+
+static int macvlan_get_rx_filter_addr_passthru(const struct net_device *dev,
+  int vf, struct sk_buff *skb)
+{
+   struct macvlan_dev *vlan = netdev_priv(dev);
+   struct net_device *lowerdev = vlan-lowerdev;
+   const struct net_device_ops *ops = lowerdev-netdev_ops;
+   struct nlattr *uninitialized_var(uc_list), *mc_list;
+   struct netdev_hw_addr *ha;
+
+   if (ops-ndo_get_rx_filter_addr)
+   return ops-ndo_get_rx_filter_addr(dev, vf, skb);
+
+   NLA_PUT_U32(skb, IFLA_RX_FILTER_ADDR_FLAGS,
+   dev-flags  RX_FILTER_FLAGS);
+
+   if (netdev_uc_count(dev)) {
+   uc_list = nla_nest_start(skb, IFLA_RX_FILTER_ADDR_UC_LIST);
+   if (uc_list == NULL)
+   goto nla_put_failure;
+
+   netdev_for_each_uc_addr(ha, dev) {
+   NLA_PUT(skb, IFLA_ADDR_LIST_ENTRY, ETH_ALEN, ha-addr

Re: [net-next-2.6 PATCH 0/6 RFC v3] macvlan: MAC Address filtering support for passthru mode

2011-11-01 Thread Roopa Prabhu




On 10/31/11 10:39 AM, Rose, Gregory V gregory.v.r...@intel.com wrote:

 -Original Message-
 From: Roopa Prabhu [mailto:ropra...@cisco.com]
 Sent: Monday, October 31, 2011 10:09 AM
 To: Rose, Gregory V; net...@vger.kernel.org
 Cc: s...@us.ibm.com; dragos.tatu...@gmail.com; kvm@vger.kernel.org;
 a...@arndb.de; m...@redhat.com; da...@davemloft.net; mc...@broadcom.com;
 dwa...@cisco.com; shemmin...@vyatta.com; eric.duma...@gmail.com;
 ka...@trash.net; be...@cisco.com
 Subject: Re: [net-next-2.6 PATCH 0/6 RFC v3] macvlan: MAC Address
 filtering support for passthru mode
 
 
 
 
 On 10/31/11 9:38 AM, Rose, Gregory V gregory.v.r...@intel.com wrote:
 
 -Original Message-
 From: netdev-ow...@vger.kernel.org [mailto:netdev-
 ow...@vger.kernel.org]
 On Behalf Of Roopa Prabhu
 Sent: Friday, October 28, 2011 7:34 PM
 To: net...@vger.kernel.org
 Cc: s...@us.ibm.com; dragos.tatu...@gmail.com; kvm@vger.kernel.org;
 a...@arndb.de; m...@redhat.com; da...@davemloft.net; Rose, Gregory V;
 mc...@broadcom.com; dwa...@cisco.com; shemmin...@vyatta.com;
 eric.duma...@gmail.com; ka...@trash.net; be...@cisco.com
 Subject: [net-next-2.6 PATCH 0/6 RFC v3] macvlan: MAC Address filtering
 support for passthru mode
 
 v2 - v3
 - Moved set and get filter ops from rtnl_link_ops to netdev_ops
 - Support for SRIOV VFs.
 [Note: The get filters msg might get too big for SRIOV vfs.
 But this patch follows existing sriov vf get code and
 accomodate filters for all VF's in a PF.
 And for the SRIOV case I have only tested the fact that the VF
 arguments are getting delivered to rtnetlink correctly. The rest of
 the code follows existing sriov vf handling code so it should work
 just fine]
 - Fixed all op and netlink attribute names to start with IFLA_RX_FILTER
 - Changed macvlan filter ops to call corresponding lowerdev op if
 lowerdev
   supports it for passthru mode. Else it falls back on macvlan handling
 the
   filters locally as in v1 and v2
 
 v1 - v2
 - Instead of TUNSETTXFILTER introduced rtnetlink interface for the same
 
 
 [snip...]
 
 
 This patch series implements the following
 01/6 rtnetlink: Netlink interface for setting MAC and VLAN filters
 02/6 netdev: Add netdev_ops to set and get MAC/VLAN rx filters
 03/6 rtnetlink: Add support to set MAC/VLAN filters
 04/6 rtnetlink: Add support to get MAC/VLAN filters
 05/6 macvlan: Add support to set MAC/VLAN filter netdev ops
 06/6 macvlan: Add support to get MAC/VLAN filter netdev ops
 
 Please comment. Thanks.
 
 After some preliminary review this looks pretty good to me in so far as
 adding
 the necessary hooks to do what I need to do.  I appreciate your effort
 on
 this.
 
 I'm sort of a hands-on type of person so I need to apply this patch to a
 private git tree and then take it for a test drive (so to speak).  If I have
 further comments I'll get back to you.
 
 Sounds good.
 
 Did you have any plans for modifying any user space tools such as 'ip' to
 use
 this interface?
 
 
 Yes, I have an iproute2 sample patch for setting and displaying the filters
 which I have been using to test this interface. I can send the patch to you
 after some cleanup if you think it will be useful for you to try out this
 interface.
 
 Thanks Greg.
 
 Yes, please do.
 
 Thanks,
 
 - Greg
 
Greg, here is the patch. I rebased it with tip-of-tree iproute2 git. Thanks.

iproute2: support for MAC/VLAN filter

This patch is not complete. Its a bit hackish right now.
I implemented this patch to only test the kernel interface
without usability in mind.

Limitations:
- Havent checked corner cases for sriov vfs
- usage msg needs to be fixed. Its ugly right now
- vf = -1 for direct assignment of filters on a vf or any network
interface
- functions could be broken down, var names changed etc
- show part definately needs to change. It does not
follow the convention right now
- it has some redundant code which can be removed and simplified

I will work on this patch some more and resubmit this patch
after the kernel patch gets accepted.

diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index 304c44f..ffd03e1 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -137,6 +137,8 @@ enum {
 IFLA_AF_SPEC,
 IFLA_GROUP,/* Group the device belongs to */
 IFLA_NET_NS_FD,
+IFLA_VF_RX_FILTERS,
+IFLA_RX_FILTER,
 __IFLA_MAX
 };
 
@@ -264,6 +266,8 @@ enum macvlan_mode {
 
 /* SR-IOV virtual function management section */
 
+#define SELF_VF-1
+
 enum {
 IFLA_VF_INFO_UNSPEC,
 IFLA_VF_INFO,
@@ -378,4 +382,63 @@ struct ifla_port_vsi {
 __u8 pad[3];
 };
 
+/* VF rx filters management section
+ *
+ *Nested layout of set/get msg is:
+ *
+ *[IFLA_VF_RX_FILTERS]
+ *[IFLA_VF_RX_FILTER]
+ *[IFLA_RX_FILTER_*], ...
+ *[IFLA_VF_RX_FILTER]
+ *[IFLA_RX_FILTER_*], ...
+ *...
+ *[IFLA_RX_FILTER

Re: [net-next-2.6 PATCH 0/6 RFC v3] macvlan: MAC Address filtering support for passthru mode

2011-10-31 Thread Roopa Prabhu




On 10/31/11 9:38 AM, Rose, Gregory V gregory.v.r...@intel.com wrote:

 -Original Message-
 From: netdev-ow...@vger.kernel.org [mailto:netdev-ow...@vger.kernel.org]
 On Behalf Of Roopa Prabhu
 Sent: Friday, October 28, 2011 7:34 PM
 To: net...@vger.kernel.org
 Cc: s...@us.ibm.com; dragos.tatu...@gmail.com; kvm@vger.kernel.org;
 a...@arndb.de; m...@redhat.com; da...@davemloft.net; Rose, Gregory V;
 mc...@broadcom.com; dwa...@cisco.com; shemmin...@vyatta.com;
 eric.duma...@gmail.com; ka...@trash.net; be...@cisco.com
 Subject: [net-next-2.6 PATCH 0/6 RFC v3] macvlan: MAC Address filtering
 support for passthru mode
 
 v2 - v3
 - Moved set and get filter ops from rtnl_link_ops to netdev_ops
 - Support for SRIOV VFs.
 [Note: The get filters msg might get too big for SRIOV vfs.
 But this patch follows existing sriov vf get code and
 accomodate filters for all VF's in a PF.
 And for the SRIOV case I have only tested the fact that the VF
 arguments are getting delivered to rtnetlink correctly. The rest of
 the code follows existing sriov vf handling code so it should work
 just fine]
 - Fixed all op and netlink attribute names to start with IFLA_RX_FILTER
 - Changed macvlan filter ops to call corresponding lowerdev op if lowerdev
   supports it for passthru mode. Else it falls back on macvlan handling
 the
   filters locally as in v1 and v2
 
 v1 - v2
 - Instead of TUNSETTXFILTER introduced rtnetlink interface for the same
 
 
 [snip...]
 
 
 This patch series implements the following
 01/6 rtnetlink: Netlink interface for setting MAC and VLAN filters
 02/6 netdev: Add netdev_ops to set and get MAC/VLAN rx filters
 03/6 rtnetlink: Add support to set MAC/VLAN filters
 04/6 rtnetlink: Add support to get MAC/VLAN filters
 05/6 macvlan: Add support to set MAC/VLAN filter netdev ops
 06/6 macvlan: Add support to get MAC/VLAN filter netdev ops
 
 Please comment. Thanks.
 
 After some preliminary review this looks pretty good to me in so far as adding
 the necessary hooks to do what I need to do.  I appreciate your effort on
 this.
 
 I'm sort of a hands-on type of person so I need to apply this patch to a
 private git tree and then take it for a test drive (so to speak).  If I have
 further comments I'll get back to you.
 
Sounds good. 

 Did you have any plans for modifying any user space tools such as 'ip' to use
 this interface?
 

Yes, I have an iproute2 sample patch for setting and displaying the filters
which I have been using to test this interface. I can send the patch to you
after some cleanup if you think it will be useful for you to try out this
interface.

Thanks Greg.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net-next-2.6 PATCH 8/8 RFC v2] macvtap: Add support to get MAC/VLAN filter rtnl link operations

2011-10-28 Thread Roopa Prabhu




On 10/23/11 10:56 PM, Michael S. Tsirkin m...@redhat.com wrote:

 On Tue, Oct 18, 2011 at 11:26:36PM -0700, Roopa Prabhu wrote:
 From: Roopa Prabhu ropra...@cisco.com
 
 This patch adds support to get MAC and VLAN filter rtnl_link_ops
 on a macvtap interface. It adds support for get_rx_addr_filter_size,
 get_rx_vlan_filter_size, fill_rx_addr_filter and fill_rx_vlan_filter
 rtnl link operations. Calls equivalent macvlan operations.
 
 Signed-off-by: Roopa Prabhu ropra...@cisco.com
 Signed-off-by: Christian Benvenuti be...@cisco.com
 Signed-off-by: David Wang dwa...@cisco.com
 ---
  drivers/net/macvtap.c |   27 +++
  1 files changed, 27 insertions(+), 0 deletions(-)
 
 
 diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
 index 8a2cb59..9b40de7 100644
 --- a/drivers/net/macvtap.c
 +++ b/drivers/net/macvtap.c
 @@ -285,6 +285,29 @@ static int macvtap_set_rx_vlan_filter(struct net_device
 *dev,
 return macvlan_set_rx_vlan_filter(dev, tb);
  }
  
 +static int macvtap_fill_rx_addr_filter(struct sk_buff *skb,
 + const struct net_device *dev)
 +{
 + return macvlan_fill_rx_addr_filter(skb, dev);
 +}
 +
 +static int macvtap_fill_rx_vlan_filter(struct sk_buff *skb,
 + const struct net_device *dev)
 +{
 + return macvlan_fill_rx_vlan_filter(skb, dev);
 +}
 +
 +static size_t macvtap_get_rx_addr_filter_size(const struct net_device *dev)
 +{
 + return macvlan_get_rx_addr_filter_size(dev);
 +}
 +
 +static size_t macvtap_get_rx_vlan_filter_size(const struct net_device *dev)
 +{
 + return macvlan_get_rx_vlan_filter_size(dev);
 +}
 
 So why do we need the above wrappers? Can't use macvlanXXX directly?
 

I had followed the existing macvtap rtnl_link_ops convention here.
It seems cleaner this way. You can define the macvtap ops static and
Call equivalent macvlan functions from it if required. It  also gives you
flexibility in adding any macvtap specific stuff before or after you call
the macvlan equivalent function (like some of the macvtap rtnl link ops
already do today)

In any case this part and the below empty line error goes away in the new
version.

Thanks,
Roopa


 +
 +
 
 don't add double emoty lines pls.
 
  static int macvtap_newlink(struct net *src_net,
   struct net_device *dev,
   struct nlattr *tb[],
 @@ -335,6 +358,10 @@ static struct rtnl_link_ops macvtap_link_ops
 __read_mostly = {
 .dellink   = macvtap_dellink,
 .set_rx_addr_filter  = macvtap_set_rx_addr_filter,
 .set_rx_vlan_filter  = macvtap_set_rx_vlan_filter,
 + .get_rx_addr_filter_size = macvtap_get_rx_addr_filter_size,
 + .get_rx_vlan_filter_size = macvtap_get_rx_vlan_filter_size,
 + .fill_rx_addr_filter  = macvtap_fill_rx_addr_filter,
 + .fill_rx_vlan_filter  = macvtap_fill_rx_vlan_filter,
  };
  
  

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net-next-2.6 PATCH 0/6 RFC v3] macvlan: MAC Address filtering support for passthru mode

2011-10-28 Thread Roopa Prabhu

, struct sk_buff *skb);
int (*ndo_get_rx_filter_vlan)(
const struct net_device *dev,
int vf, struct sk_buff *skb);

Some answers to questions that were raised during the review:
- Protection against address spoofing:
- This patch adds filtering support only for macvtap PASSTHRU 
Mode. PASSTHRU mode is used mainly with SRIOV VF's. And SRIOV VF's 
come with anti mac/vlan spoofing support in the lowerdev driver. 
(netdev infrastructure to support this was added recently 
with IFLA_VF_SPOOFCHK). For 802.1Qbh devices, the port profile has a 
knob to enable/disable anti spoof check. Lowerdevice drivers also 
enforce limits on the number of address registrations allowed. 
For non-SRIOV VF's its the responsibility of the lowerdev driver
to implement any such protection. The currrent netdev hooks for 
SRIOV VF's spoof check could be extended to accomodate any network 
interface in the future.

- Support for multiqueue devices: Enable filtering on individual queues (?):
As i understand after the thread between (Micheal and Greg),
VMdq Linux implementation is not in yet and dont know how its going to
take shape. But Intel VMdq devices do accept filters on a per-queue
basis. Since the netdev infrastructure for VMdq is not in yet, Its
hard to say how this patch can support it.

This patch makes use of current netdev infrastructure for setting
address and vlan filters. And if that changes for vmdq tomorrow,
then the work that this patch represents can be modified to accomodate
vmdq devices at that time. 

So i dont see a huge problem with this patch coming in the way for
vmdq devices.

- Support for non-PASSTHRU mode:
I started implementing this. But there are a couple of problems.
- Today, in non-PASSTHRU cases macvlan_handle_frame assumes that 
every macvlan device has a single unique mac.
And the macvlans are hashed on that single mac address. 
To support filtering for non-PASSTHRU mode in addition to this 
patch the following needs to be done:
- non-passthru mode with a single macvlan over a lower dev
can be treated as PASSTHRU case
- For non-PASSTHRU mode with multiple macvlans over a single 
lower dev:  
- Multiple unicast mac's now need to be hashed to the 
same macvlan device. The macvlan hash needs to change 
for lookup based on any one of the multiple unicast 
addresses a macvlan is interested in
- We need to consider vlans during the lookup too
- So the macvlan device hash needs to hash on both mac 
and vlan
- But the support for filtering in non-PASSTHRU mode can be 
built on this patch

This patch series implements the following 
01/6 rtnetlink: Netlink interface for setting MAC and VLAN filters
02/6 netdev: Add netdev_ops to set and get MAC/VLAN rx filters
03/6 rtnetlink: Add support to set MAC/VLAN filters
04/6 rtnetlink: Add support to get MAC/VLAN filters
05/6 macvlan: Add support to set MAC/VLAN filter netdev ops
06/6 macvlan: Add support to get MAC/VLAN filter netdev ops

Please comment. Thanks.

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net-next-2.6 PATCH 1/6 RFC v3] rtnetlink: Netlink interface for setting MAC and VLAN filters

2011-10-28 Thread Roopa Prabhu

From: Roopa Prabhu ropra...@cisco.com

This patch introduces the following netlink interface to set
MAC and VLAN filters on an network interface. It can be used to
set RX filter on any network interface (if supported by the driver) and
also on a SRIOV VF via its PF

Interface to set RX filter on a SRIOV VF
[IFLA_VF_RX_FILTERS] = {
[IFLA_VF_RX_FILTER] = {
[IFLA_RX_FILTER_VF]
[IFLA_RX_FILTER_ADDR] = {
[IFLA_RX_FILTER_ADDR_FLAGS]
[IFLA_RX_FILTER_ADDR_UC_LIST] = {
[IFLA_ADDR_LIST_ENTRY]
}
[IFLA_RX_FILTER_ADDR_MC_LIST] = {
[IFLA_ADDR_LIST_ENTRY]
}
}
[IFLA_RX_FILTER_VLAN] = {
[IFLA_RX_FILTER_VLAN_BITMAP]
}
}
...
}

Interface to set RX filter on any network interface.:
[IFLA_RX_FILTER] = {
[IFLA_RX_FILTER_VF]
[IFLA_RX_FILTER_ADDR] = {
[IFLA_RX_FILTER_ADDR_FLAGS]
[IFLA_RX_FILTER_ADDR_UC_LIST] = {
[IFLA_ADDR_LIST_ENTRY]
}
[IFLA_RX_FILTER_ADDR_MC_LIST] = {
[IFLA_ADDR_LIST_ENTRY]
}
}
[IFLA_RX_FILTER_VLAN] = {
[IFLA_RX_FILTER_VLAN_BITMAP]
}
}

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
---
 include/linux/if_link.h |   61 +++
 net/core/rtnetlink.c|   20 +++
 2 files changed, 81 insertions(+), 0 deletions(-)


diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index c52d4b5..74a9f17 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -137,6 +137,8 @@ enum {
IFLA_AF_SPEC,
IFLA_GROUP, /* Group the device belongs to */
IFLA_NET_NS_FD,
+   IFLA_VF_RX_FILTERS,
+   IFLA_RX_FILTER,
__IFLA_MAX
 };
 
@@ -390,4 +392,63 @@ struct ifla_port_vsi {
__u8 pad[3];
 };
 
+/* VF rx filters management section
+ *
+ * Nested layout of set/get msg is:
+ *
+ * [IFLA_VF_RX_FILTERS]
+ * [IFLA_VF_RX_FILTER]
+ * [IFLA_RX_FILTER_*], ...
+ * [IFLA_VF_RX_FILTER]
+ * [IFLA_RX_FILTER_*], ...
+ * ...
+ * [IFLA_RX_FILTER]
+ * [IFLA_RX_FILTER_*], ...
+ */
+enum {
+   IFLA_VF_RX_FILTER_UNSPEC,
+   IFLA_VF_RX_FILTER,  /* nest */
+   __IFLA_VF_RX_FILTER_MAX,
+};
+
+#define IFLA_VF_RX_FILTER_MAX (__IFLA_VF_RX_FILTER_MAX - 1)
+
+enum {
+   IFLA_RX_FILTER_UNSPEC,
+   IFLA_RX_FILTER_VF,  /* __u32 */
+   IFLA_RX_FILTER_ADDR,
+   IFLA_RX_FILTER_VLAN,
+   __IFLA_RX_FILTER_MAX,
+};
+#define IFLA_RX_FILTER_MAX (__IFLA_RX_FILTER_MAX - 1)
+
+enum {
+   IFLA_RX_FILTER_ADDR_UNSPEC,
+   IFLA_RX_FILTER_ADDR_FLAGS,
+   IFLA_RX_FILTER_ADDR_UC_LIST,
+   IFLA_RX_FILTER_ADDR_MC_LIST,
+   __IFLA_RX_FILTER_ADDR_MAX,
+};
+#define IFLA_RX_FILTER_ADDR_MAX (__IFLA_RX_FILTER_ADDR_MAX - 1)
+
+#define RX_FILTER_FLAGS (IFF_UP | IFF_BROADCAST | IFF_MULTICAST | \
+   IFF_PROMISC | IFF_ALLMULTI)
+
+enum {
+   IFLA_ADDR_LIST_UNSPEC,
+   IFLA_ADDR_LIST_ENTRY,
+   __IFLA_ADDR_LIST_MAX,
+};
+#define IFLA_ADDR_LIST_MAX (__IFLA_ADDR_LIST_MAX - 1)
+
+enum {
+   IFLA_RX_FILTER_VLAN_UNSPEC,
+   IFLA_RX_FILTER_VLAN_BITMAP,
+   __IFLA_RX_FILTER_VLAN_MAX,
+};
+#define IFLA_RX_FILTER_VLAN_MAX (__IFLA_RX_FILTER_VLAN_MAX - 1)
+
+#define VLAN_BITMAP_SPLIT_MAX 8
+#define VLAN_BITMAP_SIZE   (VLAN_N_VID/VLAN_BITMAP_SPLIT_MAX)
+
 #endif /* _LINUX_IF_LINK_H */
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 9083e82..9eead8e 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -42,6 +42,7 @@
 
 #include linux/inet.h
 #include linux/netdevice.h
+#include linux/if_vlan.h
 #include net/ip.h
 #include net/protocol.h
 #include net/arp.h
@@ -1097,6 +1098,8 @@ const struct nla_policy ifla_policy[IFLA_MAX+1] = {
[IFLA_VF_PORTS] = { .type = NLA_NESTED },
[IFLA_PORT_SELF]= { .type = NLA_NESTED },
[IFLA_AF_SPEC]  = { .type = NLA_NESTED },
+   [IFLA_VF_RX_FILTERS]= { .type = NLA_NESTED },
+   [IFLA_RX_FILTER]= { .type = NLA_NESTED },
 };
 EXPORT_SYMBOL(ifla_policy);
 
@@ -1132,6 +1135,23 @@ static const struct nla_policy 
ifla_port_policy[IFLA_PORT_MAX+1] = {
[IFLA_PORT_RESPONSE]= { .type = NLA_U16, },
 };
 
+static const struct nla_policy ifla_rx_filter_policy[IFLA_RX_FILTER_MAX+1] = {
+   [IFLA_RX_FILTER_VF] = { .type = NLA_U32 },
+   [IFLA_RX_FILTER_ADDR]   = { .type = NLA_NESTED },
+   [IFLA_RX_FILTER_VLAN]   = { .type = NLA_NESTED

[net-next-2.6 PATCH 2/6 RFC v3] net: Add netdev_ops to set and get MAC/VLAN rx filters

2011-10-28 Thread Roopa Prabhu

From: Roopa Prabhu ropra...@cisco.com

This patch adds the following netdev_ops to set and get MAC/VLAN
filters on a SRIOV VF or any netdev interface. Each op takes a vf argument.
vf value of SELF_VF or -1 is for applying the operation directly on the
interface.

ndo_set_rx_filter_addr - to set address filter
ndo_get_rx_filter_addr_size - to get address filter size
ndo_get_rx_filter_addr - To get addr filter

ndo_set_rx_filter_vlan - to set vlan filter
ndo_get_rx_filter_vlan_size - to get vlan filter size
ndo_get_rx_filter_vlan - To get vlan filter

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
---
 include/linux/netdevice.h |   32 
 1 files changed, 32 insertions(+), 0 deletions(-)


diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 0db1f5f..94f2bc1 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -855,6 +855,20 @@ struct netdev_tc_txq {
  * feature set might be less than what was returned by ndo_fix_features()).
  * Must return 0 or -errno if it changed dev-features itself.
  *
+ * Address Filter management functions:
+ * int (*ndo_set_rx_filter_addr)(struct net_device *dev, int vf,
+ *  struct nlattr *tb[]);
+ * size_t (*ndo_get_rx_filter_addr_size)(const struct net_device *dev, int vf);
+ * int (*ndo_get_rx_filter_addr)(const struct net_device *dev, int vf,
+ *  struct sk_buff *skb);
+ *
+ * Vlan Filter management functions:
+ * int (*ndo_set_rx_filter_vlan)(struct net_device *dev, int vf,
+ *  struct nlattr *tb[]);
+ * size_t (*ndo_get_rx_filter_vlan_size)(const struct net_device *dev, int vf);
+ * int (*ndo_get_rx_filter_vlan)(const struct net_device *dev, int vf,
+ *  struct sk_buff *skb);
+ *
  */
 struct net_device_ops {
int (*ndo_init)(struct net_device *dev);
@@ -948,6 +962,24 @@ struct net_device_ops {
u32 features);
int (*ndo_set_features)(struct net_device *dev,
u32 features);
+   int (*ndo_set_rx_filter_addr)(
+   struct net_device *dev, int vf,
+   struct nlattr *tb[]);
+   size_t  (*ndo_get_rx_filter_addr_size)(
+   const struct net_device *dev,
+   int vf);
+   int (*ndo_get_rx_filter_addr)(
+   const struct net_device *dev,
+   int vf, struct sk_buff *skb);
+   int (*ndo_set_rx_filter_vlan)(
+   struct net_device *dev, int vf,
+   struct nlattr *tb[]);
+   size_t  (*ndo_get_rx_filter_vlan_size)(
+   const struct net_device *dev,
+   int vf);
+   int (*ndo_get_rx_filter_vlan)(
+   const struct net_device *dev,
+   int vf, struct sk_buff *skb);
 };
 
 /*

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net-next-2.6 PATCH 3/6 RFC v3] rtnetlink: Add support to set MAC/VLAN filters

2011-10-28 Thread Roopa Prabhu

From: Roopa Prabhu ropra...@cisco.com

This patch adds support in rtnetlink for IFLA_RX_FILTER and
IFLA_VF_RX_FILTERS set. It calls netdev_ops-set_rx_filter_addr and
rtnl_link_ops-set_rx_filter_vlan

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
---
 include/linux/if_link.h |2 +
 net/core/rtnetlink.c|  101 +++
 2 files changed, 103 insertions(+), 0 deletions(-)


diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index 74a9f17..a8c2c14 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -268,6 +268,8 @@ enum macvlan_mode {
 
 /* SR-IOV virtual function management section */
 
+#define SELF_VF-1
+
 enum {
IFLA_VF_INFO_UNSPEC,
IFLA_VF_INFO,
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 9eead8e..a042910 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1294,6 +1294,66 @@ static int do_set_master(struct net_device *dev, int 
ifindex)
return 0;
 }
 
+static int do_set_rx_filter(struct net_device *dev, int vf,
+   struct nlattr *rx_filter[],
+   int *modified)
+{
+   const struct net_device_ops *ops = dev-netdev_ops;
+   int err = 0;
+
+   if (rx_filter[IFLA_RX_FILTER_ADDR]) {
+   struct nlattr *addr_filter[IFLA_RX_FILTER_ADDR_MAX+1];
+
+   if (!ops-ndo_set_rx_filter_addr) {
+   err = -EOPNOTSUPP;
+   goto errout;
+   }
+
+   err = nla_parse_nested(addr_filter, IFLA_RX_FILTER_ADDR_MAX,
+   rx_filter[IFLA_RX_FILTER_ADDR],
+   ifla_addr_filter_policy);
+   if (err  0)
+   goto errout;
+
+   if (addr_filter[IFLA_RX_FILTER_ADDR_FLAGS]) {
+   unsigned int flags = nla_get_u32(
+   addr_filter[IFLA_RX_FILTER_ADDR_FLAGS]);
+   if (flags  ~RX_FILTER_FLAGS) {
+   err = -EINVAL;
+   goto errout;
+   }
+   }
+
+   err = ops-ndo_set_rx_filter_addr(dev, vf, addr_filter);
+   if (err  0)
+   goto errout;
+   *modified = 1;
+   }
+
+   if (rx_filter[IFLA_RX_FILTER_VLAN]) {
+   struct nlattr *vlan_filter[IFLA_RX_FILTER_VLAN_MAX+1];
+
+   if (!ops-ndo_set_rx_filter_vlan) {
+   err = -EOPNOTSUPP;
+   goto errout;
+   }
+
+   err = nla_parse_nested(vlan_filter, IFLA_RX_FILTER_VLAN_MAX,
+   rx_filter[IFLA_RX_FILTER_VLAN],
+   ifla_vlan_filter_policy);
+   if (err  0)
+   goto errout;
+
+   err = ops-ndo_set_rx_filter_vlan(dev, vf, vlan_filter);
+   if (err  0)
+   goto errout;
+   *modified = 1;
+   }
+
+errout:
+   return err;
+}
+
 static int do_setlink(struct net_device *dev, struct ifinfomsg *ifm,
  struct nlattr **tb, char *ifname, int modified)
 {
@@ -1515,6 +1575,47 @@ static int do_setlink(struct net_device *dev, struct 
ifinfomsg *ifm,
modified = 1;
}
}
+
+   if (tb[IFLA_VF_RX_FILTERS]) {
+   struct nlattr *vf_rx_filter[IFLA_RX_FILTER_MAX+1];
+   struct nlattr *attr;
+   int vf;
+   int rem;
+
+   nla_for_each_nested(attr, tb[IFLA_VF_RX_FILTERS], rem) {
+   if (nla_type(attr) != IFLA_VF_RX_FILTER)
+   continue;
+   err = nla_parse_nested(vf_rx_filter, IFLA_RX_FILTER_MAX,
+   attr, ifla_rx_filter_policy);
+   if (err  0)
+   goto errout;
+
+   if (!vf_rx_filter[IFLA_RX_FILTER_VF]) {
+   err = -EOPNOTSUPP;
+   goto errout;
+   }
+   vf = nla_get_u32(vf_rx_filter[IFLA_RX_FILTER_VF]);
+
+   err = do_set_rx_filter(dev, vf, vf_rx_filter,
+modified);
+   if (err  0)
+   goto errout;
+   }
+   }
+
+   if (tb[IFLA_RX_FILTER]) {
+   struct nlattr *rx_filter[IFLA_RX_FILTER_MAX+1];
+
+   err = nla_parse_nested(rx_filter, IFLA_RX_FILTER_MAX,
+   tb[IFLA_RX_FILTER], ifla_rx_filter_policy);
+   if (err  0)
+   goto errout;
+
+   err = do_set_rx_filter(dev, SELF_VF, rx_filter

[net-next-2.6 PATCH 4/6 RFC v3] rtnetlink: Add support to get MAC/VLAN filters

2011-10-28 Thread Roopa Prabhu

From: Roopa Prabhu ropra...@cisco.com

This patch adds support in rtnetlink for IFLA_RX_VF_FILTERS and
IFLA_RX_FILTER get. It gets the size of the filters using
netdev_ops-get_rx_filter_addr_size and netdev_ops-get_rx_filter_vlan_size
and uses netdev_ops-get_rx_filter_addr and netdev_ops-get_rx_filter_vlan.
In case of IFLA_RX_VF_FILTERS it loops through all vf's to get the filter
data

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
---
 net/core/rtnetlink.c |  159 ++
 1 files changed, 158 insertions(+), 1 deletions(-)


diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index a042910..ea861b4 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -475,6 +475,62 @@ static size_t rtnl_link_get_af_size(const struct 
net_device *dev)
return size;
 }
 
+static size_t rtnl_vf_rx_filter_size(const struct net_device *dev, int vf)
+{
+   const struct net_device_ops *ops = dev-netdev_ops;
+   size_t size;
+
+   /* IFLA_RX_FILTER  or IFLA_VF_RX_FILTER */
+   size = nla_total_size(sizeof(struct nlattr));
+
+   if (vf != SELF_VF)
+   size = nla_total_size(4); /* IFLA_RX_FILTER_VF */
+
+   if (ops-ndo_get_rx_filter_addr_size) {
+   size_t rx_filter_addr_size =
+   ops-ndo_get_rx_filter_addr_size(dev, vf);
+
+   if (rx_filter_addr_size)
+   /* IFLA_RX_FILTER_ADDR */
+   size += nla_total_size(sizeof(struct nlattr)) +
+   rx_filter_addr_size;
+   }
+
+   if (ops-ndo_get_rx_filter_vlan_size) {
+   size_t rx_filter_vlan_size =
+   ops-ndo_get_rx_filter_vlan_size(dev, vf);
+
+   if (rx_filter_vlan_size)
+   /* IFLA_RX_FILTER_VLAN */
+   size += nla_total_size(sizeof(struct nlattr)) +
+   rx_filter_vlan_size;
+   }
+
+   return size;
+}
+
+static size_t rtnl_rx_filter_size(const struct net_device *dev)
+{
+   const struct net_device_ops *ops = dev-netdev_ops;
+   int vf = SELF_VF;
+   size_t size;
+
+   if (!ops-ndo_get_rx_filter_addr_size 
+   !ops-ndo_get_rx_filter_vlan_size)
+   return 0;
+
+   size = rtnl_vf_rx_filter_size(dev, vf); /* SELF_VF */
+
+   if (dev-dev.parent  dev_num_vf(dev-dev.parent)) {
+   /* IFLA_VF_RX_FILTERS */
+   size = nla_total_size(sizeof(struct nlattr));
+   for (vf = 0; vf  dev_num_vf(dev-dev.parent); vf++)
+   size += rtnl_vf_rx_filter_size(dev, vf);
+   }
+
+   return size;
+}
+
 static int rtnl_link_fill(struct sk_buff *skb, const struct net_device *dev)
 {
const struct rtnl_link_ops *ops = dev-rtnl_link_ops;
@@ -513,6 +569,102 @@ out:
return err;
 }
 
+static int rtnl_vf_rx_filter_fill(struct sk_buff *skb,
+ const struct net_device *dev, int vf)
+{
+   const struct net_device_ops *ops = dev-netdev_ops;
+   struct nlattr *addr_filter = NULL, *vlan_filter = NULL;
+   struct nlattr *rx_filter;
+   int err = -EMSGSIZE;
+   int filter_attrtype =
+   (vf == SELF_VF ? IFLA_RX_FILTER : IFLA_VF_RX_FILTER);
+
+   rx_filter = nla_nest_start(skb, filter_attrtype);
+   if (rx_filter == NULL)
+   goto nla_put_failure;
+
+   if (vf != SELF_VF)
+   NLA_PUT_U32(skb, IFLA_RX_FILTER_VF, vf);
+
+   if (ops-ndo_get_rx_filter_addr) {
+   addr_filter = nla_nest_start(skb, IFLA_RX_FILTER_ADDR);
+   if (addr_filter == NULL)
+   goto err_cancel_rx_filter;
+   err = ops-ndo_get_rx_filter_addr(dev, vf, skb);
+   if (err == -ENODATA)
+   nla_nest_cancel(skb, addr_filter);
+   else if (err  0)
+   goto err_cancel_addr_filter;
+   else
+   nla_nest_end(skb, addr_filter);
+   }
+
+   if (ops-ndo_get_rx_filter_vlan) {
+   vlan_filter = nla_nest_start(skb, IFLA_RX_FILTER_VLAN);
+   if (vlan_filter == NULL)
+   goto err_cancel_addr_filter;
+   err = ops-ndo_get_rx_filter_vlan(dev, vf, skb);
+   if (err == -ENODATA)
+   nla_nest_cancel(skb, vlan_filter);
+   else if (err)
+   goto err_cancel_vlan_filter;
+   else
+   nla_nest_end(skb, vlan_filter);
+   }
+   nla_nest_end(skb, rx_filter);
+
+   return 0;
+
+err_cancel_vlan_filter:
+   if (vlan_filter)
+   nla_nest_cancel(skb, vlan_filter);
+err_cancel_addr_filter:
+   if (addr_filter)
+   nla_nest_cancel(skb, addr_filter

[net-next-2.6 PATCH 5/6 RFC v3] macvlan: Add support to for netdev ops to set MAC/VLAN filters

2011-10-28 Thread Roopa Prabhu

From: Roopa Prabhu ropra...@cisco.com

This patch adds support for MAC and VLAN filter netdev ops
on a macvlan interface. It adds support for set_rx_filter_addr and
set_rx_filter_vlan netdev operations. It currently supports only macvlan
PASSTHRU mode. And removes the code that puts the lowerdev in promiscous mode.

For passthru mode,
For both Address and vlan filters set, lowerdev
netdev_ops-set_rx_filter_addr and netdev_ops-set_rx_filter_vlan
are called if the lowerdev supports these ops.

Else parse the filter data and update the lowerdev filters:
 - Address filters: macvlan netdev uc and mc lists and flags are
updated to reflect the addresses and address filter flags that came
in the filter. Which inturn results in calls to macvlan_set_rx_mode and
macvlan_change_rx_flags. These functions pass the filter addresses
and flags to lowerdev netdev. And the lowerdev driver will pass it
to the hw.

- VLAN filter: Currently applied vlan bitmap is cached in
struct macvlan_dev-vlan_filter. This vlan bitmap is updated to
reflect the new bitmap that came in the netlink vlan filter msg.
macvlan_vlan_rx_add_vid and macvlan_vlan_rx_kill_vid are called
to update the vlan ids on the macvlan netdev, which in turn results in
passing the vlan ids to the lowerdev using netdev_ops
ndo_vlan_rx_add_vid and ndo_vlan_rx_kill_vid


Note: If in future if most lowerdev drivers find use for these ops and start
supporting them, we could remove the local handling of filters for passthru
mode in macvlan

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
---
 drivers/net/macvlan.c  |  331 
 include/linux/if_macvlan.h |2 
 2 files changed, 300 insertions(+), 33 deletions(-)


diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index a3ce3d4..9d8cbe3 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -302,30 +302,37 @@ static int macvlan_open(struct net_device *dev)
struct net_device *lowerdev = vlan-lowerdev;
int err;
 
-   if (vlan-port-passthru) {
-   dev_set_promiscuity(lowerdev, 1);
-   goto hash_add;
-   }
+   if (!vlan-port-passthru) {
+   err = -EBUSY;
+   if (macvlan_addr_busy(vlan-port, dev-dev_addr))
+   goto out;
 
-   err = -EBUSY;
-   if (macvlan_addr_busy(vlan-port, dev-dev_addr))
-   goto out;
+   err = dev_uc_add(lowerdev, dev-dev_addr);
+   if (err  0)
+   goto out;
+   }
 
-   err = dev_uc_add(lowerdev, dev-dev_addr);
-   if (err  0)
-   goto out;
if (dev-flags  IFF_ALLMULTI) {
err = dev_set_allmulti(lowerdev, 1);
if (err  0)
goto del_unicast;
}
 
-hash_add:
+   if (dev-flags  IFF_PROMISC) {
+   err = dev_set_promiscuity(lowerdev, 1);
+   if (err  0)
+   goto unset_allmulti;
+   }
+
macvlan_hash_add(vlan);
return 0;
 
+unset_allmulti:
+   dev_set_allmulti(lowerdev, -1);
+
 del_unicast:
-   dev_uc_del(lowerdev, dev-dev_addr);
+   if (!vlan-port-passthru)
+   dev_uc_del(lowerdev, dev-dev_addr);
 out:
return err;
 }
@@ -335,18 +342,16 @@ static int macvlan_stop(struct net_device *dev)
struct macvlan_dev *vlan = netdev_priv(dev);
struct net_device *lowerdev = vlan-lowerdev;
 
-   if (vlan-port-passthru) {
-   dev_set_promiscuity(lowerdev, -1);
-   goto hash_del;
-   }
-
+   dev_uc_unsync(lowerdev, dev);
dev_mc_unsync(lowerdev, dev);
if (dev-flags  IFF_ALLMULTI)
dev_set_allmulti(lowerdev, -1);
+   if (dev-flags  IFF_PROMISC)
+   dev_set_promiscuity(lowerdev, -1);
 
-   dev_uc_del(lowerdev, dev-dev_addr);
+   if (!vlan-port-passthru)
+   dev_uc_del(lowerdev, dev-dev_addr);
 
-hash_del:
macvlan_hash_del(vlan, !dev-dismantle);
return 0;
 }
@@ -387,12 +392,16 @@ static void macvlan_change_rx_flags(struct net_device 
*dev, int change)
 
if (change  IFF_ALLMULTI)
dev_set_allmulti(lowerdev, dev-flags  IFF_ALLMULTI ? 1 : -1);
+   if (change  IFF_PROMISC)
+   dev_set_promiscuity(lowerdev,
+   dev-flags  IFF_PROMISC ? 1 : -1);
 }
 
-static void macvlan_set_multicast_list(struct net_device *dev)
+static void macvlan_set_rx_mode(struct net_device *dev)
 {
struct macvlan_dev *vlan = netdev_priv(dev);
 
+   dev_uc_sync(vlan-lowerdev, dev);
dev_mc_sync(vlan-lowerdev, dev);
 }
 
@@ -535,6 +544,257 @@ static void macvlan_vlan_rx_kill_vid(struct net_device 
*dev,
ops

[net-next-2.6 PATCH 6/6 RFC v3] macvlan: Add support to get MAC/VLAN filter netdev ops

2011-10-28 Thread Roopa Prabhu

From: Roopa Prabhu ropra...@cisco.com

This patch adds support to get MAC and VLAN filter netdev ops
on a macvlan interface. It adds support for get_rx_filter_addr_size,
get_rx_filter_vlan_size, fill_rx_filter_addr and fill_rx_filter_vlan
netdev ops

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
---
 drivers/net/macvlan.c |  158 +
 1 files changed, 158 insertions(+), 0 deletions(-)


diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 9d8cbe3..15dd7de 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -616,6 +616,55 @@ static int macvlan_set_rx_filter_vlan(struct net_device 
*dev, int vf,
return 0;
 }
 
+static size_t macvlan_get_rx_filter_vlan_size(const struct net_device *dev,
+ int vf)
+{
+   struct macvlan_dev *vlan = netdev_priv(dev);
+   struct net_device *lowerdev = vlan-lowerdev;
+   const struct net_device_ops *ops = lowerdev-netdev_ops;
+
+   if (vf != SELF_VF)
+   return -EINVAL;
+
+   switch (vlan-mode) {
+   case MACVLAN_MODE_PASSTHRU:
+   if (ops-ndo_get_rx_filter_vlan_size)
+   return ops-ndo_get_rx_filter_vlan_size(dev, vf);
+   /* IFLA_RX_FILTER_VLAN_BITMAP */
+   return nla_total_size(VLAN_BITMAP_SIZE);
+   default:
+   return 0;
+   }
+}
+
+static int macvlan_get_rx_filter_vlan(const struct net_device *dev, int vf,
+ struct sk_buff *skb)
+{
+   struct macvlan_dev *vlan = netdev_priv(dev);
+   struct net_device *lowerdev = vlan-lowerdev;
+   const struct net_device_ops *ops = lowerdev-netdev_ops;
+
+   if (vf != SELF_VF)
+   return -EINVAL;
+
+   switch (vlan-mode) {
+   case MACVLAN_MODE_PASSTHRU:
+   if (ops-ndo_get_rx_filter_vlan)
+   return ops-ndo_get_rx_filter_vlan(dev, vf, skb);
+
+   NLA_PUT(skb, IFLA_RX_FILTER_VLAN_BITMAP, VLAN_BITMAP_SIZE,
+   vlan-vlan_filter);
+   break;
+   default:
+   return -ENODATA; /* No data to Fill */
+   }
+
+   return 0;
+
+nla_put_failure:
+   return -EMSGSIZE;
+}
+
 static int macvlan_addr_in_hw_list(struct netdev_hw_addr_list *list,
   u8 *addr, int addrlen)
 {
@@ -795,6 +844,111 @@ static int macvlan_set_rx_filter_addr(struct net_device 
*dev, int vf,
return 0;
 }
 
+static size_t macvlan_get_rx_filter_addr_passthru_size(
+   const struct net_device *dev, int vf)
+{
+   struct macvlan_dev *vlan = netdev_priv(dev);
+   struct net_device *lowerdev = vlan-lowerdev;
+   const struct net_device_ops *ops = lowerdev-netdev_ops;
+   size_t size;
+
+   if (ops-ndo_get_rx_filter_addr_size)
+   return ops-ndo_get_rx_filter_addr_size(dev, vf);
+
+   /* IFLA_RX_FILTER_ADDR_FLAGS */
+   size = nla_total_size(sizeof(u32));
+
+   if (netdev_uc_count(dev))
+   /* IFLA_RX_FILTER_ADDR_UC_LIST */
+   size += nla_total_size(netdev_uc_count(dev) *
+  ETH_ALEN * sizeof(struct nlattr));
+
+   if (netdev_mc_count(dev))
+   /* IFLA_RX_FILTER_ADDR_MC_LIST */
+   size += nla_total_size(netdev_mc_count(dev) *
+  ETH_ALEN * sizeof(struct nlattr));
+
+   return size;
+}
+
+static size_t macvlan_get_rx_filter_addr_size(const struct net_device *dev,
+ int vf)
+{
+   struct macvlan_dev *vlan = netdev_priv(dev);
+
+   if (vf != SELF_VF)
+   return -EINVAL;
+
+   switch (vlan-mode) {
+   case MACVLAN_MODE_PASSTHRU:
+   return macvlan_get_rx_filter_addr_passthru_size(dev, vf);
+   default:
+   return 0;
+   }
+}
+
+static int macvlan_get_rx_filter_addr_passthru(const struct net_device *dev,
+  int vf, struct sk_buff *skb)
+{
+   struct macvlan_dev *vlan = netdev_priv(dev);
+   struct net_device *lowerdev = vlan-lowerdev;
+   const struct net_device_ops *ops = lowerdev-netdev_ops;
+   struct nlattr *uninitialized_var(uc_list), *mc_list;
+   struct netdev_hw_addr *ha;
+
+   if (ops-ndo_get_rx_filter_addr)
+   return ops-ndo_get_rx_filter_addr(dev, vf, skb);
+
+   NLA_PUT_U32(skb, IFLA_RX_FILTER_ADDR_FLAGS,
+   dev-flags  RX_FILTER_FLAGS);
+
+   if (netdev_uc_count(dev)) {
+   uc_list = nla_nest_start(skb, IFLA_RX_FILTER_ADDR_UC_LIST);
+   if (uc_list == NULL)
+   goto nla_put_failure;
+
+   netdev_for_each_uc_addr(ha, dev) {
+   NLA_PUT(skb, IFLA_ADDR_LIST_ENTRY, ETH_ALEN, ha-addr

Re: [net-next-2.6 PATCH 0/8 RFC v2] macvlan: MAC Address filtering support for passthru mode

2011-10-24 Thread Roopa Prabhu

On 10/23/11 10:47 PM, Michael S. Tsirkin m...@redhat.com wrote:

 On Tue, Oct 18, 2011 at 11:25:54PM -0700, Roopa Prabhu wrote:
 v1 version of this RFC patch was posted at
 http://www.spinics.net/lists/netdev/msg174245.html
 
 Today macvtap used in virtualized environment does not have support to
 propagate MAC, VLAN and interface flags from guest to lowerdev.
 Which means to be able to register additional VLANs, unicast and multicast
 addresses or change pkt filter flags in the guest, the lowerdev has to be
 put in promisocous mode. Today the only macvlan mode that supports this is
 the PASSTHRU mode and it puts the lower dev in promiscous mode.
 
 PASSTHRU mode was added primarily for the SRIOV usecase. In PASSTHRU mode
 there is a 1-1 mapping between macvtap and physical NIC or VF.
 
 There are two problems with putting the lowerdev in promiscous mode (ie SRIOV
 VF's):
 - Some SRIOV cards dont support promiscous mode today (Thread on Intel
 driver indicates that http://lists.openwall.net/netdev/2011/09/27/6)
 - For the SRIOV NICs that support it, Putting the lowerdev in
 promiscous mode leads to additional traffic being sent up to the
 guest virtio-net to filter result in extra overheads.
 
 Both the above problems can be solved by offloading filtering to the
 lowerdev hw. ie lowerdev does not need to be in promiscous mode as
 long as the guest filters are passed down to the lowerdev.
 
 This patch basically adds the infrastructure to set and get MAC and VLAN
 filters on an interface via rtnetlink. And adds support in macvlan and
 macvtap
 to allow set and get filter operations.
 
 Looks sane to me. Some minor comments below.
 
 Earlier version of this patch provided the TUNSETTXFILTER macvtap interface
 for setting address filtering. In response to feedback, This version
 introduces a netlink interface for the same.
 
 Response to some of the questions raised during v1:
 
 - Netlink interface:
 This patch provides the following netlink interface to set mac and vlan
 filters :
 [IFLA_RX_FILTER] = {
 [IFLA_ADDR_FILTER] = {
 [IFLA_ADDR_FILTER_FLAGS]
 [IFLA_ADDR_FILTER_UC_LIST] = {
 [IFLA_ADDR_LIST_ENTRY]
 }
 [IFLA_ADDR_FILTER_MC_LIST] = {
 [IFLA_ADDR_LIST_ENTRY]
 }
 }
 [IFLA_VLAN_FILTER] = {
 [IFLA_VLAN_BITMAP]
 }
 }
 
 Note: The IFLA_VLAN_FILTER is a nested attribute and contains only
 IFLA_VLAN_BITMAP today. The idea is that the IFLA_VLAN_FILTER can
 be extended tomorrow to use a vlan list option if some implementations
 prefer a list instead.
 
 And it provides the following rtnl_link_ops to set/get MAC/VLAN filters:
 
int (*set_rx_addr_filter)(struct net_device *dev,
struct nlattr *tb[]);
int (*set_rx_vlan_filter)(struct net_device *dev,
 struct nlattr *tb[]);
size_t  (*get_rx_addr_filter_size)(const struct
 net_device *dev);
size_t  (*get_rx_vlan_filter_size)(const struct
 net_device *dev);
int (*fill_rx_addr_filter)(struct sk_buff *skb,
 const struct net_device
 *dev);
int (*fill_rx_vlan_filter)(struct sk_buff *skb,
 const struct net_device
 *dev);
 
 
 Note: The choice of rtnl_link_ops was because I saw the use case for
 this in virtual devices that need  to do filtering in sw like macvlan
 and tun. Hw devices usually have filtering in hw with netdev-uc and
 mc lists to indicate active filters. But I can move from rtnl_link_ops
 to netdev_ops if that is the preferred way to go and if there is a
 need to support this interface on all kinds of interfaces.
 Please suggest.
 
 - Protection against address spoofing:
 - This patch adds filtering support only for macvtap PASSTHRU
 Mode. PASSTHRU mode is used mainly with SRIOV VF's. And SRIOV VF's
 come with anti mac/vlan spoofing support. (Recently added
 IFLA_VF_SPOOFCHK). In 802.1Qbh case the port profile has a knob to
 enable/disable anti spoof check. Lowerdevice drivers also enforce limits
 on the number of address registrations allowed.
 
 - Support for multiqueue devices: Enable filtering on individual queues (?):
 AFAIK, there is no netdev interface to install per queue hw
 filters for a multi queue interface. And also I dont know of any hw
 that provides an interface to set hw filters on a per queue basis.
 
 VMDq hardware would support this, no?
 
Am not really sure. This patch uses netdev to pass filters to hw. And I
don't see any netdev infrastructure that would support per queue filters.
Maybe Greg (CC'ed) or anyone else from Intel can answer this.
Greg, michael had brought up this question during first version of these
patches as well. Will be nice to get the VMDq requirements for propagating
guest filters to hw clarified. Do you see any special VMDq nic requirement
we can cover in this patch

Re: [net-next-2.6 PATCH 0/8 RFC v2] macvlan: MAC Address filtering support for passthru mode

2011-10-20 Thread Roopa Prabhu

On 10/20/11 1:43 PM, Rose, Gregory V gregory.v.r...@intel.com wrote:

 -Original Message-
 From: Roopa Prabhu [mailto:ropra...@cisco.com]
 Sent: Wednesday, October 19, 2011 3:30 PM
 To: Rose, Gregory V; net...@vger.kernel.org
 Cc: s...@us.ibm.com; dragos.tatu...@gmail.com; a...@arndb.de;
 kvm@vger.kernel.org; m...@redhat.com; da...@davemloft.net;
 mc...@broadcom.com; dwa...@cisco.com; shemmin...@vyatta.com;
 eric.duma...@gmail.com; ka...@trash.net; be...@cisco.com
 Subject: Re: [net-next-2.6 PATCH 0/8 RFC v2] macvlan: MAC Address
 filtering support for passthru mode

 On 10/19/11 2:06 PM, Rose, Gregory V gregory.v.r...@intel.com wrote:

 -Original Message-
 From: netdev-ow...@vger.kernel.org [mailto:netdev-
 ow...@vger.kernel.org]
 On Behalf Of Roopa Prabhu
 Sent: Tuesday, October 18, 2011 11:26 PM
 To: net...@vger.kernel.org
 Cc: s...@us.ibm.com; dragos.tatu...@gmail.com; a...@arndb.de;
 kvm@vger.kernel.org; m...@redhat.com; da...@davemloft.net;
 mc...@broadcom.com; dwa...@cisco.com; shemmin...@vyatta.com;
 eric.duma...@gmail.com; ka...@trash.net; be...@cisco.com
 Subject: [net-next-2.6 PATCH 0/8 RFC v2] macvlan: MAC Address filtering
 support for passthru mode

 [snip...]

 Note: The choice of rtnl_link_ops was because I saw the use case for
 this in virtual devices that need  to do filtering in sw like macvlan
 and tun. Hw devices usually have filtering in hw with netdev-uc and
 mc lists to indicate active filters. But I can move from rtnl_link_ops
 to netdev_ops if that is the preferred way to go and if there is a
 need to support this interface on all kinds of interfaces.
 Please suggest.

 I'm still digesting the rest of the RFC patches but I did want to
 quickly jump
 in and push for adding this support in netdev_ops.  I would like to see
 these
 features available in more devices than just macvtap and macvlan.  I can
 conceive
 of use cases for multiple HW MAC and VLAN filters for a VF device that
 isn't
 owned by a macvlan/macvtap interface and only has netdev_ops support.
 In this
 case it would be necessary to program the filters directly to the VF
 device
 interface or PF interface (or lowerdev as you refer to it) instead of
 going
 through macvlan/macvtap.

 This work dovetails nicely with some work I've been doing and I'd be
 very
 interested
 in helping move this forward if we could work out the details that would
 allow
 support
 of the features we (and the community) require.

 Great. Thanks. I will definitely be interested to get this patch working
 for
 any other use case you have.

 Moving the ops to netdev should be trivial. You probably want the ops to
 work on the VF via the PF, like the existing ndo_set_vf_mac etc.

 That is correct, so we would need to add some way to pass the VF number to the
 op.
 In addition, there are use cases for multiple MAC address filters for the
 Physical
 Function (PF) so we would like to be able to identify to the netdev op that it
 is
 supposed to perform the action on the PF filters instead of a VF.

 An example of this would be when an administrator has created some number of
 VFs
 for a given PF but is also running the PF in bridged (i.e. promiscuous) mode
 so that it
 can support purely SW emulated network connections in some VMs that have low
 network
 latency and bandwidth requirements while reserving the VFs for VMs that
 require the low latency, high throughput that directly assigned VFs can
 provide.  In this case an
 emulated SW interface in a VM is unable to properly communicate with VFs on
 the same
 PF because the emulated SW interface's MAC address isn't programmed into the
 HW filters
 on the PF.  If we could use this op to program the MAC address and VLAN
 filters of
 the emulated SW interfaces into the PF HW a VF could then properly communicate
 across
 the NIC's internal VEB to the emulated SW interfaces.

 Yes, lets work out the details and I can move this to netdev-ops. Let me
 know.

 I think essentially if you could add some parameter to the ops to specify
 whether it
 is addressing a VF or the PF and then if it is a VF further specify the VF
 number we
 would be very close to addressing the requirements of many valuable use cases
 in
 addition to the ones you have identified in your RFC.

 Does that sound reasonable?

Thanks for the details Greg. Sounds good. I will change it to provide netdev
ops with a vf argument and respin.

Thanks,
Roopa

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net-next-2.6 PATCH 0/8 RFC v2] macvlan: MAC Address filtering support for passthru mode

2011-10-19 Thread Roopa Prabhu

 and multicast hw filters.
So i dont see a huge problem with this patch coming in the way for
multi queue devices.

- Support for non-PASSTHRU mode:
I started implementing this. But there are a couple of problems.
- The lowerdev may not be a SRIOV VF and may not have 
anti spoof capability
- Today, in non-PASSTHRU cases macvlan_handle_frame assumes that 
every macvlan device on top of the lowerdev has a single unique mac.
And the macvlans are hashed on that single mac address. 
To support filtering for non-PASSTHRU mode in addition to this 
patch the following needs to be done:
- non-passthru mode with a single macvlan over a lower dev
can be treated as PASSTHRU case
- For non-PASSTHRU mode with multiple macvlans over a single 
lower dev:  
- Multiple unicast mac's now need to be hashed to the 
same macvlan device. The macvlan hash needs to change 
for lookup based on any one of the multiple unicast 
addresses a macvlan is interested in
- We need to consider vlans during the lookup too
- So the macvlan device hash needs to hash on both mac 
and vlan
- But the support for filtering in non-PASSTHRU mode can be 
built on this patch

This patch series implements the following 
01/8 rtnetlink: Netlink interface for setting MAC and VLAN filters
02/8 rtnetlink: Add rtnl link operations for MAC address and VLAN filtering
03/8 rtnetlink: Add support to set MAC/VLAN filters
04/8 rtnetlink: Add support to get MAC/VLAN filters
05/8 macvlan: Add support to set MAC/VLAN filter rtnl link operations
06/8 macvlan: Add support to get MAC/VLAN filter rtnl link operations
07/8 macvtap: Add support to set MAC/VLAN filter rtnl link operations
08/8 macvtap: Add support to get MAC/VLAN filter rtnl link operations

Please comment. Thanks.

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net-next-2.6 PATCH 1/8 RFC v2] rtnetlink: Netlink interface for setting MAC and VLAN filters

2011-10-19 Thread Roopa Prabhu

From: Roopa Prabhu ropra...@cisco.com

This patch introduces the following netlink interface to set
MAC and VLAN filters on an network interface

[IFLA_RX_FILTER] = {
[IFLA_ADDR_FILTER] = {
[IFLA_ADDR_FILTER_FLAGS]
[IFLA_ADDR_FILTER_UC_LIST] = {
[IFLA_ADDR_LIST_ENTRY]
}
[IFLA_ADDR_FILTER_MC_LIST] = {
[IFLA_ADDR_LIST_ENTRY]
}
}
[IFLA_VLAN_FILTER] = {
[IFLA_VLAN_BITMAP]
}
}

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
---
 include/linux/if_link.h |   39 +++
 net/core/rtnetlink.c|   18 ++
 2 files changed, 57 insertions(+), 0 deletions(-)


diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index c52d4b5..41dbcbe 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -137,6 +137,7 @@ enum {
IFLA_AF_SPEC,
IFLA_GROUP, /* Group the device belongs to */
IFLA_NET_NS_FD,
+   IFLA_RX_FILTER,
__IFLA_MAX
 };
 
@@ -390,4 +391,42 @@ struct ifla_port_vsi {
__u8 pad[3];
 };
 
+/* Addr filters */
+enum {
+   IFLA_RX_FILTER_UNSPEC,
+   IFLA_RX_ADDR_FILTER,
+   IFLA_RX_VLAN_FILTER,
+   __IFLA_RX_FILTER_MAX,
+};
+#define IFLA_RX_FILTER_MAX (__IFLA_RX_FILTER_MAX - 1)
+
+enum {
+   IFLA_ADDR_FILTER_UNSPEC,
+   IFLA_ADDR_FILTER_FLAGS,
+   IFLA_ADDR_FILTER_UC_LIST,
+   IFLA_ADDR_FILTER_MC_LIST,
+   __IFLA_ADDR_FILTER_MAX,
+};
+#define IFLA_ADDR_FILTER_MAX (__IFLA_ADDR_FILTER_MAX - 1)
+
+#define RX_FILTER_FLAGS (IFF_UP | IFF_BROADCAST | IFF_MULTICAST | \
+   IFF_PROMISC | IFF_ALLMULTI)
+
+enum {
+   IFLA_ADDR_LIST_UNSPEC,
+   IFLA_ADDR_LIST_ENTRY,
+   __IFLA_ADDR_LIST_MAX,
+};
+#define IFLA_ADDR_LIST_MAX (__IFLA_ADDR_LIST_MAX - 1)
+
+enum {
+   IFLA_VLAN_FILTER_UNSPEC,
+   IFLA_VLAN_BITMAP,
+   __IFLA_VLAN_FILTER_MAX,
+};
+#define IFLA_VLAN_FILTER_MAX (__IFLA_VLAN_FILTER_MAX - 1)
+
+#define VLAN_BITMAP_SPLIT_MAX 8
+#define VLAN_BITMAP_SIZE   (VLAN_N_VID/VLAN_BITMAP_SPLIT_MAX)
+
 #endif /* _LINUX_IF_LINK_H */
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 9083e82..a3b213f 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -42,6 +42,7 @@
 
 #include linux/inet.h
 #include linux/netdevice.h
+#include linux/if_vlan.h
 #include net/ip.h
 #include net/protocol.h
 #include net/arp.h
@@ -1097,9 +1098,26 @@ const struct nla_policy ifla_policy[IFLA_MAX+1] = {
[IFLA_VF_PORTS] = { .type = NLA_NESTED },
[IFLA_PORT_SELF]= { .type = NLA_NESTED },
[IFLA_AF_SPEC]  = { .type = NLA_NESTED },
+   [IFLA_RX_FILTER]= { .type = NLA_NESTED },
 };
 EXPORT_SYMBOL(ifla_policy);
 
+static const struct nla_policy ifla_rx_filter_policy[IFLA_RX_FILTER_MAX+1] = {
+   [IFLA_RX_ADDR_FILTER]   = { .type = NLA_NESTED },
+   [IFLA_RX_VLAN_FILTER]= { .type = NLA_NESTED },
+};
+
+static const struct nla_policy ifla_addr_filter_policy[IFLA_ADDR_FILTER_MAX+1] 
= {
+   [IFLA_ADDR_FILTER_FLAGS] = { .type = NLA_U32 },
+   [IFLA_ADDR_FILTER_UC_LIST] = { .type = NLA_NESTED },
+   [IFLA_ADDR_FILTER_MC_LIST] = { .type = NLA_NESTED },
+};
+
+static const struct nla_policy ifla_vlan_filter_policy[IFLA_VLAN_FILTER_MAX+1] 
= {
+   [IFLA_VLAN_BITMAP]   = { .type = NLA_BINARY,
+.len = VLAN_BITMAP_SIZE },
+};
+
 static const struct nla_policy ifla_info_policy[IFLA_INFO_MAX+1] = {
[IFLA_INFO_KIND]= { .type = NLA_STRING },
[IFLA_INFO_DATA]= { .type = NLA_NESTED },

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net-next-2.6 PATCH 2/8 RFC v2] rtnetlink: Add rtnl link operations for MAC address and VLAN filtering

2011-10-19 Thread Roopa Prabhu

From: Roopa Prabhu ropra...@cisco.com

This patch adds the following rtnl_link_ops to set and get MAC and VLAN
filters

set_rx_addr_filter - to set address filter
set_rx_vlan_filter - To set vlan filter
get_rx_addr_filter_size - To get address filter size
get_rx_vlan_filter_size - To get vlan filter size
fill_rx_addr_filter - To fill addr filter
fill_rx_vlan_filter - To fill vlan filter

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
---
 include/net/rtnetlink.h |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)


diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
index 678f1ff..dcb26bd 100644
--- a/include/net/rtnetlink.h
+++ b/include/net/rtnetlink.h
@@ -78,6 +78,19 @@ struct rtnl_link_ops {
int (*get_tx_queues)(struct net *net, struct nlattr 
*tb[],
 unsigned int *tx_queues,
 unsigned int *real_tx_queues);
+
+   int (*set_rx_addr_filter)(struct net_device *dev,
+ struct nlattr *tb[]);
+   int (*set_rx_vlan_filter)(struct net_device *dev,
+ struct nlattr *tb[]);
+   size_t  (*get_rx_addr_filter_size)(
+   const struct net_device *dev);
+   size_t  (*get_rx_vlan_filter_size)(
+   const struct net_device *dev);
+   int (*fill_rx_addr_filter)(struct sk_buff *skb,
+   const struct net_device *dev);
+   int (*fill_rx_vlan_filter)(struct sk_buff *skb,
+   const struct net_device *dev);
 };
 
 extern int __rtnl_link_register(struct rtnl_link_ops *ops);

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net-next-2.6 PATCH 3/8 RFC v2] rtnetlink: Add support to set MAC/VLAN filters

2011-10-19 Thread Roopa Prabhu

From: Roopa Prabhu ropra...@cisco.com

This patch adds support in rtnetlink for IFLA_RX_FILTER set.
It adds code in do_setlink to parse IFLA_RX_FILTER and call
the rtnl_link_ops-set_rx_addr_filter and
rtnl_link_ops-set_rx_vlan_filter to set MAC and VLAN filters.

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
---
 net/core/rtnetlink.c |   67 ++
 1 files changed, 67 insertions(+), 0 deletions(-)


diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index a3b213f..bc1074d 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1296,6 +1296,7 @@ static int do_setlink(struct net_device *dev, struct 
ifinfomsg *ifm,
  struct nlattr **tb, char *ifname, int modified)
 {
const struct net_device_ops *ops = dev-netdev_ops;
+   const struct rtnl_link_ops *rtnl_ops;
int send_addr_notify = 0;
int err;
 
@@ -1513,6 +1514,72 @@ static int do_setlink(struct net_device *dev, struct 
ifinfomsg *ifm,
modified = 1;
}
}
+
+   if (tb[IFLA_RX_FILTER]) {
+   struct nlattr *filters[IFLA_RX_FILTER_MAX+1];
+
+   err = -EOPNOTSUPP;
+   rtnl_ops = dev-rtnl_link_ops;
+   if (!rtnl_ops)
+   goto errout;
+
+   err = nla_parse_nested(filters,
+   IFLA_RX_FILTER_MAX, tb[IFLA_RX_FILTER],
+   ifla_rx_filter_policy);
+   if (err  0)
+   goto errout;
+
+   if (filters[IFLA_RX_ADDR_FILTER]) {
+   struct nlattr *addr_filters[IFLA_ADDR_FILTER_MAX+1];
+
+   if (!rtnl_ops-set_rx_addr_filter) {
+   err = -EOPNOTSUPP;
+   goto errout;
+   }
+
+   err = nla_parse_nested(addr_filters,
+   IFLA_ADDR_FILTER_MAX,
+   filters[IFLA_RX_ADDR_FILTER],
+   ifla_addr_filter_policy);
+   if (err  0)
+   goto errout;
+
+   if (addr_filters[IFLA_ADDR_FILTER_FLAGS]) {
+   unsigned int flags = nla_get_u32(
+   addr_filters[IFLA_ADDR_FILTER_FLAGS]);
+   if (flags  ~RX_FILTER_FLAGS) {
+   err = -EINVAL;
+   goto errout;
+   }
+   }
+
+   err = rtnl_ops-set_rx_addr_filter(dev, addr_filters);
+   if (err  0)
+   goto errout;
+   modified = 1;
+   }
+
+   if (filters[IFLA_RX_VLAN_FILTER]) {
+   struct nlattr *vlan_filters[IFLA_VLAN_FILTER_MAX+1];
+
+   if (!rtnl_ops-set_rx_vlan_filter) {
+   err = -EOPNOTSUPP;
+   goto errout;
+   }
+
+   err = nla_parse_nested(vlan_filters,
+   IFLA_VLAN_FILTER_MAX,
+   filters[IFLA_RX_VLAN_FILTER],
+   ifla_vlan_filter_policy);
+   if (err  0)
+   goto errout;
+
+   err = rtnl_ops-set_rx_vlan_filter(dev, vlan_filters);
+   if (err  0)
+   goto errout;
+   modified = 1;
+   }
+   }
err = 0;
 
 errout:

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net-next-2.6 PATCH 4/8 RFC v2] rtnetlink: Add support to get MAC/VLAN filters

2011-10-19 Thread Roopa Prabhu

From: Roopa Prabhu ropra...@cisco.com

This patch adds support in rtnetlink for IFLA_RX_FILTER get.
It adds new function rtnl_rx_filter_get_size to get the size
of rx filters by calling rtnl_link_ops-get_rx_addr_filter_size
and rtnl_link_ops-get_rx_vlan_filter_size

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
---
 net/core/rtnetlink.c |   90 +-
 1 files changed, 89 insertions(+), 1 deletions(-)


diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index bc1074d..6a709db 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -475,6 +475,37 @@ static size_t rtnl_link_get_af_size(const struct 
net_device *dev)
return size;
 }
 
+static size_t rtnl_rx_filter_get_size(const struct net_device *dev)
+{
+   const struct rtnl_link_ops *ops = dev-rtnl_link_ops;
+   size_t size;
+
+   if (!ops)
+   return 0;
+
+   size = nla_total_size(sizeof(struct nlattr)); /* IFLA_RX_FILTER */
+
+   if (ops-get_rx_addr_filter_size) {
+   size_t rx_addr_filter_size = ops-get_rx_addr_filter_size(dev);
+
+   if (rx_addr_filter_size)
+   /* IFLA_RX_ADDR_FILTER */
+   size += nla_total_size(sizeof(struct nlattr)) +
+   rx_addr_filter_size;
+   }
+
+   if (ops-get_rx_vlan_filter_size) {
+   size_t rx_vlan_filter_size = ops-get_rx_vlan_filter_size(dev);
+
+   if (rx_vlan_filter_size)
+   /* IFLA_RX_VLAN_FILTER */
+   size += nla_total_size(sizeof(struct nlattr)) +
+   rx_vlan_filter_size;
+   }
+
+   return size;
+}
+
 static int rtnl_link_fill(struct sk_buff *skb, const struct net_device *dev)
 {
const struct rtnl_link_ops *ops = dev-rtnl_link_ops;
@@ -513,6 +544,59 @@ out:
return err;
 }
 
+static int rtnl_rx_filter_fill(struct sk_buff *skb,
+  const struct net_device *dev)
+{
+   const struct rtnl_link_ops *ops = dev-rtnl_link_ops;
+   struct nlattr *rx_filter, *uninitialized_var(addr_filter);
+   struct nlattr *uninitialized_var(vlan_filter);
+   int err = -EMSGSIZE;
+
+   rx_filter = nla_nest_start(skb, IFLA_RX_FILTER);
+   if (rx_filter == NULL)
+   goto out;
+
+   if (ops-fill_rx_addr_filter) {
+   addr_filter = nla_nest_start(skb, IFLA_RX_ADDR_FILTER);
+   if (addr_filter == NULL)
+   goto err_cancel_rx_filter;
+   err = ops-fill_rx_addr_filter(skb, dev);
+   if (err == -ENODATA)
+   nla_nest_cancel(skb, addr_filter);
+   else if (err  0)
+   goto err_cancel_addr_filter;
+   else
+   nla_nest_end(skb, addr_filter);
+   }
+
+   if (ops-fill_rx_vlan_filter) {
+   vlan_filter = nla_nest_start(skb, IFLA_RX_VLAN_FILTER);
+   if (vlan_filter == NULL)
+   goto err_cancel_addr_filter;
+   err = ops-fill_rx_vlan_filter(skb, dev);
+   if (err == -ENODATA)
+   nla_nest_cancel(skb, vlan_filter);
+   else if (err)
+   goto err_cancel_vlan_filter;
+   else
+   nla_nest_end(skb, vlan_filter);
+   }
+   nla_nest_end(skb, rx_filter);
+
+   return 0;
+
+err_cancel_vlan_filter:
+   if (ops-fill_rx_vlan_filter)
+   nla_nest_cancel(skb, vlan_filter);
+err_cancel_addr_filter:
+   if (ops-fill_rx_addr_filter)
+   nla_nest_cancel(skb, addr_filter);
+err_cancel_rx_filter:
+   nla_nest_cancel(skb, rx_filter);
+out:
+   return err;
+}
+
 static const int rtm_min[RTM_NR_FAMILIES] =
 {
[RTM_FAM(RTM_NEWLINK)]  = NLMSG_LENGTH(sizeof(struct ifinfomsg)),
@@ -786,7 +870,8 @@ static noinline size_t if_nlmsg_size(const struct 
net_device *dev)
   + rtnl_vfinfo_size(dev) /* IFLA_VFINFO_LIST */
   + rtnl_port_size(dev) /* IFLA_VF_PORTS + IFLA_PORT_SELF */
   + rtnl_link_get_size(dev) /* IFLA_LINKINFO */
-  + rtnl_link_get_af_size(dev); /* IFLA_AF_SPEC */
+  + rtnl_link_get_af_size(dev) /* IFLA_AF_SPEC */
+  + rtnl_rx_filter_get_size(dev); /* IFLA_RX_FILTER */
 }
 
 static int rtnl_vf_ports_fill(struct sk_buff *skb, struct net_device *dev)
@@ -997,6 +1082,9 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct 
net_device *dev,
goto nla_put_failure;
 
if (dev-rtnl_link_ops) {
+   if (rtnl_rx_filter_fill(skb, dev)  0)
+   goto nla_put_failure;
+
if (rtnl_link_fill(skb, dev)  0)
goto nla_put_failure;
}

--
To unsubscribe from

[net-next-2.6 PATCH 5/8 RFC v2] macvlan: Add support to set MAC/VLAN filter rtnl link operations

2011-10-19 Thread Roopa Prabhu

From: Roopa Prabhu ropra...@cisco.com

This patch adds support to set MAC and VLAN filter rtnl_link_ops
on a macvlan interface. It adds support for set_rx_addr_filter and
set_rx_vlan_filter rtnl link operations. It currently supports
only macvlan PASSTHRU mode.

For passthru mode,
 - Address filters: macvlan netdev uc and mc lists are
updated to reflect the addresses in the filter.

- VLAN filter: Currently applied vlan bitmap is maintained in
struct macvlan_dev-vlan_filter. This vlan bitmap is updated to
reflect the new bitmap that came in the netlink msg.
lowerdev hw vlan filter is updated using macvlan netdev operations
ndo_vlan_rx_add_vid and ndo_vlan_rx_kill_vid (which inturn call
lowerdev vlan add/kill netdev ops)

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
---
 drivers/net/macvlan.c  |  296 
 include/linux/if_macvlan.h |8 +
 2 files changed, 279 insertions(+), 25 deletions(-)


diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 24cf942..dbb2e30 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -299,30 +299,36 @@ static int macvlan_open(struct net_device *dev)
struct net_device *lowerdev = vlan-lowerdev;
int err;
 
-   if (vlan-port-passthru) {
-   dev_set_promiscuity(lowerdev, 1);
-   goto hash_add;
-   }
+   if (!vlan-port-passthru) {
+   err = -EBUSY;
+   if (macvlan_addr_busy(vlan-port, dev-dev_addr))
+   goto out;
 
-   err = -EBUSY;
-   if (macvlan_addr_busy(vlan-port, dev-dev_addr))
-   goto out;
+   err = dev_uc_add(lowerdev, dev-dev_addr);
+   if (err  0)
+   goto out;
+   }
 
-   err = dev_uc_add(lowerdev, dev-dev_addr);
-   if (err  0)
-   goto out;
if (dev-flags  IFF_ALLMULTI) {
err = dev_set_allmulti(lowerdev, 1);
if (err  0)
goto del_unicast;
}
+   if (dev-flags  IFF_PROMISC) {
+   err = dev_set_promiscuity(lowerdev, 1);
+   if (err  0)
+   goto unset_allmulti;
+   }
 
-hash_add:
macvlan_hash_add(vlan);
return 0;
 
+unset_allmulti:
+   dev_set_allmulti(lowerdev, -1);
+
 del_unicast:
-   dev_uc_del(lowerdev, dev-dev_addr);
+   if (!vlan-port-passthru)
+   dev_uc_del(lowerdev, dev-dev_addr);
 out:
return err;
 }
@@ -332,18 +338,16 @@ static int macvlan_stop(struct net_device *dev)
struct macvlan_dev *vlan = netdev_priv(dev);
struct net_device *lowerdev = vlan-lowerdev;
 
-   if (vlan-port-passthru) {
-   dev_set_promiscuity(lowerdev, -1);
-   goto hash_del;
-   }
-
+   dev_uc_unsync(lowerdev, dev);
dev_mc_unsync(lowerdev, dev);
if (dev-flags  IFF_ALLMULTI)
dev_set_allmulti(lowerdev, -1);
+   if (dev-flags  IFF_PROMISC)
+   dev_set_promiscuity(lowerdev, -1);
 
-   dev_uc_del(lowerdev, dev-dev_addr);
+   if (!vlan-port-passthru)
+   dev_uc_del(lowerdev, dev-dev_addr);
 
-hash_del:
macvlan_hash_del(vlan, !dev-dismantle);
return 0;
 }
@@ -384,12 +388,16 @@ static void macvlan_change_rx_flags(struct net_device 
*dev, int change)
 
if (change  IFF_ALLMULTI)
dev_set_allmulti(lowerdev, dev-flags  IFF_ALLMULTI ? 1 : -1);
+   if (change  IFF_PROMISC)
+   dev_set_promiscuity(lowerdev,
+   dev-flags  IFF_PROMISC ? 1 : -1);
 }
 
-static void macvlan_set_multicast_list(struct net_device *dev)
+static void macvlan_set_rx_mode(struct net_device *dev)
 {
struct macvlan_dev *vlan = netdev_priv(dev);
 
+   dev_uc_sync(vlan-lowerdev, dev);
dev_mc_sync(vlan-lowerdev, dev);
 }
 
@@ -562,7 +570,7 @@ static const struct net_device_ops macvlan_netdev_ops = {
.ndo_change_mtu = macvlan_change_mtu,
.ndo_change_rx_flags= macvlan_change_rx_flags,
.ndo_set_mac_address= macvlan_set_mac_address,
-   .ndo_set_rx_mode= macvlan_set_multicast_list,
+   .ndo_set_rx_mode= macvlan_set_rx_mode,
.ndo_get_stats64= macvlan_dev_get_stats64,
.ndo_validate_addr  = eth_validate_addr,
.ndo_vlan_rx_add_vid= macvlan_vlan_rx_add_vid,
@@ -574,6 +582,7 @@ void macvlan_common_setup(struct net_device *dev)
ether_setup(dev);
 
dev-priv_flags= ~(IFF_XMIT_DST_RELEASE | IFF_TX_SKB_SHARING);
+   dev-priv_flags|= IFF_UNICAST_FLT;
dev-netdev_ops = macvlan_netdev_ops;
dev-destructor = free_netdev;
dev-header_ops = macvlan_hard_header_ops,
@@ -701,6 +710,8 @@ int

[net-next-2.6 PATCH 6/8 RFC v2] macvlan: Add support to get MAC/VLAN filter rtnl link operations

2011-10-19 Thread Roopa Prabhu

From: Roopa Prabhu ropra...@cisco.com

This patch adds support to get MAC and VLAN filter rtnl_link_ops
on a macvlan interface. It adds support for get_rx_addr_filter_size,
get_rx_vlan_filter_size, fill_rx_addr_filter and fill_rx_vlan_filter
rtnl link operations.

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
---
 drivers/net/macvlan.c  |  126 
 include/linux/if_macvlan.h |   10 +++
 2 files changed, 136 insertions(+), 0 deletions(-)


diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index dbb2e30..23636e6 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -1014,6 +1014,128 @@ int macvlan_set_rx_addr_filter(struct net_device *dev,
 }
 EXPORT_SYMBOL(macvlan_set_rx_addr_filter);
 
+static size_t macvlan_get_rx_addr_filter_passthru_size(
+   const struct net_device *dev)
+{
+   size_t size;
+
+   /* IFLA_ADDR_FILTER_FLAGS */
+   size = nla_total_size(sizeof(u32));
+
+   if (netdev_uc_count(dev))
+   /* IFLA_ADDR_FILTER_UC_LIST */
+   size += nla_total_size(netdev_uc_count(dev) *
+  ETH_ALEN * sizeof(struct nlattr));
+
+   if (netdev_mc_count(dev))
+   /* IFLA_ADDR_FILTER_MC_LIST */
+   size += nla_total_size(netdev_mc_count(dev) *
+  ETH_ALEN * sizeof(struct nlattr));
+
+   return size;
+}
+
+size_t macvlan_get_rx_addr_filter_size(const struct net_device *dev)
+{
+   struct macvlan_dev *vlan = netdev_priv(dev);
+
+   switch (vlan-mode) {
+   case MACVLAN_MODE_PASSTHRU:
+   return macvlan_get_rx_addr_filter_passthru_size(dev);
+   default:
+   return 0;
+   }
+}
+EXPORT_SYMBOL(macvlan_get_rx_addr_filter_size);
+
+size_t macvlan_get_rx_vlan_filter_size(const struct net_device *dev)
+{
+   struct macvlan_dev *vlan = netdev_priv(dev);
+
+   switch (vlan-mode) {
+   case MACVLAN_MODE_PASSTHRU:
+   /* IFLA_VLAN_BITMAP */
+   return nla_total_size(VLAN_BITMAP_SIZE);
+   default:
+   return 0;
+   }
+}
+EXPORT_SYMBOL(macvlan_get_rx_vlan_filter_size);
+
+static int macvlan_fill_rx_addr_filter_passthru(struct sk_buff *skb,
+   const struct net_device *dev)
+{
+   struct nlattr *uninitialized_var(uc_list), *mc_list;
+   struct netdev_hw_addr *ha;
+
+   NLA_PUT_U32(skb, IFLA_ADDR_FILTER_FLAGS, dev-flags  RX_FILTER_FLAGS);
+
+   if (netdev_uc_count(dev)) {
+   uc_list = nla_nest_start(skb, IFLA_ADDR_FILTER_UC_LIST);
+   if (uc_list == NULL)
+   goto nla_put_failure;
+
+   netdev_for_each_uc_addr(ha, dev) {
+   NLA_PUT(skb, IFLA_ADDR_LIST_ENTRY, ETH_ALEN, ha-addr);
+   }
+   nla_nest_end(skb, uc_list);
+   }
+
+   if (netdev_mc_count(dev)) {
+   mc_list = nla_nest_start(skb, IFLA_ADDR_FILTER_MC_LIST);
+   if (mc_list == NULL)
+   goto nla_uc_list_cancel;
+
+   netdev_for_each_mc_addr(ha, dev) {
+   NLA_PUT(skb, IFLA_ADDR_LIST_ENTRY, ETH_ALEN, ha-addr);
+   }
+   nla_nest_end(skb, mc_list);
+   }
+
+   return 0;
+
+nla_uc_list_cancel:
+   if (netdev_uc_count(dev))
+   nla_nest_cancel(skb, uc_list);
+nla_put_failure:
+   return -EMSGSIZE;
+}
+
+int macvlan_fill_rx_addr_filter(struct sk_buff *skb,
+   const struct net_device *dev)
+{
+   struct macvlan_dev *vlan = netdev_priv(dev);
+
+   switch (vlan-mode) {
+   case MACVLAN_MODE_PASSTHRU:
+   return macvlan_fill_rx_addr_filter_passthru(skb, dev);
+   default:
+   return -ENODATA; /* No data to Fill */
+   }
+}
+EXPORT_SYMBOL(macvlan_fill_rx_addr_filter);
+
+int macvlan_fill_rx_vlan_filter(struct sk_buff *skb,
+   const struct net_device *dev)
+{
+   struct macvlan_dev *vlan = netdev_priv(dev);
+
+   switch (vlan-mode) {
+   case MACVLAN_MODE_PASSTHRU:
+   NLA_PUT(skb, IFLA_VLAN_BITMAP, VLAN_BITMAP_SIZE,
+   vlan-vlan_filter);
+   break;
+   default:
+   return -ENODATA; /* No data to Fill */
+   }
+
+   return 0;
+
+nla_put_failure:
+   return -EMSGSIZE;
+}
+EXPORT_SYMBOL(macvlan_fill_rx_vlan_filter);
+
 static const struct nla_policy macvlan_policy[IFLA_MACVLAN_MAX + 1] = {
[IFLA_MACVLAN_MODE] = { .type = NLA_U32 },
 };
@@ -1040,6 +1162,10 @@ static struct rtnl_link_ops macvlan_link_ops = {
.dellink= macvlan_dellink,
.set_rx_addr_filter = macvlan_set_rx_addr_filter,
.set_rx_vlan_filter

[net-next-2.6 PATCH 7/8 RFC v2] macvtap: Add support to set MAC/VLAN filter rtnl link operations

2011-10-19 Thread Roopa Prabhu

From: Roopa Prabhu ropra...@cisco.com

This patch adds support to set MAC and VLAN filter rtnl_link_ops
on a macvtap interface. It adds support for set_rx_addr_filter and
set_rx_vlan_filter rtnl link operations. These operations inturn call the
equivalent operations defined in macvlan

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
---
 drivers/net/macvtap.c |   22 ++
 1 files changed, 18 insertions(+), 4 deletions(-)


diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 3da5578..8a2cb59 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -273,6 +273,18 @@ static int macvtap_receive(struct sk_buff *skb)
return macvtap_forward(skb-dev, skb);
 }
 
+static int macvtap_set_rx_addr_filter(struct net_device *dev,
+   struct nlattr *tb[])
+{
+   return macvlan_set_rx_addr_filter(dev, tb);
+}
+
+static int macvtap_set_rx_vlan_filter(struct net_device *dev,
+   struct nlattr *tb[])
+{
+   return macvlan_set_rx_vlan_filter(dev, tb);
+}
+
 static int macvtap_newlink(struct net *src_net,
   struct net_device *dev,
   struct nlattr *tb[],
@@ -317,10 +329,12 @@ static void macvtap_setup(struct net_device *dev)
 }
 
 static struct rtnl_link_ops macvtap_link_ops __read_mostly = {
-   .kind   = macvtap,
-   .setup  = macvtap_setup,
-   .newlink= macvtap_newlink,
-   .dellink= macvtap_dellink,
+   .kind   = macvtap,
+   .setup  = macvtap_setup,
+   .newlink= macvtap_newlink,
+   .dellink= macvtap_dellink,
+   .set_rx_addr_filter = macvtap_set_rx_addr_filter,
+   .set_rx_vlan_filter = macvtap_set_rx_vlan_filter,
 };
 
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net-next-2.6 PATCH 8/8 RFC v2] macvtap: Add support to get MAC/VLAN filter rtnl link operations

2011-10-19 Thread Roopa Prabhu

From: Roopa Prabhu ropra...@cisco.com

This patch adds support to get MAC and VLAN filter rtnl_link_ops
on a macvtap interface. It adds support for get_rx_addr_filter_size,
get_rx_vlan_filter_size, fill_rx_addr_filter and fill_rx_vlan_filter
rtnl link operations. Calls equivalent macvlan operations.

Signed-off-by: Roopa Prabhu ropra...@cisco.com
Signed-off-by: Christian Benvenuti be...@cisco.com
Signed-off-by: David Wang dwa...@cisco.com
---
 drivers/net/macvtap.c |   27 +++
 1 files changed, 27 insertions(+), 0 deletions(-)


diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 8a2cb59..9b40de7 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -285,6 +285,29 @@ static int macvtap_set_rx_vlan_filter(struct net_device 
*dev,
return macvlan_set_rx_vlan_filter(dev, tb);
 }
 
+static int macvtap_fill_rx_addr_filter(struct sk_buff *skb,
+   const struct net_device *dev)
+{
+   return macvlan_fill_rx_addr_filter(skb, dev);
+}
+
+static int macvtap_fill_rx_vlan_filter(struct sk_buff *skb,
+   const struct net_device *dev)
+{
+   return macvlan_fill_rx_vlan_filter(skb, dev);
+}
+
+static size_t macvtap_get_rx_addr_filter_size(const struct net_device *dev)
+{
+   return macvlan_get_rx_addr_filter_size(dev);
+}
+
+static size_t macvtap_get_rx_vlan_filter_size(const struct net_device *dev)
+{
+   return macvlan_get_rx_vlan_filter_size(dev);
+}
+
+
 static int macvtap_newlink(struct net *src_net,
   struct net_device *dev,
   struct nlattr *tb[],
@@ -335,6 +358,10 @@ static struct rtnl_link_ops macvtap_link_ops __read_mostly 
= {
.dellink= macvtap_dellink,
.set_rx_addr_filter = macvtap_set_rx_addr_filter,
.set_rx_vlan_filter = macvtap_set_rx_vlan_filter,
+   .get_rx_addr_filter_size= macvtap_get_rx_addr_filter_size,
+   .get_rx_vlan_filter_size= macvtap_get_rx_vlan_filter_size,
+   .fill_rx_addr_filter= macvtap_fill_rx_addr_filter,
+   .fill_rx_vlan_filter= macvtap_fill_rx_vlan_filter,
 };
 
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net-next-2.6 PATCH 0/8 RFC v2] macvlan: MAC Address filtering support for passthru mode

2011-10-19 Thread Roopa Prabhu

On 10/19/11 2:06 PM, Rose, Gregory V gregory.v.r...@intel.com wrote:

 -Original Message-
 From: netdev-ow...@vger.kernel.org [mailto:netdev-ow...@vger.kernel.org]
 On Behalf Of Roopa Prabhu
 Sent: Tuesday, October 18, 2011 11:26 PM
 To: net...@vger.kernel.org
 Cc: s...@us.ibm.com; dragos.tatu...@gmail.com; a...@arndb.de;
 kvm@vger.kernel.org; m...@redhat.com; da...@davemloft.net;
 mc...@broadcom.com; dwa...@cisco.com; shemmin...@vyatta.com;
 eric.duma...@gmail.com; ka...@trash.net; be...@cisco.com
 Subject: [net-next-2.6 PATCH 0/8 RFC v2] macvlan: MAC Address filtering
 support for passthru mode

 [snip...]

 Note: The choice of rtnl_link_ops was because I saw the use case for
 this in virtual devices that need  to do filtering in sw like macvlan
 and tun. Hw devices usually have filtering in hw with netdev-uc and
 mc lists to indicate active filters. But I can move from rtnl_link_ops
 to netdev_ops if that is the preferred way to go and if there is a
 need to support this interface on all kinds of interfaces.
 Please suggest.

 I'm still digesting the rest of the RFC patches but I did want to quickly jump
 in and push for adding this support in netdev_ops.  I would like to see these
 features available in more devices than just macvtap and macvlan.  I can
 conceive
 of use cases for multiple HW MAC and VLAN filters for a VF device that isn't
 owned by a macvlan/macvtap interface and only has netdev_ops support.  In this
 case it would be necessary to program the filters directly to the VF device
 interface or PF interface (or lowerdev as you refer to it) instead of going
 through macvlan/macvtap.

 This work dovetails nicely with some work I've been doing and I'd be very
 interested
 in helping move this forward if we could work out the details that would allow
 support
 of the features we (and the community) require.

Great. Thanks. I will definitely be interested to get this patch working for
any other use case you have.

Moving the ops to netdev should be trivial. You probably want the ops to
work on the VF via the PF, like the existing ndo_set_vf_mac etc.
Yes, lets work out the details and I can move this to netdev-ops. Let me
know.

Thanks,
Roopa

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode

2011-09-15 Thread Roopa Prabhu



The netlink patch is still in the works. I will post the patches after I
clean it up a bit and also accommodate or find answers to most questions
discussed for non-passthru case. Thought I will post the netlink interface
here to see if anyone has any early comments. I have a
rtnl_link_ops-set_rx_filter defined.

[IFLA_RX_FILTER] = {
[IFLA_ADDRESS_FILTER] = {
[IFLA_ADDRESS_FILTER_FLAGS]
[IFLA_ADDRESS_LIST] = {
[IFLA_ADDRESS_LIST_ENTRY]
}
}
[IFLA_VLAN_FILTER] = {
[IFLA_VLAN_LIST] = {
[IFLA_VLAN]
}
}
}

Some open questions:
- The VLAN filter above shows a VLAN list. It could also be a bitmap or
the interface could provide both a bitmap and VLAN list for more flexibility
. Like the below  

[IFLA_RX_FILTER] = {
[IFLA_ADDRESS_FILTER] = {
[IFLA_ADDRESS_FILTER_FLAGS]
[IFLA_ADDRESS_LIST] = {
[IFLA_ADDRESS_LIST_ENTRY]
}
}
[IFLA_VLAN_FILTER] = {
[IFLA_VLAN_BITMAP]
[IFLA_VLAN_LIST] = {
[IFLA_VLAN]
}
}
}

- Do you see any advantage in keeping Unicast and multicast address list
separate ? Something like the below :
[IFLA_RX_FILTER] = {
[IFLA_ADDRESS_FILTER_FLAGS]
[IFLA_UC_ADDRESS_FILTER] = {
[IFLA_ADDRESS_LIST] = {
[IFLA_ADDRESS_LIST_ENTRY]
}
}
[IFLA_MC_ADDRESS_FILTER] = {
[IFLA_ADDRESS_LIST] = {
[IFLA_ADDRESS_LIST_ENTRY]
}
}
[IFLA_VLAN_FILTER] = {
[IFLA_VLAN_LIST] = {
[IFLA_VLAN]
}
}
} 

- Is there any need to keep address and vlan filters separate. And have
two rtnl_link_ops, set_rx_address_filter, set_rx_vlan_filter ?. I don't see
one .

[IFLA_RX_ADDRESS_FILTER] = {
[IFLA_ADDRESS_FILTER_FLAGS]
[IFLA_ADDRESS_LIST] = {
[IFLA_ADDRESS_LIST_ENTRY]
}
}
[IFLA_RX_VLAN_FILTER] = {
[IFLA_VLAN_LIST] = {
[IFLA_VLAN]
}
} 


Thanks,
Roopa



On 9/12/11 10:02 AM, Roopa Prabhu ropra...@cisco.com wrote:

 
 
 
 On 9/11/11 12:03 PM, Michael S. Tsirkin m...@redhat.com wrote:
 
 On Sun, Sep 11, 2011 at 06:18:01AM -0700, Roopa Prabhu wrote:
 
 
 
 On 9/11/11 2:44 AM, Michael S. Tsirkin m...@redhat.com wrote:
 
 
 Yes, but what I mean is, if the size of the single filter table
 is limited, we need to decide how many addresses is
 each guest allowed. If we let one guest ask for
 as many as it wants, it can lock others out.
 
 Yes true. In these cases ie when the number of unicast addresses being
 registered is more than it can handle, The VF driver will put the VF  in
 promiscuous mode (Or at least its supposed to do. I think all drivers do
 that).
 
 
 Thanks,
 Roopa
 
 Right, so that works at least but likely performs worse
 than a hardware filter. So we better allocate it in
 some fair way, as a minimum. Maybe a way for
 the admin to control that allocation is useful.
 
 Yes I think we will have to do something like that. There is a maximum that hw
 can support. Might need to consider that too. But there is no interface to get
 that today. I think the virtualization case gets a little trickier. Virtio-net
 allows upto 64 unicast addresses. But the lowerdev may allow only upto say 10
 unicast addresses (I think intel supports 10 unicast addresses on the VF). Am
 not sure if there is a good way to notify the guest of blocked addresses.
 Maybe putting the lower dev in promiscuous mode could be a policy decision too
 in this case. 
 
 One other thing, I had indicated that I will look up details on opening my
 patch for non-passthru to enable hw filtering (without adding filtering
 support in macvlan right away. Ie phase1). Turns out in current code in
 macvlan_handle_frame, for non-passthru case, it does not fwd unicast pkts
 destined to macs other than the ones in macvlan hash. So a filter or hash
 lookup there for additional unicast addresses needs to be definitely added for
 non-passthru.
 
 Thanks,
 Roopa
 
 
  

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode

2011-09-12 Thread Roopa Prabhu




On 9/11/11 11:52 AM, Michael S. Tsirkin m...@redhat.com wrote:

 On Sun, Sep 11, 2011 at 06:18:02AM -0700, Roopa Prabhu wrote:
 
 
 
 On 9/11/11 2:38 AM, Michael S. Tsirkin m...@redhat.com wrote:
 
 On Fri, Sep 09, 2011 at 09:33:33AM -0700, Roopa Prabhu wrote:

 
 It's probably more interesting for a card without SRIOV support.
 
 If its an SRIOV card I am assuming people likely using PASSTHRU mode.
 Non-SRIOV cards will use any of the non-PASSTHRU mode.
 
 
 we will have to add filter lookup in macvlan
 to filter pkts for each guest.
 
 Any chance to enable hardware filters for that?
 
 NAFAIK. Am not sure how you would do it too. Its still a single device from
 where the host receives traffic from.
 
 Thanks,
 Roopa
 
 VMDQ cards might let you program mac addresses for individula rings.
 
I tried to lookup more information on this. I dint find any concrete
information. I am not sure if individual rings show up as separate netdev.
Any more info on how a VMDQ nic is used with macvlan ?.

I came across this 
http://www.linux-kvm.org/wiki/images/6/6a/KvmForum2008$kdf2008_7.pdf

Thanks,
Roopa
 
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode

2011-09-12 Thread Roopa Prabhu




On 9/11/11 12:03 PM, Michael S. Tsirkin m...@redhat.com wrote:

 On Sun, Sep 11, 2011 at 06:18:01AM -0700, Roopa Prabhu wrote:
 
 
 
 On 9/11/11 2:44 AM, Michael S. Tsirkin m...@redhat.com wrote:
 
 
 Yes, but what I mean is, if the size of the single filter table
 is limited, we need to decide how many addresses is
 each guest allowed. If we let one guest ask for
 as many as it wants, it can lock others out.
 
 Yes true. In these cases ie when the number of unicast addresses being
 registered is more than it can handle, The VF driver will put the VF  in
 promiscuous mode (Or at least its supposed to do. I think all drivers do
 that).
 
 
 Thanks,
 Roopa
 
 Right, so that works at least but likely performs worse
 than a hardware filter. So we better allocate it in
 some fair way, as a minimum. Maybe a way for
 the admin to control that allocation is useful.

Yes I think we will have to do something like that. There is a maximum that
hw can support. Might need to consider that too. But there is no interface
to get that today. I think the virtualization case gets a little trickier.
Virtio-net allows upto 64 unicast addresses. But the lowerdev may allow only
upto say 10 unicast addresses (I think intel supports 10 unicast addresses
on the VF). Am not sure if there is a good way to notify the guest of
blocked addresses. Maybe putting the lower dev in promiscuous mode could be
a policy decision too in this case.

One other thing, I had indicated that I will look up details on opening my
patch for non-passthru to enable hw filtering (without adding filtering
support in macvlan right away. Ie phase1). Turns out in current code in
macvlan_handle_frame, for non-passthru case, it does not fwd unicast pkts
destined to macs other than the ones in macvlan hash. So a filter or hash
lookup there for additional unicast addresses needs to be definitely added
for non-passthru.

Thanks,
Roopa


 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode

2011-09-12 Thread Roopa Prabhu




On 9/11/11 9:30 PM, Sridhar Samudrala s...@us.ibm.com wrote:

 On 9/11/2011 6:18 AM, Roopa Prabhu wrote:
 
 
 On 9/11/11 2:44 AM, Michael S. Tsirkinm...@redhat.com  wrote:
 
 AFAIK, though it might maintain a single filter table space in hw, hw does
 know which filter belongs to which VF. And the OS driver does not need to
 do
 anything special. The VF driver exposes a VF netdev. And any uc/mc
 addresses
 registered with a VF netdev are registered with the hw by the driver. And
 hw
 will filter and send only pkts that the VF has expressed interest in.
 
 No special filter partitioning in hw is required.
 
 Thanks,
 Roopa
 Yes, but what I mean is, if the size of the single filter table
 is limited, we need to decide how many addresses is
 each guest allowed. If we let one guest ask for
 as many as it wants, it can lock others out.
 Yes true. In these cases ie when the number of unicast addresses being
 registered is more than it can handle, The VF driver will put the VF  in
 promiscuous mode (Or at least its supposed to do. I think all drivers do
 that).
 
 What does putting VF in promiscuous mode mean?  How can the NIC decide
 which set
 of mac addresses are passed to the VF? Does it mean VF sees all the
 packets received
 by the NIC including packets destined for other VFs/PF?
 
Yes I think so. After your question I looked at 2 other  VF drivers and
looks like they return error if num unicast addresses exceeds the number
supported by hw and don't put the VF in promiscuous mode. But one could put
the VF in promiscuous mode by changing IFF_FLAGS I think.

The original in-kernel passthru mode code puts the VF in promiscuous mode by
default. Am assuming that works well with other sriov cards you got a chance
to try out with.

Thanks,
Roopa

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode

2011-09-11 Thread Roopa Prabhu




On 9/11/11 2:44 AM, Michael S. Tsirkin m...@redhat.com wrote:

 
 AFAIK, though it might maintain a single filter table space in hw, hw does
 know which filter belongs to which VF. And the OS driver does not need to do
 anything special. The VF driver exposes a VF netdev. And any uc/mc addresses
 registered with a VF netdev are registered with the hw by the driver. And hw
 will filter and send only pkts that the VF has expressed interest in.
 
 No special filter partitioning in hw is required.
 
 Thanks,
 Roopa
 
 Yes, but what I mean is, if the size of the single filter table
 is limited, we need to decide how many addresses is
 each guest allowed. If we let one guest ask for
 as many as it wants, it can lock others out.

Yes true. In these cases ie when the number of unicast addresses being
registered is more than it can handle, The VF driver will put the VF  in
promiscuous mode (Or at least its supposed to do. I think all drivers do
that).


Thanks,
Roopa


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode

2011-09-11 Thread Roopa Prabhu




On 9/11/11 2:38 AM, Michael S. Tsirkin m...@redhat.com wrote:

 On Fri, Sep 09, 2011 at 09:33:33AM -0700, Roopa Prabhu wrote:
 
 
 
 On 9/8/11 10:55 PM, Michael S. Tsirkin m...@redhat.com wrote:
 
 On Thu, Sep 08, 2011 at 07:53:11PM -0700, Roopa Prabhu wrote:
 Phase 1: Goal: Enable hardware filtering for all macvlan modes
 - In macvlan passthru mode the single guest virtio-nic connected will
   receive traffic that he requested for
 - In macvlan non-passthru mode all guest virtio-nics sharing the
   physical nic will see all other guest traffic
   but the filtering at guest virtio-nic
 
 I don't think guests currently filter anything.
 
 I was referring to Qemu-kvm virtio-net in
 virtion_net_receive-receive_filter. I think It only passes pkts that the
 guest OS is interested. It uses the filter table that I am passing to
 macvtap in this patch.
 
 This happens after userspace thread gets woken up and data
 is copied there. So relying on filtering at that level is
 going to be very inefficient on a system with
 multiple active guests. Further, and for that reason, vhost-net
 doesn't do filtering at all, relying on the backends
 to pass it correct packets.
 
 Ok thanks for the info. So in which case, phase 1 is best for PASSTHRU mode
 and for non-PASSTHRU when there is a single guest connected to a VF.
 For non-PASSTHRU multi guest sharing the same VF, Phase 1 is definitely
 better than putting the VF in promiscuous mode.
 But to address the concern you mention above, in phase 2 when we have more
 than one guest sharing the VF,
 
 It's probably more interesting for a card without SRIOV support.
 
If its an SRIOV card I am assuming people likely using PASSTHRU mode.
Non-SRIOV cards will use any of the non-PASSTHRU mode.


 we will have to add filter lookup in macvlan
 to filter pkts for each guest.
 
 Any chance to enable hardware filters for that?
 
NAFAIK. Am not sure how you would do it too. Its still a single device from
where the host receives traffic from.

Thanks,
Roopa
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode

2011-09-09 Thread Roopa Prabhu




On 9/8/11 9:25 PM, Sridhar Samudrala s...@us.ibm.com wrote:

 On 9/8/2011 8:00 PM, Roopa Prabhu wrote:
 
 
 On 9/8/11 12:33 PM, Michael S. Tsirkinm...@redhat.com  wrote:
 
 On Thu, Sep 08, 2011 at 12:23:56PM -0700, Roopa Prabhu wrote:
 I think the main usecase for passthru mode is to assign a SR-IOV VF to
 a single guest.
 
 Yes and for the passthru usecase this patch should be enough to enable
 filtering in hw (eventually like I indicated before I need to fix vlan
 filtering too).
 So with filtering in hw, and in sriov VF case, VFs
 actually share a filtering table. How will that
 be partitioned?
 AFAIK, though it might maintain a single filter table space in hw, hw does
 know which filter belongs to which VF. And the OS driver does not need to do
 anything special. The VF driver exposes a VF netdev. And any uc/mc addresses
 registered with a VF netdev are registered with the hw by the driver. And hw
 will filter and send only pkts that the VF has expressed interest in.
 Does your NIC  driver support adding multiple mac addresses to a VF?
 I have tried a few other SR-IOV NICs sometime back and they didn't
 support this feature.

Yes our nic does. I thought Intel's also does (see ixgbevf_set_rx_mode).
Though I have not really tried using it on an Intel card. I think most cards
should at the least support multicast filters.

If the lower dev does not support unicast filtering, dev_uc_add(lowerdev,..)
puts the lower dev in promiscous mode. Though..i think I can chcek this
before hand in macvlan_open and put the lowerdev in promiscuous mode if it
does not support filtering.

 
 Currently, we don't have an interface to add multiple mac addresses to a
 netdev other than an
 indirect way of creating a macvlan /if on top of it.

Yes I think so. I have been using only macvlan to test.

Thanks,
Roopa

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode

2011-09-09 Thread Roopa Prabhu




On 9/8/11 10:55 PM, Michael S. Tsirkin m...@redhat.com wrote:

 On Thu, Sep 08, 2011 at 07:53:11PM -0700, Roopa Prabhu wrote:
 Phase 1: Goal: Enable hardware filtering for all macvlan modes
 - In macvlan passthru mode the single guest virtio-nic connected will
   receive traffic that he requested for
 - In macvlan non-passthru mode all guest virtio-nics sharing the
   physical nic will see all other guest traffic
   but the filtering at guest virtio-nic
 
 I don't think guests currently filter anything.
 
 I was referring to Qemu-kvm virtio-net in
 virtion_net_receive-receive_filter. I think It only passes pkts that the
 guest OS is interested. It uses the filter table that I am passing to
 macvtap in this patch.
 
 This happens after userspace thread gets woken up and data
 is copied there. So relying on filtering at that level is
 going to be very inefficient on a system with
 multiple active guests. Further, and for that reason, vhost-net
 doesn't do filtering at all, relying on the backends
 to pass it correct packets.

Ok thanks for the info. So in which case, phase 1 is best for PASSTHRU mode
and for non-PASSTHRU when there is a single guest connected to a VF.
For non-PASSTHRU multi guest sharing the same VF, Phase 1 is definitely
better than putting the VF in promiscuous mode.
But to address the concern you mention above, in phase 2 when we have more
than one guest sharing the VF, we will have to add filter lookup in macvlan
to filter pkts for each guest. This will need some performance tests too.

Will start investigating the netlink interface comments for phase 1 first.

Thanks!
-Roopa

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode

2011-09-08 Thread Roopa Prabhu




On 9/8/11 10:42 AM, Sridhar Samudrala s...@us.ibm.com wrote:

 On Thu, 2011-09-08 at 09:19 -0700, Roopa Prabhu wrote:
 
 
 On 9/8/11 4:08 AM, Michael S. Tsirkin m...@redhat.com wrote:
 
 On Wed, Sep 07, 2011 at 10:20:28PM -0700, Roopa Prabhu wrote:
 On 9/7/11 5:34 AM, Michael S. Tsirkin m...@redhat.com wrote:
 
 On Tue, Sep 06, 2011 at 03:35:40PM -0700, Roopa Prabhu wrote:
 This patch is an attempt at providing address filtering support for
 macvtap
 devices in PASSTHRU mode. Its still a work in progress.
 Briefly tested for basic functionality. Wanted to get some feedback on
 the
 direction before proceeding.
 
 
 Good work, thanks.
 
 
 Thanks.
 
 I have hopefully CC'ed all concerned people.
 
 kvm crowd might also be interested.
 Try using ./scripts/get_maintainer.pl as well.
 
 Thanks for the tip. Expanded CC list a bit more.
 
 PASSTHRU mode today sets the lowerdev in promiscous mode. In PASSTHRU
 mode
 there is a 1-1 mapping between macvtap device and physical nic or VF. And
 all
 filtering is done in lowerdev hw. The lowerdev does not need to be in
 promiscous mode as long as the guest filters are passed down to the
 lowerdev.
 This patch tries to remove the need for putting the lowerdev in
 promiscous
 mode. 
 I have also referred to the thread below where TUNSETTXFILTER was
 mentioned
 in 
 this context:
  http://patchwork.ozlabs.org/patch/69297/
 
 This patch basically passes the addresses got by TUNSETTXFILTER to
 macvlan
 lowerdev.
 
 I have looked at previous work and discussions on this for qemu-kvm
 by Michael Tsirkin, Alex Williamson and Dragos Tatulea
 http://patchwork.ozlabs.org/patch/78595/
 http://patchwork.ozlabs.org/patch/47160/
 https://patchwork.kernel.org/patch/474481/
 
 Redhat bugzilla by Michael Tsirkin:
 https://bugzilla.redhat.com/show_bug.cgi?id=655013
 
 I used Michael's qemu-kvm patch for testing the changes with KVM
 
 I would like to cover both MAC and vlan filtering in this work.
 
 Open Questions/Issues:
 - There is a need for vlan filtering to complete the patch. It will
 require
   a new tap ioctl cmd for vlans.
   Some ideas on this are:
 
   a) TUNSETVLANFILTER: This will entail we send the whole vlan bitmap
 filter
 (similar to tun_filter for addresses). Passing the vlan id's to lower
 device will mean going thru the whole list of vlans every time.
 
   OR
 
   b) TUNSETVLAN with vlan id and flag to set/unset
 
   Does option 'b' sound ok ?
 
 - In this implementation we make the macvlan address list same as the
 address
   list that came in the filter with TUNSETTXFILTER. This will not cover
 cases
   where the macvlan device needs to have other addresses that are not
   necessarily in the filter. Is this a problem ?
 
 What cases do you have in mind?
 
 This patch targets only macvlan PASSTHRU mode and for PASSTHRU mode I don't
 see a problem with uc/mc address list being the same in all the stacked
 netdevs in the path. I called that out above to make sure I was not missing
 any case in PASSTHRU mode where this might be invalid. Otherwise I don't
 see
 a problem in the simple PASSTHRU use case this patch supports.
 
 - The patch currently only supports passing of IFF_PROMISC and
 IFF_MULTICAST
 filter flags to lowerdev
 
 This patch series implements the following
 01/3 - macvlan: Add support for unicast filtering in macvlan
 02/3 - macvlan: Add function to set addr filter on lower device in
 passthru
 mode
 03/3 - macvtap: Add support for TUNSETTXFILTER
 
 Please comment. Thanks.
 
 Signed-off-by: Roopa Prabhu ropra...@cisco.com
 Signed-off-by: Christian Benvenuti be...@cisco.com
 Signed-off-by: David Wang dwa...@cisco.com
 
 The security isn't lower than with promisc, so I don't see
 a problem with this as such.
 
 There are more features we'll want down the road though,
 so let's see whether the interface will be able to
 satisfy them in a backwards compatible way before we
 set it in stone. Here's what I came up with:
 
 How will the filtering table be partitioned within guests?
 
 Since this patch supports macvlan PASSTHRU mode only, in which the lower
 device has 1-1 mapping to the guest nic, it does not require any
 partitioning of filtering table within guests. Unless I missed
 understanding
 something. 
 If the lower device were being shared by multiple guest network interfaces
 (non PASSTHRU mode), only then we will need to maintain separate filter
 tables for each guest network interface in macvlan and forward the pkt to
 respective guest interface after a filter lookup. This could affect
 performance too I think.
 
 Not with hardware filtering support. Which is where we'd need to
 partition the host nic mac table between guests.
 
 I need to understand this more. In non passthru case when a VF or physical
 nic is shared between guests, the nic does not really know about the guests,
 so I was thinking we do the same thing as we do for the passthru case (ie
 send all the address filters from macvlan to the physical nic). So at the
 hardware

Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode

2011-09-08 Thread Roopa Prabhu




On 9/8/11 4:08 AM, Michael S. Tsirkin m...@redhat.com wrote:

 On Wed, Sep 07, 2011 at 10:20:28PM -0700, Roopa Prabhu wrote:
 On 9/7/11 5:34 AM, Michael S. Tsirkin m...@redhat.com wrote:
 
 On Tue, Sep 06, 2011 at 03:35:40PM -0700, Roopa Prabhu wrote:
 This patch is an attempt at providing address filtering support for macvtap
 devices in PASSTHRU mode. Its still a work in progress.
 Briefly tested for basic functionality. Wanted to get some feedback on the
 direction before proceeding.
 
 
 Good work, thanks.
 
 
 Thanks.
 
 I have hopefully CC'ed all concerned people.
 
 kvm crowd might also be interested.
 Try using ./scripts/get_maintainer.pl as well.
 
 Thanks for the tip. Expanded CC list a bit more.
 
 PASSTHRU mode today sets the lowerdev in promiscous mode. In PASSTHRU mode
 there is a 1-1 mapping between macvtap device and physical nic or VF. And
 all
 filtering is done in lowerdev hw. The lowerdev does not need to be in
 promiscous mode as long as the guest filters are passed down to the
 lowerdev.
 This patch tries to remove the need for putting the lowerdev in promiscous
 mode. 
 I have also referred to the thread below where TUNSETTXFILTER was mentioned
 in 
 this context: 
  http://patchwork.ozlabs.org/patch/69297/
 
 This patch basically passes the addresses got by TUNSETTXFILTER to macvlan
 lowerdev.
 
 I have looked at previous work and discussions on this for qemu-kvm
 by Michael Tsirkin, Alex Williamson and Dragos Tatulea
 http://patchwork.ozlabs.org/patch/78595/
 http://patchwork.ozlabs.org/patch/47160/
 https://patchwork.kernel.org/patch/474481/
 
 Redhat bugzilla by Michael Tsirkin:
 https://bugzilla.redhat.com/show_bug.cgi?id=655013
 
 I used Michael's qemu-kvm patch for testing the changes with KVM
 
 I would like to cover both MAC and vlan filtering in this work.
 
 Open Questions/Issues:
 - There is a need for vlan filtering to complete the patch. It will require
   a new tap ioctl cmd for vlans.
   Some ideas on this are:
 
   a) TUNSETVLANFILTER: This will entail we send the whole vlan bitmap
 filter
 (similar to tun_filter for addresses). Passing the vlan id's to lower
 device will mean going thru the whole list of vlans every time.
 
   OR
 
   b) TUNSETVLAN with vlan id and flag to set/unset
 
   Does option 'b' sound ok ?
 
 - In this implementation we make the macvlan address list same as the
 address
   list that came in the filter with TUNSETTXFILTER. This will not cover
 cases
   where the macvlan device needs to have other addresses that are not
   necessarily in the filter. Is this a problem ?
 
 What cases do you have in mind?
 
 This patch targets only macvlan PASSTHRU mode and for PASSTHRU mode I don't
 see a problem with uc/mc address list being the same in all the stacked
 netdevs in the path. I called that out above to make sure I was not missing
 any case in PASSTHRU mode where this might be invalid. Otherwise I don't see
 a problem in the simple PASSTHRU use case this patch supports.
 
 - The patch currently only supports passing of IFF_PROMISC and
 IFF_MULTICAST
 filter flags to lowerdev
 
 This patch series implements the following
 01/3 - macvlan: Add support for unicast filtering in macvlan
 02/3 - macvlan: Add function to set addr filter on lower device in passthru
 mode
 03/3 - macvtap: Add support for TUNSETTXFILTER
 
 Please comment. Thanks.
 
 Signed-off-by: Roopa Prabhu ropra...@cisco.com
 Signed-off-by: Christian Benvenuti be...@cisco.com
 Signed-off-by: David Wang dwa...@cisco.com
 
 The security isn't lower than with promisc, so I don't see
 a problem with this as such.
 
 There are more features we'll want down the road though,
 so let's see whether the interface will be able to
 satisfy them in a backwards compatible way before we
 set it in stone. Here's what I came up with:
 
 How will the filtering table be partitioned within guests?
 
 Since this patch supports macvlan PASSTHRU mode only, in which the lower
 device has 1-1 mapping to the guest nic, it does not require any
 partitioning of filtering table within guests. Unless I missed understanding
 something. 
 If the lower device were being shared by multiple guest network interfaces
 (non PASSTHRU mode), only then we will need to maintain separate filter
 tables for each guest network interface in macvlan and forward the pkt to
 respective guest interface after a filter lookup. This could affect
 performance too I think.
 
 Not with hardware filtering support. Which is where we'd need to
 partition the host nic mac table between guests.
 
I need to understand this more. In non passthru case when a VF or physical
nic is shared between guests, the nic does not really know about the guests,
so I was thinking we do the same thing as we do for the passthru case (ie
send all the address filters from macvlan to the physical nic). So at the
hardware, filtering is done for all guests sharing the nic. But if we want
each virtio-net nic or guest to get exactly what it asked for
macvlan/macvtap

Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode

2011-09-08 Thread Roopa Prabhu




On 9/8/11 12:11 PM, Michael S. Tsirkin m...@redhat.com wrote:

 On Thu, Sep 08, 2011 at 09:19:32AM -0700, Roopa Prabhu wrote:
 There are more features we'll want down the road though,
 so let's see whether the interface will be able to
 satisfy them in a backwards compatible way before we
 set it in stone. Here's what I came up with:
 
 How will the filtering table be partitioned within guests?
 
 Since this patch supports macvlan PASSTHRU mode only, in which the lower
 device has 1-1 mapping to the guest nic, it does not require any
 partitioning of filtering table within guests. Unless I missed
 understanding
 something. 
 If the lower device were being shared by multiple guest network interfaces
 (non PASSTHRU mode), only then we will need to maintain separate filter
 tables for each guest network interface in macvlan and forward the pkt to
 respective guest interface after a filter lookup. This could affect
 performance too I think.
 
 Not with hardware filtering support. Which is where we'd need to
 partition the host nic mac table between guests.
 
 I need to understand this more. In non passthru case when a VF or physical
 nic is shared between guests,
 
 For example, consider a VF given to each guest. Hardware supports a fixed
 total number of filters, which can be partitioned between VFs.
 
O ok. But hw maintains VF filters separately for every VF as far as I know.
Filters received on a VF are programmed for that VF only. Am assuming all
hardware do this. Atleast our hardware does this.
What I was referring to was a single VF shared between guests using macvtap
(could be bridge mode for example). All guests sharing the VF will register
 filters with the VF via macvlan. Hw makes sure what ever the VF asked for
is received at the VF. VF in hw does not know that it is shared by guests.
Only at macvlan we might need to re-filter the pkts received on the VF and
steer pkts to the individual guests based on what they asked for.


 the nic does not really know about the guests,
 so I was thinking we do the same thing as we do for the passthru case (ie
 send all the address filters from macvlan to the physical nic). So at the
 hardware, filtering is done for all guests sharing the nic. But if we want
 each virtio-net nic or guest to get exactly what it asked for
 macvlan/macvtap needs to maintain a copy of each guest filter and do a
 lookup and send only the requested traffic to the guest. Here is the
 performance hit that I was seeing. Please see my next comment for further
 details. 
 
 It won't be any slower than attaching a non-passthrough macvlan
 to a device, will it?
 
Am not sure. The filter lookup in macvlan is the one I am concerned about.
Will need to try it out.

 
 I chose to support PASSTHRU Mode only at first because its simpler and all
 code additions are in control path only.
 
 I agree. It would be a bit silly to have a dedicated interface
 for passthough and a completely separate one for
 non passthrough.
 
 Agree. The reason I did not focus on non-passthru case in the initial
 version was because I was thinking things to do in the non-passthru case
 will be just add-ons to the passthru case. But true Better to flush out the
 non-pasthru case details.
 
 After dwelling on this a bit more how about the below:
 
 Phase 1: Goal: Enable hardware filtering for all macvlan modes
 - In macvlan passthru mode the single guest virtio-nic connected will
   receive traffic that he requested for
 - In macvlan non-passthru mode all guest virtio-nics sharing the
   physical nic will see all other guest traffic
   but the filtering at guest virtio-nic
 
 I don't think guests currently filter anything.
 
I was referring to Qemu-kvm virtio-net in
virtion_net_receive-receive_filter. I think It only passes pkts that the
guest OS is interested. It uses the filter table that I am passing to
macvtap in this patch.


   will make sure each guest
   eventually sees traffic he asked for. This is still better than
   putting the physical nic in promiscuous mode.
 
 (This is mainly what my patch does...but will need to remove the passthru
 check and see if there are any thing else needed for non-passthru case)
 
 I'm fine with sticking with passthrough, make non passthrough
 a separate phase.
 
Ok.

 
 Phase 2: Goal: Enable filtering at macvlan so that each guest virtio-nic
 receives only what he requested for.
 - In this case, in addition to pushing the filters down to the physical
   nic we will have to maintain the same filter in macvlan and do a filter
   lookup before forwarding the traffic to a virtio-nic.
 
 But I am thinking phase 2 might be redundant given virtio-nic already does
 filtering for the guest.
 
 It does? Do you mean the filter that qemu does in userspace?
 
Yes I meant the filter that qemu does in userspace qemu-kvm/hw/virtio-net.c
receive_filter(). 

 In which case we might not need phase 2 at all. I
 might have been over complicating things

Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode

2011-09-07 Thread Roopa Prabhu

On 9/7/11 5:34 AM, Michael S. Tsirkin m...@redhat.com wrote:

 On Tue, Sep 06, 2011 at 03:35:40PM -0700, Roopa Prabhu wrote:
 This patch is an attempt at providing address filtering support for macvtap
 devices in PASSTHRU mode. Its still a work in progress.
 Briefly tested for basic functionality. Wanted to get some feedback on the
 direction before proceeding.
 
 
 Good work, thanks.
 

Thanks.

 I have hopefully CC'ed all concerned people.
 
 kvm crowd might also be interested.
 Try using ./scripts/get_maintainer.pl as well.
 
Thanks for the tip. Expanded CC list a bit more.

 PASSTHRU mode today sets the lowerdev in promiscous mode. In PASSTHRU mode
 there is a 1-1 mapping between macvtap device and physical nic or VF. And all
 filtering is done in lowerdev hw. The lowerdev does not need to be in
 promiscous mode as long as the guest filters are passed down to the lowerdev.
 This patch tries to remove the need for putting the lowerdev in promiscous
 mode. 
 I have also referred to the thread below where TUNSETTXFILTER was mentioned
 in 
 this context: 
  http://patchwork.ozlabs.org/patch/69297/
 
 This patch basically passes the addresses got by TUNSETTXFILTER to macvlan
 lowerdev.
 
 I have looked at previous work and discussions on this for qemu-kvm
 by Michael Tsirkin, Alex Williamson and Dragos Tatulea
 http://patchwork.ozlabs.org/patch/78595/
 http://patchwork.ozlabs.org/patch/47160/
 https://patchwork.kernel.org/patch/474481/
 
 Redhat bugzilla by Michael Tsirkin:
 https://bugzilla.redhat.com/show_bug.cgi?id=655013
 
 I used Michael's qemu-kvm patch for testing the changes with KVM
 
 I would like to cover both MAC and vlan filtering in this work.
 
 Open Questions/Issues:
 - There is a need for vlan filtering to complete the patch. It will require
   a new tap ioctl cmd for vlans.
   Some ideas on this are:
 
   a) TUNSETVLANFILTER: This will entail we send the whole vlan bitmap filter
 (similar to tun_filter for addresses). Passing the vlan id's to lower
 device will mean going thru the whole list of vlans every time.
 
   OR
 
   b) TUNSETVLAN with vlan id and flag to set/unset
 
   Does option 'b' sound ok ?
 
 - In this implementation we make the macvlan address list same as the address
   list that came in the filter with TUNSETTXFILTER. This will not cover cases
   where the macvlan device needs to have other addresses that are not
   necessarily in the filter. Is this a problem ?
 
 What cases do you have in mind?
 
This patch targets only macvlan PASSTHRU mode and for PASSTHRU mode I don't
see a problem with uc/mc address list being the same in all the stacked
netdevs in the path. I called that out above to make sure I was not missing
any case in PASSTHRU mode where this might be invalid. Otherwise I don't see
a problem in the simple PASSTHRU use case this patch supports.

 - The patch currently only supports passing of IFF_PROMISC and IFF_MULTICAST
 filter flags to lowerdev
 
 This patch series implements the following
 01/3 - macvlan: Add support for unicast filtering in macvlan
 02/3 - macvlan: Add function to set addr filter on lower device in passthru
 mode
 03/3 - macvtap: Add support for TUNSETTXFILTER
 
 Please comment. Thanks.
 
 Signed-off-by: Roopa Prabhu ropra...@cisco.com
 Signed-off-by: Christian Benvenuti be...@cisco.com
 Signed-off-by: David Wang dwa...@cisco.com
 
 The security isn't lower than with promisc, so I don't see
 a problem with this as such.
 
 There are more features we'll want down the road though,
 so let's see whether the interface will be able to
 satisfy them in a backwards compatible way before we
 set it in stone. Here's what I came up with:
 
 How will the filtering table be partitioned within guests?

Since this patch supports macvlan PASSTHRU mode only, in which the lower
device has 1-1 mapping to the guest nic, it does not require any
partitioning of filtering table within guests. Unless I missed understanding
something. 

If the lower device were being shared by multiple guest network interfaces
(non PASSTHRU mode), only then we will need to maintain separate filter
tables for each guest network interface in macvlan and forward the pkt to
respective guest interface after a filter lookup. This could affect
performance too I think.

I chose to support PASSTHRU Mode only at first because its simpler and all
code additions are in control path only.

 
 A way to limit what the guest can do would also be useful.
 How can this be done? selinux?

I vaguely remember a thread on the same context.. had a suggestion to
maintain pre-approved address lists and allow guest filter registration of
only those addresses for security. This seemed reasonable. Plus the ability
to support additional address registration from guest could be made
configurable (One of your ideas again from prior work).

I am not an selinux expert, but I am thinking we can use it to only allow or
disallow access or operations to the macvtap device. (?). I will check more
on this.

 
 Any

Re: RFT: virtio_net: limit xmit polling

2011-07-14 Thread Roopa Prabhu




On 6/29/11 1:42 AM, Michael S. Tsirkin m...@redhat.com wrote:

 On Tue, Jun 28, 2011 at 11:08:07AM -0500, Tom Lendacky wrote:
 On Sunday, June 19, 2011 05:27:00 AM Michael S. Tsirkin wrote:
 OK, different people seem to test different trees.  In the hope to get
 everyone on the same page, I created several variants of this patch so
 they can be compared. Whoever's interested, please check out the
 following, and tell me how these compare:
 
 kernel:
 
 git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git
 
 virtio-net-limit-xmit-polling/base - this is net-next baseline to test
 against virtio-net-limit-xmit-polling/v0 - fixes checks on out of capacity
 virtio-net-limit-xmit-polling/v1 - previous revision of the patch
 this does xmit,free,xmit,2*free,free
 virtio-net-limit-xmit-polling/v2 - new revision of the patch
 this does free,xmit,2*free,free
 
 
 Here's a summary of the results.  I've also attached an ODS format
 spreadsheet
 (30 KB in size) that might be easier to analyze and also has some pinned VM
 results data.  I broke the tests down into a local guest-to-guest scenario
 and a remote host-to-guest scenario.
 
 Within the local guest-to-guest scenario I ran:
   - TCP_RR tests using two different messsage sizes and four different
 instance counts among 1 pair of VMs and 2 pairs of VMs.
   - TCP_STREAM tests using four different message sizes and two different
 instance counts among 1 pair of VMs and 2 pairs of VMs.
 
 Within the remote host-to-guest scenario I ran:
   - TCP_RR tests using two different messsage sizes and four different
 instance counts to 1 VM and 4 VMs.
   - TCP_STREAM and TCP_MAERTS tests using four different message sizes and
 two different instance counts to 1 VM and 4 VMs.
 over a 10GbE link.
 
 roprabhu, Tom,
 
 Thanks very much for the testing. So on the first glance
 one seems to see a significant performance gain in V0 here,
 and a slightly less significant in V2, with V1
 being worse than base. But I'm afraid that's not the
 whole story, and we'll need to work some more to
 know what really goes on, please see below.
 
 
 Some comments on the results: I found out that V0 because of mistake
 on my part was actually almost identical to base.
 I pushed out virtio-net-limit-xmit-polling/v1a instead that
 actually does what I intended to check. However,
 the fact we get such a huge distribution in the results by Tom
 most likely means that the noise factor is very large.
 
 
 From my experience one way to get stable results is to
 divide the throughput by the host CPU utilization
 (measured by something like mpstat).
 Sometimes throughput doesn't increase (e.g. guest-host)
 by CPU utilization does decrease. So it's interesting.
 
 
 Another issue is that we are trying to improve the latency
 of a busy queue here. However STREAM/MAERTS tests ignore the latency
 (more or less) while TCP_RR by default runs a single packet per queue.
 Without arguing about whether these are practically interesting
 workloads, these results are thus unlikely to be significantly affected
 by the optimization in question.
 
 What we are interested in, thus, is either TCP_RR with a -b flag
 (configure with  --enable-burst) or multiple concurrent
 TCP_RRs.
 
 
 
Michael, below are some numbers I got from one round of runs.
Thanks,
Roopa

256byte req/response.
Vcpus and irqs were pinned to 4 cores and the cpu utilization is
Avg across 4 cores.

base:
Numof concurrent TCP_RRsNum of transactions/sec  host cpu-util(%)
17982.9315.72
25   67873 28.84
50   11253452.25
100  192057   86.54


v1
Numof concurrent TCP_RRsNum of transactions/sechost cpu-util(%)
1   7970.94   10.8
25  65496.8   28
50  10985853.22
100 19015587.5


v1a
Numof concurrent TCP_RRsNum of transactions/sec   host cpu-util (%)
1   7979.81   9.5
25  66786.1   28
50  10955251
100 19087688


v2
Numof concurrent TCP_RRsNum of transactions/sec   host cpu-util (%)
17969.87 16.5
25   67780.1 28.44
50   114966  54.29
100  177982  79.9

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: TODO item: guest programmable mac/vlan filtering with macvtap

2011-06-23 Thread Roopa Prabhu

Dragos Tatulea dragos.tatulea at gmail.com writes:

 
 I have created a wiki page for this [1], also added to the networking
 todo list [2]. No meaty information yet. But it's enough to start
 working on it.
 
 [1] - http://www.linux-kvm.org/page/GuestProgrammableMacVlanFiltering
 [2] - http://www.linux-kvm.org/page/NetworkingTodo
 

Hi Dragos,  I wanted to know if there were any updates to this work.
I am interested to try it out and also willing to help with anything possible.

Please let me know,

Thanks,
Roopa



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

50 matches

Mail list logo