Re: Multicast and receive filtering in TUN/TAP

2008-07-11 Thread Max Krasnyansky
Rusty Russell wrote:
 On Friday 11 July 2008 12:20:07 Max Krasnyansky wrote:
 I haven't looked at the virtio stuff much, I was assuming that the host
 side of it is still the TUN driver. Is it not ?
 Yes, the host side is still tun/tap. The problem is that qemu doesnt know
 which multicast addresses are used inside the guest.
 Ah, now I see what you meant by virtio_net does not do multicast. I guess
 it should trivial to add. Rusty will clarify it I guess.
 
 Yes, it could certainly be added; that's what feature bits are for :)

Sounds good.
I'll send the patch that lets you guys setup tx filters on the TAP devices.
Hypervisors will then need to translate rx filters set by the guest OS into
TAP tx filters. I'm thinking of doing it just like E1000 for example. 14 exact
filters and the rest is hashed.

Max
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: Multicast and receive filtering in TUN/TAP

2008-07-11 Thread Brian Braunstein
Sorry that I was confused here and it seems I am still confused.

I was thinking that for any one instance of a TAP interface, there should be
only 1 MAC address, since there is only 1 network interface, since the
character device is not a network interface but rather the interface for the
application to send and receive on that virtual network interface.

For the MC stuff, I have to admit I haven't looked into it much, but it
seems like the basic operation of setting the MAC address of the network
interface should be supported, and it seems like an ioctl called
SIOCSIFHWADDR should Set the InterFace HardWare ADDRess.  Sorry if I was
wrong about this.  It might be good to add a comment to SIOCSIFHWADDR that
says This does not actually set the network interface hardware address,
this is for multicast filtering or whatever it actually is suppose to do.
Or perhaps create a new ioctl that has something about multicast filtering
in the name, and leave SIOCSIFHWADDR doing what it is doing now.

brian


On Thu, Jul 10, 2008 at 2:38 PM, Shaun Jackman [EMAIL PROTECTED] wrote:

 Hi Max,

 The original patch implemented receive multicast filtering by
 emulating the implementation used by many physical Ethernet
 interfaces: hashing the multicast address. TUN emulates two network
 cards (and communication via the virtual link between them), the guest
 and the host, or the character device and the network device, so there
 are two receive filters: chr_filter and net_filter. I implemented the
 filtering at the character device using chr_filter in tun_chr_readv,
 and left filtering at the network device for someone else to
 implement.

 I'm not sure what you mean by TX filtering. Multicast filtering is
 implemented uniquely at the receiver. There are, however, two
 receivers: the character device and the network device.

 I believe Brian's patch was mistaken. Two entirely distinct Ethernet
 addresses are required: one for the character device and one for the
 network device, or put another way, one for the virtual Ethernet
 interface at the guest and one for the virtual Ethernet interface at
 the host. For the same reason, there are two distinct multicast
 filters.



 Looking over the original patch, I believe I see a bug in tun_net_mclist:
 memset(tun-chr_filter, 0, sizeof tun-chr_filter);
 should be
 memset(tun-net_filter, 0, sizeof tun-net_filter);

 Cheers,
 Shaun

 On Wed, Jul 9, 2008 at 3:58 PM, Max Krasnyansky [EMAIL PROTECTED] wrote:
  Yesterday while fixing xoff stuckiness issue in the TUN/TAP driver I got
 a
  chance to look into the multicast filtering code in there. And
 immediately
  realized how terribly broken  confusing it is. The patch was originally
  done by Shaun (CC'ed) and went in without any proper ACK from me, Dave or
  Jeff.
  Here is the original ref
 http://marc.info/?l=linux-netdevm=110490502102308w=2
 
  I'm not going to dive into too much details on what's wrong with the
 current
  code. The main issues are that it mixes RX and TX filtering which are
  orthogonal, and it reuses ioctl names and stuff for manipulating TX
 filter
  state as if it was a normal RX multicast state.
  Later on Brian's patch added insult to the injury
 
  http://git.kernel.org/?p=linux/kernel/git/\http://git.kernel.org/?p=linux/kernel/git/%5C
 torvalds/linux-2.6.git;\
 a=commit;h=36226a8ded46b89a94f9de5976f554bb5e02d84c
  Brian missed the point of the original patch (not his fault, as I said
 the
  original patch was not the best) that the separate address introduced by
 the
  MC patch was used for filtering _TX_ packets. It had nothing to do with
 the
  HW addr of the local network interface.
 
  The problem is that MC stuff is now even more broken and ioctls that were
  used originally now mean something different. So my first thinking was to
  just rip the MC stuff out because it's broken and probably nobody uses it
  (given that we got no complains after Brian's patch broke it completely).
  But then I realized that if done properly it might be very useful for
  virtualization.
 
  ---
 
  So the first question is are there any users out there that ever used the
  original patch. Shaun, any insight ? How did you intend to use it ?
 
  ---
 
  The second question is do you guys think that QEMU/KVM/LGUEST/etc would
  benefit if receive filtering was done by the host OS. Here is a specific
  example of what I'm talking about.
  We can do what qemu/hw/e1000.c:receive_filter() does in the _host_
 context
  (that function currently runs in the guest context). By looking at
 libvirt,
  typical QEMU based setup is that you have a single bridge and all the
 TAPs
  from different VMs are hooked up to that bridge. What that means is that
 if
  one VM is getting MC traffic or when the bridge sees MACADDR that is not
 in
  its tables the packets get delivered to all the VMs. ie We have to wake
 all
  of the up only to so that they could drop that packet. Instead, we could
  

Re: Multicast and receive filtering in TUN/TAP

2008-07-10 Thread Christian Borntraeger
Am Donnerstag, 10. Juli 2008 schrieb Max Krasnyansky:
[...]
 The second question is do you guys think that QEMU/KVM/LGUEST/etc would 
 benefit if receive filtering was done by the host OS. Here is a specific 
 example of what I'm talking about.
 We can do what qemu/hw/e1000.c:receive_filter() does in the _host_ 
 context (that function currently runs in the guest context). By looking 
 at libvirt, typical QEMU based setup is that you have a single bridge 
 and all the TAPs from different VMs are hooked up to that bridge. What 
 that means is that if one VM is getting MC traffic or when the bridge 
 sees MACADDR that is not in its tables the packets get delivered to all 
 the VMs. ie We have to wake all of the up only to so that they could 
 drop that packet. Instead, we could setup filters in the host's side of 
 the TAP device.
 Does that sound like something useful for QEMU/KVM ?
 If yes we can talk about the API. If not then I'll just nuke it.

Max,

I know that on s390 the shared OSA network card have multicast filter 
capabilities. So I guess it is worthwile for a virtualization environments 
with lots of guests. I also think, that this kind of filtering should be 
straightforward to implement with the qemu e1000 code. Qemu already knows the 
multicast addresses.

Thing is, we are heading towards virtio. Unfortunately, virtio_net currently 
does not offer a method to register multicast addresses.

Rusty, do you think its worthwile to notify the host about registered 
multicast addresses?

Christian
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: Multicast and receive filtering in TUN/TAP

2008-07-10 Thread Max Krasnyansky
Christian Borntraeger wrote:
 Am Donnerstag, 10. Juli 2008 schrieb Max Krasnyansky:
 [...]
 The second question is do you guys think that QEMU/KVM/LGUEST/etc would 
 benefit if receive filtering was done by the host OS. Here is a specific 
 example of what I'm talking about.
 We can do what qemu/hw/e1000.c:receive_filter() does in the _host_ 
 context (that function currently runs in the guest context). By looking 
 at libvirt, typical QEMU based setup is that you have a single bridge 
 and all the TAPs from different VMs are hooked up to that bridge. What 
 that means is that if one VM is getting MC traffic or when the bridge 
 sees MACADDR that is not in its tables the packets get delivered to all 
 the VMs. ie We have to wake all of the up only to so that they could 
 drop that packet. Instead, we could setup filters in the host's side of 
 the TAP device.
 Does that sound like something useful for QEMU/KVM ?
 If yes we can talk about the API. If not then I'll just nuke it.
 
 Max,
 
 I know that on s390 the shared OSA network card have multicast filter 
 capabilities. So I guess it is worthwile for a virtualization environments 
 with lots of guests. I also think, that this kind of filtering should be 
 straightforward to implement with the qemu e1000 code. Qemu already knows the 
 multicast addresses.
Sure. It's straightforward to do inside QEMU, and it's already doing it.
The question is should we do it in the host context instead and avoid some
wakeups.

 Thing is, we are heading towards virtio. 
Even for Windows ?

 Unfortunately, virtio_net currently  does not offer a method to register 
 multicast addresses.
I haven't looked at the virtio stuff much, I was assuming that the host side
of it is still the TUN driver. Is it not ?

Max
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: Multicast and receive filtering in TUN/TAP

2008-07-10 Thread Christian Borntraeger
Am Donnerstag, 10. Juli 2008 schrieb Max Krasnyansky:
  Thing is, we are heading towards virtio. 
 Even for Windows ?

Its possible:
http://marc.info/?l=kvmm=121075389300722w=2

 
  Unfortunately, virtio_net currently  does not offer a method to register 
multicast addresses.
 I haven't looked at the virtio stuff much, I was assuming that the host side
 of it is still the TUN driver. Is it not ?

Yes, the host side is still tun/tap. The problem is that qemu doesnt know 
which multicast addresses are used inside the guest.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: Multicast and receive filtering in TUN/TAP

2008-07-10 Thread Shaun Jackman
Hi Max,

The original patch implemented receive multicast filtering by
emulating the implementation used by many physical Ethernet
interfaces: hashing the multicast address. TUN emulates two network
cards (and communication via the virtual link between them), the guest
and the host, or the character device and the network device, so there
are two receive filters: chr_filter and net_filter. I implemented the
filtering at the character device using chr_filter in tun_chr_readv,
and left filtering at the network device for someone else to
implement.

I'm not sure what you mean by TX filtering. Multicast filtering is
implemented uniquely at the receiver. There are, however, two
receivers: the character device and the network device.

I believe Brian's patch was mistaken. Two entirely distinct Ethernet
addresses are required: one for the character device and one for the
network device, or put another way, one for the virtual Ethernet
interface at the guest and one for the virtual Ethernet interface at
the host. For the same reason, there are two distinct multicast
filters.

Looking over the original patch, I believe I see a bug in tun_net_mclist:
memset(tun-chr_filter, 0, sizeof tun-chr_filter);
should be
memset(tun-net_filter, 0, sizeof tun-net_filter);

Cheers,
Shaun

On Wed, Jul 9, 2008 at 3:58 PM, Max Krasnyansky [EMAIL PROTECTED] wrote:
 Yesterday while fixing xoff stuckiness issue in the TUN/TAP driver I got a
 chance to look into the multicast filtering code in there. And immediately
 realized how terribly broken  confusing it is. The patch was originally
 done by Shaun (CC'ed) and went in without any proper ACK from me, Dave or
 Jeff.
 Here is the original ref
http://marc.info/?l=linux-netdevm=110490502102308w=2

 I'm not going to dive into too much details on what's wrong with the current
 code. The main issues are that it mixes RX and TX filtering which are
 orthogonal, and it reuses ioctl names and stuff for manipulating TX filter
 state as if it was a normal RX multicast state.
 Later on Brian's patch added insult to the injury
http://git.kernel.org/?p=linux/kernel/git/\
torvalds/linux-2.6.git;\
a=commit;h=36226a8ded46b89a94f9de5976f554bb5e02d84c
 Brian missed the point of the original patch (not his fault, as I said the
 original patch was not the best) that the separate address introduced by the
 MC patch was used for filtering _TX_ packets. It had nothing to do with the
 HW addr of the local network interface.

 The problem is that MC stuff is now even more broken and ioctls that were
 used originally now mean something different. So my first thinking was to
 just rip the MC stuff out because it's broken and probably nobody uses it
 (given that we got no complains after Brian's patch broke it completely).
 But then I realized that if done properly it might be very useful for
 virtualization.

 ---

 So the first question is are there any users out there that ever used the
 original patch. Shaun, any insight ? How did you intend to use it ?

 ---

 The second question is do you guys think that QEMU/KVM/LGUEST/etc would
 benefit if receive filtering was done by the host OS. Here is a specific
 example of what I'm talking about.
 We can do what qemu/hw/e1000.c:receive_filter() does in the _host_ context
 (that function currently runs in the guest context). By looking at libvirt,
 typical QEMU based setup is that you have a single bridge and all the TAPs
 from different VMs are hooked up to that bridge. What that means is that if
 one VM is getting MC traffic or when the bridge sees MACADDR that is not in
 its tables the packets get delivered to all the VMs. ie We have to wake all
 of the up only to so that they could drop that packet. Instead, we could
 setup filters in the host's side of the TAP device.
 Does that sound like something useful for QEMU/KVM ?
 If yes we can talk about the API. If not then I'll just nuke it.

 Thanx
 Max

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: Multicast and receive filtering in TUN/TAP

2008-07-10 Thread Max Krasnyansky


Brian Braunstein wrote:
 Sorry that I was confused here and it seems I am still confused.
 
 I was thinking that for any one instance of a TAP interface, there
 should be only 1 MAC address, since there is only 1 network interface,
 since the character device is not a network interface but rather the
 interface for the application to send and receive on that virtual
 network interface.
 
Exactly. Your understanding is perfectly correct.
See my previous reply. It should clear up all the confusion.


 For the MC stuff, I have to admit I haven't looked into it much, but it
 seems like the basic operation of setting the MAC address of the network
 interface should be supported, and it seems like an ioctl called
 SIOCSIFHWADDR should Set the InterFace HardWare ADDRess.  Sorry if I was
 wrong about this.  It might be good to add a comment to SIOCSIFHWADDR
 that says This does not actually set the network interface hardware
 address, this is for multicast filtering or whatever it actually is
 suppose to do.  Or perhaps create a new ioctl that has something about
 multicast filtering in the name, and leave SIOCSIFHWADDR doing what it
 is doing now.
Yep. That's what I'm going to do (ie a different ioctl). Again see my prev
email. We're totally on the same page :).

Max







 
 brian
 
 
 On Thu, Jul 10, 2008 at 2:38 PM, Shaun Jackman [EMAIL PROTECTED]
 mailto:[EMAIL PROTECTED] wrote:
 
 Hi Max,
 
 The original patch implemented receive multicast filtering by
 emulating the implementation used by many physical Ethernet
 interfaces: hashing the multicast address. TUN emulates two network
 cards (and communication via the virtual link between them), the guest
 and the host, or the character device and the network device, so there
 are two receive filters: chr_filter and net_filter. I implemented the
 filtering at the character device using chr_filter in tun_chr_readv,
 and left filtering at the network device for someone else to
 implement.
 
 I'm not sure what you mean by TX filtering. Multicast filtering is
 implemented uniquely at the receiver. There are, however, two
 receivers: the character device and the network device.
 
 I believe Brian's patch was mistaken. Two entirely distinct Ethernet
 addresses are required: one for the character device and one for the
 network device, or put another way, one for the virtual Ethernet
 interface at the guest and one for the virtual Ethernet interface at
 the host. For the same reason, there are two distinct multicast
 filters.
 
 
 
 Looking over the original patch, I believe I see a bug in
 tun_net_mclist:
 memset(tun-chr_filter, 0, sizeof tun-chr_filter);
 should be
 memset(tun-net_filter, 0, sizeof tun-net_filter);
 
 Cheers,
 Shaun
 
 On Wed, Jul 9, 2008 at 3:58 PM, Max Krasnyansky [EMAIL PROTECTED]
 mailto:[EMAIL PROTECTED] wrote:
  Yesterday while fixing xoff stuckiness issue in the TUN/TAP driver
 I got a
  chance to look into the multicast filtering code in there. And
 immediately
  realized how terribly broken  confusing it is. The patch was
 originally
  done by Shaun (CC'ed) and went in without any proper ACK from me,
 Dave or
  Jeff.
  Here is the original ref
 http://marc.info/?l=linux-netdevm=110490502102308w=2
 http://marc.info/?l=linux-netdevm=110490502102308w=2
 
  I'm not going to dive into too much details on what's wrong with
 the current
  code. The main issues are that it mixes RX and TX filtering which are
  orthogonal, and it reuses ioctl names and stuff for manipulating
 TX filter
  state as if it was a normal RX multicast state.
  Later on Brian's patch added insult to the injury
 http://git.kernel.org/?p=linux/kernel/git/\
 http://git.kernel.org/?p=linux/kernel/git/%5C
 torvalds/linux-2.6.git;\
 a=commit;h=36226a8ded46b89a94f9de5976f554bb5e02d84c
  Brian missed the point of the original patch (not his fault, as I
 said the
  original patch was not the best) that the separate address
 introduced by the
  MC patch was used for filtering _TX_ packets. It had nothing to do
 with the
  HW addr of the local network interface.
 
  The problem is that MC stuff is now even more broken and ioctls
 that were
  used originally now mean something different. So my first thinking
 was to
  just rip the MC stuff out because it's broken and probably nobody
 uses it
  (given that we got no complains after Brian's patch broke it
 completely).
  But then I realized that if done properly it might be very useful for
  virtualization.
 
  ---
 
  So the first question is are there any users out there that ever
 used the
  original patch. Shaun, any insight ? How did you intend to use it ?
 
  ---
 
  

Multicast and receive filtering in TUN/TAP

2008-07-09 Thread Max Krasnyansky
Yesterday while fixing xoff stuckiness issue in the TUN/TAP driver I got 
a chance to look into the multicast filtering code in there. And 
immediately realized how terribly broken  confusing it is. The patch 
was originally done by Shaun (CC'ed) and went in without any proper ACK 
from me, Dave or Jeff.
Here is the original ref
http://marc.info/?l=linux-netdevm=110490502102308w=2

I'm not going to dive into too much details on what's wrong with the 
current code. The main issues are that it mixes RX and TX filtering 
which are orthogonal, and it reuses ioctl names and stuff for 
manipulating TX filter state as if it was a normal RX multicast state.
Later on Brian's patch added insult to the injury
http://git.kernel.org/?p=linux/kernel/git/\
torvalds/linux-2.6.git;\
a=commit;h=36226a8ded46b89a94f9de5976f554bb5e02d84c
Brian missed the point of the original patch (not his fault, as I said 
the original patch was not the best) that the separate address 
introduced by the MC patch was used for filtering _TX_ packets. It had 
nothing to do with the HW addr of the local network interface.

The problem is that MC stuff is now even more broken and ioctls that 
were used originally now mean something different. So my first thinking 
was to just rip the MC stuff out because it's broken and probably nobody 
uses it (given that we got no complains after Brian's patch broke it 
completely). But then I realized that if done properly it might be very 
useful for virtualization.

---

So the first question is are there any users out there that ever used 
the original patch. Shaun, any insight ? How did you intend to use it ?

---

The second question is do you guys think that QEMU/KVM/LGUEST/etc would 
benefit if receive filtering was done by the host OS. Here is a specific 
example of what I'm talking about.
We can do what qemu/hw/e1000.c:receive_filter() does in the _host_ 
context (that function currently runs in the guest context). By looking 
at libvirt, typical QEMU based setup is that you have a single bridge 
and all the TAPs from different VMs are hooked up to that bridge. What 
that means is that if one VM is getting MC traffic or when the bridge 
sees MACADDR that is not in its tables the packets get delivered to all 
the VMs. ie We have to wake all of the up only to so that they could 
drop that packet. Instead, we could setup filters in the host's side of 
the TAP device.
Does that sound like something useful for QEMU/KVM ?
If yes we can talk about the API. If not then I'll just nuke it.

Thanx
Max
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization