Re: Patch for non-netgraph bridge code worthy of attention forpeople experimenting with bridging setups (including ng_bridge)

2001-02-03 Thread Julian Elischer

"Rogier R. Mulhuijzen" wrote:
 
 I found this while experimenting with both "legacy" bridge and ng_bridge.
 The bridging code doesn't check its activation everywhere so when I started
 using an ng_bridge node I started getting weird errors.
 
 Patch is rather simple, can someone submit this?

I'm a litle confused when I look at this patch.

I think this is the wrong fix.

I see that you are accounting for packets coming in on two interfaces, 
but the aim of the netgraph bridging is to make it look as if the
packets are all coming in off one interface. Theoretically the
bridging code should be attached to only one 'upper' part of a driver
and all packets should arrive at higher levels, looking as if they have
all come in through that one interface. The other interfaces in the bridge 
will never receive anything because their input has been diverted. To the 
system it should look as if the entire bridged network is on that one 
interface.

If this is not the case then we need to fix the bridging code so that
it is true, rather than clutter up higher level code trying to
account for a bug in the lower code.

So how can an incoming packet look like it is not coming from that single
interface?
1/ ifnet pointer. The function ng_ether_rcv_upper() adjust this, so that's
not the problem.
2/ rcv interface MAC address. This is stripped off before arp gets it
(also in ng_ether_rcv_upper()).
3/ the tha[] or sha[] fields may contain a MAC address for
some other interface. (depending on how the remote mechine fills out 
those fields), but our outgoing packets should have the MAC address
of the interface we have selected as out main interface, independent of
which interface it actually goes out of, (unless the hardware
over-writes it). so even that should point to the single interface.

The other interfaces should (maybe) beb ifconfigged 'UP' but they should
not have IP addresses assigned tp them, as they are being slaved from 
the main interface by the ng_bridging code so everything comes and 
goes via that one.

so I'm slightly confused as to what problem this solves.
(I'm not saying there isn't one, just that I con't figure out what it is).
Everything should act as if there is just one interface when netgraph 
bridging is turned on.

 
  DocWilco
 
 Date: Mon, 29 Jan 2001 08:20:01 -0800 (PST)
 To: [EMAIL PROTECTED]
 From: [EMAIL PROTECTED]
 Subject: Re: kern/24720: Bridging code does not always check activation
 (w/patch)
 Reply-To: [EMAIL PROTECTED], [EMAIL PROTECTED]
 Sender: [EMAIL PROTECTED]
 
 Thank you very much for your problem report.
 It has the internal identification `kern/24720'.
 The individual assigned to look at your
 report is: freebsd-bugs.
 
 You can access the state of your problem report at any time
 via this link:
 
 http://www.freebsd.org/cgi/query-pr.cgi?pr=24720
 
  Category:   kern
  Responsible:freebsd-bugs
  Synopsis:   Bridging code does not always check activation (w/patch)
  Arrival-Date:   Mon Jan 29 08:20:01 PST 2001
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with "unsubscribe freebsd-net" in the body of the message

-- 
  __--_|\  Julian Elischer
 /   \ [EMAIL PROTECTED]
(   OZ) World tour 2000-2001
--- X_.---._/  
v


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Patch for non-netgraph bridge code worthy of attention forpeople experimenting with bridging setups (including ng_bridge)

2001-02-03 Thread Rogier R. Mulhuijzen

At 00:48 3-2-01 -0800, Julian Elischer wrote:
"Rogier R. Mulhuijzen" wrote:
 
  I found this while experimenting with both "legacy" bridge and ng_bridge.
  The bridging code doesn't check its activation everywhere so when I started
  using an ng_bridge node I started getting weird errors.
 
  Patch is rather simple, can someone submit this?

I'm a litle confused when I look at this patch.

I think this is the wrong fix.

I see that you are accounting for packets coming in on two interfaces,
but the aim of the netgraph bridging is to make it look as if the
packets are all coming in off one interface. Theoretically the
bridging code should be attached to only one 'upper' part of a driver
and all packets should arrive at higher levels, looking as if they have
all come in through that one interface. The other interfaces in the bridge
will never receive anything because their input has been diverted. To the
system it should look as if the entire bridged network is on that one
interface.

If this is not the case then we need to fix the bridging code so that
it is true, rather than clutter up higher level code trying to
account for a bug in the lower code.

I found out this bug while using ng_bridge with BRIDGE in the kernel but 
turned off with the sysctl.

Like I say in the problem report, this could easily be true if you take 2 
NICs and wire them both up to the same switch, each using a different IP. 
(Something we do at QuakeCon for instance. We run a lot of servers per box, 
so we spread them out over 3 IPs, each with it's own NIC. Worked very 
well). Netgraph has nothing to do with this. I'm just warning people who 
are experimenting with the netgraph bridge because they will probably still 
have BRIDGE in their kernel.

Let's run through this example of 2 NICs. Let's say I have BRIDGE in the 
kernel. I would not want to enable bridging in this case, because I would 
get a packet storm because of the loop between the machine and the switch. 
So I turn it off. But the checks for incoming interface are still switched 
off in the ethernet code. Checks that ARE executed when compiling without 
BRIDGE. NIC1 has 10.1.1.2 and NIC2 10.1.1.3. An ARP request is sent for 
10.1.1.2, and since it's a broadcast it will arrive on both NIC1 and NIC2. 
Without the checks for what interface a packet came in on, the kernel will 
send 2 ARP replies, one saying 10.1.1.3 is on NIC1, the other that 10.1.1.3 
is on NIC2. The last ARP reply to be sent will be used by the other 
machines on the network. Now with switches, they will probably handle 
broadcast packets the same way each time, so the same NIC on our machine 
will always get the ARP after the other got it. So the same NIC will always 
win the battle over an IP, resulting in ALL traffic for the machine, no 
matter what IP it was sent to, to go over a single NIC. That's a BAD(tm) thing.

Now if you compare the code that my patch effects from before and after the 
patch you will see the following changes:

original code:

 blabla();
#ifndef BRIDGE
 doCheck();
#endif
 dumDeDum();

my code

 blabla();
#ifdef BRIDGE
 if (do_bridge) {/* signifies whether bridging is switched on 
or off by the sysctl */
#else
 {
#endif
 doCheck();
 }
 dumDeDum();

This will make the ethernet code behave EXACTLY the same when bridging is 
switched off with the sysctl and when BRIDGE is NOT compiled into the kernel.

Now, on the netgraph stuff.

So how can an incoming packet look like it is not coming from that single
interface?
1/ ifnet pointer. The function ng_ether_rcv_upper() adjust this, so that's
not the problem.
2/ rcv interface MAC address. This is stripped off before arp gets it
(also in ng_ether_rcv_upper()).
3/ the tha[] or sha[] fields may contain a MAC address for
some other interface. (depending on how the remote mechine fills out
those fields), but our outgoing packets should have the MAC address
of the interface we have selected as out main interface, independent of
which interface it actually goes out of, (unless the hardware
over-writes it). so even that should point to the single interface.

The other interfaces should (maybe) beb ifconfigged 'UP' but they should
not have IP addresses assigned tp them, as they are being slaved from
the main interface by the ng_bridging code so everything comes and
goes via that one.

so I'm slightly confused as to what problem this solves.
(I'm not saying there isn't one, just that I con't figure out what it is).
Everything should act as if there is just one interface when netgraph
bridging is turned on.

Exactly if there's just one interface when netgraph bridging is on. Why? 
Why just one interface? Now that my kernel is patched to behave like BRIDGE 
wasn't compiled in when I switch it off I can include the upper's of 
multiple interfaces in a single netgraph bridge.

If you think about it, this should not even be a problem.

Look at this diagram