Hey Glen,

First off, thanks for all of the debugging output - it really helped in 
figuring out your setup.

I took a look at the dump files and nox log, and I didn't see more than 
a few ARP packets, but I did see a lot of cisco traffic, presumably from 
the cat6k.  I'm not sure if there's necessarily a problem here, but I 
can tell you what it seems like the situation is and you can let me know 
if NOX should be behaving differently. 

Almost all of the flows seen by NOX are from the mac 00:01:63:d4:67:ca 
(which I'm assuming is the 6k).  Because the 6k is connected to two of 
the Openflow switch's ports, NOX has record of the mac at these two 
locations (eth1 and eth4, which are port numbers 0 and 3 respectively in 
the NOX log file).  Right now, we have a notion of a "primary" location 
when a sender is connected at two different points in the network.  At 
any given time, a sender's primary location is the location it most 
recently sent a packet from.  When that location switches, the old one 
is "poisoned" to force ongoing traffic to be routed to the new 
location.  Thus the poison dbg message you were seeing results from the 
6k sending a packet with that source mac address through a different 
port on the openflow switch than which it last sent through.  What's 
interesting is that to begin with, a different mac address is used by 
the 6k when sending traffic to openflow's eth4 interface 
(00:01:63:d4:67:cb), but then when the destination address changes to 
01:00:0c:00:00:07, the source address is always 00:01:63:d4:67:ca 
regardless of the openflow port it is received on. 

The second point worth noting is that all of the 6k's traffic is sent to 
multicast addresses, and NOX currently treats multicast and broadcast 
traffic the same, flooding a packet out every port except for the one it 
came in on.  If there's a more appropriate way of dealing with this 
traffic, please let me know!

So that's what seems to be the situation.  Again, let me know if you 
think any of the above described behavior is incorrect, or if there's 
still a problem that I just couldn't deduce from the log/dump files I 
looked at.

Natasha

Glen Gibb wrote:
> Hi all,
>
> We're working on our test setup for our demo at Stanford. Unfortunately 
> we're seeing some strange behavior that results in an ARP storm with a 
> fairly simple setup.
>
> The setup is as follows:
> NOX controller on one host (mvm-nox)
> OpenFlow switch (kernel module -- built from git) on another host (mvm-root)
>
> mvm-root has 5 ethernet ports:
>  eth0 -- controller connection to mvm-nox
>  eth{1,2,3,4} -- ports controlled by OF
>  Note: eth{1,4} are connected to a Cisco Cat6K -- the ports they are 
> connected to are on separate VLANs:
>
> Here's the MAC addresses of the ports (relevant later)
> eth0      Link encap:Ethernet  HWaddr 00:15:f2:a6:6c:2a 
> eth1      Link encap:Ethernet  HWaddr 00:0c:42:03:b8:bd 
> eth2      Link encap:Ethernet  HWaddr 00:0c:42:03:b8:be 
> eth3      Link encap:Ethernet  HWaddr 00:0c:42:03:b8:bf 
> eth4      Link encap:Ethernet  HWaddr 00:0c:42:03:b8:c0 
>
>
> Various other hosts are connected to eth1 via the Cat6k, particularly 
> 171.67.74.17 and 171.67.74.33 (mvm-17 and mvm-33 respectively).
>
> Without NOX+OF running I see Dynamic Trunking Protocol (DTP) packets and 
> Cisco Discovery Protocol (CDP) packets on eth1 and eth4 of mvm-ofroot. 
> In addition I sometime see ARP requests from some of the hosts connected 
> to eth1.
>
> When I run NOX and OF (with pyrouting and pyauthenticator as the active 
> modules for NOX), I initially see correct behavior. Packets received on 
> any of the ports are flooded to the other ports. I also start seeing 
> Link Layer Discovery Protocol (LLDP) packets on all 4 ports with a 
> source address of 00:15:f2:a6:6c:2a (corresponding to the MAC address of 
> the port used to connect to the controller). So far, so good...
>
> Now, for the problems.
>
> Eventually I see an ARP packet on eth1 from mvm-33 (for mvm-44: 
> 171.67.74.44). This packet triggers a flood of ARP packet -- it seems 
> like it continually gets flooded out all ports.
>
> Additionally, we also start to see entries about "Poisoning old primary" 
> in the NOX output (which probably happens before we see the arp storm):
> 00264|authenticator|ERR:Poisoning old primary ap:15f2a66c2a:0, 
> dl:163d467ca, nw:0 owns:0
>
>
> To help with diagnosis I performed packet capture on all 5 ports and 
> recorded the output from NOX and OF. You can grab them from 
> http://yuba.stanford.edu/~grg/nox_prob_b.tgz
> The dump files were actually started about 80 seconds before OF was 
> started. If you look at b.eth1.dump and b.eth4.dump the first packet 
> sent out by NOX are the packets at 86 and 87 s respectively. Note that 
> no packets were received on eth2 and eth3 before NOX was started so all 
> packets in those dumps are when NOX/OF was running.
>
>
> Please let me know if I can be of further assistance in debugging (or if 
> there's something plainly wrong that I'm doing).
> Glen
>
> _______________________________________________
> nox-dev mailing list
> [email protected]
> http://noxrepo.org/mailman/listinfo/nox-dev_noxrepo.org
>   


_______________________________________________
nox-dev mailing list
[email protected]
http://noxrepo.org/mailman/listinfo/nox-dev_noxrepo.org

Reply via email to