Re: [TriLUG] OT - gigabit switches

Aaron S. Joyner Sat, 23 Sep 2006 10:54:04 -0700

Christopher L Merrill wrote:

Wow!  What a great bunch of responses, especially those from Greg,
Ryan and Aaron.  Most of it was over my head...but I'm hoping to absorb
just enough to make an intelligent decision for our test lab.


I've tried to identify which specs of a switch are actually important
for our use-case.  To recap, we're are moving towards having (at most)
20 computers in our test lab with GigE NICs (some with multiples).
When we care about the performance, the scenario will be that most
of those computers will be hammering one or more web servers, also on
the same switch, with as much traffic as it/they can handle.  In some
cases, each "load engine" will be aliasing multiple IP addresses on
each NIC.  All of the machines will be on the same subnet.  When we
run tests, we would like the network to be invisible...meaning that
it is never the bottleneck.

So I've seen a few specs mentioned in switch literature and mentioned
in the discussions -- I am trying to assess how those relate to our
situation.
1. MTU - larger is better to improve bandwidth efficiency

Jumbo frames are a cool technology, but you can't mix-and-match if yournetwork contains any computers not capable of handling jumbo frames (akaany 10/100 clients). Generally, you can get dramatically improvedthroughput by tuning the TCP connection parameters, with out the need touse Jumbo frames. Check out /proc/sys/net/ipv4/tcp_*mem

2. # of MAC addresses - since we have a small number of computers
on a small network, I would guess this is unimportant to us.

You're not likely to be affected by this unless you have more than 8,000computers on the same Ethernet segment. If you do, fix that problem. :)

3. Switching Capacity - pretty important to us, I would think, but
also seems to be the same for all models within a given line from a
given manufacturer - is the published number meaningful?

Generally, the published number is not meaningful. We could get into alot of specifics about internals and when it does matter, but unlessyou're going to go and do a *lot* of empirical testing, or take apartthe switch and examine it's internals or work closely with themanufacturer to get design and implementation details, to learn whatchips are used and how they're wired to what ports, you're not likely tocome to any meaningful conclusions. Also, unless you're stressing thedaylights out of the switch, it's not likely to matter to you much.

The potential exception to this that I see in your use case is that withlarger port densities, the switch design internally gets lessefficient. The designs are often implemented with single chips that cando non-blocking gig-e between 8 to 16 ports at a time. If you have a 24port switch, for example, you might have two Gig-E chipsets that cantalk to 12 ports. The interconnections between any two ports which areon the same chipset, will be 1gig non-blocking, and ideally theinterconnection between the two chipsets will be fully non-blocking. Inpractice, that's often not the case, as that requires a 12gig channelbetween them. In the better switch gear, generally it is, but in a lotof cases you either don't have full non-blocking throughput between thetwo chipsets, or even if you do, since the switch may not be able topush the required traffic out the required port fast enough, you can runinto the switch having to drop packets internally, having congestionproblems, etc -- this often isn't handled gracefully on theinter-chipset links.

So, consider your scenario, where you have 1 server, and 30 clientsflooding that server with data. It's not quite your scenario, as you'remore likely to be pushing orders of magnitude more data *to* the clientsthan from, and this doesn't apply there, but bear with me. :) If those30 clients are all flooding 1 Gigabit of traffic into the switch, and 15of them are on the same chipset with the server, and 15 are not, you maybegin to find that the 15 computers on the second chipset exhibitsubtly, or maybe markedly different behavior than those on the first.They may exhibit higher packet loss, thus high connection failure rates,lower throughput, etc.

4. Forwarding Rate - I have no idea what this is...important?


This is the same as Switching Capacity, essentially.

One other point that I wanted to verify is that one of the jobs of
the switch is to keep traffic away from parts of the network that
are not involved with the sender or receiver.  For example - the
switch in our test lab is hooked to the switch for the rest of the
office, to which the rest of our desktops are connected.  So when
we are running tests in the lab, none of that traffic bleeds into
the rest of the network affecting performance there.  My
understanding (and anecdotal evidence) is that this is true...is it?

So, how does a switch work? :) A switch learns what MAC addresses areconnected to what port by looking at source addresses of incomingtraffic, and associating that traffic with a given port. Then, whenanother packet is received, it can look in the table ofpreviously-learned entries, and send the traffic only to where it haslearned that MAC address exists. If it gets a packet addresses for aMAC it doesn't know about, or the Ethernet broadcast address, it forwardthat packet on to all ports except the port it was received on. This isgreat for established simple flows. This does not totally isolate onesegment of another from another, though. Broadcast packets, such as ARPrequests, DHCP lease requests, some service discovery protocols(NetBIOS/NetBEUI's NMP, SMP, browser service...; mdns; etc) all areaddressed to the Ethernet broadcast address (FF:FF:FF:FF:FF:FF), andthese packets will be delivered to all end points on the network. Theseprotocols do not cross broadcast domains (god, I really sound like anetwork guy these days), which are at the borders of the routed subnet -for that you need a router.

So, the single-recommendation take-away I would recommend, is that youneed to break apart your test network into a separate subnet. There arenumerous ways to do this, but I would suggest getting a layer3 switch,setting up two VLANs, assigning appropriate ports to the test lab VLANand appropriate ports to the "office" VLAN, and have the switch routefreely between those two subnets. This will nicely allow you to stilleasily talk to machines in the test lab, but provide some insulationbetween traffic inadvertently bleeding from one network to the other.If you wanted to be even more cautious about it, you can implement ACLson the switch to say, only allow port 22 or port 80 TCP traffic betweenthe two subnets. You could also implement this on the cheap with an oldLinux box with two network cards, and a simple layer2 gig-e switch forthe lab. This is probably the simpler solution that better leveragesyour existing knowledge, but I think you'll find that once you startplaying with a nicer switch, you'll come up with more interesting waysto segment the lab and get more use out of it in the long run. Thenagain, perhaps not. :) That's why you get to make the decision, not me.


Aaron S. Joyner

--
TriLUG mailing list        : http://www.trilug.org/mailman/listinfo/trilug
TriLUG Organizational FAQ  : http://trilug.org/faq/
TriLUG Member Services FAQ : http://members.trilug.org/services_faq/

Re: [TriLUG] OT - gigabit switches

Reply via email to