Christopher L Merrill wrote:

Wow!  What a great bunch of responses, especially those from Greg,
Ryan and Aaron.  Most of it was over my head...but I'm hoping to absorb
just enough to make an intelligent decision for our test lab.

I've tried to identify which specs of a switch are actually important
for our use-case.  To recap, we're are moving towards having (at most)
20 computers in our test lab with GigE NICs (some with multiples).
When we care about the performance, the scenario will be that most
of those computers will be hammering one or more web servers, also on
the same switch, with as much traffic as it/they can handle.  In some
cases, each "load engine" will be aliasing multiple IP addresses on
each NIC.  All of the machines will be on the same subnet.  When we
run tests, we would like the network to be invisible...meaning that
it is never the bottleneck.

So I've seen a few specs mentioned in switch literature and mentioned
in the discussions -- I am trying to assess how those relate to our
situation.
1. MTU - larger is better to improve bandwidth efficiency

Jumbo frames are a cool technology, but you can't mix-and-match if your network contains any computers not capable of handling jumbo frames (aka any 10/100 clients). Generally, you can get dramatically improved throughput by tuning the TCP connection parameters, with out the need to use Jumbo frames. Check out /proc/sys/net/ipv4/tcp_*mem

2. # of MAC addresses - since we have a small number of computers
on a small network, I would guess this is unimportant to us.

You're not likely to be affected by this unless you have more than 8,000 computers on the same Ethernet segment. If you do, fix that problem. :)

3. Switching Capacity - pretty important to us, I would think, but
also seems to be the same for all models within a given line from a
given manufacturer - is the published number meaningful?

Generally, the published number is not meaningful. We could get into a lot of specifics about internals and when it does matter, but unless you're going to go and do a *lot* of empirical testing, or take apart the switch and examine it's internals or work closely with the manufacturer to get design and implementation details, to learn what chips are used and how they're wired to what ports, you're not likely to come to any meaningful conclusions. Also, unless you're stressing the daylights out of the switch, it's not likely to matter to you much.

The potential exception to this that I see in your use case is that with larger port densities, the switch design internally gets less efficient. The designs are often implemented with single chips that can do non-blocking gig-e between 8 to 16 ports at a time. If you have a 24 port switch, for example, you might have two Gig-E chipsets that can talk to 12 ports. The interconnections between any two ports which are on the same chipset, will be 1gig non-blocking, and ideally the interconnection between the two chipsets will be fully non-blocking. In practice, that's often not the case, as that requires a 12gig channel between them. In the better switch gear, generally it is, but in a lot of cases you either don't have full non-blocking throughput between the two chipsets, or even if you do, since the switch may not be able to push the required traffic out the required port fast enough, you can run into the switch having to drop packets internally, having congestion problems, etc -- this often isn't handled gracefully on the inter-chipset links.

So, consider your scenario, where you have 1 server, and 30 clients flooding that server with data. It's not quite your scenario, as you're more likely to be pushing orders of magnitude more data *to* the clients than from, and this doesn't apply there, but bear with me. :) If those 30 clients are all flooding 1 Gigabit of traffic into the switch, and 15 of them are on the same chipset with the server, and 15 are not, you may begin to find that the 15 computers on the second chipset exhibit subtly, or maybe markedly different behavior than those on the first. They may exhibit higher packet loss, thus high connection failure rates, lower throughput, etc.

4. Forwarding Rate - I have no idea what this is...important?

This is the same as Switching Capacity, essentially.


One other point that I wanted to verify is that one of the jobs of
the switch is to keep traffic away from parts of the network that
are not involved with the sender or receiver.  For example - the
switch in our test lab is hooked to the switch for the rest of the
office, to which the rest of our desktops are connected.  So when
we are running tests in the lab, none of that traffic bleeds into
the rest of the network affecting performance there.  My
understanding (and anecdotal evidence) is that this is true...is it?

So, how does a switch work? :) A switch learns what MAC addresses are connected to what port by looking at source addresses of incoming traffic, and associating that traffic with a given port. Then, when another packet is received, it can look in the table of previously-learned entries, and send the traffic only to where it has learned that MAC address exists. If it gets a packet addresses for a MAC it doesn't know about, or the Ethernet broadcast address, it forward that packet on to all ports except the port it was received on. This is great for established simple flows. This does not totally isolate one segment of another from another, though. Broadcast packets, such as ARP requests, DHCP lease requests, some service discovery protocols (NetBIOS/NetBEUI's NMP, SMP, browser service...; mdns; etc) all are addressed to the Ethernet broadcast address (FF:FF:FF:FF:FF:FF), and these packets will be delivered to all end points on the network. These protocols do not cross broadcast domains (god, I really sound like a network guy these days), which are at the borders of the routed subnet - for that you need a router.

So, the single-recommendation take-away I would recommend, is that you need to break apart your test network into a separate subnet. There are numerous ways to do this, but I would suggest getting a layer3 switch, setting up two VLANs, assigning appropriate ports to the test lab VLAN and appropriate ports to the "office" VLAN, and have the switch route freely between those two subnets. This will nicely allow you to still easily talk to machines in the test lab, but provide some insulation between traffic inadvertently bleeding from one network to the other. If you wanted to be even more cautious about it, you can implement ACLs on the switch to say, only allow port 22 or port 80 TCP traffic between the two subnets. You could also implement this on the cheap with an old Linux box with two network cards, and a simple layer2 gig-e switch for the lab. This is probably the simpler solution that better leverages your existing knowledge, but I think you'll find that once you start playing with a nicer switch, you'll come up with more interesting ways to segment the lab and get more use out of it in the long run. Then again, perhaps not. :) That's why you get to make the decision, not me.

Aaron S. Joyner

--
TriLUG mailing list        : http://www.trilug.org/mailman/listinfo/trilug
TriLUG Organizational FAQ  : http://trilug.org/faq/
TriLUG Member Services FAQ : http://members.trilug.org/services_faq/

Reply via email to