Re: [TriLUG] OT - gigabit switches

Aaron S. Joyner Wed, 20 Sep 2006 23:18:53 -0700

Ryan Leathers wrote:

<...snip...>
In short, if a way can be
devised to use L3 data in order to populate the table used by the
switching process, then it is possible to achieve the performance
benefits inherent in the process while retaining the value of decisions
based on L3 hierarchical addressing.  To go deeper into how this is done
we really need to talk about specific implementations since there is
more than one way to peel this onion, but this is the gist of it.
<...snip...>
From a practical standpoint, to me, unless we're in a seriously deep
network design discussion, making the distinction between routing and L3
switching is splitting hairs.

Kudos to Ryan for going into great depth, yet resisting the urge to justdiscuss the Cisco specifics. :) Unfortunately, I'm practically amazedyou got through that description with out using the terms CAM or TCAM.Allow me to take a moment to elucidate on this idea, because the termsCAM and 'cam table' get so horribly misused and throw around that I havetaken to clarifying the origins at every opportunity. No one has yetprovided the opportunity, so I'll create one. :)

First, we must explain how a switch works its magic internally. Itneeds to take MAC addresses, and associate them with ports, and do sovery quickly. This is done by employing a large bank of a ratherunusual type of memory, called "Content Addressable Memory", or CAM forsort. It's usually referenced as CAM memory, or the number of "CAMs"that you have, referring to the quantity of memory (more on this in asecond). So what does this CAM do for a switch? It allows the switchto tell the memory, "At address 00:00:00:C1:B2:A3, store the number 4".Then, when the time comes, it can look up the memory address00:00:00:C1:B2:A3 and get back a 4. This is, in essence, themac-address to port lookup table that a switch uses. In fact, the oldCisco switches used to call the mac address table the "CAM table",because literally, that's how it was implemented. It was of interest toknow how much CAM memory a switch had, in order to know how many MACaddresses a switch could know about at any given time. This is stillsomething quoted on the side of most switch boxes, that most peopledon't pay attention to. It's also a reasonable differentiator betweencheap switches and expensive switches, and directly relates to how muchpower they consume (for reasons we'll see in a moment).

So, how does this fancy-pants memory work? (If you're really notinterested, you can skip this paragraph.) In essence, this is at theheart of why switches are fast. The normal way to implement this insoftware would be to take the MAC address, create a hash of it, andstore it at that memory location, so you could quickly access it later.You then have to deal with potential hash collisions as a corner case,but that's roughly it. The CAM simply implements this at the hardwarelayer, with transistors. As you might imagine, this isn't terriblydifficult, but does consume quite a many more transistors than youraverage memory circuit (which is basically just a set of tinycapacitors). As a result, CAMs chew up modestly more transistors thansimple RAM, and this is why big fancy switches often more than justmodest amounts of power, and come with big noisy fans to match. Even ifyou're not using them, powering up all those transistors can generate ahealthy amount of heat.

So, as Ryan suggested, there is a magical way of moving this routingdecision from a software decision by a CPU* to a hardware decision madein a lighting fast switch-like fashion. A layer3 routing decision, atit's heart, is a similar kind of beast. You want to take a certaininput (this time the destination IP address, as opposed to the MACaddress as before), and look up the result. The problem here is thatthe mappings are no longer a simple 1-1 map, there is subnettinginvolved. Enter, the TCAM: Ternary Content Addressable Memory. Nowthis is cool stuff. TCAMs are effectively a hardware implementation ofa routing table. You can program in a certain number of values (think,routes), and masks to those values (think subnet masks of those routes),and when you look up the location of an IP address in the TCAM, you getback the corresponding route, based on what you programmed into it inthe beginning. In fact, TCAMs are so cool, and versatile, that allmanner of subsystems in the a typical router can make use of them. Youcould use them for route matching, ACLs, you can use them as simple MACtable lookups (a poor use, but if you're just doing switching, why not-- it does happen :) ), etc, etc. As a result, you can change how theseare associated with various subsystems of fancier routers, but itusually requires a full restart of the system (unfortunate but true). Ishould also mention that TCAMs have substantially more transistors thantheir simple RAM brethren, and also far more than CAMs. Thus, theyconsume proportionately more power, and because of the cost of makingCAMs and TCAMs, are also not cheap. The capabilities of a given routeror L3 switch have to be balanced with how much heat and power they canreasonably expected to dissipate and consume, respectively, and how manyTCAMs they can include to meet a given price point. More CAMs and TCAMsresults in a more capable device, that is more expensive to produce, andexpensive to operate.

So, now that I've babbled on about CAMs and TCAMs, perhaps some of youare thinking, "Gee, my Linux box doesn't have any of that fancy stuff."You would be unfortunately correct. Generally speaking, a Linux box canmake up for *some* of these short comings in raw horse power. Considerthat dual cores running at 3+Ghz is really a lot of raw speed. Thecounter argument there is that speed only helps so much. A modernrouter doing an equivalent of CEF (Cisco Express Forwarding, aka makingthe routing decisions in hardware, via TCAMs, as described above) getsthe packet in one interface and out the other in about the time it takesa Linux kernel to process the interrupt that the packet has arrived andread it into memory. It's still got to make the routing decision, thenwrite it back to the Ethernet card. This is assuming you don't have aniptables module loaded or anything of the sort, which will interjectmore delay that would happen in a single clock via a TCAM on the router.

So it might sound like I'm trashing Linux for routing. And in someways, I am. I suppose the lesson to take away is that when latency andthroughput are of utmost importance, Linux is not necessarily thefastest way to go. On the flip side, it's often much much moreflexible, and usually orders of magnitude cheaper. It's usually adecision you only have to make at the edge, because 24-ports or more ofGigE in a Linux box to be used a L3 switch isn't exactly easy to come byyet. At the edge, your usually not as concerned with latency, and willbe glad to have the improved flexibility and convenience of a Linux box.


Okay, I'm done for the night.  :)
Aaron S. Joyner

* - A "route processor" in the networking world, a generic Intel or AMDCPU running the Linux kernel in most PC environments.

--
TriLUG mailing list        : http://www.trilug.org/mailman/listinfo/trilug
TriLUG Organizational FAQ  : http://trilug.org/faq/
TriLUG Member Services FAQ : http://members.trilug.org/services_faq/

Re: [TriLUG] OT - gigabit switches

Reply via email to