Ryan Leathers wrote:

<...snip...>
In short, if a way can be
devised to use L3 data in order to populate the table used by the
switching process, then it is possible to achieve the performance
benefits inherent in the process while retaining the value of decisions
based on L3 hierarchical addressing.  To go deeper into how this is done
we really need to talk about specific implementations since there is
more than one way to peel this onion, but this is the gist of it.
<...snip...>
From a practical standpoint, to me, unless we're in a seriously deep
network design discussion, making the distinction between routing and L3
switching is splitting hairs.
Kudos to Ryan for going into great depth, yet resisting the urge to just discuss the Cisco specifics. :) Unfortunately, I'm practically amazed you got through that description with out using the terms CAM or TCAM. Allow me to take a moment to elucidate on this idea, because the terms CAM and 'cam table' get so horribly misused and throw around that I have taken to clarifying the origins at every opportunity. No one has yet provided the opportunity, so I'll create one. :)

First, we must explain how a switch works its magic internally. It needs to take MAC addresses, and associate them with ports, and do so very quickly. This is done by employing a large bank of a rather unusual type of memory, called "Content Addressable Memory", or CAM for sort. It's usually referenced as CAM memory, or the number of "CAMs" that you have, referring to the quantity of memory (more on this in a second). So what does this CAM do for a switch? It allows the switch to tell the memory, "At address 00:00:00:C1:B2:A3, store the number 4". Then, when the time comes, it can look up the memory address 00:00:00:C1:B2:A3 and get back a 4. This is, in essence, the mac-address to port lookup table that a switch uses. In fact, the old Cisco switches used to call the mac address table the "CAM table", because literally, that's how it was implemented. It was of interest to know how much CAM memory a switch had, in order to know how many MAC addresses a switch could know about at any given time. This is still something quoted on the side of most switch boxes, that most people don't pay attention to. It's also a reasonable differentiator between cheap switches and expensive switches, and directly relates to how much power they consume (for reasons we'll see in a moment).

So, how does this fancy-pants memory work? (If you're really not interested, you can skip this paragraph.) In essence, this is at the heart of why switches are fast. The normal way to implement this in software would be to take the MAC address, create a hash of it, and store it at that memory location, so you could quickly access it later. You then have to deal with potential hash collisions as a corner case, but that's roughly it. The CAM simply implements this at the hardware layer, with transistors. As you might imagine, this isn't terribly difficult, but does consume quite a many more transistors than your average memory circuit (which is basically just a set of tiny capacitors). As a result, CAMs chew up modestly more transistors than simple RAM, and this is why big fancy switches often more than just modest amounts of power, and come with big noisy fans to match. Even if you're not using them, powering up all those transistors can generate a healthy amount of heat.

So, as Ryan suggested, there is a magical way of moving this routing decision from a software decision by a CPU* to a hardware decision made in a lighting fast switch-like fashion. A layer3 routing decision, at it's heart, is a similar kind of beast. You want to take a certain input (this time the destination IP address, as opposed to the MAC address as before), and look up the result. The problem here is that the mappings are no longer a simple 1-1 map, there is subnetting involved. Enter, the TCAM: Ternary Content Addressable Memory. Now this is cool stuff. TCAMs are effectively a hardware implementation of a routing table. You can program in a certain number of values (think, routes), and masks to those values (think subnet masks of those routes), and when you look up the location of an IP address in the TCAM, you get back the corresponding route, based on what you programmed into it in the beginning. In fact, TCAMs are so cool, and versatile, that all manner of subsystems in the a typical router can make use of them. You could use them for route matching, ACLs, you can use them as simple MAC table lookups (a poor use, but if you're just doing switching, why not -- it does happen :) ), etc, etc. As a result, you can change how these are associated with various subsystems of fancier routers, but it usually requires a full restart of the system (unfortunate but true). I should also mention that TCAMs have substantially more transistors than their simple RAM brethren, and also far more than CAMs. Thus, they consume proportionately more power, and because of the cost of making CAMs and TCAMs, are also not cheap. The capabilities of a given router or L3 switch have to be balanced with how much heat and power they can reasonably expected to dissipate and consume, respectively, and how many TCAMs they can include to meet a given price point. More CAMs and TCAMs results in a more capable device, that is more expensive to produce, and expensive to operate.

So, now that I've babbled on about CAMs and TCAMs, perhaps some of you are thinking, "Gee, my Linux box doesn't have any of that fancy stuff." You would be unfortunately correct. Generally speaking, a Linux box can make up for *some* of these short comings in raw horse power. Consider that dual cores running at 3+Ghz is really a lot of raw speed. The counter argument there is that speed only helps so much. A modern router doing an equivalent of CEF (Cisco Express Forwarding, aka making the routing decisions in hardware, via TCAMs, as described above) gets the packet in one interface and out the other in about the time it takes a Linux kernel to process the interrupt that the packet has arrived and read it into memory. It's still got to make the routing decision, then write it back to the Ethernet card. This is assuming you don't have an iptables module loaded or anything of the sort, which will interject more delay that would happen in a single clock via a TCAM on the router.

So it might sound like I'm trashing Linux for routing. And in some ways, I am. I suppose the lesson to take away is that when latency and throughput are of utmost importance, Linux is not necessarily the fastest way to go. On the flip side, it's often much much more flexible, and usually orders of magnitude cheaper. It's usually a decision you only have to make at the edge, because 24-ports or more of GigE in a Linux box to be used a L3 switch isn't exactly easy to come by yet. At the edge, your usually not as concerned with latency, and will be glad to have the improved flexibility and convenience of a Linux box.

Okay, I'm done for the night.  :)
Aaron S. Joyner


* - A "route processor" in the networking world, a generic Intel or AMD CPU running the Linux kernel in most PC environments.
--
TriLUG mailing list        : http://www.trilug.org/mailman/listinfo/trilug
TriLUG Organizational FAQ  : http://trilug.org/faq/
TriLUG Member Services FAQ : http://members.trilug.org/services_faq/

Reply via email to