Re: [zfs-discuss] NFS performance?

Mike Gerdts Mon, 26 Jul 2010 14:10:40 -0700

On Mon, Jul 26, 2010 at 2:56 PM, Miles Nordin <car...@ivy.net> wrote:
>>>>>> "mg" == Mike Gerdts <mger...@gmail.com> writes:
>    mg> it is rather common to have multiple 1 Gb links to
>    mg> servers going to disparate switches so as to provide
>    mg> resilience in the face of switch failures.  This is not unlike
>    mg> (at a block diagram level) the architecture that you see in
>    mg> pretty much every SAN.  In such a configuation, it is
>    mg> reasonable for people to expect that load balancing will
>    mg> occur.
>
> nope.  spanning tree removes all loops, which means between any two
> points there will be only one enabled path.  An L2-switched network
> will look into L4 headers for splitting traffic across an aggregated
> link (as long as it's been deliberately configured to do that---by
> default probably only looks to L2), but it won't do any multipath
> within the mesh.


I was speaking more of IPMP, which is at layer 3.

> Even with an L3 routing protocol it usually won't do multipath unless
> the costs of the paths match exactly, so you'd want to build the
> topology to achieve this and then do all switching at layer 3 by
> making sure no VLAN is larger than a switch.

By default, IPMP does outbound load spreading.  Inbound load spreading
is not practical with a single (non-test) IP address.  If you have
multiple virtual IP's you can spread them across all of the NICs in
the IPMP group and get some degree of inbound spreading as well.  This
is the default behavior of the OpenSolaris IPMP implementation, last I
looked.  I've not seen any examples (although I can't say I've looked
real hard either) of the Solaris 10 IPMP configuration set up with
multipe IP's to encourage inbound load spreading as well.

>
> There's actually a cisco feature to make no VLAN larger than a *port*,
> which I use a little bit.  It's meant for CATV networks I think, or
> DSL networks aggregated by IP instead of ATM like maybe some European
> ones?  but the idea is not to put edge ports into vlans any more but
> instead say 'ip unnumbered loopbackN', and then some black magic they
> have built into their DHCP forwarder adds /32 routes by watching the
> DHCP replies.  If you don't use DHCP you can add static /32 routes
> yourself, and it will work.  It does not help with IPv6, and also you
> can only use it on vlan-tagged edge ports (whaaaaat? arbitrary!) but
> neat that it's there at all.
>
>  http://www.cisco.com/en/US/docs/ios/12_3t/12_3t4/feature/guide/gtunvlan.html

Interesting... however this seems to limit you to < 4096 edge ports
per VTP domain, as the VID field in the 802.1q header is only 12 bits.
 It is also unclear how this works when you have one physical host
with many guests.  And then there is the whole thing that I don't
really see how this helps with resilience in the face of a switch
failure.  Cool technology, but I'm not certain that it addresses what
I was talking about.

>
> The best thing IMHO would be to use this feature on the edge ports,
> just as I said, but you will have to teach the servers to VLAN-tag
> their packets.  not such a bad idea, but weird.
>
> You could also use it one hop up from the edge switches, but I think
> it might have problems in general removing the routes when you unplug
> a server, and using it one hop up could make them worse.  I only use
> it with static routes so far, so no mobility for me: I have to keep
> each server plugged into its assigned port, and reconfigure switches
> if I move it.  Once you have ``no vlan larger than 1 switch,'' if you
> actually need a vlan-like thing that spans multiple switches, the new
> word for it is 'vrf'.

There was some other Cisco dark magic that our network guys were
touting a while ago that would make each edge switch look like a blade
in a 6500 series.  This would then allow them to do link aggregation
across edge switches.  At least two of "organizational changes",
"personnel changes", and "roadmap changes" happened so I've not seen
this in action.

>
> so, yeah, it means the server people will have to take over the job of
> the networking people.  The good news is that networking people don't
> like spanning tree very much because it's always going wrong, so
> AFAICT most of them who are paying attention are already moving in
> this direction.
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
>



-- 
Mike Gerdts
http://mgerdts.blogspot.com/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] NFS performance?

Reply via email to