[ceph-users] Re: Networking Idea/Question

Dave Hall Mon, 15 Mar 2021 23:42:04 -0700

Andrew,

I agree that the choice of hash function is important for LACP. Mythinking has always been to stay down in layers 2 and 3. With enoughhosts it seems likely that traffic would be split close to evenly. Heads or tails - 50% of the time you're right. TCP ports should also benearly equally split, but listening ports could introduce some asymmetry.

What I'm concerned about is the next level up: With the client networkand the cluster network (Marc's terms are more descriptive) on the sameNICs/Switch Ports, with or without LACP and LAGs, it seems possible thatat times the bandwidth consumed by cluster traffic could overwhelm andstarve the client traffic. Or the other way around, which would be worseif the cluster nodes can't communicate on their 'private' network tokeep the cluster consistent. These overloads could happen in thepacket queues in the NIC drivers, or maybe in the switch fabric.

Maybe these starvation scenarios aren't that likely in clusters with10GB networking. Maybe it's hard to fill up a 10GB pipe, much lesstwo. But it could happen with 1GB NICs, even in LAGs of 4 or 6 ports,and eventually it will be possible with faster NVMe drives to easilyfill a 10GB pipe.

So, what could we do with some of the 'exotic' queuing mechanismsavailable in Linux to keep the balance - to assure that the lessercategory can transmit proportionally? (And is 'proportional' the rightanswer, or should one side get a slight advantage?)


-Dave

Dave Hall
Binghamton University
kdh...@binghamton.edu
On 3/15/2021 12:48 PM, Andrew Walker-Brown wrote:

Dave

That’s the way our cluster is setup. It’s relatively small, 5 hosts, 12 osd’s.

Each host has 2x10G with LACP to the switches.  We’ve vlan’d public/private 
networks.

Making best use of the LACP lag will to a greater extent be down to choosing 
the best hashing policy.  At the moment we’re using layer3+4 on the Linux 
config and switch configs.  We’re monitoring link utilisation to make sure the 
balancing is as close to equal as possible.

Hope this helps

A

Sent from my iPhone

On 15 Mar 2021, at 16:39, Marc <m...@f1-outsourcing.eu> wrote:

I have client and cluster network on one 10gbit port (with different vlans).
I think many smaller clusters do this ;)

I've been thinking about ways to squeeze as much performance as possible
from the NICs  on a Ceph OSD node.  The nodes in our cluster (6 x OSD, 3
x MGR/MON/MDS/RGW) currently have 2 x 10GB ports.  Currently, one port
is assigned to the front-side network, and one to the back-side
network.  However, there are times when the traffic on one side or the
other is more intense and might benefit from a bit more bandwidth.

The idea I had was to bond the two ports together, and to run the
back-side network in a tagged VLAN on the combined 20GB LACP port.  In
order to keep the balance and prevent starvation from either side it
would be necessary to apply some sort of a weighted fair queuing
mechanism via the 'tc' command.  The idea is that if the client side
isn't using up the full 10GB/node, and there is a burst of re-balancing
activity, the bandwidth consumed by the back-side traffic could swell to
15GB or more.   Or vice versa.

 From what I have read and studied, these algorithms are fairly
responsive to changes in load and would thus adjust rapidly if the
demand from either side suddenly changed.

Maybe this is a crazy idea, or maybe it's really cool.  Your thoughts?

Thanks.

-Dave

--
Dave Hall
Binghamton University
kdh...@binghamton.edu
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Networking Idea/Question

Reply via email to