We do the same thing. OSPF between ToR switches, BGP to all of the hosts
with each one advertising its own /32 (each has 2 NICs).

On Mon, Jun 6, 2016 at 6:29 AM, Luis Periquito <[email protected]> wrote:

> Nick,
>
> TL;DR: works brilliantly :)
>
> Where I work we have all of the ceph nodes (and a lot of other stuff)
> using OSPF and BGP server attachment. With that we're able to implement
> solutions like Anycast addresses, removing the need to add load balancers,
> for the radosgw solution.
>
> The biggest issues we've had were around the per-flow vs per-packets
> traffic load balancing, but as long as you keep it simple you shouldn't
> have any issues.
>
> Currently we have a P2P network between the servers and the ToR switches
> on a /31 subnet, and then create a virtual loopback address, which is the
> interface we use for all communications. Running tests like iperf we're
> able to reach 19Gbps (on a 2x10Gbps network). OTOH we no longer have the
> ability to separate traffic between public and osd network, but never
> really felt the need for it.
>
> Also spend a bit of time planning how the network will look like and it's
> topology. If done properly (think details like route summarization) then
> it's really worth the extra effort.
>
>
>
> On Mon, Jun 6, 2016 at 11:57 AM, Nick Fisk <[email protected]> wrote:
>
>> Hi All,
>>
>>
>>
>> Has anybody had any experience with running the network routed down all
>> the way to the host?
>>
>>
>>
>> I know the standard way most people configured their OSD nodes is to bond
>> the two nics which will then talk via a VRRP gateway and then probably from
>> then on the networking is all Layer3. The main disadvantage I see here is
>> that you need a beefy inter switch link to cope with the amount of traffic
>> flowing between switches to the VRRP address. I’ve been trying to design
>> around this by splitting hosts into groups with different VRRP gateways on
>> either switch, but this relies on using active/passive bonding on the OSD
>> hosts to make sure traffic goes from the correct Nic to the directly
>> connected switch.
>>
>>
>>
>> What I was thinking, instead of terminating the Layer3 part of the
>> network at the access switches, terminate it at the hosts. If each Nic of
>> the OSD host had a different subnet and the actual “OSD Server” address
>> bound to a loopback adapter, OSPF should advertise this loopback adapter
>> address as reachable via the two L3 links on the physically attached Nic’s.
>> This should give you a redundant topology which also will respect your
>> physically layout and potentially give you higher performance due to ECMP.
>>
>>
>>
>> Any thoughts, any pitfalls?
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to