What about specifying all non-local instances as "backup" in haproxy.cfg?
This way haproxy would only direct traffic to the local instance as long as
the local instance is alive.

For example, if you plan to use the haproxy-marathon-bridge script, you can
modify this line to achieve that:
https://github.com/mesosphere/marathon/blob/8b3ce8844dcc53055345914ef11019789dd843cf/bin/haproxy-marathon-bridge#L162
.


On Thu, Dec 31, 2015 at 1:56 AM, vincent gromakowski <
[email protected]> wrote:

> I am currently using mesos as a big data backend for spark, cassandra,
> kafka and elasticsearch but I cannot find a good overall design regarding
> service discovery. I explain:
> Generally, the service discovery is managed by a HAproxy instance on each
> node which redirect trafic from service ports to real assigned network
> ports. Currently I am not using it because the cluster is quite small and I
> don't need to deploy lots of service but I am thinking on futur design that
> will allows me to scale.
> The problem with HAproxy dealing with all network trafic is that I am
> afraid it will break the data locality which is so important in the big
> data world regarding performances.
> For example when Spark tries to connect to elasticsearch, it will discover
> the elasticsearch topology and try to launch tasks next to elasticsearch
> shards. If HAproxy intercept network flows, what would be the result ?
> Will HAproxy masquarade the elasticsearch  IP/ports ? Same thing for Kafka
> and Cassandra ?
>
> I assume it depends on each connector but it's very hard to find any
> information. Thanks for your help if you have any experience in it.
> Regards
>
>
>

Reply via email to