I know this was recently bashed on the user list, but I still like the idea
of using haproxy (or other load balancer).

I think it would be very useful to have a simple haproxy configuration which
used a list of ip's to start, then dynamically updated itself using
describe_ring() to keep a relatively accurate up-to-the-minute list of which
nodes are available.

I assume this is roughly how the connection pooling classes would attempt
it.

Dave Viner


On Tue, Aug 31, 2010 at 7:22 PM, Aaron Morton <aa...@thelastpickle.com>wrote:

> Ops, should be
>
> Round robin DNS is just the *wrong* approach, when doing things like
> draining a node or turning a box off.
>
>
> Aaron
>
> On 01 Sep, 2010,at 02:18 PM, Aaron Morton <aa...@thelastpickle.com> wrote:
>
> I agree.
>
> Round robin DNS is just the right approach, when doing things like draining
> a node or turning a box off.
>
> Am moving to use describe_ring and a either a seed list or dns. My concern
> with using a seed list in the client is the chance that the seeds are down,
> I have 2 seeds in a 4 node cluster so with both seeds down the cluster
> should still have quorum.
>
> So may go with regular round robin DNS when the client needs refresh it's
> list of nodes, e.g. when starting up.
>
> Have the client hold a list of the up nodes returned from describe_ring(),
> that it shuffles and then round robins. The list would be refreshed
> periodically.
>
> I also have the client periodically obtain a new connection to avoid the
> connections getting clumped in one area of the ring.
>
> (I'm working on an in house Python client that I hope to make public).
>
> Aaron
>
> On 01 Sep, 2010,at 02:04 PM, Dan Washusen <d...@reactive.org> wrote:
>
> The Pelops provides a connection pooling impl that's using (or attempting
> to
> use) the second approach, but to be honest it needs a significant amount of
> testing before I'd be willing to go into production with it...
>
> IMO, the connection pooling/node failure/etc logic is by far the most
> complex part of a client library. It would be excellent if we could avoid
> re-inventing the wheel when attempting to create a solution to solve it.
>
> Cheers,
> Dan
>
> On Wed, Sep 1, 2010 at 11:35 AM, Aaron Morton <aa...@thelastpickle.com
> >wrote:
>
> > When I first started writing code against the thrift API the FAQ
> > recommended using a round robin DNS to select nodes
> > http://wiki.apache.org/cassandra/FAQ#node_clients_connect_to
> >
> > <http://wiki.apache.org/cassandra/FAQ#node_clients_connect_to>The other
> > day Ben said something like "well behaved clients use describe_ring to
> keep
> > track of running nodes".
> > http://www.mail-archive.com/u...@cassandra.apache.org/msg05588.html
> >
> > So am wondering what approach people are taking to detecting cluster
> > membership.
> >
> > 1. Round Robin
> >
> > 2. List seeds in app config, connect to a seed, use describe_ring.
> >
> > 3. Round robin and describe_ring
> >
> > One issue I've found with round robin, is that is the machine is powered
> > off it can take a while for the network to work out there is no ARP for
> the
> > IP. This may just be a result of the network here, have not looked into
> it
> > too far.
> >
> > cheers
> > Aaron
> >
>
>

Reply via email to