I know this was recently bashed on the user list, but I still like the idea of using haproxy (or other load balancer).
I think it would be very useful to have a simple haproxy configuration which used a list of ip's to start, then dynamically updated itself using describe_ring() to keep a relatively accurate up-to-the-minute list of which nodes are available. I assume this is roughly how the connection pooling classes would attempt it. Dave Viner On Tue, Aug 31, 2010 at 7:22 PM, Aaron Morton <aa...@thelastpickle.com>wrote: > Ops, should be > > Round robin DNS is just the *wrong* approach, when doing things like > draining a node or turning a box off. > > > Aaron > > On 01 Sep, 2010,at 02:18 PM, Aaron Morton <aa...@thelastpickle.com> wrote: > > I agree. > > Round robin DNS is just the right approach, when doing things like draining > a node or turning a box off. > > Am moving to use describe_ring and a either a seed list or dns. My concern > with using a seed list in the client is the chance that the seeds are down, > I have 2 seeds in a 4 node cluster so with both seeds down the cluster > should still have quorum. > > So may go with regular round robin DNS when the client needs refresh it's > list of nodes, e.g. when starting up. > > Have the client hold a list of the up nodes returned from describe_ring(), > that it shuffles and then round robins. The list would be refreshed > periodically. > > I also have the client periodically obtain a new connection to avoid the > connections getting clumped in one area of the ring. > > (I'm working on an in house Python client that I hope to make public). > > Aaron > > On 01 Sep, 2010,at 02:04 PM, Dan Washusen <d...@reactive.org> wrote: > > The Pelops provides a connection pooling impl that's using (or attempting > to > use) the second approach, but to be honest it needs a significant amount of > testing before I'd be willing to go into production with it... > > IMO, the connection pooling/node failure/etc logic is by far the most > complex part of a client library. It would be excellent if we could avoid > re-inventing the wheel when attempting to create a solution to solve it. > > Cheers, > Dan > > On Wed, Sep 1, 2010 at 11:35 AM, Aaron Morton <aa...@thelastpickle.com > >wrote: > > > When I first started writing code against the thrift API the FAQ > > recommended using a round robin DNS to select nodes > > http://wiki.apache.org/cassandra/FAQ#node_clients_connect_to > > > > <http://wiki.apache.org/cassandra/FAQ#node_clients_connect_to>The other > > day Ben said something like "well behaved clients use describe_ring to > keep > > track of running nodes". > > http://www.mail-archive.com/u...@cassandra.apache.org/msg05588.html > > > > So am wondering what approach people are taking to detecting cluster > > membership. > > > > 1. Round Robin > > > > 2. List seeds in app config, connect to a seed, use describe_ring. > > > > 3. Round robin and describe_ring > > > > One issue I've found with round robin, is that is the machine is powered > > off it can take a while for the network to work out there is no ARP for > the > > IP. This may just be a result of the network here, have not looked into > it > > too far. > > > > cheers > > Aaron > > > >