Re: Detecting cluster membership

Aaron Morton Tue, 31 Aug 2010 20:09:51 -0700

Looks like I've got the wrong idea here or something is wrong, am guessing the former.

describe_ring() does not seem to return the same level of info as nodetool ring. It's including nodes that are down, e..g I ran this on a node...

nodetool -h localhost -p 8080 ring

Address Status State Load Token

170141183460469231731687303715884105728

192.168.3426 Up Normal 35.07 GB 42535295865117307932921825928971026432

192.168.34.27 Up Normal 34.63 GB 85070591730234615865843651857942052864

192.168.34.28 Up Normal 34.89 GB 127605887595351923798765477786913079296

192.168.34.29 Down Normal 36.55 GB 170141183460469231731687303715884105728

also also ran describe_ring against the same node

[TokenRangeValue(start_token='170141183460469231731687303715884105728', end_token='42535295865117307932921825928971026432', endpoints=['192.168.34.28', '192.168.34.26', '192.168.34.27']),

TokenRangeValue(start_token='85070591730234615865843651857942052864', end_token='127605887595351923798765477786913079296', endpoints=['192.168.34.28', '192.168.34.29', '192.168.34.26']),

TokenRangeValue(start_token='127605887595351923798765477786913079296', end_token='170141183460469231731687303715884105728', endpoints=['192.168.34.29', '192.168.34.26', '192.168.34.27']),

TokenRangeValue(start_token='42535295865117307932921825928971026432', end_token='85070591730234615865843651857942052864', endpoints=['192.168.34.28', '192.168.34.29', '192.168.34.27'])]

Need to some more more investigation.

Aaron

On 01 Sep, 2010,at 02:37 PM, Dave Viner <davevi...@pobox.com> wrote:

I know this was recently bashed on the user list, but I still like the idea
of using haproxy (or other load balancer).

I think it would be very useful to have a simple haproxy configuration which
used a list of ip's to start, then dynamically updated itself using
describe_ring() to keep a relatively accurate up-to-the-minute list of which
nodes are available.

I assume this is roughly how the connection pooling classes would attempt
it

Dave Viner

On Tue, Aug 31, 2010 at 7:22 PM, Aaron Morton <aa...@thelastpickle.com>wrote:

> Ops, should be
>
> Round robin DNS is just the *wrong* approach, when doing things like
> draining a node or turning a box off.
>
>
> Aaron
>
> On 01 Sep, 2010,at 02:18 PM, Aaron Morton <aa...@thelastpickle.com> wrote:
>
> I agree.
>
> Round robin DNS is just the right approach, when doing things like draining
> a node or turning a box off.
>
> Am moving to use describe_ring and a either a seed list or dns. My concern
> with using a seed list in the client is the chance that the seeds are down,
> I have 2 seeds in a 4 node cluster so with both seeds down the cluster
> should still have quorum.
>
> So may go with regular round robin DNS when the client needs refresh it's
> list of nodes, e.g. when starting up.
>
> Have the client hold a list of the up nodes returned from describe_ring(),
> that it shuffles and then round robins. The list would be refreshed
> periodically.
>
> I also have the client periodically obtain a new connection to avoid the
> connections getting clumped in one area of the ring.
>
> (I'm working on an in house Python client that I hope to make public).
>
> Aaron
>
> On 01 Sep, 2010,at 02:04 PM, Dan Washusen <d...@reactive.org> wrote:
>
> The Pelops provides a connection pooling impl that's using (or attempting
> to
> use) the second approach, but to be honest it needs a significant amount of
> testing before I'd be willing to go into production with it...
>
> IMO, the connection pooling/node failure/etc logic is by far the most
> complex part of a client library. It would be excellent if we could avoid
> re-inventing the wheel when attempting to create a solution to solve it.
>
> Cheers,
> Dan
>
> On Wed, Sep 1, 2010 at 11:35 AM, Aaron Morton <aa...@thelastpickle.com
> >wrote:
>
> > When I first started writing code against the thrift API the FAQ
> > recommended using a round robin DNS to select nodes
> > http://wikiapache.org/cassandra/FAQ#node_clients_connect_to
> >
> > <http://wiki.apache.org/cassandra/FAQ#node_clients_connect_to>The other
> > day Ben said something like "well behaved clients use describe_ring to
> keep
> > track of running nodes".
> > http://www.mail-archive.com/u...@cassandra.apache.org/msg05588.html
> >
> > So am wondering what approach people are taking to detecting cluster
> > membership.
> >
> > 1. Round Robin
> >
> > 2. List seeds in app config, connect to a seed, use describe_ring.
> >
> > 3. Round robin and describe_ring
> >
> > One issue I've found with round robin, is that is the machine is powered
> > off it can take a while for the network to work out there is no ARP for
> the
> > IP. This may just be a result of the network here, have not looked into
> it
> > too far.
> >
> > cheers
> > Aaron
> >
>
>

Re: Detecting cluster membership

Reply via email to