Re: finding Cassandra servers

2010-03-03 Thread Gary Dusbabek
2010/3/3 Ted Zlatanov t...@lifelogs.com:
 On Mon, 01 Mar 2010 12:15:11 -0600 Ted Zlatanov t...@lifelogs.com wrote:

 TZ I need to find Cassandra servers on my network from several types of
 TZ clients and platforms.  The goal is to make adding and removing servers
 TZ painless, assuming a leading window of at least 1 hour.  The discovery
 TZ should be automatic and distributed.  I want to minimize management.

 TZ Round-robin DNS with a 1-hour TTL would work all right, but I was
 TZ wondering if Bonjour/Zeroconf is a better idea and what else should I
 TZ consider.

 So... is this a dumb question or is there no good answer currently to
 discovering Cassandra servers?

 Ted



Nothing in the current codebase currently meets these needs.  But then
again, cassandra doesn't need the described functionality.  Zeroconf
confines itself to a single subnet (would require router configuration
to work across subnets so that multicast goes through).  RRDNS would
work, but something would need to keep that updated when servers go
away (it wouldn't be automatic).

If you can count on one of your (seed nodes) to be up, RRDNS could be
used to connect to one of them and fetch the token range list.  To do
this, create a thrift client and call describe_ring.  In older
versions you can get a jsonified endpoint map by calling
get_string_property('token map').

Hope that helps.

Gary.


Re: finding Cassandra servers

2010-03-03 Thread Ted Zlatanov
On Wed, 3 Mar 2010 08:41:18 -0600 Gary Dusbabek gdusba...@gmail.com wrote: 

GD It wouldn't be a lot work for you to write a mdns service that would
GD query the seeds for endpoints and publish it to interested clients.
GD It could go in contrib.

This requires knowledge of the seeds so I need to at least look in
storage-conf.xml to find them.  Are you saying there's no chance of
Cassandra nodes (or just seeds) announcing themselves, even if it's
optional behavior that's off by default?  If so I'll do the contrib mDNS
service but it really seems like a backward way to do things.

Ted



Re: finding Cassandra servers

2010-03-03 Thread Gary Dusbabek
2010/3/3 Ted Zlatanov t...@lifelogs.com:
 On Wed, 3 Mar 2010 08:41:18 -0600 Gary Dusbabek gdusba...@gmail.com wrote:

 GD It wouldn't be a lot work for you to write a mdns service that would
 GD query the seeds for endpoints and publish it to interested clients.
 GD It could go in contrib.

 This requires knowledge of the seeds so I need to at least look in
 storage-conf.xml to find them.  Are you saying there's no chance of
 Cassandra nodes (or just seeds) announcing themselves, even if it's
 optional behavior that's off by default?  If so I'll do the contrib mDNS
 service but it really seems like a backward way to do things.

 Ted



Nodes already announce themselves, only just to the cluster.  That's
what gossip is for.  I don't see the point of making the announcement
to the subnet at large.

The decision rests with the community.  Obviously, if there is enough
merit to this work, it will find its way into the codebase.  I just
think it falls into the realm of shiny-and-neat (mdns and automatic
discovery is cool) and not in the realm of pragmatic (not reliable
across subnets).

Gary.


Re: finding Cassandra servers

2010-03-03 Thread Ted Zlatanov
On Wed, 3 Mar 2010 09:32:33 -0600 Gary Dusbabek gdusba...@gmail.com wrote: 

GD 2010/3/3 Ted Zlatanov t...@lifelogs.com:
 This requires knowledge of the seeds so I need to at least look in
 storage-conf.xml to find them.  Are you saying there's no chance of
 Cassandra nodes (or just seeds) announcing themselves, even if it's
 optional behavior that's off by default?  If so I'll do the contrib mDNS
 service but it really seems like a backward way to do things.

GD Nodes already announce themselves, only just to the cluster.  That's
GD what gossip is for.  I don't see the point of making the announcement
GD to the subnet at large.

GD The decision rests with the community.  Obviously, if there is enough
GD merit to this work, it will find its way into the codebase.  I just
GD think it falls into the realm of shiny-and-neat (mdns and automatic
GD discovery is cool) and not in the realm of pragmatic (not reliable
GD across subnets).

It's currently not possible to find a usable node without running
centralized services like RRDNS or a special mDNS broadcaster as you
suggested.  I don't think this is shiny and neat, it's a matter of
running in a true decentralized environment (which Cassandra is supposed
to fit into).

The subnet limitation is not an issue in my environment (we forward
much, much larger multicast volumes routinely) but I understand routing
multicasts is not everyone's cup of tea.  IMHO it's better than the
current situation and, mDNS being a well-known standard, can at least be
handled at the switch level without code changes.

I can do a patch+ticket for this in the core, making it optional and off
by default, or do the same for a contrib/ service as you suggested.  So
I'd appreciate a +1/-1 quick vote on whether this can go in the core to
save me from rewriting the patch later.

Ted



Re: finding Cassandra servers

2010-03-03 Thread Eric Evans
On Wed, 2010-03-03 at 10:05 -0600, Ted Zlatanov wrote:
 I can do a patch+ticket for this in the core, making it optional and
 off by default, or do the same for a contrib/ service as you
 suggested.  So I'd appreciate a +1/-1 quick vote on whether this can
 go in the core to save me from rewriting the patch later.

I don't think voting is going to help. Voting doesn't do anything to
develop consensus and it seems pretty clear that no consensus exists
here.

It's entirely possible that you've identified a problem that others
can't see, or haven't yet encountered. I don't see it, but then maybe
I'm just thick.

Either way, if you think this is important, the onus is on you to
demonstrate the merit of your idea and contrib/ or a github project is
one way to do that (the latter has the advantage of not needing to rely
on anyone else).


-- 
Eric Evans
eev...@rackspace.com



Re: finding Cassandra servers

2010-03-03 Thread Christopher Brind
So is the current general practice to connect to a known node, e.g. by ip
address?

If so, what happens if that node is down?  Is the entire cluster effectively
broken at that point?

Or do clients simply maintain a list of nodes a just connect to the first
available in the list?

Thanks in advance.

Cheers
Chris

On 3 Mar 2010 16:43, Eric Evans eev...@rackspace.com wrote:

On Wed, 2010-03-03 at 10:05 -0600, Ted Zlatanov wrote:
 I can do a patch+ticket for this in the cor...
I don't think voting is going to help. Voting doesn't do anything to
develop consensus and it seems pretty clear that no consensus exists
here.

It's entirely possible that you've identified a problem that others
can't see, or haven't yet encountered. I don't see it, but then maybe
I'm just thick.

Either way, if you think this is important, the onus is on you to
demonstrate the merit of your idea and contrib/ or a github project is
one way to do that (the latter has the advantage of not needing to rely
on anyone else).


--
Eric Evans
eev...@rackspace.com


Re: finding Cassandra servers

2010-03-03 Thread Ted Zlatanov
On Wed, 03 Mar 2010 10:43:19 -0600 Eric Evans eev...@rackspace.com wrote: 

EE It's entirely possible that you've identified a problem that others
EE can't see, or haven't yet encountered. I don't see it, but then maybe
EE I'm just thick.

Getting back to my original question, how do you (and others) find
usable Cassandra nodes from your clients?  It's supposed to be a
decentralized database and yet I only know of centralized ways (RRDNS)
to locate nodes.  Contacting the seeds is not a decentralized solution
and sidesteps the issue.  It also complicates the client unnecessarily.

EE Either way, if you think this is important, the onus is on you to
EE demonstrate the merit of your idea and contrib/ or a github project is
EE one way to do that (the latter has the advantage of not needing to rely
EE on anyone else).

I'll submit a core patch in a jira ticket.  It's much easier than
writing a full application and IMHO much more useful because it just
works.  If it gets rejected I'll move to contrib/ as you and Gary
suggested.

Ted



Re: finding Cassandra servers

2010-03-03 Thread Ian Holsman
+1 on erics comments
We could create a branch or git fork where you guys could develop it,
and if it reaches a usable state and others find it interesting it
could get integrated in then


On 3/3/10, Eric Evans eev...@rackspace.com wrote:
 On Wed, 2010-03-03 at 10:05 -0600, Ted Zlatanov wrote:
 I can do a patch+ticket for this in the core, making it optional and
 off by default, or do the same for a contrib/ service as you
 suggested.  So I'd appreciate a +1/-1 quick vote on whether this can
 go in the core to save me from rewriting the patch later.

 I don't think voting is going to help. Voting doesn't do anything to
 develop consensus and it seems pretty clear that no consensus exists
 here.

 It's entirely possible that you've identified a problem that others
 can't see, or haven't yet encountered. I don't see it, but then maybe
 I'm just thick.

 Either way, if you think this is important, the onus is on you to
 demonstrate the merit of your idea and contrib/ or a github project is
 one way to do that (the latter has the advantage of not needing to rely
 on anyone else).


 --
 Eric Evans
 eev...@rackspace.com



-- 
Sent from my mobile device


Re: finding Cassandra servers

2010-03-03 Thread Ted Zlatanov
On Wed, 3 Mar 2010 09:04:37 -0800 Ryan King r...@twitter.com wrote: 

RK Something like RRDNS is no more complex that managing a list of seed nodes.

How do your clients at Twitter find server nodes?  Do you just run them
local to each node?

My concern is that both RRDNS and seed node lists are vulnerable to
individual node failure.  Updating DNS when a node dies means you have
to wait until the TTL expires, and if you lower the TTL too much your
server will get killed.

With seed node lists, if I get unlucky I'd be trying to hit a downed
node in which case I may as well just use RRDNS and deal with connection
failure from the start.

Ted



Re: finding Cassandra servers

2010-03-03 Thread Chris Goffinet
At Digg we have automated infrastructure. We use Puppet + our own in-house 
system that allows us to query pools of nodes for 'seeds'. Config files like 
storage-conf.xml are auto generated on the fly, and we randomly pick a set of 
seeds. 

Seeds can be per datacenter as well. As soon as a machine is decommissioned, it 
no longer gets picked as seed.

-Chris

On Mar 3, 2010, at 9:12 AM, Ted Zlatanov wrote:

 On Wed, 3 Mar 2010 09:04:37 -0800 Ryan King r...@twitter.com wrote: 
 
 RK Something like RRDNS is no more complex that managing a list of seed 
 nodes.
 
 How do your clients at Twitter find server nodes?  Do you just run them
 local to each node?
 
 My concern is that both RRDNS and seed node lists are vulnerable to
 individual node failure.  Updating DNS when a node dies means you have
 to wait until the TTL expires, and if you lower the TTL too much your
 server will get killed.
 
 With seed node lists, if I get unlucky I'd be trying to hit a downed
 node in which case I may as well just use RRDNS and deal with connection
 failure from the start.
 
 Ted
 



Re: finding Cassandra servers

2010-03-03 Thread Brandon Williams
2010/3/3 Ted Zlatanov t...@lifelogs.com

 On Wed, 3 Mar 2010 09:04:37 -0800 Ryan King r...@twitter.com wrote:

 RK Something like RRDNS is no more complex that managing a list of seed
 nodes.

 My concern is that both RRDNS and seed node lists are vulnerable to
 individual node failure.


They're not.  That's why they're lists.  If one doesn't work out, move along
to the next.


  Updating DNS when a node dies means you have
 to wait until the TTL expires, and if you lower the TTL too much your
 server will get killed.


Don't do that.  Make your clients keep trying.  Any failure is likely to be
transient anyway, so running around messing with DNS every time a machine is
offline doesn't make much sense.

-Brandon


Re: finding Cassandra servers

2010-03-03 Thread Ted Zlatanov
On Wed, 3 Mar 2010 12:08:06 -0500 Ian Holsman i...@holsman.net wrote: 

IH We could create a branch or git fork where you guys could develop it,
IH and if it reaches a usable state and others find it interesting it
IH could get integrated in then

Thanks, Ian.  Would it be OK to do it as a patch in
http://issues.apache.org/jira/browse/CASSANDRA-846?  Or is there a
reason for using a branch/fork instead?

Ted



Re: finding Cassandra servers

2010-03-03 Thread Jonathan Ellis
We appear to be reaching consensus that this is solving a non-problem,
so I have closed that ticket.

2010/3/3 Ted Zlatanov t...@lifelogs.com:
 On Wed, 3 Mar 2010 12:08:06 -0500 Ian Holsman i...@holsman.net wrote:

 IH We could create a branch or git fork where you guys could develop it,
 IH and if it reaches a usable state and others find it interesting it
 IH could get integrated in then

 Thanks, Ian.  Would it be OK to do it as a patch in
 http://issues.apache.org/jira/browse/CASSANDRA-846?  Or is there a
 reason for using a branch/fork instead?

 Ted




Re: finding Cassandra servers

2010-03-03 Thread Eric Evans
On Wed, 2010-03-03 at 16:49 +, Christopher Brind wrote:
 So is the current general practice to connect to a known node, e.g. by
 ip address?

There are so many ways you could tackle this but...

If you're talking about provisioning/startup of new nodes, just use the
IPs of 2-4 nodes in the seeds section of configs.

If you're talking about clients, then round-robin DNS is one option.
Load-balancers are another. Either could be used with a subset of
higher-capacity/higher-availability nodes, or for the entire cluster.

 If so, what happens if that node is down?  Is the entire cluster
 effectively broken at that point?

You don't use just one node, see above.

 Or do clients simply maintain a list of nodes a just connect to the
 first available in the list? 

It's possible to obtain a list of nodes over Thrift. So, yet another
option would be to use a short-list of well-known nodes (discovered via
round-robin DNS for example), to obtain a current node list and
distribute among them.

-- 
Eric Evans
eev...@rackspace.com



Re: finding Cassandra servers

2010-03-03 Thread Ted Zlatanov
On Wed, 3 Mar 2010 09:19:28 -0800 Chris Goffinet goffi...@digg.com wrote: 

CG At Digg we have automated infrastructure. We use Puppet + our own
CG in-house system that allows us to query pools of nodes for
CG 'seeds'. Config files like storage-conf.xml are auto generated on
CG the fly, and we randomly pick a set of seeds.

CG Seeds can be per datacenter as well. As soon as a machine is
CG decommissioned, it no longer gets picked as seed.

On Wed, 3 Mar 2010 11:20:07 -0600 Brandon Williams dri...@gmail.com wrote: 

BW 2010/3/3 Ted Zlatanov t...@lifelogs.com
 My concern is that both RRDNS and seed node lists are vulnerable to
 individual node failure.

BW They're not.  That's why they're lists.  If one doesn't work out, move along
BW to the next.

 Updating DNS when a node dies means you have
 to wait until the TTL expires, and if you lower the TTL too much your
 server will get killed.

BW Don't do that.  Make your clients keep trying.  Any failure is likely to be
BW transient anyway, so running around messing with DNS every time a machine is
BW offline doesn't make much sense.

Thanks for the advice.  I am probably being paranoid about the
connection timeout; we're using Puppet as well so I'll just use it to
generate the seeds portion of the config file *and* a plain list of seed
nodes that each client can retrieve (so they don't have to parse the
XML).

On Wed, 3 Mar 2010 11:22:45 -0600 Jonathan Ellis jbel...@gmail.com wrote: 

JE We appear to be reaching consensus that this is solving a non-problem,
JE so I have closed that ticket.

Sure.  Thanks for everyone's opinion, I really appreciate it.

Ted



Re: finding Cassandra servers

2010-03-03 Thread Ryan King
2010/3/3 Ted Zlatanov t...@lifelogs.com:
 On Wed, 3 Mar 2010 09:04:37 -0800 Ryan King r...@twitter.com wrote:

 RK Something like RRDNS is no more complex that managing a list of seed 
 nodes.

 How do your clients at Twitter find server nodes?  Do you just run them
 local to each node?

RRDNS + loading the token map to discover more servers. Our
implementation is open source:
http://github.com/fauna/cassandra/blob/master/lib/cassandra/cassandra.rb

 My concern is that both RRDNS and seed node lists are vulnerable to
 individual node failure.  Updating DNS when a node dies means you have
 to wait until the TTL expires, and if you lower the TTL too much your
 server will get killed.

If you combine it with a fault-tolerate thrift client and loading the
token map, it works fine.

 With seed node lists, if I get unlucky I'd be trying to hit a downed
 node in which case I may as well just use RRDNS and deal with connection
 failure from the start.

Why would you not deal with connection failure?

-ryan


Re: finding Cassandra servers

2010-03-03 Thread Ryan King
On Wed, Mar 3, 2010 at 9:27 AM, Eric Evans eev...@rackspace.com wrote:
 On Wed, 2010-03-03 at 16:49 +, Christopher Brind wrote:
 So is the current general practice to connect to a known node, e.g. by
 ip address?

 There are so many ways you could tackle this but...

 If you're talking about provisioning/startup of new nodes, just use the
 IPs of 2-4 nodes in the seeds section of configs.

 If you're talking about clients, then round-robin DNS is one option.
 Load-balancers are another. Either could be used with a subset of
 higher-capacity/higher-availability nodes, or for the entire cluster.

 If so, what happens if that node is down?  Is the entire cluster
 effectively broken at that point?

 You don't use just one node, see above.

 Or do clients simply maintain a list of nodes a just connect to the
 first available in the list?

 It's possible to obtain a list of nodes over Thrift. So, yet another
 option would be to use a short-list of well-known nodes (discovered via
 round-robin DNS for example), to obtain a current node list and
 distribute among them.

This is exactly what we do.

-ryan


Re: finding Cassandra servers

2010-03-03 Thread Ted Zlatanov
On Wed, 3 Mar 2010 09:35:31 -0800 Ryan King r...@twitter.com wrote: 

 With seed node lists, if I get unlucky I'd be trying to hit a downed
 node in which case I may as well just use RRDNS and deal with connection
 failure from the start.

RK Why would you not deal with connection failure?

I mean it's simpler to deal with one type of connection failure (to any
node in RRDNS) than multiples (to seed node to get node list, then to
random active node from that list).  Sorry if my phrasing was confusing.

Ted