[jira] [Comment Edited] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-15 Thread Oleksandr Shulgin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16400082#comment-16400082
 ] 

Oleksandr Shulgin edited comment on CASSANDRA-5836 at 3/15/18 8:35 AM:
---

{quote}system.available_ranges works off keyspaces, so rebuild will still work 
fine as long as you didn't add RF before provisioning the DC (e.g you didn't 
bootstrap the NTS keyspaces){quote}

You are correct, I had a false assumption here.  But then I don't see at all 
where does the recommendation to set {{auto_bootstrap=false}} for new DC come 
from?  I believed the reason was that {{nodetool rebuild}} won't work 
otherwise, but it's not the case apparently.

If we can simply drop this recommendation from the docs that would be a great 
thing, IMO.  By following the doc in its current form it is not unlikely that 
one can accidentally add some nodes with {{auto_bootstrap=false}} to *existing* 
DC, simply by messing up the DC suffix parameter.  With the default setting of 
{{auto_bootstrap} such a configuration error is mostly harmless and is easy to 
rollback.

{quote}> At the same time, new cluster startup process can be arbitrarily 
complex
No, it can't. {quote}

Unfortunately, it already is.  For example, look at our home-grown automation 
code to create new Cassandra clusters on AWS: 
https://github.com/zalando-stups/planb-cassandra/blob/master/planb/create_cluster.py
  That's already close to 1,000 lines of Python.

{quote}Cassandra is hard enough to use as it is, and we really shouldn't be 
making operations more complex.{quote}

Creating a new cluster is the operation with the least possible potential 
impact of all, and you do it only once in a lifetime of a cluster.  I would go 
as far as saying it doesn't even belong to "ops".  Restarts, upgrades, 
bootstrapping new nodes and DCs: these are the operations and we shouldn't make 
"introduction to Cassandra" easier at the cost of making *these* more complex 
or risky.

{quote}Far more logical to be able to say that "All nodes will respect the 
auto_bootstrap setting regardless of their configuration". The only caveat is 
that the first node won't bootstrap,..{quote}

That's already a contradiction, don't you think?  And more precisely it should 
be spelled as "if a node *believes* it is the very first one".  A big question 
to me still: can this be done in the code reliably?

{quote}... but to users this is irrelevant and they don't need to know about 
it.{quote}

This attitude is exactly what makes Cassandra hard to use in my experience. :(  
I cannot even count the number of times when I had to dive deeply into the 
source code trying to figure some detail which was not properly documented, 
because the devs thought the same: users don't need to know about it...



was (Author: oshulgin):
{quote}system.available_ranges works off keyspaces, so rebuild will still work 
fine as long as you didn't add RF before provisioning the DC (e.g you didn't 
bootstrap the NTS keyspaces){quote}

You are correct, I had a false assumption here.  But then I don't see at all 
where does the recommendation to set {{auto_bootstrap=false}} for new DC come 
from?  I believed the reason was that {{nodetool rebuild}} won't work 
otherwise, but it's not the case apparently.

If we can simply drop this recommendation from the docs that would be a great 
thing, IMO.  By following the doc in its current form it is not unlikely that 
one can accidentally add some nodes with {{auto_bootstrap=false}} to *existing* 
DC, simply by messing up the DC suffix parameter.  With the default setting of 
{{auto_bootstrap} such a configuration error is mostly harmless and is easy to 
rollback.

{quote}> At the same time, new cluster startup process can be arbitrarily 
complex
No, it can't. {quote}

Unfortunately, it already is.  For example, look at our home-grown automation 
code to create new Cassandra clusters on AWS: 
https://github.com/zalando-stups/planb-cassandra/blob/master/planb/create_cluster.py
  That's already close to 1,000 lines of Python.

{quote}Cassandra is hard enough to use as it is, and we really shouldn't be 
making operations more complex.{quote}

Creating a new cluster is the operation with the least possible potential 
impact of all, and you do it only once in a lifetime of a cluster.  I would go 
as far as saying it doesn't even belong to "ops".  Restarts, upgrades, 
bootstrapping new nodes and DCs: these are the operations and we shouldn't make 
"introduction to Cassandra" easier at the cost of making *these* more complex 
or risky.

{quote}Far more logical to be able to say that "All nodes will respect the 
auto_bootstrap setting regardless of their configuration". The only caveat is 
that the first node won't bootstrap,..{quote}

That's already a contradiction, don't you think?  And more precisely it should 
be spelled as "if a node *believes

[jira] [Comment Edited] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-04 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385559#comment-16385559
 ] 

Kurt Greaves edited comment on CASSANDRA-5836 at 3/5/18 4:42 AM:
-

Glad some more discussion is happening again here. This has always been a pain 
point for operators. I doubt many people can actually list every single 
different startup case in Cassandra, there are a hell of a lot.

On the seed issue; in the past I've gone so far as to write a patch for making 
only the first seed a special case, such as already mentioned by [~oshulgin] to 
at least fix the issue of maintenance on seed nodes. There's a patch 
[here|https://github.com/apache/cassandra/compare/trunk...kgreav:13851-extension-3.11]
 if anyone cares, think it's more or less working and only missing tests at the 
moment (but I'm probably wrong). Note it's built on my patch for 
CASSANDRA-13851, so not all code there is relevant.
{quote}My understanding is that the 'seed node' role has a significant 
initial-topology-discovery responsibility which I have not seen mentioned in 
recent discussions.
{quote}
Not my understanding. A seed simply defines a node to connect to to join the 
cluster. No topology information is transmitted between the seed and the node 
contacting it. If it did, this would bring it's own complexities and likely 
make Gossip really expensive (especially on large clusters).
{quote}Also, as of my last knowledge of this code, a given node will gossip 
with a Seed node more frequently than its other peers, which I believe is "just 
an optimization of gossip" but seems notable.
{quote}
Yep, just an optimisation but is important. For the most part however it 
shouldn't have any effect on the bootstrap case.
{quote}I also recall past dev discussion (with driftx?) suggesting that the 
"correct" solution in their view is an external seed provider.
{quote}
Yeah, the correct solution is external seed provider or not breaking your 
config management, but we can still do better here. Especially in the replaces 
case, and probably the new DC case.
{quote}Another case when you add seed nodes is when adding a new DC. In this 
case they are not the first ones to start so they could bootstrap, but most of 
the time this is not what you want, so you set auto_bootstrap=false for every 
node in the new DC, including the new seeds.
{quote}
It's worth noting here that there is the case of {{SimpleStrategy}} in which 
you wouldn't want auto_bootstrap=false (this affects auth, traces, 
system_distributed). This is specifically why you would want every node to 
bootstrap in a new DC (including seeds). The alternative is to get rid of 
{{SimpleStrategy}} (or at least stop using it as a default).
{quote}In the case where seeds nodes can not be contacted, how do you determine 
if this is the first node in a cluster (so we should special case and skip 
bootstrap) vs a mis-configuration or other seeds are down issues and therefor 
the bootstrap should fail?
{quote}
If the listed seed isn't itself then you fail. This is how it currently works 
as well. That is, the first node in the cluster has itself as a seed and also 
can't contact any other seeds in its seed list. I'm pretty sure my patch above 
works this way as if there are seeds they should be present in the 
{{endpointShadowStateMap}} after the SR. There may be some edge cases to think 
of here though like starting multiple seeds at the same time.

Also related is CASSANDRA-14073, which will fix the case where you replace a 
seed node and it doesn't bootstrap. This one is more important IMO as it's more 
likely for config management not to handle this case.


was (Author: kurtg):
Glad some more discussion is happening again here. This has always been a pain 
point for operators. I doubt many people can actually list every single 
different startup case in Cassandra, there are a hell of a lot.

On the seed issue; in the past I've gone so far as to write a patch for making 
only the first seed a special case, such as already mentioned by [~oshulgin] to 
at least fix the issue of replacing seed nodes. There's a patch 
[here|https://github.com/apache/cassandra/compare/trunk...kgreav:13851-extension-3.11]
 if anyone cares, think it's more or less working and only missing tests at the 
moment (but I'm probably wrong). Note it's built on my patch for 
CASSANDRA-13851, so not all code there is relevant.
{quote}My understanding is that the 'seed node' role has a significant 
initial-topology-discovery responsibility which I have not seen mentioned in 
recent discussions.
{quote}
Not my understanding. A seed simply defines a node to connect to to join the 
cluster. No topology information is transmitted between the seed and the node 
contacting it. If it did, this would bring it's own complexities and likely 
make Gossip really expensive (especially on large clusters).

[jira] [Comment Edited] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-04 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385559#comment-16385559
 ] 

Kurt Greaves edited comment on CASSANDRA-5836 at 3/5/18 4:42 AM:
-

Glad some more discussion is happening again here. This has always been a pain 
point for operators. I doubt many people can actually list every single 
different startup case in Cassandra, there are a hell of a lot.

On the seed issue; in the past I've gone so far as to write a patch for making 
only the first seed a special case, such as already mentioned by [~oshulgin] to 
at least fix the issue of maintenance on seed nodes/adding new nodes as seeds. 
There's a patch 
[here|https://github.com/apache/cassandra/compare/trunk...kgreav:13851-extension-3.11]
 if anyone cares, think it's more or less working and only missing tests at the 
moment (but I'm probably wrong). Note it's built on my patch for 
CASSANDRA-13851, so not all code there is relevant.
{quote}My understanding is that the 'seed node' role has a significant 
initial-topology-discovery responsibility which I have not seen mentioned in 
recent discussions.
{quote}
Not my understanding. A seed simply defines a node to connect to to join the 
cluster. No topology information is transmitted between the seed and the node 
contacting it. If it did, this would bring it's own complexities and likely 
make Gossip really expensive (especially on large clusters).
{quote}Also, as of my last knowledge of this code, a given node will gossip 
with a Seed node more frequently than its other peers, which I believe is "just 
an optimization of gossip" but seems notable.
{quote}
Yep, just an optimisation but is important. For the most part however it 
shouldn't have any effect on the bootstrap case.
{quote}I also recall past dev discussion (with driftx?) suggesting that the 
"correct" solution in their view is an external seed provider.
{quote}
Yeah, the correct solution is external seed provider or not breaking your 
config management, but we can still do better here. Especially in the replaces 
case, and probably the new DC case.
{quote}Another case when you add seed nodes is when adding a new DC. In this 
case they are not the first ones to start so they could bootstrap, but most of 
the time this is not what you want, so you set auto_bootstrap=false for every 
node in the new DC, including the new seeds.
{quote}
It's worth noting here that there is the case of {{SimpleStrategy}} in which 
you wouldn't want auto_bootstrap=false (this affects auth, traces, 
system_distributed). This is specifically why you would want every node to 
bootstrap in a new DC (including seeds). The alternative is to get rid of 
{{SimpleStrategy}} (or at least stop using it as a default).
{quote}In the case where seeds nodes can not be contacted, how do you determine 
if this is the first node in a cluster (so we should special case and skip 
bootstrap) vs a mis-configuration or other seeds are down issues and therefor 
the bootstrap should fail?
{quote}
If the listed seed isn't itself then you fail. This is how it currently works 
as well. That is, the first node in the cluster has itself as a seed and also 
can't contact any other seeds in its seed list. I'm pretty sure my patch above 
works this way as if there are seeds they should be present in the 
{{endpointShadowStateMap}} after the SR. There may be some edge cases to think 
of here though like starting multiple seeds at the same time.

Also related is CASSANDRA-14073, which will fix the case where you replace a 
seed node and it doesn't bootstrap. This one is more important IMO as it's more 
likely for config management not to handle this case.


was (Author: kurtg):
Glad some more discussion is happening again here. This has always been a pain 
point for operators. I doubt many people can actually list every single 
different startup case in Cassandra, there are a hell of a lot.

On the seed issue; in the past I've gone so far as to write a patch for making 
only the first seed a special case, such as already mentioned by [~oshulgin] to 
at least fix the issue of maintenance on seed nodes. There's a patch 
[here|https://github.com/apache/cassandra/compare/trunk...kgreav:13851-extension-3.11]
 if anyone cares, think it's more or less working and only missing tests at the 
moment (but I'm probably wrong). Note it's built on my patch for 
CASSANDRA-13851, so not all code there is relevant.
{quote}My understanding is that the 'seed node' role has a significant 
initial-topology-discovery responsibility which I have not seen mentioned in 
recent discussions.
{quote}
Not my understanding. A seed simply defines a node to connect to to join the 
cluster. No topology information is transmitted between the seed and the node 
contacting it. If it did, this would bring it's own complexities and likely 
make Gossip really expensive

[jira] [Comment Edited] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-02 Thread Robert Coli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16383984#comment-16383984
 ] 

Robert Coli edited comment on CASSANDRA-5836 at 3/2/18 7:02 PM:


I should probably join #cassandra-dev IRC and chat about this there, but I'd 
like to refer people to this comment up-ticket :

https://issues.apache.org/jira/browse/CASSANDRA-5836?focusedCommentId=13727032&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13727032

Nobody really seems to understand why it's not safe for a seed node to 
bootstrap, because the workaround is to temporarily pretend the node isn't a 
seed and to bootstrap it. Usually one doesn't even inform the other nodes that 
it temporarily isn't a seed, and nothing unsafe seems to happen.

I feel like clarity here starts with explaining in what cases a Seed node is 
actually "Seeding" and what "Seeding" means and does not mean. My understanding 
is that the 'seed node' role has a significant initial-topology-discovery 
responsibility which I have not seen mentioned in recent discussions. This 
dovetails with needing to understand seed node behavior in the "restoring from 
snapshot" case, where the topology is known from existing cluster information 
and therefore may or may not "need" to be discovered from a seed. Also, as of 
my last knowledge of this code, a given node will gossip with a Seed node more 
frequently than its other peers, which I believe is "just an optimization of 
gossip" but seems notable.

I also recall past dev discussion (with driftx?) suggesting that the "correct" 
solution in their view is an external seed provider.

So in summary my understanding of the complete responsibilities of a seed, 
independent from whether it's serving as a bootstrap source or bootstrapping 
itself :
1) provide other nodes which consider it a seed with initial topology

2) provide "faster" topology updates to nodes which have me listed in their 
seed provider

The minimum requirement for a new node joining the cluster seems to be a single 
seed node that can inform it of topology in a timely manner. If that's correct 
and we imagine that all nodes use a seed provider that always returns at least 
one available node that can fulfill that role, the problem (?) of not being 
able to bootstrap seed nodes seems to disappear?


was (Author: rcoli):
I should probably join #cassandra-dev IRC and chat about this there, but I'd 
like to refer people to this comment up-ticket :

https://issues.apache.org/jira/browse/CASSANDRA-5836?focusedCommentId=13727032&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13727032


Nobody really seems to understand why it's not safe for a seed node to 
bootstrap, because the workaround is to temporarily pretend the node isn't a 
seed and to bootstrap it. Usually one doesn't even inform the other nodes that 
it temporarily isn't a seed, and nothing unsafe seems to happen.

I feel like clarity here starts with explaining in what cases a Seed node is 
actually "Seeding" and what "Seeding" means and does not mean. My understanding 
is that the 'seed node' role has a significant initial-topology-discovery 
responsibility which I have not seen mentioned in recent discussions. This 
dovetails with needing to understand seed node behavior in the "restoring from 
snapshot" case, where the topology is known from existing cluster information 
and therefore may or may not "need" to be discovered from a seed. Also, as of 
my last knowledge of this code, a given node will gossip with a Seed node more 
frequently than its other peers, which I believe is "just an optimization of 
gossip" but seems notable.

I also recall past dev discussion (with driftx?) suggesting that the "correct" 
solution in their view is an external seed provider.

So in summary my understanding of the complete responsibilities of a seed, 
independent from whether it's serving as a bootstrap source or bootstrapping 
itself :
#) provide other nodes which consider it a seed with initial topology

#) provide "faster" topology updates to nodes which have me listed in their 
seed provider

The minimum requirement for a new node joining the cluster seems to be a single 
seed node that can inform it of topology in a timely manner. If that's correct 
and we imagine that all nodes use a seed provider that always returns at least 
one available node that can fulfill that role, the problem (?) of not being 
able to bootstrap seed nodes seems to disappear?

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
> 

[jira] [Comment Edited] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-02-28 Thread Robert Coli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16381561#comment-16381561
 ] 

Robert Coli edited comment on CASSANDRA-5836 at 3/1/18 6:07 AM:


[~oshulgin] : as I understand it, a bootstrapping node also receives "extra" 
copies of writes via the storage protocol, which is not technically "streaming 
in" the data. These "extra" copies do not count towards CL.

While I'm commenting on this ticket, it seems appropriate to share my 
enthusiasm for resolving the question of the bootstrapping seed nodes. This has 
been a longstanding point of pain and confusion for operators and those who 
support them.


was (Author: rcoli):
[~oshulgin] : as I understand it, a bootstrapping node also receives "extra" 
copies of writes via the storage protocol, which is not technically "streaming 
in" the data. These "extra" copies do not count towards CL.

While I'm commenting on this ticket, it seems appropriate to share my 
enthusiasm for resolving the question of the bootstrapping seed nodes. This has 
been a longstanding point of pain and  confusion for operators and those who 
support them.

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org