[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-04-25 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453524#comment-16453524
 ] 

Kurt Greaves commented on CASSANDRA-5836:
-

Shadow round, sorry. First round of gossip introduced to protect against 
collisions in the ring as per CASSANDRA-10134.

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-04-23 Thread Oleksandr Shulgin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16447709#comment-16447709
 ] 

Oleksandr Shulgin commented on CASSANDRA-5836:
--

[~KurtG] What is SR?

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-04-23 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16447629#comment-16447629
 ] 

Kurt Greaves commented on CASSANDRA-5836:
-

{quote}OK, but this implies that you have to start the very first node 
differently from the rest of the cluster. If you want to have 3 seed nodes, 
what you do currently is just list all of them in configuration and deploy 
nodes one by one, starting with the seeds, with identical config and you're 
done.

With your proposed approach, there are two extra steps:
1. Deploy the very first seed node with a different config, i.e. only itself in 
the seeds list.
2. After other seeds nodes are there (or all nodes are there), restart the 
first node with the complete seeds list.
{quote}
Getting back to this after being distracted for a while, actually not sure what 
I was thinking there. It actually doesn't matter how many seeds the first node 
has in its seed list so there is no special case there.

If SR ends and no seed can be contacted and the node is currently 
uninitialised, but has itself as a seed, the node creates a cluster. This is 
the existing behaviour and should work perfectly fine with the other changes 
I've mentioned where any seed bootstraps.

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-16 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402060#comment-16402060
 ] 

Jeremiah Jordan commented on CASSANDRA-5836:


CASSANDRA-12681 for the NTS change

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-16 Thread Oleksandr Shulgin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401552#comment-16401552
 ] 

Oleksandr Shulgin commented on CASSANDRA-5836:
--

{quote}That was always a datastax recommendation so I don't know where it came 
from. As I'm sure you're aware, the Cassandra docs are quite sparse in the 
operations area, but all of this should be documented properly.{quote}

Given the above comment from [~jjordan], the only reason I can still see to use 
{{auto_bootstrap=false}} is to make new token allocation algorithm work on the 
new DC.  I would then also strongly argue for following the DSE exampe and 
deprecating {{allocate_tokens_for_keyspace}} option, exactly because this one 
requires you to add your new DC to your NTS data keyspace before starting the 
nodes there.  The allocator depends solely on the local DC replication factor, 
it doesn't need the keyspace to be replicated initially to the new nodes.

{quote}It's literally that it's irrelevant what the first node does. If 
auto_bootstrap is true for the first node, it's a no-op, if it's false, it's a 
defined no-op. The first node still respects auto_bootstrap, but the result is 
the same for either true or false. This is always going to be the case.{quote}

I'm fully aware of that.  The problem is how to make sure that a node starting 
up *correctly* assumes that it is the very first one.

{quote}The first node would be defined as a node that only has itself as a 
seed, and no existing knowledge of any other node in the cluster.{quote}

OK, but this implies that you have to start the very first node differently 
from the rest of the cluster.  If you want to have 3 seed nodes, what you do 
currently is just list all of them in configuration and deploy nodes one by 
one, starting with the seeds, with identical config and you're done.

With your proposed approach, there are two extra steps:
1. Deploy the very first seed node with a different config, i.e. only itself in 
the seeds list.
2. After other seeds nodes are there (or all nodes are there), restart the 
first node with the complete seeds list.

So that already makes startup more complicated than it is currently.  And don't 
forget the pluggable seeds providers: how (reliably) is this going to work 
together?

{quote}it's the fact that things get implemented without documentation{quote}

But this is exactly what I mean.  If it's because of attitude or not is just my 
judgement, so let's set that aside.

My point is: by spending time on writing decent documentation (preferably, 
before starting on the code!) it could be possible to avoid certain 
implementation pitfalls.  In some extreme cases, like the aforementioned token 
allocation option, it would become obvious that the implementation and the very 
name of the option is wrong: it should be about replication factor and not at 
all about keyspace name.


> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-16 Thread Oleksandr Shulgin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401531#comment-16401531
 ] 

Oleksandr Shulgin commented on CASSANDRA-5836:
--

{quote}Since we have changed NTS such that you can’t set the new DC name in 
until after there are nodes in that DC this is no longer something someone 
could easily do by going in the “wrong” order and altering keyspaces 
first.{quote}

Whoa, but in which version?  Trunk?  DSE?  We are using Apache Cassandra 3.0 
and definitely that one doesn't check DC names at all.


> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-15 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401409#comment-16401409
 ] 

Jeremiah Jordan commented on CASSANDRA-5836:


People used to recommend setting auto bootstrap false to make sure that you 
didn’t end up with a node owning 100% of the ring. Since we have changed NTS 
such that you can’t set the new DC name in until after there are nodes in that 
DC this is no longer something someone could easily do by going in the “wrong” 
order and altering keyspaces first.

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-15 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401328#comment-16401328
 ] 

Kurt Greaves commented on CASSANDRA-5836:
-

bq. But then I don't see at all where does the recommendation to set 
auto_bootstrap=false for new DC come from?
That was always a datastax recommendation so I don't know where it came from. 
As I'm sure you're aware, the Cassandra docs are quite sparse in the operations 
area, but all of this should be documented properly.

bq. Unfortunately, it already is
I'm well aware of that. I'm saying it shouldn't be. And we certainly shouldn't 
be making a bad situation worse.

bq. That's already a contradiction, don't you think? And more precisely it 
should be spelled as "if a node believes it is the very first one". A big 
question to me still: can this be done in the code reliably?

No, I don't think so. Caveat was probably the wrong word here, because it's not 
a caveat. It's literally that it's *irrelevant* what the first node does. If 
auto_bootstrap is true for the first node, it's a no-op, if it's false, it's a 
defined no-op. The first node still respects auto_bootstrap, but the result is 
the same for either true or false. This is always going to be the case.
If it can be done by hand it can be done by code. The first node would be 
defined as a node that only has itself as a seed, and no existing knowledge of 
any other node in the cluster. This is quite straightforward (but not part of 
my patch yet).

bq. This attitude is exactly what makes Cassandra hard to use in my experience. 
 I cannot even count the number of times when I had to dive deeply into the 
source code trying to figure some detail which was not properly documented, 
because the devs thought the same: users don't need to know about it...
I think you misunderstand. It's not that attitude that makes Cassandra hard to 
use, it's the fact that things get implemented without documentation and 
problems get missed when considering the effects on all aspects of Cassandra, 
because it's so large and complex. The only reason you need to dive into the 
code/jira tickets is because there's no accurate documentation for what you're 
looking for, or reasoning for why something behaves as it does. 

My point is precisely that users don't need to know about it, therefore they 
shouldn't have to dig into the code. If they have to dig into the code the 
implementation is likely wrong. We should be aiming to make it so that no user 
ever has to even consider looking in the code to figure out how seeds work, let 
alone even the documentation. It literally should not even cross their mind 
short of the fact that they have to set some seeds.




> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-15 Thread Oleksandr Shulgin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16400082#comment-16400082
 ] 

Oleksandr Shulgin commented on CASSANDRA-5836:
--

{quote}system.available_ranges works off keyspaces, so rebuild will still work 
fine as long as you didn't add RF before provisioning the DC (e.g you didn't 
bootstrap the NTS keyspaces){quote}

You are correct, I had a false assumption here.  But then I don't see at all 
where does the recommendation to set {{auto_bootstrap=false}} for new DC come 
from?  I believed the reason was that {{nodetool rebuild}} won't work 
otherwise, but it's not the case apparently.

If we can simply drop this recommendation from the docs that would be a great 
thing, IMO.  By following the doc in its current form it is not unlikely that 
one can accidentally add some nodes with {{auto_bootstrap=false}} to *existing* 
DC, simply by messing up the DC suffix parameter.  With the default setting of 
{{auto_bootstrap} such a configuration error is mostly harmless and is easy to 
rollback.

{quote}> At the same time, new cluster startup process can be arbitrarily 
complex
No, it can't. {quote}

Unfortunately, it already is.  For example, look at our home-grown automation 
code to create new Cassandra clusters on AWS: 
https://github.com/zalando-stups/planb-cassandra/blob/master/planb/create_cluster.py
  That's already close to 1,000 lines of Python.

{quote}Cassandra is hard enough to use as it is, and we really shouldn't be 
making operations more complex.{quote}

Creating a new cluster is the operation with the least possible potential 
impact of all, and you do it only once in a lifetime of a cluster.  I would go 
as far as saying it doesn't even belong to "ops".  Restarts, upgrades, 
bootstrapping new nodes and DCs: these are the operations and we shouldn't make 
"introduction to Cassandra" easier at the cost of making *these* more complex 
or risky.

{quote}Far more logical to be able to say that "All nodes will respect the 
auto_bootstrap setting regardless of their configuration". The only caveat is 
that the first node won't bootstrap,..{quote}

That's already a contradiction, don't you think?  And more precisely it should 
be spelled as "if a node *believes* it is the very first one".  A big question 
to me still: can this be done reliably?

{quote}... but to users this is irrelevant and they don't need to know about 
it.{quote}

This attitude is exactly what makes Cassandra hard to use in my experience. :(  
I cannot even count the number of times when I had to dive deeply into the 
source code trying to figure some detail which was not properly documented, 
because the devs thought the same: users don't need to know about it...


> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-14 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399485#comment-16399485
 ] 

Kurt Greaves commented on CASSANDRA-5836:
-

bq. I believe it would be still required to set it to false, unless you want to 
replace this recommendation with running TRUNCATE system.available_ranges 
post-bootstrap on every node of the new DC. Otherwise, nodetool rebuild is not 
going to do anything because the new nodes would think that they already have 
all the data due to bootstrap. Or am I missing something?

{{system.available_ranges}} works off keyspaces, so rebuild will still work 
fine as long as you didn't add RF before provisioning the DC (e.g you didn't 
bootstrap the NTS keyspaces). It's not really a big deal either way.

bq. At the same time, new cluster startup process can be arbitrarily complex
No, it can't. That's not a good introduction to Cassandra. Cassandra is hard 
enough to use as it is, and we really shouldn't be making operations more 
complex. Software exists to make things easier for humans; we shouldn't shy 
away from complexity just because it makes the code harder to understand. That 
defeats the purpose of writing code in the first place. I still think this is a 
really bad idea and having this all handled in the code is a much better 
solution. Far more logical to be able to say that "All nodes will respect the 
auto_bootstrap setting regardless of their configuration". The only caveat is 
that the first node won't bootstrap, but to users this is irrelevant and they 
don't need to know about it. To users it's all the same to them as 
bootstrapping when there are no other nodes is a no-op.



> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-13 Thread Oleksandr Shulgin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16396935#comment-16396935
 ] 

Oleksandr Shulgin commented on CASSANDRA-5836:
--

{quote}That will only occur if you've added the datacenter to replication prior 
to adding the nodes in the new DC.{quote}

Ah, you're right.  Looks like I've spent too much time recently trying to make 
token allocation to work in 3.0...

{quote}It's perfectly acceptable to still use auto_bootstrap: false for a new 
DC, even with my code changes.{quote}

I believe it would be still required to set it to false, unless you want to 
replace this recommendation with running {{TRUNCATE system.available_ranges}} 
post-bootstrap on every node of the new DC.  Otherwise, {{nodetool rebuild}} is 
not going to do anything because the new nodes would think that they already 
have all the data due to bootstrap.  Or am I missing something?

{quote}Otherwise how would you tell if you are the first node? Currently a seed 
won't fail if you set auto_bootstrap: true and it's the first node in the 
cluster, which is what you're proposing.{quote}

I would expect that it will try to contact other seeds for some time and will 
fail ultimately, because they are not there.  This is a pretty good indication 
of being the first node IMO.

{quote}"the first node in the cluster doesn't bootstrap, all other nodes 
respect the auto_bootstrap setting"{quote}

I expect that adding yet another corner case like this one is not going to make 
the already complicated code easier to reason about.  At the same time, new 
cluster startup process can be arbitrarily complex: it doesn't matter much IMO. 
 You don't have data or clients and you are allowed to fail and start over 
again.  Much more important is predictable behavior during operations on a 
cluster with the data and clients.


> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-13 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16396587#comment-16396587
 ] 

Kurt Greaves commented on CASSANDRA-5836:
-

{quote}Hold on, how is this going to work at all? If the first node in new DC 
is going to bootstrap (let's assume seeds are allowed to bootstrap) it will own 
the whole token ring at first, so it will have to stream in all the data that 
exists in the source DC, times the RF(s) of new DC. Even if the new node 
doesn't die a horrible death in the process, you won't be able to add another 
node to the cluster until this is finished. And even after that, adding the 
next node to new DC will take ~50% of ownership from the first one, so you will 
need to run cleanup on the first one in the end, etc. for the rest of the new 
nodes. 

It is totally unpractical to add new DC this way, so I firmly believe that 
auto_bootstrap=false is here to stay for new DCs.
{quote}

That will only occur if you've added the datacenter to replication prior to 
adding the nodes in the new DC. I was under the impression for a while that NTS 
keyspaces won't bootstrap across DC's, but on further testing they do. This is 
irrelevant however as it's not really standard practice to update your RF 
before you set up your new DC. You can, but it's generally not a good idea 
because of the reason you listed, unless you're doing it on a really small 
dataset. Either way it's for an expert to decide what they want to do here.

Regardless, there's no saying we need to change the [un]documented procedure 
for adding a new DC because of this. It's perfectly acceptable to still use 
{{auto_bootstrap: false}} for a new DC, even with my code changes. It would be 
worth documenting the current behaviour that you'll stream *a lot* of data if 
you update RF prior to bootstrapping a new DC though.

bq. ... I have doubts that any of these checks can be made really bullet-proof.
I was saying you'd need that check if you changed all seeds to bootstrap. 
Otherwise how would you tell if you are the first node? Currently a seed won't 
fail if you set {{auto_bootstrap: true}} and it's the first node in the 
cluster, which is what you're proposing.

bq. Well, we have this for adding new DCs, so not really that silly. It also 
doesn't have to be "we said so", for me the explanation is simple: the first 
seed node will fail the bootstrap otherwise, because there is no other nodes to 
bootstrap from yet.
I really don't think it's a good idea to change/further complicate the new 
cluster startup process. It's not the same as adding a DC (and as I've said, 
that procedure isn't necessarily correct). Many people will be relying on the 
existing behaviour pretty heavily. Complicating it by saying "now your first 
node will need auto_bootstrap: false" is not going to end well. How is it not 
simpler that the first node just doesn't bootstrap, and all others do?

bq. Again, I have serious doubts about all this automatic corner case 
detection. As I've said before I'm totally fine with making initial cluster set 
up a little bit more involved, if that makes operations on the clusters in 
production more reliable.
Cassandra is a complex beast. I've been in the startup code pretty heavily and 
I know first hand from working with hundreds of clusters that the 
startup/bootstrapping code is a nightmare. All I'm proposing is we change the 
"seeds don't bootstrap logic regardless of configuration" to "the first node in 
the cluster doesn't bootstrap, all other nodes respect the auto_bootstrap 
setting". IMO this reduces the # of corner cases because you no longer have to 
think about replacing nodes, new DC's, replacing nodes as seeds, new nodes as 
seeds, new nodes, or having conflicting configs like {{auto_bootstrap: true}} 
but being a seed.






> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: 

[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-09 Thread Oleksandr Shulgin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392587#comment-16392587
 ] 

Oleksandr Shulgin commented on CASSANDRA-5836:
--

{quote}If N>RF it becomes less likely that you'll have one replica in each DC 
for every range.{quote}

Without defining {{N}} it's hard for me to say what you mean here.  Maybe we 
should move this part of discussion off the ticket? :)

{quote}nodetool rebuild should probably avoid rebuilding SimpleStrategy 
keyspaces and you shouldn't get an error for them.{quote}

That would be nice.

{quote}Bootstrapping SimpleStrategy across DC's is still relevant as long as 
SimpleStrategy exists.{quote}

To clarify we are not talking about significant amount of data, i.e. 
user-defined keyspaces here?  I would assume that if we teach nodetool rebuild 
to ignore SimpleStrategy keyspaces, they could be cheaply spread to new DC by 
running a repair targeted at these small system keyspaces only.

{quote}with my patch we could forget about the instructions telling people to 
set auto_bootstrap=false when adding a new DC.{quote}

Hold on, how is this going to work at all?  If the first node in new DC is 
going to bootstrap (let's assume seeds are allowed to bootstrap) it will own 
the whole token ring at first, so it will have to stream in all the data that 
exists in the source DC, times the RF(s) of new DC.  Even if the new node 
doesn't die a horrible death in the process, you won't be able to add another 
node to the cluster until this is finished.  And even after that, adding the 
next node to new DC will take ~50% of ownership from the first one, so you will 
need to run cleanup on the first one in the end, etc. for the rest of the new 
nodes.

It is totally unpractical to add new DC this way, so I firmly believe that 
{{auto_bootstrap=false}} is here to stay for new DCs.

{quote}1. You still need code to handle the case where a seed starts with 
auto_bootstrap=true but it's a new cluster.{quote}

I would prefer this just to fail with some helpful error message.  Because:

{quote}You could potentially know when to fail by checking your seeds list and 
seeing if you are the only seed (then create a cluster, else fail). But I still 
don't see this as terribly necessary.{quote}

... I have doubts that any of these checks can be made really bullet-proof.

{quote}2. Seems a bit silly to have a new cluster procedure where the first 
step is to "set this to false in the yaml... because we said so". Especially 
when we can avoid that situation.{quote}

Well, we have this for adding new DCs, so not really that silly.  It also 
doesn't have to be "we said so", for me the explanation is simple: the first 
seed node will fail the bootstrap otherwise, because there is no other nodes to 
bootstrap from yet.

{quote}Note that when I say special case I mean a special case in the code, not 
for the user. My patch (maybe with some tweaks) should be able to decide 
automatically every case where a seed should bootstrap versus when it 
shouldn't. If we can do that in the code, there's no reason to worry about 
changing any procedures or behaviours, and we don't need to worry about 
explaining the intricacies of why a seed can't bootstrap.{quote}

Again, I have serious doubts about all this automatic corner case detection.  
As I've said before I'm totally fine with making initial cluster set up a 
little bit more involved, if that makes operations on the clusters in 
production more reliable.

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-08 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392115#comment-16392115
 ] 

Kurt Greaves commented on CASSANDRA-5836:
-

bq. Hm, I've tested the procedure quite a number of times and every time I 
forgot to change the replication to NTS or to extend the replication to the new 
DC I was getting a complaint from nodetool rebuild.
If N>RF it becomes less likely that you'll have one replica in each DC for 
every range. But it's irrelevant anyway, {{nodetool rebuild}} should probably 
avoid rebuilding SimpleStrategy keyspaces and you shouldn't get an error for 
them. Bootstrapping SimpleStrategy across DC's is still relevant as long as 
SimpleStrategy exists. The only solutions here are either to (1) remove 
SimpleStrategy, (2) bootstrap new nodes within a new DC, or (3) disallow 
addition of a DC when SimpleStrategy is being used. 1 is hard + unreasonable 
and 3 is just unreasonable. 2 makes the most sense, and with my patch we could 
forget about the instructions telling people to set auto_bootstrap=false when 
adding a new DC.

bq. Do you mean it is more common to see the error with a small cluster or 
other way round: more common that it will work with a small cluster?
It's more common that people do this with small clusters and it works/they 
don't realise that they didn't change to NTS.

Well that makes sense. But there's a few issues:
# You still need code to handle the case where a seed starts with 
auto_bootstrap=true but it's a new cluster. You could potentially know when to 
fail by checking your seeds list and seeing if you are the only seed (then 
create a cluster, else fail). But I still don't see this as terribly necessary.
# Seems a bit silly to have a new cluster procedure where the first step is to 
"set this to false in the yaml... because we said so". Especially when we can 
avoid that situation.

Note that when I say special case I mean a special case in the code, not for 
the user. My patch (maybe with some tweaks) should be able to decide 
automatically every case where a seed should bootstrap versus when it 
shouldn't. If we can do that in the code, there's no reason to worry about 
changing any procedures or behaviours, and we don't need to worry about 
explaining the intricacies of why a seed can't bootstrap.

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-08 Thread Oleksandr Shulgin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390903#comment-16390903
 ] 

Oleksandr Shulgin commented on CASSANDRA-5836:
--

{quote}As long as there is at least 1 replica (for every range) in the source 
DC it will work.{quote}

Hm, I've tested the procedure quite a number of times and every time I forgot 
to change the replication to NTS or to extend the replication to the new DC I 
was getting a complaint from nodetool rebuild.

{quote}This is quite common for the case where you're adding a DC to a small 
cluster. Personally I'd prefer we get rid of SimpleStrategy altogether... but 
that's likely problematic/controversial.{quote}

Do you mean it is more common to see the error with a small cluster or other 
way round: more common that it will work with a small cluster?

{quote}If we make auto_bootstrap false by default...{quote}

This not what I was suggesting.  Maybe I didn't express myself clear enough.  
What I suggest is to:
1) Allow seed nodes to bootstrap in presence of {{auto_boostrap=true}}, this 
setting still being the default one.
2) Update documented procedure for setting up a new cluster to manually set 
{{auto_boostrap=false}} before starting the nodes for the first time, then 
remove the setting or change it to {{true}}.

This removes all special cases:
a) Setting up the first DC is then not different from setting up an additional 
one w.r.t. {{auto_bootstrap}} setting.
b) Seed nodes are not different from non-seeds w.r.t. bootstrap behavior.
c) The very first seed node is not different from the rest.


> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-07 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390427#comment-16390427
 ] 

Kurt Greaves commented on CASSANDRA-5836:
-

bq. Won't work, as shown above.
As long as there is at least 1 replica (for every range) in the source DC it 
will work. This is quite common for the case where you're adding a DC to a 
small cluster. Personally I'd prefer we get rid of SimpleStrategy altogether... 
but that's likely problematic/controversial.

bq. I would pretty much like to avoid any special cases, if possible.
1 special seed node doesn't seem to be a big deal to me. The first seed is 
always going to be special to some degree. And in my case, it's not that 
special. You could replace it and nothing should go wrong.

If we make auto_bootstrap false by default to me that seems much more likely to 
cause the exact same configuration problem where people forget to change it to 
true for new nodes they add. It doesn't reduce complexity, just shifts it. A 
lot of people out there will already be depending on a default of true and 
changing it to false would be a pretty big surprise, even for a major version.

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-07 Thread Oleksandr Shulgin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389586#comment-16389586
 ] 

Oleksandr Shulgin commented on CASSANDRA-5836:
--

[~KurtG],

{quote}My patch above should only make the very first node in a cluster a 
special case{quote}

I would pretty much like to avoid any special cases, if possible.  Exactly 
because you have to make the corner case detection bullet-proof, which is 
arguably hard to achieve in distributed / eventual consistency setup.

{quote}I don't think it's acceptable to say you have to change these to NTS 
prior to adding a new DC.{quote}

But it is a matter of fact currently: {{nodetool rebuild}} refuses to start the 
rebuild unless all non-local keyspaces are using the NTS, with a message 
similar to the one posted here: 
https://issues.apache.org/jira/browse/CASSANDRA-11098?focusedCommentId=15422197=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15422197

Do you see an alternative way to bootstrap new DC w/o using NTS?  A cross-DC 
repair could work still, but it's too resource extensive for any significant 
amount of data, I think.

{quote}Using nodetool rebuild when you've used SimpleStrategy will end up with 
data loss/won't work.{quote}

Won't work, as shown above.


> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-06 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388935#comment-16388935
 ] 

Kurt Greaves commented on CASSANDRA-5836:
-

bq. But the above only holds in the context of the default auto_bootstrap=true 
setting. If we require that it is set to false when deploying new clusters/DCs, 
the problem goes away and we don't need a special case for the very first node.
I don't think this is necessary. My patch above should only make the very first 
node in a cluster a special case, all new seeds regardless of DC would not have 
the bootstrap problem. We wouldn't want to be overriding the behaviour of 
{{autobootstrap}} for these cases anyway. 

bq. The only case I see when SimpleStrategy can actually work with multiple DCs 
is when you start multi-DC from scratch. The auth keyspace you will want to 
change to use NTS and replicate to all DCs, but you might not care about the 
other two non-local system keyspaces.

SimpleStrategy does not care about DC's, it only cares about token order in the 
ring. Whether or not it makes sense for adding DC's, we currently create 3 
keyspaces using SimpleStrategy. I don't think it's acceptable to say you have 
to change these to NTS prior to adding a new DC. It's perfectly acceptable to 
use SimpleStrategy for {{system_traces}} and {{system_distributed}}. 
{{system_auth}} also works but it's a bad idea.

bq. But are you referring here to a case where you would add a new DC to a 
cluster with data already in the original DC and still using SimpleStrategy, 
Kurt Greaves? To me that doesn't seem to be practical. 
bq. Any reason why would you want to go this way instead of the proper nodetool 
rebuild?
Yes. Not that the data in those keyspaces is terribly important to me, but it 
might be to some people. Using nodetool rebuild when you've used SimpleStrategy 
will end up with data loss/won't work.

bq.  how do you make new seeds bootstrap?
If only the first seed is special this is not a problem. Other seeds can 
bootstrap if they so desire (auto_bootstrap: true).

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-06 Thread Oleksandr Shulgin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16387545#comment-16387545
 ] 

Oleksandr Shulgin commented on CASSANDRA-5836:
--

{quote}Nobody really seems to understand why it's not safe for a seed node to 
bootstrap{quote}

I think the reasoning here is as follows: if seed nodes were allowed to 
bootstrap, the very first seed node would need some mechanism to opt out (or it 
will never be able to start).   Such a mechanism is dangerous to have in the 
first place, because it can fail and then you will allow a node to skip 
bootstrap then it must not skip it.

But the above only holds in the context of the default {{auto_bootstrap=true}} 
setting.  If we require that it is set to {{false}} when deploying new 
clusters/DCs, the problem goes away and we don't need a special case for the 
very first node.

This way we also don't need the special case of seed nodes at all: the 
bootstrap behavior can be controlled entirely by the {{auto_bootstrap}} 
parameter value, regardless of the node's seed status.  It then becomes *safer* 
to bootstrap a seed node, than not, as with any other node, because it performs 
more checks which can detect configuration problems and doesn't accept client 
reads before it has fully joined the ring.

{quote}It's worth noting here that there is the case of SimpleStrategy in which 
you wouldn't want auto_bootstrap=false (this affects auth, traces, 
system_distributed). This is specifically why you would want every node to 
bootstrap in a new DC (including seeds). The alternative is to get rid of 
SimpleStrategy (or at least stop using it as a default).{quote}

The only case I see when {{SimpleStrategy}} can actually work with multiple DCs 
is when you start multi-DC from scratch.  The auth keyspace you will want to 
change to use NTS and replicate to all DCs, but you might not care about the 
other two non-local system keyspaces.

But are you referring here to a case where you would add a new DC to a cluster 
with data already in the original DC and still using {{SimpleStrategy}}, 
[~KurtG]?  To me that doesn't seem to be practical.  Even if your data set is 
so small that you can bootstrap the first node of the new DC w/o running out of 
disk space, how do you make new seeds bootstrap?  Or do you suggest to add all 
nodes as non-seeds first and then do a rolling restart to indicate the seeds?  
Any reason why would you want to go this way instead of the proper {{nodetool 
rebuild}}?


> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-05 Thread Oleksandr Shulgin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386312#comment-16386312
 ] 

Oleksandr Shulgin commented on CASSANDRA-5836:
--

{quote}Don't assume that a new cluster is empty or that it's safe to assume 
that the first ever seed node has no data. If authentication is enabled, a 
default user is created. It would be very surprising for users ("can't login 
with cassandra anymore!") and actually a real security issue, if that user is 
recreated (if it was dropped) or the password changed back to the 
default.{quote}

I indeed remember myself seeing the default superuser created again due to a 
node restart on a cluster where it was dropped before that, but I'm not sure I 
fully understand the mechanism.  I might have forgotten some important detail, 
but I think it was related to setting up a secondary DC.  Unfortunately, I 
don't think it was trivially reproducible.  [~snazy], do you have more details 
about this issue?


> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-04 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385559#comment-16385559
 ] 

Kurt Greaves commented on CASSANDRA-5836:
-

Glad some more discussion is happening again here. This has always been a pain 
point for operators. I doubt many people can actually list every single 
different startup case in Cassandra, there are a hell of a lot.

On the seed issue; in the past I've gone so far as to write a patch for making 
only the first seed a special case, such as already mentioned by [~oshulgin] to 
at least fix the issue of replacing seed nodes. There's a patch 
[here|https://github.com/apache/cassandra/compare/trunk...kgreav:13851-extension-3.11]
 if anyone cares, think it's more or less working and only missing tests at the 
moment (but I'm probably wrong). Note it's built on my patch for 
CASSANDRA-13851, so not all code there is relevant.
{quote}My understanding is that the 'seed node' role has a significant 
initial-topology-discovery responsibility which I have not seen mentioned in 
recent discussions.
{quote}
Not my understanding. A seed simply defines a node to connect to to join the 
cluster. No topology information is transmitted between the seed and the node 
contacting it. If it did, this would bring it's own complexities and likely 
make Gossip really expensive (especially on large clusters).
{quote}Also, as of my last knowledge of this code, a given node will gossip 
with a Seed node more frequently than its other peers, which I believe is "just 
an optimization of gossip" but seems notable.
{quote}
Yep, just an optimisation but is important. For the most part however it 
shouldn't have any effect on the bootstrap case.
{quote}I also recall past dev discussion (with driftx?) suggesting that the 
"correct" solution in their view is an external seed provider.
{quote}
Yeah, the correct solution is external seed provider or not breaking your 
config management, but we can still do better here. Especially in the replaces 
case, and probably the new DC case.
{quote}Another case when you add seed nodes is when adding a new DC. In this 
case they are not the first ones to start so they could bootstrap, but most of 
the time this is not what you want, so you set auto_bootstrap=false for every 
node in the new DC, including the new seeds.
{quote}
It's worth noting here that there is the case of {{SimpleStrategy}} in which 
you wouldn't want auto_bootstrap=false (this affects auth, traces, 
system_distributed). This is specifically why you would want every node to 
bootstrap in a new DC (including seeds). The alternative is to get rid of 
{{SimpleStrategy}} (or at least stop using it as a default).
{quote}In the case where seeds nodes can not be contacted, how do you determine 
if this is the first node in a cluster (so we should special case and skip 
bootstrap) vs a mis-configuration or other seeds are down issues and therefor 
the bootstrap should fail?
{quote}
If the listed seed isn't itself then you fail. This is how it currently works 
as well. That is, the first node in the cluster has itself as a seed and also 
can't contact any other seeds in its seed list. I'm pretty sure my patch above 
works this way as if there are seeds they should be present in the 
{{endpointShadowStateMap}} after the SR. There may be some edge cases to think 
of here though like starting multiple seeds at the same time.

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-02 Thread Robert Coli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383984#comment-16383984
 ] 

Robert Coli commented on CASSANDRA-5836:


I should probably join #cassandra-dev IRC and chat about this there, but I'd 
like to refer people to this comment up-ticket :

https://issues.apache.org/jira/browse/CASSANDRA-5836?focusedCommentId=13727032=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13727032


Nobody really seems to understand why it's not safe for a seed node to 
bootstrap, because the workaround is to temporarily pretend the node isn't a 
seed and to bootstrap it. Usually one doesn't even inform the other nodes that 
it temporarily isn't a seed, and nothing unsafe seems to happen.

I feel like clarity here starts with explaining in what cases a Seed node is 
actually "Seeding" and what "Seeding" means and does not mean. My understanding 
is that the 'seed node' role has a significant initial-topology-discovery 
responsibility which I have not seen mentioned in recent discussions. This 
dovetails with needing to understand seed node behavior in the "restoring from 
snapshot" case, where the topology is known from existing cluster information 
and therefore may or may not "need" to be discovered from a seed. Also, as of 
my last knowledge of this code, a given node will gossip with a Seed node more 
frequently than its other peers, which I believe is "just an optimization of 
gossip" but seems notable.

I also recall past dev discussion (with driftx?) suggesting that the "correct" 
solution in their view is an external seed provider.

So in summary my understanding of the complete responsibilities of a seed, 
independent from whether it's serving as a bootstrap source or bootstrapping 
itself :
#) provide other nodes which consider it a seed with initial topology

#) provide "faster" topology updates to nodes which have me listed in their 
seed provider

The minimum requirement for a new node joining the cluster seems to be a single 
seed node that can inform it of topology in a timely manner. If that's correct 
and we imagine that all nodes use a seed provider that always returns at least 
one available node that can fulfill that role, the problem (?) of not being 
able to bootstrap seed nodes seems to disappear?

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-02 Thread Oleksandr Shulgin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383885#comment-16383885
 ] 

Oleksandr Shulgin commented on CASSANDRA-5836:
--

[~snazy] not sure what do you have in mind: restoring from snapshot?

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-02 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383877#comment-16383877
 ] 

Robert Stupp commented on CASSANDRA-5836:
-

Don't assume that a new cluster is empty or that it's safe to assume that the 
first ever seed node has no data. If authentication is enabled, a default user 
is created. It would be very surprising for users ("can't login with cassandra 
anymore!") and actually a real security issue, if that user is recreated (if it 
was dropped) or the password changed back to the default.

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-02 Thread Oleksandr Shulgin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383821#comment-16383821
 ] 

Oleksandr Shulgin commented on CASSANDRA-5836:
--

{quote}In the case where seeds nodes can not be contacted, how do you determine 
if this is the first node in a cluster (so we should special case and skip 
bootstrap) vs a mis-configuration or other seeds are down issues and therefor 
the bootstrap should fail?{quote}

Good point.  In this case I think it is worthwhile to require 
{{auto_bootstrap=false}} when deploying new cluster/DCs, which also makes it 
consistent (new cluster vs. new DC) and doesn't require special handling of the 
very first seed.  If {{auto_bootstrap=true}} it should signal the node to 
expect some other nodes to be there, so bootstrap would fail in the above case, 
but you won't end up with a blank node serving requests.

The implication is that after deploying a new cluster one would need to go and 
update configuration to remove non-default auto_bootstrap setting, but this is 
already the case when deploying a new DC, so I think it is acceptable.

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-02 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383757#comment-16383757
 ] 

Jeremiah Jordan commented on CASSANDRA-5836:


bq. I think "seed nodes can't bootstrap" is only true for the very first node 
deployed in a cluster (which must be a seed node).  Any further nodes deployed 
into a new cluster *can* bootstrap, but they *don't need* to, since there is no 
data. 

In the case where seeds nodes can not be contacted, how do you determine if 
this is the first node in a cluster (so we should special case and skip 
bootstrap) vs a mis-configuration or other seeds are down issues and therefor 
the bootstrap should fail?

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-02 Thread Oleksandr Shulgin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383742#comment-16383742
 ] 

Oleksandr Shulgin commented on CASSANDRA-5836:
--

I think "seed nodes can't bootstrap" is only true for the very first node 
deployed in a cluster (which must be a seed node).  Any further nodes deployed 
into a new cluster *can* bootstrap, but they *don't need* to, since there is no 
data.  For all practical purposes, when deploying a new cluster there is no 
difference if you specify {{auto_bootstrap=false}} or leave the default.  We 
could also say, that bootstrapping of the very first node in the cluster is 
no-op.

Another case when you add seed nodes is when adding a new DC.  In this case 
they are *not* the first ones to start so they could bootstrap, but most of the 
time this is not what you want, so you set {{auto_bootstrap=false}} for every 
node in the new DC, including the new seeds.

Finally, if a seed node is restarted because of maintenance it would help if it 
behaved the same way as normal nodes.  It shouldn't be OK that it runs w/o data 
(due to misconfiguration, for example) and starts to accept read requests again.

I would argue that maintenance safety is much more important than ease of 
initial deployment.  So if we need to make a trade-offs we should favor 
increased safety in the long run.

I think the safest option is to just allow seed nodes to bootstrap, when there 
is are nodes to bootstrap from.  So the *only* special case is the very first 
seed node.

At the same time there is no trade-off to be made really: such a change is 
backwards-compatible.  Users will not notice the difference (though it should 
be documented of course), because as explained above the specific setting of 
{{auto_bootstrap}} is:
1) irrelevant when deploying a new cluster;
and 
2) supposed to be explicitly set to {{false}} when deploying a new DC;
or
3) has be set to {{true}}, when adding a new node to existing DC.

This will allow to bootstrap new nodes directly as seeds as a side-effect.

Did I miss something?


> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-01 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16382084#comment-16382084
 ] 

Jeremiah Jordan commented on CASSANDRA-5836:


So why don’t seed nodes bootstrap?  The idea is that they are the first nodes 
added to a cluster, so they can’t bootstrap, there is nothing to bootstrap 
from.  If we let seed nodes bootstrap, what is the logic for the new cluster 
startup case? Have to be careful we don’t just start failing everyone’s new 
cluster creations because bootstrap isn’t possible.
Some thoughts on what could be done:
Do we only allow seed nodes with only themselves in the list to start without 
bootstrapping?  So if you want to keep the list the same on all nodes you need 
to modify the first nodes yaml?
Do we require setting autobootstrap to false for the first nodes in a cluster 
and get rid of the special casing around seed nodes all together?  Changing the 
burden from modifying a nodes seedlist to re-add it, to making sure 
autobootstrap is true?
If we want to mess with this let’s make sure to think through exactly how a 
change here will affects cluster ops for initial creation and also for 
replacing/adding nodes later.  

I’m not sure what the right answer is, but let’s make sure any change here is a 
clear win for all cases rather than just moving around the burdensome thing.  
Also making sure to kee p clear logging around what is happening, as we do at 
least currently have a clear clear log message that a node did not bootstrap 
because it was a seed, that was a nice addition to help in tracking down why 
your new node had no data.

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-02-28 Thread Oleksandr Shulgin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16381639#comment-16381639
 ] 

Oleksandr Shulgin commented on CASSANDRA-5836:
--

[~rcoli] Good point.  But this doesn't make a lot of difference, IMO.  The 
nodes started with {{auto_bootstrap=false}} also receive all writes, but since 
they join more or less immediately, they are also responsible for reads and 
these copies count towards the client request CL.

The difference comes from the fact that a bootstrapping node doesn't fully join 
the ring before bootstrap is complete, so it receives all writes and doesn't 
serve any reads yet.


> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-02-28 Thread Robert Coli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16381561#comment-16381561
 ] 

Robert Coli commented on CASSANDRA-5836:


[~oshulgin] : as I understand it, a bootstrapping node also receives "extra" 
copies of writes via the storage protocol, which is not technically "streaming 
in" the data. These "extra" copies do not count towards CL.

While I'm commenting on this ticket, it seems appropriate to share my 
enthusiasm for resolving the question of the bootstrapping seed nodes. This has 
been a longstanding point of pain and  confusion for operators and those who 
support them.

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-02-27 Thread Oleksandr Shulgin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16378207#comment-16378207
 ] 

Oleksandr Shulgin commented on CASSANDRA-5836:
--

[~jjirsa] thanks for reopening this.  Before suggesting a fix I'd like to have 
a better understanding of what the bootstrap process really is.
[~jbellis] could you please elaborate on the "special cases" you've mentioned?

In the literature I can find definitions akin to "Bootstrapping is the process 
of claiming token ranges and streaming in the data from other nodes".  This 
cannot be accurate, because the nodes which don't bootstrap (seeds or the ones 
having {{auto_bootstrap}} set to {{false}} explicitly) they also claim token 
ranges, the just don't stream the data in and are immediately responsible for 
handling read requests.

If I understand it correctly, the above definition is what really "joining the 
ring" is, i.e. "claiming token ranges and (optionally) streaming in the data".  
By this reasoning bootstrapping is only about "streaming in the data".  Is 
there anything else to the bootstrap process that I'm not aware of?  Please 
clarify.


> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-02-26 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16377289#comment-16377289
 ] 

Jeff Jirsa commented on CASSANDRA-5836:
---

Still causing pain, re-opening to discuss valid options. MAY END UP CLOSING 
AGAIN if no good option is found.


> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2017-06-29 Thread Mikael Valot (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16068104#comment-16068104
 ] 

Mikael Valot commented on CASSANDRA-5836:
-

Same here, we had 3 DSE 5.1 nodes and created 3 new nodes as seeds, with a 
replication factor = 3
Everything was looking good until our users noticed that some data was missing, 
a few hours before an important client demo.
It was fortunate that it was not a production environment and that we had 
another environment available for the demo.

We observed during the data loss that some partitions were allocated to the 3 
new nodes, which explains why the data was not accessible anymore.
We managed to recover the data by stopping one of the new nodes, and by running 
nodetool removenode followed by nodetool repair.
Cassandra subsequently managed to copy the data from the old nodes to the 2 new 
ones. 

Cassandra should either prevent the user from starting the new nodes when they 
are setup as seeds, or have some mechanism to prevent any loss of data.
IMHO this ticket should be reopened.

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2015-12-08 Thread Maciek (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15047609#comment-15047609
 ] 

Maciek commented on CASSANDRA-5836:
---

I've also been pretty confused by this. Simplifying setup would make it less 
likely for new operators to make mistakes, and allow them to have a better 
first impression of the system. I hope the decision to mark this as wontfix is 
revisited.

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2014-08-14 Thread Jon Travis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14097647#comment-14097647
 ] 

Jon Travis commented on CASSANDRA-5836:
---

I was just bitten by this as well.  Our ops uses ZooKeeper to store a list of 
all our infrastructure, so I wrote a SeedProvider that peeked into Zk for the 
list of Cassandra nodes and returned that as the seed list ... big mistake..   
Our push-button deployment launched the node (it thought it was a seed), so it 
essentially stopped doing anything, reported errors of missing keyspaces and 
column families, then simply sat there.  All the while, it claims it has a 
portion of the ring, yet no data. 

There is no good documentation about this and no warnings in the logs -- this 
is certainly something that will bite more people.  It would be nice if the 
process could warn about the error or refuse to start under this scenario.  

 Seed nodes should be able to bootstrap without manual intervention
 --

 Key: CASSANDRA-5836
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Bill Hathaway
Priority: Minor

 The current logic doesn't allow a seed node to be bootstrapped.  If a user 
 wants to bootstrap a node configured as a seed (for example to replace a seed 
 node via replace_token), they first need to remove the node's own IP from the 
 seed list, and then start the bootstrap process.  This seems like an 
 unnecessary step since a node never uses itself as a seed.
 I think it would be a better experience if the logic was changed to allow a 
 seed node to bootstrap without manual intervention when there are other seed 
 nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2014-08-14 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14097683#comment-14097683
 ] 

Robert Stupp commented on CASSANDRA-5836:
-

[~jtravis] It's basically documented here: 
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_seed_node.html
But you are right, that docs could be better in this point. I could not find a 
place in initializing a cluster that says: do not use all nodes as seed 
nodes - it just says at least 1 per DC.
Might you drop an email to docs at datastax dot com ?

 Seed nodes should be able to bootstrap without manual intervention
 --

 Key: CASSANDRA-5836
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Bill Hathaway
Priority: Minor

 The current logic doesn't allow a seed node to be bootstrapped.  If a user 
 wants to bootstrap a node configured as a seed (for example to replace a seed 
 node via replace_token), they first need to remove the node's own IP from the 
 seed list, and then start the bootstrap process.  This seems like an 
 unnecessary step since a node never uses itself as a seed.
 I think it would be a better experience if the logic was changed to allow a 
 seed node to bootstrap without manual intervention when there are other seed 
 nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2013-12-10 Thread Anne Sullivan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13844643#comment-13844643
 ] 

Anne Sullivan commented on CASSANDRA-5836:
--

For ease of maintenance and because we'll likely have many deployments where 
the cluster size is very small (2 - 5 nodes), I'm wondering if I can set my 
seed_provider list to contain all nodes except the local node's IP.  ie) For 
nodes A-C
A- B, C
B- A, C
C- A, B

I think my question is more or less In line with Robert's comment, I'm 
wondering if satisfying ONLY 2) is safe:

Datastax docs suggest that every node should have the same list of seeds, and 
also To prevent partitions in gossip communications, use the same list of seed 
nodes in all nodes in a cluster.  In my case, I wouldn't end up with gossip 
partitions in the example above, so if that's the only reason for the 
recommendation of keeping the list consistent across all nodes then it should 
be ok.  

I would like to have all nodes auto-bootstrap, so I can automate the deployment 
process, push the config once and forget about it.  When adding a new node, I 
don't want to do 2 edits to the config file (first start without node as seed, 
then add node as seed). 

 Seed nodes should be able to bootstrap without manual intervention
 --

 Key: CASSANDRA-5836
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Bill Hathaway
Priority: Minor

 The current logic doesn't allow a seed node to be bootstrapped.  If a user 
 wants to bootstrap a node configured as a seed (for example to replace a seed 
 node via replace_token), they first need to remove the node's own IP from the 
 seed list, and then start the bootstrap process.  This seems like an 
 unnecessary step since a node never uses itself as a seed.
 I think it would be a better experience if the logic was changed to allow a 
 seed node to bootstrap without manual intervention when there are other seed 
 nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2013-11-27 Thread Christopher J. Bottaro (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13834006#comment-13834006
 ] 

Christopher J. Bottaro commented on CASSANDRA-5836:
---

I agree with Robert.  We didn't come across this information until it bit us 
pretty badly:

http://www.mail-archive.com/user@cassandra.apache.org/msg33382.html

Took us 36 hours of work over a weekend to recover... :(

 Seed nodes should be able to bootstrap without manual intervention
 --

 Key: CASSANDRA-5836
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Bill Hathaway
Priority: Minor

 The current logic doesn't allow a seed node to be bootstrapped.  If a user 
 wants to bootstrap a node configured as a seed (for example to replace a seed 
 node via replace_token), they first need to remove the node's own IP from the 
 seed list, and then start the bootstrap process.  This seems like an 
 unnecessary step since a node never uses itself as a seed.
 I think it would be a better experience if the logic was changed to allow a 
 seed node to bootstrap without manual intervention when there are other seed 
 nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2013-08-01 Thread Robert Coli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13727032#comment-13727032
 ] 

Robert Coli commented on CASSANDRA-5836:


Replacing a seed node is a very common operation, and this best practice is 
confusing/poorly documented. There are regular contacts to 
#cassandra/cassandra-user@ where people ask how to replace a seed node, and are 
confused by the answer. The workaround also means that, if you do not restart 
your node after bootstrapping it (and changing the conf file back to indicate 
to itself that it is a seed) the node runs until next restart without any 
understanding that it is a seed node.

Being a seed node appears to mean two things :

1) I have myself as an entry in my own seed list, so I know that I am a seed.
2) Other nodes have me in their seed list, so they consider me a seed.

The current code checks for 1) and refuses to bootstrap. The workaround is to 
remove the 1) state temporarily. But if it is unsafe to bootstrap a seed node 
because of *either* 1) or 2), the workaround is unsafe.

Can you explicate the special cases here? I sincerely would like to understand 
why the code tries to prevent a seed from bootstrapping when one can clearly, 
and apparently safely, bootstrap a seed. :)

 Seed nodes should be able to bootstrap without manual intervention
 --

 Key: CASSANDRA-5836
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Bill Hathaway
Priority: Minor

 The current logic doesn't allow a seed node to be bootstrapped.  If a user 
 wants to bootstrap a node configured as a seed (for example to replace a seed 
 node via replace_token), they first need to remove the node's own IP from the 
 seed list, and then start the bootstrap process.  This seems like an 
 unnecessary step since a node never uses itself as a seed.
 I think it would be a better experience if the logic was changed to allow a 
 seed node to bootstrap without manual intervention when there are other seed 
 nodes up in a ring.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira