[ 
https://issues.apache.org/jira/browse/CASSANDRA-14303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16393438#comment-16393438
 ] 

Joseph Lynch edited comment on CASSANDRA-14303 at 3/9/18 7:46 PM:
------------------------------------------------------------------

bq. An issue with having a default replication would be that you *must* set 
autobootstrap:false when adding a new DC, otherwise the first nodes added in 
the DC would get all the data. Given proper DC creation, it is not required to 
do this right now.

Yes, that edge case as well as others (gossip inconsistency mostly) is why I 
propose only evaluating the DCs at the time of a CREATE or ALTER statement 
execution. The operator would still have to go run:
{noformat}
ALTER KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
'replication_factor': 3}{noformat}
manually after adding a new datacenter to trigger the re-generation of the dcs. 
It's also worth noting that as proposed if you then described the keyspace you 
would get all the dcs that it is actually replicated to:
{noformat}
cqlsh> DESCRIBE KEYSPACE test

CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
'us-west-1': '3', 'us-east-1': 3} AND durable_writes = true;{noformat}


was (Author: jolynch):
Yes, that edge case as well as others (gossip inconsistency mostly) is why I 
propose only evaluating the DCs at the time of a CREATE or ALTER statement 
execution. The operator would still have to go run:
{noformat}
ALTER KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
'replication_factor': 3}{noformat}
manually after adding a new datacenter to trigger the re-generation of the dcs. 
It's also worth noting that as proposed if you then described the keyspace you 
would get all the dcs that it is actually replicated to:
{noformat}
cqlsh> DESCRIBE KEYSPACE test

CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
'us-west-1': '3', 'us-east-1': 3} AND durable_writes = true;{noformat}

> NetworkTopologyStrategy could have a "default replication" option
> -----------------------------------------------------------------
>
>                 Key: CASSANDRA-14303
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14303
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Configuration
>            Reporter: Joseph Lynch
>            Priority: Minor
>
> Right now when creating a keyspace with {{NetworkTopologyStrategy}} the user 
> has to manually specify the datacenters they want their data replicated to 
> with parameters, e.g.:
> {noformat}
>  CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'dc1': 3, 'dc2': 3}{noformat}
> This is a poor user interface because it requires the creator of the keyspace 
> (typically a developer) to know the layout of the Cassandra cluster (which 
> may or may not be controlled by them). Also, at least in my experience, folks 
> typo the datacenters _all_ the time. To work around this I see a number of 
> users creating automation around this where the automation describes the 
> Cassandra cluster and automatically expands out to all the dcs that Cassandra 
> knows about. Why can't Cassandra just do this for us, re-using the previously 
> forbidden {{replication_factor}} option (for backwards compatibility):
> {noformat}
>  CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'replication_factor': 3}{noformat}
> This would automatically replicate this Keyspace to all datacenters that are 
> present in the cluster. If you need to _override_ the default you could 
> supply a datacenter name, e.g.:
> {noformat}
> > CREATE KEYSPACE test WITH replication = {'class': 
> > 'NetworkTopologyStrategy', 'replication_factor': 3, 'dc1': 2}
> > DESCRIBE KEYSPACE test
> CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'dc1': '2', 'dc2': 3} AND durable_writes = true;
> {noformat}
> On the implementation side I think this may be reasonably straightforward to 
> do an auto-expansion at the time of keyspace creation (or alter), where the 
> above would automatically expand to list out the datacenters. We could allow 
> this to be recomputed whenever an AlterKeyspaceStatement runs so that to add 
> datacenters you would just run:
> {noformat}
> ALTER KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'replication_factor': 3}{noformat}
> and this would check that if the dc's in the current schema are different you 
> add in the new ones (for safety reasons we'd probably never remove non-zero 
> rf dcs when auto-generating dcs). Removing a datacenter becomes an alter that 
> includes an override for the dc you want to remove (or of course you can 
> always not use the auto-expansion and just use the old way):
> {noformat}
> // Tell it explicitly not to replicate to dc1
> > ALTER KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> > 'replication_factor': 3, 'dc3': 0}
> > DESCRIBE KEYSPACE test
> CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'dc1': '3', 'dc2': 3} AND durable_writes = true;{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to