Hello, We are in the process of upgrading out cassandra installation from a single instance to a 6 node cluster with a replication factor of 3. We are using Cassandra 1.1.2. This is something I've done before in other environments, but now I've hit an interesting issue. The cluster has been setup and all the nodes have joined. I was about to update the replication factor to 3 via cassandra-cli:
[open@unknown] use open; Authenticated to keyspace: open [default@open] update keyspace open with placement_strategy='org.apache.cassandra.locator.NetworkTopologyStrategy' and strategy_options={us-east:3}; 4698e471-5a1d-30f2-aa11-761d204581ff Waiting for schema agreement... ... schemas agree across the cluster The above looks normal, but when I look at the schema, the replication factor is unchanged: [default@open] describe open; Keyspace: open: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [us-east:1] Column Families: ... I couldn't figure out why this was, but then I saw this thread: http://www.datastax.com/support-forums/topic/cassandra-111-update-keyspace-not-working So I tried creating a new keyspace "ks" and looked at the results: [default@open] use system; Authenticated to keyspace: system [default@system] list schema_keyspace; schema_keyspace not found in current keyspace. [default@system] list schema_keyspaces; Using default limit of 100 Using default column limit of 100 ------------------- RowKey: open => (column=durable_writes, value=true, timestamp=530617107329814) => (column=name, value=open, timestamp=530617107329814) => (column=strategy_class, value=org.apache.cassandra.locator.NetworkTopologyStrategy, timestamp=530617107329814) => (column=strategy_options, value={"us-east":"1"}, timestamp=530617107329814) ------------------- RowKey: ks => (column=durable_writes, value=true, timestamp=42396175198913) => (column=name, value=ky, timestamp=42396175198913) => (column=strategy_class, value=org.apache.cassandra.locator.NetworkTopologyStrategy, timestamp=42396175198913) => (column=strategy_options, value={"datacenter1":"1"}, timestamp=42396175198913) Notice the "timestamp" on the new keyspace is MUCH younger than "open" (by more than a factor of 10). I didn't understand how this could be, as time has always been in sync. I decided to look at the code to see if I could spot anything. When cassandra-cli attempts to create a new keyspace, it uses thrift, and ends up here (in CassandraServer.java): public String system_update_keyspace(KsDef ks_def) throws InvalidRequestException, SchemaDisagreementException, TException { logger.debug("update_keyspace"); ThriftValidation.validateKeyspaceNotSystem(ks_def.name); ... MigrationManager.announceKeyspaceUpdate(KSMetaData.fromThrift(ks_def)); return Schema.instance.getVersion().toString(); ... } Which then calls: public static void announceKeyspaceUpdate(KSMetaData ksm) throws ConfigurationException { ksm.validate(); KSMetaData oldKsm = Schema.instance.getKSMetaData(ksm.name); if (oldKsm == null) throw new ConfigurationException(String.format("Cannot update non existing keyspace '%s'.", ksm.name)); announce(oldKsm.toSchemaUpdate(ksm, System.nanoTime())); } It then uses the results of System.nanoTime in the timestamp. I wrote a simple java program to output System.nanoTime on the system in which I attempted to add the new keyspace, and the output was: 46627528340034 Which is in the realm of the keyspace I added above. System.nanoTime() is java instance dependent (nanoTime). You will get different values depending on what machine you run on and is not necessarily associated with you system clock. I ran this on several different machines, all verified to be in sync with NTP, and got massively different results. In fact, when I stopped and started my instance, my nanoTime became: 97234377869 I then created another keyspace "kw": [default@system] list schema_keyspaces; Using default limit of 100 Using default column limit of 100 ------------------- RowKey: open => (column=durable_writes, value=true, timestamp=530617107329814) => (column=name, value=open, timestamp=530617107329814) => (column=strategy_class, value=org.apache.cassandra.locator.NetworkTopologyStrategy, timestamp=530617107329814) => (column=strategy_options, value={"us-east":"1"}, timestamp=530617107329814) ------------------- RowKey: ks => (column=durable_writes, value=true, timestamp=42396175198913) => (column=name, value=ky, timestamp=42396175198913) => (column=strategy_class, value=org.apache.cassandra.locator.NetworkTopologyStrategy, timestamp=42396175198913) => (column=strategy_options, value={"datacenter1":"1"}, timestamp=42396175198913) ------------------- RowKey: kw => (column=durable_writes, value=true, timestamp=236211433609) => (column=name, value=kw, timestamp=236211433609) => (column=strategy_class, value=org.apache.cassandra.locator.NetworkTopologyStrategy, timestamp=236211433609) => (column=strategy_options, value={"datacenter1":"1"}, timestamp=236211433609) What I believe is happening is updates are not working because, as the thread I linked above indicated, Cassandra is seeing my update as older than the current entries, and is not honoring it. However, this is because it is using System.nanoTime in thrift, which has no relation to the system clock time. I tried to find something in JIRA, but I couldn't really find any issue that matched (and wasn't fixed for other reasons in earlier releases). Is there something simpler going on? Thanks, -Mike