Hello,

We are in the process of upgrading out cassandra installation from a single 
instance to a 6 node cluster with a replication factor of 3.  We are using 
Cassandra 1.1.2.  This is something I've done before in other environments, but 
now I've hit an interesting issue.  The cluster has been setup and all the 
nodes have joined.  I was about to update the replication factor to 3 via 
cassandra-cli:

[open@unknown] use open;                                
Authenticated to keyspace: open

[default@open]  update keyspace open with 
placement_strategy='org.apache.cassandra.locator.NetworkTopologyStrategy' and 
strategy_options={us-east:3};
4698e471-5a1d-30f2-aa11-761d204581ff
Waiting for schema agreement...
... schemas agree across the cluster

The above looks normal, but when I look at the schema, the replication factor 
is unchanged:

[default@open] describe open;                                                 
Keyspace: open:                                                       
  Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
  Durable Writes: true
    Options: [us-east:1]
  Column Families:
...

I couldn't figure out why this was, but then I saw this thread:

http://www.datastax.com/support-forums/topic/cassandra-111-update-keyspace-not-working

So I tried creating a new keyspace "ks" and looked at the results:

[default@open] use system;
Authenticated to keyspace: system
[default@system] list schema_keyspace;
schema_keyspace not found in current keyspace.
[default@system] list schema_keyspaces;
Using default limit of 100
Using default column limit of 100
-------------------
RowKey: open
=> (column=durable_writes, value=true, timestamp=530617107329814)
=> (column=name, value=open, timestamp=530617107329814)
=> (column=strategy_class, 
value=org.apache.cassandra.locator.NetworkTopologyStrategy, 
timestamp=530617107329814)
=> (column=strategy_options, value={"us-east":"1"}, timestamp=530617107329814)
-------------------
RowKey: ks
=> (column=durable_writes, value=true, timestamp=42396175198913)
=> (column=name, value=ky, timestamp=42396175198913)
=> (column=strategy_class, 
value=org.apache.cassandra.locator.NetworkTopologyStrategy, 
timestamp=42396175198913)
=> (column=strategy_options, value={"datacenter1":"1"}, 
timestamp=42396175198913)


Notice the "timestamp" on the new keyspace is MUCH younger than "open" (by more 
than a factor of 10).  

I didn't understand how this could be, as time has always been in sync. 

I decided to look at the code to see if I could spot anything.  When 
cassandra-cli attempts to create a new keyspace, it uses thrift, and ends up 
here (in CassandraServer.java):

 public String system_update_keyspace(KsDef ks_def)
    throws InvalidRequestException, SchemaDisagreementException, TException
    {
        logger.debug("update_keyspace");
        ThriftValidation.validateKeyspaceNotSystem(ks_def.name);
 ...
            
MigrationManager.announceKeyspaceUpdate(KSMetaData.fromThrift(ks_def));
            return Schema.instance.getVersion().toString();
 ...
    }


Which then calls:

   public static void announceKeyspaceUpdate(KSMetaData ksm) throws 
ConfigurationException
    {
        ksm.validate();

        KSMetaData oldKsm = Schema.instance.getKSMetaData(ksm.name);
        if (oldKsm == null)
            throw new ConfigurationException(String.format("Cannot update non 
existing keyspace '%s'.", ksm.name));

        announce(oldKsm.toSchemaUpdate(ksm, System.nanoTime()));
    }

It then uses the results of System.nanoTime in the timestamp.

I wrote a simple java program to output System.nanoTime on the system in which 
I attempted to add the new keyspace, and the output was:

46627528340034

Which is in the realm of the keyspace I added above.  System.nanoTime() is java 
instance dependent (nanoTime).  You will get different values depending on what 
machine you run on and is not necessarily associated with you system clock.  I 
ran this on several different machines, all verified to be in sync with NTP, 
and got massively different results.  In fact, when I stopped and started my 
instance, my nanoTime became:

97234377869

I then created another keyspace "kw":

[default@system] list schema_keyspaces;
Using default limit of 100
Using default column limit of 100
-------------------
RowKey: open
=> (column=durable_writes, value=true, timestamp=530617107329814)
=> (column=name, value=open, timestamp=530617107329814)
=> (column=strategy_class, 
value=org.apache.cassandra.locator.NetworkTopologyStrategy, 
timestamp=530617107329814)
=> (column=strategy_options, value={"us-east":"1"}, timestamp=530617107329814)
-------------------
RowKey: ks
=> (column=durable_writes, value=true, timestamp=42396175198913)
=> (column=name, value=ky, timestamp=42396175198913)
=> (column=strategy_class, 
value=org.apache.cassandra.locator.NetworkTopologyStrategy, 
timestamp=42396175198913)
=> (column=strategy_options, value={"datacenter1":"1"}, 
timestamp=42396175198913)
-------------------
RowKey: kw
=> (column=durable_writes, value=true, timestamp=236211433609)
=> (column=name, value=kw, timestamp=236211433609)
=> (column=strategy_class, 
value=org.apache.cassandra.locator.NetworkTopologyStrategy, 
timestamp=236211433609)
=> (column=strategy_options, value={"datacenter1":"1"}, timestamp=236211433609)


What I believe is happening is updates are not working because, as the thread I 
linked above indicated, Cassandra is seeing my update as older than the current 
entries, and is not honoring it.  However, this is because it is using 
System.nanoTime in thrift, which has no relation to the system clock time.

I tried to find something in JIRA, but I couldn't really find any issue that 
matched (and wasn't fixed for other reasons in earlier releases).  Is there 
something simpler going on?

Thanks,
-Mike

Reply via email to