Re: Schema disagreement
Hi, On Tue, May 1, 2018 at 10:27 PM Gábor Authwrote: > One or two years ago I've tried the CDC feature but switched off... maybe > is it a side effect of switched off CDC? How can I fix it? :) > Okay, I've worked out. Updated the schema of the affected keyspaces on the new nodes with 'cdc=false' and everything is okay now. I think, it is a strange bug around the CDC... Bye, Gábor Auth
Re: Schema disagreement
Hi, On Tue, May 1, 2018 at 7:40 PM Gábor Authwrote: > What can I do? Any suggestion? :( > Okay, I've diffed the good and the bad system_scheme tables. The only difference is the `cdc` field in three keyspaces (in `tables` and `views`): - the value of `cdc` field on the good node is `False` - the value of `cdc` field on the bad node is `null` The value of `cdc` field on the other keyspaces is `null`. One or two years ago I've tried the CDC feature but switched off... maybe is it a side effect of switched off CDC? How can I fix it? :) Bye, Gábor Auth
Re: Schema disagreement
Hi, On Mon, Apr 30, 2018 at 11:11 PM Gábor Authwrote: > On Mon, Apr 30, 2018 at 11:03 PM Ali Hubail > wrote: > >> What steps have you performed to add the new DC? Have you tried to follow >> certain procedures like this? >> >> https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsAddDCToCluster.html >> > > Yes, exactly. :/ > Okay, removed all new nodes (with `removenode`). Cleared all new node (removed data and logs). I did all the steps described in the link (again). Same result: Cluster Information: Name: cluster Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Schema versions: 5de14758-887d-38c1-9105-fc60649b0edf: [new, new, ...] f4ed784a-174a-38dd-a7e5-55ff6f3002b2: [old, old, ...] The old nodes try to gossip their own schema: DEBUG [InternalResponseStage:1] 2018-05-01 17:36:36,266 MigrationManager.java:572 - Gossiping my schema version f4ed784a-174a-38dd-a7e5-55ff6f3002b2 DEBUG [InternalResponseStage:1] 2018-05-01 17:36:36,863 MigrationManager.java:572 - Gossiping my schema version f4ed784a-174a-38dd-a7e5-55ff6f3002b2 The new nodes try to gossip their own schema: DEBUG [InternalResponseStage:4] 2018-05-01 17:36:26,329 MigrationManager.java:572 - Gossiping my schema version 5de14758-887d-38c1-9105-fc60649b0edf DEBUG [InternalResponseStage:4] 2018-05-01 17:36:27,595 MigrationManager.java:572 - Gossiping my schema version 5de14758-887d-38c1-9105-fc60649b0edf What can I do? Any suggestion? :( Bye, Gábor Auth
Re: Schema disagreement
Hi, On Mon, Apr 30, 2018 at 11:03 PM Ali Hubailwrote: > What steps have you performed to add the new DC? Have you tried to follow > certain procedures like this? > > https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsAddDCToCluster.html > Yes, exactly. :/ Bye, Gábor Auth
Re: Schema disagreement
Hi, What steps have you performed to add the new DC? Have you tried to follow certain procedures like this? https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsAddDCToCluster.html Node can appear offline to other nodes for various reasons. It would help greatly to know what steps you have taken in order to know why you're facing this Ali Hubail Confidentiality warning: This message and any attachments are intended only for the persons to whom this message is addressed, are confidential, and may be privileged. If you are not the intended recipient, you are hereby notified that any review, retransmission, conversion to hard copy, copying, modification, circulation or other use of this message and any attachments is strictly prohibited. If you receive this message in error, please notify the sender immediately by return email, and delete this message and any attachments from your system. Petrolink International Limited its subsidiaries, holding companies and affiliates disclaims all responsibility from and accepts no liability whatsoever for the consequences of any unauthorized person acting, or refraining from acting, on any information contained in this message. For security purposes, staff training, to assist in resolving complaints and to improve our customer service, email communications may be monitored and telephone calls may be recorded. Gábor Auth <auth.ga...@gmail.com> 04/30/2018 03:40 PM Please respond to user@cassandra.apache.org To "user@cassandra.apache.org" <user@cassandra.apache.org>, cc Subject Re: Schema disagreement Hi, On Mon, Apr 30, 2018 at 11:39 AM Gábor Auth <auth.ga...@gmail.com> wrote: 've just tried to add a new DC and new node to my cluster (3 DCs and 10 nodes) and the new node has a different schema version: Is it normal? Node is marked down but doing a repair successfully? WARN [MigrationStage:1] 2018-04-30 20:36:56,579 MigrationTask.java:67 - Can't send schema pull request: node /x.x.216.121 is down. INFO [AntiEntropyStage:1] 2018-04-30 20:36:56,611 Validator.java:281 - [repair #323bf873-4cb6-11e8-bdd5-5feb84046dc9] Sending completed merkle tree to /x.x.216.121 for keyspace.table The `nodetool status` is looking good: UN x.x.216.121 959.29 MiB 32 ? 322e4e9b-4d9e-43e3-94a3-bbe012058516 RACK01 Bye, Gábor Auth
Re: Schema disagreement
Hi, On Mon, Apr 30, 2018 at 11:39 AM Gábor Authwrote: > 've just tried to add a new DC and new node to my cluster (3 DCs and 10 > nodes) and the new node has a different schema version: > Is it normal? Node is marked down but doing a repair successfully? WARN [MigrationStage:1] 2018-04-30 20:36:56,579 MigrationTask.java:67 - Can't send schema pull request: node /x.x.216.121 is down. INFO [AntiEntropyStage:1] 2018-04-30 20:36:56,611 Validator.java:281 - [repair #323bf873-4cb6-11e8-bdd5-5feb84046dc9] Sending completed merkle tree to /x.x.216.121 for keyspace.table The `nodetool status` is looking good: UN x.x.216.121 959.29 MiB 32 ? 322e4e9b-4d9e-43e3-94a3-bbe012058516 RACK01 Bye, Gábor Auth
Schema disagreement
Hi, I've just tried to add a new DC and new node to my cluster (3 DCs and 10 nodes) and the new node has a different schema version: Cluster Information: Name: cluster Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Schema versions: 7e12a13e-dcca-301b-a5ce-b1ad29fbbacb: [x.x.x.x, ..., ...] bb186922-82b5-3a61-9c12-bf4eb87b9155: [new.new.new.new] I've tried: - node decommission and node re-addition - resetlocalschema - rebuild - replace node - repair - cluster restart (node-by-node) The MigrationManager constantly running on the new node and try to migrate schema: DEBUG [NonPeriodicTasks:1] 2018-04-30 09:33:22,405 MigrationManager.java:125 - submitting migration task for /x.x.x.x What also can I do? :( Bye, Gábor Auth
Re: Schema Disagreement vs Nodetool resetlocalschema
Hi Michael, Did you ever get an answer on this? I'm curious to hear for future reference. Thanks, Jens On Monday, June 20, 2016, Michael Fong <michael.f...@ruckuswireless.com> wrote: > Hi, > > > > We have recently encountered several schema disagreement issue while > upgrading Cassandra. In one of the cases, the 2-node cluster idled for over > 30 minutes and their schema remain unsynced. Due to other logic flows, > Cassandra cannot be restarted, and hence we need to come up an alternative > on-the-fly. We are thinking to do a nodetool resetlocalschema to force the > schema synchronization. How safe is this method? Do we need to disable > thrift/gossip protocol before performing this function, and enable them > back after resync completes? > > > > Thanks in advance! > > > > Sincerely, > > > > Michael Fong > -- Jens Rantil Backend engineer Tink AB Email: jens.ran...@tink.se Phone: +46 708 84 18 32 Web: www.tink.se Facebook <https://www.facebook.com/#!/tink.se> Linkedin <http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary> Twitter <https://twitter.com/tink>
Schema Disagreement vs Nodetool resetlocalschema
Hi, We have recently encountered several schema disagreement issue while upgrading Cassandra. In one of the cases, the 2-node cluster idled for over 30 minutes and their schema remain unsynced. Due to other logic flows, Cassandra cannot be restarted, and hence we need to come up an alternative on-the-fly. We are thinking to do a nodetool resetlocalschema to force the schema synchronization. How safe is this method? Do we need to disable thrift/gossip protocol before performing this function, and enable them back after resync completes? Thanks in advance! Sincerely, Michael Fong
Re: What are problems with schema disagreement
On Mon, Jul 6, 2015 at 1:30 PM, John Wong gokoproj...@gmail.com wrote: But is there a problem with letting schema disagreement running for a long time? It depends on what the nature of the desynch is, but generally speaking there may be. If you added a column or a columnfamily, and one node didn't get that update, it will except when your clients attempt to read that column/columnfamily. And so on... =Rob
Re: What are problems with schema disagreement
Thanks. Yeah we typically restart the nodes in the minor version to force resync. But is there a problem with letting schema disagreement running for a long time? Thanks. John On Mon, Jul 6, 2015 at 2:29 PM, Robert Coli rc...@eventbrite.com wrote: On Thu, Jul 2, 2015 at 9:31 PM, John Wong gokoproj...@gmail.com wrote: Hi Graham. Thanks. We are still running on 1.2.16, but we do plan to upgrade in the near future. The load on the cluster at the time was very very low. All nodes were responsive, except nothing was show up in the logs after certain time, which led me to believe something happened internal, although that was a poor wild guess. But is it safe to be okay with schema disagreement? I worry about data consistency if I let it sit too long. In general one shouldn't run with schema disagreement persistently. I've seen schema desynch issues on 1.2.x, in general restarting some unclear subset of the affected daemons made them synch. =Rob
Re: What are problems with schema disagreement
On Thu, Jul 2, 2015 at 9:31 PM, John Wong gokoproj...@gmail.com wrote: Hi Graham. Thanks. We are still running on 1.2.16, but we do plan to upgrade in the near future. The load on the cluster at the time was very very low. All nodes were responsive, except nothing was show up in the logs after certain time, which led me to believe something happened internal, although that was a poor wild guess. But is it safe to be okay with schema disagreement? I worry about data consistency if I let it sit too long. In general one shouldn't run with schema disagreement persistently. I've seen schema desynch issues on 1.2.x, in general restarting some unclear subset of the affected daemons made them synch. =Rob
What are problems with schema disagreement
Hi. Here is a schema disagreement we encountered. Schema versions: b6467059-5897-3cc1-9ee2-73f31841b0b0: [10.0.1.100, 10.0.1.109] c8971b2d-0949-3584-aa87-0050a4149bbd: [10.0.1.55, 10.0.1.16, 10.0.1.77] c733920b-2a31-30f0-bca1-45a8c9130a2c: [10.0.1.221] We deployed an application which would send a schema update (DDL=auto). We found this prod cluster had 3 schema difference. Other existing applications were fine, so some people were curious what if we left this problem alone until off hours. Is there any concerns with not resolve schema disagreement right away? FWIW we went ahead and restarted 221 first, and continue with the rest of the minors. Thanks. John
Re: What are problems with schema disagreement
What version of C* are you running? Some versions of 2.0.x might occasionally fail to propagate schema changes in a timely fashion (though they would fix themselves eventually - in the order of a few minutes) On Jul 2, 2015, at 9:37 PM, John Wong gokoproj...@gmail.com wrote: Hi. Here is a schema disagreement we encountered. Schema versions: b6467059-5897-3cc1-9ee2-73f31841b0b0: [10.0.1.100, 10.0.1.109] c8971b2d-0949-3584-aa87-0050a4149bbd: [10.0.1.55, 10.0.1.16, 10.0.1.77] c733920b-2a31-30f0-bca1-45a8c9130a2c: [10.0.1.221] We deployed an application which would send a schema update (DDL=auto). We found this prod cluster had 3 schema difference. Other existing applications were fine, so some people were curious what if we left this problem alone until off hours. Is there any concerns with not resolve schema disagreement right away? FWIW we went ahead and restarted 221 first, and continue with the rest of the minors. Thanks. John smime.p7s Description: S/MIME cryptographic signature
Re: What are problems with schema disagreement
On Thu, Jul 2, 2015 at 11:01 PM, graham sanderson gra...@vast.com wrote: What version of C* are you running? Some versions of 2.0.x might occasionally fail to propagate schema changes in a timely fashion (though they would fix themselves eventually - in the order of a few minutes) Hi Graham. Thanks. We are still running on 1.2.16, but we do plan to upgrade in the near future. The load on the cluster at the time was very very low. All nodes were responsive, except nothing was show up in the logs after certain time, which led me to believe something happened internal, although that was a poor wild guess. But is it safe to be okay with schema disagreement? I worry about data consistency if I let it sit too long. Thanks. John On Jul 2, 2015, at 9:37 PM, John Wong gokoproj...@gmail.com wrote: Hi. Here is a schema disagreement we encountered. Schema versions: b6467059-5897-3cc1-9ee2-73f31841b0b0: [10.0.1.100, 10.0.1.109] c8971b2d-0949-3584-aa87-0050a4149bbd: [10.0.1.55, 10.0.1.16, 10.0.1.77] c733920b-2a31-30f0-bca1-45a8c9130a2c: [10.0.1.221] We deployed an application which would send a schema update (DDL=auto). We found this prod cluster had 3 schema difference. Other existing applications were fine, so some people were curious what if we left this problem alone until off hours. Is there any concerns with not resolve schema disagreement right away? FWIW we went ahead and restarted 221 first, and continue with the rest of the minors. Thanks. John
Cassandra schema disagreement
Hello, I have a cluster running and I'm trying to change the schema on it. Altough it succeeds on one cluster (a test one), on another it keeps creating two separate schema versions (both are 2 DC configuration; the cluster where it goes wrong end up with a schema version on each DC). I use apache-cassandra11-1.1.12 on CentOS 6.4 I'm trying to start from a fresh cassandra config (doing rm -rf /var/lib/cassandra/{commitlog,data}/* while cassandra is stopped). Each DC are on separate IP segment but there are no firewall between them. Here is the output of the command when the desynchronisation occurs: --- [root@cassandranode00 CDN]# cassandra-cli -f reCreateCassandraStruct.sh Connected to: TTF Cluster v2013_1257 on 127.0.0.1/9160 7ef8c681-189a-3088-8598-560437f705d9 Waiting for schema agreement... ... schemas agree across the cluster Authenticated to keyspace: ks1 f179fd8e-f8ca-36cf-bf53-d8341fd6006e Waiting for schema agreement... The schema has not settled in 10 seconds; further migrations are ill-advised until it does. Versions are f179fd8e-f8ca-36cf-bf53-d8341fd6006e:[10.69.221.20, 10.69.221.21, 10.69.221.22], e9656b30-b671-3fce-9fb4-bdd3e6da36d1:[1 0.69.10.14, 10.69.10.13, 10.69.10.11] --- I also try creating a keyspace with a column family using the opscenter (with no good result). I'm out of hint to where to look. Do you have some suggestions ? Is there improvements on this side with cassandra 1.1.12 ? Thanks, Jonathan DEMEYER Here is the start of reCreateCassandraStruct.sh : CREATE KEYSPACE ks1 WITH placement_strategy = 'NetworkTopologyStrategy' AND strategy_options={DC1:3,DC2:3}; use ks1; create column family id with comparator = 'UTF8Type' and key_validation_class = 'UTF8Type' and column_metadata = [ { column_name : 'user', validation_class : UTF8Type } ]; CREATE KEYSPACE ks2 WITH placement_strategy = 'NetworkTopologyStrategy' AND strategy_options={DC1:3,DC2:3}; use ks2; create column family id;
RE: Cassandra schema disagreement
After a lot of investigation, it seems that the clocks were desynchronized through the cluster (altough we did not check that resyncing them resolve the problem, we modify the schma with one node up and restart all other nodes afterwards). From: Demeyer Jonathan [mailto:jonathan.deme...@macq.eu] Sent: mardi 12 août 2014 11:03 To: user@cassandra.apache.org Subject: Cassandra schema disagreement Hello, I have a cluster running and I'm trying to change the schema on it. Altough it succeeds on one cluster (a test one), on another it keeps creating two separate schema versions (both are 2 DC configuration; the cluster where it goes wrong end up with a schema version on each DC). I use apache-cassandra11-1.1.12 on CentOS 6.4 I'm trying to start from a fresh cassandra config (doing rm -rf /var/lib/cassandra/{commitlog,data}/* while cassandra is stopped). Each DC are on separate IP segment but there are no firewall between them. Here is the output of the command when the desynchronisation occurs: --- [root@cassandranode00 CDN]# cassandra-cli -f reCreateCassandraStruct.sh Connected to: TTF Cluster v2013_1257 on 127.0.0.1/9160 7ef8c681-189a-3088-8598-560437f705d9 Waiting for schema agreement... ... schemas agree across the cluster Authenticated to keyspace: ks1 f179fd8e-f8ca-36cf-bf53-d8341fd6006e Waiting for schema agreement... The schema has not settled in 10 seconds; further migrations are ill-advised until it does. Versions are f179fd8e-f8ca-36cf-bf53-d8341fd6006e:[10.69.221.20, 10.69.221.21, 10.69.221.22], e9656b30-b671-3fce-9fb4-bdd3e6da36d1:[1 0.69.10.14, 10.69.10.13, 10.69.10.11] --- I also try creating a keyspace with a column family using the opscenter (with no good result). I'm out of hint to where to look. Do you have some suggestions ? Is there improvements on this side with cassandra 1.1.12 ? Thanks, Jonathan DEMEYER Here is the start of reCreateCassandraStruct.sh : CREATE KEYSPACE ks1 WITH placement_strategy = 'NetworkTopologyStrategy' AND strategy_options={DC1:3,DC2:3}; use ks1; create column family id with comparator = 'UTF8Type' and key_validation_class = 'UTF8Type' and column_metadata = [ { column_name : 'user', validation_class : UTF8Type } ]; CREATE KEYSPACE ks2 WITH placement_strategy = 'NetworkTopologyStrategy' AND strategy_options={DC1:3,DC2:3}; use ks2; create column family id;
Re: Schema disagreement errors
Hi Gaurav, a schema versioning bug was fixed in 2.0.7. Best wishes, Duncan. On 12/05/14 21:31, Gaurav Sehgal wrote: We have recently started seeing a lot of Schema Disagreement errors. We are using Cassandra 2.0.6 with Oracle Java 1.7. I went through the Cassandra FAQ and followed the below steps: * nodetool disablethrift * nodetool disablegossip * nodetool drain * 'kill pid'. As per the documentation; the commit logs should have been flush; but that did not happen in our case. The commit logs were still there. So, I removed them manually to make sure there are no commit logs when cassandra start up( which was fine in our case as this data can always be replayed). I also deleted the schema* directory from the /data/system folder. Though when we started cassandra back up the issue started happening again. Any help would be appreciated Cheers! Gaurav
Re: Schema disagreement errors
Hey Gaurav, You should consider moving to 2.0.7 which fixes a bunch of these schema disagreement problems. You could also play around with nodetool resetlocalschema on the nodes that are behind, but be careful with that one. I'd go with 2.0.7 first for sure. Thanks, Vince. On Mon, May 12, 2014 at 12:31 PM, Gaurav Sehgal gsehg...@gmail.com wrote: We have recently started seeing a lot of Schema Disagreement errors. We are using Cassandra 2.0.6 with Oracle Java 1.7. I went through the Cassandra FAQ and followed the below steps: - nodetool disablethrift - nodetool disablegossip - nodetool drain - 'kill pid'. As per the documentation; the commit logs should have been flush; but that did not happen in our case. The commit logs were still there. So, I removed them manually to make sure there are no commit logs when cassandra start up( which was fine in our case as this data can always be replayed). I also deleted the schema* directory from the /data/system folder. Though when we started cassandra back up the issue started happening again. Any help would be appreciated Cheers! Gaurav
Re: Schema disagreement errors
On Tue, May 13, 2014 at 5:11 PM, Donald Smith donald.sm...@audiencescience.com wrote: I too have noticed that after doing “nodetool flush” (or “nodetool drain”), the commit logs are still there. I think they’re NEW (empty) commit logs, but I may be wrong. Anyone know? Assuming they are being correctly marked clean after drain (which historically has been a nontrivial assumption) they are new, empty commit log segments which have been recycled. =Rob
Schema disagreement errors
We have recently started seeing a lot of Schema Disagreement errors. We are using Cassandra 2.0.6 with Oracle Java 1.7. I went through the Cassandra FAQ and followed the below steps: - nodetool disablethrift - nodetool disablegossip - nodetool drain - 'kill pid'. As per the documentation; the commit logs should have been flush; but that did not happen in our case. The commit logs were still there. So, I removed them manually to make sure there are no commit logs when cassandra start up( which was fine in our case as this data can always be replayed). I also deleted the schema* directory from the /data/system folder. Though when we started cassandra back up the issue started happening again. Any help would be appreciated Cheers! Gaurav
Re: Schema disagreement errors
Upgrade to 2.0.7 fixed this for me. You can also try 'nodetool resetlocalschema' on disagreeing nodes. This worked temporarily for me in 2.0.6. ml On Mon, May 12, 2014 at 3:31 PM, Gaurav Sehgal gsehg...@gmail.com wrote: We have recently started seeing a lot of Schema Disagreement errors. We are using Cassandra 2.0.6 with Oracle Java 1.7. I went through the Cassandra FAQ and followed the below steps: - nodetool disablethrift - nodetool disablegossip - nodetool drain - 'kill pid'. As per the documentation; the commit logs should have been flush; but that did not happen in our case. The commit logs were still there. So, I removed them manually to make sure there are no commit logs when cassandra start up( which was fine in our case as this data can always be replayed). I also deleted the schema* directory from the /data/system folder. Though when we started cassandra back up the issue started happening again. Any help would be appreciated Cheers! Gaurav
Re: Schema disagreement under normal conditions, ALTER TABLE hangs
Thanks Rob. Let me add one thing in case someone else finds this thread - Restarting the nodes did not in and of itself get the schema disagreement resolved. We had to run the ALTER TABLE command individually on each of the disagreeing nodes once they came back up. On Tuesday, November 26, 2013 at 11:24 AM, Robert Coli wrote: On Mon, Nov 25, 2013 at 6:42 PM, Josh Dzielak j...@keen.io (mailto:j...@keen.io) wrote: Recently we had a strange thing happen. Altering schema (gc_grace_seconds) for a column family resulted in a schema disagreement. 3/4 of nodes got it, 1/4 didn't. There was no partition at the time, nor was there multiple schema updates issued. Going to the nodes with stale schema and trying to do the ALTER TABLE there resulted in hanging. We were eventually able to get schema agreement by restarting nodes, but both the initial disagreement under normal conditions and the hanging ALTER TABLE seem pretty weird. Any ideas here? Sound like a bug? Yes, that sounds like a bug. This behavior is less common in 1.2.x than it was previously, but still happens sometimes. It's interesting that restarting the affected node helped, in previous versions of hung schema issue, it would survive restart. We're on 1.2.8. Unfortunately, unless you have a repro path, it is probably not worth reporting a JIRA. =Rob
Re: Schema disagreement under normal conditions, ALTER TABLE hangs
On Mon, Nov 25, 2013 at 6:42 PM, Josh Dzielak j...@keen.io wrote: Recently we had a strange thing happen. Altering schema (gc_grace_seconds) for a column family resulted in a schema disagreement. 3/4 of nodes got it, 1/4 didn't. There was no partition at the time, nor was there multiple schema updates issued. Going to the nodes with stale schema and trying to do the ALTER TABLE there resulted in hanging. We were eventually able to get schema agreement by restarting nodes, but both the initial disagreement under normal conditions and the hanging ALTER TABLE seem pretty weird. Any ideas here? Sound like a bug? Yes, that sounds like a bug. This behavior is less common in 1.2.x than it was previously, but still happens sometimes. It's interesting that restarting the affected node helped, in previous versions of hung schema issue, it would survive restart. We're on 1.2.8. Unfortunately, unless you have a repro path, it is probably not worth reporting a JIRA. =Rob
Schema disagreement under normal conditions, ALTER TABLE hangs
Recently we had a strange thing happen. Altering schema (gc_grace_seconds) for a column family resulted in a schema disagreement. 3/4 of nodes got it, 1/4 didn't. There was no partition at the time, nor was there multiple schema updates issued. Going to the nodes with stale schema and trying to do the ALTER TABLE there resulted in hanging. We were eventually able to get schema agreement by restarting nodes, but both the initial disagreement under normal conditions and the hanging ALTER TABLE seem pretty weird. Any ideas here? Sound like a bug? We're on 1.2.8. Thanks, Josh -- Josh Dzielak • Keen IO • @dzello (https://twitter.com/dzello)
Re: Cannot resolve schema disagreement
On Wed, May 8, 2013 at 5:40 PM, srmore comom...@gmail.com wrote: After running the commands, I get back to the same issue. Cannot afford to lose the data so I guess this is the only option for me. And unfortunately I am using 1.0.12 ( cannot upgrade as of now ). Any, ideas on what might be happening or any pointers will be greatly appreciated. If you can afford downtime on the cluster, the solution to this problem with the highest chance of success is : 1) dump the existing schema from a good node 2) nodetool drain on all nodes 3) stop cluster 4) move schema and migration CF tables out of the way on all nodes 5) start cluster 6) re-load schema, being careful to explicitly check for schema agreement on all nodes between schema modifying statements In many/most cases of schema disagreement, people try the FAQ approach and it doesn't work and they end up being forced to do the above anyway. In general if you can tolerate the downtime, you should save yourself the effort and just do the above process. =Rob
Re: Cannot resolve schema disagreement
Thanks Rob ! Tried the steps, that did not work, however I was able to resolve the problem by syncing the clocks. The thing that confuses me is that, the FAQ says Before 0.7.6, this can also be caused by cluster system clocks being substantially out of sync with each other. The version I am using was 1.0.12. This raises an important question, where does Cassandra get the time information from ? and is it required (I know it is highly highly advisable to) to keep clocks in sync, any suggestions/best practices on how to keep the clocks in sync ? /srm On Thu, May 9, 2013 at 1:58 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, May 8, 2013 at 5:40 PM, srmore comom...@gmail.com wrote: After running the commands, I get back to the same issue. Cannot afford to lose the data so I guess this is the only option for me. And unfortunately I am using 1.0.12 ( cannot upgrade as of now ). Any, ideas on what might be happening or any pointers will be greatly appreciated. If you can afford downtime on the cluster, the solution to this problem with the highest chance of success is : 1) dump the existing schema from a good node 2) nodetool drain on all nodes 3) stop cluster 4) move schema and migration CF tables out of the way on all nodes 5) start cluster 6) re-load schema, being careful to explicitly check for schema agreement on all nodes between schema modifying statements In many/most cases of schema disagreement, people try the FAQ approach and it doesn't work and they end up being forced to do the above anyway. In general if you can tolerate the downtime, you should save yourself the effort and just do the above process. =Rob
Re: Cannot resolve schema disagreement
This raises an important question, where does Cassandra get the time information from ? http://docs.oracle.com/javase/6/docs/api/java/lang/System.html normally milliSeconds, not sure if 1.0.12 may use nanoTime() which is less reliable on some VM's. and is it required (I know it is highly highly advisable to) to keep clocks in sync, any suggestions/best practices on how to keep the clocks in sync ? http://en.wikipedia.org/wiki/Network_Time_Protocol Hope that helps. - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 10/05/2013, at 9:16 AM, srmore comom...@gmail.com wrote: Thanks Rob ! Tried the steps, that did not work, however I was able to resolve the problem by syncing the clocks. The thing that confuses me is that, the FAQ says Before 0.7.6, this can also be caused by cluster system clocks being substantially out of sync with each other. The version I am using was 1.0.12. This raises an important question, where does Cassandra get the time information from ? and is it required (I know it is highly highly advisable to) to keep clocks in sync, any suggestions/best practices on how to keep the clocks in sync ? /srm On Thu, May 9, 2013 at 1:58 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, May 8, 2013 at 5:40 PM, srmore comom...@gmail.com wrote: After running the commands, I get back to the same issue. Cannot afford to lose the data so I guess this is the only option for me. And unfortunately I am using 1.0.12 ( cannot upgrade as of now ). Any, ideas on what might be happening or any pointers will be greatly appreciated. If you can afford downtime on the cluster, the solution to this problem with the highest chance of success is : 1) dump the existing schema from a good node 2) nodetool drain on all nodes 3) stop cluster 4) move schema and migration CF tables out of the way on all nodes 5) start cluster 6) re-load schema, being careful to explicitly check for schema agreement on all nodes between schema modifying statements In many/most cases of schema disagreement, people try the FAQ approach and it doesn't work and they end up being forced to do the above anyway. In general if you can tolerate the downtime, you should save yourself the effort and just do the above process. =Rob
Re: Cannot resolve schema disagreement
Thought so. Thanks Aaron ! On Thu, May 9, 2013 at 6:09 PM, aaron morton aa...@thelastpickle.comwrote: This raises an important question, where does Cassandra get the time information from ? http://docs.oracle.com/javase/6/docs/api/java/lang/System.html normally milliSeconds, not sure if 1.0.12 may use nanoTime() which is less reliable on some VM's. and is it required (I know it is highly highly advisable to) to keep clocks in sync, any suggestions/best practices on how to keep the clocks in sync ? http://en.wikipedia.org/wiki/Network_Time_Protocol Hope that helps. - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 10/05/2013, at 9:16 AM, srmore comom...@gmail.com wrote: Thanks Rob ! Tried the steps, that did not work, however I was able to resolve the problem by syncing the clocks. The thing that confuses me is that, the FAQ says Before 0.7.6, this can also be caused by cluster system clocks being substantially out of sync with each other. The version I am using was 1.0.12. This raises an important question, where does Cassandra get the time information from ? and is it required (I know it is highly highly advisable to) to keep clocks in sync, any suggestions/best practices on how to keep the clocks in sync ? /srm On Thu, May 9, 2013 at 1:58 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, May 8, 2013 at 5:40 PM, srmore comom...@gmail.com wrote: After running the commands, I get back to the same issue. Cannot afford to lose the data so I guess this is the only option for me. And unfortunately I am using 1.0.12 ( cannot upgrade as of now ). Any, ideas on what might be happening or any pointers will be greatly appreciated. If you can afford downtime on the cluster, the solution to this problem with the highest chance of success is : 1) dump the existing schema from a good node 2) nodetool drain on all nodes 3) stop cluster 4) move schema and migration CF tables out of the way on all nodes 5) start cluster 6) re-load schema, being careful to explicitly check for schema agreement on all nodes between schema modifying statements In many/most cases of schema disagreement, people try the FAQ approach and it doesn't work and they end up being forced to do the above anyway. In general if you can tolerate the downtime, you should save yourself the effort and just do the above process. =Rob
Cannot resolve schema disagreement
Hello, I have a cluster of 4 nodes and two of them are on different schema. I tried to run the commands described in the FAQ section but no luck ( http://wiki.apache.org/cassandra/FAQ#schema_disagreement) . After running the commands, I get back to the same issue. Cannot afford to lose the data so I guess this is the only option for me. And unfortunately I am using 1.0.12 ( cannot upgrade as of now ). Any, ideas on what might be happening or any pointers will be greatly appreciated. /srm
Schema Disagreement after migration from 1.0.6 to 1.1.4
Hi list We have a 5-node Cassandra cluster with a single 1.0.9 installation and four 1.0.6 installations. We have tried installing 1.1.4 on one of the 1.0.6 nodes (following the instructions on http://www.datastax.com/docs/1.1/install/upgrading). After bringing up 1.1.4 there are no errors in the log, but the cluster now suffers from schema disagreement [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 59adb24e-f3cd-3e02-97f0-5b395827453f: [10.10.29.67] - The new 1.1.4 node 943fc0a0-f678-11e1--339cf8a6c1bf: [10.10.87.228, 10.10.153.45, 10.10.145.90, 10.38.127.80] - nodes in the old cluster The recipe for recovering from schema disagreement ( http://wiki.apache.org/cassandra/FAQ#schema_disagreement) doesn't cover the new directory layout. The system/Schema directory is empty save for a snapshots subdirectory. system/schema_columnfamilies and system/schema_keyspaces contain some files. As described in datastax's description, we tried running nodetool upgradesstables. When this had done, describe schema in the cli showed a schema definition which seemed correct, but was indeed different from the schema on the other nodes in the cluster. Any clues on how we should proceed? Thanks, /Martin Koch
Re: Schema Disagreement after migration from 1.0.6 to 1.1.4
I would try nodetool resetlocalschema. On 12-09-05 07:08 AM, Martin Koch wrote: Hi list We have a 5-node Cassandra cluster with a single 1.0.9 installation and four 1.0.6 installations. We have tried installing 1.1.4 on one of the 1.0.6 nodes (following the instructions on http://www.datastax.com/docs/1.1/install/upgrading). After bringing up 1.1.4 there are no errors in the log, but the cluster now suffers from schema disagreement [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 59adb24e-f3cd-3e02-97f0-5b395827453f: [10.10.29.67] -The new 1.1.4 node 943fc0a0-f678-11e1--339cf8a6c1bf: [10.10.87.228, 10.10.153.45, 10.10.145.90, 10.38.127.80] - nodes in the old cluster The recipe for recovering from schema disagreement (http://wiki.apache.org/cassandra/FAQ#schema_disagreement) doesn't cover the new directory layout. The system/Schema directory is empty save for a snapshots subdirectory. system/schema_columnfamilies and system/schema_keyspaces contain some files. As described in datastax's description, we tried running nodetool upgradesstables. When this had done, describe schema in the cli showed a schema definition which seemed correct, but was indeed different from the schema on the other nodes in the cluster. Any clues on how we should proceed? Thanks, /Martin Koch -- Edward Sargisson senior java developer Global Relay edward.sargis...@globalrelay.net mailto:edward.sargis...@globalrelay.net *866.484.6630* New York | Chicago | Vancouver | London (+44.0800.032.9829) | Singapore (+65.3158.1301) Global Relay Archive supports email, instant messaging, BlackBerry, Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, Facebook and more. Ask about *Global Relay Message* http://www.globalrelay.com/services/message*--- *The Future of Collaboration in the Financial Services World * *All email sent to or from this address will be retained by Global Relay's email archiving system. This message is intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. Global Relay will not be liable for any compliance or technical information provided herein. All trademarks are the property of their respective owners.
Re: Schema Disagreement after migration from 1.0.6 to 1.1.4
Do you see exceptions like java.lang.UnsupportedOperationException: Not a time-based UUID in log files of nodes running 1.0.6 and 1.0.9? Then it's probably due to [1] explained here [2] -- In this case you either have to upgrade all nodes to 1.1.4 or if you prefer keeping a mixed-version cluster, the 1.0.6 and 1.0.9 nodes won't be able to join the cluster again, unless you temporarily upgrade them to 1.0.11. Cheers, Omid [1] https://issues.apache.org/jira/browse/CASSANDRA-1391 [2] https://issues.apache.org/jira/browse/CASSANDRA-4195 On Wed, Sep 5, 2012 at 4:08 PM, Martin Koch m...@issuu.com wrote: Hi list We have a 5-node Cassandra cluster with a single 1.0.9 installation and four 1.0.6 installations. We have tried installing 1.1.4 on one of the 1.0.6 nodes (following the instructions on http://www.datastax.com/docs/1.1/install/upgrading). After bringing up 1.1.4 there are no errors in the log, but the cluster now suffers from schema disagreement [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 59adb24e-f3cd-3e02-97f0-5b395827453f: [10.10.29.67] - The new 1.1.4 node 943fc0a0-f678-11e1--339cf8a6c1bf: [10.10.87.228, 10.10.153.45, 10.10.145.90, 10.38.127.80] - nodes in the old cluster The recipe for recovering from schema disagreement (http://wiki.apache.org/cassandra/FAQ#schema_disagreement) doesn't cover the new directory layout. The system/Schema directory is empty save for a snapshots subdirectory. system/schema_columnfamilies and system/schema_keyspaces contain some files. As described in datastax's description, we tried running nodetool upgradesstables. When this had done, describe schema in the cli showed a schema definition which seemed correct, but was indeed different from the schema on the other nodes in the cluster. Any clues on how we should proceed? Thanks, /Martin Koch
Re: Schema Disagreement after migration from 1.0.6 to 1.1.4
Thanks, this is exactly it. We'd like to do a rolling upgrade - this is a production cluster - so I guess we'll upgrade 1.0.6 - 1.0.11 - 1.1.4, then. /Martin On Thu, Sep 6, 2012 at 2:35 AM, Omid Aladini omidalad...@gmail.com wrote: Do you see exceptions like java.lang.UnsupportedOperationException: Not a time-based UUID in log files of nodes running 1.0.6 and 1.0.9? Then it's probably due to [1] explained here [2] -- In this case you either have to upgrade all nodes to 1.1.4 or if you prefer keeping a mixed-version cluster, the 1.0.6 and 1.0.9 nodes won't be able to join the cluster again, unless you temporarily upgrade them to 1.0.11. Cheers, Omid [1] https://issues.apache.org/jira/browse/CASSANDRA-1391 [2] https://issues.apache.org/jira/browse/CASSANDRA-4195 On Wed, Sep 5, 2012 at 4:08 PM, Martin Koch m...@issuu.com wrote: Hi list We have a 5-node Cassandra cluster with a single 1.0.9 installation and four 1.0.6 installations. We have tried installing 1.1.4 on one of the 1.0.6 nodes (following the instructions on http://www.datastax.com/docs/1.1/install/upgrading). After bringing up 1.1.4 there are no errors in the log, but the cluster now suffers from schema disagreement [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 59adb24e-f3cd-3e02-97f0-5b395827453f: [10.10.29.67] - The new 1.1.4 node 943fc0a0-f678-11e1--339cf8a6c1bf: [10.10.87.228, 10.10.153.45, 10.10.145.90, 10.38.127.80] - nodes in the old cluster The recipe for recovering from schema disagreement ( http://wiki.apache.org/cassandra/FAQ#schema_disagreement) doesn't cover the new directory layout. The system/Schema directory is empty save for a snapshots subdirectory. system/schema_columnfamilies and system/schema_keyspaces contain some files. As described in datastax's description, we tried running nodetool upgradesstables. When this had done, describe schema in the cli showed a schema definition which seemed correct, but was indeed different from the schema on the other nodes in the cluster. Any clues on how we should proceed? Thanks, /Martin Koch
How schema disagreement can be fixed faster on 1.0.10 cluster ?
Hi ! We got into schema disagreement situation on 1.0.10 having 250GB of compressed data per node. Following http://wiki.apache.org/cassandra/FAQ#schema_disagreement after node restart looks like it is replaying all schema changes one be one , right ? As we did a lot of them during cluster lifetime, now node is busy creating long time ago dropped secondary indexes which looks like gonna take hours. Can it be done faster ? 1. Can we move all data SStables out of data/*/ directories, 2. follow FAQ#schema_disagreement (it should be faster on no data node) until we reach schema agreement. 3. Than stop cassandra, 4. Copy files back. 5. Start cassandra. Will it work ? Extra option is to disable thrift during above process (can it be done in config ? In cassandra.yaml rpc_port: 0 ? ) Thanks in advance for any hints, regards, -- Mateusz Korniak
Re: How schema disagreement can be fixed faster on 1.0.10 cluster ?
I know you specified 1.0.10, but C* 1.1 solves this problem: http://www.datastax.com/dev/blog/the-schema-management-renaissance On Thu, Jul 26, 2012 at 7:29 AM, Mateusz Korniak mateusz-li...@ant.gliwice.pl wrote: Hi ! We got into schema disagreement situation on 1.0.10 having 250GB of compressed data per node. Following http://wiki.apache.org/cassandra/FAQ#schema_disagreement after node restart looks like it is replaying all schema changes one be one , right ? As we did a lot of them during cluster lifetime, now node is busy creating long time ago dropped secondary indexes which looks like gonna take hours. Can it be done faster ? 1. Can we move all data SStables out of data/*/ directories, 2. follow FAQ#schema_disagreement (it should be faster on no data node) until we reach schema agreement. 3. Than stop cassandra, 4. Copy files back. 5. Start cassandra. Will it work ? Extra option is to disable thrift during above process (can it be done in config ? In cassandra.yaml rpc_port: 0 ? ) Thanks in advance for any hints, regards, -- Mateusz Korniak -- Tyler Hobbs DataStax http://datastax.com/
Re: Couldn't detect any schema definitions in local storage - after handling schema disagreement according to FAQ
1) What did I wrong? - why cassandra was throwing exceptions on first startup? In 1.0.X the history of schema changes was replayed to the node when it rejoined the cluster. If the node is receiving traffic while this is going on it will log those errors until the schema mutation that created 1012 is replayed. 2) Why the keyspace data was invalidated ? Is it expected? The data will have remained on the disk. The load is calculated based on the CF's in the schema, this can mean that the load will not return to full until the schema is fully replayed. Did you lose data ? 3) If answer to #2 is yes it's expected then that's the point in doing http://wiki.apache.org/cassandra/FAQ#schema_disagreement then all keyspace data is lost anyway? It makes more sense to just do http://wiki.apache.org/cassandra/Operations#Replacing_a_Dead_Node Answer as no. Checking, did you delete just the Schema-* and Migration-* files or all of the files in data/system? Also in the first log there is a log of commit log mutation being skipped because the schema is not there. Drain should have removed these, but it can take a little time (I think). 4) afaiu i could also stop cassandra again move old sstables from snapshot back to keyspace data dir and run repair for all keyspace CFs? So that it finishes faster and makes less load than running a repair which has no previous keyspace data at all? The approach you followed was the correct one. I've updated the wiki to say the errors are expected. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 19/05/2012, at 6:34 AM, Piavlo wrote: Hi, I had a schema disagreement problem in cassandra 1.0.9 cluster, where one node had different schema version. So I followed the faq at http://wiki.apache.org/cassandra/FAQ#schema_disagreement disabled gossip, disabled thrift, drained and finally stopped the cassandra process, on startup noticed INFO [main] 2012-05-18 16:23:11,879 DatabaseDescriptor.java (line 467) Couldn't detect any schema definitions in local storage. in the log, and after INFO [main] 2012-05-18 16:23:15,463 StorageService.java (line 619) Bootstrap/Replace/Move completed! Now serving reads. it started throwing Fatal exceptions for all read/write operations endlessly. I had to stop cassandra process again(no draining was done) On second start it did came up ok immediately loading the correct cluster schema version INFO [main] 2012-05-18 16:54:44,303 DatabaseDescriptor.java (line 499) Loading schema version 9db34ef0-a0be-11e1--f9687e034cf7 But now this node appears to have started with no data from keyspace which had schema disagreement. The original keyspace sstables now appear under snapshots dir. # nodetool -h localhost ring Address DC RackStatus State LoadOwns Token 141784319550391026443072753096570088106 10.49.127.4 eu-west 1a Up Normal 8.19 GB 16.67% 0 10.241.29.65eu-west 1b Up Normal 8.18 GB 16.67% 28356863910078205288614550619314017621 10.59.46.236eu-west 1c Up Normal 8.22 GB 16.67% 56713727820156410577229101238628035242 10.50.33.232eu-west 1a Up Normal 8.2 GB 16.67% 85070591730234615865843651857942052864 10.234.71.33eu-west 1b Up Normal 8.15 GB 16.67% 113427455640312821154458202477256070485 10.58.249.118 eu-west 1c Up Normal 660.98 MB 16.67% 141784319550391026443072753096570088106 # The node is the one with 660.98 MB data( which is opscenter keyspace data which was not invalidated) So i have some questions: 1) What did I wrong? - why cassandra was throwing exceptions on first startup? 2) Why the keyspace data was invalidated ? Is it expected? 3) If answer to #2 is yes it's expected then that's the point in doing http://wiki.apache.org/cassandra/FAQ#schema_disagreement then all keyspace data is lost anyway? It makes more sense to just do http://wiki.apache.org/cassandra/Operations#Replacing_a_Dead_Node 4) afaiu i could also stop cassandra again move old sstables from snapshot back to keyspace data dir and run repair for all keyspace CFs? So that it finishes faster and makes less load than running a repair which has no previous keyspace data at all? The first startup log is below: INFO [main] 2012-05-18 16:23:07,367 AbstractCassandraDaemon.java (line 105) Logging initialized INFO [main] 2012-05-18 16:23:07,382 AbstractCassandraDaemon.java (line 126) JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_24 INFO [main] 2012-05-18 16:23:07,383 AbstractCassandraDaemon.java (line 127) Heap size: 2600468480/2600468480 INFO [main] 2012-05-18 16:23:07,383
Couldn't detect any schema definitions in local storage - after handling schema disagreement according to FAQ
Hi, I had a schema disagreement problem in cassandra 1.0.9 cluster, where one node had different schema version. So I followed the faq at http://wiki.apache.org/cassandra/FAQ#schema_disagreement disabled gossip, disabled thrift, drained and finally stopped the cassandra process, on startup noticed INFO [main] 2012-05-18 16:23:11,879 DatabaseDescriptor.java (line 467) Couldn't detect any schema definitions in local storage. in the log, and after INFO [main] 2012-05-18 16:23:15,463 StorageService.java (line 619) Bootstrap/Replace/Move completed! Now serving reads. it started throwing Fatal exceptions for all read/write operations endlessly. I had to stop cassandra process again(no draining was done) On second start it did came up ok immediately loading the correct cluster schema version INFO [main] 2012-05-18 16:54:44,303 DatabaseDescriptor.java (line 499) Loading schema version 9db34ef0-a0be-11e1--f9687e034cf7 But now this node appears to have started with no data from keyspace which had schema disagreement. The original keyspace sstables now appear under snapshots dir. # nodetool -h localhost ring Address DC RackStatus State Load OwnsToken 141784319550391026443072753096570088106 10.49.127.4 eu-west 1a Up Normal 8.19 GB 16.67% 0 10.241.29.65eu-west 1b Up Normal 8.18 GB 16.67% 28356863910078205288614550619314017621 10.59.46.236eu-west 1c Up Normal 8.22 GB 16.67% 56713727820156410577229101238628035242 10.50.33.232eu-west 1a Up Normal 8.2 GB 16.67% 85070591730234615865843651857942052864 10.234.71.33eu-west 1b Up Normal 8.15 GB 16.67% 113427455640312821154458202477256070485 10.58.249.118 eu-west 1c Up Normal 660.98 MB 16.67% 141784319550391026443072753096570088106 # The node is the one with 660.98 MB data( which is opscenter keyspace data which was not invalidated) So i have some questions: 1) What did I wrong? - why cassandra was throwing exceptions on first startup? 2) Why the keyspace data was invalidated ? Is it expected? 3) If answer to #2 is yes it's expected then that's the point in doing http://wiki.apache.org/cassandra/FAQ#schema_disagreement then all keyspace data is lost anyway? It makes more sense to just do http://wiki.apache.org/cassandra/Operations#Replacing_a_Dead_Node 4) afaiu i could also stop cassandra again move old sstables from snapshot back to keyspace data dir and run repair for all keyspace CFs? So that it finishes faster and makes less load than running a repair which has no previous keyspace data at all? The first startup log is below: INFO [main] 2012-05-18 16:23:07,367 AbstractCassandraDaemon.java (line 105) Logging initialized INFO [main] 2012-05-18 16:23:07,382 AbstractCassandraDaemon.java (line 126) JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_24 INFO [main] 2012-05-18 16:23:07,383 AbstractCassandraDaemon.java (line 127) Heap size: 2600468480/2600468480 INFO [main] 2012-05-18 16:23:07,383 AbstractCassandraDaemon.java (line 128) Classpath: /etc/cassandra/conf:/usr/share/java/jna.jar:/usr/share/java/mx4j-tools.jar:/usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/apache-cassandra-1.0.9.jar:/usr/share/cassandra/lib/apache-cassandra-clientutil-1.0.9.jar:/usr/share/cassandra/lib/apache-cassandra-thrift-1.0.9.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang-2.4.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.2.jar:/usr/share/cassandra/lib/guava-r08.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jline-0.9.94.jar:/usr/share/cassandra/lib/joda-time-1.6.2.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.6.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.6.1.jar:/usr/share/cassandra/lib/snakeyaml-1.6.jar:/usr/share/cassandra/lib/snappy-java-1.0.4.1.jar:/usr/share/cassandra//lib/jamm-0.2.5.jar INFO [main] 2012-05-18 16:23:10,661 CLibrary.java (line 109) JNA mlockall successful INFO [main] 2012-05-18 16:23:10,692 DatabaseDescriptor.java (line 114) Loading settings from file:/etc/cassandra/ssa/cassandra.yaml INFO [main] 2012-05-18 16:23:10,868 DatabaseDescriptor.java (line 168) DiskAccessMode 'auto' determined
Re: Schema disagreement in 1.0.2
So I was able to get the schema agreeing on the two bad nodes, but I don't particularly like the way that I did it. One at a time, I shut them down, removed Schema* and Migration*, then copied over Schema* from another working node. They then started up with the correct schema. Did I do something totally incorrect in doing that? Also, some of my nodes are reporting that others are unreachable via the CLI when executing describe cluster;. Not all of the nodes do this, about 7/10 are perfectly fine. I tried restarting each of the nodes that say others are unreachable, when they came back up then their unreachable list had changed. Nodetool gossipinfo describes everything perfectly fine as does nodetool ring. The topology of the cluster is 2 datacenters, 5 servers each of with a RF of 3. Only one datacenter seems to have this issue. -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Schema-disagreement-in-1-0-2-tp7098609p7099003.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Schema disagreement issue in 1.0.0
Im facing the following issue with Cassandra 1.0 set up. The same works for 0.8.7 # cassandra-cli -h x.x.x.x -f RTSCFs.sch Connected to: Real Time Stats on x.x.x.x/9160 Authenticated to keyspace: Stats 39c3e120-fa24-11e0--61d449114eff Waiting for schema agreement... The schema has not settled in 10 seconds; further migrations are ill-advised until it does. Versions are 39c3e120-fa24-11e0--61d449114eff:[x.x.x.x], 317eb8f0-fa24-11e0--61d449114eff:[x.x.x.y] I tried this http://wiki.apache.org/cassandra/FAQ#schema_disagreement But Now when I restart the cluster I'm getting `org.apache.cassandra.config.ConfigurationException: Invalid definition for comparator` org.apache.cassandra.db.marshal.CompositeType This is my keyspace defn create keyspace Stats with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options={replication_factor:1}; This is my CF defn create column family Sample_Stats with default_validation_class=CounterColumnType and key_validation_class='CompositeType(UTF8Type,UTF8Type)' and comparator='CompositeType(UTF8Type, UTF8Type)' and replicate_on_write=true; What am I missing?
Re: Schema disagreement issue in 1.0.0
Looks like a bug, patch is here https://issues.apache.org/jira/browse/CASSANDRA-3391 Until it is fixed avoid using CompositeType in the key_validator_class and blow away the Schema and Migrations SSTables. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 20/10/2011, at 7:59 PM, Tamil selvan R.S wrote: Im facing the following issue with Cassandra 1.0 set up. The same works for 0.8.7 # cassandra-cli -h x.x.x.x -f RTSCFs.sch Connected to: Real Time Stats on x.x.x.x/9160 Authenticated to keyspace: Stats 39c3e120-fa24-11e0--61d449114eff Waiting for schema agreement... The schema has not settled in 10 seconds; further migrations are ill-advised until it does. Versions are 39c3e120-fa24-11e0--61d449114eff:[x.x.x.x], 317eb8f0-fa24-11e0--61d449114eff:[x.x.x.y] I tried this http://wiki.apache.org/cassandra/FAQ#schema_disagreement But Now when I restart the cluster I'm getting `org.apache.cassandra.config.ConfigurationException: Invalid definition for comparator` org.apache.cassandra.db.marshal.CompositeType This is my keyspace defn create keyspace Stats with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options={replication_factor:1}; This is my CF defn create column family Sample_Stats with default_validation_class=CounterColumnType and key_validation_class='CompositeType(UTF8Type,UTF8Type)' and comparator='CompositeType(UTF8Type, UTF8Type)' and replicate_on_write=true; What am I missing?
Re: How to solve this kind of schema disagreement...
I don't have time to look into the reasons for that error, but that does not sound good. It kind of sounds like there are multiple migration chains out there in the cluster. This could come from apply changes to different nodes at the same time. Is this a prod system ? If not I would shut it down, wipe all the Schema and Migration SSTables and then apply the schema again one CF at a time (it will take time to read the data). If it's a prod system it may need some delicate surgery on the Migrations and Schema CF's. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 10 Aug 2011, at 15:41, Dikang Gu wrote: And a lot of not apply logs. DEBUG [MigrationStage:1] 2011-08-10 11:36:29,376 DefinitionsUpdateVerbHandler.java (line 70) Applying AddColumnFamily from /192.168.1.9 DEBUG [MigrationStage:1] 2011-08-10 11:36:29,376 DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous version mismatch. cannot apply. DEBUG [MigrationStage:1] 2011-08-10 11:36:29,379 DefinitionsUpdateVerbHandler.java (line 70) Applying AddColumnFamily from /192.168.1.9 DEBUG [MigrationStage:1] 2011-08-10 11:36:29,379 DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous version mismatch. cannot apply. DEBUG [MigrationStage:1] 2011-08-10 11:36:29,382 DefinitionsUpdateVerbHandler.java (line 70) Applying AddColumnFamily from /192.168.1.9 DEBUG [MigrationStage:1] 2011-08-10 11:36:29,382 DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous version mismatch. cannot apply. -- Dikang Gu 0086 - 18611140205 On Wednesday, August 10, 2011 at 11:35 AM, Dikang Gu wrote: Hi Aaron, I set the log level to be DEBUG, and find a lot of forceFlush debug info in the log: DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) forceFlush requested but everything is clean DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) forceFlush requested but everything is clean DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) forceFlush requested but everything is clean DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) forceFlush requested but everything is clean What does this mean? Thanks. -- Dikang Gu 0086 - 18611140205 On Wednesday, August 10, 2011 at 6:42 AM, aaron morton wrote: um. There has got to be something stopping the migration from completing. Turn the logging up to DEBUG before starting and look for messages from MigrationManager.java Provide all the log messages from Migration.java on the 1.27 node Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 8 Aug 2011, at 15:52, Dikang Gu wrote: Hi Aaron, I repeat the whole procedure: 1. kill the cassandra instance on 1.27. 2. rm the data/system/Migrations-g-* 3. rm the data/system/Schema-g-* 4. bin/cassandra to start the cassandra. Now, the migration seems stop and I do not find any error in the system.log yet. The ring looks good: [root@yun-phy2 apache-cassandra-0.8.1]# bin/nodetool -h192.168.1.27 -p8090 ring Address DC RackStatus State Load OwnsToken 127605887595351923798765477786913079296 192.168.1.28datacenter1 rack1 Up Normal 8.38 GB 25.00% 1 192.168.1.25datacenter1 rack1 Up Normal 8.54 GB 34.01% 57856537434773737201679995572503935972 192.168.1.27datacenter1 rack1 Up Normal 1.78 GB 24.28% 99165710459060760249270263771474737125 192.168.1.9 datacenter1 rack1 Up Normal 8.75 GB 16.72% 127605887595351923798765477786913079296 But the schema still does not correct: Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 192.168.1.25] 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27] The 5a54ebd0-bd90-11e0--9510c23fceff is same as last time… And in the log, the last Migration.java log is: INFO [MigrationStage:1] 2011-08-08 11:41:30,293 Migration.java (line 116) Applying migration 5a54ebd0-bd90-11e0--9510c23fceff Add keyspace: SimpleDB_4E38DAA64894A9146105rep strategy:SimpleStrategy{}durable_writes: true Could you explain this? If I change the token given to 1.27 to another one, will it help? Thanks. -- Dikang Gu 0086 - 18611140205 On Sunday, August 7, 2011 at 4:14 PM, aaron morton wrote: did you check the logs in 1.27 for
Re: How to solve this kind of schema disagreement...
um. There has got to be something stopping the migration from completing. Turn the logging up to DEBUG before starting and look for messages from MigrationManager.java Provide all the log messages from Migration.java on the 1.27 node Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 8 Aug 2011, at 15:52, Dikang Gu wrote: Hi Aaron, I repeat the whole procedure: 1. kill the cassandra instance on 1.27. 2. rm the data/system/Migrations-g-* 3. rm the data/system/Schema-g-* 4. bin/cassandra to start the cassandra. Now, the migration seems stop and I do not find any error in the system.log yet. The ring looks good: [root@yun-phy2 apache-cassandra-0.8.1]# bin/nodetool -h192.168.1.27 -p8090 ring Address DC RackStatus State LoadOwns Token 127605887595351923798765477786913079296 192.168.1.28datacenter1 rack1 Up Normal 8.38 GB 25.00% 1 192.168.1.25datacenter1 rack1 Up Normal 8.54 GB 34.01% 57856537434773737201679995572503935972 192.168.1.27datacenter1 rack1 Up Normal 1.78 GB 24.28% 99165710459060760249270263771474737125 192.168.1.9 datacenter1 rack1 Up Normal 8.75 GB 16.72% 127605887595351923798765477786913079296 But the schema still does not correct: Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 192.168.1.25] 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27] The 5a54ebd0-bd90-11e0--9510c23fceff is same as last time… And in the log, the last Migration.java log is: INFO [MigrationStage:1] 2011-08-08 11:41:30,293 Migration.java (line 116) Applying migration 5a54ebd0-bd90-11e0--9510c23fceff Add keyspace: SimpleDB_4E38DAA64894A9146105rep strategy:SimpleStrategy{}durable_writes: true Could you explain this? If I change the token given to 1.27 to another one, will it help? Thanks. -- Dikang Gu 0086 - 18611140205 On Sunday, August 7, 2011 at 4:14 PM, aaron morton wrote: did you check the logs in 1.27 for errors ? Could you be seeing this ? https://issues.apache.org/jira/browse/CASSANDRA-2867 Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 7 Aug 2011, at 16:24, Dikang Gu wrote: I restart both nodes, and deleted the shcema* and migration* and restarted them. The current cluster looks like this: [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 192.168.1.25] 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27] the 1.28 looks good, and the 1.27 still can not get the schema agreement... I have tried several times, even delete all the data on 1.27, and rejoin it as a new node, but it is still unhappy. And the ring looks like this: Address DC RackStatus State LoadOwns Token 127605887595351923798765477786913079296 192.168.1.28datacenter1 rack1 Up Normal 8.38 GB 25.00% 1 192.168.1.25datacenter1 rack1 Up Normal 8.55 GB 34.01% 57856537434773737201679995572503935972 192.168.1.27datacenter1 rack1 Up Joining 1.81 GB 24.28% 99165710459060760249270263771474737125 192.168.1.9 datacenter1 rack1 Up Normal 8.75 GB 16.72% 127605887595351923798765477786913079296 The 1.27 seems can not join the cluster, and it just hangs there... Any suggestions? Thanks. On Sun, Aug 7, 2011 at 10:01 AM, aaron morton aa...@thelastpickle.com wrote: After there restart you what was in the logs for the 1.27 machine from the Migration.java logger ? Some of the messages will start with Applying migration You should have shut down both of the nodes, then deleted the schema* and migration* system sstables, then restarted one of them and watched to see if it got to schema agreement. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 6 Aug 2011, at 22:56, Dikang Gu wrote: I have tried this, but the schema still does not agree in the
Re: How to solve this kind of schema disagreement...
Hi Aaron, I set the log level to be DEBUG, and find a lot of forceFlush debug info in the log: DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) forceFlush requested but everything is clean DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) forceFlush requested but everything is clean DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) forceFlush requested but everything is clean DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) forceFlush requested but everything is clean What does this mean? Thanks. -- Dikang Gu 0086 - 18611140205 On Wednesday, August 10, 2011 at 6:42 AM, aaron morton wrote: um. There has got to be something stopping the migration from completing. Turn the logging up to DEBUG before starting and look for messages from MigrationManager.java Provide all the log messages from Migration.java on the 1.27 node Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 8 Aug 2011, at 15:52, Dikang Gu wrote: Hi Aaron, I repeat the whole procedure: 1. kill the cassandra instance on 1.27. 2. rm the data/system/Migrations-g-* 3. rm the data/system/Schema-g-* 4. bin/cassandra to start the cassandra. Now, the migration seems stop and I do not find any error in the system.log yet. The ring looks good: [root@yun-phy2 apache-cassandra-0.8.1]# bin/nodetool -h192.168.1.27 -p8090 ring Address DC Rack Status State Load Owns Token 127605887595351923798765477786913079296 192.168.1.28 datacenter1 rack1 Up Normal 8.38 GB 25.00% 1 192.168.1.25 datacenter1 rack1 Up Normal 8.54 GB 34.01% 57856537434773737201679995572503935972 192.168.1.27 datacenter1 rack1 Up Normal 1.78 GB 24.28% 99165710459060760249270263771474737125 192.168.1.9 datacenter1 rack1 Up Normal 8.75 GB 16.72% 127605887595351923798765477786913079296 But the schema still does not correct: Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 192.168.1.25] 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27] The 5a54ebd0-bd90-11e0--9510c23fceff is same as last time… And in the log, the last Migration.java log is: INFO [MigrationStage:1] 2011-08-08 11:41:30,293 Migration.java (line 116) Applying migration 5a54ebd0-bd90-11e0--9510c23fceff Add keyspace: SimpleDB_4E38DAA64894A9146105rep strategy:SimpleStrategy{}durable_writes: true Could you explain this? If I change the token given to 1.27 to another one, will it help? Thanks. -- Dikang Gu 0086 - 18611140205 On Sunday, August 7, 2011 at 4:14 PM, aaron morton wrote: did you check the logs in 1.27 for errors ? Could you be seeing this ? https://issues.apache.org/jira/browse/CASSANDRA-2867 Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 7 Aug 2011, at 16:24, Dikang Gu wrote: I restart both nodes, and deleted the shcema* and migration* and restarted them. The current cluster looks like this: [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 192.168.1.25] 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27] the 1.28 looks good, and the 1.27 still can not get the schema agreement... I have tried several times, even delete all the data on 1.27, and rejoin it as a new node, but it is still unhappy. And the ring looks like this: Address DC Rack Status State Load Owns Token 127605887595351923798765477786913079296 192.168.1.28 datacenter1 rack1 Up Normal 8.38 GB 25.00% 1 192.168.1.25 datacenter1 rack1 Up Normal 8.55 GB 34.01% 57856537434773737201679995572503935972 192.168.1.27 datacenter1 rack1 Up Joining 1.81 GB 24.28% 99165710459060760249270263771474737125 192.168.1.9 datacenter1 rack1 Up Normal 8.75 GB 16.72% 127605887595351923798765477786913079296 The 1.27 seems can not join the cluster, and it just hangs there... Any suggestions? Thanks. On Sun, Aug 7, 2011 at 10:01 AM, aaron morton aa...@thelastpickle.com wrote: After there restart you what was in the logs for the 1.27 machine from the Migration.java logger ? Some of the messages will start with Applying migration You should have shut down both of the nodes, then
Re: How to solve this kind of schema disagreement...
And a lot of not apply logs. DEBUG [MigrationStage:1] 2011-08-10 11:36:29,376 DefinitionsUpdateVerbHandler.java (line 70) Applying AddColumnFamily from /192.168.1.9 DEBUG [MigrationStage:1] 2011-08-10 11:36:29,376 DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous version mismatch. cannot apply. DEBUG [MigrationStage:1] 2011-08-10 11:36:29,379 DefinitionsUpdateVerbHandler.java (line 70) Applying AddColumnFamily from /192.168.1.9 DEBUG [MigrationStage:1] 2011-08-10 11:36:29,379 DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous version mismatch. cannot apply. DEBUG [MigrationStage:1] 2011-08-10 11:36:29,382 DefinitionsUpdateVerbHandler.java (line 70) Applying AddColumnFamily from /192.168.1.9 DEBUG [MigrationStage:1] 2011-08-10 11:36:29,382 DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous version mismatch. cannot apply. -- Dikang Gu 0086 - 18611140205 On Wednesday, August 10, 2011 at 11:35 AM, Dikang Gu wrote: Hi Aaron, I set the log level to be DEBUG, and find a lot of forceFlush debug info in the log: DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) forceFlush requested but everything is clean DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) forceFlush requested but everything is clean DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) forceFlush requested but everything is clean DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) forceFlush requested but everything is clean What does this mean? Thanks. -- Dikang Gu 0086 - 18611140205 On Wednesday, August 10, 2011 at 6:42 AM, aaron morton wrote: um. There has got to be something stopping the migration from completing. Turn the logging up to DEBUG before starting and look for messages from MigrationManager.java Provide all the log messages from Migration.java on the 1.27 node Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 8 Aug 2011, at 15:52, Dikang Gu wrote: Hi Aaron, I repeat the whole procedure: 1. kill the cassandra instance on 1.27. 2. rm the data/system/Migrations-g-* 3. rm the data/system/Schema-g-* 4. bin/cassandra to start the cassandra. Now, the migration seems stop and I do not find any error in the system.log yet. The ring looks good: [root@yun-phy2 apache-cassandra-0.8.1]# bin/nodetool -h192.168.1.27 -p8090 ring Address DC Rack Status State Load Owns Token 127605887595351923798765477786913079296 192.168.1.28 datacenter1 rack1 Up Normal 8.38 GB 25.00% 1 192.168.1.25 datacenter1 rack1 Up Normal 8.54 GB 34.01% 57856537434773737201679995572503935972 192.168.1.27 datacenter1 rack1 Up Normal 1.78 GB 24.28% 99165710459060760249270263771474737125 192.168.1.9 datacenter1 rack1 Up Normal 8.75 GB 16.72% 127605887595351923798765477786913079296 But the schema still does not correct: Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 192.168.1.25] 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27] The 5a54ebd0-bd90-11e0--9510c23fceff is same as last time… And in the log, the last Migration.java log is: INFO [MigrationStage:1] 2011-08-08 11:41:30,293 Migration.java (line 116) Applying migration 5a54ebd0-bd90-11e0--9510c23fceff Add keyspace: SimpleDB_4E38DAA64894A9146105rep strategy:SimpleStrategy{}durable_writes: true Could you explain this? If I change the token given to 1.27 to another one, will it help? Thanks. -- Dikang Gu 0086 - 18611140205 On Sunday, August 7, 2011 at 4:14 PM, aaron morton wrote: did you check the logs in 1.27 for errors ? Could you be seeing this ? https://issues.apache.org/jira/browse/CASSANDRA-2867 Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 7 Aug 2011, at 16:24, Dikang Gu wrote: I restart both nodes, and deleted the shcema* and migration* and restarted them. The current cluster looks like this: [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 192.168.1.25] 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27] the 1.28 looks good, and the 1.27 still can not get the schema
Re: How to solve this kind of schema disagreement...
did you check the logs in 1.27 for errors ? Could you be seeing this ? https://issues.apache.org/jira/browse/CASSANDRA-2867 Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 7 Aug 2011, at 16:24, Dikang Gu wrote: I restart both nodes, and deleted the shcema* and migration* and restarted them. The current cluster looks like this: [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 192.168.1.25] 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27] the 1.28 looks good, and the 1.27 still can not get the schema agreement... I have tried several times, even delete all the data on 1.27, and rejoin it as a new node, but it is still unhappy. And the ring looks like this: Address DC RackStatus State LoadOwns Token 127605887595351923798765477786913079296 192.168.1.28datacenter1 rack1 Up Normal 8.38 GB 25.00% 1 192.168.1.25datacenter1 rack1 Up Normal 8.55 GB 34.01% 57856537434773737201679995572503935972 192.168.1.27datacenter1 rack1 Up Joining 1.81 GB 24.28% 99165710459060760249270263771474737125 192.168.1.9 datacenter1 rack1 Up Normal 8.75 GB 16.72% 127605887595351923798765477786913079296 The 1.27 seems can not join the cluster, and it just hangs there... Any suggestions? Thanks. On Sun, Aug 7, 2011 at 10:01 AM, aaron morton aa...@thelastpickle.com wrote: After there restart you what was in the logs for the 1.27 machine from the Migration.java logger ? Some of the messages will start with Applying migration You should have shut down both of the nodes, then deleted the schema* and migration* system sstables, then restarted one of them and watched to see if it got to schema agreement. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 6 Aug 2011, at 22:56, Dikang Gu wrote: I have tried this, but the schema still does not agree in the cluster: [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: UNREACHABLE: [192.168.1.28] 75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25] 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27] Any other suggestions to solve this? Because I have some production data saved in the cassandra cluster, so I can not afford data lost... Thanks. On Fri, Aug 5, 2011 at 8:55 PM, Benoit Perroud ben...@noisette.ch wrote: Based on http://wiki.apache.org/cassandra/FAQ#schema_disagreement, 75eece10-bf48-11e0--4d205df954a7 own the majority, so shutdown and remove the schema* and migration* sstables from both 192.168.1.28 and 192.168.1.27 2011/8/5 Dikang Gu dikan...@gmail.com: [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 743fe590-bf48-11e0--4d205df954a7: [192.168.1.28] 75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25] 06da9aa0-bda8-11e0--9510c23fceff: [192.168.1.27] three different schema versions in the cluster... -- Dikang Gu 0086 - 18611140205 -- Dikang Gu 0086 - 18611140205 -- Dikang Gu 0086 - 18611140205
Re: How to solve this kind of schema disagreement...
Hi Aaron, I repeat the whole procedure: 1. kill the cassandra instance on 1.27. 2. rm the data/system/Migrations-g-* 3. rm the data/system/Schema-g-* 4. bin/cassandra to start the cassandra. Now, the migration seems stop and I do not find any error in the system.log yet. The ring looks good: [root@yun-phy2 apache-cassandra-0.8.1]# bin/nodetool -h192.168.1.27 -p8090 ring Address DC Rack Status State Load Owns Token 127605887595351923798765477786913079296 192.168.1.28 datacenter1 rack1 Up Normal 8.38 GB 25.00% 1 192.168.1.25 datacenter1 rack1 Up Normal 8.54 GB 34.01% 57856537434773737201679995572503935972 192.168.1.27 datacenter1 rack1 Up Normal 1.78 GB 24.28% 99165710459060760249270263771474737125 192.168.1.9 datacenter1 rack1 Up Normal 8.75 GB 16.72% 127605887595351923798765477786913079296 But the schema still does not correct: Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 192.168.1.25] 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27] The 5a54ebd0-bd90-11e0--9510c23fceff is same as last time… And in the log, the last Migration.java log is: INFO [MigrationStage:1] 2011-08-08 11:41:30,293 Migration.java (line 116) Applying migration 5a54ebd0-bd90-11e0--9510c23fceff Add keyspace: SimpleDB_4E38DAA64894A9146105rep strategy:SimpleStrategy{}durable_writes: true Could you explain this? If I change the token given to 1.27 to another one, will it help? Thanks. -- Dikang Gu 0086 - 18611140205 On Sunday, August 7, 2011 at 4:14 PM, aaron morton wrote: did you check the logs in 1.27 for errors ? Could you be seeing this ? https://issues.apache.org/jira/browse/CASSANDRA-2867 Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 7 Aug 2011, at 16:24, Dikang Gu wrote: I restart both nodes, and deleted the shcema* and migration* and restarted them. The current cluster looks like this: [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 192.168.1.25] 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27] the 1.28 looks good, and the 1.27 still can not get the schema agreement... I have tried several times, even delete all the data on 1.27, and rejoin it as a new node, but it is still unhappy. And the ring looks like this: Address DC Rack Status State Load Owns Token 127605887595351923798765477786913079296 192.168.1.28 datacenter1 rack1 Up Normal 8.38 GB 25.00% 1 192.168.1.25 datacenter1 rack1 Up Normal 8.55 GB 34.01% 57856537434773737201679995572503935972 192.168.1.27 datacenter1 rack1 Up Joining 1.81 GB 24.28% 99165710459060760249270263771474737125 192.168.1.9 datacenter1 rack1 Up Normal 8.75 GB 16.72% 127605887595351923798765477786913079296 The 1.27 seems can not join the cluster, and it just hangs there... Any suggestions? Thanks. On Sun, Aug 7, 2011 at 10:01 AM, aaron morton aa...@thelastpickle.com wrote: After there restart you what was in the logs for the 1.27 machine from the Migration.java logger ? Some of the messages will start with Applying migration You should have shut down both of the nodes, then deleted the schema* and migration* system sstables, then restarted one of them and watched to see if it got to schema agreement. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 6 Aug 2011, at 22:56, Dikang Gu wrote: I have tried this, but the schema still does not agree in the cluster: [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: UNREACHABLE: [192.168.1.28] 75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25] 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27] Any other suggestions to solve this? Because I have some production data saved in the cassandra cluster, so I can not afford data lost... Thanks. On Fri, Aug 5, 2011 at 8:55 PM, Benoit Perroud ben...@noisette.ch wrote: Based on http://wiki.apache.org/cassandra/FAQ#schema_disagreement, 75eece10-bf48-11e0--4d205df954a7 own the majority, so shutdown and remove the schema* and migration* sstables from both 192.168.1.28 and 192.168.1.27 2011/8/5 Dikang Gu dikan...@gmail.com: [default@unknown] describe cluster;
Re: How to solve this kind of schema disagreement...
I have tried this, but the schema still does not agree in the cluster: [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: UNREACHABLE: [192.168.1.28] 75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25] 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27] Any other suggestions to solve this? Because I have some production data saved in the cassandra cluster, so I can not afford data lost... Thanks. On Fri, Aug 5, 2011 at 8:55 PM, Benoit Perroud ben...@noisette.ch wrote: Based on http://wiki.apache.org/cassandra/FAQ#schema_disagreement, 75eece10-bf48-11e0--4d205df954a7 own the majority, so shutdown and remove the schema* and migration* sstables from both 192.168.1.28 and 192.168.1.27 2011/8/5 Dikang Gu dikan...@gmail.com: [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 743fe590-bf48-11e0--4d205df954a7: [192.168.1.28] 75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25] 06da9aa0-bda8-11e0--9510c23fceff: [192.168.1.27] three different schema versions in the cluster... -- Dikang Gu 0086 - 18611140205 -- Dikang Gu 0086 - 18611140205
Re: How to solve this kind of schema disagreement...
After there restart you what was in the logs for the 1.27 machine from the Migration.java logger ? Some of the messages will start with Applying migration You should have shut down both of the nodes, then deleted the schema* and migration* system sstables, then restarted one of them and watched to see if it got to schema agreement. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 6 Aug 2011, at 22:56, Dikang Gu wrote: I have tried this, but the schema still does not agree in the cluster: [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: UNREACHABLE: [192.168.1.28] 75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25] 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27] Any other suggestions to solve this? Because I have some production data saved in the cassandra cluster, so I can not afford data lost... Thanks. On Fri, Aug 5, 2011 at 8:55 PM, Benoit Perroud ben...@noisette.ch wrote: Based on http://wiki.apache.org/cassandra/FAQ#schema_disagreement, 75eece10-bf48-11e0--4d205df954a7 own the majority, so shutdown and remove the schema* and migration* sstables from both 192.168.1.28 and 192.168.1.27 2011/8/5 Dikang Gu dikan...@gmail.com: [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 743fe590-bf48-11e0--4d205df954a7: [192.168.1.28] 75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25] 06da9aa0-bda8-11e0--9510c23fceff: [192.168.1.27] three different schema versions in the cluster... -- Dikang Gu 0086 - 18611140205 -- Dikang Gu 0086 - 18611140205
Re: How to solve this kind of schema disagreement...
I restart both nodes, and deleted the shcema* and migration* and restarted them. The current cluster looks like this: [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 192.168.1.25] 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27] the 1.28 looks good, and the 1.27 still can not get the schema agreement... I have tried several times, even delete all the data on 1.27, and rejoin it as a new node, but it is still unhappy. And the ring looks like this: Address DC RackStatus State LoadOwns Token 127605887595351923798765477786913079296 192.168.1.28datacenter1 rack1 Up Normal 8.38 GB 25.00% 1 192.168.1.25datacenter1 rack1 Up Normal 8.55 GB 34.01% 57856537434773737201679995572503935972 192.168.1.27datacenter1 rack1 Up Joining 1.81 GB 24.28% 99165710459060760249270263771474737125 192.168.1.9 datacenter1 rack1 Up Normal 8.75 GB 16.72% 127605887595351923798765477786913079296 The 1.27 seems can not join the cluster, and it just hangs there... Any suggestions? Thanks. On Sun, Aug 7, 2011 at 10:01 AM, aaron morton aa...@thelastpickle.comwrote: After there restart you what was in the logs for the 1.27 machine from the Migration.java logger ? Some of the messages will start with Applying migration You should have shut down both of the nodes, then deleted the schema* and migration* system sstables, then restarted one of them and watched to see if it got to schema agreement. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 6 Aug 2011, at 22:56, Dikang Gu wrote: I have tried this, but the schema still does not agree in the cluster: [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: UNREACHABLE: [192.168.1.28] 75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25] 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27] Any other suggestions to solve this? Because I have some production data saved in the cassandra cluster, so I can not afford data lost... Thanks. On Fri, Aug 5, 2011 at 8:55 PM, Benoit Perroud ben...@noisette.ch wrote: Based on http://wiki.apache.org/cassandra/FAQ#schema_disagreement, 75eece10-bf48-11e0--4d205df954a7 own the majority, so shutdown and remove the schema* and migration* sstables from both 192.168.1.28 and 192.168.1.27 2011/8/5 Dikang Gu dikan...@gmail.com: [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 743fe590-bf48-11e0--4d205df954a7: [192.168.1.28] 75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25] 06da9aa0-bda8-11e0--9510c23fceff: [192.168.1.27] three different schema versions in the cluster... -- Dikang Gu 0086 - 18611140205 -- Dikang Gu 0086 - 18611140205 -- Dikang Gu 0086 - 18611140205
How to solve this kind of schema disagreement...
[default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 743fe590-bf48-11e0--4d205df954a7: [192.168.1.28] 75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25] 06da9aa0-bda8-11e0--9510c23fceff: [192.168.1.27] three different schema versions in the cluster... -- Dikang Gu 0086 - 18611140205
Re: How to solve this kind of schema disagreement...
Based on http://wiki.apache.org/cassandra/FAQ#schema_disagreement, 75eece10-bf48-11e0--4d205df954a7 own the majority, so shutdown and remove the schema* and migration* sstables from both 192.168.1.28 and 192.168.1.27 2011/8/5 Dikang Gu dikan...@gmail.com: [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 743fe590-bf48-11e0--4d205df954a7: [192.168.1.28] 75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25] 06da9aa0-bda8-11e0--9510c23fceff: [192.168.1.27] three different schema versions in the cluster... -- Dikang Gu 0086 - 18611140205
Re: Schema Disagreement
Thanks Aaron. On Aug 2, 2011, at 3:04 AM, aaron morton wrote: Hang on, using brain now. That is triggering a small bug in the code see https://issues.apache.org/jira/browse/CASSANDRA-2984 For not just remove the column meta data. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 2 Aug 2011, at 21:19, aaron morton wrote: What do you see when you run describe cluster; in the cassandra-cli ? Whats the exact error you get and is there anything in the server side logs ? Have you added other CF's before adding this one ? Did the schema agree before starting this statement? I ran the statement below on the current trunk and it worked. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 2 Aug 2011, at 12:08, Dikang Gu wrote: I thought the schema disagree problem was already solved in 0.8.1... On possible solution is to decommission the disagree node and rejoin it. On Tue, Aug 2, 2011 at 8:01 AM, Yi Yang yy...@me.com wrote: Dear all, I'm always meeting mp with schema disagree problems while trying to create a column family like this, using cassandra-cli: create column family sd with column_type = 'Super' and key_validation_class = 'UUIDType' and comparator = 'LongType' and subcomparator = 'UTF8Type' and column_metadata = [ { column_name: 'time', validation_class : 'LongType' },{ column_name: 'open', validation_class : 'FloatType' },{ column_name: 'high', validation_class : 'FloatType' },{ column_name: 'low', validation_class : 'FloatType' },{ column_name: 'close', validation_class : 'FloatType' },{ column_name: 'volumn', validation_class : 'LongType' },{ column_name: 'splitopen', validation_class : 'FloatType' },{ column_name: 'splithigh', validation_class : 'FloatType' },{ column_name: 'splitlow', validation_class : 'FloatType' },{ column_name: 'splitclose', validation_class : 'FloatType' },{ column_name: 'splitvolume', validation_class : 'LongType' },{ column_name: 'splitclose', validation_class : 'FloatType' } ] ; I've tried to erase everything and restart Cassandra but this still happens. But when I clear the column_metadata section this no more disagreement error. Do you have any idea why this happens? Environment: 2 VMs, using the same harddrive, Cassandra 0.8.1, Ubuntu 10.04 This is for testing only. We'll move to dedicated servers later. Best regards, Yi -- Dikang Gu 0086 - 18611140205
Re: Schema Disagreement
It means the node you ran the command against could not contact node 192.168.1.25 it's probably down. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 3 Aug 2011, at 14:03, Dikang Gu wrote: I followed the instructions in the FAQ, but got the following when describe cluster; Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: dd73c740-bd84-11e0--98dab94442fb: [192.168.1.28, 192.168.1.9, 192.168.1.27] UNREACHABLE: [192.168.1.25] What's the UNREACHABLE? Thanks. -- Dikang Gu 0086 - 18611140205 On Wednesday, August 3, 2011 at 11:28 AM, Jonathan Ellis wrote: Have you seen http://wiki.apache.org/cassandra/FAQ#schema_disagreement ? On Tue, Aug 2, 2011 at 10:25 PM, Dikang Gu dikan...@gmail.com wrote: I also encounter the schema disagreement in my 0.8.1 cluster today… The disagreement occurs when I create a column family using the hector api, and I found the following errors in my cassandra/system.log ERROR [pool-2-thread-99] 2011-08-03 11:21:18,051 Cassandra.java (line 3378) Internal error processing remove java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337) at org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360) at org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241) at org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62) at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99) at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210) at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154) at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560) at org.apache.cassandra.thrift.CassandraServer.internal_remove(CassandraServer.java:539) at org.apache.cassandra.thrift.CassandraServer.remove(CassandraServer.java:547) at org.apache.cassandra.thrift.Cassandra$Processor$remove.process(Cassandra.java:3370) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) And when I try to decommission, I got this: ERROR [pool-2-thread-90] 2011-08-03 11:24:35,611 Cassandra.java (line 3462) Internal error processing batch_mutate java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337) at org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360) at org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241) at org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62) at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99) at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210) at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154) at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560) at org.apache.cassandra.thrift.CassandraServer.internal_batch_mutate(CassandraServer.java:511) at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:519) at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3454) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) What does this mean? Thanks. -- Dikang Gu 0086 - 18611140205 On Tuesday, August 2, 2011 at 6:04 PM, aaron morton wrote: Hang on, using brain now. That is triggering a small bug in the code see https://issues.apache.org/jira/browse/CASSANDRA-2984 For not just remove the column meta data. Cheers - Aaron Morton Freelance Cassandra Developer
Re: Schema Disagreement
What do you see when you run describe cluster; in the cassandra-cli ? Whats the exact error you get and is there anything in the server side logs ? Have you added other CF's before adding this one ? Did the schema agree before starting this statement? I ran the statement below on the current trunk and it worked. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 2 Aug 2011, at 12:08, Dikang Gu wrote: I thought the schema disagree problem was already solved in 0.8.1... On possible solution is to decommission the disagree node and rejoin it. On Tue, Aug 2, 2011 at 8:01 AM, Yi Yang yy...@me.com wrote: Dear all, I'm always meeting mp with schema disagree problems while trying to create a column family like this, using cassandra-cli: create column family sd with column_type = 'Super' and key_validation_class = 'UUIDType' and comparator = 'LongType' and subcomparator = 'UTF8Type' and column_metadata = [ { column_name: 'time', validation_class : 'LongType' },{ column_name: 'open', validation_class : 'FloatType' },{ column_name: 'high', validation_class : 'FloatType' },{ column_name: 'low', validation_class : 'FloatType' },{ column_name: 'close', validation_class : 'FloatType' },{ column_name: 'volumn', validation_class : 'LongType' },{ column_name: 'splitopen', validation_class : 'FloatType' },{ column_name: 'splithigh', validation_class : 'FloatType' },{ column_name: 'splitlow', validation_class : 'FloatType' },{ column_name: 'splitclose', validation_class : 'FloatType' },{ column_name: 'splitvolume', validation_class : 'LongType' },{ column_name: 'splitclose', validation_class : 'FloatType' } ] ; I've tried to erase everything and restart Cassandra but this still happens. But when I clear the column_metadata section this no more disagreement error. Do you have any idea why this happens? Environment: 2 VMs, using the same harddrive, Cassandra 0.8.1, Ubuntu 10.04 This is for testing only. We'll move to dedicated servers later. Best regards, Yi -- Dikang Gu 0086 - 18611140205
Re: Schema Disagreement
Hang on, using brain now. That is triggering a small bug in the code see https://issues.apache.org/jira/browse/CASSANDRA-2984 For not just remove the column meta data. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 2 Aug 2011, at 21:19, aaron morton wrote: What do you see when you run describe cluster; in the cassandra-cli ? Whats the exact error you get and is there anything in the server side logs ? Have you added other CF's before adding this one ? Did the schema agree before starting this statement? I ran the statement below on the current trunk and it worked. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 2 Aug 2011, at 12:08, Dikang Gu wrote: I thought the schema disagree problem was already solved in 0.8.1... On possible solution is to decommission the disagree node and rejoin it. On Tue, Aug 2, 2011 at 8:01 AM, Yi Yang yy...@me.com wrote: Dear all, I'm always meeting mp with schema disagree problems while trying to create a column family like this, using cassandra-cli: create column family sd with column_type = 'Super' and key_validation_class = 'UUIDType' and comparator = 'LongType' and subcomparator = 'UTF8Type' and column_metadata = [ { column_name: 'time', validation_class : 'LongType' },{ column_name: 'open', validation_class : 'FloatType' },{ column_name: 'high', validation_class : 'FloatType' },{ column_name: 'low', validation_class : 'FloatType' },{ column_name: 'close', validation_class : 'FloatType' },{ column_name: 'volumn', validation_class : 'LongType' },{ column_name: 'splitopen', validation_class : 'FloatType' },{ column_name: 'splithigh', validation_class : 'FloatType' },{ column_name: 'splitlow', validation_class : 'FloatType' },{ column_name: 'splitclose', validation_class : 'FloatType' },{ column_name: 'splitvolume', validation_class : 'LongType' },{ column_name: 'splitclose', validation_class : 'FloatType' } ] ; I've tried to erase everything and restart Cassandra but this still happens. But when I clear the column_metadata section this no more disagreement error. Do you have any idea why this happens? Environment: 2 VMs, using the same harddrive, Cassandra 0.8.1, Ubuntu 10.04 This is for testing only. We'll move to dedicated servers later. Best regards, Yi -- Dikang Gu 0086 - 18611140205
Re: Schema Disagreement
I also encounter the schema disagreement in my 0.8.1 cluster today… The disagreement occurs when I create a column family using the hector api, and I found the following errors in my cassandra/system.log ERROR [pool-2-thread-99] 2011-08-03 11:21:18,051 Cassandra.java (line 3378) Internal error processing remove java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337) at org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360) at org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241) at org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62) at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99) at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210) at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154) at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560) at org.apache.cassandra.thrift.CassandraServer.internal_remove(CassandraServer.java:539) at org.apache.cassandra.thrift.CassandraServer.remove(CassandraServer.java:547) at org.apache.cassandra.thrift.Cassandra$Processor$remove.process(Cassandra.java:3370) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) And when I try to decommission, I got this: ERROR [pool-2-thread-90] 2011-08-03 11:24:35,611 Cassandra.java (line 3462) Internal error processing batch_mutate java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337) at org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360) at org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241) at org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62) at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99) at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210) at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154) at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560) at org.apache.cassandra.thrift.CassandraServer.internal_batch_mutate(CassandraServer.java:511) at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:519) at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3454) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) What does this mean? Thanks. -- Dikang Gu 0086 - 18611140205 On Tuesday, August 2, 2011 at 6:04 PM, aaron morton wrote: Hang on, using brain now. That is triggering a small bug in the code see https://issues.apache.org/jira/browse/CASSANDRA-2984 For not just remove the column meta data. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 2 Aug 2011, at 21:19, aaron morton wrote: What do you see when you run describe cluster; in the cassandra-cli ? Whats the exact error you get and is there anything in the server side logs ? Have you added other CF's before adding this one ? Did the schema agree before starting this statement? I ran the statement below on the current trunk and it worked. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 2 Aug 2011, at 12:08, Dikang Gu wrote: I thought the schema disagree problem was already solved in 0.8.1... On possible solution is to decommission the disagree node and rejoin it. On Tue, Aug 2, 2011 at 8:01 AM, Yi Yang yy...@me.com wrote: Dear all, I'm always meeting mp with schema disagree problems while trying to create
Re: Schema Disagreement
Have you seen http://wiki.apache.org/cassandra/FAQ#schema_disagreement ? On Tue, Aug 2, 2011 at 10:25 PM, Dikang Gu dikan...@gmail.com wrote: I also encounter the schema disagreement in my 0.8.1 cluster today… The disagreement occurs when I create a column family using the hector api, and I found the following errors in my cassandra/system.log ERROR [pool-2-thread-99] 2011-08-03 11:21:18,051 Cassandra.java (line 3378) Internal error processing remove java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337) at org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360) at org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241) at org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62) at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99) at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210) at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154) at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560) at org.apache.cassandra.thrift.CassandraServer.internal_remove(CassandraServer.java:539) at org.apache.cassandra.thrift.CassandraServer.remove(CassandraServer.java:547) at org.apache.cassandra.thrift.Cassandra$Processor$remove.process(Cassandra.java:3370) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) And when I try to decommission, I got this: ERROR [pool-2-thread-90] 2011-08-03 11:24:35,611 Cassandra.java (line 3462) Internal error processing batch_mutate java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337) at org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360) at org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241) at org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62) at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99) at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210) at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154) at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560) at org.apache.cassandra.thrift.CassandraServer.internal_batch_mutate(CassandraServer.java:511) at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:519) at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3454) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) What does this mean? Thanks. -- Dikang Gu 0086 - 18611140205 On Tuesday, August 2, 2011 at 6:04 PM, aaron morton wrote: Hang on, using brain now. That is triggering a small bug in the code see https://issues.apache.org/jira/browse/CASSANDRA-2984 For not just remove the column meta data. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 2 Aug 2011, at 21:19, aaron morton wrote: What do you see when you run describe cluster; in the cassandra-cli ? Whats the exact error you get and is there anything in the server side logs ? Have you added other CF's before adding this one ? Did the schema agree before starting this statement? I ran the statement below on the current trunk and it worked. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 2 Aug 2011, at 12:08, Dikang Gu wrote: I thought the schema disagree problem was already solved in 0.8.1... On possible solution is to decommission the disagree node and rejoin it. On Tue, Aug 2, 2011 at 8:01 AM, Yi Yang yy
Re: Schema Disagreement
I followed the instructions in the FAQ, but got the following when describe cluster; Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: dd73c740-bd84-11e0--98dab94442fb: [192.168.1.28, 192.168.1.9, 192.168.1.27] UNREACHABLE: [192.168.1.25] What's the UNREACHABLE? Thanks. -- Dikang Gu 0086 - 18611140205 On Wednesday, August 3, 2011 at 11:28 AM, Jonathan Ellis wrote: Have you seen http://wiki.apache.org/cassandra/FAQ#schema_disagreement ? On Tue, Aug 2, 2011 at 10:25 PM, Dikang Gu dikan...@gmail.com wrote: I also encounter the schema disagreement in my 0.8.1 cluster today… The disagreement occurs when I create a column family using the hector api, and I found the following errors in my cassandra/system.log ERROR [pool-2-thread-99] 2011-08-03 11:21:18,051 Cassandra.java (line 3378) Internal error processing remove java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337) at org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360) at org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241) at org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62) at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99) at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210) at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154) at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560) at org.apache.cassandra.thrift.CassandraServer.internal_remove(CassandraServer.java:539) at org.apache.cassandra.thrift.CassandraServer.remove(CassandraServer.java:547) at org.apache.cassandra.thrift.Cassandra$Processor$remove.process(Cassandra.java:3370) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) And when I try to decommission, I got this: ERROR [pool-2-thread-90] 2011-08-03 11:24:35,611 Cassandra.java (line 3462) Internal error processing batch_mutate java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337) at org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360) at org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241) at org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62) at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99) at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210) at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154) at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560) at org.apache.cassandra.thrift.CassandraServer.internal_batch_mutate(CassandraServer.java:511) at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:519) at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3454) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) What does this mean? Thanks. -- Dikang Gu 0086 - 18611140205 On Tuesday, August 2, 2011 at 6:04 PM, aaron morton wrote: Hang on, using brain now. That is triggering a small bug in the code see https://issues.apache.org/jira/browse/CASSANDRA-2984 For not just remove the column meta data. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 2 Aug 2011, at 21:19, aaron morton wrote: What do you see when you run describe cluster; in the cassandra-cli ? Whats the exact error you get
Schema Disagreement
Dear all, I'm always meeting mp with schema disagree problems while trying to create a column family like this, using cassandra-cli: create column family sd with column_type = 'Super' and key_validation_class = 'UUIDType' and comparator = 'LongType' and subcomparator = 'UTF8Type' and column_metadata = [ { column_name: 'time', validation_class : 'LongType' },{ column_name: 'open', validation_class : 'FloatType' },{ column_name: 'high', validation_class : 'FloatType' },{ column_name: 'low', validation_class : 'FloatType' },{ column_name: 'close', validation_class : 'FloatType' },{ column_name: 'volumn', validation_class : 'LongType' },{ column_name: 'splitopen', validation_class : 'FloatType' },{ column_name: 'splithigh', validation_class : 'FloatType' },{ column_name: 'splitlow', validation_class : 'FloatType' },{ column_name: 'splitclose', validation_class : 'FloatType' },{ column_name: 'splitvolume', validation_class : 'LongType' },{ column_name: 'splitclose', validation_class : 'FloatType' } ] ; I've tried to erase everything and restart Cassandra but this still happens. But when I clear the column_metadata section this no more disagreement error. Do you have any idea why this happens? Environment: 2 VMs, using the same harddrive, Cassandra 0.8.1, Ubuntu 10.04 This is for testing only. We'll move to dedicated servers later. Best regards, Yi
Re: Schema Disagreement
I thought the schema disagree problem was already solved in 0.8.1... On possible solution is to decommission the disagree node and rejoin it. On Tue, Aug 2, 2011 at 8:01 AM, Yi Yang yy...@me.com wrote: Dear all, I'm always meeting mp with schema disagree problems while trying to create a column family like this, using cassandra-cli: create column family sd with column_type = 'Super' and key_validation_class = 'UUIDType' and comparator = 'LongType' and subcomparator = 'UTF8Type' and column_metadata = [ { column_name: 'time', validation_class : 'LongType' },{ column_name: 'open', validation_class : 'FloatType' },{ column_name: 'high', validation_class : 'FloatType' },{ column_name: 'low', validation_class : 'FloatType' },{ column_name: 'close', validation_class : 'FloatType' },{ column_name: 'volumn', validation_class : 'LongType' },{ column_name: 'splitopen', validation_class : 'FloatType' },{ column_name: 'splithigh', validation_class : 'FloatType' },{ column_name: 'splitlow', validation_class : 'FloatType' },{ column_name: 'splitclose', validation_class : 'FloatType' },{ column_name: 'splitvolume', validation_class : 'LongType' },{ column_name: 'splitclose', validation_class : 'FloatType' } ] ; I've tried to erase everything and restart Cassandra but this still happens. But when I clear the column_metadata section this no more disagreement error. Do you have any idea why this happens? Environment: 2 VMs, using the same harddrive, Cassandra 0.8.1, Ubuntu 10.04 This is for testing only. We'll move to dedicated servers later. Best regards, Yi -- Dikang Gu 0086 - 18611140205