Re: Schema disagreement

2018-05-01 Thread Gábor Auth
Hi,

On Tue, May 1, 2018 at 10:27 PM Gábor Auth  wrote:

> One or two years ago I've tried the CDC feature but switched off... maybe
> is it a side effect of switched off CDC? How can I fix it? :)
>

Okay, I've worked out. Updated the schema of the affected keyspaces on the
new nodes with 'cdc=false' and everything is okay now.

I think, it is a strange bug around the CDC...

Bye,
Gábor Auth


Re: Schema disagreement

2018-05-01 Thread Gábor Auth
Hi,

On Tue, May 1, 2018 at 7:40 PM Gábor Auth  wrote:

> What can I do? Any suggestion? :(
>

Okay, I've diffed the good and the bad system_scheme tables. The only
difference is the `cdc` field in three keyspaces (in `tables` and `views`):
- the value of `cdc` field on the good node is `False`
- the value of `cdc` field on the bad node is `null`

The value of `cdc` field on the other keyspaces is `null`.

One or two years ago I've tried the CDC feature but switched off... maybe
is it a side effect of switched off CDC? How can I fix it? :)

Bye,
Gábor Auth


Re: Schema disagreement

2018-05-01 Thread Gábor Auth
Hi,

On Mon, Apr 30, 2018 at 11:11 PM Gábor Auth  wrote:

> On Mon, Apr 30, 2018 at 11:03 PM Ali Hubail 
> wrote:
>
>> What steps have you performed to add the new DC? Have you tried to follow
>> certain procedures like this?
>>
>> https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsAddDCToCluster.html
>>
>
> Yes, exactly. :/
>

Okay, removed all new nodes (with `removenode`). Cleared all new node
(removed data and logs).

I did all the steps described in the link (again).

Same result:

Cluster Information:
   Name: cluster
   Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
   Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
   Schema versions:
   5de14758-887d-38c1-9105-fc60649b0edf: [new, new, ...]

   f4ed784a-174a-38dd-a7e5-55ff6f3002b2: [old, old, ...]

The old nodes try to gossip their own schema:
DEBUG [InternalResponseStage:1] 2018-05-01 17:36:36,266
MigrationManager.java:572 - Gossiping my schema version
f4ed784a-174a-38dd-a7e5-55ff6f3002b2
DEBUG [InternalResponseStage:1] 2018-05-01 17:36:36,863
MigrationManager.java:572 - Gossiping my schema version
f4ed784a-174a-38dd-a7e5-55ff6f3002b2

The new nodes try to gossip their own schema:
DEBUG [InternalResponseStage:4] 2018-05-01 17:36:26,329
MigrationManager.java:572 - Gossiping my schema version
5de14758-887d-38c1-9105-fc60649b0edf
DEBUG [InternalResponseStage:4] 2018-05-01 17:36:27,595
MigrationManager.java:572 - Gossiping my schema version
5de14758-887d-38c1-9105-fc60649b0edf

What can I do? Any suggestion? :(

Bye,
Gábor Auth


Re: Schema disagreement

2018-04-30 Thread Gábor Auth
Hi,

On Mon, Apr 30, 2018 at 11:03 PM Ali Hubail 
wrote:

> What steps have you performed to add the new DC? Have you tried to follow
> certain procedures like this?
>
> https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsAddDCToCluster.html
>

Yes, exactly. :/

Bye,
Gábor Auth


Re: Schema disagreement

2018-04-30 Thread Ali Hubail
Hi,

What steps have you performed to add the new DC? Have you tried to follow 
certain procedures like this?
https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsAddDCToCluster.html

Node can appear offline to other nodes for various reasons. It would help 
greatly to know what steps you have taken in order to know why you're 
facing this

Ali Hubail

Confidentiality warning: This message and any attachments are intended 
only for the persons to whom this message is addressed, are confidential, 
and may be privileged. If you are not the intended recipient, you are 
hereby notified that any review, retransmission, conversion to hard copy, 
copying, modification, circulation or other use of this message and any 
attachments is strictly prohibited. If you receive this message in error, 
please notify the sender immediately by return email, and delete this 
message and any attachments from your system. Petrolink International 
Limited its subsidiaries, holding companies and affiliates disclaims all 
responsibility from and accepts no liability whatsoever for the 
consequences of any unauthorized person acting, or refraining from acting, 
on any information contained in this message. For security purposes, staff 
training, to assist in resolving complaints and to improve our customer 
service, email communications may be monitored and telephone calls may be 
recorded.



Gábor Auth <auth.ga...@gmail.com> 
04/30/2018 03:40 PM
Please respond to
user@cassandra.apache.org


To
"user@cassandra.apache.org" <user@cassandra.apache.org>, 
cc

Subject
Re: Schema disagreement






Hi,

On Mon, Apr 30, 2018 at 11:39 AM Gábor Auth <auth.ga...@gmail.com> wrote:
've just tried to add a new DC and new node to my cluster (3 DCs and 10 
nodes) and the new node has a different schema version:

Is it normal? Node is marked down but doing a repair successfully?

WARN  [MigrationStage:1] 2018-04-30 20:36:56,579 MigrationTask.java:67 - 
Can't send schema pull request: node /x.x.216.121 is down.
INFO  [AntiEntropyStage:1] 2018-04-30 20:36:56,611 Validator.java:281 - 
[repair #323bf873-4cb6-11e8-bdd5-5feb84046dc9] Sending completed merkle 
tree to /x.x.216.121 for keyspace.table

The `nodetool status` is looking good:
UN  x.x.216.121  959.29 MiB  32   ? 
  322e4e9b-4d9e-43e3-94a3-bbe012058516  RACK01

Bye,
Gábor Auth


Re: Schema disagreement

2018-04-30 Thread Gábor Auth
Hi,

On Mon, Apr 30, 2018 at 11:39 AM Gábor Auth  wrote:

> 've just tried to add a new DC and new node to my cluster (3 DCs and 10
> nodes) and the new node has a different schema version:
>

Is it normal? Node is marked down but doing a repair successfully?

WARN  [MigrationStage:1] 2018-04-30 20:36:56,579 MigrationTask.java:67 -
Can't send schema pull request: node /x.x.216.121 is down.
INFO  [AntiEntropyStage:1] 2018-04-30 20:36:56,611 Validator.java:281 -
[repair #323bf873-4cb6-11e8-bdd5-5feb84046dc9] Sending completed merkle
tree to /x.x.216.121 for keyspace.table

The `nodetool status` is looking good:
UN  x.x.216.121  959.29 MiB  32   ?
  322e4e9b-4d9e-43e3-94a3-bbe012058516  RACK01

Bye,
Gábor Auth


Schema disagreement

2018-04-30 Thread Gábor Auth
Hi,

I've just tried to add a new DC and new node to my cluster (3 DCs and 10
nodes) and the new node has a different schema version:

Cluster Information:
Name: cluster
Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
7e12a13e-dcca-301b-a5ce-b1ad29fbbacb: [x.x.x.x, ..., ...]
bb186922-82b5-3a61-9c12-bf4eb87b9155: [new.new.new.new]

I've tried:
- node decommission and node re-addition
- resetlocalschema
- rebuild
- replace node
- repair
- cluster restart (node-by-node)

The MigrationManager constantly running on the new node and try to migrate
schema:
DEBUG [NonPeriodicTasks:1] 2018-04-30 09:33:22,405
MigrationManager.java:125 - submitting migration task for /x.x.x.x

What also can I do? :(

Bye,
Gábor Auth


Re: Schema Disagreement vs Nodetool resetlocalschema

2016-09-11 Thread Jens Rantil
Hi Michael,

Did you ever get an answer on this? I'm curious to hear for future
reference.

Thanks,
Jens

On Monday, June 20, 2016, Michael Fong <michael.f...@ruckuswireless.com>
wrote:

> Hi,
>
>
>
> We have recently encountered several schema disagreement issue while
> upgrading Cassandra. In one of the cases, the 2-node cluster idled for over
> 30 minutes and their schema remain unsynced. Due to other logic flows,
> Cassandra cannot be restarted, and hence we need to come up an alternative
> on-the-fly. We are thinking to do a nodetool resetlocalschema to force the
> schema synchronization. How safe is this method? Do we need to disable
> thrift/gossip protocol before performing this function, and enable them
> back after resync completes?
>
>
>
> Thanks in advance!
>
>
>
> Sincerely,
>
>
>
> Michael Fong
>


-- 
Jens Rantil
Backend engineer
Tink AB

Email: jens.ran...@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook <https://www.facebook.com/#!/tink.se> Linkedin
<http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary>
 Twitter <https://twitter.com/tink>


Schema Disagreement vs Nodetool resetlocalschema

2016-06-19 Thread Michael Fong
Hi,

We have recently encountered several schema disagreement issue while upgrading 
Cassandra. In one of the cases, the 2-node cluster idled for over 30 minutes 
and their schema remain unsynced. Due to other logic flows, Cassandra cannot be 
restarted, and hence we need to come up an alternative on-the-fly. We are 
thinking to do a nodetool resetlocalschema to force the schema synchronization. 
How safe is this method? Do we need to disable thrift/gossip protocol before 
performing this function, and enable them back after resync completes?

Thanks in advance!

Sincerely,

Michael Fong


Re: What are problems with schema disagreement

2015-07-06 Thread Robert Coli
On Mon, Jul 6, 2015 at 1:30 PM, John Wong gokoproj...@gmail.com wrote:

 But is there a problem with letting schema disagreement running for a long
 time?


It depends on what the nature of the desynch is, but generally speaking
there may be.

If you added a column or a columnfamily, and one node didn't get that
update, it will except when your clients attempt to read that
column/columnfamily. And so on...

=Rob


Re: What are problems with schema disagreement

2015-07-06 Thread John Wong
Thanks. Yeah we typically restart the nodes in the minor version to force
resync.

But is there a problem with letting schema disagreement running for a long
time?

Thanks.

John

On Mon, Jul 6, 2015 at 2:29 PM, Robert Coli rc...@eventbrite.com wrote:



 On Thu, Jul 2, 2015 at 9:31 PM, John Wong gokoproj...@gmail.com wrote:

 Hi Graham. Thanks. We are still running on 1.2.16, but we do plan to
 upgrade in the near future. The load on the cluster at the time was very
 very low. All nodes were responsive, except nothing was show up in the logs
 after certain time, which led me to believe something happened internal,
 although that was a poor wild guess.

 But is it safe to be okay with schema disagreement? I worry about data
 consistency if I let it sit too long.


 In general one shouldn't run with schema disagreement persistently.

 I've seen schema desynch issues on 1.2.x, in general restarting some
 unclear subset of the affected daemons made them synch.

 =Rob




Re: What are problems with schema disagreement

2015-07-06 Thread Robert Coli
On Thu, Jul 2, 2015 at 9:31 PM, John Wong gokoproj...@gmail.com wrote:

 Hi Graham. Thanks. We are still running on 1.2.16, but we do plan to
 upgrade in the near future. The load on the cluster at the time was very
 very low. All nodes were responsive, except nothing was show up in the logs
 after certain time, which led me to believe something happened internal,
 although that was a poor wild guess.

 But is it safe to be okay with schema disagreement? I worry about data
 consistency if I let it sit too long.


In general one shouldn't run with schema disagreement persistently.

I've seen schema desynch issues on 1.2.x, in general restarting some
unclear subset of the affected daemons made them synch.

=Rob


What are problems with schema disagreement

2015-07-02 Thread John Wong
Hi.

Here is a schema disagreement we encountered.
Schema versions:
b6467059-5897-3cc1-9ee2-73f31841b0b0: [10.0.1.100, 10.0.1.109]
c8971b2d-0949-3584-aa87-0050a4149bbd: [10.0.1.55, 10.0.1.16,
10.0.1.77]
c733920b-2a31-30f0-bca1-45a8c9130a2c: [10.0.1.221]

We deployed an application which would send a schema update (DDL=auto). We
found this prod cluster had 3 schema difference. Other existing
applications were fine, so some people were curious what if we left this
problem alone until off hours.

Is there any concerns with not resolve schema disagreement right away? FWIW
we went ahead and restarted 221 first, and continue with the rest of the
minors.

Thanks.

John


Re: What are problems with schema disagreement

2015-07-02 Thread graham sanderson
What version of C* are you running? Some versions of 2.0.x might occasionally 
fail to propagate schema changes in a timely fashion (though they would fix 
themselves eventually - in the order of a few minutes)

 On Jul 2, 2015, at 9:37 PM, John Wong gokoproj...@gmail.com wrote:
 
 Hi.
 
 Here is a schema disagreement we encountered.
 Schema versions:
 b6467059-5897-3cc1-9ee2-73f31841b0b0: [10.0.1.100, 10.0.1.109]
 c8971b2d-0949-3584-aa87-0050a4149bbd: [10.0.1.55, 10.0.1.16, 
 10.0.1.77]
 c733920b-2a31-30f0-bca1-45a8c9130a2c: [10.0.1.221]
 
 We deployed an application which would send a schema update (DDL=auto). We 
 found this prod cluster had 3 schema difference. Other existing applications 
 were fine, so some people were curious what if we left this problem alone 
 until off hours.
 
 Is there any concerns with not resolve schema disagreement right away? FWIW 
 we went ahead and restarted 221 first, and continue with the rest of the 
 minors.
 
 Thanks.
 
 John
 



smime.p7s
Description: S/MIME cryptographic signature


Re: What are problems with schema disagreement

2015-07-02 Thread John Wong
On Thu, Jul 2, 2015 at 11:01 PM, graham sanderson gra...@vast.com wrote:

 What version of C* are you running? Some versions of 2.0.x might
 occasionally fail to propagate schema changes in a timely fashion (though
 they would fix themselves eventually - in the order of a few minutes)


Hi Graham. Thanks. We are still running on 1.2.16, but we do plan to
upgrade in the near future. The load on the cluster at the time was very
very low. All nodes were responsive, except nothing was show up in the logs
after certain time, which led me to believe something happened internal,
although that was a poor wild guess.

But is it safe to be okay with schema disagreement? I worry about data
consistency if I let it sit too long.

Thanks.

John

  On Jul 2, 2015, at 9:37 PM, John Wong gokoproj...@gmail.com wrote:
 
  Hi.
 
  Here is a schema disagreement we encountered.
  Schema versions:
  b6467059-5897-3cc1-9ee2-73f31841b0b0: [10.0.1.100, 10.0.1.109]
  c8971b2d-0949-3584-aa87-0050a4149bbd: [10.0.1.55, 10.0.1.16,
 10.0.1.77]
  c733920b-2a31-30f0-bca1-45a8c9130a2c: [10.0.1.221]
 
  We deployed an application which would send a schema update (DDL=auto).
 We found this prod cluster had 3 schema difference. Other existing
 applications were fine, so some people were curious what if we left this
 problem alone until off hours.
 
  Is there any concerns with not resolve schema disagreement right away?
 FWIW we went ahead and restarted 221 first, and continue with the rest of
 the minors.
 
  Thanks.
 
  John
 




Cassandra schema disagreement

2014-08-12 Thread Demeyer Jonathan
Hello,

I have a cluster running and I'm trying to change the schema on it. Altough it 
succeeds on one cluster (a test one), on another it keeps creating two separate 
schema versions (both are 2 DC configuration; the cluster where it goes wrong 
end up with a schema version on each DC).

I use apache-cassandra11-1.1.12 on CentOS 6.4

I'm trying to start from a fresh cassandra config (doing  rm -rf 
/var/lib/cassandra/{commitlog,data}/*  while cassandra is stopped).

Each DC are on separate IP segment but there are no firewall between them.

Here is the output of the command when the desynchronisation occurs:
---
[root@cassandranode00 CDN]# cassandra-cli -f reCreateCassandraStruct.sh
Connected to: TTF Cluster v2013_1257 on 127.0.0.1/9160
7ef8c681-189a-3088-8598-560437f705d9
Waiting for schema agreement...
... schemas agree across the cluster
Authenticated to keyspace: ks1
f179fd8e-f8ca-36cf-bf53-d8341fd6006e
Waiting for schema agreement...
The schema has not settled in 10 seconds; further migrations are ill-advised 
until it does.
Versions are f179fd8e-f8ca-36cf-bf53-d8341fd6006e:[10.69.221.20, 10.69.221.21, 
10.69.221.22], e9656b30-b671-3fce-9fb4-bdd3e6da36d1:[1
0.69.10.14, 10.69.10.13, 10.69.10.11]
---

I also try creating a keyspace with a column family using the opscenter (with 
no good result).

I'm out of hint to where to look. Do you have some suggestions ?

Is there improvements on this side with cassandra  1.1.12 ?

Thanks,
Jonathan DEMEYER
Here is the start of reCreateCassandraStruct.sh :
CREATE KEYSPACE ks1 WITH placement_strategy = 'NetworkTopologyStrategy' AND 
strategy_options={DC1:3,DC2:3};
use ks1;
create column family id
with comparator = 'UTF8Type'
and key_validation_class = 'UTF8Type'
and column_metadata = [
{
column_name : 'user',
validation_class : UTF8Type
}
];
CREATE KEYSPACE ks2 WITH placement_strategy = 'NetworkTopologyStrategy' AND 
strategy_options={DC1:3,DC2:3};
use ks2;
create column family id;


RE: Cassandra schema disagreement

2014-08-12 Thread Demeyer Jonathan
After a lot of investigation, it seems that the clocks were desynchronized 
through the cluster (altough we did not check that resyncing them resolve the 
problem, we modify the schma with one node up and restart all other nodes 
afterwards).


From: Demeyer Jonathan [mailto:jonathan.deme...@macq.eu]
Sent: mardi 12 août 2014 11:03
To: user@cassandra.apache.org
Subject: Cassandra schema disagreement

Hello,

I have a cluster running and I'm trying to change the schema on it. Altough it 
succeeds on one cluster (a test one), on another it keeps creating two separate 
schema versions (both are 2 DC configuration; the cluster where it goes wrong 
end up with a schema version on each DC).

I use apache-cassandra11-1.1.12 on CentOS 6.4

I'm trying to start from a fresh cassandra config (doing  rm -rf 
/var/lib/cassandra/{commitlog,data}/*  while cassandra is stopped).

Each DC are on separate IP segment but there are no firewall between them.

Here is the output of the command when the desynchronisation occurs:
---
[root@cassandranode00 CDN]# cassandra-cli -f reCreateCassandraStruct.sh
Connected to: TTF Cluster v2013_1257 on 127.0.0.1/9160
7ef8c681-189a-3088-8598-560437f705d9
Waiting for schema agreement...
... schemas agree across the cluster
Authenticated to keyspace: ks1
f179fd8e-f8ca-36cf-bf53-d8341fd6006e
Waiting for schema agreement...
The schema has not settled in 10 seconds; further migrations are ill-advised 
until it does.
Versions are f179fd8e-f8ca-36cf-bf53-d8341fd6006e:[10.69.221.20, 10.69.221.21, 
10.69.221.22], e9656b30-b671-3fce-9fb4-bdd3e6da36d1:[1
0.69.10.14, 10.69.10.13, 10.69.10.11]
---

I also try creating a keyspace with a column family using the opscenter (with 
no good result).

I'm out of hint to where to look. Do you have some suggestions ?

Is there improvements on this side with cassandra  1.1.12 ?

Thanks,
Jonathan DEMEYER
Here is the start of reCreateCassandraStruct.sh :
CREATE KEYSPACE ks1 WITH placement_strategy = 'NetworkTopologyStrategy' AND 
strategy_options={DC1:3,DC2:3};
use ks1;
create column family id
with comparator = 'UTF8Type'
and key_validation_class = 'UTF8Type'
and column_metadata = [
{
column_name : 'user',
validation_class : UTF8Type
}
];
CREATE KEYSPACE ks2 WITH placement_strategy = 'NetworkTopologyStrategy' AND 
strategy_options={DC1:3,DC2:3};
use ks2;
create column family id;


Re: Schema disagreement errors

2014-05-13 Thread Duncan Sands

Hi Gaurav, a schema versioning bug was fixed in 2.0.7.

Best wishes, Duncan.

On 12/05/14 21:31, Gaurav Sehgal wrote:

We have recently started seeing a lot of Schema Disagreement errors. We are
using Cassandra 2.0.6 with Oracle Java 1.7. I went through the Cassandra FAQ and
followed the below steps:


  * nodetool disablethrift
  * nodetool disablegossip
  * nodetool drain
  *

'kill pid'.


As per the documentation; the commit logs should have been flush; but that did
not happen in our case. The commit logs were still there. So, I removed them
manually to make sure there are no commit logs when cassandra start up( which
was fine in our case as this data can always be replayed).  I also deleted the
schema* directory from the /data/system folder.

Though when we started cassandra back up the issue started happening again.


Any help would be appreciated

Cheers!
Gaurav






Re: Schema disagreement errors

2014-05-13 Thread Vincent Mallet
Hey Gaurav,

You should consider moving to 2.0.7 which fixes a bunch of these schema
disagreement problems. You could also play around with nodetool
resetlocalschema on the nodes that are behind, but be careful with that
one. I'd go with 2.0.7 first for sure.

Thanks,

   Vince.


On Mon, May 12, 2014 at 12:31 PM, Gaurav Sehgal gsehg...@gmail.com wrote:

 We have recently started seeing a lot of Schema Disagreement errors. We
 are using Cassandra 2.0.6 with Oracle Java 1.7. I went through the
 Cassandra FAQ and followed the below steps:



- nodetool disablethrift
- nodetool disablegossip
- nodetool drain
-

'kill pid'.


 As per the documentation; the commit logs should have been flush; but that
 did not happen in our case. The commit logs were still there. So, I removed
 them manually to make sure there are no commit logs when cassandra start
 up( which was fine in our case as this data can always be replayed).  I
 also deleted the schema* directory from the /data/system folder.

 Though when we started cassandra back up the issue started happening again.


 Any help would be appreciated

 Cheers!
 Gaurav





Re: Schema disagreement errors

2014-05-13 Thread Robert Coli
On Tue, May 13, 2014 at 5:11 PM, Donald Smith 
donald.sm...@audiencescience.com wrote:

  I too have noticed that after doing “nodetool flush” (or “nodetool
 drain”), the commit logs are still there. I think they’re NEW (empty)
 commit logs, but I may be wrong. Anyone know?


Assuming they are being correctly marked clean after drain (which
historically has been a nontrivial assumption) they are new, empty
commit log segments which have been recycled.

=Rob


Schema disagreement errors

2014-05-12 Thread Gaurav Sehgal
We have recently started seeing a lot of Schema Disagreement errors. We are
using Cassandra 2.0.6 with Oracle Java 1.7. I went through the Cassandra
FAQ and followed the below steps:



   - nodetool disablethrift
   - nodetool disablegossip
   - nodetool drain
   -

   'kill pid'.


As per the documentation; the commit logs should have been flush; but that
did not happen in our case. The commit logs were still there. So, I removed
them manually to make sure there are no commit logs when cassandra start
up( which was fine in our case as this data can always be replayed).  I
also deleted the schema* directory from the /data/system folder.

Though when we started cassandra back up the issue started happening again.


Any help would be appreciated

Cheers!
Gaurav


Re: Schema disagreement errors

2014-05-12 Thread Laing, Michael
Upgrade to 2.0.7 fixed this for me.

You can also try 'nodetool resetlocalschema' on disagreeing nodes. This
worked temporarily for me in 2.0.6.

ml


On Mon, May 12, 2014 at 3:31 PM, Gaurav Sehgal gsehg...@gmail.com wrote:

 We have recently started seeing a lot of Schema Disagreement errors. We
 are using Cassandra 2.0.6 with Oracle Java 1.7. I went through the
 Cassandra FAQ and followed the below steps:



- nodetool disablethrift
- nodetool disablegossip
- nodetool drain
-

'kill pid'.


 As per the documentation; the commit logs should have been flush; but that
 did not happen in our case. The commit logs were still there. So, I removed
 them manually to make sure there are no commit logs when cassandra start
 up( which was fine in our case as this data can always be replayed).  I
 also deleted the schema* directory from the /data/system folder.

 Though when we started cassandra back up the issue started happening again.


 Any help would be appreciated

 Cheers!
 Gaurav





Re: Schema disagreement under normal conditions, ALTER TABLE hangs

2013-11-28 Thread Josh Dzielak
Thanks Rob. Let me add one thing in case someone else finds this thread - 

Restarting the nodes did not in and of itself get the schema disagreement 
resolved. We had to run the ALTER TABLE command individually on each of the 
disagreeing nodes once they came back up. 

On Tuesday, November 26, 2013 at 11:24 AM, Robert Coli wrote:

 On Mon, Nov 25, 2013 at 6:42 PM, Josh Dzielak j...@keen.io 
 (mailto:j...@keen.io) wrote:
  Recently we had a strange thing happen. Altering schema (gc_grace_seconds) 
  for a column family resulted in a schema disagreement. 3/4 of nodes got it, 
  1/4 didn't. There was no partition at the time, nor was there multiple 
  schema updates issued. Going to the nodes with stale schema and trying to 
  do the ALTER TABLE there resulted in hanging. We were eventually able to 
  get schema agreement by restarting nodes, but both the initial disagreement 
  under normal conditions and the hanging ALTER TABLE seem pretty weird. Any 
  ideas here? Sound like a bug? 
 
 Yes, that sounds like a bug. This behavior is less common in 1.2.x than it 
 was previously, but still happens sometimes. It's interesting that restarting 
 the affected node helped, in previous versions of hung schema issue, it 
 would survive restart. 
  
  We're on 1.2.8.
  
 
 
 Unfortunately, unless you have a repro path, it is probably not worth 
 reporting a JIRA. 
 
 =Rob
  
 
 
 
 
 




Re: Schema disagreement under normal conditions, ALTER TABLE hangs

2013-11-26 Thread Robert Coli
On Mon, Nov 25, 2013 at 6:42 PM, Josh Dzielak j...@keen.io wrote:

 Recently we had a strange thing happen. Altering schema (gc_grace_seconds)
 for a column family resulted in a schema disagreement. 3/4 of nodes got it,
 1/4 didn't. There was no partition at the time, nor was there multiple
 schema updates issued. Going to the nodes with stale schema and trying to
 do the ALTER TABLE there resulted in hanging. We were eventually able to
 get schema agreement by restarting nodes, but both the initial disagreement
 under normal conditions and the hanging ALTER TABLE seem pretty weird. Any
 ideas here? Sound like a bug?


Yes, that sounds like a bug. This behavior is less common in 1.2.x than it
was previously, but still happens sometimes. It's interesting that
restarting the affected node helped, in previous versions of hung schema
issue, it would survive restart.


 We're on 1.2.8.


Unfortunately, unless you have a repro path, it is probably not worth
reporting a JIRA.

=Rob


Schema disagreement under normal conditions, ALTER TABLE hangs

2013-11-25 Thread Josh Dzielak
Recently we had a strange thing happen. Altering schema (gc_grace_seconds) for 
a column family resulted in a schema disagreement. 3/4 of nodes got it, 1/4 
didn't. There was no partition at the time, nor was there multiple schema 
updates issued. Going to the nodes with stale schema and trying to do the ALTER 
TABLE there resulted in hanging. We were eventually able to get schema 
agreement by restarting nodes, but both the initial disagreement under normal 
conditions and the hanging ALTER TABLE seem pretty weird. Any ideas here? Sound 
like a bug?  

We're on 1.2.8.

Thanks,
Josh

--
Josh Dzielak • Keen IO • @dzello (https://twitter.com/dzello)



Re: Cannot resolve schema disagreement

2013-05-09 Thread Robert Coli
On Wed, May 8, 2013 at 5:40 PM, srmore comom...@gmail.com wrote:
 After running the commands, I get back to the same issue. Cannot afford to
 lose the data so I guess this is the only option for me. And unfortunately I
 am using 1.0.12 ( cannot upgrade as of now ). Any, ideas on what might be
 happening or any pointers will be greatly appreciated.

If you can afford downtime on the cluster, the solution to this
problem with the highest chance of success is :

1) dump the existing schema from a good node
2) nodetool drain on all nodes
3) stop cluster
4) move schema and migration CF tables out of the way on all nodes
5) start cluster
6) re-load schema, being careful to explicitly check for schema
agreement on all nodes between schema modifying statements

In many/most cases of schema disagreement, people try the FAQ approach
and it doesn't work and they end up being forced to do the above
anyway. In general if you can tolerate the downtime, you should save
yourself the effort and just do the above process.

=Rob


Re: Cannot resolve schema disagreement

2013-05-09 Thread srmore
Thanks Rob !

Tried the steps, that did not work, however I was able to resolve the
problem by syncing the clocks. The thing that confuses me is that, the FAQ
says Before 0.7.6, this can also be caused by cluster system clocks being
substantially out of sync with each other. The version I am using was
1.0.12.

This raises an important question, where does Cassandra get the time
information from ? and is it required (I know it is highly highly advisable
to) to keep clocks in sync, any suggestions/best practices on how to keep
the clocks in sync  ?



/srm


On Thu, May 9, 2013 at 1:58 PM, Robert Coli rc...@eventbrite.com wrote:

 On Wed, May 8, 2013 at 5:40 PM, srmore comom...@gmail.com wrote:
  After running the commands, I get back to the same issue. Cannot afford
 to
  lose the data so I guess this is the only option for me. And
 unfortunately I
  am using 1.0.12 ( cannot upgrade as of now ). Any, ideas on what might be
  happening or any pointers will be greatly appreciated.

 If you can afford downtime on the cluster, the solution to this
 problem with the highest chance of success is :

 1) dump the existing schema from a good node
 2) nodetool drain on all nodes
 3) stop cluster
 4) move schema and migration CF tables out of the way on all nodes
 5) start cluster
 6) re-load schema, being careful to explicitly check for schema
 agreement on all nodes between schema modifying statements

 In many/most cases of schema disagreement, people try the FAQ approach
 and it doesn't work and they end up being forced to do the above
 anyway. In general if you can tolerate the downtime, you should save
 yourself the effort and just do the above process.

 =Rob



Re: Cannot resolve schema disagreement

2013-05-09 Thread aaron morton
 This raises an important question, where does Cassandra get the time 
 information from ? 
http://docs.oracle.com/javase/6/docs/api/java/lang/System.html
normally milliSeconds, not sure if 1.0.12 may use nanoTime() which is less 
reliable on some VM's. 

 and is it required (I know it is highly highly advisable to) to keep clocks 
 in sync, any suggestions/best practices on how to keep the clocks in sync  ? 
http://en.wikipedia.org/wiki/Network_Time_Protocol

Hope that helps. 

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 10/05/2013, at 9:16 AM, srmore comom...@gmail.com wrote:

 Thanks Rob !
 
 Tried the steps, that did not work, however I was able to resolve the problem 
 by syncing the clocks. The thing that confuses me is that, the FAQ says 
 Before 0.7.6, this can also be caused by cluster system clocks being 
 substantially out of sync with each other. The version I am using was 1.0.12.
 
 This raises an important question, where does Cassandra get the time 
 information from ? and is it required (I know it is highly highly advisable 
 to) to keep clocks in sync, any suggestions/best practices on how to keep the 
 clocks in sync  ? 
 
 
 
 /srm
 
 
 On Thu, May 9, 2013 at 1:58 PM, Robert Coli rc...@eventbrite.com wrote:
 On Wed, May 8, 2013 at 5:40 PM, srmore comom...@gmail.com wrote:
  After running the commands, I get back to the same issue. Cannot afford to
  lose the data so I guess this is the only option for me. And unfortunately I
  am using 1.0.12 ( cannot upgrade as of now ). Any, ideas on what might be
  happening or any pointers will be greatly appreciated.
 
 If you can afford downtime on the cluster, the solution to this
 problem with the highest chance of success is :
 
 1) dump the existing schema from a good node
 2) nodetool drain on all nodes
 3) stop cluster
 4) move schema and migration CF tables out of the way on all nodes
 5) start cluster
 6) re-load schema, being careful to explicitly check for schema
 agreement on all nodes between schema modifying statements
 
 In many/most cases of schema disagreement, people try the FAQ approach
 and it doesn't work and they end up being forced to do the above
 anyway. In general if you can tolerate the downtime, you should save
 yourself the effort and just do the above process.
 
 =Rob
 



Re: Cannot resolve schema disagreement

2013-05-09 Thread srmore
Thought so.

Thanks Aaron !



On Thu, May 9, 2013 at 6:09 PM, aaron morton aa...@thelastpickle.comwrote:

 This raises an important question, where does Cassandra get the time
 information from ?

 http://docs.oracle.com/javase/6/docs/api/java/lang/System.html
 normally milliSeconds, not sure if 1.0.12 may use nanoTime() which is less
 reliable on some VM's.

 and is it required (I know it is highly highly advisable to) to keep
 clocks in sync, any suggestions/best practices on how to keep the clocks in
 sync  ?

 http://en.wikipedia.org/wiki/Network_Time_Protocol

 Hope that helps.

 -
 Aaron Morton
 Freelance Cassandra Consultant
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 10/05/2013, at 9:16 AM, srmore comom...@gmail.com wrote:

 Thanks Rob !

 Tried the steps, that did not work, however I was able to resolve the
 problem by syncing the clocks. The thing that confuses me is that, the FAQ
 says Before 0.7.6, this can also be caused by cluster system clocks being
 substantially out of sync with each other. The version I am using was
 1.0.12.

 This raises an important question, where does Cassandra get the time
 information from ? and is it required (I know it is highly highly advisable
 to) to keep clocks in sync, any suggestions/best practices on how to keep
 the clocks in sync  ?



 /srm


 On Thu, May 9, 2013 at 1:58 PM, Robert Coli rc...@eventbrite.com wrote:

 On Wed, May 8, 2013 at 5:40 PM, srmore comom...@gmail.com wrote:
  After running the commands, I get back to the same issue. Cannot afford
 to
  lose the data so I guess this is the only option for me. And
 unfortunately I
  am using 1.0.12 ( cannot upgrade as of now ). Any, ideas on what might
 be
  happening or any pointers will be greatly appreciated.

 If you can afford downtime on the cluster, the solution to this
 problem with the highest chance of success is :

 1) dump the existing schema from a good node
 2) nodetool drain on all nodes
 3) stop cluster
 4) move schema and migration CF tables out of the way on all nodes
 5) start cluster
 6) re-load schema, being careful to explicitly check for schema
 agreement on all nodes between schema modifying statements

 In many/most cases of schema disagreement, people try the FAQ approach
 and it doesn't work and they end up being forced to do the above
 anyway. In general if you can tolerate the downtime, you should save
 yourself the effort and just do the above process.

 =Rob






Cannot resolve schema disagreement

2013-05-08 Thread srmore
Hello,
I have a cluster of 4 nodes and two of them are on different schema. I
tried to run the commands described in the FAQ section but no luck  (
http://wiki.apache.org/cassandra/FAQ#schema_disagreement) .

After running the commands, I get back to the same issue. Cannot afford to
lose the data so I guess this is the only option for me. And unfortunately
I am using 1.0.12 ( cannot upgrade as of now ). Any, ideas on what might be
happening or any pointers will be greatly appreciated.

/srm


Schema Disagreement after migration from 1.0.6 to 1.1.4

2012-09-05 Thread Martin Koch
Hi list

We have a 5-node Cassandra cluster with a single 1.0.9 installation and
four 1.0.6 installations.

We have tried installing 1.1.4 on one of the 1.0.6 nodes (following the
instructions on http://www.datastax.com/docs/1.1/install/upgrading).

After bringing up 1.1.4 there are no errors in the log, but the cluster now
suffers from schema disagreement

[default@unknown] describe cluster;
Cluster Information:
   Snitch: org.apache.cassandra.locator.SimpleSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions:
59adb24e-f3cd-3e02-97f0-5b395827453f: [10.10.29.67] - The new 1.1.4 node

943fc0a0-f678-11e1--339cf8a6c1bf: [10.10.87.228, 10.10.153.45,
10.10.145.90, 10.38.127.80] - nodes in the old cluster

The recipe for recovering from schema disagreement (
http://wiki.apache.org/cassandra/FAQ#schema_disagreement) doesn't cover the
new directory layout. The system/Schema directory is empty save for a
snapshots subdirectory. system/schema_columnfamilies and
system/schema_keyspaces contain some files. As described in datastax's
description, we tried running nodetool upgradesstables. When this had done,
describe schema in the cli showed a schema definition which seemed correct,
but was indeed different from the schema on the other nodes in the cluster.

Any clues on how we should proceed?

Thanks,
/Martin Koch


Re: Schema Disagreement after migration from 1.0.6 to 1.1.4

2012-09-05 Thread Edward Sargisson

I would try nodetool resetlocalschema.


On 12-09-05 07:08 AM, Martin Koch wrote:

Hi list

We have a 5-node Cassandra cluster with a single 1.0.9 installation 
and four 1.0.6 installations.


We have tried installing 1.1.4 on one of the 1.0.6 nodes (following 
the instructions on http://www.datastax.com/docs/1.1/install/upgrading).


After bringing up 1.1.4 there are no errors in the log, but the 
cluster now suffers from schema disagreement


[default@unknown] describe cluster;
Cluster Information:
   Snitch: org.apache.cassandra.locator.SimpleSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions:
59adb24e-f3cd-3e02-97f0-5b395827453f: [10.10.29.67] -The new 1.1.4 node

943fc0a0-f678-11e1--339cf8a6c1bf: [10.10.87.228, 10.10.153.45, 
10.10.145.90, 10.38.127.80] - nodes in the old cluster


The recipe for recovering from schema disagreement 
(http://wiki.apache.org/cassandra/FAQ#schema_disagreement) doesn't 
cover the new directory layout. The system/Schema directory is empty 
save for a snapshots subdirectory. system/schema_columnfamilies and 
system/schema_keyspaces contain some files. As described in datastax's 
description, we tried running nodetool upgradesstables. When this had 
done, describe schema in the cli showed a schema definition which 
seemed correct, but was indeed different from the schema on the other 
nodes in the cluster.


Any clues on how we should proceed?

Thanks,
/Martin Koch


--

Edward Sargisson

senior java developer
Global Relay

edward.sargis...@globalrelay.net mailto:edward.sargis...@globalrelay.net


*866.484.6630*
New York | Chicago | Vancouver | London (+44.0800.032.9829) | Singapore 
(+65.3158.1301)


Global Relay Archive supports email, instant messaging, BlackBerry, 
Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, 
Facebook and more.



Ask about *Global Relay Message* 
http://www.globalrelay.com/services/message*--- *The Future of 
Collaboration in the Financial Services World


*
*All email sent to or from this address will be retained by Global 
Relay's email archiving system. This message is intended only for the 
use of the individual or entity to which it is addressed, and may 
contain information that is privileged, confidential, and exempt from 
disclosure under applicable law.  Global Relay will not be liable for 
any compliance or technical information provided herein. All trademarks 
are the property of their respective owners.




Re: Schema Disagreement after migration from 1.0.6 to 1.1.4

2012-09-05 Thread Omid Aladini
Do you see exceptions like java.lang.UnsupportedOperationException:
Not a time-based UUID in log files of nodes running 1.0.6 and 1.0.9?
Then it's probably due to [1] explained here [2] -- In this case you
either have to upgrade all nodes to 1.1.4 or if you prefer keeping a
mixed-version cluster, the 1.0.6 and 1.0.9 nodes won't be able to join
the cluster again, unless you temporarily upgrade them to 1.0.11.

Cheers,
Omid

[1] https://issues.apache.org/jira/browse/CASSANDRA-1391
[2] https://issues.apache.org/jira/browse/CASSANDRA-4195

On Wed, Sep 5, 2012 at 4:08 PM, Martin Koch m...@issuu.com wrote:

 Hi list

 We have a 5-node Cassandra cluster with a single 1.0.9 installation and four 
 1.0.6 installations.

 We have tried installing 1.1.4 on one of the 1.0.6 nodes (following the 
 instructions on http://www.datastax.com/docs/1.1/install/upgrading).

 After bringing up 1.1.4 there are no errors in the log, but the cluster now 
 suffers from schema disagreement

 [default@unknown] describe cluster;
 Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions:
 59adb24e-f3cd-3e02-97f0-5b395827453f: [10.10.29.67] - The new 1.1.4 node

 943fc0a0-f678-11e1--339cf8a6c1bf: [10.10.87.228, 10.10.153.45, 
 10.10.145.90, 10.38.127.80] - nodes in the old cluster

 The recipe for recovering from schema disagreement 
 (http://wiki.apache.org/cassandra/FAQ#schema_disagreement) doesn't cover the 
 new directory layout. The system/Schema directory is empty save for a 
 snapshots subdirectory. system/schema_columnfamilies and 
 system/schema_keyspaces contain some files. As described in datastax's 
 description, we tried running nodetool upgradesstables. When this had done, 
 describe schema in the cli showed a schema definition which seemed correct, 
 but was indeed different from the schema on the other nodes in the cluster.

 Any clues on how we should proceed?

 Thanks,
 /Martin Koch


Re: Schema Disagreement after migration from 1.0.6 to 1.1.4

2012-09-05 Thread Martin Koch
Thanks, this is exactly it. We'd like to do a rolling upgrade - this is a
production cluster - so I guess we'll upgrade 1.0.6 - 1.0.11 - 1.1.4,
then.

/Martin

On Thu, Sep 6, 2012 at 2:35 AM, Omid Aladini omidalad...@gmail.com wrote:

 Do you see exceptions like java.lang.UnsupportedOperationException:
 Not a time-based UUID in log files of nodes running 1.0.6 and 1.0.9?
 Then it's probably due to [1] explained here [2] -- In this case you
 either have to upgrade all nodes to 1.1.4 or if you prefer keeping a
 mixed-version cluster, the 1.0.6 and 1.0.9 nodes won't be able to join
 the cluster again, unless you temporarily upgrade them to 1.0.11.

 Cheers,
 Omid

 [1] https://issues.apache.org/jira/browse/CASSANDRA-1391
 [2] https://issues.apache.org/jira/browse/CASSANDRA-4195

 On Wed, Sep 5, 2012 at 4:08 PM, Martin Koch m...@issuu.com wrote:
 
  Hi list
 
  We have a 5-node Cassandra cluster with a single 1.0.9 installation and
 four 1.0.6 installations.
 
  We have tried installing 1.1.4 on one of the 1.0.6 nodes (following the
 instructions on http://www.datastax.com/docs/1.1/install/upgrading).
 
  After bringing up 1.1.4 there are no errors in the log, but the cluster
 now suffers from schema disagreement
 
  [default@unknown] describe cluster;
  Cluster Information:
 Snitch: org.apache.cassandra.locator.SimpleSnitch
 Partitioner: org.apache.cassandra.dht.RandomPartitioner
 Schema versions:
  59adb24e-f3cd-3e02-97f0-5b395827453f: [10.10.29.67] - The new 1.1.4 node
 
  943fc0a0-f678-11e1--339cf8a6c1bf: [10.10.87.228, 10.10.153.45,
 10.10.145.90, 10.38.127.80] - nodes in the old cluster
 
  The recipe for recovering from schema disagreement (
 http://wiki.apache.org/cassandra/FAQ#schema_disagreement) doesn't cover
 the new directory layout. The system/Schema directory is empty save for a
 snapshots subdirectory. system/schema_columnfamilies and
 system/schema_keyspaces contain some files. As described in datastax's
 description, we tried running nodetool upgradesstables. When this had done,
 describe schema in the cli showed a schema definition which seemed correct,
 but was indeed different from the schema on the other nodes in the cluster.
 
  Any clues on how we should proceed?
 
  Thanks,
  /Martin Koch



How schema disagreement can be fixed faster on 1.0.10 cluster ?

2012-07-26 Thread Mateusz Korniak
Hi !
We got into schema disagreement situation on 1.0.10 having 250GB of compressed 
data per node.

Following
http://wiki.apache.org/cassandra/FAQ#schema_disagreement
after node restart looks like it is replaying all schema changes one be one , 
right ? 
As we did a lot of them during cluster lifetime, now node is busy creating 
long time ago dropped secondary indexes which looks like gonna take hours.
Can it be done faster ?

1. Can we move all data SStables out of data/*/ directories,
2. follow FAQ#schema_disagreement (it should be faster on no data node) until 
we reach schema agreement. 
3. Than stop cassandra,
4. Copy files back.
5. Start cassandra.


Will it work ?

Extra option is to disable thrift during above process (can it be done in 
config ? In cassandra.yaml rpc_port: 0 ? )



Thanks in advance for any hints, regards,

-- 
Mateusz Korniak


Re: How schema disagreement can be fixed faster on 1.0.10 cluster ?

2012-07-26 Thread Tyler Hobbs
I know you specified 1.0.10, but C* 1.1 solves this problem:
http://www.datastax.com/dev/blog/the-schema-management-renaissance

On Thu, Jul 26, 2012 at 7:29 AM, Mateusz Korniak 
mateusz-li...@ant.gliwice.pl wrote:

 Hi !
 We got into schema disagreement situation on 1.0.10 having 250GB of
 compressed
 data per node.

 Following
 http://wiki.apache.org/cassandra/FAQ#schema_disagreement
 after node restart looks like it is replaying all schema changes one be
 one ,
 right ?
 As we did a lot of them during cluster lifetime, now node is busy creating
 long time ago dropped secondary indexes which looks like gonna take hours.
 Can it be done faster ?

 1. Can we move all data SStables out of data/*/ directories,
 2. follow FAQ#schema_disagreement (it should be faster on no data node)
 until
 we reach schema agreement.
 3. Than stop cassandra,
 4. Copy files back.
 5. Start cassandra.


 Will it work ?

 Extra option is to disable thrift during above process (can it be done in
 config ? In cassandra.yaml rpc_port: 0 ? )



 Thanks in advance for any hints, regards,

 --
 Mateusz Korniak




-- 
Tyler Hobbs
DataStax http://datastax.com/


Re: Couldn't detect any schema definitions in local storage - after handling schema disagreement according to FAQ

2012-05-21 Thread aaron morton
 1) What did I wrong? - why cassandra was throwing exceptions on first startup?
In 1.0.X the history of schema changes was replayed to the node when it 
rejoined the cluster. If the node is receiving traffic while this is going on 
it will log those errors until the schema mutation that created 1012 is 
replayed. 

 2) Why the keyspace data was invalidated ? Is it expected?
The data will have remained on the disk. The load is calculated based on the 
CF's in the schema, this can mean that the load will not return to full until 
the schema is fully replayed. 

Did you lose data ?

 3) If answer to #2 is  yes it's expected then  that's the point in doing 
 http://wiki.apache.org/cassandra/FAQ#schema_disagreement
 then all keyspace data is lost anyway? It makes more sense to just do 
 http://wiki.apache.org/cassandra/Operations#Replacing_a_Dead_Node

Answer as no. 

Checking, did you delete just the Schema-* and Migration-* files or all of the 
files in data/system?

Also in the first log there is a log of commit log mutation being skipped 
because the schema is not there. Drain should have removed these, but it can 
take a little time (I think).  

 4) afaiu i could also stop cassandra again move old sstables from snapshot 
 back to keyspace data dir and run repair for all keyspace CFs? So that it 
 finishes faster
 and makes less load than running a repair which has no previous keyspace data 
 at all?

The approach you followed was the correct one. 

I've updated the wiki to say the errors are expected. 

Cheers
 
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 19/05/2012, at 6:34 AM, Piavlo wrote:

 Hi,
 
 I had a schema disagreement problem in cassandra 1.0.9 cluster, where one 
 node had different schema version.
 So I followed the faq at 
 http://wiki.apache.org/cassandra/FAQ#schema_disagreement
 disabled gossip, disabled thrift, drained  and finally stopped the cassandra 
 process, on startup
 noticed
 INFO [main] 2012-05-18 16:23:11,879 DatabaseDescriptor.java (line 467) 
 Couldn't detect any schema definitions in local storage.
 in the log, and after
 INFO [main] 2012-05-18 16:23:15,463 StorageService.java (line 619) 
 Bootstrap/Replace/Move completed! Now serving reads.
 it started throwing Fatal exceptions for all read/write operations endlessly.
 
 I had to stop cassandra process again(no draining was done)
 
 On second start it did came up ok immediately loading the correct cluster 
 schema version
 INFO [main] 2012-05-18 16:54:44,303 DatabaseDescriptor.java (line 499) 
 Loading schema version 9db34ef0-a0be-11e1--f9687e034cf7
 
 But now this node appears to have started with no data from keyspace which 
 had schema disagreement.
 The original keyspace sstables now appear under snapshots dir.
 
 # nodetool -h localhost ring
 Address DC  RackStatus State   LoadOwns   
  Token
   
 141784319550391026443072753096570088106
 10.49.127.4 eu-west 1a  Up Normal  8.19 GB 16.67% 
  0
 10.241.29.65eu-west 1b  Up Normal  8.18 GB 16.67% 
  28356863910078205288614550619314017621
 10.59.46.236eu-west 1c  Up Normal  8.22 GB 16.67% 
  56713727820156410577229101238628035242
 10.50.33.232eu-west 1a  Up Normal  8.2 GB  16.67% 
  85070591730234615865843651857942052864
 10.234.71.33eu-west 1b  Up Normal  8.15 GB 16.67% 
  113427455640312821154458202477256070485
 10.58.249.118   eu-west 1c  Up Normal  660.98 MB   16.67% 
  141784319550391026443072753096570088106
 #
 
 The node is the one with 660.98 MB data( which is opscenter keyspace data 
 which was not invalidated)
 
 So i have some questions:
 
 1) What did I wrong? - why cassandra was throwing exceptions on first startup?
 2) Why the keyspace data was invalidated ? Is it expected?
 3) If answer to #2 is  yes it's expected then  that's the point in doing 
 http://wiki.apache.org/cassandra/FAQ#schema_disagreement
 then all keyspace data is lost anyway? It makes more sense to just do 
 http://wiki.apache.org/cassandra/Operations#Replacing_a_Dead_Node
 4) afaiu i could also stop cassandra again move old sstables from snapshot 
 back to keyspace data dir and run repair for all keyspace CFs? So that it 
 finishes faster
 and makes less load than running a repair which has no previous keyspace data 
 at all?
 
 The first startup log is below:
 
 INFO [main] 2012-05-18 16:23:07,367 AbstractCassandraDaemon.java (line 105) 
 Logging initialized
 INFO [main] 2012-05-18 16:23:07,382 AbstractCassandraDaemon.java (line 126) 
 JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_24
 INFO [main] 2012-05-18 16:23:07,383 AbstractCassandraDaemon.java (line 127) 
 Heap size: 2600468480/2600468480
 INFO [main] 2012-05-18 16:23:07,383

Couldn't detect any schema definitions in local storage - after handling schema disagreement according to FAQ

2012-05-18 Thread Piavlo

 Hi,

I had a schema disagreement problem in cassandra 1.0.9 cluster, where 
one node had different schema version.
So I followed the faq at 
http://wiki.apache.org/cassandra/FAQ#schema_disagreement
disabled gossip, disabled thrift, drained  and finally stopped the 
cassandra process, on startup

noticed
INFO [main] 2012-05-18 16:23:11,879 DatabaseDescriptor.java (line 467) 
Couldn't detect any schema definitions in local storage.

in the log, and after
INFO [main] 2012-05-18 16:23:15,463 StorageService.java (line 619) 
Bootstrap/Replace/Move completed! Now serving reads.
it started throwing Fatal exceptions for all read/write operations 
endlessly.


I had to stop cassandra process again(no draining was done)

On second start it did came up ok immediately loading the correct 
cluster schema version
INFO [main] 2012-05-18 16:54:44,303 DatabaseDescriptor.java (line 499) 
Loading schema version 9db34ef0-a0be-11e1--f9687e034cf7


But now this node appears to have started with no data from keyspace 
which had schema disagreement.

The original keyspace sstables now appear under snapshots dir.

# nodetool -h localhost ring
Address DC  RackStatus State   Load
OwnsToken

   
141784319550391026443072753096570088106
10.49.127.4 eu-west 1a  Up Normal  8.19 GB 
16.67%  0
10.241.29.65eu-west 1b  Up Normal  8.18 GB 
16.67%  28356863910078205288614550619314017621
10.59.46.236eu-west 1c  Up Normal  8.22 GB 
16.67%  56713727820156410577229101238628035242
10.50.33.232eu-west 1a  Up Normal  8.2 GB  
16.67%  85070591730234615865843651857942052864
10.234.71.33eu-west 1b  Up Normal  8.15 GB 
16.67%  113427455640312821154458202477256070485
10.58.249.118   eu-west 1c  Up Normal  660.98 MB   
16.67%  141784319550391026443072753096570088106

#

The node is the one with 660.98 MB data( which is opscenter keyspace 
data which was not invalidated)


So i have some questions:

1) What did I wrong? - why cassandra was throwing exceptions on first 
startup?

2) Why the keyspace data was invalidated ? Is it expected?
3) If answer to #2 is  yes it's expected then  that's the point in 
doing http://wiki.apache.org/cassandra/FAQ#schema_disagreement
then all keyspace data is lost anyway? It makes more sense to just do 
http://wiki.apache.org/cassandra/Operations#Replacing_a_Dead_Node
4) afaiu i could also stop cassandra again move old sstables from 
snapshot back to keyspace data dir and run repair for all keyspace CFs? 
So that it finishes faster
and makes less load than running a repair which has no previous keyspace 
data at all?


The first startup log is below:

 INFO [main] 2012-05-18 16:23:07,367 AbstractCassandraDaemon.java (line 
105) Logging initialized
 INFO [main] 2012-05-18 16:23:07,382 AbstractCassandraDaemon.java (line 
126) JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_24
 INFO [main] 2012-05-18 16:23:07,383 AbstractCassandraDaemon.java (line 
127) Heap size: 2600468480/2600468480
 INFO [main] 2012-05-18 16:23:07,383 AbstractCassandraDaemon.java (line 
128) Classpath: 
/etc/cassandra/conf:/usr/share/java/jna.jar:/usr/share/java/mx4j-tools.jar:/usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/apache-cassandra-1.0.9.jar:/usr/share/cassandra/lib/apache-cassandra-clientutil-1.0.9.jar:/usr/share/cassandra/lib/apache-cassandra-thrift-1.0.9.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang-2.4.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.2.jar:/usr/share/cassandra/lib/guava-r08.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jline-0.9.94.jar:/usr/share/cassandra/lib/joda-time-1.6.2.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.6.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.6.1.jar:/usr/share/cassandra/lib/snakeyaml-1.6.jar:/usr/share/cassandra/lib/snappy-java-1.0.4.1.jar:/usr/share/cassandra//lib/jamm-0.2.5.jar
 INFO [main] 2012-05-18 16:23:10,661 CLibrary.java (line 109) JNA 
mlockall successful
 INFO [main] 2012-05-18 16:23:10,692 DatabaseDescriptor.java (line 114) 
Loading settings from file:/etc/cassandra/ssa/cassandra.yaml
 INFO [main] 2012-05-18 16:23:10,868 DatabaseDescriptor.java (line 168) 
DiskAccessMode 'auto' determined

Re: Schema disagreement in 1.0.2

2011-12-15 Thread blafrisch
So I was able to get the schema agreeing on the two bad nodes, but I don't
particularly like the way that I did it.  One at a time, I shut them down,
removed Schema* and Migration*, then copied over Schema* from another
working node.  They then started up with the correct schema.  Did I do
something totally incorrect in doing that?

Also, some of my nodes are reporting that others are unreachable via the CLI
when executing describe cluster;.  Not all of the nodes do this, about
7/10 are perfectly fine.  I tried restarting each of the nodes that say
others are unreachable, when they came back up then their unreachable list
had changed. Nodetool gossipinfo describes everything perfectly fine as does
nodetool ring.

The topology of the cluster is 2 datacenters, 5 servers each of with a RF of
3.  Only one datacenter seems to have this issue.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Schema-disagreement-in-1-0-2-tp7098609p7099003.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Schema disagreement issue in 1.0.0

2011-10-20 Thread Tamil selvan R.S
Im facing the following issue with Cassandra 1.0 set up. The same works for
0.8.7

# cassandra-cli  -h x.x.x.x -f RTSCFs.sch
Connected to: Real Time Stats on x.x.x.x/9160
Authenticated to keyspace: Stats
39c3e120-fa24-11e0--61d449114eff
Waiting for schema agreement...
The schema has not settled in 10 seconds; further migrations are
ill-advised until it does.
Versions are 39c3e120-fa24-11e0--61d449114eff:[x.x.x.x],
317eb8f0-fa24-11e0--61d449114eff:[x.x.x.y]

I tried this http://wiki.apache.org/cassandra/FAQ#schema_disagreement

But Now when I restart the cluster I'm getting

`org.apache.cassandra.config.ConfigurationException: Invalid
definition for comparator`
org.apache.cassandra.db.marshal.CompositeType

This is my keyspace defn

create keyspace Stats with placement_strategy =
'org.apache.cassandra.locator.SimpleStrategy' and
strategy_options={replication_factor:1};

This is my CF defn

create column family Sample_Stats with
default_validation_class=CounterColumnType
and key_validation_class='CompositeType(UTF8Type,UTF8Type)'
and comparator='CompositeType(UTF8Type, UTF8Type)'
and replicate_on_write=true;

What am I missing?


Re: Schema disagreement issue in 1.0.0

2011-10-20 Thread aaron morton
Looks like a bug, patch is here 
https://issues.apache.org/jira/browse/CASSANDRA-3391

Until it is fixed avoid using CompositeType in the key_validator_class and blow 
away the Schema and Migrations SSTables. 

Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 20/10/2011, at 7:59 PM, Tamil selvan R.S wrote:

 Im facing the following issue with Cassandra 1.0 set up. The same works for 
 0.8.7
 
 # cassandra-cli  -h x.x.x.x -f RTSCFs.sch 
 Connected to: Real Time Stats on x.x.x.x/9160
 Authenticated to keyspace: Stats
 39c3e120-fa24-11e0--61d449114eff
 Waiting for schema agreement...
 The schema has not settled in 10 seconds; further migrations are ill-advised 
 until it does.
 Versions are 39c3e120-fa24-11e0--61d449114eff:[x.x.x.x], 
 317eb8f0-fa24-11e0--61d449114eff:[x.x.x.y]
 I tried this http://wiki.apache.org/cassandra/FAQ#schema_disagreement
 
 But Now when I restart the cluster I'm getting
 
 `org.apache.cassandra.config.ConfigurationException: Invalid definition for 
 comparator` org.apache.cassandra.db.marshal.CompositeType
 This is my keyspace defn
 
 create keyspace Stats with placement_strategy = 
 'org.apache.cassandra.locator.SimpleStrategy' and 
 strategy_options={replication_factor:1};
 This is my CF defn
 
 create column family Sample_Stats with 
 default_validation_class=CounterColumnType
 and key_validation_class='CompositeType(UTF8Type,UTF8Type)'
 and comparator='CompositeType(UTF8Type, UTF8Type)'
 and replicate_on_write=true;
 What am I missing?
 



Re: How to solve this kind of schema disagreement...

2011-08-10 Thread aaron morton
I don't have time to look into the reasons for that error, but that does not 
sound good. It kind of sounds like there are multiple migration chains out 
there in the cluster. This could come from apply changes to different nodes at 
the same time. 

Is this a prod system ? If not I would shut it down, wipe all the Schema and 
Migration SSTables and then apply the schema again one CF at a time (it will 
take time to read the data). 

If it's a prod system it may need some delicate surgery on the Migrations and 
Schema CF's. 

Cheers
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 10 Aug 2011, at 15:41, Dikang Gu wrote:

 And a lot of not apply logs.
 
 DEBUG [MigrationStage:1] 2011-08-10 11:36:29,376 
 DefinitionsUpdateVerbHandler.java (line 70) Applying AddColumnFamily from 
 /192.168.1.9
 DEBUG [MigrationStage:1] 2011-08-10 11:36:29,376 
 DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous 
 version mismatch. cannot apply.
 DEBUG [MigrationStage:1] 2011-08-10 11:36:29,379 
 DefinitionsUpdateVerbHandler.java (line 70) Applying AddColumnFamily from 
 /192.168.1.9
 DEBUG [MigrationStage:1] 2011-08-10 11:36:29,379 
 DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous 
 version mismatch. cannot apply.
 DEBUG [MigrationStage:1] 2011-08-10 11:36:29,382 
 DefinitionsUpdateVerbHandler.java (line 70) Applying AddColumnFamily from 
 /192.168.1.9
 DEBUG [MigrationStage:1] 2011-08-10 11:36:29,382 
 DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous 
 version mismatch. cannot apply.
 
 -- 
 Dikang Gu
 0086 - 18611140205
 On Wednesday, August 10, 2011 at 11:35 AM, Dikang Gu wrote:
 
 Hi Aaron,
 
 I set the log level to be DEBUG, and find a lot of forceFlush debug info in 
 the log:
 
 DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 
 725) forceFlush requested but everything is clean
 DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 
 725) forceFlush requested but everything is clean
 DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 
 725) forceFlush requested but everything is clean
 DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 
 725) forceFlush requested but everything is clean
 
 What does this mean?
 
 Thanks.
  
 
 -- 
 Dikang Gu
 0086 - 18611140205
 On Wednesday, August 10, 2011 at 6:42 AM, aaron morton wrote:
 
 um. There has got to be something stopping the migration from completing. 
 
 Turn the logging up to DEBUG before starting and look for messages from 
 MigrationManager.java
 
 Provide all the log messages from Migration.java on the 1.27 node
 
 Cheers
 
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 On 8 Aug 2011, at 15:52, Dikang Gu wrote:
 
 Hi Aaron, 
 
 I repeat the whole procedure:
 
 1. kill the cassandra instance on 1.27.
 2. rm the data/system/Migrations-g-*
 3. rm the data/system/Schema-g-*
 4. bin/cassandra to start the cassandra.
 
 Now, the migration seems stop and I do not find any error in the 
 system.log yet.
 
 The ring looks good:
 [root@yun-phy2 apache-cassandra-0.8.1]# bin/nodetool -h192.168.1.27 -p8090 
 ring
 Address DC  RackStatus State   Load
 OwnsToken   

 127605887595351923798765477786913079296 
 192.168.1.28datacenter1 rack1   Up Normal  8.38 GB 
 25.00%  1   
 192.168.1.25datacenter1 rack1   Up Normal  8.54 GB 
 34.01%  57856537434773737201679995572503935972  
 192.168.1.27datacenter1 rack1   Up Normal  1.78 GB 
 24.28%  99165710459060760249270263771474737125  
 192.168.1.9 datacenter1 rack1   Up Normal  8.75 GB 
 16.72%  127605887595351923798765477786913079296  
 
 But the schema still does not correct:
 Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions: 
75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 
 192.168.1.25]
5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]
 
 The 5a54ebd0-bd90-11e0--9510c23fceff is same as last time…
 
 And in the log, the last Migration.java log is:
  INFO [MigrationStage:1] 2011-08-08 11:41:30,293 Migration.java (line 116) 
 Applying migration 5a54ebd0-bd90-11e0--9510c23fceff Add keyspace: 
 SimpleDB_4E38DAA64894A9146105rep 
 strategy:SimpleStrategy{}durable_writes: true
 
 Could you explain this? 
 
 If I change the token given to 1.27 to another one, will it help?
 
 Thanks.
 
 -- 
 Dikang Gu
 0086 - 18611140205
 On Sunday, August 7, 2011 at 4:14 PM, aaron morton wrote:
 
 did you check the logs in 1.27 for 

Re: How to solve this kind of schema disagreement...

2011-08-09 Thread aaron morton
um. There has got to be something stopping the migration from completing. 

Turn the logging up to DEBUG before starting and look for messages from 
MigrationManager.java

Provide all the log messages from Migration.java on the 1.27 node

Cheers


-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 8 Aug 2011, at 15:52, Dikang Gu wrote:

 Hi Aaron, 
 
 I repeat the whole procedure:
 
 1. kill the cassandra instance on 1.27.
 2. rm the data/system/Migrations-g-*
 3. rm the data/system/Schema-g-*
 4. bin/cassandra to start the cassandra.
 
 Now, the migration seems stop and I do not find any error in the system.log 
 yet.
 
 The ring looks good:
 [root@yun-phy2 apache-cassandra-0.8.1]# bin/nodetool -h192.168.1.27 -p8090 
 ring
 Address DC  RackStatus State   LoadOwns   
  Token   
   
  127605887595351923798765477786913079296 
 192.168.1.28datacenter1 rack1   Up Normal  8.38 GB 25.00% 
  1   
 192.168.1.25datacenter1 rack1   Up Normal  8.54 GB 34.01% 
  57856537434773737201679995572503935972  
 192.168.1.27datacenter1 rack1   Up Normal  1.78 GB 24.28% 
  99165710459060760249270263771474737125  
 192.168.1.9 datacenter1 rack1   Up Normal  8.75 GB 16.72% 
  127605887595351923798765477786913079296  
 
 But the schema still does not correct:
 Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions: 
   75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 
 192.168.1.25]
   5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]
 
 The 5a54ebd0-bd90-11e0--9510c23fceff is same as last time…
 
 And in the log, the last Migration.java log is:
  INFO [MigrationStage:1] 2011-08-08 11:41:30,293 Migration.java (line 116) 
 Applying migration 5a54ebd0-bd90-11e0--9510c23fceff Add keyspace: 
 SimpleDB_4E38DAA64894A9146105rep 
 strategy:SimpleStrategy{}durable_writes: true
 
 Could you explain this? 
 
 If I change the token given to 1.27 to another one, will it help?
 
 Thanks.
 
 -- 
 Dikang Gu
 0086 - 18611140205
 On Sunday, August 7, 2011 at 4:14 PM, aaron morton wrote:
 
 did you check the logs in 1.27 for errors ? 
 
 Could you be seeing this ? 
 https://issues.apache.org/jira/browse/CASSANDRA-2867
 
 Cheers
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 On 7 Aug 2011, at 16:24, Dikang Gu wrote:
 
 I restart both nodes, and deleted the shcema* and migration* and restarted 
 them.
 
 The current cluster looks like this:
 [default@unknown] describe cluster; 
 Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions: 
 75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 
 192.168.1.25]
 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]
 
 the 1.28 looks good, and the 1.27 still can not get the schema agreement...
 
 I have tried several times, even delete all the data on 1.27, and rejoin it 
 as a new node, but it is still unhappy.
 
 And the ring looks like this: 
 
 Address DC  RackStatus State   LoadOwns 
Token   
 
127605887595351923798765477786913079296 
 192.168.1.28datacenter1 rack1   Up Normal  8.38 GB 
 25.00%  1   
 192.168.1.25datacenter1 rack1   Up Normal  8.55 GB 
 34.01%  57856537434773737201679995572503935972  
 192.168.1.27datacenter1 rack1   Up Joining 1.81 GB 
 24.28%  99165710459060760249270263771474737125  
 192.168.1.9 datacenter1 rack1   Up Normal  8.75 GB 
 16.72%  127605887595351923798765477786913079296 
 
 The 1.27 seems can not join the cluster, and it just hangs there...
 
 Any suggestions?
 
 Thanks.
 
 
 On Sun, Aug 7, 2011 at 10:01 AM, aaron morton aa...@thelastpickle.com 
 wrote:
 After there restart you what was in the  logs for the 1.27 machine  from 
 the Migration.java logger ? Some of the messages will start with Applying 
 migration
 
 You should have shut down both of the nodes, then deleted the schema* and 
 migration* system sstables, then restarted one of them and watched to see 
 if it got to schema agreement. 
 
 Cheers
   
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 On 6 Aug 2011, at 22:56, Dikang Gu wrote:
 
 I have tried this, but the schema still does not agree in the 

Re: How to solve this kind of schema disagreement...

2011-08-09 Thread Dikang Gu
Hi Aaron,

I set the log level to be DEBUG, and find a lot of forceFlush debug info in the 
log:

DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) 
forceFlush requested but everything is clean
DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) 
forceFlush requested but everything is clean
DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) 
forceFlush requested but everything is clean
DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) 
forceFlush requested but everything is clean

What does this mean?

Thanks.


-- 
Dikang Gu
0086 - 18611140205
On Wednesday, August 10, 2011 at 6:42 AM, aaron morton wrote: 
 um. There has got to be something stopping the migration from completing. 
 
 Turn the logging up to DEBUG before starting and look for messages from 
 MigrationManager.java
 
 Provide all the log messages from Migration.java on the 1.27 node
 
 Cheers
 
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 
 
 
 
 On 8 Aug 2011, at 15:52, Dikang Gu wrote:
  Hi Aaron, 
  
  I repeat the whole procedure:
  
  1. kill the cassandra instance on 1.27.
  2. rm the data/system/Migrations-g-*
  3. rm the data/system/Schema-g-*
  4. bin/cassandra to start the cassandra.
  
  Now, the migration seems stop and I do not find any error in the system.log 
  yet.
  
  The ring looks good:
  [root@yun-phy2 apache-cassandra-0.8.1]# bin/nodetool -h192.168.1.27 -p8090 
  ring
  Address  DC Rack Status State  Load Owns Token 
  127605887595351923798765477786913079296 
  192.168.1.28 datacenter1 rack1  Up  Normal 8.38 GB  25.00% 1 
  192.168.1.25 datacenter1 rack1  Up  Normal 8.54 GB  34.01% 
  57856537434773737201679995572503935972 
  192.168.1.27 datacenter1 rack1  Up Normal 1.78 GB  24.28% 
  99165710459060760249270263771474737125 
  192.168.1.9  datacenter1 rack1  Up  Normal 8.75 GB  16.72% 
  127605887595351923798765477786913079296 
  
  
  But the schema still does not correct:
  Cluster Information:
  Snitch: org.apache.cassandra.locator.SimpleSnitch
  Partitioner: org.apache.cassandra.dht.RandomPartitioner
  Schema versions: 
  75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 
  192.168.1.25]
  5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]
  
  
  The 5a54ebd0-bd90-11e0--9510c23fceff is same as last time…
  
  And in the log, the last Migration.java log is:
  INFO [MigrationStage:1] 2011-08-08 11:41:30,293 Migration.java (line 116) 
  Applying migration 5a54ebd0-bd90-11e0--9510c23fceff Add keyspace: 
  SimpleDB_4E38DAA64894A9146105rep 
  strategy:SimpleStrategy{}durable_writes: true
  
  Could you explain this? 
  
  If I change the token given to 1.27 to another one, will it help?
  
  Thanks.
  -- 
  Dikang Gu
  0086 - 18611140205
  On Sunday, August 7, 2011 at 4:14 PM, aaron morton wrote:
   did you check the logs in 1.27 for errors ? 
   
   Could you be seeing this ? 
   https://issues.apache.org/jira/browse/CASSANDRA-2867
   
   Cheers
   
   -
   Aaron Morton
   Freelance Cassandra Developer
   @aaronmorton
   http://www.thelastpickle.com
   
   
   
   
   
   On 7 Aug 2011, at 16:24, Dikang Gu wrote:
I restart both nodes, and deleted the shcema* and migration* and 
restarted them.

The current cluster looks like this:
[default@unknown] describe cluster; 
Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions: 
75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 
192.168.1.25]
5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]


the 1.28 looks good, and the 1.27 still can not get the schema 
agreement...

I have tried several times, even delete all the data on 1.27, and 
rejoin it as a new node, but it is still unhappy. 

And the ring looks like this: 

Address  DC Rack Status State  Load Owns Token 
127605887595351923798765477786913079296 
192.168.1.28 datacenter1 rack1  Up  Normal 8.38 GB  25.00% 1 
192.168.1.25 datacenter1 rack1  Up  Normal 8.55 GB  34.01% 
57856537434773737201679995572503935972 
192.168.1.27 datacenter1 rack1  Up Joining 1.81 GB  24.28% 
99165710459060760249270263771474737125 
 192.168.1.9  datacenter1 rack1  Up  Normal 8.75 GB  16.72% 
127605887595351923798765477786913079296 


The 1.27 seems can not join the cluster, and it just hangs there... 

Any suggestions?

Thanks.


On Sun, Aug 7, 2011 at 10:01 AM, aaron morton aa...@thelastpickle.com 
wrote:
 After there restart you what was in the logs for the 1.27 machine 
 from the Migration.java logger ? Some of the messages will start with 
 Applying migration 
 
 You should have shut down both of the nodes, then 

Re: How to solve this kind of schema disagreement...

2011-08-09 Thread Dikang Gu
And a lot of not apply logs.

DEBUG [MigrationStage:1] 2011-08-10 11:36:29,376 
DefinitionsUpdateVerbHandler.java (line 70) Applying AddColumnFamily from 
/192.168.1.9
DEBUG [MigrationStage:1] 2011-08-10 11:36:29,376 
DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous 
version mismatch. cannot apply.
DEBUG [MigrationStage:1] 2011-08-10 11:36:29,379 
DefinitionsUpdateVerbHandler.java (line 70) Applying AddColumnFamily from 
/192.168.1.9
DEBUG [MigrationStage:1] 2011-08-10 11:36:29,379 
DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous 
version mismatch. cannot apply.
DEBUG [MigrationStage:1] 2011-08-10 11:36:29,382 
DefinitionsUpdateVerbHandler.java (line 70) Applying AddColumnFamily from 
/192.168.1.9
DEBUG [MigrationStage:1] 2011-08-10 11:36:29,382 
DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous 
version mismatch. cannot apply.


-- 
Dikang Gu
0086 - 18611140205
On Wednesday, August 10, 2011 at 11:35 AM, Dikang Gu wrote: 
 Hi Aaron,
 
 I set the log level to be DEBUG, and find a lot of forceFlush debug info in 
 the log:
 
 DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 
 725) forceFlush requested but everything is clean
 DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 
 725) forceFlush requested but everything is clean
 DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 
 725) forceFlush requested but everything is clean
 DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 
 725) forceFlush requested but everything is clean
 
 What does this mean?
 
 Thanks.
 
 
 -- 
 Dikang Gu
 0086 - 18611140205
 On Wednesday, August 10, 2011 at 6:42 AM, aaron morton wrote:
  um. There has got to be something stopping the migration from completing. 
  
  Turn the logging up to DEBUG before starting and look for messages from 
  MigrationManager.java
  
  Provide all the log messages from Migration.java on the 1.27 node
  
  Cheers
  
  
  -
  Aaron Morton
  Freelance Cassandra Developer
  @aaronmorton
  http://www.thelastpickle.com
  
  
  
  
  
  On 8 Aug 2011, at 15:52, Dikang Gu wrote:
   Hi Aaron, 
   
   I repeat the whole procedure:
   
   1. kill the cassandra instance on 1.27.
   2. rm the data/system/Migrations-g-*
   3. rm the data/system/Schema-g-*
   4. bin/cassandra to start the cassandra.
   
   Now, the migration seems stop and I do not find any error in the 
   system.log yet.
   
   The ring looks good:
   [root@yun-phy2 apache-cassandra-0.8.1]# bin/nodetool -h192.168.1.27 
   -p8090 ring
   Address  DC Rack Status State  Load Owns Token 
   127605887595351923798765477786913079296 
   192.168.1.28 datacenter1 rack1  Up  Normal 8.38 GB  25.00% 1 
   192.168.1.25 datacenter1 rack1  Up  Normal 8.54 GB  34.01% 
   57856537434773737201679995572503935972 
   192.168.1.27 datacenter1 rack1  Up Normal 1.78 GB  24.28% 
   99165710459060760249270263771474737125 
   192.168.1.9  datacenter1 rack1  Up  Normal 8.75 GB  16.72% 
   127605887595351923798765477786913079296 
   
   
   But the schema still does not correct:
   Cluster Information:
   Snitch: org.apache.cassandra.locator.SimpleSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions: 
   75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 
   192.168.1.25]
   5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]
   
   
   The 5a54ebd0-bd90-11e0--9510c23fceff is same as last time…
   
   And in the log, the last Migration.java log is:
   INFO [MigrationStage:1] 2011-08-08 11:41:30,293 Migration.java (line 116) 
   Applying migration 5a54ebd0-bd90-11e0--9510c23fceff Add keyspace: 
   SimpleDB_4E38DAA64894A9146105rep 
   strategy:SimpleStrategy{}durable_writes: true
   
   Could you explain this? 
   
   If I change the token given to 1.27 to another one, will it help?
   
   Thanks.
   -- 
   Dikang Gu
   0086 - 18611140205
   On Sunday, August 7, 2011 at 4:14 PM, aaron morton wrote:
did you check the logs in 1.27 for errors ? 

Could you be seeing this ? 
https://issues.apache.org/jira/browse/CASSANDRA-2867

Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com





On 7 Aug 2011, at 16:24, Dikang Gu wrote:
 I restart both nodes, and deleted the shcema* and migration* and 
 restarted them.
 
 The current cluster looks like this:
 [default@unknown] describe cluster; 
 Cluster Information:
 Snitch: org.apache.cassandra.locator.SimpleSnitch
 Partitioner: org.apache.cassandra.dht.RandomPartitioner
 Schema versions: 
 75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 
 192.168.1.25]
 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]
 
 
 the 1.28 looks good, and the 1.27 still can not get the schema 
  

Re: How to solve this kind of schema disagreement...

2011-08-07 Thread aaron morton
did you check the logs in 1.27 for errors ? 

Could you be seeing this ? https://issues.apache.org/jira/browse/CASSANDRA-2867

Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 7 Aug 2011, at 16:24, Dikang Gu wrote:

 I restart both nodes, and deleted the shcema* and migration* and restarted 
 them.
 
 The current cluster looks like this:
 [default@unknown] describe cluster; 
 Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions: 
   75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 
 192.168.1.25]
   5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]
 
 the 1.28 looks good, and the 1.27 still can not get the schema agreement...
 
 I have tried several times, even delete all the data on 1.27, and rejoin it 
 as a new node, but it is still unhappy.
 
 And the ring looks like this: 
 
 Address DC  RackStatus State   LoadOwns   
  Token   
   
  127605887595351923798765477786913079296 
 192.168.1.28datacenter1 rack1   Up Normal  8.38 GB 25.00% 
  1   
 192.168.1.25datacenter1 rack1   Up Normal  8.55 GB 34.01% 
  57856537434773737201679995572503935972  
 192.168.1.27datacenter1 rack1   Up Joining 1.81 GB 24.28% 
  99165710459060760249270263771474737125  
 192.168.1.9 datacenter1 rack1   Up Normal  8.75 GB 16.72% 
  127605887595351923798765477786913079296 
 
 The 1.27 seems can not join the cluster, and it just hangs there...
 
 Any suggestions?
 
 Thanks.
 
 
 On Sun, Aug 7, 2011 at 10:01 AM, aaron morton aa...@thelastpickle.com wrote:
 After there restart you what was in the  logs for the 1.27 machine  from the 
 Migration.java logger ? Some of the messages will start with Applying 
 migration
 
 You should have shut down both of the nodes, then deleted the schema* and 
 migration* system sstables, then restarted one of them and watched to see if 
 it got to schema agreement. 
 
 Cheers
   
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 On 6 Aug 2011, at 22:56, Dikang Gu wrote:
 
 I have tried this, but the schema still does not agree in the cluster:
 
 [default@unknown] describe cluster;
 Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions: 
  UNREACHABLE: [192.168.1.28]
  75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25]
  5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]
 
 Any other suggestions to solve this?
 
 Because I have some production data saved in the cassandra cluster, so I can 
 not afford data lost...
 
 Thanks.
 
 On Fri, Aug 5, 2011 at 8:55 PM, Benoit Perroud ben...@noisette.ch wrote:
 Based on http://wiki.apache.org/cassandra/FAQ#schema_disagreement,
 75eece10-bf48-11e0--4d205df954a7 own the majority, so shutdown and
 remove the schema* and migration* sstables from both 192.168.1.28 and
 192.168.1.27
 
 
 2011/8/5 Dikang Gu dikan...@gmail.com:
  [default@unknown] describe cluster;
  Cluster Information:
 Snitch: org.apache.cassandra.locator.SimpleSnitch
 Partitioner: org.apache.cassandra.dht.RandomPartitioner
 Schema versions:
  743fe590-bf48-11e0--4d205df954a7: [192.168.1.28]
  75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25]
  06da9aa0-bda8-11e0--9510c23fceff: [192.168.1.27]
 
   three different schema versions in the cluster...
  --
  Dikang Gu
  0086 - 18611140205
 
 
 
 
 -- 
 Dikang Gu
 
 0086 - 18611140205
 
 
 
 
 
 -- 
 Dikang Gu
 
 0086 - 18611140205
 



Re: How to solve this kind of schema disagreement...

2011-08-07 Thread Dikang Gu
Hi Aaron, 

I repeat the whole procedure:

1. kill the cassandra instance on 1.27.
2. rm the data/system/Migrations-g-*
3. rm the data/system/Schema-g-*
4. bin/cassandra to start the cassandra.

Now, the migration seems stop and I do not find any error in the system.log yet.

The ring looks good:
[root@yun-phy2 apache-cassandra-0.8.1]# bin/nodetool -h192.168.1.27 -p8090 ring
Address  DC Rack Status State  Load Owns Token 
127605887595351923798765477786913079296 
192.168.1.28 datacenter1 rack1  Up  Normal 8.38 GB  25.00% 1 
192.168.1.25 datacenter1 rack1  Up  Normal 8.54 GB  34.01% 
57856537434773737201679995572503935972 
192.168.1.27 datacenter1 rack1  Up Normal 1.78 GB  24.28% 
99165710459060760249270263771474737125 
192.168.1.9  datacenter1 rack1  Up  Normal 8.75 GB  16.72% 
127605887595351923798765477786913079296 


But the schema still does not correct:
Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions: 
75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 192.168.1.25]
5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]


The 5a54ebd0-bd90-11e0--9510c23fceff is same as last time…

And in the log, the last Migration.java log is:
INFO [MigrationStage:1] 2011-08-08 11:41:30,293 Migration.java (line 116) 
Applying migration 5a54ebd0-bd90-11e0--9510c23fceff Add keyspace: 
SimpleDB_4E38DAA64894A9146105rep 
strategy:SimpleStrategy{}durable_writes: true

Could you explain this? 

If I change the token given to 1.27 to another one, will it help?

Thanks.
-- 
Dikang Gu
0086 - 18611140205
On Sunday, August 7, 2011 at 4:14 PM, aaron morton wrote: 
 did you check the logs in 1.27 for errors ? 
 
 Could you be seeing this ? 
 https://issues.apache.org/jira/browse/CASSANDRA-2867
 
 Cheers
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 
 
 
 
 On 7 Aug 2011, at 16:24, Dikang Gu wrote:
  I restart both nodes, and deleted the shcema* and migration* and restarted 
  them.
  
  The current cluster looks like this:
  [default@unknown] describe cluster; 
  Cluster Information:
  Snitch: org.apache.cassandra.locator.SimpleSnitch
  Partitioner: org.apache.cassandra.dht.RandomPartitioner
  Schema versions: 
  75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 
  192.168.1.25]
  5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]
  
  
  the 1.28 looks good, and the 1.27 still can not get the schema agreement...
  
  I have tried several times, even delete all the data on 1.27, and rejoin it 
  as a new node, but it is still unhappy. 
  
  And the ring looks like this: 
  
  Address  DC Rack Status State  Load Owns Token 
  127605887595351923798765477786913079296 
  192.168.1.28 datacenter1 rack1  Up  Normal 8.38 GB  25.00% 1 
  192.168.1.25 datacenter1 rack1  Up  Normal 8.55 GB  34.01% 
  57856537434773737201679995572503935972 
  192.168.1.27 datacenter1 rack1  Up Joining 1.81 GB  24.28% 
  99165710459060760249270263771474737125 
   192.168.1.9  datacenter1 rack1  Up  Normal 8.75 GB  16.72% 
  127605887595351923798765477786913079296 
  
  
  The 1.27 seems can not join the cluster, and it just hangs there... 
  
  Any suggestions?
  
  Thanks.
  
  
  On Sun, Aug 7, 2011 at 10:01 AM, aaron morton aa...@thelastpickle.com 
  wrote:
   After there restart you what was in the logs for the 1.27 machine from 
   the Migration.java logger ? Some of the messages will start with 
   Applying migration 
   
   You should have shut down both of the nodes, then deleted the schema* and 
   migration* system sstables, then restarted one of them and watched to see 
   if it got to schema agreement. 
   
Cheers
   
   -
   Aaron Morton
   Freelance Cassandra Developer
   @aaronmorton
   http://www.thelastpickle.com
   
   
   
   
   
   On 6 Aug 2011, at 22:56, Dikang Gu wrote:
I have tried this, but the schema still does not agree in the cluster:

[default@unknown] describe cluster; 
Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions: 
UNREACHABLE: [192.168.1.28]
75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25]
5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]

Any other suggestions to solve this?

Because I have some production data saved in the cassandra cluster, so 
I can not afford data lost... 

Thanks.
On Fri, Aug 5, 2011 at 8:55 PM, Benoit Perroud ben...@noisette.ch 
wrote:
  Based on http://wiki.apache.org/cassandra/FAQ#schema_disagreement,
  75eece10-bf48-11e0--4d205df954a7 own the majority, so shutdown 
 and
  remove the schema* and migration* sstables from both 192.168.1.28 and
  192.168.1.27
 
 
  2011/8/5 Dikang Gu dikan...@gmail.com:
  [default@unknown] describe cluster;
  

Re: How to solve this kind of schema disagreement...

2011-08-06 Thread Dikang Gu
I have tried this, but the schema still does not agree in the cluster:

[default@unknown] describe cluster;
Cluster Information:
   Snitch: org.apache.cassandra.locator.SimpleSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions:
UNREACHABLE: [192.168.1.28]
75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25]
5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]

Any other suggestions to solve this?

Because I have some production data saved in the cassandra cluster, so I can
not afford data lost...

Thanks.

On Fri, Aug 5, 2011 at 8:55 PM, Benoit Perroud ben...@noisette.ch wrote:

 Based on http://wiki.apache.org/cassandra/FAQ#schema_disagreement,
 75eece10-bf48-11e0--4d205df954a7 own the majority, so shutdown and
 remove the schema* and migration* sstables from both 192.168.1.28 and
 192.168.1.27


 2011/8/5 Dikang Gu dikan...@gmail.com:
  [default@unknown] describe cluster;
  Cluster Information:
 Snitch: org.apache.cassandra.locator.SimpleSnitch
 Partitioner: org.apache.cassandra.dht.RandomPartitioner
 Schema versions:
  743fe590-bf48-11e0--4d205df954a7: [192.168.1.28]
  75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25]
  06da9aa0-bda8-11e0--9510c23fceff: [192.168.1.27]
 
   three different schema versions in the cluster...
  --
  Dikang Gu
  0086 - 18611140205
 




-- 
Dikang Gu

0086 - 18611140205


Re: How to solve this kind of schema disagreement...

2011-08-06 Thread aaron morton
After there restart you what was in the  logs for the 1.27 machine  from the 
Migration.java logger ? Some of the messages will start with Applying 
migration

You should have shut down both of the nodes, then deleted the schema* and 
migration* system sstables, then restarted one of them and watched to see if it 
got to schema agreement. 

Cheers
  
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 6 Aug 2011, at 22:56, Dikang Gu wrote:

 I have tried this, but the schema still does not agree in the cluster:
 
 [default@unknown] describe cluster;
 Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions: 
   UNREACHABLE: [192.168.1.28]
   75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25]
   5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]
 
 Any other suggestions to solve this?
 
 Because I have some production data saved in the cassandra cluster, so I can 
 not afford data lost...
 
 Thanks.
 
 On Fri, Aug 5, 2011 at 8:55 PM, Benoit Perroud ben...@noisette.ch wrote:
 Based on http://wiki.apache.org/cassandra/FAQ#schema_disagreement,
 75eece10-bf48-11e0--4d205df954a7 own the majority, so shutdown and
 remove the schema* and migration* sstables from both 192.168.1.28 and
 192.168.1.27
 
 
 2011/8/5 Dikang Gu dikan...@gmail.com:
  [default@unknown] describe cluster;
  Cluster Information:
 Snitch: org.apache.cassandra.locator.SimpleSnitch
 Partitioner: org.apache.cassandra.dht.RandomPartitioner
 Schema versions:
  743fe590-bf48-11e0--4d205df954a7: [192.168.1.28]
  75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25]
  06da9aa0-bda8-11e0--9510c23fceff: [192.168.1.27]
 
   three different schema versions in the cluster...
  --
  Dikang Gu
  0086 - 18611140205
 
 
 
 
 -- 
 Dikang Gu
 
 0086 - 18611140205
 



Re: How to solve this kind of schema disagreement...

2011-08-06 Thread Dikang Gu
I restart both nodes, and deleted the shcema* and migration* and restarted
them.

The current cluster looks like this:
[default@unknown] describe cluster;
Cluster Information:
   Snitch: org.apache.cassandra.locator.SimpleSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions:
75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9,
192.168.1.25]
5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]

the 1.28 looks good, and the 1.27 still can not get the schema agreement...

I have tried several times, even delete all the data on 1.27, and rejoin it
as a new node, but it is still unhappy.

And the ring looks like this:

Address DC  RackStatus State   LoadOwns
   Token

   127605887595351923798765477786913079296
192.168.1.28datacenter1 rack1   Up Normal  8.38 GB
25.00%  1
192.168.1.25datacenter1 rack1   Up Normal  8.55 GB
34.01%  57856537434773737201679995572503935972
192.168.1.27datacenter1 rack1   Up Joining 1.81 GB
24.28%  99165710459060760249270263771474737125
192.168.1.9 datacenter1 rack1   Up Normal  8.75 GB
16.72%  127605887595351923798765477786913079296

The 1.27 seems can not join the cluster, and it just hangs there...

Any suggestions?

Thanks.


On Sun, Aug 7, 2011 at 10:01 AM, aaron morton aa...@thelastpickle.comwrote:

 After there restart you what was in the  logs for the 1.27 machine  from
 the Migration.java logger ? Some of the messages will start with Applying
 migration

 You should have shut down both of the nodes, then deleted the schema* and
 migration* system sstables, then restarted one of them and watched to see if
 it got to schema agreement.

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 6 Aug 2011, at 22:56, Dikang Gu wrote:

 I have tried this, but the schema still does not agree in the cluster:

 [default@unknown] describe cluster;
 Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions:
 UNREACHABLE: [192.168.1.28]
 75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25]
  5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]

 Any other suggestions to solve this?

 Because I have some production data saved in the cassandra cluster, so I
 can not afford data lost...

 Thanks.

 On Fri, Aug 5, 2011 at 8:55 PM, Benoit Perroud ben...@noisette.ch wrote:

 Based on http://wiki.apache.org/cassandra/FAQ#schema_disagreement,
 75eece10-bf48-11e0--4d205df954a7 own the majority, so shutdown and
 remove the schema* and migration* sstables from both 192.168.1.28 and
 192.168.1.27


 2011/8/5 Dikang Gu dikan...@gmail.com:
  [default@unknown] describe cluster;
  Cluster Information:
 Snitch: org.apache.cassandra.locator.SimpleSnitch
 Partitioner: org.apache.cassandra.dht.RandomPartitioner
 Schema versions:
  743fe590-bf48-11e0--4d205df954a7: [192.168.1.28]
  75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25]
  06da9aa0-bda8-11e0--9510c23fceff: [192.168.1.27]
 
   three different schema versions in the cluster...
  --
  Dikang Gu
  0086 - 18611140205
 




 --
 Dikang Gu

 0086 - 18611140205





-- 
Dikang Gu

0086 - 18611140205


How to solve this kind of schema disagreement...

2011-08-05 Thread Dikang Gu
[default@unknown] describe cluster;
Cluster Information:
   Snitch: org.apache.cassandra.locator.SimpleSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions:
743fe590-bf48-11e0--4d205df954a7: [192.168.1.28]
75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25]
06da9aa0-bda8-11e0--9510c23fceff: [192.168.1.27]


 three different schema versions in the cluster...

-- 
Dikang Gu

0086 - 18611140205


Re: How to solve this kind of schema disagreement...

2011-08-05 Thread Benoit Perroud
Based on http://wiki.apache.org/cassandra/FAQ#schema_disagreement,
75eece10-bf48-11e0--4d205df954a7 own the majority, so shutdown and
remove the schema* and migration* sstables from both 192.168.1.28 and
192.168.1.27


2011/8/5 Dikang Gu dikan...@gmail.com:
 [default@unknown] describe cluster;
 Cluster Information:
    Snitch: org.apache.cassandra.locator.SimpleSnitch
    Partitioner: org.apache.cassandra.dht.RandomPartitioner
    Schema versions:
 743fe590-bf48-11e0--4d205df954a7: [192.168.1.28]
 75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25]
 06da9aa0-bda8-11e0--9510c23fceff: [192.168.1.27]

  three different schema versions in the cluster...
 --
 Dikang Gu
 0086 - 18611140205



Re: Schema Disagreement

2011-08-05 Thread Yi Yang
Thanks Aaron.
On Aug 2, 2011, at 3:04 AM, aaron morton wrote:

 Hang on, using brain now. 
 
 That is triggering a small bug in the code see 
 https://issues.apache.org/jira/browse/CASSANDRA-2984
 
 For not just remove the column meta data. 
 
 Cheers
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 On 2 Aug 2011, at 21:19, aaron morton wrote:
 
 What do you see when you run describe cluster; in the cassandra-cli ? Whats 
 the exact error you get and is there anything in the server side logs ?
 
 Have you added other CF's before adding this one ? Did the schema agree 
 before starting this statement?
 
 I ran the statement below on the current trunk and it worked. 
 
 Cheers
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 On 2 Aug 2011, at 12:08, Dikang Gu wrote:
 
 I thought the schema disagree problem was already solved in 0.8.1...
 
 On possible solution is to decommission the disagree node and rejoin it.
 
 
 On Tue, Aug 2, 2011 at 8:01 AM, Yi Yang yy...@me.com wrote:
 Dear all,
 
 I'm always meeting mp with schema disagree problems while trying to create 
 a column family like this, using cassandra-cli:
 
 create column family sd
with column_type = 'Super'
and key_validation_class = 'UUIDType'
and comparator = 'LongType'
and subcomparator = 'UTF8Type'
and column_metadata = [
{
column_name: 'time',
validation_class : 'LongType'
},{
column_name: 'open',
validation_class : 'FloatType'
},{
column_name: 'high',
validation_class : 'FloatType'
},{
column_name: 'low',
validation_class : 'FloatType'
},{
column_name: 'close',
validation_class : 'FloatType'
},{
column_name: 'volumn',
validation_class : 'LongType'
},{
column_name: 'splitopen',
validation_class : 'FloatType'
},{
column_name: 'splithigh',
validation_class : 'FloatType'
},{
column_name: 'splitlow',
validation_class : 'FloatType'
},{
column_name: 'splitclose',
validation_class : 'FloatType'
},{
column_name: 'splitvolume',
validation_class : 'LongType'
},{
column_name: 'splitclose',
validation_class : 'FloatType'
}
]
 ;
 
 I've tried to erase everything and restart Cassandra but this still 
 happens.   But when I clear the column_metadata section this no more 
 disagreement error.   Do you have any idea why this happens?
 
 Environment: 2 VMs, using the same harddrive, Cassandra 0.8.1, Ubuntu 10.04
 This is for testing only.   We'll move to dedicated servers later.
 
 Best regards,
 Yi
 
 
 
 -- 
 Dikang Gu
 
 0086 - 18611140205
 
 
 



Re: Schema Disagreement

2011-08-03 Thread aaron morton
It means the node you ran the command against could not contact node 
192.168.1.25 it's probably down. 

Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 3 Aug 2011, at 14:03, Dikang Gu wrote:

 I followed the instructions in the FAQ, but got the following when describe 
 cluster;
 
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions: 
   dd73c740-bd84-11e0--98dab94442fb: [192.168.1.28, 192.168.1.9, 
 192.168.1.27]
   UNREACHABLE: [192.168.1.25]
 
 What's the UNREACHABLE?
 
 Thanks.
 
 -- 
 Dikang Gu
 0086 - 18611140205
 On Wednesday, August 3, 2011 at 11:28 AM, Jonathan Ellis wrote:
 
 Have you seen http://wiki.apache.org/cassandra/FAQ#schema_disagreement ?
 
 On Tue, Aug 2, 2011 at 10:25 PM, Dikang Gu dikan...@gmail.com wrote:
 I also encounter the schema disagreement in my 0.8.1 cluster today…
 
 The disagreement occurs when I create a column family using the hector api,
 and I found the following errors in my cassandra/system.log
 ERROR [pool-2-thread-99] 2011-08-03 11:21:18,051 Cassandra.java (line 3378)
 Internal error processing remove
 java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut
 down
 at
 org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
 at
 java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816)
 at
 java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337)
 at
 org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360)
 at
 org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241)
 at
 org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62)
 at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99)
 at
 org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210)
 at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154)
 at
 org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560)
 at
 org.apache.cassandra.thrift.CassandraServer.internal_remove(CassandraServer.java:539)
 at
 org.apache.cassandra.thrift.CassandraServer.remove(CassandraServer.java:547)
 at
 org.apache.cassandra.thrift.Cassandra$Processor$remove.process(Cassandra.java:3370)
 at
 org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
 at
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:636)
 And when I try to decommission, I got this:
 ERROR [pool-2-thread-90] 2011-08-03 11:24:35,611 Cassandra.java (line 3462)
 Internal error processing batch_mutate
 java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut
 down
 at
 org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
 at
 java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816)
 at
 java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337)
 at
 org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360)
 at
 org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241)
 at
 org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62)
 at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99)
 at
 org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210)
 at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154)
 at
 org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560)
 at
 org.apache.cassandra.thrift.CassandraServer.internal_batch_mutate(CassandraServer.java:511)
 at
 org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:519)
 at
 org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3454)
 at
 org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
 at
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:636)
 What does this mean?
 Thanks.
 --
 Dikang Gu
 0086 - 18611140205
 
 On Tuesday, August 2, 2011 at 6:04 PM, aaron morton wrote:
 
 Hang on, using brain now.
 That is triggering a small bug in the code
 see https://issues.apache.org/jira/browse/CASSANDRA-2984
 For not just remove the column meta data.
 Cheers
 -
 Aaron Morton
 Freelance Cassandra Developer

Re: Schema Disagreement

2011-08-02 Thread aaron morton
What do you see when you run describe cluster; in the cassandra-cli ? Whats the 
exact error you get and is there anything in the server side logs ?

Have you added other CF's before adding this one ? Did the schema agree before 
starting this statement?

I ran the statement below on the current trunk and it worked. 

Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 2 Aug 2011, at 12:08, Dikang Gu wrote:

 I thought the schema disagree problem was already solved in 0.8.1...
 
 On possible solution is to decommission the disagree node and rejoin it.
 
 
 On Tue, Aug 2, 2011 at 8:01 AM, Yi Yang yy...@me.com wrote:
 Dear all,
 
 I'm always meeting mp with schema disagree problems while trying to create a 
 column family like this, using cassandra-cli:
 
 create column family sd
with column_type = 'Super'
and key_validation_class = 'UUIDType'
and comparator = 'LongType'
and subcomparator = 'UTF8Type'
and column_metadata = [
{
column_name: 'time',
validation_class : 'LongType'
},{
column_name: 'open',
validation_class : 'FloatType'
},{
column_name: 'high',
validation_class : 'FloatType'
},{
column_name: 'low',
validation_class : 'FloatType'
},{
column_name: 'close',
validation_class : 'FloatType'
},{
column_name: 'volumn',
validation_class : 'LongType'
},{
column_name: 'splitopen',
validation_class : 'FloatType'
},{
column_name: 'splithigh',
validation_class : 'FloatType'
},{
column_name: 'splitlow',
validation_class : 'FloatType'
},{
column_name: 'splitclose',
validation_class : 'FloatType'
},{
column_name: 'splitvolume',
validation_class : 'LongType'
},{
column_name: 'splitclose',
validation_class : 'FloatType'
}
]
 ;
 
 I've tried to erase everything and restart Cassandra but this still happens.  
  But when I clear the column_metadata section this no more disagreement 
 error.   Do you have any idea why this happens?
 
 Environment: 2 VMs, using the same harddrive, Cassandra 0.8.1, Ubuntu 10.04
 This is for testing only.   We'll move to dedicated servers later.
 
 Best regards,
 Yi
 
 
 
 -- 
 Dikang Gu
 
 0086 - 18611140205
 



Re: Schema Disagreement

2011-08-02 Thread aaron morton
Hang on, using brain now. 

That is triggering a small bug in the code see 
https://issues.apache.org/jira/browse/CASSANDRA-2984

For not just remove the column meta data. 

Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 2 Aug 2011, at 21:19, aaron morton wrote:

 What do you see when you run describe cluster; in the cassandra-cli ? Whats 
 the exact error you get and is there anything in the server side logs ?
 
 Have you added other CF's before adding this one ? Did the schema agree 
 before starting this statement?
 
 I ran the statement below on the current trunk and it worked. 
 
 Cheers
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 On 2 Aug 2011, at 12:08, Dikang Gu wrote:
 
 I thought the schema disagree problem was already solved in 0.8.1...
 
 On possible solution is to decommission the disagree node and rejoin it.
 
 
 On Tue, Aug 2, 2011 at 8:01 AM, Yi Yang yy...@me.com wrote:
 Dear all,
 
 I'm always meeting mp with schema disagree problems while trying to create a 
 column family like this, using cassandra-cli:
 
 create column family sd
with column_type = 'Super'
and key_validation_class = 'UUIDType'
and comparator = 'LongType'
and subcomparator = 'UTF8Type'
and column_metadata = [
{
column_name: 'time',
validation_class : 'LongType'
},{
column_name: 'open',
validation_class : 'FloatType'
},{
column_name: 'high',
validation_class : 'FloatType'
},{
column_name: 'low',
validation_class : 'FloatType'
},{
column_name: 'close',
validation_class : 'FloatType'
},{
column_name: 'volumn',
validation_class : 'LongType'
},{
column_name: 'splitopen',
validation_class : 'FloatType'
},{
column_name: 'splithigh',
validation_class : 'FloatType'
},{
column_name: 'splitlow',
validation_class : 'FloatType'
},{
column_name: 'splitclose',
validation_class : 'FloatType'
},{
column_name: 'splitvolume',
validation_class : 'LongType'
},{
column_name: 'splitclose',
validation_class : 'FloatType'
}
]
 ;
 
 I've tried to erase everything and restart Cassandra but this still happens. 
   But when I clear the column_metadata section this no more disagreement 
 error.   Do you have any idea why this happens?
 
 Environment: 2 VMs, using the same harddrive, Cassandra 0.8.1, Ubuntu 10.04
 This is for testing only.   We'll move to dedicated servers later.
 
 Best regards,
 Yi
 
 
 
 -- 
 Dikang Gu
 
 0086 - 18611140205
 
 



Re: Schema Disagreement

2011-08-02 Thread Dikang Gu
I also encounter the schema disagreement in my 0.8.1 cluster today…

The disagreement occurs when I create a column family using the hector api, and 
I found the following errors in my cassandra/system.log

ERROR [pool-2-thread-99] 2011-08-03 11:21:18,051 Cassandra.java (line 3378) 
Internal error processing remove
java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut 
down
at 
org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337)
at org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360)
at 
org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241)
at org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62)
at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99)
at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210)
at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154)
at 
org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560)
at 
org.apache.cassandra.thrift.CassandraServer.internal_remove(CassandraServer.java:539)
at org.apache.cassandra.thrift.CassandraServer.remove(CassandraServer.java:547)
at 
org.apache.cassandra.thrift.Cassandra$Processor$remove.process(Cassandra.java:3370)
at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)

And when I try to decommission, I got this:

ERROR [pool-2-thread-90] 2011-08-03 11:24:35,611 Cassandra.java (line 3462) 
Internal error processing batch_mutate
java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut 
down
at 
org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337)
at org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360)
at 
org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241)
at org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62)
at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99)
at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210)
at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154)
at 
org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560)
at 
org.apache.cassandra.thrift.CassandraServer.internal_batch_mutate(CassandraServer.java:511)
at 
org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:519)
at 
org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3454)
at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)

What does this mean? 

Thanks.

-- 
Dikang Gu
0086 - 18611140205
On Tuesday, August 2, 2011 at 6:04 PM, aaron morton wrote: 
 Hang on, using brain now. 
 
 That is triggering a small bug in the code see 
 https://issues.apache.org/jira/browse/CASSANDRA-2984
 
 For not just remove the column meta data. 
 
 Cheers
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 
 
 
 
 On 2 Aug 2011, at 21:19, aaron morton wrote:
  What do you see when you run describe cluster; in the cassandra-cli ? Whats 
  the exact error you get and is there anything in the server side logs ?
  
  Have you added other CF's before adding this one ? Did the schema agree 
  before starting this statement?
  
  I ran the statement below on the current trunk and it worked. 
  
  Cheers
  
  -
  Aaron Morton
  Freelance Cassandra Developer
  @aaronmorton
  http://www.thelastpickle.com
  
  
  
  
  
  On 2 Aug 2011, at 12:08, Dikang Gu wrote:
   I thought the schema disagree problem was already solved in 0.8.1...
   
   On possible solution is to decommission the disagree node and rejoin it.
   
   
   On Tue, Aug 2, 2011 at 8:01 AM, Yi Yang yy...@me.com wrote:
Dear all,

 I'm always meeting mp with schema disagree problems while trying to 
create

Re: Schema Disagreement

2011-08-02 Thread Jonathan Ellis
Have you seen http://wiki.apache.org/cassandra/FAQ#schema_disagreement ?

On Tue, Aug 2, 2011 at 10:25 PM, Dikang Gu dikan...@gmail.com wrote:
 I also encounter the schema disagreement in my 0.8.1 cluster today…

 The disagreement occurs when I create a column family using the hector api,
 and I found the following errors in my cassandra/system.log
 ERROR [pool-2-thread-99] 2011-08-03 11:21:18,051 Cassandra.java (line 3378)
 Internal error processing remove
 java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut
 down
 at
 org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
 at
 java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816)
 at
 java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337)
 at
 org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360)
 at
 org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241)
 at
 org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62)
 at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99)
 at
 org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210)
 at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154)
 at
 org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560)
 at
 org.apache.cassandra.thrift.CassandraServer.internal_remove(CassandraServer.java:539)
 at
 org.apache.cassandra.thrift.CassandraServer.remove(CassandraServer.java:547)
 at
 org.apache.cassandra.thrift.Cassandra$Processor$remove.process(Cassandra.java:3370)
 at
 org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
 at
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:636)
 And when I try to decommission, I got this:
 ERROR [pool-2-thread-90] 2011-08-03 11:24:35,611 Cassandra.java (line 3462)
 Internal error processing batch_mutate
 java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut
 down
 at
 org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
 at
 java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816)
 at
 java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337)
 at
 org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360)
 at
 org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241)
 at
 org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62)
 at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99)
 at
 org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210)
 at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154)
 at
 org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560)
 at
 org.apache.cassandra.thrift.CassandraServer.internal_batch_mutate(CassandraServer.java:511)
 at
 org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:519)
 at
 org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3454)
 at
 org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
 at
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:636)
 What does this mean?
 Thanks.
 --
 Dikang Gu
 0086 - 18611140205

 On Tuesday, August 2, 2011 at 6:04 PM, aaron morton wrote:

 Hang on, using brain now.
 That is triggering a small bug in the code
 see https://issues.apache.org/jira/browse/CASSANDRA-2984
 For not just remove the column meta data.
 Cheers
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 On 2 Aug 2011, at 21:19, aaron morton wrote:

 What do you see when you run describe cluster; in the cassandra-cli ? Whats
 the exact error you get and is there anything in the server side logs ?
 Have you added other CF's before adding this one ? Did the schema agree
 before starting this statement?
 I ran the statement below on the current trunk and it worked.
 Cheers
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 On 2 Aug 2011, at 12:08, Dikang Gu wrote:

 I thought the schema disagree problem was already solved in 0.8.1...
 On possible solution is to decommission the disagree node and rejoin it.

 On Tue, Aug 2, 2011 at 8:01 AM, Yi Yang yy

Re: Schema Disagreement

2011-08-02 Thread Dikang Gu
I followed the instructions in the FAQ, but got the following when describe 
cluster;

Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions: 
dd73c740-bd84-11e0--98dab94442fb: [192.168.1.28, 192.168.1.9, 192.168.1.27]
UNREACHABLE: [192.168.1.25]


What's the UNREACHABLE?

Thanks.

-- 
Dikang Gu
0086 - 18611140205
On Wednesday, August 3, 2011 at 11:28 AM, Jonathan Ellis wrote: 
 Have you seen http://wiki.apache.org/cassandra/FAQ#schema_disagreement ?
 
 On Tue, Aug 2, 2011 at 10:25 PM, Dikang Gu dikan...@gmail.com wrote:
  I also encounter the schema disagreement in my 0.8.1 cluster today…
  
  The disagreement occurs when I create a column family using the hector api,
  and I found the following errors in my cassandra/system.log
  ERROR [pool-2-thread-99] 2011-08-03 11:21:18,051 Cassandra.java (line 3378)
  Internal error processing remove
  java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut
  down
  at
  org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
  at
  java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816)
  at
  java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337)
  at
  org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360)
  at
  org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241)
  at
  org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62)
  at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99)
  at
  org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210)
  at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154)
  at
  org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560)
  at
  org.apache.cassandra.thrift.CassandraServer.internal_remove(CassandraServer.java:539)
  at
  org.apache.cassandra.thrift.CassandraServer.remove(CassandraServer.java:547)
  at
  org.apache.cassandra.thrift.Cassandra$Processor$remove.process(Cassandra.java:3370)
  at
  org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
  at
  org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
  at
  java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  at
  java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  at java.lang.Thread.run(Thread.java:636)
  And when I try to decommission, I got this:
  ERROR [pool-2-thread-90] 2011-08-03 11:24:35,611 Cassandra.java (line 3462)
  Internal error processing batch_mutate
  java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut
  down
  at
  org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
  at
  java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816)
  at
  java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337)
  at
  org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360)
  at
  org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241)
  at
  org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62)
  at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99)
  at
  org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210)
  at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154)
  at
  org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560)
  at
  org.apache.cassandra.thrift.CassandraServer.internal_batch_mutate(CassandraServer.java:511)
  at
  org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:519)
  at
  org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3454)
  at
  org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
  at
  org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
  at
  java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  at
  java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  at java.lang.Thread.run(Thread.java:636)
  What does this mean?
  Thanks.
  --
  Dikang Gu
  0086 - 18611140205
  
  On Tuesday, August 2, 2011 at 6:04 PM, aaron morton wrote:
  
  Hang on, using brain now.
  That is triggering a small bug in the code
  see https://issues.apache.org/jira/browse/CASSANDRA-2984
  For not just remove the column meta data.
  Cheers
  -
  Aaron Morton
  Freelance Cassandra Developer
  @aaronmorton
  http://www.thelastpickle.com
  On 2 Aug 2011, at 21:19, aaron morton wrote:
  
  What do you see when you run describe cluster; in the cassandra-cli ? Whats
  the exact error you get

Schema Disagreement

2011-08-01 Thread Yi Yang
Dear all,

I'm always meeting mp with schema disagree problems while trying to create a 
column family like this, using cassandra-cli:

create column family sd
with column_type = 'Super' 
and key_validation_class = 'UUIDType'
and comparator = 'LongType'
and subcomparator = 'UTF8Type'
and column_metadata = [
{
column_name: 'time', 
validation_class : 'LongType'
},{
column_name: 'open', 
validation_class : 'FloatType'
},{
column_name: 'high', 
validation_class : 'FloatType'
},{
column_name: 'low', 
validation_class : 'FloatType'
},{
column_name: 'close', 
validation_class : 'FloatType'
},{
column_name: 'volumn', 
validation_class : 'LongType'
},{
column_name: 'splitopen', 
validation_class : 'FloatType'
},{
column_name: 'splithigh', 
validation_class : 'FloatType'
},{
column_name: 'splitlow', 
validation_class : 'FloatType'
},{
column_name: 'splitclose', 
validation_class : 'FloatType'
},{
column_name: 'splitvolume',
validation_class : 'LongType'
},{
column_name: 'splitclose',
validation_class : 'FloatType'
}
]
;

I've tried to erase everything and restart Cassandra but this still happens.   
But when I clear the column_metadata section this no more disagreement error.   
Do you have any idea why this happens?

Environment: 2 VMs, using the same harddrive, Cassandra 0.8.1, Ubuntu 10.04
This is for testing only.   We'll move to dedicated servers later.

Best regards,
Yi


Re: Schema Disagreement

2011-08-01 Thread Dikang Gu
I thought the schema disagree problem was already solved in 0.8.1...

On possible solution is to decommission the disagree node and rejoin it.


On Tue, Aug 2, 2011 at 8:01 AM, Yi Yang yy...@me.com wrote:

 Dear all,

 I'm always meeting mp with schema disagree problems while trying to create
 a column family like this, using cassandra-cli:

 create column family sd
with column_type = 'Super'
and key_validation_class = 'UUIDType'
and comparator = 'LongType'
and subcomparator = 'UTF8Type'
and column_metadata = [
{
column_name: 'time',
validation_class : 'LongType'
},{
column_name: 'open',
validation_class : 'FloatType'
},{
column_name: 'high',
validation_class : 'FloatType'
},{
column_name: 'low',
validation_class : 'FloatType'
},{
column_name: 'close',
validation_class : 'FloatType'
},{
column_name: 'volumn',
validation_class : 'LongType'
},{
column_name: 'splitopen',
validation_class : 'FloatType'
},{
column_name: 'splithigh',
validation_class : 'FloatType'
},{
column_name: 'splitlow',
validation_class : 'FloatType'
},{
column_name: 'splitclose',
validation_class : 'FloatType'
},{
column_name: 'splitvolume',
validation_class : 'LongType'
},{
column_name: 'splitclose',
validation_class : 'FloatType'
}
]
 ;

 I've tried to erase everything and restart Cassandra but this still
 happens.   But when I clear the column_metadata section this no more
 disagreement error.   Do you have any idea why this happens?

 Environment: 2 VMs, using the same harddrive, Cassandra 0.8.1, Ubuntu 10.04
 This is for testing only.   We'll move to dedicated servers later.

 Best regards,
 Yi




-- 
Dikang Gu

0086 - 18611140205