Re: UnknownColumnFamilyException after removing all Cassandra data

2017-02-13 Thread Jacob Shadix
The node will not bootstrap if it is listed as a seed node.

-- Jacob Shadix

On Tue, Feb 7, 2017 at 12:16 PM, Simone Franzini 
wrote:

> To further add to my previous answer, the node in question is a seed node,
> so it did not bootstrap.
> Should I remove it from the list of seed nodes and then try to restart it?
>
> Simone Franzini, PhD
>
> http://www.linkedin.com/in/simonefranzini
>
> On Tue, Feb 7, 2017 at 9:43 AM, Simone Franzini 
> wrote:
>
>> This is exactly what I did on the second node. If this is not the correct
>> / best procedure to adopt in these cases, please advise:
>>
>> 1. Removed all the data, including the system table (rm -rf data/
>> commitlog/ saved_caches).
>> 2. Configured the node to replace itself, by adding the following line to
>> cassandra-env.sh: JVM_OPTS="$JVM_OPTS -Dcassandra.replace_address=> own IP address>"
>> 3. Start the node.
>>
>> Noticeably, I did not do nodetool decommission or removenode. Is that the
>> recommended approach?
>>
>> Given what I did, I am mystified as to what the problem is. If I query
>> the system.schema_columnfamilies on the affected node, all CF IDs are
>> there. Same goes for the only other node that is currently up. Also, the
>> other node that is currently up has data for all those CF IDs in the data
>> folder.
>>
>>
>> Simone Franzini, PhD
>>
>> http://www.linkedin.com/in/simonefranzini
>>
>> On Tue, Feb 7, 2017 at 5:39 AM, kurt greaves 
>> wrote:
>>
>>> The node is trying to communicate with another node, potentially
>>> streaming data, and is receiving files/data for an "unknown column family".
>>> That is, it doesn't know about the CF with the id
>>> e36415b6-95a7-368c-9ac0-ae0ac774863d.
>>> If you deleted some columnfamilies but not all the system keyspace and
>>> restarted the node I'd expect this error to occur. Or I suppose if you
>>> didn't decommission the node properly before blowing the data away and
>>> restarting.
>>>
>>> You'll have to give us more information on what your exact steps were on
>>> this 2nd node:
>>>
>>> When you say deleted all Cassandra data, did this include the system
>>> tables? Were your steps to delete all the data and then just restart the
>>> node? Did you remove the node from the cluster prior to deleting the data
>>> and restarting it (nodetool decommission/removenode? Did the node rejoin
>>> the cluster or did it have to bootstrap?
>>>
>>>
>>>
>>
>


Re: UnknownColumnFamilyException after removing all Cassandra data

2017-02-07 Thread Simone Franzini
To further add to my previous answer, the node in question is a seed node,
so it did not bootstrap.
Should I remove it from the list of seed nodes and then try to restart it?

Simone Franzini, PhD

http://www.linkedin.com/in/simonefranzini

On Tue, Feb 7, 2017 at 9:43 AM, Simone Franzini 
wrote:

> This is exactly what I did on the second node. If this is not the correct
> / best procedure to adopt in these cases, please advise:
>
> 1. Removed all the data, including the system table (rm -rf data/
> commitlog/ saved_caches).
> 2. Configured the node to replace itself, by adding the following line to
> cassandra-env.sh: JVM_OPTS="$JVM_OPTS -Dcassandra.replace_address= own IP address>"
> 3. Start the node.
>
> Noticeably, I did not do nodetool decommission or removenode. Is that the
> recommended approach?
>
> Given what I did, I am mystified as to what the problem is. If I query the
> system.schema_columnfamilies on the affected node, all CF IDs are there.
> Same goes for the only other node that is currently up. Also, the other
> node that is currently up has data for all those CF IDs in the data folder.
>
>
> Simone Franzini, PhD
>
> http://www.linkedin.com/in/simonefranzini
>
> On Tue, Feb 7, 2017 at 5:39 AM, kurt greaves  wrote:
>
>> The node is trying to communicate with another node, potentially
>> streaming data, and is receiving files/data for an "unknown column family".
>> That is, it doesn't know about the CF with the id
>> e36415b6-95a7-368c-9ac0-ae0ac774863d.
>> If you deleted some columnfamilies but not all the system keyspace and
>> restarted the node I'd expect this error to occur. Or I suppose if you
>> didn't decommission the node properly before blowing the data away and
>> restarting.
>>
>> You'll have to give us more information on what your exact steps were on
>> this 2nd node:
>>
>> When you say deleted all Cassandra data, did this include the system
>> tables? Were your steps to delete all the data and then just restart the
>> node? Did you remove the node from the cluster prior to deleting the data
>> and restarting it (nodetool decommission/removenode? Did the node rejoin
>> the cluster or did it have to bootstrap?
>>
>>
>>
>


Re: UnknownColumnFamilyException after removing all Cassandra data

2017-02-07 Thread Simone Franzini
This is exactly what I did on the second node. If this is not the correct /
best procedure to adopt in these cases, please advise:

1. Removed all the data, including the system table (rm -rf data/
commitlog/ saved_caches).
2. Configured the node to replace itself, by adding the following line to
cassandra-env.sh: JVM_OPTS="$JVM_OPTS -Dcassandra.replace_address="
3. Start the node.

Noticeably, I did not do nodetool decommission or removenode. Is that the
recommended approach?

Given what I did, I am mystified as to what the problem is. If I query the
system.schema_columnfamilies on the affected node, all CF IDs are there.
Same goes for the only other node that is currently up. Also, the other
node that is currently up has data for all those CF IDs in the data folder.


Simone Franzini, PhD

http://www.linkedin.com/in/simonefranzini

On Tue, Feb 7, 2017 at 5:39 AM, kurt greaves  wrote:

> The node is trying to communicate with another node, potentially streaming
> data, and is receiving files/data for an "unknown column family". That is,
> it doesn't know about the CF with the id e36415b6-95a7-368c-9ac0-ae0ac7
> 74863d.
> If you deleted some columnfamilies but not all the system keyspace and
> restarted the node I'd expect this error to occur. Or I suppose if you
> didn't decommission the node properly before blowing the data away and
> restarting.
>
> You'll have to give us more information on what your exact steps were on
> this 2nd node:
>
> When you say deleted all Cassandra data, did this include the system
> tables? Were your steps to delete all the data and then just restart the
> node? Did you remove the node from the cluster prior to deleting the data
> and restarting it (nodetool decommission/removenode? Did the node rejoin
> the cluster or did it have to bootstrap?
>
>
>


Re: UnknownColumnFamilyException after removing all Cassandra data

2017-02-07 Thread kurt greaves
The node is trying to communicate with another node, potentially streaming
data, and is receiving files/data for an "unknown column family". That is,
it doesn't know about the CF with the id e36415b6-95a7-368c-9ac0-
ae0ac774863d.
If you deleted some columnfamilies but not all the system keyspace and
restarted the node I'd expect this error to occur. Or I suppose if you
didn't decommission the node properly before blowing the data away and
restarting.

You'll have to give us more information on what your exact steps were on
this 2nd node:

When you say deleted all Cassandra data, did this include the system
tables? Were your steps to delete all the data and then just restart the
node? Did you remove the node from the cluster prior to deleting the data
and restarting it (nodetool decommission/removenode? Did the node rejoin
the cluster or did it have to bootstrap?


UnknownColumnFamilyException after removing all Cassandra data

2017-02-06 Thread Simone Franzini
I am trying to restore functionality of a cluster that got into a really
bad state of schema disagreements. Right now, I am at a point where I have
a single node up and I am trying to replicate data from there.
I am then trying to bring up a second node, where I deleted all Cassandra
data. The node comes up, then I get a bunch of:

2017-02-06 15:09:16,220 WARN  [MessagingService-Incoming-/10.128.6.196]
 IncomingTcpConnection.java:97 - UnknownColumnFamilyException reading from
socket; closing
org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find
cfId=e36415b6-95a7-368c-9ac0-ae0ac774863d

What does this mean exactly? Is it a symptom of a problem with the schema?
Or is it just a warning that Cassandra cannot find the data on disk (since
it was deleted)?
In other words: is this something to worry about in this situation or is it
expected behavior?

I am currently running a repair and waiting to see if the data comes back.

Thanks,
Simone Franzini, PhD

http://www.linkedin.com/in/simonefranzini