Re: UnknownColumnFamilyException after removing all Cassandra data
The node will not bootstrap if it is listed as a seed node. -- Jacob Shadix On Tue, Feb 7, 2017 at 12:16 PM, Simone Franziniwrote: > To further add to my previous answer, the node in question is a seed node, > so it did not bootstrap. > Should I remove it from the list of seed nodes and then try to restart it? > > Simone Franzini, PhD > > http://www.linkedin.com/in/simonefranzini > > On Tue, Feb 7, 2017 at 9:43 AM, Simone Franzini > wrote: > >> This is exactly what I did on the second node. If this is not the correct >> / best procedure to adopt in these cases, please advise: >> >> 1. Removed all the data, including the system table (rm -rf data/ >> commitlog/ saved_caches). >> 2. Configured the node to replace itself, by adding the following line to >> cassandra-env.sh: JVM_OPTS="$JVM_OPTS -Dcassandra.replace_address=> own IP address>" >> 3. Start the node. >> >> Noticeably, I did not do nodetool decommission or removenode. Is that the >> recommended approach? >> >> Given what I did, I am mystified as to what the problem is. If I query >> the system.schema_columnfamilies on the affected node, all CF IDs are >> there. Same goes for the only other node that is currently up. Also, the >> other node that is currently up has data for all those CF IDs in the data >> folder. >> >> >> Simone Franzini, PhD >> >> http://www.linkedin.com/in/simonefranzini >> >> On Tue, Feb 7, 2017 at 5:39 AM, kurt greaves >> wrote: >> >>> The node is trying to communicate with another node, potentially >>> streaming data, and is receiving files/data for an "unknown column family". >>> That is, it doesn't know about the CF with the id >>> e36415b6-95a7-368c-9ac0-ae0ac774863d. >>> If you deleted some columnfamilies but not all the system keyspace and >>> restarted the node I'd expect this error to occur. Or I suppose if you >>> didn't decommission the node properly before blowing the data away and >>> restarting. >>> >>> You'll have to give us more information on what your exact steps were on >>> this 2nd node: >>> >>> When you say deleted all Cassandra data, did this include the system >>> tables? Were your steps to delete all the data and then just restart the >>> node? Did you remove the node from the cluster prior to deleting the data >>> and restarting it (nodetool decommission/removenode? Did the node rejoin >>> the cluster or did it have to bootstrap? >>> >>> >>> >> >
Re: UnknownColumnFamilyException after removing all Cassandra data
To further add to my previous answer, the node in question is a seed node, so it did not bootstrap. Should I remove it from the list of seed nodes and then try to restart it? Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini On Tue, Feb 7, 2017 at 9:43 AM, Simone Franziniwrote: > This is exactly what I did on the second node. If this is not the correct > / best procedure to adopt in these cases, please advise: > > 1. Removed all the data, including the system table (rm -rf data/ > commitlog/ saved_caches). > 2. Configured the node to replace itself, by adding the following line to > cassandra-env.sh: JVM_OPTS="$JVM_OPTS -Dcassandra.replace_address= own IP address>" > 3. Start the node. > > Noticeably, I did not do nodetool decommission or removenode. Is that the > recommended approach? > > Given what I did, I am mystified as to what the problem is. If I query the > system.schema_columnfamilies on the affected node, all CF IDs are there. > Same goes for the only other node that is currently up. Also, the other > node that is currently up has data for all those CF IDs in the data folder. > > > Simone Franzini, PhD > > http://www.linkedin.com/in/simonefranzini > > On Tue, Feb 7, 2017 at 5:39 AM, kurt greaves wrote: > >> The node is trying to communicate with another node, potentially >> streaming data, and is receiving files/data for an "unknown column family". >> That is, it doesn't know about the CF with the id >> e36415b6-95a7-368c-9ac0-ae0ac774863d. >> If you deleted some columnfamilies but not all the system keyspace and >> restarted the node I'd expect this error to occur. Or I suppose if you >> didn't decommission the node properly before blowing the data away and >> restarting. >> >> You'll have to give us more information on what your exact steps were on >> this 2nd node: >> >> When you say deleted all Cassandra data, did this include the system >> tables? Were your steps to delete all the data and then just restart the >> node? Did you remove the node from the cluster prior to deleting the data >> and restarting it (nodetool decommission/removenode? Did the node rejoin >> the cluster or did it have to bootstrap? >> >> >> >
Re: UnknownColumnFamilyException after removing all Cassandra data
This is exactly what I did on the second node. If this is not the correct / best procedure to adopt in these cases, please advise: 1. Removed all the data, including the system table (rm -rf data/ commitlog/ saved_caches). 2. Configured the node to replace itself, by adding the following line to cassandra-env.sh: JVM_OPTS="$JVM_OPTS -Dcassandra.replace_address=" 3. Start the node. Noticeably, I did not do nodetool decommission or removenode. Is that the recommended approach? Given what I did, I am mystified as to what the problem is. If I query the system.schema_columnfamilies on the affected node, all CF IDs are there. Same goes for the only other node that is currently up. Also, the other node that is currently up has data for all those CF IDs in the data folder. Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini On Tue, Feb 7, 2017 at 5:39 AM, kurt greaveswrote: > The node is trying to communicate with another node, potentially streaming > data, and is receiving files/data for an "unknown column family". That is, > it doesn't know about the CF with the id e36415b6-95a7-368c-9ac0-ae0ac7 > 74863d. > If you deleted some columnfamilies but not all the system keyspace and > restarted the node I'd expect this error to occur. Or I suppose if you > didn't decommission the node properly before blowing the data away and > restarting. > > You'll have to give us more information on what your exact steps were on > this 2nd node: > > When you say deleted all Cassandra data, did this include the system > tables? Were your steps to delete all the data and then just restart the > node? Did you remove the node from the cluster prior to deleting the data > and restarting it (nodetool decommission/removenode? Did the node rejoin > the cluster or did it have to bootstrap? > > >
Re: UnknownColumnFamilyException after removing all Cassandra data
The node is trying to communicate with another node, potentially streaming data, and is receiving files/data for an "unknown column family". That is, it doesn't know about the CF with the id e36415b6-95a7-368c-9ac0- ae0ac774863d. If you deleted some columnfamilies but not all the system keyspace and restarted the node I'd expect this error to occur. Or I suppose if you didn't decommission the node properly before blowing the data away and restarting. You'll have to give us more information on what your exact steps were on this 2nd node: When you say deleted all Cassandra data, did this include the system tables? Were your steps to delete all the data and then just restart the node? Did you remove the node from the cluster prior to deleting the data and restarting it (nodetool decommission/removenode? Did the node rejoin the cluster or did it have to bootstrap?
UnknownColumnFamilyException after removing all Cassandra data
I am trying to restore functionality of a cluster that got into a really bad state of schema disagreements. Right now, I am at a point where I have a single node up and I am trying to replicate data from there. I am then trying to bring up a second node, where I deleted all Cassandra data. The node comes up, then I get a bunch of: 2017-02-06 15:09:16,220 WARN [MessagingService-Incoming-/10.128.6.196] IncomingTcpConnection.java:97 - UnknownColumnFamilyException reading from socket; closing org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=e36415b6-95a7-368c-9ac0-ae0ac774863d What does this mean exactly? Is it a symptom of a problem with the schema? Or is it just a warning that Cassandra cannot find the data on disk (since it was deleted)? In other words: is this something to worry about in this situation or is it expected behavior? I am currently running a repair and waiting to see if the data comes back. Thanks, Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini