Re: Storage: upsert vs. delete + insert
IMHO, delete then insert will take two times more disk space then single insert. But after compaction the difference will disappear. This was true in version prior to 2.0, but it should still work this way. But maybe someone will correct me, if i'm wrong. Cheers, Olek 2014-09-10 18:30 GMT+02:00 Michal Budzyn : > One insert would be much better e.g. for performance and network latency. > I wanted to know if there is a significant difference (apart from additional > commit log entry) in the used storage between these 2 use cases. >
Re: Storage: upsert vs. delete + insert
I think so. this is how i see it: on the very beginning you have such line in datafile: {key: [col_name, col_value, date_of_last_change]} //something similar, i don't remember now after delete you're adding line: {key:[col_name, last_col_value, date_of_delete, 'd']} //this d indicates that field is deleted after insert the following line is added: {key: [col_name, col_value, date_of_insert]} so delete and then insert generates 2 lines in datafile. after pure insert (upsert in fact) you will have only one line {key: [col_name, col_value, date_of_insert]} So, summarizing, in second scenario you have only one line, in first: two. I hope my post is correct ;) regards, Olek 2014-09-10 18:56 GMT+02:00 Michal Budzyn : > Would the factor before compaction be always 2 ? > > On Wed, Sep 10, 2014 at 6:38 PM, olek.stas...@gmail.com > wrote: >> >> IMHO, delete then insert will take two times more disk space then >> single insert. But after compaction the difference will disappear. >> This was true in version prior to 2.0, but it should still work this >> way. But maybe someone will correct me, if i'm wrong. >> Cheers, >> Olek >> >> 2014-09-10 18:30 GMT+02:00 Michal Budzyn : >> > One insert would be much better e.g. for performance and network >> > latency. >> > I wanted to know if there is a significant difference (apart from >> > additional >> > commit log entry) in the used storage between these 2 use cases. >> > > >
Re: Storage: upsert vs. delete + insert
You're right, there is no data in tombstone, only a column name. So there is only small overhead of disk size after delete. But i must agree with post above, it's pointless in deleting prior to inserting. Moreover, it needs one op more to compute resulting row. cheers, Olek 2014-09-10 22:18 GMT+02:00 graham sanderson : > delete inserts a tombstone which is likely smaller than the original record > (though still (currently) has overhead of cost for full key/column name > the data for the insert after a delete would be identical to the data if you > just inserted/updated > > no real benefit I can think of for doing the delete first. > > On Sep 10, 2014, at 2:25 PM, olek.stas...@gmail.com wrote: > >> I think so. >> this is how i see it: >> on the very beginning you have such line in datafile: >> {key: [col_name, col_value, date_of_last_change]} //something similar, >> i don't remember now >> >> after delete you're adding line: >> {key:[col_name, last_col_value, date_of_delete, 'd']} //this d >> indicates that field is deleted >> after insert the following line is added: >> {key: [col_name, col_value, date_of_insert]} >> so delete and then insert generates 2 lines in datafile. >> >> after pure insert (upsert in fact) you will have only one line >> {key: [col_name, col_value, date_of_insert]} >> So, summarizing, in second scenario you have only one line, in first: two. >> I hope my post is correct ;) >> regards, >> Olek >> >> 2014-09-10 18:56 GMT+02:00 Michal Budzyn : >>> Would the factor before compaction be always 2 ? >>> >>> On Wed, Sep 10, 2014 at 6:38 PM, olek.stas...@gmail.com >>> wrote: >>>> >>>> IMHO, delete then insert will take two times more disk space then >>>> single insert. But after compaction the difference will disappear. >>>> This was true in version prior to 2.0, but it should still work this >>>> way. But maybe someone will correct me, if i'm wrong. >>>> Cheers, >>>> Olek >>>> >>>> 2014-09-10 18:30 GMT+02:00 Michal Budzyn : >>>>> One insert would be much better e.g. for performance and network >>>>> latency. >>>>> I wanted to know if there is a significant difference (apart from >>>>> additional >>>>> commit log entry) in the used storage between these 2 use cases. >>>>> >>> >>> >
Re: [RELEASE] Apache Cassandra 2.1.0
When I upgraded my system from 1.2.x to 2.0.x there were simple hint: never upgrade before target release does not have at least 5 on third place. versions before x.x.5 are unstable and aren't ready for production use. I don't know if it's still true, but be careful ;) Regards Olek 2014-09-17 20:14 GMT+02:00 abhinav chowdary : > > it depends on how you installed it, package vs tar ball etc , Datastax has > good documentation i suggest reading it > http://www.datastax.com/documentation/upgrade/doc/upgrade/cassandra/upgradeCassandraDetails.html > > in either case at a high level > > Step 1: Stop node > Step 2: Backup your config files > Step 3: will be to replace new binaries > Step 4: will be to run nodetool upgradesstables > > Again there are some gotchas so i recommend reading above documentation > thoroughly before proceeding > > > > On Wed, Sep 17, 2014 at 8:04 AM, Yatong Zhang wrote: >> >> Well, how to upgrade from 2.0.x to 2.1? Just replace cassandra bin files? >> >> On Wed, Sep 17, 2014 at 3:52 PM, Alex Popescu wrote: >>> >>> Apologies for the late reply: the 2.1.x version of the C#, Java and >>> Python DataStax drivers support the new Cassandra 2.1 version. >>> >>> Here's the quick list of links: >>> >>> C#: >>> >>> Latest version: 2.1.2 >>> Nuget: https://www.nuget.org/packages/CassandraCSharpDriver/ >>> >>> Java: >>> >>> Latest version: 2.1.1 >>> Maven: http://maven-repository.com/artifact/com.datastax.cassandra >>> Binary distro: http://www.datastax.com/download#dl-datastax-drivers >>> >>> Python: >>> >>> Latest version: 2.1.1 >>> PyPi: https://pypi.python.org/pypi/cassandra-driver >>> >>> >>> On Thu, Sep 11, 2014 at 10:58 AM, Tony Anecito >>> wrote: Ok is it part of the release or needs to be downloaded from Datastax somewhere. I am wondering about the java driver. Thanks! -Tony On Thursday, September 11, 2014 9:47 AM, abhinav chowdary wrote: Yes its was released java driver 2.1 On Sep 11, 2014 8:33 AM, "Tony Anecito" wrote: Congrads team I know you worked hard on it!! One question. Where can users get a java Datastax driver to support this version? If so is it released? Best Regards, -Tony Anecito Founder/President MyUniPortal LLC http://www.myuniportal.com On Thursday, September 11, 2014 9:05 AM, Sylvain Lebresne wrote: The Cassandra team is pleased to announce the release of the final version of Apache Cassandra 2.1.0. Cassandra 2.1.0 brings a number of new features and improvements including (but not limited to): - Improved support of Windows. - A new incremental repair option[4, 5] - A better row cache that can cache only the head of partitions[6] - Off-heap memtables[7] - Numerous performance improvements[8, 9] - CQL improvements and additions: User-defined types, tuple types, 2ndary indexing of collections, ...[10] - An improved stress tool[11] Please refer to the release notes[1] and changelog[2] for details. Both source and binary distributions of Cassandra 2.1.0 can be downloaded at: http://cassandra.apache.org/download/ As usual, a debian package is available from the project APT repository[3] (you will need to use the 21x series). The Cassandra team [1]: http://goo.gl/k4eM39 (CHANGES.txt) [2]: http://goo.gl/npCsro (NEWS.txt) [3]: http://wiki.apache.org/cassandra/DebianPackaging [4]: http://goo.gl/MjohJp [5]: http://goo.gl/f8jSme [6]: http://goo.gl/6TJPH6 [7]: http://goo.gl/YT7znJ [8]: http://goo.gl/Rg3tdA [9]: http://goo.gl/JfDBGW [10]: http://goo.gl/kQl7GW [11]: http://goo.gl/OTNqiQ >>> >>> >>> >>> -- >>> >>> :- a) >>> >>> >>> Alex Popescu >>> Sen. Product Manager @ DataStax >>> @al3xandru >> >> > > > > -- > Warm Regards > Abhinav Chowdary
OOM while reading key cache
Hello, I'm facing OOM on reading key_cache Cluster conf is as follows: -6 machines which 8gb RAM each and three 150GB disks each -default heap configuration -deafult key cache configuration -the biggest keyspace has abt 500GB size (RF: 2, so in fact there is 250GB of raw data). After upgrading first of the machines from 1.2.11 to 2.0.2 i've recieved error: INFO [main] 2013-11-08 10:53:16,716 AutoSavingCache.java (line 114) reading saved cache /home/synat/nosql_filesystem/cassandra/data/saved_caches/production_storage-METADATA-KeyCache-b.db ERROR [main] 2013-11-08 10:53:16,895 CassandraDaemon.java (line 478) Exception encountered during startup java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:394) at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355) at org.apache.cassandra.service.CacheService$KeyCacheSerializer.deserialize(CacheService.java:352) at org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:119) at org.apache.cassandra.db.ColumnFamilyStore.(ColumnFamilyStore.java:264) at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:409) at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:381) at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:314) at org.apache.cassandra.db.Keyspace.(Keyspace.java:268) at org.apache.cassandra.db.Keyspace.open(Keyspace.java:110) at org.apache.cassandra.db.Keyspace.open(Keyspace.java:88) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:274) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:461) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:504) Error appears every start, so I've decided to disable key cache (this was not helpful) and temporarily moved key cache out of cache folder (file was of size 13M). That helps in starting node, but this is only workaround and it's not demanded configuration. Anyone has any idea what is the real cause of problem with oom? best regards Aleksander ps. I've still 5 nodes to upgrade, I'll inform if problem apperas on the rest.
Re: OOM while reading key cache
Yes, as I wrote in first e-mail. When I removed key cache file cassandra started without further problems. regards Olek 2013/11/13 Robert Coli : > > On Wed, Nov 13, 2013 at 12:35 AM, Tom van den Berge > wrote: >> >> I'm having the same problem, after upgrading from 1.2.3 to 1.2.10. >> >> I can remember this was a bug that was solved in the 1.0 or 1.1 version >> some time ago, but apparently it got back. >> A workaround is to delete the contents of the saved_caches directory >> before starting up. > > > Yours is not the first report of this I've heard resulting from a 1.2.x to > 1.2.x upgrade. Reports are of the form "I had to nuke my saved_caches or couldn't start my node, it OOMED, etc.>". > > https://issues.apache.org/jira/browse/CASSANDRA-6325 > > Exists, but doesn't seem to be the same issue. > > https://issues.apache.org/jira/browse/CASSANDRA-5986 > > Similar, doesn't seem to be an issue triggered by upgrade.. > > If I were one of the posters on this thread, I would strongly consider > filing a JIRA on point. > > @OP (olek) : did removing the saved_caches also fix your problem? > > =Rob > >
Risk of not doing repair
Hello, I'm facing bug https://issues.apache.org/jira/browse/CASSANDRA-6277. After migration to 2.0.2 I can't perform repair on my cluster (six nodes). Repair on the biggest CF breaks with error described in Jira. I know, that probably there is a solution in repository, but it's not included in any release. I can estimate, that 2.0.3 with this fix will be released in december. If it's not really neccessary, i would avoid building unstable version of cass from sources and install it in prod environ, I would rather use rpm-based distribution to keep system in consistent state. So this is my question: What is the risk for me concerned with not doing repair for a month, assuming that gc_grace is 10days? Should I really worry? Maybe I should use repo version of cass? best regards Olek
Re: Data tombstoned during bulk loading 1.2.10 -> 2.0.3
Hi All, We've faced very similar effect after upgrade from 1.1.7 to 2.0 (via 1.2.10). Probably after upgradesstable (but it's only a guess, because we noticed problem few weeks later), some rows became tombstoned. They just disappear from results of queries. After inverstigation I've noticed, that they are reachable via sstable2json. Example output for "non-existent" row: {"key": "6e6e37716c6d665f6f61695f6463","metadata": {"deletionInfo": {"markedForDeleteAt":2201170739199,"localDeletionTime":0}},"columns": [["DATA","3c6f61695f64633a64(...)",1357677928108]]} ] If I understand correctly row is marked as deleted with timestamp in the far future, but it's still on the disk. Also localDeletionTime is set to 0, which may means, that it's kind of internal bug, not effect of client error. So my question is: is it true, that upgradesstable may do soemthing like that? How to find reasons for such strange cassandra behaviour? Is there any option of recovering such strange marked nodes? This problem touches about 500K rows of all 14M in our database, so the percentage is quite big. best regards Aleksander 2013-12-12 Robert Coli : > On Wed, Dec 11, 2013 at 6:27 AM, Mathijs Vogelzang > wrote: >> >> When I use sstable2json on the sstable on the destination cluster, it has >> "metadata": {"deletionInfo": >> {"markedForDeleteAt":1796952039620607,"localDeletionTime":0}}, whereas >> it doesn't have that in the source sstable. >> (Yes, this is a timestamp far into the future. All our hosts are >> properly synced through ntp). > > > This seems like a bug in sstableloader, I would report it on JIRA. > >> >> Naturally, copying the data again doesn't work to fix it, as the >> tombstone is far in the future. Apart from not having this happen at >> all, how can it be fixed? > > > Briefly, you'll want to purge that tombstone and then reload the data with a > reasonable timestamp. > > Dealing with rows with data (and tombstones) in the far future is described > in detail here : > > http://thelastpickle.com/blog/2011/12/15/Anatomy-of-a-Cassandra-Partition.html > > =Rob >
Re: Data tombstoned during bulk loading 1.2.10 -> 2.0.3
Ok, but will upgrade "resurrect" my data? Or maybe I should perform additional action to bring my system to correct state? best regards Aleksander 3 lut 2014 17:08 "Yuki Morishita" napisał(a): > if you are using < 2.0.4, then you are hitting > https://issues.apache.org/jira/browse/CASSANDRA-6527 > > > On Mon, Feb 3, 2014 at 2:51 AM, olek.stas...@gmail.com > wrote: > > Hi All, > > We've faced very similar effect after upgrade from 1.1.7 to 2.0 (via > > 1.2.10). Probably after upgradesstable (but it's only a guess, > > because we noticed problem few weeks later), some rows became > > tombstoned. They just disappear from results of queries. After > > inverstigation I've noticed, that they are reachable via sstable2json. > > Example output for "non-existent" row: > > > > {"key": "6e6e37716c6d665f6f61695f6463","metadata": {"deletionInfo": > > {"markedForDeleteAt":2201170739199,"localDeletionTime":0}},"columns": > > [["DATA","3c6f61695f64633a64(...)",1357677928108]]} > > ] > > > > If I understand correctly row is marked as deleted with timestamp in > > the far future, but it's still on the disk. Also localDeletionTime is > > set to 0, which may means, that it's kind of internal bug, not effect > > of client error. So my question is: is it true, that upgradesstable > > may do soemthing like that? How to find reasons for such strange > > cassandra behaviour? Is there any option of recovering such strange > > marked nodes? > > This problem touches about 500K rows of all 14M in our database, so > > the percentage is quite big. > > best regards > > Aleksander > > > > 2013-12-12 Robert Coli : > >> On Wed, Dec 11, 2013 at 6:27 AM, Mathijs Vogelzang < > math...@apptornado.com> > >> wrote: > >>> > >>> When I use sstable2json on the sstable on the destination cluster, it > has > >>> "metadata": {"deletionInfo": > >>> {"markedForDeleteAt":1796952039620607,"localDeletionTime":0}}, whereas > >>> it doesn't have that in the source sstable. > >>> (Yes, this is a timestamp far into the future. All our hosts are > >>> properly synced through ntp). > >> > >> > >> This seems like a bug in sstableloader, I would report it on JIRA. > >> > >>> > >>> Naturally, copying the data again doesn't work to fix it, as the > >>> tombstone is far in the future. Apart from not having this happen at > >>> all, how can it be fixed? > >> > >> > >> Briefly, you'll want to purge that tombstone and then reload the data > with a > >> reasonable timestamp. > >> > >> Dealing with rows with data (and tombstones) in the far future is > described > >> in detail here : > >> > >> > http://thelastpickle.com/blog/2011/12/15/Anatomy-of-a-Cassandra-Partition.html > >> > >> =Rob > >> > > > > -- > Yuki Morishita > t:yukim (http://twitter.com/yukim) >
Re: Data tombstoned during bulk loading 1.2.10 -> 2.0.3
Yes, I haven't run sstableloader. The data loss apperared somwhere on the line: 1.1.7-> 1.2.10 -> upgradesstable -> 2.0.2 -> normal operations ->2.0.3 normal operations -> now Today I've noticed that oldest files with broken values appear during repair (we do repair once a week on each node). Maybe it's the repair operation, which caused data loss? I've no idea. Currently our cluster is runing 2.0.3 version. We can do some tests on data to give you all info to track the bug. But our most crucial question is: can we recover loss, or should we start to think how to re-gather them? best regards Aleksander ps. I like your link Rob, i'll pin it over my desk ;) In Oracle there were a rule: never deploy RDBMS before release 2 ;) 2014-02-03 Robert Coli : > On Mon, Feb 3, 2014 at 12:51 AM, olek.stas...@gmail.com > wrote: >> >> We've faced very similar effect after upgrade from 1.1.7 to 2.0 (via >> 1.2.10). Probably after upgradesstable (but it's only a guess, >> because we noticed problem few weeks later), some rows became >> tombstoned. > > > To be clear, you didn't run SSTableloader at all? If so, this is the > hypothetical case where normal streaming operations (replacing a node? what > streaming did you do?) results in data loss... > > Also, CASSANDRA-6527 is a good reminder regarding the following : > > https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ > > =Rob
Re: Data tombstoned during bulk loading 1.2.10 -> 2.0.3
2014-02-03 Robert Coli : > On Mon, Feb 3, 2014 at 1:02 PM, olek.stas...@gmail.com > wrote: >> >> Today I've noticed that oldest files with broken values appear during >> repair (we do repair once a week on each node). Maybe it's the repair >> operation, which caused data loss? > > > Yes, unless you added or removed or replaced nodes, it would have to be the > repair operation, which streams SSTables. Did you run the repair during the > upgradesstables? No, i've done repair after upgrade sstables. In fact it was about 4 weeks after, because of bug: https://issues.apache.org/jira/browse/CASSANDRA-6277. We upgrded cass to 2.0.2 and then after ca 1 month to 2.0.3 because of 6277. Then we were able to do repair, so I set up cron to do it weekly on each node. (it was about 10 dec 2013) the loss was discovered about new year's eve. > >> >> I've no idea. Currently our cluster >> is runing 2.0.3 version. > > > 2.0.3 has serious bugs, upgrade to 2.0.4 ASAP. OK > >> >> But our most crucial question is: can we recover loss, or should we >> start to think how to re-gather them? > > > If I were you, I would do the latter. You can to some extent recover them > via manual processes dumping with sstable2json and so forth, but it will be > quite painful. > > http://thelastpickle.com/2011/12/15/Anatomy-of-a-Cassandra-Partition/ > > Contains an explanation of how one could deal with it. Sorry, but I have to admit, that i can't transfer this solution to my problem. Could you briefly describe steps I should perform to recover? best regards Aleksander > > =Rob > > > >> >> best regards >> Aleksander >> ps. I like your link Rob, i'll pin it over my desk ;) In Oracle there >> were a rule: never deploy RDBMS before release 2 ;) >> >> 2014-02-03 Robert Coli : >> > On Mon, Feb 3, 2014 at 12:51 AM, olek.stas...@gmail.com >> > wrote: >> >> >> >> We've faced very similar effect after upgrade from 1.1.7 to 2.0 (via >> >> 1.2.10). Probably after upgradesstable (but it's only a guess, >> >> because we noticed problem few weeks later), some rows became >> >> tombstoned. >> > >> > >> > To be clear, you didn't run SSTableloader at all? If so, this is the >> > hypothetical case where normal streaming operations (replacing a node? >> > what >> > streaming did you do?) results in data loss... >> > >> > Also, CASSANDRA-6527 is a good reminder regarding the following : >> > >> > >> > https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ >> > >> > =Rob > >
Re: Data tombstoned during bulk loading 1.2.10 -> 2.0.3
I don't know what is the real cause of my problem. We are still guessing. All operations I have done one cluster are described on timeline: 1.1.7-> 1.2.10 -> upgradesstable -> 2.0.2 -> normal operations ->2.0.3 -> normal operations -> now normal operations means reads/writes/repairs. Could you please, describe briefly how to recover data? I have a problem with scenario described under link: http://thelastpickle.com/blog/2011/12/15/Anatomy-of-a-Cassandra-Partition.html , I can't apply this solution to my case. regards Olek 2014-02-03 Robert Coli : > On Mon, Feb 3, 2014 at 2:17 PM, olek.stas...@gmail.com > wrote: >> >> No, i've done repair after upgrade sstables. In fact it was about 4 >> weeks after, because of bug: > > > If you only did a repair after you upgraded SSTables, when did you have an > opportunity to hit : > > https://issues.apache.org/jira/browse/CASSANDRA-6527 > > ... which relies on you having multiple versions of SStables while > streaming? > > Did you do any operation which involves streaming? (Add/Remove/Replace a > node?) > > =Rob >
Re: Data tombstoned during bulk loading 1.2.10 -> 2.0.3
Seems good. I'll discus it with data owners and we choose the best method. Best regards, Aleksander 4 lut 2014 19:40 "Robert Coli" napisał(a): > On Tue, Feb 4, 2014 at 12:21 AM, olek.stas...@gmail.com < > olek.stas...@gmail.com> wrote: > >> I don't know what is the real cause of my problem. We are still guessing. >> All operations I have done one cluster are described on timeline: >> 1.1.7-> 1.2.10 -> upgradesstable -> 2.0.2 -> normal operations ->2.0.3 >> -> normal operations -> now >> normal operations means reads/writes/repairs. >> Could you please, describe briefly how to recover data? I have a >> problem with scenario described under link: >> >> http://thelastpickle.com/blog/2011/12/15/Anatomy-of-a-Cassandra-Partition.html, >> I can't apply this solution to my case. >> > > I think your only option is the following : > > 1) determine which SSTables contain rows have doomstones (tombstones from > the far future) > 2) determine whether these tombstones mask a live or dead version of the > row, by looking at other row fragments > 3) dump/filter/re-write all your data via some method, probably > sstable2json/json2sstable > 4) load the corrected sstables by starting a node with the sstables in the > data directory > > I understand you have a lot of data, but I am pretty sure there is no way > for you to fix it within Cassandra. Perhaps ask for advice on the JIRA > ticket mentioned upthread if this answer is not sufficient? > > =Rob > >
Problems with adding datacenter and schema version disagreement
Hi All, I've faced an issue with cassandra 2.0.5. I've 6 node cluster with random partitioner, still using tokens instead of vnodes. Cause we're changing hardware we decide to migrate cluster to 6 new machines and change partitioning options to vnode rather then token-based. I've followed instruction on site: http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html and started cassandra on 6 new nodes in new DC. Everything seems to work correctly, nodes were seen from all others as up and normal. Then i performed nodetool repair -pr on the first of new nodes. But process falls into infinite loop, sending/receiving merkle trees over and over. It hangs on one very small KS it there were no hope it will stop sometime (process was running whole night). So I decided to stop the repair and restart cass on this particular new node. after restart 'Ive tried repair one more time with another small KS, but it also falls into infinite loop. So i decided to break the procedure of adding datacenter, remove nodes from new DC and start all from scratch. After running removenode on all new nodes I've wiped data dir and start cassandra on new node once again. During the start messages "org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=98bb99a2-42f2-3fcd-af67-208a4faae5fa" appears in logs. Google said, that they may mean problems with schema versions consistency, so I performed describe cluster in cassandra-cli and i get: Cluster Information: Name: Metadata Cluster Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 76198f8b-663f-3434-8860-251ebc6f50c4: [150.254.164.4] f48d3512-e299-3508-a29d-0844a0293f3a: [150.254.164.3] 16ad2e35-1eef-32f0-995c-e2cbd4c18abf: [150.254.164.6] 72352017-9b0d-3b29-8c55-ed86f30363c5: [150.254.164.1] 7f1faa84-0821-3311-9232-9407500591cc: [150.254.164.5] 85cd0ebc-5d33-3bec-a682-8c5880ee2fa1: [150.254.164.2] So now I have 6 diff schema version for cluster. But how it can happened? How can I take my cluster to consistent state? What did I wrong during extending cluster, so nodetool falls into infinite loop? At the first sight data looks ok, I can read from cluster and I'm getting expected output. best regards Aleksander
Re: Problems with adding datacenter and schema version disagreement
I plan to install 2.0.6 as soon as it will be available in datastax rpm repo. But how to deal with schema inconsistency on such scale? best regards Aleksander 2014-03-11 13:40 GMT+01:00 Duncan Sands : > Hi Aleksander, this may be related to CASSANDRA-6799 and CASSANDRA-6700 (if > it is caused by CASSANDRA-6700 then you are in luck: it is fixed in 2.0.6). > > Best wishes, Duncan. > > > On 11/03/14 13:30, olek.stas...@gmail.com wrote: >> >> Hi All, >> I've faced an issue with cassandra 2.0.5. >> I've 6 node cluster with random partitioner, still using tokens >> instead of vnodes. >> Cause we're changing hardware we decide to migrate cluster to 6 new >> machines and change partitioning options to vnode rather then >> token-based. >> I've followed instruction on site: >> >> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html >> and started cassandra on 6 new nodes in new DC. Everything seems to >> work correctly, nodes were seen from all others as up and normal. >> Then i performed nodetool repair -pr on the first of new nodes. >> But process falls into infinite loop, sending/receiving merkle trees >> over and over. It hangs on one very small KS it there were no hope it >> will stop sometime (process was running whole night). >> So I decided to stop the repair and restart cass on this particular >> new node. after restart 'Ive tried repair one more time with another >> small KS, but it also falls into infinite loop. >> So i decided to break the procedure of adding datacenter, remove nodes >> from new DC and start all from scratch. >> After running removenode on all new nodes I've wiped data dir and >> start cassandra on new node once again. During the start messages >> "org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find >> cfId=98bb99a2-42f2-3fcd-af67-208a4faae5fa" >> appears in logs. Google said, that they may mean problems with schema >> versions consistency, so I performed describe cluster in cassandra-cli >> and i get: >> Cluster Information: >> Name: Metadata Cluster >> Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch >> Partitioner: org.apache.cassandra.dht.RandomPartitioner >> Schema versions: >> 76198f8b-663f-3434-8860-251ebc6f50c4: [150.254.164.4] >> >> f48d3512-e299-3508-a29d-0844a0293f3a: [150.254.164.3] >> >> 16ad2e35-1eef-32f0-995c-e2cbd4c18abf: [150.254.164.6] >> >> 72352017-9b0d-3b29-8c55-ed86f30363c5: [150.254.164.1] >> >> 7f1faa84-0821-3311-9232-9407500591cc: [150.254.164.5] >> >> 85cd0ebc-5d33-3bec-a682-8c5880ee2fa1: [150.254.164.2] >> >> So now I have 6 diff schema version for cluster. But how it can >> happened? How can I take my cluster to consistent state? >> What did I wrong during extending cluster, so nodetool falls into infinite >> loop? >> At the first sight data looks ok, I can read from cluster and I'm >> getting expected output. >> best regards >> Aleksander >> >
Re: Problems with adding datacenter and schema version disagreement
Didn't help :) thanks and regards Aleksander 2014-03-11 14:14 GMT+01:00 Duncan Sands : > On 11/03/14 14:00, olek.stas...@gmail.com wrote: >> >> I plan to install 2.0.6 as soon as it will be available in datastax rpm >> repo. >> But how to deal with schema inconsistency on such scale? > > > Does it get better if you restart all the nodes? In my case restarting just > some of the nodes didn't help, but restarting all nodes did. > > Ciao, Duncan.
Re: Problems with adding datacenter and schema version disagreement
Bump, are there any solutions to bring my cluster back to schema consistency? I've 6 node cluster with exactly six versions of schema, how to deal with it? regards Aleksander 2014-03-11 14:36 GMT+01:00 olek.stas...@gmail.com : > Didn't help :) > thanks and regards > Aleksander > > 2014-03-11 14:14 GMT+01:00 Duncan Sands : >> On 11/03/14 14:00, olek.stas...@gmail.com wrote: >>> >>> I plan to install 2.0.6 as soon as it will be available in datastax rpm >>> repo. >>> But how to deal with schema inconsistency on such scale? >> >> >> Does it get better if you restart all the nodes? In my case restarting just >> some of the nodes didn't help, but restarting all nodes did. >> >> Ciao, Duncan.
Re: Problems with adding datacenter and schema version disagreement
Huh, you mean json dump? Regards Aleksander 2014-03-13 18:59 GMT+01:00 Robert Coli : > On Thu, Mar 13, 2014 at 2:05 AM, olek.stas...@gmail.com > wrote: >> >> Bump, are there any solutions to bring my cluster back to schema >> consistency? >> I've 6 node cluster with exactly six versions of schema, how to deal with >> it? > > > The simplest way, which is most likely to actually work, is to down all > nodes, nuke schema, and reload it from a dump. > > =Rob >
Re: Problems with adding datacenter and schema version disagreement
OK, I see, so the data files stay in place, i have to just stop cassandra on whole cluster, remove system schema and then start cluster and recreate all keyspaces with all column families? Data will be than loaded automatically from existing ssstables, right? So one more question: what about KS system_traces? should it be removed and recreted? What data it's holding? best regards Aleksander 2014-03-14 0:14 GMT+01:00 Robert Coli : > On Thu, Mar 13, 2014 at 1:20 PM, olek.stas...@gmail.com > wrote: >> >> Huh, >> you mean json dump? > > > If you're using cassandra-cli, I mean the output of "show schema;" > > If you're using CQLsh, there is an analogous way to show all schema. > > 1) dump schema to a file via one of the above tools > 2) stop cassandra and nuke system keyspaces everywhere > 3) start cassandra, coalesce cluster > 4) load schema > > =Rob >
Re: Problems with adding datacenter and schema version disagreement
Ok, I'll do this during the weekend, I'll give you a feedback on Monday. Regards Aleksander 14 mar 2014 18:15 "Robert Coli" napisał(a): > On Fri, Mar 14, 2014 at 12:40 AM, olek.stas...@gmail.com < > olek.stas...@gmail.com> wrote: > >> OK, I see, so the data files stay in place, i have to just stop >> cassandra on whole cluster, remove system schema and then start >> cluster and recreate all keyspaces with all column families? Data will >> be than loaded automatically from existing ssstables, right? >> > > Right. If you have clients reading while loading the schema, they may get > exceptions. > > >> So one more question: what about KS system_traces? should it be >> removed and recreted? What data it's holding? >> > > It's holding data about tracing, a profiling feature. It's safe to nuke. > > =Rob > >
Re: Problems with adding datacenter and schema version disagreement
Ok, i've dropped all system keyspaces, rebuild cluster and recover schema, now everything looks ok. But main goal of operations was to add new datacenter to cluster. After starting node in new cluster two schema versions appear, one version is held by 6 nodes of first datacenter, second one is in newly added node in new datacenter. Sth like this: nodetool status Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens Owns Host ID Rack UN 192.168.1.1 50.19 GB 1 0,5% c9323f38-d9c4-4a69-96e3-76cd4e1a204e rack1 UN 192.168.1.2 54.83 GB 1 0,3% ad1de2a9-2149-4f4a-aec6-5087d9d3acbb rack1 UN 192.168.1.3 51.14 GB 1 0,6% 0ceef523-93fe-4684-ba4b-4383106fe3d1 rack1 UN 192.168.1.4 54.31 GB 1 0,7% 39d15471-456d-44da-bdc8-221f3c212c78 rack1 UN 192.168.1.5 53.36 GB 1 0,3% 7fed25a5-e018-43df-b234-47c2f118879b rack1 UN 192.168.1.6 39.89 GB 1 0,1% 9f54fad6-949a-4fa9-80da-87efd62f3260 rack1 Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens Owns Host ID Rack UN 192.168.1.7 100.77 KB 256 97,4% ddb1f913-d075-4840-9665-3ba64eda0558 RAC1 describe cluster; Cluster Information: Name: Metadata Cluster Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 8fe34841-4f2a-3c05-97f2-15dd413d71dc: [192.168.1.7] 4ad381b6-df5a-3cbc-ba5a-0234b74d2383: [192.168.1.1, 192.168.1.2, 192.168.1.3, 192.168.1.4, 192.168.1.5, 192.168.1.6] All keyspaces are now configured to keep data in datacenter1. I assume, that It's not correct behaviour, is it true? Could you help me, how can I safely add new DC to the cluster? Regards Aleksander 2014-03-14 18:28 GMT+01:00 olek.stas...@gmail.com : > Ok, I'll do this during the weekend, I'll give you a feedback on Monday. > Regards > Aleksander > > 14 mar 2014 18:15 "Robert Coli" napisał(a): > >> On Fri, Mar 14, 2014 at 12:40 AM, olek.stas...@gmail.com >> wrote: >>> >>> OK, I see, so the data files stay in place, i have to just stop >>> cassandra on whole cluster, remove system schema and then start >>> cluster and recreate all keyspaces with all column families? Data will >>> be than loaded automatically from existing ssstables, right? >> >> >> Right. If you have clients reading while loading the schema, they may get >> exceptions. >> >>> >>> So one more question: what about KS system_traces? should it be >>> removed and recreted? What data it's holding? >> >> >> It's holding data about tracing, a profiling feature. It's safe to nuke. >> >> =Rob >>
Re: Problems with adding datacenter and schema version disagreement
Oh, one more question: what should be configuration for storing system_traces keyspace? Should it be replicated or stored locally? Regards Olek 2014-03-18 16:47 GMT+01:00 olek.stas...@gmail.com : > Ok, i've dropped all system keyspaces, rebuild cluster and recover > schema, now everything looks ok. > But main goal of operations was to add new datacenter to cluster. > After starting node in new cluster two schema versions appear, one > version is held by 6 nodes of first datacenter, second one is in newly > added node in new datacenter. Sth like this: > nodetool status > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns Host ID > Rack > UN 192.168.1.1 50.19 GB 1 0,5% > c9323f38-d9c4-4a69-96e3-76cd4e1a204e rack1 > UN 192.168.1.2 54.83 GB 1 0,3% > ad1de2a9-2149-4f4a-aec6-5087d9d3acbb rack1 > UN 192.168.1.3 51.14 GB 1 0,6% > 0ceef523-93fe-4684-ba4b-4383106fe3d1 rack1 > UN 192.168.1.4 54.31 GB 1 0,7% > 39d15471-456d-44da-bdc8-221f3c212c78 rack1 > UN 192.168.1.5 53.36 GB 1 0,3% > 7fed25a5-e018-43df-b234-47c2f118879b rack1 > UN 192.168.1.6 39.89 GB 1 0,1% > 9f54fad6-949a-4fa9-80da-87efd62f3260 rack1 > Datacenter: DC1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns Host ID > Rack > UN 192.168.1.7 100.77 KB 256 97,4% > ddb1f913-d075-4840-9665-3ba64eda0558 RAC1 > > describe cluster; > Cluster Information: >Name: Metadata Cluster >Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch >Partitioner: org.apache.cassandra.dht.RandomPartitioner >Schema versions: > 8fe34841-4f2a-3c05-97f2-15dd413d71dc: [192.168.1.7] > > 4ad381b6-df5a-3cbc-ba5a-0234b74d2383: [192.168.1.1, 192.168.1.2, > 192.168.1.3, 192.168.1.4, 192.168.1.5, 192.168.1.6] > > All keyspaces are now configured to keep data in datacenter1. > I assume, that It's not correct behaviour, is it true? > Could you help me, how can I safely add new DC to the cluster? > > Regards > Aleksander > > > 2014-03-14 18:28 GMT+01:00 olek.stas...@gmail.com : >> Ok, I'll do this during the weekend, I'll give you a feedback on Monday. >> Regards >> Aleksander >> >> 14 mar 2014 18:15 "Robert Coli" napisał(a): >> >>> On Fri, Mar 14, 2014 at 12:40 AM, olek.stas...@gmail.com >>> wrote: >>>> >>>> OK, I see, so the data files stay in place, i have to just stop >>>> cassandra on whole cluster, remove system schema and then start >>>> cluster and recreate all keyspaces with all column families? Data will >>>> be than loaded automatically from existing ssstables, right? >>> >>> >>> Right. If you have clients reading while loading the schema, they may get >>> exceptions. >>> >>>> >>>> So one more question: what about KS system_traces? should it be >>>> removed and recreted? What data it's holding? >>> >>> >>> It's holding data about tracing, a profiling feature. It's safe to nuke. >>> >>> =Rob >>>
Re: Problems with adding datacenter and schema version disagreement
Bump, could anyone comment this behaviour, is it correct, or should I create Jira task for this problems? regards Olek 2014-03-18 16:49 GMT+01:00 olek.stas...@gmail.com : > Oh, one more question: what should be configuration for storing > system_traces keyspace? Should it be replicated or stored locally? > Regards > Olek > > 2014-03-18 16:47 GMT+01:00 olek.stas...@gmail.com : >> Ok, i've dropped all system keyspaces, rebuild cluster and recover >> schema, now everything looks ok. >> But main goal of operations was to add new datacenter to cluster. >> After starting node in new cluster two schema versions appear, one >> version is held by 6 nodes of first datacenter, second one is in newly >> added node in new datacenter. Sth like this: >> nodetool status >> Datacenter: datacenter1 >> === >> Status=Up/Down >> |/ State=Normal/Leaving/Joining/Moving >> -- AddressLoad Tokens Owns Host ID >> Rack >> UN 192.168.1.1 50.19 GB 1 0,5% >> c9323f38-d9c4-4a69-96e3-76cd4e1a204e rack1 >> UN 192.168.1.2 54.83 GB 1 0,3% >> ad1de2a9-2149-4f4a-aec6-5087d9d3acbb rack1 >> UN 192.168.1.3 51.14 GB 1 0,6% >> 0ceef523-93fe-4684-ba4b-4383106fe3d1 rack1 >> UN 192.168.1.4 54.31 GB 1 0,7% >> 39d15471-456d-44da-bdc8-221f3c212c78 rack1 >> UN 192.168.1.5 53.36 GB 1 0,3% >> 7fed25a5-e018-43df-b234-47c2f118879b rack1 >> UN 192.168.1.6 39.89 GB 1 0,1% >> 9f54fad6-949a-4fa9-80da-87efd62f3260 rack1 >> Datacenter: DC1 >> === >> Status=Up/Down >> |/ State=Normal/Leaving/Joining/Moving >> -- AddressLoad Tokens Owns Host ID >> Rack >> UN 192.168.1.7 100.77 KB 256 97,4% >> ddb1f913-d075-4840-9665-3ba64eda0558 RAC1 >> >> describe cluster; >> Cluster Information: >>Name: Metadata Cluster >>Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch >>Partitioner: org.apache.cassandra.dht.RandomPartitioner >>Schema versions: >> 8fe34841-4f2a-3c05-97f2-15dd413d71dc: [192.168.1.7] >> >> 4ad381b6-df5a-3cbc-ba5a-0234b74d2383: [192.168.1.1, 192.168.1.2, >> 192.168.1.3, 192.168.1.4, 192.168.1.5, 192.168.1.6] >> >> All keyspaces are now configured to keep data in datacenter1. >> I assume, that It's not correct behaviour, is it true? >> Could you help me, how can I safely add new DC to the cluster? >> >> Regards >> Aleksander >> >> >> 2014-03-14 18:28 GMT+01:00 olek.stas...@gmail.com : >>> Ok, I'll do this during the weekend, I'll give you a feedback on Monday. >>> Regards >>> Aleksander >>> >>> 14 mar 2014 18:15 "Robert Coli" napisał(a): >>> >>>> On Fri, Mar 14, 2014 at 12:40 AM, olek.stas...@gmail.com >>>> wrote: >>>>> >>>>> OK, I see, so the data files stay in place, i have to just stop >>>>> cassandra on whole cluster, remove system schema and then start >>>>> cluster and recreate all keyspaces with all column families? Data will >>>>> be than loaded automatically from existing ssstables, right? >>>> >>>> >>>> Right. If you have clients reading while loading the schema, they may get >>>> exceptions. >>>> >>>>> >>>>> So one more question: what about KS system_traces? should it be >>>>> removed and recreted? What data it's holding? >>>> >>>> >>>> It's holding data about tracing, a profiling feature. It's safe to nuke. >>>> >>>> =Rob >>>>
Re: Problems with adding datacenter and schema version disagreement
Bump one more time, could anybody help me? regards Olek 2014-03-19 16:44 GMT+01:00 olek.stas...@gmail.com : > Bump, could anyone comment this behaviour, is it correct, or should I > create Jira task for this problems? > regards > Olek > > 2014-03-18 16:49 GMT+01:00 olek.stas...@gmail.com : >> Oh, one more question: what should be configuration for storing >> system_traces keyspace? Should it be replicated or stored locally? >> Regards >> Olek >> >> 2014-03-18 16:47 GMT+01:00 olek.stas...@gmail.com : >>> Ok, i've dropped all system keyspaces, rebuild cluster and recover >>> schema, now everything looks ok. >>> But main goal of operations was to add new datacenter to cluster. >>> After starting node in new cluster two schema versions appear, one >>> version is held by 6 nodes of first datacenter, second one is in newly >>> added node in new datacenter. Sth like this: >>> nodetool status >>> Datacenter: datacenter1 >>> === >>> Status=Up/Down >>> |/ State=Normal/Leaving/Joining/Moving >>> -- AddressLoad Tokens Owns Host ID >>> Rack >>> UN 192.168.1.1 50.19 GB 1 0,5% >>> c9323f38-d9c4-4a69-96e3-76cd4e1a204e rack1 >>> UN 192.168.1.2 54.83 GB 1 0,3% >>> ad1de2a9-2149-4f4a-aec6-5087d9d3acbb rack1 >>> UN 192.168.1.3 51.14 GB 1 0,6% >>> 0ceef523-93fe-4684-ba4b-4383106fe3d1 rack1 >>> UN 192.168.1.4 54.31 GB 1 0,7% >>> 39d15471-456d-44da-bdc8-221f3c212c78 rack1 >>> UN 192.168.1.5 53.36 GB 1 0,3% >>> 7fed25a5-e018-43df-b234-47c2f118879b rack1 >>> UN 192.168.1.6 39.89 GB 1 0,1% >>> 9f54fad6-949a-4fa9-80da-87efd62f3260 rack1 >>> Datacenter: DC1 >>> === >>> Status=Up/Down >>> |/ State=Normal/Leaving/Joining/Moving >>> -- AddressLoad Tokens Owns Host ID >>> Rack >>> UN 192.168.1.7 100.77 KB 256 97,4% >>> ddb1f913-d075-4840-9665-3ba64eda0558 RAC1 >>> >>> describe cluster; >>> Cluster Information: >>>Name: Metadata Cluster >>>Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch >>>Partitioner: org.apache.cassandra.dht.RandomPartitioner >>>Schema versions: >>> 8fe34841-4f2a-3c05-97f2-15dd413d71dc: [192.168.1.7] >>> >>> 4ad381b6-df5a-3cbc-ba5a-0234b74d2383: [192.168.1.1, 192.168.1.2, >>> 192.168.1.3, 192.168.1.4, 192.168.1.5, 192.168.1.6] >>> >>> All keyspaces are now configured to keep data in datacenter1. >>> I assume, that It's not correct behaviour, is it true? >>> Could you help me, how can I safely add new DC to the cluster? >>> >>> Regards >>> Aleksander >>> >>> >>> 2014-03-14 18:28 GMT+01:00 olek.stas...@gmail.com : >>>> Ok, I'll do this during the weekend, I'll give you a feedback on Monday. >>>> Regards >>>> Aleksander >>>> >>>> 14 mar 2014 18:15 "Robert Coli" napisał(a): >>>> >>>>> On Fri, Mar 14, 2014 at 12:40 AM, olek.stas...@gmail.com >>>>> wrote: >>>>>> >>>>>> OK, I see, so the data files stay in place, i have to just stop >>>>>> cassandra on whole cluster, remove system schema and then start >>>>>> cluster and recreate all keyspaces with all column families? Data will >>>>>> be than loaded automatically from existing ssstables, right? >>>>> >>>>> >>>>> Right. If you have clients reading while loading the schema, they may get >>>>> exceptions. >>>>> >>>>>> >>>>>> So one more question: what about KS system_traces? should it be >>>>>> removed and recreted? What data it's holding? >>>>> >>>>> >>>>> It's holding data about tracing, a profiling feature. It's safe to nuke. >>>>> >>>>> =Rob >>>>>