So, sstableloader streams a portion of data stored in /var/lib/cassandra/data/keyspace/table catalog If we have 3 nodes and RF=3, then only 1/3 of data would be streamed to other cluster. Problem is solved.
2015-04-01 12:05 GMT+02:00 Alain RODRIGUEZ <[email protected]>: > From Michael Laing - posted on the wrong thread : > > "We use Alain's solution as well to make major operational revisions. > > We have a "red team" and a "blue team in each AWS region, so we just add > and drop datacenters to get where we want to be. > > Pretty simple." > > 2015-03-31 15:50 GMT+02:00 Alain RODRIGUEZ <[email protected]>: > >> IMHO, the most straight forward solution is to add cluster2 as a new DC >> for mykeyspace and then drop the old DC. >> >> That's how we migrated to VPC (AWS) and we love this approach since you >> don't have to mess with your existing cluster, plus sync is made >> automatically and you can then drop your old DC safely, when you are sure. >> >> I put steps on this ML long time ago: >> https://mail-archives.apache.org/mod_mbox/incubator-cassandra-user/201406.mbox/%3cca+vsrlopop7th8nx20aoz3as75g2jrjm3ryx119deklynhq...@mail.gmail.com%3E >> Also Datastax docs: >> https://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html >> >> "get data from cluster1, >> put it to cluster2 >> wipe cluster1" >> >> I would definitely use this method to do this (I actually did already, >> multiple times). >> >> Up to you, I heard once that there is almost as much way of doing >> operational on Cassandra as the number of operators :). You should go with >> method you can be confident with. I can assure the one I propose is quite >> secure. >> >> C*heers, >> >> Alain >> >> 2015-03-31 15:32 GMT+02:00 Serega Sheypak <[email protected]>: >> >>> >I have to ask you if you considered doing an Alter keyspace, change RF >>> The idea is dead simple: >>> get data from cluster1, >>> put it to cluster2 >>> vipe cluster1 >>> >>> I understand drawbacks of streaming sstableloader approach, I need right >>> now something easy. Later we consider switch to Priam since it does >>> backup/restore in a right way. >>> >>> 2015-03-31 14:45 GMT+02:00 Alain RODRIGUEZ <[email protected]>: >>> >>>> Hi, >>>> >>>> Despite of "I understand that it's not the best solution, I need it >>>> for testing purposes", I have to ask you if you considered doing an Alter >>>> keyspace, change RF > 1 for mykeyspace on cluster2 and "nodetool rebuild" >>>> to add a new DC (your cluster2) ? >>>> >>>> In the case you go your way (sstableloader) also advice you to make a >>>> snapshot (instead of just flushing) to avoid fails due to compactions on >>>> your active cluster1. >>>> >>>> To answer your question, sstableloader is supposed to distribute >>>> correctly data on the new cluster depending on your RF and topology. >>>> Basically if you run sstable loader just on sstable c1.node1 my guess >>>> is that you will have all the data present on c1.node1 stored on the new c2 >>>> (each data to corresponding node). So if you have an RF=3 on c1, you should >>>> have all the data on c2 just by running sstableloader from c1.node1, if you >>>> are using RF=1 on c1, then you need to load data from c1.each_node. I >>>> suppose that cluster2.nodeXXX doesn't matter and act as a coordinator. >>>> >>>> I never used the tool, but that's what would be "logical" imho. Wait >>>> for a confirmation as I wouldn't to lead you to a failure of any kind. >>>> Also, I don't know if data is also replicated directly with sstableloader >>>> or if you need to repair c2 after loading data. >>>> >>>> C*heers, >>>> >>>> Alain >>>> >>>> 2015-03-31 13:21 GMT+02:00 Serega Sheypak <[email protected]>: >>>> >>>>> Hi, I have a simple question and can't find related info in docs. >>>>> >>>>> I have cluster1 with 3 nodes and cluster2 with 5 nodes. I want to >>>>> transfer whole keyspace named 'mykeyspace' data from cluster1 to cluster2 >>>>> using sstableloader. I understand that it's not the best solution, I need >>>>> it for testing purposes. >>>>> >>>>> What I'm going to do: >>>>> >>>>> 1. Recreate keyspace schema on cluster2 using schema from cluster1 >>>>> 2. nodetool flush for mykeyspace.source_table being exported from >>>>> cluster1 to cluster2 >>>>> 3. >>>>> >>>>> Run sstableloader for each table on cluster1.node01 >>>>> >>>>> sstableloader -d cluster2.nodeXXX.com >>>>> >>>>> /var/lib/cassandra/data/mykeyspace/source_table-83f369e0d6e511e4b3a6010e8d2b68af/ >>>>> >>>>> What should I get as a result on cluster2? >>>>> >>>>> *ALL* data from source_table? >>>>> >>>>> or >>>>> >>>>> Just data stored in *partition of source_table* >>>>> >>>>> I'm confused. Doc says I just run this command to export table from >>>>> cluster1 to cluster2, but I specify path to a part of source_table data, >>>>> since other parts of table should be on other nodes. >>>>> >>>> >>>> >>> >> >
