Re: Replacing nodes disks
Thanks guys. I think I'll start with the replacement of a dead node procedure at least for the first node and I'll monitor the cluster overhead and timing. If I'll see that the overhead and elapsed time are substantially higher I'll try to find some network storage to store the backup. Using the replacement procedure will also be a great practice on production size data, and it will be less dangerous as it's on DR nodes. On Mon, Dec 22, 2014 at 6:22 PM, Jan Kesten j.kes...@enercast.de wrote: Hi, even if recovery like a dead node would work - backup and restore (like my way with an usb docking station) will be much faster and produce less IO and CPU impact on your cluster. Keep that in Mind :-) Cheers, Jan Am 22.12.2014 um 10:58 schrieb Or Sher: Great. replace_address works great. From some reason I thought it won't work with the same IP. On Sun, Dec 21, 2014 at 5:14 PM, Ryan Svihla rsvi...@datastax.com wrote: Cassandra is designed to rebuild a node from other nodes, whether a node is dead by your hand because you killed it or fate is irrelevant, the process is the same, a new node can be the same hostname and ip or it can have totally different ones. On Sun, Dec 21, 2014 at 6:01 AM, Or Sher or.sh...@gmail.com wrote: If I'll use the replace_address parameter with the same IP address, would that do the job? On Sun, Dec 21, 2014 at 11:20 AM, Or Sher or.sh...@gmail.com wrote: What I want to do is kind of replacing a dead node - http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html But replacing it with a clean node with the same IP and hostname. On Sun, Dec 21, 2014 at 9:53 AM, Or Sher or.sh...@gmail.com wrote: Thanks guys. I have to replace all data disks, so I don't have another large enough local disk to move the data to. If I'll have no choice, I will backup the data before on some other node or something, but I'd like to avoid it. I would really love letting Cassandra do it thing and rebuild itself. Did anybody handled such cases that way (Letting Cassandra rebuild it's data?) Although there are no documented procedure for it, It should be possible right? On Fri, Dec 19, 2014 at 8:41 AM, Jan Kesten j.kes...@enercast.de wrote: Hi Or, I did some sort of this a while ago. If your machines do have a free disk slot - just put another disk there and use it as another data_file_directory. If not - as in my case: - grab an usb dock for disks - put the new one in there, plug in, format, mount to /mnt etc. - I did an online rsync from /var/lib/cassandra/data to /mnt - after that, bring cassandra down - do another rsync from /var/lib/cassandra/data to /mnt (should be faster, as sstables do not change, minimizes downtime) - if you need adjust /etc/fstab if needed - shutdown the node - swap disks - power on the node - everything should be fine ;-) Of course you will need a replication factor 1 for this to work ;-) Just my 2 cents, Jan rsync the full contents there, Am 18.12.2014 um 16:17 schrieb Or Sher: Hi all, We have a situation where some of our nodes have smaller disks and we would like to align all nodes by replacing the smaller disks to bigger ones without replacing nodes. We don't have enough space to put data on / disk and copy it back to the bigger disks so we would like to rebuild the nodes data from other replicas. What do you think should be the procedure here? I'm guessing it should be something like this but I'm pretty sure it's not enough. 1. shutdown C* node and server. 2. replace disks + create the same vg lv etc. 3. start C* (Normally?) 4. nodetool repair/rebuild? *I think I might get some consistency issues for use cases relying on Quorum reads and writes for strong consistency. What do you say? Another question is (and I know it depends on many factors but I'd like to hear an experienced estimation): How much time would take to rebuild a 250G data node? Thanks in advance, Or. -- Or Sher -- Or Sher -- Or Sher -- Or Sher -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. -- Or Sher -- Or Sher
Re: Replacing nodes disks
Great. replace_address works great. From some reason I thought it won't work with the same IP. On Sun, Dec 21, 2014 at 5:14 PM, Ryan Svihla rsvi...@datastax.com wrote: Cassandra is designed to rebuild a node from other nodes, whether a node is dead by your hand because you killed it or fate is irrelevant, the process is the same, a new node can be the same hostname and ip or it can have totally different ones. On Sun, Dec 21, 2014 at 6:01 AM, Or Sher or.sh...@gmail.com wrote: If I'll use the replace_address parameter with the same IP address, would that do the job? On Sun, Dec 21, 2014 at 11:20 AM, Or Sher or.sh...@gmail.com wrote: What I want to do is kind of replacing a dead node - http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html But replacing it with a clean node with the same IP and hostname. On Sun, Dec 21, 2014 at 9:53 AM, Or Sher or.sh...@gmail.com wrote: Thanks guys. I have to replace all data disks, so I don't have another large enough local disk to move the data to. If I'll have no choice, I will backup the data before on some other node or something, but I'd like to avoid it. I would really love letting Cassandra do it thing and rebuild itself. Did anybody handled such cases that way (Letting Cassandra rebuild it's data?) Although there are no documented procedure for it, It should be possible right? On Fri, Dec 19, 2014 at 8:41 AM, Jan Kesten j.kes...@enercast.de wrote: Hi Or, I did some sort of this a while ago. If your machines do have a free disk slot - just put another disk there and use it as another data_file_directory. If not - as in my case: - grab an usb dock for disks - put the new one in there, plug in, format, mount to /mnt etc. - I did an online rsync from /var/lib/cassandra/data to /mnt - after that, bring cassandra down - do another rsync from /var/lib/cassandra/data to /mnt (should be faster, as sstables do not change, minimizes downtime) - if you need adjust /etc/fstab if needed - shutdown the node - swap disks - power on the node - everything should be fine ;-) Of course you will need a replication factor 1 for this to work ;-) Just my 2 cents, Jan rsync the full contents there, Am 18.12.2014 um 16:17 schrieb Or Sher: Hi all, We have a situation where some of our nodes have smaller disks and we would like to align all nodes by replacing the smaller disks to bigger ones without replacing nodes. We don't have enough space to put data on / disk and copy it back to the bigger disks so we would like to rebuild the nodes data from other replicas. What do you think should be the procedure here? I'm guessing it should be something like this but I'm pretty sure it's not enough. 1. shutdown C* node and server. 2. replace disks + create the same vg lv etc. 3. start C* (Normally?) 4. nodetool repair/rebuild? *I think I might get some consistency issues for use cases relying on Quorum reads and writes for strong consistency. What do you say? Another question is (and I know it depends on many factors but I'd like to hear an experienced estimation): How much time would take to rebuild a 250G data node? Thanks in advance, Or. -- Or Sher -- Or Sher -- Or Sher -- Or Sher -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. -- Or Sher
Re: Replacing nodes disks
You should be able to use Cassandra's built in tooling for sure. But just be aware that restoring from a backup of the data will be a lot faster and won't introduce any stress on the existing cluster. Repair and replace operations aren't free to the other nodes, so an offline backup and restore is a better option when it's available. On Mon, Dec 22, 2014, 3:00 AM Or Sher or.sh...@gmail.com wrote: Great. replace_address works great. From some reason I thought it won't work with the same IP. On Sun, Dec 21, 2014 at 5:14 PM, Ryan Svihla rsvi...@datastax.com wrote: Cassandra is designed to rebuild a node from other nodes, whether a node is dead by your hand because you killed it or fate is irrelevant, the process is the same, a new node can be the same hostname and ip or it can have totally different ones. On Sun, Dec 21, 2014 at 6:01 AM, Or Sher or.sh...@gmail.com wrote: If I'll use the replace_address parameter with the same IP address, would that do the job? On Sun, Dec 21, 2014 at 11:20 AM, Or Sher or.sh...@gmail.com wrote: What I want to do is kind of replacing a dead node - http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html But replacing it with a clean node with the same IP and hostname. On Sun, Dec 21, 2014 at 9:53 AM, Or Sher or.sh...@gmail.com wrote: Thanks guys. I have to replace all data disks, so I don't have another large enough local disk to move the data to. If I'll have no choice, I will backup the data before on some other node or something, but I'd like to avoid it. I would really love letting Cassandra do it thing and rebuild itself. Did anybody handled such cases that way (Letting Cassandra rebuild it's data?) Although there are no documented procedure for it, It should be possible right? On Fri, Dec 19, 2014 at 8:41 AM, Jan Kesten j.kes...@enercast.de wrote: Hi Or, I did some sort of this a while ago. If your machines do have a free disk slot - just put another disk there and use it as another data_file_directory. If not - as in my case: - grab an usb dock for disks - put the new one in there, plug in, format, mount to /mnt etc. - I did an online rsync from /var/lib/cassandra/data to /mnt - after that, bring cassandra down - do another rsync from /var/lib/cassandra/data to /mnt (should be faster, as sstables do not change, minimizes downtime) - if you need adjust /etc/fstab if needed - shutdown the node - swap disks - power on the node - everything should be fine ;-) Of course you will need a replication factor 1 for this to work ;-) Just my 2 cents, Jan rsync the full contents there, Am 18.12.2014 um 16:17 schrieb Or Sher: Hi all, We have a situation where some of our nodes have smaller disks and we would like to align all nodes by replacing the smaller disks to bigger ones without replacing nodes. We don't have enough space to put data on / disk and copy it back to the bigger disks so we would like to rebuild the nodes data from other replicas. What do you think should be the procedure here? I'm guessing it should be something like this but I'm pretty sure it's not enough. 1. shutdown C* node and server. 2. replace disks + create the same vg lv etc. 3. start C* (Normally?) 4. nodetool repair/rebuild? *I think I might get some consistency issues for use cases relying on Quorum reads and writes for strong consistency. What do you say? Another question is (and I know it depends on many factors but I'd like to hear an experienced estimation): How much time would take to rebuild a 250G data node? Thanks in advance, Or. -- Or Sher -- Or Sher -- Or Sher -- Or Sher -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. -- Or Sher
Re: Replacing nodes disks
Hi, even if recovery like a dead node would work - backup and restore (like my way with an usb docking station) will be much faster and produce less IO and CPU impact on your cluster. Keep that in Mind :-) Cheers, Jan Am 22.12.2014 um 10:58 schrieb Or Sher: Great. replace_address works great. From some reason I thought it won't work with the same IP. On Sun, Dec 21, 2014 at 5:14 PM, Ryan Svihla rsvi...@datastax.com mailto:rsvi...@datastax.com wrote: Cassandra is designed to rebuild a node from other nodes, whether a node is dead by your hand because you killed it or fate is irrelevant, the process is the same, a new node can be the same hostname and ip or it can have totally different ones. On Sun, Dec 21, 2014 at 6:01 AM, Or Sher or.sh...@gmail.com mailto:or.sh...@gmail.com wrote: If I'll use the replace_address parameter with the same IP address, would that do the job? On Sun, Dec 21, 2014 at 11:20 AM, Or Sher or.sh...@gmail.com mailto:or.sh...@gmail.com wrote: What I want to do is kind of replacing a dead node - http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html But replacing it with a clean node with the same IP and hostname. On Sun, Dec 21, 2014 at 9:53 AM, Or Sher or.sh...@gmail.com mailto:or.sh...@gmail.com wrote: Thanks guys. I have to replace all data disks, so I don't have another large enough local disk to move the data to. If I'll have no choice, I will backup the data before on some other node or something, but I'd like to avoid it. I would really love letting Cassandra do it thing and rebuild itself. Did anybody handled such cases that way (Letting Cassandra rebuild it's data?) Although there are no documented procedure for it, It should be possible right? On Fri, Dec 19, 2014 at 8:41 AM, Jan Kesten j.kes...@enercast.de mailto:j.kes...@enercast.de wrote: Hi Or, I did some sort of this a while ago. If your machines do have a free disk slot - just put another disk there and use it as another data_file_directory. If not - as in my case: - grab an usb dock for disks - put the new one in there, plug in, format, mount to /mnt etc. - I did an online rsync from /var/lib/cassandra/data to /mnt - after that, bring cassandra down - do another rsync from /var/lib/cassandra/data to /mnt (should be faster, as sstables do not change, minimizes downtime) - if you need adjust /etc/fstab if needed - shutdown the node - swap disks - power on the node - everything should be fine ;-) Of course you will need a replication factor 1 for this to work ;-) Just my 2 cents, Jan rsync the full contents there, Am 18.12.2014 um 16:17 schrieb Or Sher: Hi all, We have a situation where some of our nodes have smaller disks and we would like to align all nodes by replacing the smaller disks to bigger ones without replacing nodes. We don't have enough space to put data on / disk and copy it back to the bigger disks so we would like to rebuild the nodes data from other replicas. What do you think should be the procedure here? I'm guessing it should be something like this but I'm pretty sure it's not enough. 1. shutdown C* node and server. 2. replace disks + create the same vg lv etc. 3. start C* (Normally?) 4. nodetool repair/rebuild? *I think I might get some consistency issues for use cases relying on Quorum reads and writes for strong consistency. What do you say? Another question is (and I know it depends on many factors but I'd like to hear an experienced estimation): How much time would take to rebuild a 250G data node?
Re: Replacing nodes disks
What I want to do is kind of replacing a dead node - http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html But replacing it with a clean node with the same IP and hostname. On Sun, Dec 21, 2014 at 9:53 AM, Or Sher or.sh...@gmail.com wrote: Thanks guys. I have to replace all data disks, so I don't have another large enough local disk to move the data to. If I'll have no choice, I will backup the data before on some other node or something, but I'd like to avoid it. I would really love letting Cassandra do it thing and rebuild itself. Did anybody handled such cases that way (Letting Cassandra rebuild it's data?) Although there are no documented procedure for it, It should be possible right? On Fri, Dec 19, 2014 at 8:41 AM, Jan Kesten j.kes...@enercast.de wrote: Hi Or, I did some sort of this a while ago. If your machines do have a free disk slot - just put another disk there and use it as another data_file_directory. If not - as in my case: - grab an usb dock for disks - put the new one in there, plug in, format, mount to /mnt etc. - I did an online rsync from /var/lib/cassandra/data to /mnt - after that, bring cassandra down - do another rsync from /var/lib/cassandra/data to /mnt (should be faster, as sstables do not change, minimizes downtime) - if you need adjust /etc/fstab if needed - shutdown the node - swap disks - power on the node - everything should be fine ;-) Of course you will need a replication factor 1 for this to work ;-) Just my 2 cents, Jan rsync the full contents there, Am 18.12.2014 um 16:17 schrieb Or Sher: Hi all, We have a situation where some of our nodes have smaller disks and we would like to align all nodes by replacing the smaller disks to bigger ones without replacing nodes. We don't have enough space to put data on / disk and copy it back to the bigger disks so we would like to rebuild the nodes data from other replicas. What do you think should be the procedure here? I'm guessing it should be something like this but I'm pretty sure it's not enough. 1. shutdown C* node and server. 2. replace disks + create the same vg lv etc. 3. start C* (Normally?) 4. nodetool repair/rebuild? *I think I might get some consistency issues for use cases relying on Quorum reads and writes for strong consistency. What do you say? Another question is (and I know it depends on many factors but I'd like to hear an experienced estimation): How much time would take to rebuild a 250G data node? Thanks in advance, Or. -- Or Sher -- Or Sher -- Or Sher
Re: Replacing nodes disks
If I'll use the replace_address parameter with the same IP address, would that do the job? On Sun, Dec 21, 2014 at 11:20 AM, Or Sher or.sh...@gmail.com wrote: What I want to do is kind of replacing a dead node - http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html But replacing it with a clean node with the same IP and hostname. On Sun, Dec 21, 2014 at 9:53 AM, Or Sher or.sh...@gmail.com wrote: Thanks guys. I have to replace all data disks, so I don't have another large enough local disk to move the data to. If I'll have no choice, I will backup the data before on some other node or something, but I'd like to avoid it. I would really love letting Cassandra do it thing and rebuild itself. Did anybody handled such cases that way (Letting Cassandra rebuild it's data?) Although there are no documented procedure for it, It should be possible right? On Fri, Dec 19, 2014 at 8:41 AM, Jan Kesten j.kes...@enercast.de wrote: Hi Or, I did some sort of this a while ago. If your machines do have a free disk slot - just put another disk there and use it as another data_file_directory. If not - as in my case: - grab an usb dock for disks - put the new one in there, plug in, format, mount to /mnt etc. - I did an online rsync from /var/lib/cassandra/data to /mnt - after that, bring cassandra down - do another rsync from /var/lib/cassandra/data to /mnt (should be faster, as sstables do not change, minimizes downtime) - if you need adjust /etc/fstab if needed - shutdown the node - swap disks - power on the node - everything should be fine ;-) Of course you will need a replication factor 1 for this to work ;-) Just my 2 cents, Jan rsync the full contents there, Am 18.12.2014 um 16:17 schrieb Or Sher: Hi all, We have a situation where some of our nodes have smaller disks and we would like to align all nodes by replacing the smaller disks to bigger ones without replacing nodes. We don't have enough space to put data on / disk and copy it back to the bigger disks so we would like to rebuild the nodes data from other replicas. What do you think should be the procedure here? I'm guessing it should be something like this but I'm pretty sure it's not enough. 1. shutdown C* node and server. 2. replace disks + create the same vg lv etc. 3. start C* (Normally?) 4. nodetool repair/rebuild? *I think I might get some consistency issues for use cases relying on Quorum reads and writes for strong consistency. What do you say? Another question is (and I know it depends on many factors but I'd like to hear an experienced estimation): How much time would take to rebuild a 250G data node? Thanks in advance, Or. -- Or Sher -- Or Sher -- Or Sher -- Or Sher
Re: Replacing nodes disks
Cassandra is designed to rebuild a node from other nodes, whether a node is dead by your hand because you killed it or fate is irrelevant, the process is the same, a new node can be the same hostname and ip or it can have totally different ones. On Sun, Dec 21, 2014 at 6:01 AM, Or Sher or.sh...@gmail.com wrote: If I'll use the replace_address parameter with the same IP address, would that do the job? On Sun, Dec 21, 2014 at 11:20 AM, Or Sher or.sh...@gmail.com wrote: What I want to do is kind of replacing a dead node - http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html But replacing it with a clean node with the same IP and hostname. On Sun, Dec 21, 2014 at 9:53 AM, Or Sher or.sh...@gmail.com wrote: Thanks guys. I have to replace all data disks, so I don't have another large enough local disk to move the data to. If I'll have no choice, I will backup the data before on some other node or something, but I'd like to avoid it. I would really love letting Cassandra do it thing and rebuild itself. Did anybody handled such cases that way (Letting Cassandra rebuild it's data?) Although there are no documented procedure for it, It should be possible right? On Fri, Dec 19, 2014 at 8:41 AM, Jan Kesten j.kes...@enercast.de wrote: Hi Or, I did some sort of this a while ago. If your machines do have a free disk slot - just put another disk there and use it as another data_file_directory. If not - as in my case: - grab an usb dock for disks - put the new one in there, plug in, format, mount to /mnt etc. - I did an online rsync from /var/lib/cassandra/data to /mnt - after that, bring cassandra down - do another rsync from /var/lib/cassandra/data to /mnt (should be faster, as sstables do not change, minimizes downtime) - if you need adjust /etc/fstab if needed - shutdown the node - swap disks - power on the node - everything should be fine ;-) Of course you will need a replication factor 1 for this to work ;-) Just my 2 cents, Jan rsync the full contents there, Am 18.12.2014 um 16:17 schrieb Or Sher: Hi all, We have a situation where some of our nodes have smaller disks and we would like to align all nodes by replacing the smaller disks to bigger ones without replacing nodes. We don't have enough space to put data on / disk and copy it back to the bigger disks so we would like to rebuild the nodes data from other replicas. What do you think should be the procedure here? I'm guessing it should be something like this but I'm pretty sure it's not enough. 1. shutdown C* node and server. 2. replace disks + create the same vg lv etc. 3. start C* (Normally?) 4. nodetool repair/rebuild? *I think I might get some consistency issues for use cases relying on Quorum reads and writes for strong consistency. What do you say? Another question is (and I know it depends on many factors but I'd like to hear an experienced estimation): How much time would take to rebuild a 250G data node? Thanks in advance, Or. -- Or Sher -- Or Sher -- Or Sher -- Or Sher -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay.
Re: Replacing nodes disks
Thanks guys. I have to replace all data disks, so I don't have another large enough local disk to move the data to. If I'll have no choice, I will backup the data before on some other node or something, but I'd like to avoid it. I would really love letting Cassandra do it thing and rebuild itself. Did anybody handled such cases that way (Letting Cassandra rebuild it's data?) Although there are no documented procedure for it, It should be possible right? On Fri, Dec 19, 2014 at 8:41 AM, Jan Kesten j.kes...@enercast.de wrote: Hi Or, I did some sort of this a while ago. If your machines do have a free disk slot - just put another disk there and use it as another data_file_directory. If not - as in my case: - grab an usb dock for disks - put the new one in there, plug in, format, mount to /mnt etc. - I did an online rsync from /var/lib/cassandra/data to /mnt - after that, bring cassandra down - do another rsync from /var/lib/cassandra/data to /mnt (should be faster, as sstables do not change, minimizes downtime) - if you need adjust /etc/fstab if needed - shutdown the node - swap disks - power on the node - everything should be fine ;-) Of course you will need a replication factor 1 for this to work ;-) Just my 2 cents, Jan rsync the full contents there, Am 18.12.2014 um 16:17 schrieb Or Sher: Hi all, We have a situation where some of our nodes have smaller disks and we would like to align all nodes by replacing the smaller disks to bigger ones without replacing nodes. We don't have enough space to put data on / disk and copy it back to the bigger disks so we would like to rebuild the nodes data from other replicas. What do you think should be the procedure here? I'm guessing it should be something like this but I'm pretty sure it's not enough. 1. shutdown C* node and server. 2. replace disks + create the same vg lv etc. 3. start C* (Normally?) 4. nodetool repair/rebuild? *I think I might get some consistency issues for use cases relying on Quorum reads and writes for strong consistency. What do you say? Another question is (and I know it depends on many factors but I'd like to hear an experienced estimation): How much time would take to rebuild a 250G data node? Thanks in advance, Or. -- Or Sher -- Or Sher
Re: Replacing nodes disks
Hi Or, You don't have another machine on the network that would temporarily be able to host your /var/lib/cassandra content? That way you would simply be scp:ing the files temporarily to another machine and copy them back when done. You obviously want to do a repair afterwards just in case, but this could save you some time. Just an idea, Jens On Thu, Dec 18, 2014 at 4:17 PM, Or Sher or.sh...@gmail.com wrote: Hi all, We have a situation where some of our nodes have smaller disks and we would like to align all nodes by replacing the smaller disks to bigger ones without replacing nodes. We don't have enough space to put data on / disk and copy it back to the bigger disks so we would like to rebuild the nodes data from other replicas. What do you think should be the procedure here? I'm guessing it should be something like this but I'm pretty sure it's not enough. 1. shutdown C* node and server. 2. replace disks + create the same vg lv etc. 3. start C* (Normally?) 4. nodetool repair/rebuild? *I think I might get some consistency issues for use cases relying on Quorum reads and writes for strong consistency. What do you say? Another question is (and I know it depends on many factors but I'd like to hear an experienced estimation): How much time would take to rebuild a 250G data node? Thanks in advance, Or. -- Or Sher -- Jens Rantil Backend engineer Tink AB Email: jens.ran...@tink.se Phone: +46 708 84 18 32 Web: www.tink.se Facebook https://www.facebook.com/#!/tink.se Linkedin http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_phototrkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary Twitter https://twitter.com/tink
Re: Replacing nodes disks
do you have to replace those disks? can you simply add new disks to those nodes and configure C* to use JBOD? On Dec 18, 2014 10:18 AM, Or Sher or.sh...@gmail.com wrote: Hi all, We have a situation where some of our nodes have smaller disks and we would like to align all nodes by replacing the smaller disks to bigger ones without replacing nodes. We don't have enough space to put data on / disk and copy it back to the bigger disks so we would like to rebuild the nodes data from other replicas. What do you think should be the procedure here? I'm guessing it should be something like this but I'm pretty sure it's not enough. 1. shutdown C* node and server. 2. replace disks + create the same vg lv etc. 3. start C* (Normally?) 4. nodetool repair/rebuild? *I think I might get some consistency issues for use cases relying on Quorum reads and writes for strong consistency. What do you say? Another question is (and I know it depends on many factors but I'd like to hear an experienced estimation): How much time would take to rebuild a 250G data node? Thanks in advance, Or. -- Or Sher
Re: Replacing nodes disks
Hi Or, I did some sort of this a while ago. If your machines do have a free disk slot - just put another disk there and use it as another data_file_directory. If not - as in my case: - grab an usb dock for disks - put the new one in there, plug in, format, mount to /mnt etc. - I did an online rsync from /var/lib/cassandra/data to /mnt - after that, bring cassandra down - do another rsync from /var/lib/cassandra/data to /mnt (should be faster, as sstables do not change, minimizes downtime) - if you need adjust /etc/fstab if needed - shutdown the node - swap disks - power on the node - everything should be fine ;-) Of course you will need a replication factor 1 for this to work ;-) Just my 2 cents, Jan rsync the full contents there, Am 18.12.2014 um 16:17 schrieb Or Sher: Hi all, We have a situation where some of our nodes have smaller disks and we would like to align all nodes by replacing the smaller disks to bigger ones without replacing nodes. We don't have enough space to put data on / disk and copy it back to the bigger disks so we would like to rebuild the nodes data from other replicas. What do you think should be the procedure here? I'm guessing it should be something like this but I'm pretty sure it's not enough. 1. shutdown C* node and server. 2. replace disks + create the same vg lv etc. 3. start C* (Normally?) 4. nodetool repair/rebuild? *I think I might get some consistency issues for use cases relying on Quorum reads and writes for strong consistency. What do you say? Another question is (and I know it depends on many factors but I'd like to hear an experienced estimation): How much time would take to rebuild a 250G data node? Thanks in advance, Or. -- Or Sher