Re: Replacing nodes disks

2014-12-23 Thread Or Sher
Thanks guys.
I think I'll start with the replacement of a dead node procedure at least
for the first node and I'll monitor the cluster overhead and timing.
If I'll see that the overhead and elapsed time are substantially higher
I'll try to find some network storage to store the backup.
Using the replacement procedure will also be a great practice on production
size data, and it will be less dangerous as it's on DR nodes.

On Mon, Dec 22, 2014 at 6:22 PM, Jan Kesten j.kes...@enercast.de wrote:

  Hi,

 even if recovery like a dead node would work - backup and restore (like my
 way with an usb docking station) will be much faster and produce less IO
 and CPU impact on your cluster.

 Keep that in Mind :-)

 Cheers,
 Jan

 Am 22.12.2014 um 10:58 schrieb Or Sher:

 Great. replace_address works great.
 From some reason I thought it won't work with the same IP.


 On Sun, Dec 21, 2014 at 5:14 PM, Ryan Svihla rsvi...@datastax.com wrote:

 Cassandra is designed to rebuild a node from other nodes, whether a node
 is dead by your hand because you killed it or fate is irrelevant, the
 process is the same, a new node can be the same hostname and ip or it can
 have totally different ones.

 On Sun, Dec 21, 2014 at 6:01 AM, Or Sher or.sh...@gmail.com wrote:

 If I'll use the replace_address parameter with the same IP address,
 would that do the job?

 On Sun, Dec 21, 2014 at 11:20 AM, Or Sher or.sh...@gmail.com wrote:

 What I want to do is kind of replacing a dead node -
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html
 But replacing it with a clean node with the same IP and hostname.

 On Sun, Dec 21, 2014 at 9:53 AM, Or Sher or.sh...@gmail.com wrote:

 Thanks guys.
 I have to replace all data disks, so I don't have another large enough
 local disk to move the data to.
 If I'll have no choice, I will backup the data before on some other
 node or something, but I'd like to avoid it.
 I would really love letting Cassandra do it thing and rebuild itself.
 Did anybody handled such cases that way (Letting Cassandra rebuild
 it's data?)
 Although there are no documented procedure for it, It should be
 possible right?

 On Fri, Dec 19, 2014 at 8:41 AM, Jan Kesten j.kes...@enercast.de
 wrote:

 Hi Or,

 I did some sort of this a while ago. If your machines do have a free
 disk slot - just put another disk there and use it as another
 data_file_directory.

 If not - as in my case:

 - grab an usb dock for disks
 - put the new one in there, plug in, format, mount to /mnt etc.
 - I did an online rsync from /var/lib/cassandra/data to /mnt
 - after that, bring cassandra down
 - do another rsync from /var/lib/cassandra/data to /mnt (should be
 faster, as sstables do not change, minimizes downtime)
 - if you need adjust /etc/fstab if needed
 - shutdown the node
 - swap disks
 - power on the node
 - everything should be fine ;-)

 Of course you will need a replication factor  1 for this to work ;-)

 Just my 2 cents,
 Jan

 rsync the full contents there,

 Am 18.12.2014 um 16:17 schrieb Or Sher:

  Hi all,

 We have a situation where some of our nodes have smaller disks and
 we would like to align all nodes by replacing the smaller disks to 
 bigger
 ones without replacing nodes.
 We don't have enough space to put data on / disk and copy it back to
 the bigger disks so we would like to rebuild the nodes data from other
 replicas.

 What do you think should be the procedure here?

 I'm guessing it should be something like this but I'm pretty sure
 it's not enough.
 1. shutdown C* node and server.
 2. replace disks + create the same vg lv etc.
 3. start C* (Normally?)
 4. nodetool repair/rebuild?
 *I think I might get some consistency issues for use cases relying
 on Quorum reads and writes for strong consistency.
 What do you say?

 Another question is (and I know it depends on many factors but I'd
 like to hear an experienced estimation): How much time would take to
 rebuild a 250G data node?

 Thanks in advance,
 Or.

 --
 Or Sher





   --
 Or Sher




   --
 Or Sher




   --
 Or Sher



  --

 [image: datastax_logo.png] http://www.datastax.com/

 Ryan Svihla

 Solution Architect

 [image: twitter.png] https://twitter.com/foundev [image: linkedin.png]
 http://www.linkedin.com/pub/ryan-svihla/12/621/727/

  DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.




  --
 Or Sher





-- 
Or Sher


Re: Replacing nodes disks

2014-12-22 Thread Or Sher
Great. replace_address works great.
From some reason I thought it won't work with the same IP.


On Sun, Dec 21, 2014 at 5:14 PM, Ryan Svihla rsvi...@datastax.com wrote:

 Cassandra is designed to rebuild a node from other nodes, whether a node
 is dead by your hand because you killed it or fate is irrelevant, the
 process is the same, a new node can be the same hostname and ip or it can
 have totally different ones.

 On Sun, Dec 21, 2014 at 6:01 AM, Or Sher or.sh...@gmail.com wrote:

 If I'll use the replace_address parameter with the same IP address,
 would that do the job?

 On Sun, Dec 21, 2014 at 11:20 AM, Or Sher or.sh...@gmail.com wrote:

 What I want to do is kind of replacing a dead node -
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html
 But replacing it with a clean node with the same IP and hostname.

 On Sun, Dec 21, 2014 at 9:53 AM, Or Sher or.sh...@gmail.com wrote:

 Thanks guys.
 I have to replace all data disks, so I don't have another large enough
 local disk to move the data to.
 If I'll have no choice, I will backup the data before on some other
 node or something, but I'd like to avoid it.
 I would really love letting Cassandra do it thing and rebuild itself.
 Did anybody handled such cases that way (Letting Cassandra rebuild it's
 data?)
 Although there are no documented procedure for it, It should be
 possible right?

 On Fri, Dec 19, 2014 at 8:41 AM, Jan Kesten j.kes...@enercast.de
 wrote:

 Hi Or,

 I did some sort of this a while ago. If your machines do have a free
 disk slot - just put another disk there and use it as another
 data_file_directory.

 If not - as in my case:

 - grab an usb dock for disks
 - put the new one in there, plug in, format, mount to /mnt etc.
 - I did an online rsync from /var/lib/cassandra/data to /mnt
 - after that, bring cassandra down
 - do another rsync from /var/lib/cassandra/data to /mnt (should be
 faster, as sstables do not change, minimizes downtime)
 - if you need adjust /etc/fstab if needed
 - shutdown the node
 - swap disks
 - power on the node
 - everything should be fine ;-)

 Of course you will need a replication factor  1 for this to work ;-)

 Just my 2 cents,
 Jan

 rsync the full contents there,

 Am 18.12.2014 um 16:17 schrieb Or Sher:

  Hi all,

 We have a situation where some of our nodes have smaller disks and we
 would like to align all nodes by replacing the smaller disks to bigger 
 ones
 without replacing nodes.
 We don't have enough space to put data on / disk and copy it back to
 the bigger disks so we would like to rebuild the nodes data from other
 replicas.

 What do you think should be the procedure here?

 I'm guessing it should be something like this but I'm pretty sure
 it's not enough.
 1. shutdown C* node and server.
 2. replace disks + create the same vg lv etc.
 3. start C* (Normally?)
 4. nodetool repair/rebuild?
 *I think I might get some consistency issues for use cases relying on
 Quorum reads and writes for strong consistency.
 What do you say?

 Another question is (and I know it depends on many factors but I'd
 like to hear an experienced estimation): How much time would take to
 rebuild a 250G data node?

 Thanks in advance,
 Or.

 --
 Or Sher





 --
 Or Sher




 --
 Or Sher




 --
 Or Sher



 --

 [image: datastax_logo.png] http://www.datastax.com/

 Ryan Svihla

 Solution Architect

 [image: twitter.png] https://twitter.com/foundev [image: linkedin.png]
 http://www.linkedin.com/pub/ryan-svihla/12/621/727/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.




-- 
Or Sher


Re: Replacing nodes disks

2014-12-22 Thread Eric Stevens
You should be able to use Cassandra's built in tooling for sure. But just
be aware that restoring from a backup of the data will be a lot faster and
won't introduce any stress on the existing cluster. Repair and replace
operations aren't free to the other nodes, so an offline backup and restore
is a better option when it's available.

On Mon, Dec 22, 2014, 3:00 AM Or Sher or.sh...@gmail.com wrote:

 Great. replace_address works great.
 From some reason I thought it won't work with the same IP.


 On Sun, Dec 21, 2014 at 5:14 PM, Ryan Svihla rsvi...@datastax.com wrote:

 Cassandra is designed to rebuild a node from other nodes, whether a node
 is dead by your hand because you killed it or fate is irrelevant, the
 process is the same, a new node can be the same hostname and ip or it can
 have totally different ones.

 On Sun, Dec 21, 2014 at 6:01 AM, Or Sher or.sh...@gmail.com wrote:

 If I'll use the replace_address parameter with the same IP address,
 would that do the job?

 On Sun, Dec 21, 2014 at 11:20 AM, Or Sher or.sh...@gmail.com wrote:

 What I want to do is kind of replacing a dead node -
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html
 But replacing it with a clean node with the same IP and hostname.

 On Sun, Dec 21, 2014 at 9:53 AM, Or Sher or.sh...@gmail.com wrote:

 Thanks guys.
 I have to replace all data disks, so I don't have another large enough
 local disk to move the data to.
 If I'll have no choice, I will backup the data before on some other
 node or something, but I'd like to avoid it.
 I would really love letting Cassandra do it thing and rebuild itself.
 Did anybody handled such cases that way (Letting Cassandra rebuild
 it's data?)
 Although there are no documented procedure for it, It should be
 possible right?

 On Fri, Dec 19, 2014 at 8:41 AM, Jan Kesten j.kes...@enercast.de
 wrote:

 Hi Or,

 I did some sort of this a while ago. If your machines do have a free
 disk slot - just put another disk there and use it as another
 data_file_directory.

 If not - as in my case:

 - grab an usb dock for disks
 - put the new one in there, plug in, format, mount to /mnt etc.
 - I did an online rsync from /var/lib/cassandra/data to /mnt
 - after that, bring cassandra down
 - do another rsync from /var/lib/cassandra/data to /mnt (should be
 faster, as sstables do not change, minimizes downtime)
 - if you need adjust /etc/fstab if needed
 - shutdown the node
 - swap disks
 - power on the node
 - everything should be fine ;-)

 Of course you will need a replication factor  1 for this to work ;-)

 Just my 2 cents,
 Jan

 rsync the full contents there,

 Am 18.12.2014 um 16:17 schrieb Or Sher:

  Hi all,

 We have a situation where some of our nodes have smaller disks and
 we would like to align all nodes by replacing the smaller disks to 
 bigger
 ones without replacing nodes.
 We don't have enough space to put data on / disk and copy it back to
 the bigger disks so we would like to rebuild the nodes data from other
 replicas.

 What do you think should be the procedure here?

 I'm guessing it should be something like this but I'm pretty sure
 it's not enough.
 1. shutdown C* node and server.
 2. replace disks + create the same vg lv etc.
 3. start C* (Normally?)
 4. nodetool repair/rebuild?
 *I think I might get some consistency issues for use cases relying
 on Quorum reads and writes for strong consistency.
 What do you say?

 Another question is (and I know it depends on many factors but I'd
 like to hear an experienced estimation): How much time would take to
 rebuild a 250G data node?

 Thanks in advance,
 Or.

 --
 Or Sher





 --
 Or Sher




 --
 Or Sher




 --
 Or Sher



 --

 [image: datastax_logo.png] http://www.datastax.com/

 Ryan Svihla

 Solution Architect

 [image: twitter.png] https://twitter.com/foundev [image: linkedin.png]
 http://www.linkedin.com/pub/ryan-svihla/12/621/727/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.




 --
 Or Sher



Re: Replacing nodes disks

2014-12-22 Thread Jan Kesten

Hi,

even if recovery like a dead node would work - backup and restore (like 
my way with an usb docking station) will be much faster and produce less 
IO and CPU impact on your cluster.


Keep that in Mind :-)

Cheers,
Jan

Am 22.12.2014 um 10:58 schrieb Or Sher:

Great. replace_address works great.
From some reason I thought it won't work with the same IP.


On Sun, Dec 21, 2014 at 5:14 PM, Ryan Svihla rsvi...@datastax.com 
mailto:rsvi...@datastax.com wrote:


Cassandra is designed to rebuild a node from other nodes, whether
a node is dead by your hand because you killed it or fate is
irrelevant, the process is the same, a new node can be the same
hostname and ip or it can have totally different ones.

On Sun, Dec 21, 2014 at 6:01 AM, Or Sher or.sh...@gmail.com
mailto:or.sh...@gmail.com wrote:

If I'll use the replace_address parameter with the same IP
address, would that do the job?

On Sun, Dec 21, 2014 at 11:20 AM, Or Sher or.sh...@gmail.com
mailto:or.sh...@gmail.com wrote:

What I want to do is kind of replacing a dead node -

http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html

But replacing it with a clean node with the same IP and
hostname.

On Sun, Dec 21, 2014 at 9:53 AM, Or Sher
or.sh...@gmail.com mailto:or.sh...@gmail.com wrote:

Thanks guys.
I have to replace all data disks, so I don't have
another large enough local disk to move the data to.
If I'll have no choice, I will backup the data before
on some other node or something, but I'd like to avoid it.
I would really love letting Cassandra do it thing and
rebuild itself.
Did anybody handled such cases that way (Letting
Cassandra rebuild it's data?)
Although there are no documented procedure for it, It
should be possible right?

On Fri, Dec 19, 2014 at 8:41 AM, Jan Kesten
j.kes...@enercast.de mailto:j.kes...@enercast.de
wrote:

Hi Or,

I did some sort of this a while ago. If your
machines do have a free disk slot - just put
another disk there and use it as another
data_file_directory.

If not - as in my case:

- grab an usb dock for disks
- put the new one in there, plug in, format, mount
to /mnt etc.
- I did an online rsync from
/var/lib/cassandra/data to /mnt
- after that, bring cassandra down
- do another rsync from /var/lib/cassandra/data to
/mnt (should be faster, as sstables do not change,
minimizes downtime)
- if you need adjust /etc/fstab if needed
- shutdown the node
- swap disks
- power on the node
- everything should be fine ;-)

Of course you will need a replication factor  1
for this to work ;-)

Just my 2 cents,
Jan

rsync the full contents there,

Am 18.12.2014 um 16:17 schrieb Or Sher:

Hi all,

We have a situation where some of our nodes
have smaller disks and we would like to align
all nodes by replacing the smaller disks to
bigger ones without replacing nodes.
We don't have enough space to put data on /
disk and copy it back to the bigger disks so
we would like to rebuild the nodes data from
other replicas.

What do you think should be the procedure here?

I'm guessing it should be something like this
but I'm pretty sure it's not enough.
1. shutdown C* node and server.
2. replace disks + create the same vg lv etc.
3. start C* (Normally?)
4. nodetool repair/rebuild?
*I think I might get some consistency issues
for use cases relying on Quorum reads and
writes for strong consistency.
What do you say?

Another question is (and I know it depends on
many factors but I'd like to hear an
experienced estimation): How much time would
take to rebuild a 250G data node?


Re: Replacing nodes disks

2014-12-21 Thread Or Sher
What I want to do is kind of replacing a dead node -
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html
But replacing it with a clean node with the same IP and hostname.

On Sun, Dec 21, 2014 at 9:53 AM, Or Sher or.sh...@gmail.com wrote:

 Thanks guys.
 I have to replace all data disks, so I don't have another large enough
 local disk to move the data to.
 If I'll have no choice, I will backup the data before on some other node
 or something, but I'd like to avoid it.
 I would really love letting Cassandra do it thing and rebuild itself.
 Did anybody handled such cases that way (Letting Cassandra rebuild it's
 data?)
 Although there are no documented procedure for it, It should be possible
 right?

 On Fri, Dec 19, 2014 at 8:41 AM, Jan Kesten j.kes...@enercast.de wrote:

 Hi Or,

 I did some sort of this a while ago. If your machines do have a free disk
 slot - just put another disk there and use it as another
 data_file_directory.

 If not - as in my case:

 - grab an usb dock for disks
 - put the new one in there, plug in, format, mount to /mnt etc.
 - I did an online rsync from /var/lib/cassandra/data to /mnt
 - after that, bring cassandra down
 - do another rsync from /var/lib/cassandra/data to /mnt (should be
 faster, as sstables do not change, minimizes downtime)
 - if you need adjust /etc/fstab if needed
 - shutdown the node
 - swap disks
 - power on the node
 - everything should be fine ;-)

 Of course you will need a replication factor  1 for this to work ;-)

 Just my 2 cents,
 Jan

 rsync the full contents there,

 Am 18.12.2014 um 16:17 schrieb Or Sher:

  Hi all,

 We have a situation where some of our nodes have smaller disks and we
 would like to align all nodes by replacing the smaller disks to bigger ones
 without replacing nodes.
 We don't have enough space to put data on / disk and copy it back to the
 bigger disks so we would like to rebuild the nodes data from other replicas.

 What do you think should be the procedure here?

 I'm guessing it should be something like this but I'm pretty sure it's
 not enough.
 1. shutdown C* node and server.
 2. replace disks + create the same vg lv etc.
 3. start C* (Normally?)
 4. nodetool repair/rebuild?
 *I think I might get some consistency issues for use cases relying on
 Quorum reads and writes for strong consistency.
 What do you say?

 Another question is (and I know it depends on many factors but I'd like
 to hear an experienced estimation): How much time would take to rebuild a
 250G data node?

 Thanks in advance,
 Or.

 --
 Or Sher





 --
 Or Sher




-- 
Or Sher


Re: Replacing nodes disks

2014-12-21 Thread Or Sher
If I'll use the replace_address parameter with the same IP address, would
that do the job?

On Sun, Dec 21, 2014 at 11:20 AM, Or Sher or.sh...@gmail.com wrote:

 What I want to do is kind of replacing a dead node -
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html
 But replacing it with a clean node with the same IP and hostname.

 On Sun, Dec 21, 2014 at 9:53 AM, Or Sher or.sh...@gmail.com wrote:

 Thanks guys.
 I have to replace all data disks, so I don't have another large enough
 local disk to move the data to.
 If I'll have no choice, I will backup the data before on some other node
 or something, but I'd like to avoid it.
 I would really love letting Cassandra do it thing and rebuild itself.
 Did anybody handled such cases that way (Letting Cassandra rebuild it's
 data?)
 Although there are no documented procedure for it, It should be possible
 right?

 On Fri, Dec 19, 2014 at 8:41 AM, Jan Kesten j.kes...@enercast.de wrote:

 Hi Or,

 I did some sort of this a while ago. If your machines do have a free
 disk slot - just put another disk there and use it as another
 data_file_directory.

 If not - as in my case:

 - grab an usb dock for disks
 - put the new one in there, plug in, format, mount to /mnt etc.
 - I did an online rsync from /var/lib/cassandra/data to /mnt
 - after that, bring cassandra down
 - do another rsync from /var/lib/cassandra/data to /mnt (should be
 faster, as sstables do not change, minimizes downtime)
 - if you need adjust /etc/fstab if needed
 - shutdown the node
 - swap disks
 - power on the node
 - everything should be fine ;-)

 Of course you will need a replication factor  1 for this to work ;-)

 Just my 2 cents,
 Jan

 rsync the full contents there,

 Am 18.12.2014 um 16:17 schrieb Or Sher:

  Hi all,

 We have a situation where some of our nodes have smaller disks and we
 would like to align all nodes by replacing the smaller disks to bigger ones
 without replacing nodes.
 We don't have enough space to put data on / disk and copy it back to
 the bigger disks so we would like to rebuild the nodes data from other
 replicas.

 What do you think should be the procedure here?

 I'm guessing it should be something like this but I'm pretty sure it's
 not enough.
 1. shutdown C* node and server.
 2. replace disks + create the same vg lv etc.
 3. start C* (Normally?)
 4. nodetool repair/rebuild?
 *I think I might get some consistency issues for use cases relying on
 Quorum reads and writes for strong consistency.
 What do you say?

 Another question is (and I know it depends on many factors but I'd like
 to hear an experienced estimation): How much time would take to rebuild a
 250G data node?

 Thanks in advance,
 Or.

 --
 Or Sher





 --
 Or Sher




 --
 Or Sher




-- 
Or Sher


Re: Replacing nodes disks

2014-12-21 Thread Ryan Svihla
Cassandra is designed to rebuild a node from other nodes, whether a node is
dead by your hand because you killed it or fate is irrelevant, the process
is the same, a new node can be the same hostname and ip or it can have
totally different ones.

On Sun, Dec 21, 2014 at 6:01 AM, Or Sher or.sh...@gmail.com wrote:

 If I'll use the replace_address parameter with the same IP address, would
 that do the job?

 On Sun, Dec 21, 2014 at 11:20 AM, Or Sher or.sh...@gmail.com wrote:

 What I want to do is kind of replacing a dead node -
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html
 But replacing it with a clean node with the same IP and hostname.

 On Sun, Dec 21, 2014 at 9:53 AM, Or Sher or.sh...@gmail.com wrote:

 Thanks guys.
 I have to replace all data disks, so I don't have another large enough
 local disk to move the data to.
 If I'll have no choice, I will backup the data before on some other node
 or something, but I'd like to avoid it.
 I would really love letting Cassandra do it thing and rebuild itself.
 Did anybody handled such cases that way (Letting Cassandra rebuild it's
 data?)
 Although there are no documented procedure for it, It should be possible
 right?

 On Fri, Dec 19, 2014 at 8:41 AM, Jan Kesten j.kes...@enercast.de
 wrote:

 Hi Or,

 I did some sort of this a while ago. If your machines do have a free
 disk slot - just put another disk there and use it as another
 data_file_directory.

 If not - as in my case:

 - grab an usb dock for disks
 - put the new one in there, plug in, format, mount to /mnt etc.
 - I did an online rsync from /var/lib/cassandra/data to /mnt
 - after that, bring cassandra down
 - do another rsync from /var/lib/cassandra/data to /mnt (should be
 faster, as sstables do not change, minimizes downtime)
 - if you need adjust /etc/fstab if needed
 - shutdown the node
 - swap disks
 - power on the node
 - everything should be fine ;-)

 Of course you will need a replication factor  1 for this to work ;-)

 Just my 2 cents,
 Jan

 rsync the full contents there,

 Am 18.12.2014 um 16:17 schrieb Or Sher:

  Hi all,

 We have a situation where some of our nodes have smaller disks and we
 would like to align all nodes by replacing the smaller disks to bigger 
 ones
 without replacing nodes.
 We don't have enough space to put data on / disk and copy it back to
 the bigger disks so we would like to rebuild the nodes data from other
 replicas.

 What do you think should be the procedure here?

 I'm guessing it should be something like this but I'm pretty sure it's
 not enough.
 1. shutdown C* node and server.
 2. replace disks + create the same vg lv etc.
 3. start C* (Normally?)
 4. nodetool repair/rebuild?
 *I think I might get some consistency issues for use cases relying on
 Quorum reads and writes for strong consistency.
 What do you say?

 Another question is (and I know it depends on many factors but I'd
 like to hear an experienced estimation): How much time would take to
 rebuild a 250G data node?

 Thanks in advance,
 Or.

 --
 Or Sher





 --
 Or Sher




 --
 Or Sher




 --
 Or Sher



-- 

[image: datastax_logo.png] http://www.datastax.com/

Ryan Svihla

Solution Architect

[image: twitter.png] https://twitter.com/foundev [image: linkedin.png]
http://www.linkedin.com/pub/ryan-svihla/12/621/727/

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.


Re: Replacing nodes disks

2014-12-20 Thread Or Sher
Thanks guys.
I have to replace all data disks, so I don't have another large enough
local disk to move the data to.
If I'll have no choice, I will backup the data before on some other node or
something, but I'd like to avoid it.
I would really love letting Cassandra do it thing and rebuild itself.
Did anybody handled such cases that way (Letting Cassandra rebuild it's
data?)
Although there are no documented procedure for it, It should be possible
right?

On Fri, Dec 19, 2014 at 8:41 AM, Jan Kesten j.kes...@enercast.de wrote:

 Hi Or,

 I did some sort of this a while ago. If your machines do have a free disk
 slot - just put another disk there and use it as another
 data_file_directory.

 If not - as in my case:

 - grab an usb dock for disks
 - put the new one in there, plug in, format, mount to /mnt etc.
 - I did an online rsync from /var/lib/cassandra/data to /mnt
 - after that, bring cassandra down
 - do another rsync from /var/lib/cassandra/data to /mnt (should be faster,
 as sstables do not change, minimizes downtime)
 - if you need adjust /etc/fstab if needed
 - shutdown the node
 - swap disks
 - power on the node
 - everything should be fine ;-)

 Of course you will need a replication factor  1 for this to work ;-)

 Just my 2 cents,
 Jan

 rsync the full contents there,

 Am 18.12.2014 um 16:17 schrieb Or Sher:

  Hi all,

 We have a situation where some of our nodes have smaller disks and we
 would like to align all nodes by replacing the smaller disks to bigger ones
 without replacing nodes.
 We don't have enough space to put data on / disk and copy it back to the
 bigger disks so we would like to rebuild the nodes data from other replicas.

 What do you think should be the procedure here?

 I'm guessing it should be something like this but I'm pretty sure it's
 not enough.
 1. shutdown C* node and server.
 2. replace disks + create the same vg lv etc.
 3. start C* (Normally?)
 4. nodetool repair/rebuild?
 *I think I might get some consistency issues for use cases relying on
 Quorum reads and writes for strong consistency.
 What do you say?

 Another question is (and I know it depends on many factors but I'd like
 to hear an experienced estimation): How much time would take to rebuild a
 250G data node?

 Thanks in advance,
 Or.

 --
 Or Sher





-- 
Or Sher


Re: Replacing nodes disks

2014-12-18 Thread Jens Rantil
Hi Or,

You don't have another machine on the network that would temporarily be
able to host your /var/lib/cassandra content? That way you would simply be
scp:ing the files temporarily to another machine and copy them back when
done. You obviously want to do a repair afterwards just in case, but this
could save you some time.

Just an idea,
Jens

On Thu, Dec 18, 2014 at 4:17 PM, Or Sher or.sh...@gmail.com wrote:

 Hi all,

 We have a situation where some of our nodes have smaller disks and we
 would like to align all nodes by replacing the smaller disks to bigger ones
 without replacing nodes.
 We don't have enough space to put data on / disk and copy it back to the
 bigger disks so we would like to rebuild the nodes data from other replicas.

 What do you think should be the procedure here?

 I'm guessing it should be something like this but I'm pretty sure it's not
 enough.
 1. shutdown C* node and server.
 2. replace disks + create the same vg lv etc.
 3. start C* (Normally?)
 4. nodetool repair/rebuild?
 *I think I might get some consistency issues for use cases relying on
 Quorum reads and writes for strong consistency.
 What do you say?

 Another question is (and I know it depends on many factors but I'd like to
 hear an experienced estimation): How much time would take to rebuild a 250G
 data node?

 Thanks in advance,
 Or.

 --
 Or Sher



-- 
Jens Rantil
Backend engineer
Tink AB

Email: jens.ran...@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook https://www.facebook.com/#!/tink.se Linkedin
http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_phototrkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary
 Twitter https://twitter.com/tink


Re: Replacing nodes disks

2014-12-18 Thread Kai Wang
do you have to replace those disks? can you simply add new disks to those
nodes and configure C* to use JBOD?
On Dec 18, 2014 10:18 AM, Or Sher or.sh...@gmail.com wrote:

 Hi all,

 We have a situation where some of our nodes have smaller disks and we
 would like to align all nodes by replacing the smaller disks to bigger ones
 without replacing nodes.
 We don't have enough space to put data on / disk and copy it back to the
 bigger disks so we would like to rebuild the nodes data from other replicas.

 What do you think should be the procedure here?

 I'm guessing it should be something like this but I'm pretty sure it's not
 enough.
 1. shutdown C* node and server.
 2. replace disks + create the same vg lv etc.
 3. start C* (Normally?)
 4. nodetool repair/rebuild?
 *I think I might get some consistency issues for use cases relying on
 Quorum reads and writes for strong consistency.
 What do you say?

 Another question is (and I know it depends on many factors but I'd like to
 hear an experienced estimation): How much time would take to rebuild a 250G
 data node?

 Thanks in advance,
 Or.

 --
 Or Sher



Re: Replacing nodes disks

2014-12-18 Thread Jan Kesten

Hi Or,

I did some sort of this a while ago. If your machines do have a free 
disk slot - just put another disk there and use it as another 
data_file_directory.


If not - as in my case:

- grab an usb dock for disks
- put the new one in there, plug in, format, mount to /mnt etc.
- I did an online rsync from /var/lib/cassandra/data to /mnt
- after that, bring cassandra down
- do another rsync from /var/lib/cassandra/data to /mnt (should be 
faster, as sstables do not change, minimizes downtime)

- if you need adjust /etc/fstab if needed
- shutdown the node
- swap disks
- power on the node
- everything should be fine ;-)

Of course you will need a replication factor  1 for this to work ;-)

Just my 2 cents,
Jan

rsync the full contents there,

Am 18.12.2014 um 16:17 schrieb Or Sher:

Hi all,

We have a situation where some of our nodes have smaller disks and we 
would like to align all nodes by replacing the smaller disks to bigger 
ones without replacing nodes.
We don't have enough space to put data on / disk and copy it back to 
the bigger disks so we would like to rebuild the nodes data from other 
replicas.


What do you think should be the procedure here?

I'm guessing it should be something like this but I'm pretty sure it's 
not enough.

1. shutdown C* node and server.
2. replace disks + create the same vg lv etc.
3. start C* (Normally?)
4. nodetool repair/rebuild?
*I think I might get some consistency issues for use cases relying on 
Quorum reads and writes for strong consistency.

What do you say?

Another question is (and I know it depends on many factors but I'd 
like to hear an experienced estimation): How much time would take to 
rebuild a 250G data node?


Thanks in advance,
Or.

--
Or Sher