Re: Adding disk capacity to a running node

2016-10-18 Thread Vladimir Yudovin
 On Mon, 17 Oct 2016 15:59:41 -0400Ben Bromhead 
b...@instaclustr.com wrote 

For the times that AWS retires an instance, you get plenty of notice and it's 
generally pretty rare. We run over 1000 instances on AWS and see one forced 
retirement a month if that. We've never had an instance pulled from under our 
feet without warning.




Yes, in case of planned event. But in case of some hardware failure it can 
happen. And it shouldn't be some catastrophe affecting the whole availability 
zone. Just failure of singe blade. 





Best regards, Vladimir Yudovin, 

Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer.
Launch your cluster in minutes.






 On Mon, 17 Oct 2016 15:59:41 -0400Ben Bromhead 
b...@instaclustr.com wrote 




Yup as everyone has mentioned ephemeral are fine if you run in multiple AZs... 
which is pretty much mandatory for any production deployment in AWS (and other 
cloud providers) . i2.2xls are generally your best bet for high read throughput 
applications on AWS. 



Also on AWS ephemeral storage will generally survive a user initiated restart. 
For the times that AWS retires an instance, you get plenty of notice and it's 
generally pretty rare. We run over 1000 instances on AWS and see one forced 
retirement a month if that. We've never had an instance pulled from under our 
feet without warning.



To add another option for the original question, one thing you can do is to 
attach a large EBS drive to the instance and bind mount it to the directory for 
the table that has the very large SSTables. You will need to copy data across 
to the EBS volume. Let everything compact and then copy everything back and 
detach EBS. Latency may be higher than normal on the node you are doing this on 
(especially if you are used to i2.2xl performance). 



This is something we often have to do, when we encounter pathological 
compaction situations associated with bootstrapping, adding new DCs or STCS 
with a dominant table or people ignore high disk usage warnings :)




On Mon, 17 Oct 2016 at 12:43 Jeff Jirsa jeff.ji...@crowdstrike.com 
wrote:




-- 

Ben Bromhead

CTO | Instaclustr

+1 650 284 9692

Managed Cassandra / Spark on AWS, Azure and Softlayer




Ephemeral is fine, you just need to have enough replicas (in enough AZs and 
enough regions) to tolerate instances being terminated.

 

 

 

From: Vladimir Yudovin vla...@winguzone.com
Reply-To: "user@cassandra.apache.org" user@cassandra.apache.org
Date: Monday, October 17, 2016 at 11:48 AM
To: user user@cassandra.apache.org




Subject: Re: Adding disk capacity to a running node






 


It's extremely unreliable to use ephemeral (local) disks. Even if you don't 
stop instance by yourself, it can be restarted on different server in case of 
some hardware failure or AWS initiated update. So all node data will be lost.







 


Best regards, Vladimir Yudovin, 


Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer.
Launch your cluster in minutes.


 


 


 On Mon, 17 Oct 2016 14:45:00 -0400Seth Edwards s...@pubnub.com 
wrote 



 


These are i2.2xlarge instances so the disks currently configured as ephemeral 
dedicated disks. 


 


On Mon, Oct 17, 2016 at 11:34 AM, Laing, Michael 
michael.la...@nytimes.com wrote:


 





You could just expand the size of your ebs volume and extend the file system. 
No data is lost - assuming you are running Linux.


 


 


On Monday, October 17, 2016, Seth Edwards s...@pubnub.com wrote:


We're running 2.0.16. We're migrating to a new data model but we've had an 
unexpected increase in write traffic that has caused us some capacity issues 
when we encounter compactions. Our old data model is on STCS. We'd like to add 
another ebs volume (we're on aws) to our JBOD config and hopefully avoid any 
situation where we run out of disk space during a large compaction. It appears 
that the behavior we are hoping to get is actually undesirable and removed in 
3.2. It still might be an option for us until we can finish the migration. 


 


I'm not familiar with LVM so it may be a bit risky to try at this point. 



 


On Mon, Oct 17, 2016 at 9:42 AM, Yabin Meng yabinm...@gmail.com wrote:


I assume you're talking about Cassandra JBOD (just a bunch of disk) setup 
because you do mention it as adding it to the list of data directories. If this 
is the case, you may run into issues, depending on your C* version. Check this 
out: 
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.datastax.com_dev_blog_improving-2Djbodd=DQMFaQc=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3Mr=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3owm=ixOxpX-xpw1dJZNpaMT3mepToWX8gzmsVaXFizQLzoUs=e_rkkJ8RHJXe4KvyNfeRWQkdy-zZzOnaMDQle3nN808e=.


 


Or another approach is to use LVM to manage multiple devices into a single 
mount point. If you do so, from what Cassandra can see is just simply increased 
disk storage space and there should should have no problem.


 


Hope this helps,


Re: Adding disk capacity to a running node

2016-10-17 Thread Jeff Jirsa
Note that ebs iops scale with disk capacity – gp2 volumes < 3334 GB have 
“burst” capacity that you can exhaust. Practically speaking, this may not 
matter to you, but if you chose a very small volume size (say, 500G), your 
baseline throughput may be much lower than you anticipate.

 

 

 

From: Seth Edwards <s...@pubnub.com>
Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Monday, October 17, 2016 at 2:06 PM
To: user <user@cassandra.apache.org>
Subject: Re: Adding disk capacity to a running node

 

Thanks for the detailed steps Ben! This gives me another option in case of 
emergency.  

 

 

 

On Mon, Oct 17, 2016 at 1:55 PM, Ben Bromhead <b...@instaclustr.com> wrote:

yup you would need to copy the files across to the new volume from the dir you 
wanted to give additional space to. Rough steps would look like: 

Create EBS volume (make it big... like 3TB)
Attach to instance
Mount/format EBS volume
Stop C*
Copy full/troublesome directory to the EBS volume
Remove copied files (using rsync for the copy / remove step can be a good idea)
bind mount EBS volume with the same path as the troublesome directory
Start C* back up
Let it finish compacting / streaming etc
Stop C*
remove bind mount
copy files back to ephemeral
start C* back up
repeat on other nodes
run repair
You can use this process if you somehow end up in a full disk situation. If you 
end up in a low disk situation you'll have other issues (like corrupt / half 
written SSTable components), but it's better than nothing

 

Also to maintain your read throughput during this whole thing, double check the 
EBS volumes read_ahead_kb setting on the block volume and reduce it to 
something sane like 0 or 16.

 

 

 

On Mon, 17 Oct 2016 at 13:42 Seth Edwards <s...@pubnub.com> wrote:

@Ben  

 

Interesting idea, is this also an option for situations where the disk is 
completely full and Cassandra has stopped? (Not that I want to go there). 

 

If this was the route taken, and we did 

 

mount --bind   /mnt/path/to/large/sstable   /mnt/newebs 

 

We would still need to do some manual copying of files? such as

 

mv /mnt/path/to/large/sstable.sd /mnt/newebs ? 

 

Thanks!

 

On Mon, Oct 17, 2016 at 12:59 PM, Ben Bromhead <b...@instaclustr.com> wrote:

Yup as everyone has mentioned ephemeral are fine if you run in multiple AZs... 
which is pretty much mandatory for any production deployment in AWS (and other 
cloud providers) . i2.2xls are generally your best bet for high read throughput 
applications on AWS.  

 

Also on AWS ephemeral storage will generally survive a user initiated restart. 
For the times that AWS retires an instance, you get plenty of notice and it's 
generally pretty rare. We run over 1000 instances on AWS and see one forced 
retirement a month if that. We've never had an instance pulled from under our 
feet without warning.

 

To add another option for the original question, one thing you can do is to 
attach a large EBS drive to the instance and bind mount it to the directory for 
the table that has the very large SSTables. You will need to copy data across 
to the EBS volume. Let everything compact and then copy everything back and 
detach EBS. Latency may be higher than normal on the node you are doing this on 
(especially if you are used to i2.2xl performance). 

 

This is something we often have to do, when we encounter pathological 
compaction situations associated with bootstrapping, adding new DCs or STCS 
with a dominant table or people ignore high disk usage warnings :)

 

On Mon, 17 Oct 2016 at 12:43 Jeff Jirsa <jeff.ji...@crowdstrike.com> wrote:

Ephemeral is fine, you just need to have enough replicas (in enough AZs and 
enough regions) to tolerate instances being terminated.

 

 

 

From: Vladimir Yudovin <vla...@winguzone.com>
Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Monday, October 17, 2016 at 11:48 AM
To: user <user@cassandra.apache.org>


Subject: Re: Adding disk capacity to a running node

 

It's extremely unreliable to use ephemeral (local) disks. Even if you don't 
stop instance by yourself, it can be restarted on different server in case of 
some hardware failure or AWS initiated update. So all node data will be lost.

 

Best regards, Vladimir Yudovin, 

Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer.
Launch your cluster in minutes.

 

 

 On Mon, 17 Oct 2016 14:45:00 -0400Seth Edwards <s...@pubnub.com> wrote 

 

These are i2.2xlarge instances so the disks currently configured as ephemeral 
dedicated disks. 

 

On Mon, Oct 17, 2016 at 11:34 AM, Laing, Michael <michael.la...@nytimes.com> 
wrote:

 

You could just expand the size of your ebs volume and extend the file system. 
No data is lost - assuming you are running Linux.

 

 

On Monday, October 17, 2016, Seth Edwards <s...@pubnub.com> wrote:

We're running 2.0.16

Re: Adding disk capacity to a running node

2016-10-17 Thread Seth Edwards
Thanks for the detailed steps Ben! This gives me another option in case of
emergency.



On Mon, Oct 17, 2016 at 1:55 PM, Ben Bromhead <b...@instaclustr.com> wrote:

> yup you would need to copy the files across to the new volume from the dir
> you wanted to give additional space to. Rough steps would look like:
>
>1. Create EBS volume (make it big... like 3TB)
>2. Attach to instance
>3. Mount/format EBS volume
>4. Stop C*
>5. Copy full/troublesome directory to the EBS volume
>6. Remove copied files (using rsync for the copy / remove step can be
>a good idea)
>7. bind mount EBS volume with the same path as the troublesome
>directory
>8. Start C* back up
>9. Let it finish compacting / streaming etc
>10. Stop C*
>11. remove bind mount
>12. copy files back to ephemeral
>13. start C* back up
>14. repeat on other nodes
>15. run repair
>
> You can use this process if you somehow end up in a full disk situation.
> If you end up in a low disk situation you'll have other issues (like
> corrupt / half written SSTable components), but it's better than nothing
>
> Also to maintain your read throughput during this whole thing, double
> check the EBS volumes read_ahead_kb setting on the block volume and reduce
> it to something sane like 0 or 16.
>
>
>
> On Mon, 17 Oct 2016 at 13:42 Seth Edwards <s...@pubnub.com> wrote:
>
>> @Ben
>>
>> Interesting idea, is this also an option for situations where the disk is
>> completely full and Cassandra has stopped? (Not that I want to go there).
>>
>> If this was the route taken, and we did
>>
>> mount --bind   /mnt/path/to/large/sstable   /mnt/newebs
>>
>> We would still need to do some manual copying of files? such as
>>
>> mv /mnt/path/to/large/sstable.sd /mnt/newebs ?
>>
>> Thanks!
>>
>> On Mon, Oct 17, 2016 at 12:59 PM, Ben Bromhead <b...@instaclustr.com>
>> wrote:
>>
>> Yup as everyone has mentioned ephemeral are fine if you run in multiple
>> AZs... which is pretty much mandatory for any production deployment in AWS
>> (and other cloud providers) . i2.2xls are generally your best bet for high
>> read throughput applications on AWS.
>>
>> Also on AWS ephemeral storage will generally survive a user initiated
>> restart. For the times that AWS retires an instance, you get plenty of
>> notice and it's generally pretty rare. We run over 1000 instances on AWS
>> and see one forced retirement a month if that. We've never had an instance
>> pulled from under our feet without warning.
>>
>> To add another option for the original question, one thing you can do is
>> to attach a large EBS drive to the instance and bind mount it to the
>> directory for the table that has the very large SSTables. You will need to
>> copy data across to the EBS volume. Let everything compact and then copy
>> everything back and detach EBS. Latency may be higher than normal on the
>> node you are doing this on (especially if you are used to i2.2xl
>> performance).
>>
>> This is something we often have to do, when we encounter pathological
>> compaction situations associated with bootstrapping, adding new DCs or STCS
>> with a dominant table or people ignore high disk usage warnings :)
>>
>> On Mon, 17 Oct 2016 at 12:43 Jeff Jirsa <jeff.ji...@crowdstrike.com>
>> wrote:
>>
>> Ephemeral is fine, you just need to have enough replicas (in enough AZs
>> and enough regions) to tolerate instances being terminated.
>>
>>
>>
>>
>>
>>
>>
>> *From: *Vladimir Yudovin <vla...@winguzone.com>
>> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
>> *Date: *Monday, October 17, 2016 at 11:48 AM
>> *To: *user <user@cassandra.apache.org>
>>
>>
>> *Subject: *Re: Adding disk capacity to a running node
>>
>>
>>
>> It's extremely unreliable to use ephemeral (local) disks. Even if you
>> don't stop instance by yourself, it can be restarted on different server in
>> case of some hardware failure or AWS initiated update. So all node data
>> will be lost.
>>
>>
>>
>> Best regards, Vladimir Yudovin,
>>
>>
>> *Winguzone
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__winguzone.com-3Ffrom-3Dlist=DQMFaQ=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=ixOxpX-xpw1dJZNpaMT3mepToWX8gzmsVaXFizQLzoU=4q7P9fddEYpXwPR-h9yA_tk5JwR8l6c7cKJ-LQTVcGM=>
>> - Hosted Cloud Cassandra on Azure and SoftLayer.Launch yo

Re: Adding disk capacity to a running node

2016-10-17 Thread Ben Bromhead
yup you would need to copy the files across to the new volume from the dir
you wanted to give additional space to. Rough steps would look like:

   1. Create EBS volume (make it big... like 3TB)
   2. Attach to instance
   3. Mount/format EBS volume
   4. Stop C*
   5. Copy full/troublesome directory to the EBS volume
   6. Remove copied files (using rsync for the copy / remove step can be a
   good idea)
   7. bind mount EBS volume with the same path as the troublesome directory
   8. Start C* back up
   9. Let it finish compacting / streaming etc
   10. Stop C*
   11. remove bind mount
   12. copy files back to ephemeral
   13. start C* back up
   14. repeat on other nodes
   15. run repair

You can use this process if you somehow end up in a full disk situation. If
you end up in a low disk situation you'll have other issues (like corrupt /
half written SSTable components), but it's better than nothing

Also to maintain your read throughput during this whole thing, double check
the EBS volumes read_ahead_kb setting on the block volume and reduce it to
something sane like 0 or 16.



On Mon, 17 Oct 2016 at 13:42 Seth Edwards <s...@pubnub.com> wrote:

> @Ben
>
> Interesting idea, is this also an option for situations where the disk is
> completely full and Cassandra has stopped? (Not that I want to go there).
>
> If this was the route taken, and we did
>
> mount --bind   /mnt/path/to/large/sstable   /mnt/newebs
>
> We would still need to do some manual copying of files? such as
>
> mv /mnt/path/to/large/sstable.sd /mnt/newebs ?
>
> Thanks!
>
> On Mon, Oct 17, 2016 at 12:59 PM, Ben Bromhead <b...@instaclustr.com>
> wrote:
>
> Yup as everyone has mentioned ephemeral are fine if you run in multiple
> AZs... which is pretty much mandatory for any production deployment in AWS
> (and other cloud providers) . i2.2xls are generally your best bet for high
> read throughput applications on AWS.
>
> Also on AWS ephemeral storage will generally survive a user initiated
> restart. For the times that AWS retires an instance, you get plenty of
> notice and it's generally pretty rare. We run over 1000 instances on AWS
> and see one forced retirement a month if that. We've never had an instance
> pulled from under our feet without warning.
>
> To add another option for the original question, one thing you can do is
> to attach a large EBS drive to the instance and bind mount it to the
> directory for the table that has the very large SSTables. You will need to
> copy data across to the EBS volume. Let everything compact and then copy
> everything back and detach EBS. Latency may be higher than normal on the
> node you are doing this on (especially if you are used to i2.2xl
> performance).
>
> This is something we often have to do, when we encounter pathological
> compaction situations associated with bootstrapping, adding new DCs or STCS
> with a dominant table or people ignore high disk usage warnings :)
>
> On Mon, 17 Oct 2016 at 12:43 Jeff Jirsa <jeff.ji...@crowdstrike.com>
> wrote:
>
> Ephemeral is fine, you just need to have enough replicas (in enough AZs
> and enough regions) to tolerate instances being terminated.
>
>
>
>
>
>
>
> *From: *Vladimir Yudovin <vla...@winguzone.com>
> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Date: *Monday, October 17, 2016 at 11:48 AM
> *To: *user <user@cassandra.apache.org>
>
>
> *Subject: *Re: Adding disk capacity to a running node
>
>
>
> It's extremely unreliable to use ephemeral (local) disks. Even if you
> don't stop instance by yourself, it can be restarted on different server in
> case of some hardware failure or AWS initiated update. So all node data
> will be lost.
>
>
>
> Best regards, Vladimir Yudovin,
>
>
> *Winguzone
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__winguzone.com-3Ffrom-3Dlist=DQMFaQ=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=ixOxpX-xpw1dJZNpaMT3mepToWX8gzmsVaXFizQLzoU=4q7P9fddEYpXwPR-h9yA_tk5JwR8l6c7cKJ-LQTVcGM=>
> - Hosted Cloud Cassandra on Azure and SoftLayer.Launch your cluster in
> minutes.*
>
>
>
>
>
>  On Mon, 17 Oct 2016 14:45:00 -0400*Seth Edwards <s...@pubnub.com
> <s...@pubnub.com>>* wrote 
>
>
>
> These are i2.2xlarge instances so the disks currently configured as
> ephemeral dedicated disks.
>
>
>
> On Mon, Oct 17, 2016 at 11:34 AM, Laing, Michael <
> michael.la...@nytimes.com> wrote:
>
>
>
> You could just expand the size of your ebs volume and extend the file
> system. No data is lost - assuming you are running Linux.
>
>
>
>
>
> On Monday, October 17, 2016, Seth Edwards 

Re: Adding disk capacity to a running node

2016-10-17 Thread Seth Edwards
@Ben

Interesting idea, is this also an option for situations where the disk is
completely full and Cassandra has stopped? (Not that I want to go there).

If this was the route taken, and we did

mount --bind   /mnt/path/to/large/sstable   /mnt/newebs

We would still need to do some manual copying of files? such as

mv /mnt/path/to/large/sstable.sd /mnt/newebs ?

Thanks!

On Mon, Oct 17, 2016 at 12:59 PM, Ben Bromhead <b...@instaclustr.com> wrote:

> Yup as everyone has mentioned ephemeral are fine if you run in multiple
> AZs... which is pretty much mandatory for any production deployment in AWS
> (and other cloud providers) . i2.2xls are generally your best bet for high
> read throughput applications on AWS.
>
> Also on AWS ephemeral storage will generally survive a user initiated
> restart. For the times that AWS retires an instance, you get plenty of
> notice and it's generally pretty rare. We run over 1000 instances on AWS
> and see one forced retirement a month if that. We've never had an instance
> pulled from under our feet without warning.
>
> To add another option for the original question, one thing you can do is
> to attach a large EBS drive to the instance and bind mount it to the
> directory for the table that has the very large SSTables. You will need to
> copy data across to the EBS volume. Let everything compact and then copy
> everything back and detach EBS. Latency may be higher than normal on the
> node you are doing this on (especially if you are used to i2.2xl
> performance).
>
> This is something we often have to do, when we encounter pathological
> compaction situations associated with bootstrapping, adding new DCs or STCS
> with a dominant table or people ignore high disk usage warnings :)
>
> On Mon, 17 Oct 2016 at 12:43 Jeff Jirsa <jeff.ji...@crowdstrike.com>
> wrote:
>
>> Ephemeral is fine, you just need to have enough replicas (in enough AZs
>> and enough regions) to tolerate instances being terminated.
>>
>>
>>
>>
>>
>>
>>
>> *From: *Vladimir Yudovin <vla...@winguzone.com>
>> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
>> *Date: *Monday, October 17, 2016 at 11:48 AM
>> *To: *user <user@cassandra.apache.org>
>>
>>
>> *Subject: *Re: Adding disk capacity to a running node
>>
>>
>>
>> It's extremely unreliable to use ephemeral (local) disks. Even if you
>> don't stop instance by yourself, it can be restarted on different server in
>> case of some hardware failure or AWS initiated update. So all node data
>> will be lost.
>>
>>
>>
>> Best regards, Vladimir Yudovin,
>>
>>
>> *Winguzone
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__winguzone.com-3Ffrom-3Dlist=DQMFaQ=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=ixOxpX-xpw1dJZNpaMT3mepToWX8gzmsVaXFizQLzoU=4q7P9fddEYpXwPR-h9yA_tk5JwR8l6c7cKJ-LQTVcGM=>
>> - Hosted Cloud Cassandra on Azure and SoftLayer.Launch your cluster in
>> minutes.*
>>
>>
>>
>>
>>
>>  On Mon, 17 Oct 2016 14:45:00 -0400*Seth Edwards <s...@pubnub.com
>> <s...@pubnub.com>>* wrote 
>>
>>
>>
>> These are i2.2xlarge instances so the disks currently configured as
>> ephemeral dedicated disks.
>>
>>
>>
>> On Mon, Oct 17, 2016 at 11:34 AM, Laing, Michael <
>> michael.la...@nytimes.com> wrote:
>>
>>
>>
>> You could just expand the size of your ebs volume and extend the file
>> system. No data is lost - assuming you are running Linux.
>>
>>
>>
>>
>>
>> On Monday, October 17, 2016, Seth Edwards <s...@pubnub.com> wrote:
>>
>> We're running 2.0.16. We're migrating to a new data model but we've had
>> an unexpected increase in write traffic that has caused us some capacity
>> issues when we encounter compactions. Our old data model is on STCS. We'd
>> like to add another ebs volume (we're on aws) to our JBOD config and
>> hopefully avoid any situation where we run out of disk space during a large
>> compaction. It appears that the behavior we are hoping to get is actually
>> undesirable and removed in 3.2. It still might be an option for us until we
>> can finish the migration.
>>
>>
>>
>> I'm not familiar with LVM so it may be a bit risky to try at this point.
>>
>>
>>
>> On Mon, Oct 17, 2016 at 9:42 AM, Yabin Meng <yabinm...@gmail.com> wrote:
>>
>> I assume you're talking about Cassandra JBOD (just a bunch of disk) setup
>> because you do mention it as adding it 

Re: Adding disk capacity to a running node

2016-10-17 Thread Mark Rose
I've had luck using the st1 EBS type, too, for situations where reads
are rare (the commit log still needs to be on its own high IOPS
volume; I like using ephemeral storage for that).

On Mon, Oct 17, 2016 at 3:03 PM, Branton Davis
 wrote:
> I doubt that's true anymore.  EBS volumes, while previously discouraged, are
> the most flexible way to go, and are very reliable.  You can attach, detach,
> and snapshot them too.  If you don't need provisioned IOPS, the GP2 SSDs are
> more cost-effective and allow you to balance IOPS with cost.
>
> On Mon, Oct 17, 2016 at 1:55 PM, Jonathan Haddad  wrote:
>>
>> Vladimir,
>>
>> *Most* people are running Cassandra are doing so using ephemeral disks.
>> Instances are not arbitrarily moved to different hosts.  Yes, instances can
>> be shut down, but that's why you distribute across AZs.
>>
>> On Mon, Oct 17, 2016 at 11:48 AM Vladimir Yudovin 
>> wrote:
>>>
>>> It's extremely unreliable to use ephemeral (local) disks. Even if you
>>> don't stop instance by yourself, it can be restarted on different server in
>>> case of some hardware failure or AWS initiated update. So all node data will
>>> be lost.
>>>
>>> Best regards, Vladimir Yudovin,
>>> Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer.
>>> Launch your cluster in minutes.
>>>
>>>
>>>  On Mon, 17 Oct 2016 14:45:00 -0400Seth Edwards 
>>> wrote 
>>>
>>> These are i2.2xlarge instances so the disks currently configured as
>>> ephemeral dedicated disks.
>>>
>>> On Mon, Oct 17, 2016 at 11:34 AM, Laing, Michael
>>>  wrote:
>>>
>>> You could just expand the size of your ebs volume and extend the file
>>> system. No data is lost - assuming you are running Linux.
>>>
>>>
>>> On Monday, October 17, 2016, Seth Edwards  wrote:
>>>
>>> We're running 2.0.16. We're migrating to a new data model but we've had
>>> an unexpected increase in write traffic that has caused us some capacity
>>> issues when we encounter compactions. Our old data model is on STCS. We'd
>>> like to add another ebs volume (we're on aws) to our JBOD config and
>>> hopefully avoid any situation where we run out of disk space during a large
>>> compaction. It appears that the behavior we are hoping to get is actually
>>> undesirable and removed in 3.2. It still might be an option for us until we
>>> can finish the migration.
>>>
>>> I'm not familiar with LVM so it may be a bit risky to try at this point.
>>>
>>> On Mon, Oct 17, 2016 at 9:42 AM, Yabin Meng  wrote:
>>>
>>> I assume you're talking about Cassandra JBOD (just a bunch of disk) setup
>>> because you do mention it as adding it to the list of data directories. If
>>> this is the case, you may run into issues, depending on your C* version.
>>> Check this out: http://www.datastax.com/dev/blog/improving-jbod.
>>>
>>> Or another approach is to use LVM to manage multiple devices into a
>>> single mount point. If you do so, from what Cassandra can see is just simply
>>> increased disk storage space and there should should have no problem.
>>>
>>> Hope this helps,
>>>
>>> Yabin
>>>
>>> On Mon, Oct 17, 2016 at 11:54 AM, Vladimir Yudovin 
>>> wrote:
>>>
>>>
>>> Yes, Cassandra should keep percent of disk usage equal for all disk.
>>> Compaction process and SSTable flushes will use new disk to distribute both
>>> new and existing data.
>>>
>>> Best regards, Vladimir Yudovin,
>>> Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer.
>>> Launch your cluster in minutes.
>>>
>>>
>>>  On Mon, 17 Oct 2016 11:43:27 -0400Seth Edwards 
>>> wrote 
>>>
>>> We have a few nodes that are running out of disk capacity at the moment
>>> and instead of adding more nodes to the cluster, we would like to add
>>> another disk to the server and add it to the list of data directories. My
>>> question, is, will Cassandra use the new disk for compactions on sstables
>>> that already exist in the primary directory?
>>>
>>>
>>>
>>> Thanks!
>>>
>>>
>>>
>


Re: Adding disk capacity to a running node

2016-10-17 Thread Ben Bromhead
Yup as everyone has mentioned ephemeral are fine if you run in multiple
AZs... which is pretty much mandatory for any production deployment in AWS
(and other cloud providers) . i2.2xls are generally your best bet for high
read throughput applications on AWS.

Also on AWS ephemeral storage will generally survive a user initiated
restart. For the times that AWS retires an instance, you get plenty of
notice and it's generally pretty rare. We run over 1000 instances on AWS
and see one forced retirement a month if that. We've never had an instance
pulled from under our feet without warning.

To add another option for the original question, one thing you can do is to
attach a large EBS drive to the instance and bind mount it to the directory
for the table that has the very large SSTables. You will need to copy data
across to the EBS volume. Let everything compact and then copy everything
back and detach EBS. Latency may be higher than normal on the node you are
doing this on (especially if you are used to i2.2xl performance).

This is something we often have to do, when we encounter pathological
compaction situations associated with bootstrapping, adding new DCs or STCS
with a dominant table or people ignore high disk usage warnings :)

On Mon, 17 Oct 2016 at 12:43 Jeff Jirsa <jeff.ji...@crowdstrike.com> wrote:

> Ephemeral is fine, you just need to have enough replicas (in enough AZs
> and enough regions) to tolerate instances being terminated.
>
>
>
>
>
>
>
> *From: *Vladimir Yudovin <vla...@winguzone.com>
> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Date: *Monday, October 17, 2016 at 11:48 AM
> *To: *user <user@cassandra.apache.org>
>
>
> *Subject: *Re: Adding disk capacity to a running node
>
>
>
> It's extremely unreliable to use ephemeral (local) disks. Even if you
> don't stop instance by yourself, it can be restarted on different server in
> case of some hardware failure or AWS initiated update. So all node data
> will be lost.
>
>
>
> Best regards, Vladimir Yudovin,
>
>
> *Winguzone
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__winguzone.com-3Ffrom-3Dlist=DQMFaQ=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=ixOxpX-xpw1dJZNpaMT3mepToWX8gzmsVaXFizQLzoU=4q7P9fddEYpXwPR-h9yA_tk5JwR8l6c7cKJ-LQTVcGM=>
> - Hosted Cloud Cassandra on Azure and SoftLayer.Launch your cluster in
> minutes.*
>
>
>
>
>
>  On Mon, 17 Oct 2016 14:45:00 -0400*Seth Edwards <s...@pubnub.com
> <s...@pubnub.com>>* wrote 
>
>
>
> These are i2.2xlarge instances so the disks currently configured as
> ephemeral dedicated disks.
>
>
>
> On Mon, Oct 17, 2016 at 11:34 AM, Laing, Michael <
> michael.la...@nytimes.com> wrote:
>
>
>
> You could just expand the size of your ebs volume and extend the file
> system. No data is lost - assuming you are running Linux.
>
>
>
>
>
> On Monday, October 17, 2016, Seth Edwards <s...@pubnub.com> wrote:
>
> We're running 2.0.16. We're migrating to a new data model but we've had an
> unexpected increase in write traffic that has caused us some capacity
> issues when we encounter compactions. Our old data model is on STCS. We'd
> like to add another ebs volume (we're on aws) to our JBOD config and
> hopefully avoid any situation where we run out of disk space during a large
> compaction. It appears that the behavior we are hoping to get is actually
> undesirable and removed in 3.2. It still might be an option for us until we
> can finish the migration.
>
>
>
> I'm not familiar with LVM so it may be a bit risky to try at this point.
>
>
>
> On Mon, Oct 17, 2016 at 9:42 AM, Yabin Meng <yabinm...@gmail.com> wrote:
>
> I assume you're talking about Cassandra JBOD (just a bunch of disk) setup
> because you do mention it as adding it to the list of data directories. If
> this is the case, you may run into issues, depending on your C* version.
> Check this out: http://www.datastax.com/dev/blog/improving-jbod
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.datastax.com_dev_blog_improving-2Djbod=DQMFaQ=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=ixOxpX-xpw1dJZNpaMT3mepToWX8gzmsVaXFizQLzoU=e_rkkJ8RHJXe4KvyNfeRWQkdy-zZzOnaMDQle3nN808=>
> .
>
>
>
> Or another approach is to use LVM to manage multiple devices into a single
> mount point. If you do so, from what Cassandra can see is just simply
> increased disk storage space and there should should have no problem.
>
>
>
> Hope this helps,
>
>
>
> Yabin
>
>
>
> On Mon, Oct 17, 2016 at 11:54 AM, Vladimir Yudovin <vla...@winguzo

Re: Adding disk capacity to a running node

2016-10-17 Thread Jeff Jirsa
Ephemeral is fine, you just need to have enough replicas (in enough AZs and 
enough regions) to tolerate instances being terminated.

 

 

 

From: Vladimir Yudovin <vla...@winguzone.com>
Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Monday, October 17, 2016 at 11:48 AM
To: user <user@cassandra.apache.org>
Subject: Re: Adding disk capacity to a running node

 

It's extremely unreliable to use ephemeral (local) disks. Even if you don't 
stop instance by yourself, it can be restarted on different server in case of 
some hardware failure or AWS initiated update. So all node data will be lost.

 

Best regards, Vladimir Yudovin, 

Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer.
Launch your cluster in minutes.

 

 

 On Mon, 17 Oct 2016 14:45:00 -0400Seth Edwards <s...@pubnub.com> wrote 

 

These are i2.2xlarge instances so the disks currently configured as ephemeral 
dedicated disks. 

 

On Mon, Oct 17, 2016 at 11:34 AM, Laing, Michael <michael.la...@nytimes.com> 
wrote:

 

You could just expand the size of your ebs volume and extend the file system. 
No data is lost - assuming you are running Linux.

 

 

On Monday, October 17, 2016, Seth Edwards <s...@pubnub.com> wrote:

We're running 2.0.16. We're migrating to a new data model but we've had an 
unexpected increase in write traffic that has caused us some capacity issues 
when we encounter compactions. Our old data model is on STCS. We'd like to add 
another ebs volume (we're on aws) to our JBOD config and hopefully avoid any 
situation where we run out of disk space during a large compaction. It appears 
that the behavior we are hoping to get is actually undesirable and removed in 
3.2. It still might be an option for us until we can finish the migration. 

 

I'm not familiar with LVM so it may be a bit risky to try at this point. 

 

On Mon, Oct 17, 2016 at 9:42 AM, Yabin Meng <yabinm...@gmail.com> wrote:

I assume you're talking about Cassandra JBOD (just a bunch of disk) setup 
because you do mention it as adding it to the list of data directories. If this 
is the case, you may run into issues, depending on your C* version. Check this 
out: http://www.datastax.com/dev/blog/improving-jbod.

 

Or another approach is to use LVM to manage multiple devices into a single 
mount point. If you do so, from what Cassandra can see is just simply increased 
disk storage space and there should should have no problem.

 

Hope this helps,

 

Yabin

 

On Mon, Oct 17, 2016 at 11:54 AM, Vladimir Yudovin <vla...@winguzone.com> wrote:

 

Yes, Cassandra should keep percent of disk usage equal for all disk. Compaction 
process and SSTable flushes will use new disk to distribute both new and 
existing data.

 

Best regards, Vladimir Yudovin, 

Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer.
Launch your cluster in minutes.

 

 

 On Mon, 17 Oct 2016 11:43:27 -0400Seth Edwards <s...@pubnub.com> wrote 

 

We have a few nodes that are running out of disk capacity at the moment and 
instead of adding more nodes to the cluster, we would like to add another disk 
to the server and add it to the list of data directories. My question, is, will 
Cassandra use the new disk for compactions on sstables that already exist in 
the primary directory? 

 

 

 

Thanks!

 

 

 


CONFIDENTIALITY NOTE: This e-mail and any attachments are confidential and may 
be legally privileged. If you are not the intended recipient, do not disclose, 
copy, distribute, or use this email or any attachments. If you have received 
this in error please let the sender know and then delete the email and all 
attachments.


smime.p7s
Description: S/MIME cryptographic signature


Re: Adding disk capacity to a running node

2016-10-17 Thread Jonathan Haddad
There are, of course, people using EBS successfully, I didn't say there
weren't and it wasn't my point.  I was merely saying the reasoning to avoid
ephemeral disk because your instance is going to move between machines and
lose data is nonsense, in that they work just fine and have been heavily
used in production Cassandra clusters for years.

On Mon, Oct 17, 2016 at 12:03 PM Branton Davis 
wrote:

> I doubt that's true anymore.  EBS volumes, while previously discouraged,
> are the most flexible way to go, and are very reliable.  You can attach,
> detach, and snapshot them too.  If you don't need provisioned IOPS, the
> GP2 SSDs are more cost-effective and allow you to balance IOPS with cost.
>
> On Mon, Oct 17, 2016 at 1:55 PM, Jonathan Haddad 
> wrote:
>
> Vladimir,
>
> *Most* people are running Cassandra are doing so using ephemeral disks.  
> Instances
> are not arbitrarily moved to different hosts.  Yes, instances can be shut
> down, but that's why you distribute across AZs.
>
> On Mon, Oct 17, 2016 at 11:48 AM Vladimir Yudovin 
> wrote:
>
> It's extremely unreliable to use ephemeral (local) disks. Even if you
> don't stop instance by yourself, it can be restarted on different server in
> case of some hardware failure or AWS initiated update. So all node data
> will be lost.
>
> Best regards, Vladimir Yudovin,
>
>
> *Winguzone  - Hosted Cloud Cassandra on
> Azure and SoftLayer.Launch your cluster in minutes.*
>
>
>  On Mon, 17 Oct 2016 14:45:00 -0400*Seth Edwards  >* wrote 
>
> These are i2.2xlarge instances so the disks currently configured as
> ephemeral dedicated disks.
>
> On Mon, Oct 17, 2016 at 11:34 AM, Laing, Michael <
> michael.la...@nytimes.com> wrote:
>
> You could just expand the size of your ebs volume and extend the file
> system. No data is lost - assuming you are running Linux.
>
>
> On Monday, October 17, 2016, Seth Edwards  wrote:
>
> We're running 2.0.16. We're migrating to a new data model but we've had an
> unexpected increase in write traffic that has caused us some capacity
> issues when we encounter compactions. Our old data model is on STCS. We'd
> like to add another ebs volume (we're on aws) to our JBOD config and
> hopefully avoid any situation where we run out of disk space during a large
> compaction. It appears that the behavior we are hoping to get is actually
> undesirable and removed in 3.2. It still might be an option for us until we
> can finish the migration.
>
> I'm not familiar with LVM so it may be a bit risky to try at this point.
>
> On Mon, Oct 17, 2016 at 9:42 AM, Yabin Meng  wrote:
>
> I assume you're talking about Cassandra JBOD (just a bunch of disk) setup
> because you do mention it as adding it to the list of data directories. If
> this is the case, you may run into issues, depending on your C* version.
> Check this out: http://www.datastax.com/dev/blog/improving-jbod.
>
> Or another approach is to use LVM to manage multiple devices into a single
> mount point. If you do so, from what Cassandra can see is just simply
> increased disk storage space and there should should have no problem.
>
> Hope this helps,
>
> Yabin
>
> On Mon, Oct 17, 2016 at 11:54 AM, Vladimir Yudovin 
> wrote:
>
>
> Yes, Cassandra should keep percent of disk usage equal for all disk.
> Compaction process and SSTable flushes will use new disk to distribute both
> new and existing data.
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone  - Hosted Cloud Cassandra on
> Azure and SoftLayer.Launch your cluster in minutes.*
>
>
>  On Mon, 17 Oct 2016 11:43:27 -0400*Seth Edwards *
> wrote 
>
> We have a few nodes that are running out of disk capacity at the moment
> and instead of adding more nodes to the cluster, we would like to add
> another disk to the server and add it to the list of data directories. My
> question, is, will Cassandra use the new disk for compactions on sstables
> that already exist in the primary directory?
>
>
>
> Thanks!
>
>
>
>
>


Re: Adding disk capacity to a running node

2016-10-17 Thread Jonathan Haddad
If a node is restarted is not moved, no.  That's not how it works.

On Mon, Oct 17, 2016 at 12:01 PM Vladimir Yudovin 
wrote:

> But after such restart node should be joined to cluster again and restore
> data, right?
>
> Best regards, Vladimir Yudovin,
>
>
> *Winguzone  - Hosted Cloud Cassandra on
> Azure and SoftLayer.Launch your cluster in minutes.*
>
>
>  On Mon, 17 Oct 2016 14:55:49 -0400*Jonathan Haddad
> >* wrote 
>
> Vladimir,
>
> *Most* people are running Cassandra are doing so using ephemeral disks.
> Instances are not arbitrarily moved to different hosts.  Yes, instances can
> be shut down, but that's why you distribute across AZs.
>
> On Mon, Oct 17, 2016 at 11:48 AM Vladimir Yudovin 
> wrote:
>
>
> It's extremely unreliable to use ephemeral (local) disks. Even if you
> don't stop instance by yourself, it can be restarted on different server in
> case of some hardware failure or AWS initiated update. So all node data
> will be lost.
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone  - Hosted Cloud Cassandra on
> Azure and SoftLayer.Launch your cluster in minutes.*
>
>
>  On Mon, 17 Oct 2016 14:45:00 -0400*Seth Edwards  >* wrote 
>
> These are i2.2xlarge instances so the disks currently configured as
> ephemeral dedicated disks.
>
> On Mon, Oct 17, 2016 at 11:34 AM, Laing, Michael <
> michael.la...@nytimes.com> wrote:
>
> You could just expand the size of your ebs volume and extend the file
> system. No data is lost - assuming you are running Linux.
>
>
> On Monday, October 17, 2016, Seth Edwards  wrote:
>
> We're running 2.0.16. We're migrating to a new data model but we've had an
> unexpected increase in write traffic that has caused us some capacity
> issues when we encounter compactions. Our old data model is on STCS. We'd
> like to add another ebs volume (we're on aws) to our JBOD config and
> hopefully avoid any situation where we run out of disk space during a large
> compaction. It appears that the behavior we are hoping to get is actually
> undesirable and removed in 3.2. It still might be an option for us until we
> can finish the migration.
>
> I'm not familiar with LVM so it may be a bit risky to try at this point.
>
> On Mon, Oct 17, 2016 at 9:42 AM, Yabin Meng  wrote:
>
> I assume you're talking about Cassandra JBOD (just a bunch of disk) setup
> because you do mention it as adding it to the list of data directories. If
> this is the case, you may run into issues, depending on your C* version.
> Check this out: http://www.datastax.com/dev/blog/improving-jbod.
>
> Or another approach is to use LVM to manage multiple devices into a single
> mount point. If you do so, from what Cassandra can see is just simply
> increased disk storage space and there should should have no problem.
>
> Hope this helps,
>
> Yabin
>
> On Mon, Oct 17, 2016 at 11:54 AM, Vladimir Yudovin 
> wrote:
>
>
> Yes, Cassandra should keep percent of disk usage equal for all disk.
> Compaction process and SSTable flushes will use new disk to distribute both
> new and existing data.
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone  - Hosted Cloud Cassandra on
> Azure and SoftLayer.Launch your cluster in minutes.*
>
>
>  On Mon, 17 Oct 2016 11:43:27 -0400*Seth Edwards *
> wrote 
>
> We have a few nodes that are running out of disk capacity at the moment
> and instead of adding more nodes to the cluster, we would like to add
> another disk to the server and add it to the list of data directories. My
> question, is, will Cassandra use the new disk for compactions on sstables
> that already exist in the primary directory?
>
>
>
> Thanks!
>
>
>
>


Re: Adding disk capacity to a running node

2016-10-17 Thread Branton Davis
I doubt that's true anymore.  EBS volumes, while previously discouraged,
are the most flexible way to go, and are very reliable.  You can attach,
detach, and snapshot them too.  If you don't need provisioned IOPS, the GP2
SSDs are more cost-effective and allow you to balance IOPS with cost.

On Mon, Oct 17, 2016 at 1:55 PM, Jonathan Haddad  wrote:

> Vladimir,
>
> *Most* people are running Cassandra are doing so using ephemeral disks.  
> Instances
> are not arbitrarily moved to different hosts.  Yes, instances can be shut
> down, but that's why you distribute across AZs.
>
> On Mon, Oct 17, 2016 at 11:48 AM Vladimir Yudovin 
> wrote:
>
>> It's extremely unreliable to use ephemeral (local) disks. Even if you
>> don't stop instance by yourself, it can be restarted on different server in
>> case of some hardware failure or AWS initiated update. So all node data
>> will be lost.
>>
>> Best regards, Vladimir Yudovin,
>>
>>
>> *Winguzone  - Hosted Cloud Cassandra on
>> Azure and SoftLayer.Launch your cluster in minutes.*
>>
>>
>>  On Mon, 17 Oct 2016 14:45:00 -0400*Seth Edwards > >* wrote 
>>
>> These are i2.2xlarge instances so the disks currently configured as
>> ephemeral dedicated disks.
>>
>> On Mon, Oct 17, 2016 at 11:34 AM, Laing, Michael <
>> michael.la...@nytimes.com> wrote:
>>
>> You could just expand the size of your ebs volume and extend the file
>> system. No data is lost - assuming you are running Linux.
>>
>>
>> On Monday, October 17, 2016, Seth Edwards  wrote:
>>
>> We're running 2.0.16. We're migrating to a new data model but we've had
>> an unexpected increase in write traffic that has caused us some capacity
>> issues when we encounter compactions. Our old data model is on STCS. We'd
>> like to add another ebs volume (we're on aws) to our JBOD config and
>> hopefully avoid any situation where we run out of disk space during a large
>> compaction. It appears that the behavior we are hoping to get is actually
>> undesirable and removed in 3.2. It still might be an option for us until we
>> can finish the migration.
>>
>> I'm not familiar with LVM so it may be a bit risky to try at this point.
>>
>> On Mon, Oct 17, 2016 at 9:42 AM, Yabin Meng  wrote:
>>
>> I assume you're talking about Cassandra JBOD (just a bunch of disk) setup
>> because you do mention it as adding it to the list of data directories. If
>> this is the case, you may run into issues, depending on your C* version.
>> Check this out: http://www.datastax.com/dev/blog/improving-jbod.
>>
>> Or another approach is to use LVM to manage multiple devices into a
>> single mount point. If you do so, from what Cassandra can see is just
>> simply increased disk storage space and there should should have no problem.
>>
>> Hope this helps,
>>
>> Yabin
>>
>> On Mon, Oct 17, 2016 at 11:54 AM, Vladimir Yudovin 
>> wrote:
>>
>>
>> Yes, Cassandra should keep percent of disk usage equal for all disk.
>> Compaction process and SSTable flushes will use new disk to distribute both
>> new and existing data.
>>
>> Best regards, Vladimir Yudovin,
>>
>> *Winguzone  - Hosted Cloud Cassandra on
>> Azure and SoftLayer.Launch your cluster in minutes.*
>>
>>
>>  On Mon, 17 Oct 2016 11:43:27 -0400*Seth Edwards *
>> wrote 
>>
>> We have a few nodes that are running out of disk capacity at the moment
>> and instead of adding more nodes to the cluster, we would like to add
>> another disk to the server and add it to the list of data directories. My
>> question, is, will Cassandra use the new disk for compactions on sstables
>> that already exist in the primary directory?
>>
>>
>>
>> Thanks!
>>
>>
>>
>>


Re: Adding disk capacity to a running node

2016-10-17 Thread Vladimir Yudovin
But after such restart node should be joined to cluster again and restore data, 
right?



Best regards, Vladimir Yudovin, 

Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer.
Launch your cluster in minutes.






 On Mon, 17 Oct 2016 14:55:49 -0400Jonathan Haddad 
j...@jonhaddad.com wrote 




Vladimir,



*Most* people are running Cassandra are doing so using ephemeral disks.  
Instances are not arbitrarily moved to different hosts.  Yes, instances can be 
shut down, but that's why you distribute across AZs.  




On Mon, Oct 17, 2016 at 11:48 AM Vladimir Yudovin vla...@winguzone.com 
wrote:







It's extremely unreliable to use ephemeral (local) disks. Even if you don't 
stop instance by yourself, it can be restarted on different server in case of 
some hardware failure or AWS initiated update. So all node data will be lost.





Best regards, Vladimir Yudovin, 

Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer.
Launch your cluster in minutes.







 On Mon, 17 Oct 2016 14:45:00 -0400Seth Edwards s...@pubnub.com 
wrote 







These are i2.2xlarge instances so the disks currently configured as ephemeral 
dedicated disks. 



On Mon, Oct 17, 2016 at 11:34 AM, Laing, Michael 
michael.la...@nytimes.com wrote:






You could just expand the size of your ebs volume and extend the file system. 
No data is lost - assuming you are running Linux.





On Monday, October 17, 2016, Seth Edwards s...@pubnub.com wrote:

We're running 2.0.16. We're migrating to a new data model but we've had an 
unexpected increase in write traffic that has caused us some capacity issues 
when we encounter compactions. Our old data model is on STCS. We'd like to add 
another ebs volume (we're on aws) to our JBOD config and hopefully avoid any 
situation where we run out of disk space during a large compaction. It appears 
that the behavior we are hoping to get is actually undesirable and removed in 
3.2. It still might be an option for us until we can finish the migration. 



I'm not familiar with LVM so it may be a bit risky to try at this point. 




On Mon, Oct 17, 2016 at 9:42 AM, Yabin Meng yabinm...@gmail.com wrote:

I assume you're talking about Cassandra JBOD (just a bunch of disk) setup 
because you do mention it as adding it to the list of data directories. If this 
is the case, you may run into issues, depending on your C* version. Check this 
out: http://www.datastax.com/dev/blog/improving-jbod.



Or another approach is to use LVM to manage multiple devices into a single 
mount point. If you do so, from what Cassandra can see is just simply increased 
disk storage space and there should should have no problem.



Hope this helps,



Yabin




On Mon, Oct 17, 2016 at 11:54 AM, Vladimir Yudovin vla...@winguzone.com 
wrote:



Yes, Cassandra should keep percent of disk usage equal for all disk. Compaction 
process and SSTable flushes will use new disk to distribute both new and 
existing data.



Best regards, Vladimir Yudovin, 

Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer.
Launch your cluster in minutes.





 On Mon, 17 Oct 2016 11:43:27 -0400Seth Edwards s...@pubnub.com 
wrote 




We have a few nodes that are running out of disk capacity at the moment and 
instead of adding more nodes to the cluster, we would like to add another disk 
to the server and add it to the list of data directories. My question, is, will 
Cassandra use the new disk for compactions on sstables that already exist in 
the primary directory? 







Thanks!



























Re: Adding disk capacity to a running node

2016-10-17 Thread Jonathan Haddad
Vladimir,

*Most* people are running Cassandra are doing so using ephemeral disks.
Instances are not arbitrarily moved to different hosts.  Yes, instances can
be shut down, but that's why you distribute across AZs.

On Mon, Oct 17, 2016 at 11:48 AM Vladimir Yudovin 
wrote:

> It's extremely unreliable to use ephemeral (local) disks. Even if you
> don't stop instance by yourself, it can be restarted on different server in
> case of some hardware failure or AWS initiated update. So all node data
> will be lost.
>
> Best regards, Vladimir Yudovin,
>
>
> *Winguzone  - Hosted Cloud Cassandra on
> Azure and SoftLayer.Launch your cluster in minutes.*
>
>
>  On Mon, 17 Oct 2016 14:45:00 -0400*Seth Edwards  >* wrote 
>
> These are i2.2xlarge instances so the disks currently configured as
> ephemeral dedicated disks.
>
> On Mon, Oct 17, 2016 at 11:34 AM, Laing, Michael <
> michael.la...@nytimes.com> wrote:
>
> You could just expand the size of your ebs volume and extend the file
> system. No data is lost - assuming you are running Linux.
>
>
> On Monday, October 17, 2016, Seth Edwards  wrote:
>
> We're running 2.0.16. We're migrating to a new data model but we've had an
> unexpected increase in write traffic that has caused us some capacity
> issues when we encounter compactions. Our old data model is on STCS. We'd
> like to add another ebs volume (we're on aws) to our JBOD config and
> hopefully avoid any situation where we run out of disk space during a large
> compaction. It appears that the behavior we are hoping to get is actually
> undesirable and removed in 3.2. It still might be an option for us until we
> can finish the migration.
>
> I'm not familiar with LVM so it may be a bit risky to try at this point.
>
> On Mon, Oct 17, 2016 at 9:42 AM, Yabin Meng  wrote:
>
> I assume you're talking about Cassandra JBOD (just a bunch of disk) setup
> because you do mention it as adding it to the list of data directories. If
> this is the case, you may run into issues, depending on your C* version.
> Check this out: http://www.datastax.com/dev/blog/improving-jbod.
>
> Or another approach is to use LVM to manage multiple devices into a single
> mount point. If you do so, from what Cassandra can see is just simply
> increased disk storage space and there should should have no problem.
>
> Hope this helps,
>
> Yabin
>
> On Mon, Oct 17, 2016 at 11:54 AM, Vladimir Yudovin 
> wrote:
>
>
> Yes, Cassandra should keep percent of disk usage equal for all disk.
> Compaction process and SSTable flushes will use new disk to distribute both
> new and existing data.
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone  - Hosted Cloud Cassandra on
> Azure and SoftLayer.Launch your cluster in minutes.*
>
>
>  On Mon, 17 Oct 2016 11:43:27 -0400*Seth Edwards *
> wrote 
>
> We have a few nodes that are running out of disk capacity at the moment
> and instead of adding more nodes to the cluster, we would like to add
> another disk to the server and add it to the list of data directories. My
> question, is, will Cassandra use the new disk for compactions on sstables
> that already exist in the primary directory?
>
>
>
> Thanks!
>
>
>
>


Re: Adding disk capacity to a running node

2016-10-17 Thread Vladimir Yudovin
It's extremely unreliable to use ephemeral (local) disks. Even if you don't 
stop instance by yourself, it can be restarted on different server in case of 
some hardware failure or AWS initiated update. So all node data will be lost.



Best regards, Vladimir Yudovin, 

Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer.
Launch your cluster in minutes.






 On Mon, 17 Oct 2016 14:45:00 -0400Seth Edwards s...@pubnub.com 
wrote 




These are i2.2xlarge instances so the disks currently configured as ephemeral 
dedicated disks. 



On Mon, Oct 17, 2016 at 11:34 AM, Laing, Michael 
michael.la...@nytimes.com wrote:






You could just expand the size of your ebs volume and extend the file system. 
No data is lost - assuming you are running Linux.





On Monday, October 17, 2016, Seth Edwards s...@pubnub.com wrote:

We're running 2.0.16. We're migrating to a new data model but we've had an 
unexpected increase in write traffic that has caused us some capacity issues 
when we encounter compactions. Our old data model is on STCS. We'd like to add 
another ebs volume (we're on aws) to our JBOD config and hopefully avoid any 
situation where we run out of disk space during a large compaction. It appears 
that the behavior we are hoping to get is actually undesirable and removed in 
3.2. It still might be an option for us until we can finish the migration. 



I'm not familiar with LVM so it may be a bit risky to try at this point. 




On Mon, Oct 17, 2016 at 9:42 AM, Yabin Meng yabinm...@gmail.com wrote:

I assume you're talking about Cassandra JBOD (just a bunch of disk) setup 
because you do mention it as adding it to the list of data directories. If this 
is the case, you may run into issues, depending on your C* version. Check this 
out: http://www.datastax.com/dev/blog/improving-jbod.



Or another approach is to use LVM to manage multiple devices into a single 
mount point. If you do so, from what Cassandra can see is just simply increased 
disk storage space and there should should have no problem.



Hope this helps,



Yabin




On Mon, Oct 17, 2016 at 11:54 AM, Vladimir Yudovin vla...@winguzone.com 
wrote:



Yes, Cassandra should keep percent of disk usage equal for all disk. Compaction 
process and SSTable flushes will use new disk to distribute both new and 
existing data.



Best regards, Vladimir Yudovin, 

Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer.
Launch your cluster in minutes.





 On Mon, 17 Oct 2016 11:43:27 -0400Seth Edwards s...@pubnub.com 
wrote 




We have a few nodes that are running out of disk capacity at the moment and 
instead of adding more nodes to the cluster, we would like to add another disk 
to the server and add it to the list of data directories. My question, is, will 
Cassandra use the new disk for compactions on sstables that already exist in 
the primary directory? 







Thanks!
























Re: Adding disk capacity to a running node

2016-10-17 Thread Seth Edwards
These are i2.2xlarge instances so the disks currently configured as
ephemeral dedicated disks.

On Mon, Oct 17, 2016 at 11:34 AM, Laing, Michael 
wrote:

> You could just expand the size of your ebs volume and extend the file
> system. No data is lost - assuming you are running Linux.
>
>
> On Monday, October 17, 2016, Seth Edwards  wrote:
>
>> We're running 2.0.16. We're migrating to a new data model but we've had
>> an unexpected increase in write traffic that has caused us some capacity
>> issues when we encounter compactions. Our old data model is on STCS. We'd
>> like to add another ebs volume (we're on aws) to our JBOD config and
>> hopefully avoid any situation where we run out of disk space during a large
>> compaction. It appears that the behavior we are hoping to get is actually
>> undesirable and removed in 3.2. It still might be an option for us until we
>> can finish the migration.
>>
>> I'm not familiar with LVM so it may be a bit risky to try at this point.
>>
>> On Mon, Oct 17, 2016 at 9:42 AM, Yabin Meng  wrote:
>>
>>> I assume you're talking about Cassandra JBOD (just a bunch of disk)
>>> setup because you do mention it as adding it to the list of data
>>> directories. If this is the case, you may run into issues, depending on
>>> your C* version. Check this out: http://www.datastax.com/d
>>> ev/blog/improving-jbod.
>>>
>>> Or another approach is to use LVM to manage multiple devices into a
>>> single mount point. If you do so, from what Cassandra can see is just
>>> simply increased disk storage space and there should should have no problem.
>>>
>>> Hope this helps,
>>>
>>> Yabin
>>>
>>> On Mon, Oct 17, 2016 at 11:54 AM, Vladimir Yudovin >> > wrote:
>>>
 Yes, Cassandra should keep percent of disk usage equal for all disk.
 Compaction process and SSTable flushes will use new disk to distribute both
 new and existing data.

 Best regards, Vladimir Yudovin,


 *Winguzone  - Hosted Cloud Cassandra
 on Azure and SoftLayer.Launch your cluster in minutes.*


  On Mon, 17 Oct 2016 11:43:27 -0400*Seth Edwards *
 wrote 

 We have a few nodes that are running out of disk capacity at the moment
 and instead of adding more nodes to the cluster, we would like to add
 another disk to the server and add it to the list of data directories. My
 question, is, will Cassandra use the new disk for compactions on sstables
 that already exist in the primary directory?



 Thanks!



>>>
>>


Re: Adding disk capacity to a running node

2016-10-17 Thread Laing, Michael
You could just expand the size of your ebs volume and extend the file
system. No data is lost - assuming you are running Linux.

On Monday, October 17, 2016, Seth Edwards  wrote:

> We're running 2.0.16. We're migrating to a new data model but we've had an
> unexpected increase in write traffic that has caused us some capacity
> issues when we encounter compactions. Our old data model is on STCS. We'd
> like to add another ebs volume (we're on aws) to our JBOD config and
> hopefully avoid any situation where we run out of disk space during a large
> compaction. It appears that the behavior we are hoping to get is actually
> undesirable and removed in 3.2. It still might be an option for us until we
> can finish the migration.
>
> I'm not familiar with LVM so it may be a bit risky to try at this point.
>
> On Mon, Oct 17, 2016 at 9:42 AM, Yabin Meng  > wrote:
>
>> I assume you're talking about Cassandra JBOD (just a bunch of disk) setup
>> because you do mention it as adding it to the list of data directories. If
>> this is the case, you may run into issues, depending on your C* version.
>> Check this out: http://www.datastax.com/dev/blog/improving-jbod.
>>
>> Or another approach is to use LVM to manage multiple devices into a
>> single mount point. If you do so, from what Cassandra can see is just
>> simply increased disk storage space and there should should have no problem.
>>
>> Hope this helps,
>>
>> Yabin
>>
>> On Mon, Oct 17, 2016 at 11:54 AM, Vladimir Yudovin > > wrote:
>>
>>> Yes, Cassandra should keep percent of disk usage equal for all disk.
>>> Compaction process and SSTable flushes will use new disk to distribute both
>>> new and existing data.
>>>
>>> Best regards, Vladimir Yudovin,
>>>
>>>
>>> *Winguzone  - Hosted Cloud Cassandra on
>>> Azure and SoftLayer.Launch your cluster in minutes.*
>>>
>>>
>>>  On Mon, 17 Oct 2016 11:43:27 -0400*Seth Edwards >> >* wrote 
>>>
>>> We have a few nodes that are running out of disk capacity at the moment
>>> and instead of adding more nodes to the cluster, we would like to add
>>> another disk to the server and add it to the list of data directories. My
>>> question, is, will Cassandra use the new disk for compactions on sstables
>>> that already exist in the primary directory?
>>>
>>>
>>>
>>> Thanks!
>>>
>>>
>>>
>>
>


Re: Adding disk capacity to a running node

2016-10-17 Thread Seth Edwards
We're running 2.0.16. We're migrating to a new data model but we've had an
unexpected increase in write traffic that has caused us some capacity
issues when we encounter compactions. Our old data model is on STCS. We'd
like to add another ebs volume (we're on aws) to our JBOD config and
hopefully avoid any situation where we run out of disk space during a large
compaction. It appears that the behavior we are hoping to get is actually
undesirable and removed in 3.2. It still might be an option for us until we
can finish the migration.

I'm not familiar with LVM so it may be a bit risky to try at this point.

On Mon, Oct 17, 2016 at 9:42 AM, Yabin Meng  wrote:

> I assume you're talking about Cassandra JBOD (just a bunch of disk) setup
> because you do mention it as adding it to the list of data directories. If
> this is the case, you may run into issues, depending on your C* version.
> Check this out: http://www.datastax.com/dev/blog/improving-jbod.
>
> Or another approach is to use LVM to manage multiple devices into a single
> mount point. If you do so, from what Cassandra can see is just simply
> increased disk storage space and there should should have no problem.
>
> Hope this helps,
>
> Yabin
>
> On Mon, Oct 17, 2016 at 11:54 AM, Vladimir Yudovin 
> wrote:
>
>> Yes, Cassandra should keep percent of disk usage equal for all disk.
>> Compaction process and SSTable flushes will use new disk to distribute both
>> new and existing data.
>>
>> Best regards, Vladimir Yudovin,
>>
>>
>> *Winguzone  - Hosted Cloud Cassandra on
>> Azure and SoftLayer.Launch your cluster in minutes.*
>>
>>
>>  On Mon, 17 Oct 2016 11:43:27 -0400*Seth Edwards > >* wrote 
>>
>> We have a few nodes that are running out of disk capacity at the moment
>> and instead of adding more nodes to the cluster, we would like to add
>> another disk to the server and add it to the list of data directories. My
>> question, is, will Cassandra use the new disk for compactions on sstables
>> that already exist in the primary directory?
>>
>>
>>
>> Thanks!
>>
>>
>>
>


Re: Adding disk capacity to a running node

2016-10-17 Thread Yabin Meng
I assume you're talking about Cassandra JBOD (just a bunch of disk) setup
because you do mention it as adding it to the list of data directories. If
this is the case, you may run into issues, depending on your C* version.
Check this out: http://www.datastax.com/dev/blog/improving-jbod.

Or another approach is to use LVM to manage multiple devices into a single
mount point. If you do so, from what Cassandra can see is just simply
increased disk storage space and there should should have no problem.

Hope this helps,

Yabin

On Mon, Oct 17, 2016 at 11:54 AM, Vladimir Yudovin 
wrote:

> Yes, Cassandra should keep percent of disk usage equal for all disk.
> Compaction process and SSTable flushes will use new disk to distribute both
> new and existing data.
>
> Best regards, Vladimir Yudovin,
>
>
> *Winguzone  - Hosted Cloud Cassandra on
> Azure and SoftLayer.Launch your cluster in minutes.*
>
>
>  On Mon, 17 Oct 2016 11:43:27 -0400*Seth Edwards  >* wrote 
>
> We have a few nodes that are running out of disk capacity at the moment
> and instead of adding more nodes to the cluster, we would like to add
> another disk to the server and add it to the list of data directories. My
> question, is, will Cassandra use the new disk for compactions on sstables
> that already exist in the primary directory?
>
>
>
> Thanks!
>
>
>


Re: Adding disk capacity to a running node

2016-10-17 Thread Vladimir Yudovin
Yes, Cassandra should keep percent of disk usage equal for all disk. Compaction 
process and SSTable flushes will use new disk to distribute both new and 
existing data.



Best regards, Vladimir Yudovin, 

Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer.
Launch your cluster in minutes.






 On Mon, 17 Oct 2016 11:43:27 -0400Seth Edwards s...@pubnub.com 
wrote 




We have a few nodes that are running out of disk capacity at the moment and 
instead of adding more nodes to the cluster, we would like to add another disk 
to the server and add it to the list of data directories. My question, is, will 
Cassandra use the new disk for compactions on sstables that already exist in 
the primary directory? 







Thanks!