Re: Removing a disk from JBOD configuration

2017-07-31 Thread Ioannis Zafiropoulos
Excellent! Thank you Jeff.


On Mon, Jul 31, 2017 at 10:26 AM, Jeff Jirsa  wrote:

> 3.10 has 6696 in it, so my understanding is you'll probably be fine just
> running repair
>
>
> Yes, same risks if you swap drives - before 6696, you want to replace a
> whole node if any sstables are damaged or lost (if you do deletes, and if
> it hurts you if deleted data comes back to life).
>
>
> --
> Jeff Jirsa
>
>
> On Jul 31, 2017, at 6:41 AM, Ioannis Zafiropoulos 
> wrote:
>
> Thank you Jeff for your answer,
>
> I use RF=3 and our client connect always with QUORUM. So I guess I will be
> alright after a repair (?)
> Follow up questions,
> - It seems that the risks you describing would be the same as if I had
> replaced the drive with an new fresh one and run repair, is that correct?
> - can I do the reverse procedure in the future, that is, to add a new
> drive with the same procedure I described?
>
> Thanks,
> John
>
>
>
> On Mon, Jul 31, 2017 at 5:42 AM, Jeff Jirsa  wrote:
>
>> It depends on what consistency level you use for reads/writes, and
>> whether you do deletes
>>
>> The real danger is that there may have been a tombstone on the drive the
>> failed covering data on the disks that remain, where the delete happened
>> older than gc-grace - if you simple yank the disk, that data will come back
>> to life (it's also possible some data temporarily reverts to a previous
>> state for some queries, though the reversion can be fixed with nodetool
>> repair, the resurrection can't be undone). If you don't do deletes, this is
>> not a problem. If there's no danger to you if data comes back to life, then
>> you're probably ok as well.
>>
>> Cassandra-6696 dramatically lowers this risk , if you're using a new
>> enough version of Cassandra
>>
>>
>>
>> --
>> Jeff Jirsa
>>
>>
>> > On Jul 31, 2017, at 1:49 AM, Ioannis Zafiropoulos 
>> wrote:
>> >
>> > Hi All,
>> >
>> > I have a 7 node cluster (Version 3.10) consisting of 5 disks each in
>> JBOD. A few hours ago I had a disk failure on a node. I am wondering if I
>> can:
>> >
>> > - stop Cassandra on that node
>> > - remove the disk, physically and from cassandra.yaml
>> > - start Cassandra on that node
>> > - run repair
>> >
>> > I mean, is it necessary to replace a failed disk instead of just
>> removing it?
>> > (assuming that the remaining disks have enough free space)
>> >
>> > Thank you for your help,
>> > John
>> >
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>


Re: Removing a disk from JBOD configuration

2017-07-31 Thread Jeff Jirsa
3.10 has 6696 in it, so my understanding is you'll probably be fine just 
running repair


Yes, same risks if you swap drives - before 6696, you want to replace a whole 
node if any sstables are damaged or lost (if you do deletes, and if it hurts 
you if deleted data comes back to life).


-- 
Jeff Jirsa


> On Jul 31, 2017, at 6:41 AM, Ioannis Zafiropoulos  wrote:
> 
> Thank you Jeff for your answer,
> 
> I use RF=3 and our client connect always with QUORUM. So I guess I will be 
> alright after a repair (?)
> Follow up questions, 
> - It seems that the risks you describing would be the same as if I had 
> replaced the drive with an new fresh one and run repair, is that correct?
> - can I do the reverse procedure in the future, that is, to add a new drive 
> with the same procedure I described?
> 
> Thanks,
> John
> 
> 
> 
>> On Mon, Jul 31, 2017 at 5:42 AM, Jeff Jirsa  wrote:
>> It depends on what consistency level you use for reads/writes, and whether 
>> you do deletes
>> 
>> The real danger is that there may have been a tombstone on the drive the 
>> failed covering data on the disks that remain, where the delete happened 
>> older than gc-grace - if you simple yank the disk, that data will come back 
>> to life (it's also possible some data temporarily reverts to a previous 
>> state for some queries, though the reversion can be fixed with nodetool 
>> repair, the resurrection can't be undone). If you don't do deletes, this is 
>> not a problem. If there's no danger to you if data comes back to life, then 
>> you're probably ok as well.
>> 
>> Cassandra-6696 dramatically lowers this risk , if you're using a new enough 
>> version of Cassandra
>> 
>> 
>> 
>> --
>> Jeff Jirsa
>> 
>> 
>> > On Jul 31, 2017, at 1:49 AM, Ioannis Zafiropoulos  
>> > wrote:
>> >
>> > Hi All,
>> >
>> > I have a 7 node cluster (Version 3.10) consisting of 5 disks each in JBOD. 
>> > A few hours ago I had a disk failure on a node. I am wondering if I can:
>> >
>> > - stop Cassandra on that node
>> > - remove the disk, physically and from cassandra.yaml
>> > - start Cassandra on that node
>> > - run repair
>> >
>> > I mean, is it necessary to replace a failed disk instead of just removing 
>> > it?
>> > (assuming that the remaining disks have enough free space)
>> >
>> > Thank you for your help,
>> > John
>> >
>> 
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>> 
> 


Re: Removing a disk from JBOD configuration

2017-07-31 Thread Ioannis Zafiropoulos
I just want to add that we use vnodes=16 if that helps with my questions..

On Mon, Jul 31, 2017 at 9:41 AM, Ioannis Zafiropoulos 
wrote:

> Thank you Jeff for your answer,
>
> I use RF=3 and our client connect always with QUORUM. So I guess I will be
> alright after a repair (?)
> Follow up questions,
> - It seems that the risks you describing would be the same as if I had
> replaced the drive with an new fresh one and run repair, is that correct?
> - can I do the reverse procedure in the future, that is, to add a new
> drive with the same procedure I described?
>
> Thanks,
> John
>
>
>
> On Mon, Jul 31, 2017 at 5:42 AM, Jeff Jirsa  wrote:
>
>> It depends on what consistency level you use for reads/writes, and
>> whether you do deletes
>>
>> The real danger is that there may have been a tombstone on the drive the
>> failed covering data on the disks that remain, where the delete happened
>> older than gc-grace - if you simple yank the disk, that data will come back
>> to life (it's also possible some data temporarily reverts to a previous
>> state for some queries, though the reversion can be fixed with nodetool
>> repair, the resurrection can't be undone). If you don't do deletes, this is
>> not a problem. If there's no danger to you if data comes back to life, then
>> you're probably ok as well.
>>
>> Cassandra-6696 dramatically lowers this risk , if you're using a new
>> enough version of Cassandra
>>
>>
>>
>> --
>> Jeff Jirsa
>>
>>
>> > On Jul 31, 2017, at 1:49 AM, Ioannis Zafiropoulos 
>> wrote:
>> >
>> > Hi All,
>> >
>> > I have a 7 node cluster (Version 3.10) consisting of 5 disks each in
>> JBOD. A few hours ago I had a disk failure on a node. I am wondering if I
>> can:
>> >
>> > - stop Cassandra on that node
>> > - remove the disk, physically and from cassandra.yaml
>> > - start Cassandra on that node
>> > - run repair
>> >
>> > I mean, is it necessary to replace a failed disk instead of just
>> removing it?
>> > (assuming that the remaining disks have enough free space)
>> >
>> > Thank you for your help,
>> > John
>> >
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>


Re: Removing a disk from JBOD configuration

2017-07-31 Thread Ioannis Zafiropoulos
Thank you Jeff for your answer,

I use RF=3 and our client connect always with QUORUM. So I guess I will be
alright after a repair (?)
Follow up questions,
- It seems that the risks you describing would be the same as if I had
replaced the drive with an new fresh one and run repair, is that correct?
- can I do the reverse procedure in the future, that is, to add a new drive
with the same procedure I described?

Thanks,
John



On Mon, Jul 31, 2017 at 5:42 AM, Jeff Jirsa  wrote:

> It depends on what consistency level you use for reads/writes, and whether
> you do deletes
>
> The real danger is that there may have been a tombstone on the drive the
> failed covering data on the disks that remain, where the delete happened
> older than gc-grace - if you simple yank the disk, that data will come back
> to life (it's also possible some data temporarily reverts to a previous
> state for some queries, though the reversion can be fixed with nodetool
> repair, the resurrection can't be undone). If you don't do deletes, this is
> not a problem. If there's no danger to you if data comes back to life, then
> you're probably ok as well.
>
> Cassandra-6696 dramatically lowers this risk , if you're using a new
> enough version of Cassandra
>
>
>
> --
> Jeff Jirsa
>
>
> > On Jul 31, 2017, at 1:49 AM, Ioannis Zafiropoulos 
> wrote:
> >
> > Hi All,
> >
> > I have a 7 node cluster (Version 3.10) consisting of 5 disks each in
> JBOD. A few hours ago I had a disk failure on a node. I am wondering if I
> can:
> >
> > - stop Cassandra on that node
> > - remove the disk, physically and from cassandra.yaml
> > - start Cassandra on that node
> > - run repair
> >
> > I mean, is it necessary to replace a failed disk instead of just
> removing it?
> > (assuming that the remaining disks have enough free space)
> >
> > Thank you for your help,
> > John
> >
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: Removing a disk from JBOD configuration

2017-07-31 Thread Jeff Jirsa
It depends on what consistency level you use for reads/writes, and whether you 
do deletes

The real danger is that there may have been a tombstone on the drive the failed 
covering data on the disks that remain, where the delete happened older than 
gc-grace - if you simple yank the disk, that data will come back to life (it's 
also possible some data temporarily reverts to a previous state for some 
queries, though the reversion can be fixed with nodetool repair, the 
resurrection can't be undone). If you don't do deletes, this is not a problem. 
If there's no danger to you if data comes back to life, then you're probably ok 
as well.

Cassandra-6696 dramatically lowers this risk , if you're using a new enough 
version of Cassandra



-- 
Jeff Jirsa


> On Jul 31, 2017, at 1:49 AM, Ioannis Zafiropoulos  wrote:
> 
> Hi All,
> 
> I have a 7 node cluster (Version 3.10) consisting of 5 disks each in JBOD. A 
> few hours ago I had a disk failure on a node. I am wondering if I can:
> 
> - stop Cassandra on that node
> - remove the disk, physically and from cassandra.yaml
> - start Cassandra on that node
> - run repair
> 
> I mean, is it necessary to replace a failed disk instead of just removing it? 
> (assuming that the remaining disks have enough free space)
> 
> Thank you for your help,
> John
> 

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Removing a disk from JBOD configuration

2017-07-31 Thread Ioannis Zafiropoulos
Hi All,

I have a 7 node cluster (Version 3.10) consisting of 5 disks each in JBOD.
A few hours ago I had a disk failure on a node. I am wondering if I can:

- stop Cassandra on that node
- remove the disk, physically and from cassandra.yaml
- start Cassandra on that node
- run repair

I mean, is it necessary to replace a failed disk instead of just removing
it?
(assuming that the remaining disks have enough free space)

Thank you for your help,
John