Re: Recover lost node from backup or evict/re-add?

2019-06-14 Thread Oleksandr Shulgin
On Thu, Jun 13, 2019 at 3:41 PM Jeff Jirsa  wrote:

>
> Bootstrapping a new node does not require repairs at all.
>

Was my understanding as well.

Replacing a node only requires repairs to guarantee consistency to avoid
> violating quorum because streaming for bootstrap only streams from one
> replica
>
> Think this way:
>
> Host 1, 2, 3 in a replica set
> You write value A to some key
> It lands on hosts 1 and 3. Host 2 was being restarted or something
> Host 2 comes back up
> Host 3 fails
>
> If you replace 3 with 3’ -
> 3’ May stream from host 1 and now you’ve got a quorum if replicas with A
> 3’ may stream fr host 2, and now you’ve got a quorum if replicas without
> A. This is illegal.
>
> This is just a statistics game - do you have hosts missing writes? If so,
> are hints delivering them when those hosts come back? What’s the cost of
> violating consistency in that second scenario to you?
>
> If you’re running something where correctness really really really
> matters, you must repair first. If you’re actually running a truly eventual
> consistency use case and reading stale writes is fine, you probably won’t
> ever notice.
>

Alright, this makes it much more clear, thank you.

In any case these docs are weird and wrong - joining nodes get writes in
> all versions of Cassandra for the past few years (at least 2.0+), so the
> docs really need to be fixed.
>

:(

--
Alex


Re: Recover lost node from backup or evict/re-add?

2019-06-13 Thread Jeff Jirsa


> On Jun 13, 2019, at 6:29 AM, Oleksandr Shulgin  
> wrote:
> 
>> On Thu, Jun 13, 2019 at 3:16 PM Jeff Jirsa  wrote:
> 
>> On Jun 13, 2019, at 2:52 AM, Oleksandr Shulgin 
>>  wrote:
>> On Wed, Jun 12, 2019 at 4:02 PM Jeff Jirsa  wrote:
 To avoid violating consistency guarantees, you have to repair the replicas 
 while the lost node is down
>>> 
>>> How do you suggest to trigger it?  Potentially replicas of the primary 
>>> range for the down node are all over the local DC, so I would go with 
>>> triggering a full cluster repair with Cassandra Reaper.  But isn't it going 
>>> to fail because of the down node?  
>> Im not sure there’s an easy and obvious path here - this is something TLP 
>> may want to enhance reaper to help with. 
>> 
>> You have to specify the ranges with -st/-et, and you have to tell it to 
>> ignore the down host with -hosts. With vnodes you’re right that this may be 
>> lots and lots of ranges all over the ring.
>> 
>> There’s a patch proposed (maybe committed in 4.0) that makes this a nonissue 
>> by allowing bootstrap to stream one repaired set and all of the unrepaired 
>> replica data (which is probably very small if you’re running IR regularly), 
>> which accomplished the same thing.
> 
> Ouch, it really hurts to learn this. :(
>>> It is also documented (I believe) that one should repair the node after it 
>>> finishes the "replace address" procedure.  So should one repair before and 
>>> after?
>> You do not need to repair after the bootstrap if you repair before. If the 
>> docs say that, they’re wrong. The joining host gets writes during bootstrap 
>> and consistency levels are altered during bootstrap to account for the 
>> joining host.
> 
> This is what I had in mind (what makes replacement different from actual 
> bootstrap of a new node):

Bootstrapping a new node does not require repairs at all.

Replacing a node only requires repairs to guarantee consistency to avoid 
violating quorum because streaming for bootstrap only streams from one replica

Think this way:

Host 1, 2, 3 in a replica set
You write value A to some key
It lands on hosts 1 and 3. Host 2 was being restarted or something
Host 2 comes back up
Host 3 fails

If you replace 3 with 3’ - 
3’ May stream from host 1 and now you’ve got a quorum if replicas with A
3’ may stream fr host 2, and now you’ve got a quorum if replicas without A. 
This is illegal.

This is just a statistics game - do you have hosts missing writes? If so, are 
hints delivering them when those hosts come back? What’s the cost of violating 
consistency in that second scenario to you? 

If you’re running something where correctness really really really matters, you 
must repair first. If you’re actually running a truly eventual consistency use 
case and reading stale writes is fine, you probably won’t ever notice.  

In any case these docs are weird and wrong - joining nodes get writes in all 
versions of Cassandra for the past few years (at least 2.0+), so the docs 
really need to be fixed.

> http://cassandra.apache.org/doc/latest/operating/topo_changes.html?highlight=replace%20address#replacing-a-dead-node
>  
> Note
> If any of the following cases apply, you MUST run repair to make the replaced 
> node consistent again, since it missed ongoing writes during/prior to 
> bootstrapping. The replacement timeframe refers to the period from when the 
> node initially dies to when a new node completes the replacement process.
> 
> The node is down for longer than max_hint_window_in_ms before being replaced.
> You are replacing using the same IP address as the dead node and replacement 
> takes longer than max_hint_window_in_ms.
> 
> I would imagine that any production size instance would take way longer to 
> replace than the default max hint window (which is 3 hours, AFAIK).  Didn't 
> remember the same IP restriction, but at least this I would also expect to be 
> the most common setup.
> 
> --
> Alex
> 


Re: Recover lost node from backup or evict/re-add?

2019-06-13 Thread Oleksandr Shulgin
On Thu, Jun 13, 2019 at 3:16 PM Jeff Jirsa  wrote:

> On Jun 13, 2019, at 2:52 AM, Oleksandr Shulgin <
> oleksandr.shul...@zalando.de> wrote:
> On Wed, Jun 12, 2019 at 4:02 PM Jeff Jirsa  wrote:
>
> To avoid violating consistency guarantees, you have to repair the replicas
>> while the lost node is down
>>
>
> How do you suggest to trigger it?  Potentially replicas of the primary
> range for the down node are all over the local DC, so I would go with
> triggering a full cluster repair with Cassandra Reaper.  But isn't it going
> to fail because of the down node?
>
> Im not sure there’s an easy and obvious path here - this is something TLP
> may want to enhance reaper to help with.
>
> You have to specify the ranges with -st/-et, and you have to tell it to
> ignore the down host with -hosts. With vnodes you’re right that this may be
> lots and lots of ranges all over the ring.
>
> There’s a patch proposed (maybe committed in 4.0) that makes this a
> nonissue by allowing bootstrap to stream one repaired set and all of the
> unrepaired replica data (which is probably very small if you’re running IR
> regularly), which accomplished the same thing.
>

Ouch, it really hurts to learn this. :(

> It is also documented (I believe) that one should repair the node after it
> finishes the "replace address" procedure.  So should one repair before and
> after?
>
> You do not need to repair after the bootstrap if you repair before. If the
> docs say that, they’re wrong. The joining host gets writes during bootstrap
> and consistency levels are altered during bootstrap to account for the
> joining host.
>

This is what I had in mind (what makes replacement different from actual
bootstrap of a new node):
http://cassandra.apache.org/doc/latest/operating/topo_changes.html?highlight=replace%20address#replacing-a-dead-node


Note

If any of the following cases apply, you MUST run repair to make the replaced
node consistent again, since it missed ongoing writes during/prior to
bootstrapping. The *replacement* timeframe refers to the period from when
the node initially dies to when a new node completes the replacement
process.


   1. The node is down for longer than max_hint_window_in_ms before being
  replaced.
  2. You are replacing using the same IP address as the dead node and
  replacement takes longer than max_hint_window_in_ms.


I would imagine that any production size instance would take way longer to
replace than the default max hint window (which is 3 hours, AFAIK).  Didn't
remember the same IP restriction, but at least this I would also expect to
be the most common setup.

--
Alex


Re: Recover lost node from backup or evict/re-add?

2019-06-13 Thread Jeff Jirsa


> On Jun 13, 2019, at 2:52 AM, Oleksandr Shulgin  
> wrote:
> 
>> On Wed, Jun 12, 2019 at 4:02 PM Jeff Jirsa  wrote:
> 
>> To avoid violating consistency guarantees, you have to repair the replicas 
>> while the lost node is down
> 
> How do you suggest to trigger it?  Potentially replicas of the primary range 
> for the down node are all over the local DC, so I would go with triggering a 
> full cluster repair with Cassandra Reaper.  But isn't it going to fail 
> because of the down node?  

Im not sure there’s an easy and obvious path here - this is something TLP may 
want to enhance reaper to help with. 

You have to specify the ranges with -st/-et, and you have to tell it to ignore 
the down host with -hosts. With vnodes you’re right that this may be lots and 
lots of ranges all over the ring.

There’s a patch proposed (maybe committed in 4.0) that makes this a nonissue by 
allowing bootstrap to stream one repaired set and all of the unrepaired replica 
data (which is probably very small if you’re running IR regularly), which 
accomplished the same thing.

> 
> It is also documented (I believe) that one should repair the node after it 
> finishes the "replace address" procedure.  So should one repair before and 
> after?

You do not need to repair after the bootstrap if you repair before. If the docs 
say that, they’re wrong. The joining host gets writes during bootstrap and 
consistency levels are altered during bootstrap to account for the joining host.

Re: Recover lost node from backup or evict/re-add?

2019-06-13 Thread Oleksandr Shulgin
On Wed, Jun 12, 2019 at 4:02 PM Jeff Jirsa  wrote:

> To avoid violating consistency guarantees, you have to repair the replicas
> while the lost node is down
>

How do you suggest to trigger it?  Potentially replicas of the primary
range for the down node are all over the local DC, so I would go with
triggering a full cluster repair with Cassandra Reaper.  But isn't it going
to fail because of the down node?

It is also documented (I believe) that one should repair the node after it
finishes the "replace address" procedure.  So should one repair before and
after?

--
Alex


Re: Recover lost node from backup or evict/re-add?

2019-06-12 Thread Jon Haddad
100% agree with Sean.  I would only use Cassandra backups in a case where
you need to restore from full cluster loss.  Example: An entire DC burns
down, tornado, flooding.

Your routine node replacement after a failure should be
replace_address_first_boot.

To ensure this goes smoothly, run regular repairs.  We (The Last Pickle)
maintain this to make it easy: http://cassandra-reaper.io/

Jon


On Wed, Jun 12, 2019 at 11:17 AM Durity, Sean R 
wrote:

> I’m not sure it is correct to say, “you cannot.” However, that is a more
> complicated restore and more likely to lead to inconsistent data and take
> longer to do. You are basically trying to start from a backup point and
> roll everything forward and catch up to current.
>
>
>
> Replacing/re-streaming is the well-trodden path. You are getting the net
> result of all that has happened since the node failure. And the node is not
> returning data to the clients while the bootstrap is running. If you have a
> restored/repairing node, it will accept client (and coordinator)
> connections even though it isn’t (guaranteed) consistent, yet.
>
>
>
> As I understand it – a full cluster recovery from backup still requires
> repair across the cluster to ensure consistency. In my experience, most
> apps cannot wait for a full restore/repair. Availability matters more. They
> also don’t want to pay for even more disk to hold some level of backups.
>
>
>
> There are some companies that provide finer-grained backup and recovery
> options, though.
>
>
>
> Sean Durity
>
>
>
> *From:* Alan Gano 
> *Sent:* Wednesday, June 12, 2019 1:43 PM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] RE: Recover lost node from backup or evict/re-add?
>
>
>
>
>
> Is it correct to say that a lost node cannot be restored from backup?  You
> must either replace the node or evict/re-add (i.e., rebuild from other
> nodes).
>
>
>
> Also, that snapshot, incremental, commitlog backups are relegated to
> application keyspace recovery only?
>
>
>
>
>
> How about recovery of the entire cluster? (rolling it back).  Are
> snapshots exact enough, in time, to not have a nodes that differ, in
> point-in-time, from the rest of the cluster?  Would those nodes be
> recoverable (nodetool repair?) … which brings me back to recovering a lost
> node from backup (restore last snapshot, and run nodetool repair?).
>
>
>
>
>
> Thanks,
>
>
>
> Alan Gano
>
>
>
>
>
> *From:* Jeff Jirsa [mailto:jji...@gmail.com ]
> *Sent:* Wednesday, June 12, 2019 10:14 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Recover lost node from backup or evict/re-add?
>
>
>
> A host can replace itself using the method I described
>
>
> On Jun 12, 2019, at 7:10 AM, Alan Gano  wrote:
>
> I guess I’m considering this scenario:
>
>- host and configuration have survived
>- /data is gone
>- /backups have survived
>
>
>
> I have tested recovering from this scenario with an evict/re-add, which
> worked fine.
>
>
>
> If I restore from backup, the node will be behind the cluster – e,
> does it get caught up after a restore and start it up?
>
>
>
> Alan
>
>
>
> *From:* Jeff Jirsa [mailto:jji...@gmail.com ]
> *Sent:* Wednesday, June 12, 2019 10:02 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Recover lost node from backup or evict/re-add?
>
>
>
> To avoid violating consistency guarantees, you have to repair the replicas
> while the lost node is down
>
>
>
> Once you do that it’s typically easiest to bootstrap a replacement
> (there’s a property named “replace address first boot” you can google or
> someone can link) that tells a new joining host to take over for a failed
> machine.
>
>
>
>
> On Jun 12, 2019, at 6:54 AM, Alan Gano  wrote:
>
>
>
> If I lose a node, does it make sense to even restore from
> snapshot/incrementals/commitlogs?
>
>
>
> Or is the best way to do an evict/re-add?
>
>
>
>
>
> Thanks,
>
>
>
> Alan.
>
>
>
> NOTICE: This communication is intended only for the person or entity to
> whom it is addressed and may contain confidential, proprietary, and/or
> privileged material. Unless you are the intended addressee, any review,
> reliance, dissemination, distribution, copying or use whatsoever of this
> communication is strictly prohibited. If you received this in error, please
> reply immediately and delete the material from all computers. Email sent
> through the Internet is not secure. Do not use email to send us
> confidential information such as credit card numbers, PIN numbers,
> passwords, Social Security Numbers, Account nu

RE: Recover lost node from backup or evict/re-add?

2019-06-12 Thread Durity, Sean R
I’m not sure it is correct to say, “you cannot.” However, that is a more 
complicated restore and more likely to lead to inconsistent data and take 
longer to do. You are basically trying to start from a backup point and roll 
everything forward and catch up to current.

Replacing/re-streaming is the well-trodden path. You are getting the net result 
of all that has happened since the node failure. And the node is not returning 
data to the clients while the bootstrap is running. If you have a 
restored/repairing node, it will accept client (and coordinator) connections 
even though it isn’t (guaranteed) consistent, yet.

As I understand it – a full cluster recovery from backup still requires repair 
across the cluster to ensure consistency. In my experience, most apps cannot 
wait for a full restore/repair. Availability matters more. They also don’t want 
to pay for even more disk to hold some level of backups.

There are some companies that provide finer-grained backup and recovery 
options, though.

Sean Durity

From: Alan Gano 
Sent: Wednesday, June 12, 2019 1:43 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] RE: Recover lost node from backup or evict/re-add?


Is it correct to say that a lost node cannot be restored from backup?  You must 
either replace the node or evict/re-add (i.e., rebuild from other nodes).

Also, that snapshot, incremental, commitlog backups are relegated to 
application keyspace recovery only?


How about recovery of the entire cluster? (rolling it back).  Are snapshots 
exact enough, in time, to not have a nodes that differ, in point-in-time, from 
the rest of the cluster?  Would those nodes be recoverable (nodetool repair?) … 
which brings me back to recovering a lost node from backup (restore last 
snapshot, and run nodetool repair?).


Thanks,

Alan Gano


From: Jeff Jirsa [mailto:jji...@gmail.com]
Sent: Wednesday, June 12, 2019 10:14 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Recover lost node from backup or evict/re-add?

A host can replace itself using the method I described

On Jun 12, 2019, at 7:10 AM, Alan Gano mailto:ag...@tsys.com>> 
wrote:
I guess I’m considering this scenario:

  *   host and configuration have survived
  *   /data is gone
  *   /backups have survived

I have tested recovering from this scenario with an evict/re-add, which worked 
fine.

If I restore from backup, the node will be behind the cluster – e, does it 
get caught up after a restore and start it up?

Alan

From: Jeff Jirsa [mailto:jji...@gmail.com]
Sent: Wednesday, June 12, 2019 10:02 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Recover lost node from backup or evict/re-add?

To avoid violating consistency guarantees, you have to repair the replicas 
while the lost node is down

Once you do that it’s typically easiest to bootstrap a replacement (there’s a 
property named “replace address first boot” you can google or someone can link) 
that tells a new joining host to take over for a failed machine.


On Jun 12, 2019, at 6:54 AM, Alan Gano mailto:ag...@tsys.com>> 
wrote:

If I lose a node, does it make sense to even restore from 
snapshot/incrementals/commitlogs?

Or is the best way to do an evict/re-add?


Thanks,

Alan.

NOTICE: This communication is intended only for the person or entity to whom it 
is addressed and may contain confidential, proprietary, and/or privileged 
material. Unless you are the intended addressee, any review, reliance, 
dissemination, distribution, copying or use whatsoever of this communication is 
strictly prohibited. If you received this in error, please reply immediately 
and delete the material from all computers. Email sent through the Internet is 
not secure. Do not use email to send us confidential information such as credit 
card numbers, PIN numbers, passwords, Social Security Numbers, Account numbers, 
or other important and confidential information.
NOTICE: This communication is intended only for the person or entity to whom it 
is addressed and may contain confidential, proprietary, and/or privileged 
material. Unless you are the intended addressee, any review, reliance, 
dissemination, distribution, copying or use whatsoever of this communication is 
strictly prohibited. If you received this in error, please reply immediately 
and delete the material from all computers. Email sent through the Internet is 
not secure. Do not use email to send us confidential information such as credit 
card numbers, PIN numbers, passwords, Social Security Numbers, Account numbers, 
or other important and confidential information.
NOTICE: This communication is intended only for the person or entity to whom it 
is addressed and may contain confidential, proprietary, and/or privileged 
material. Unless you are the intended addressee, any review, reliance, 
dissemination, distribution, copying or use whatsoever of this communication is 
strictly prohibited. If yo

RE: Recover lost node from backup or evict/re-add?

2019-06-12 Thread Alan Gano

Is it correct to say that a lost node cannot be restored from backup?  You must 
either replace the node or evict/re-add (i.e., rebuild from other nodes).

Also, that snapshot, incremental, commitlog backups are relegated to 
application keyspace recovery only?


How about recovery of the entire cluster? (rolling it back).  Are snapshots 
exact enough, in time, to not have a nodes that differ, in point-in-time, from 
the rest of the cluster?  Would those nodes be recoverable (nodetool repair?) … 
which brings me back to recovering a lost node from backup (restore last 
snapshot, and run nodetool repair?).


Thanks,

Alan Gano


From: Jeff Jirsa [mailto:jji...@gmail.com]
Sent: Wednesday, June 12, 2019 10:14 AM
To: user@cassandra.apache.org
Subject: Re: Recover lost node from backup or evict/re-add?

A host can replace itself using the method I described

On Jun 12, 2019, at 7:10 AM, Alan Gano mailto:ag...@tsys.com>> 
wrote:
I guess I’m considering this scenario:

· host and configuration have survived

· /data is gone

· /backups have survived

I have tested recovering from this scenario with an evict/re-add, which worked 
fine.

If I restore from backup, the node will be behind the cluster – e, does it 
get caught up after a restore and start it up?

Alan

From: Jeff Jirsa [mailto:jji...@gmail.com]
Sent: Wednesday, June 12, 2019 10:02 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Recover lost node from backup or evict/re-add?

To avoid violating consistency guarantees, you have to repair the replicas 
while the lost node is down

Once you do that it’s typically easiest to bootstrap a replacement (there’s a 
property named “replace address first boot” you can google or someone can link) 
that tells a new joining host to take over for a failed machine.


On Jun 12, 2019, at 6:54 AM, Alan Gano mailto:ag...@tsys.com>> 
wrote:

If I lose a node, does it make sense to even restore from 
snapshot/incrementals/commitlogs?

Or is the best way to do an evict/re-add?


Thanks,

Alan.

NOTICE: This communication is intended only for the person or entity to whom it 
is addressed and may contain confidential, proprietary, and/or privileged 
material. Unless you are the intended addressee, any review, reliance, 
dissemination, distribution, copying or use whatsoever of this communication is 
strictly prohibited. If you received this in error, please reply immediately 
and delete the material from all computers. Email sent through the Internet is 
not secure. Do not use email to send us confidential information such as credit 
card numbers, PIN numbers, passwords, Social Security Numbers, Account numbers, 
or other important and confidential information.
NOTICE: This communication is intended only for the person or entity to whom it 
is addressed and may contain confidential, proprietary, and/or privileged 
material. Unless you are the intended addressee, any review, reliance, 
dissemination, distribution, copying or use whatsoever of this communication is 
strictly prohibited. If you received this in error, please reply immediately 
and delete the material from all computers. Email sent through the Internet is 
not secure. Do not use email to send us confidential information such as credit 
card numbers, PIN numbers, passwords, Social Security Numbers, Account numbers, 
or other important and confidential information.

NOTICE: This communication is intended only for the person or entity to whom it 
is addressed and may contain confidential, proprietary, and/or privileged 
material. Unless you are the intended addressee, any review, reliance, 
dissemination, distribution, copying or use whatsoever of this communication is 
strictly prohibited. If you received this in error, please reply immediately 
and delete the material from all computers. Email sent through the Internet is 
not secure. Do not use email to send us confidential information such as credit 
card numbers, PIN numbers, passwords, Social Security Numbers, Account numbers, 
or other important and confidential information.


Re: Recover lost node from backup or evict/re-add?

2019-06-12 Thread Jeff Jirsa
A host can replace itself using the method I described 

> On Jun 12, 2019, at 7:10 AM, Alan Gano  wrote:
> 
> I guess I’m considering this scenario:
> · host and configuration have survived
> · /data is gone
> · /backups have survived
>  
> I have tested recovering from this scenario with an evict/re-add, which 
> worked fine.
>  
> If I restore from backup, the node will be behind the cluster – e, does 
> it get caught up after a restore and start it up?
>  
> Alan
>  
> From: Jeff Jirsa [mailto:jji...@gmail.com] 
> Sent: Wednesday, June 12, 2019 10:02 AM
> To: user@cassandra.apache.org
> Subject: Re: Recover lost node from backup or evict/re-add?
>  
> To avoid violating consistency guarantees, you have to repair the replicas 
> while the lost node is down
>  
> Once you do that it’s typically easiest to bootstrap a replacement (there’s a 
> property named “replace address first boot” you can google or someone can 
> link) that tells a new joining host to take over for a failed machine.
>  
> 
> On Jun 12, 2019, at 6:54 AM, Alan Gano  wrote:
> 
>  
> If I lose a node, does it make sense to even restore from 
> snapshot/incrementals/commitlogs?
>  
> Or is the best way to do an evict/re-add?
>  
>  
> Thanks,
>  
> Alan.
>  
> NOTICE: This communication is intended only for the person or entity to whom 
> it is addressed and may contain confidential, proprietary, and/or privileged 
> material. Unless you are the intended addressee, any review, reliance, 
> dissemination, distribution, copying or use whatsoever of this communication 
> is strictly prohibited. If you received this in error, please reply 
> immediately and delete the material from all computers. Email sent through 
> the Internet is not secure. Do not use email to send us confidential 
> information such as credit card numbers, PIN numbers, passwords, Social 
> Security Numbers, Account numbers, or other important and confidential 
> information.
> NOTICE: This communication is intended only for the person or entity to whom 
> it is addressed and may contain confidential, proprietary, and/or privileged 
> material. Unless you are the intended addressee, any review, reliance, 
> dissemination, distribution, copying or use whatsoever of this communication 
> is strictly prohibited. If you received this in error, please reply 
> immediately and delete the material from all computers. Email sent through 
> the Internet is not secure. Do not use email to send us confidential 
> information such as credit card numbers, PIN numbers, passwords, Social 
> Security Numbers, Account numbers, or other important and confidential 
> information.


RE: Recover lost node from backup or evict/re-add?

2019-06-12 Thread Alan Gano
I guess I’m considering this scenario:

· host and configuration have survived

· /data is gone

· /backups have survived

I have tested recovering from this scenario with an evict/re-add, which worked 
fine.

If I restore from backup, the node will be behind the cluster – e, does it 
get caught up after a restore and start it up?

Alan

From: Jeff Jirsa [mailto:jji...@gmail.com]
Sent: Wednesday, June 12, 2019 10:02 AM
To: user@cassandra.apache.org
Subject: Re: Recover lost node from backup or evict/re-add?

To avoid violating consistency guarantees, you have to repair the replicas 
while the lost node is down

Once you do that it’s typically easiest to bootstrap a replacement (there’s a 
property named “replace address first boot” you can google or someone can link) 
that tells a new joining host to take over for a failed machine.


On Jun 12, 2019, at 6:54 AM, Alan Gano mailto:ag...@tsys.com>> 
wrote:

If I lose a node, does it make sense to even restore from 
snapshot/incrementals/commitlogs?

Or is the best way to do an evict/re-add?


Thanks,

Alan.

NOTICE: This communication is intended only for the person or entity to whom it 
is addressed and may contain confidential, proprietary, and/or privileged 
material. Unless you are the intended addressee, any review, reliance, 
dissemination, distribution, copying or use whatsoever of this communication is 
strictly prohibited. If you received this in error, please reply immediately 
and delete the material from all computers. Email sent through the Internet is 
not secure. Do not use email to send us confidential information such as credit 
card numbers, PIN numbers, passwords, Social Security Numbers, Account numbers, 
or other important and confidential information.

NOTICE: This communication is intended only for the person or entity to whom it 
is addressed and may contain confidential, proprietary, and/or privileged 
material. Unless you are the intended addressee, any review, reliance, 
dissemination, distribution, copying or use whatsoever of this communication is 
strictly prohibited. If you received this in error, please reply immediately 
and delete the material from all computers. Email sent through the Internet is 
not secure. Do not use email to send us confidential information such as credit 
card numbers, PIN numbers, passwords, Social Security Numbers, Account numbers, 
or other important and confidential information.


Re: Recover lost node from backup or evict/re-add?

2019-06-12 Thread Jeff Jirsa
To avoid violating consistency guarantees, you have to repair the replicas 
while the lost node is down

Once you do that it’s typically easiest to bootstrap a replacement (there’s a 
property named “replace address first boot” you can google or someone can link) 
that tells a new joining host to take over for a failed machine.


> On Jun 12, 2019, at 6:54 AM, Alan Gano  wrote:
> 
>  
> If I lose a node, does it make sense to even restore from 
> snapshot/incrementals/commitlogs?
>  
> Or is the best way to do an evict/re-add?
>  
>  
> Thanks,
>  
> Alan.
>  
> NOTICE: This communication is intended only for the person or entity to whom 
> it is addressed and may contain confidential, proprietary, and/or privileged 
> material. Unless you are the intended addressee, any review, reliance, 
> dissemination, distribution, copying or use whatsoever of this communication 
> is strictly prohibited. If you received this in error, please reply 
> immediately and delete the material from all computers. Email sent through 
> the Internet is not secure. Do not use email to send us confidential 
> information such as credit card numbers, PIN numbers, passwords, Social 
> Security Numbers, Account numbers, or other important and confidential 
> information.


Recover lost node from backup or evict/re-add?

2019-06-12 Thread Alan Gano

If I lose a node, does it make sense to even restore from 
snapshot/incrementals/commitlogs?

Or is the best way to do an evict/re-add?


Thanks,

Alan.

NOTICE: This communication is intended only for the person or entity to whom it 
is addressed and may contain confidential, proprietary, and/or privileged 
material. Unless you are the intended addressee, any review, reliance, 
dissemination, distribution, copying or use whatsoever of this communication is 
strictly prohibited. If you received this in error, please reply immediately 
and delete the material from all computers. Email sent through the Internet is 
not secure. Do not use email to send us confidential information such as credit 
card numbers, PIN numbers, passwords, Social Security Numbers, Account numbers, 
or other important and confidential information.