Re: Is it possible to recover a deleted-in-future record?

2017-03-08 Thread Eric Stevens
Those future tombstones are going to continue to cause problems on those
partitions.  If you're still writing to those partitions, you might be
losing data in the mean time.  It's going to be hard to get the tombstone
out of the way so that new writes can begin to happen there (newly written
data will be occluded by the existing tombstones).  Manual cleanup might be
required here, such sstablefilter or sstable2json->clean up the data ->
json2sstable.  This could get really hairy.

Another option, depending on the kind of tombstone they were (eg cell
level), my deleting compactor[1] might be able to clean them up on the live
cluster via user defined compaction if you wrote a convictor for this
purpose.  But that tool has a gap for cluster and/or partition level
tombstones which it doesn't properly recognize yet (there's an open PR that
provides partial implementation, but I'm not sure it would get you what you
need).  You can see my talk about that[2].

Careful caveat on this though, the deleting compactor was written to
_avoid_ tombstones, it hasn't been well tested against data that contains
tombstones, so although time is critical for you here to avoid ongoing
corruption of your data while those bad tombstones remain in the way, I
would still fully encourage you to validate whether this could satisfy your
use case.

[1] https://github.com/protectwise/cassandra-util
[2] https://www.youtube.com/watch?v=BhGkSnBZgJA

On Wed, Mar 8, 2017 at 6:06 AM Arvydas Jonusonis <
arvydas.jonuso...@gmail.com> wrote:

> That's a good point - a snapshot is certainly in order ASAP, if not
> already done.
>
> One more thing I'd add about "data has to be consolidated from all the
> nodes" (from #3 below):
>
>- EITHER run the sstable2json ops on each node
>- OR if size permits, copy the relevant sstables (containing the
>desired keys, from the output of the nodetool getsstables) locally or onto
>a new single-node instance, start that instance and run the commands there
>
> If restoring the sstables from a snapshot, you'll need to do the latter
> anyway.
>
> Arvydas
>
> On Wed, Mar 8, 2017 at 1:55 PM, Anuj Wadehra 
> wrote:
>
> DISCLAIMER: This is only my personal opinion. Evaluate the situation
> carefully and if you find below suggestions useful, follow them at your own
> risk.
>
> If I have understood the problem correctly, malicious deletes would
> actually lead to deletion of data.  I am not sure how everything is normal
> after the deletes?
>
> If data is critical,you could:
>
> 1. Take a database snapshot immediately so that you dont lose information
> if delete entrues in sstables are compacted together with original data.
>
> 2. Transfer snapshot to suitable place and Run some utility such as
> sstabletojson to get the keys impacted by the deletes and original data for
> keys. Data has to be consolidated from all the nodes.
>
> 3. Devise a strategy to restore deleted data.
>
> Thanks
> Anuj
>
>
>
> On Tue, Mar 7, 2017 at 8:44 AM, Michael Fong
>  wrote:
>
> Hi, all,
>
>
>
>
>
> We recently encountered an issue in production that some records were
> mysteriously deleted with a timestamp 100+ years from now. Everything is
> normal as of now, and how the deletion happened and accuracy of system
> timestamp at that moment are unknown. We were wondering if there is a
> general way to recover the mysteriously-deleted data when the timestamp
> meta is screwed up.
>
>
>
> Thanks in advanced,
>
>
>
> Regards,
>
>
>
> Michael Fong
>
>
>


Re: Is it possible to recover a deleted-in-future record?

2017-03-08 Thread Arvydas Jonusonis
That's a good point - a snapshot is certainly in order ASAP, if not already
done.

One more thing I'd add about "data has to be consolidated from all the
nodes" (from #3 below):

   - EITHER run the sstable2json ops on each node
   - OR if size permits, copy the relevant sstables (containing the desired
   keys, from the output of the nodetool getsstables) locally or onto a new
   single-node instance, start that instance and run the commands there

If restoring the sstables from a snapshot, you'll need to do the latter
anyway.

Arvydas

On Wed, Mar 8, 2017 at 1:55 PM, Anuj Wadehra  wrote:

> DISCLAIMER: This is only my personal opinion. Evaluate the situation
> carefully and if you find below suggestions useful, follow them at your own
> risk.
>
> If I have understood the problem correctly, malicious deletes would
> actually lead to deletion of data.  I am not sure how everything is normal
> after the deletes?
>
> If data is critical,you could:
>
> 1. Take a database snapshot immediately so that you dont lose information
> if delete entrues in sstables are compacted together with original data.
>
> 2. Transfer snapshot to suitable place and Run some utility such as
> sstabletojson to get the keys impacted by the deletes and original data for
> keys. Data has to be consolidated from all the nodes.
>
> 3. Devise a strategy to restore deleted data.
>
> Thanks
> Anuj
>
>
>
> On Tue, Mar 7, 2017 at 8:44 AM, Michael Fong
>  wrote:
>
> Hi, all,
>
>
>
>
>
> We recently encountered an issue in production that some records were
> mysteriously deleted with a timestamp 100+ years from now. Everything is
> normal as of now, and how the deletion happened and accuracy of system
> timestamp at that moment are unknown. We were wondering if there is a
> general way to recover the mysteriously-deleted data when the timestamp
> meta is screwed up.
>
>
>
> Thanks in advanced,
>
>
>
> Regards,
>
>
>
> Michael Fong
>
>


Re: Is it possible to recover a deleted-in-future record?

2017-03-08 Thread Anuj Wadehra
DISCLAIMER: This is only my personal opinion. Evaluate the situation carefully 
and if you find below suggestions useful, follow them at your own risk.
If I have understood the problem correctly, malicious deletes would actually 
lead to deletion of data.  I am not sure how everything is normal after the 
deletes?
If data is critical,you could:

1. Take a database snapshot immediately so that you dont lose information if 
delete entrues in sstables are compacted together with original data. 
2. Transfer snapshot to suitable place and Run some utility such as 
sstabletojson to get the keys impacted by the deletes and original data for 
keys. Data has to be consolidated from all the nodes.
3. Devise a strategy to restore deleted data.
ThanksAnuj

 
 
  On Tue, Mar 7, 2017 at 8:44 AM, Michael Fong 
wrote:   
Hi, all,
 
  
 
  
 
We recently encountered an issue in production that some records were 
mysteriously deleted with a timestamp 100+ years from now. Everything is normal 
as of now, and how the deletion happened and accuracy of system timestamp at 
that moment are unknown. We were wondering if there is a general way to recover 
the mysteriously-deleted data when the timestamp meta is screwed up.
 
  
 
Thanks in advanced,
 
  
 
Regards,
 
  
 
Michael Fong
   


Re: Is it possible to recover a deleted-in-future record?

2017-03-08 Thread Arvydas Jonusonis
Use nodetool getsstables to discover which sstables contain the data and
then dump it with sstable2json -k  to explore the content of the
data/mutations for those keys.

Arvydas

On Tue, Mar 7, 2017 at 4:13 AM, Michael Fong <
michael.f...@ruckuswireless.com> wrote:

> Hi, all,
>
>
>
>
>
> We recently encountered an issue in production that some records were
> mysteriously deleted with a timestamp 100+ years from now. Everything is
> normal as of now, and how the deletion happened and accuracy of system
> timestamp at that moment are unknown. We were wondering if there is a
> general way to recover the mysteriously-deleted data when the timestamp
> meta is screwed up.
>
>
>
> Thanks in advanced,
>
>
>
> Regards,
>
>
>
> Michael Fong
>


Is it possible to recover a deleted-in-future record?

2017-03-06 Thread Michael Fong
Hi, all,


We recently encountered an issue in production that some records were 
mysteriously deleted with a timestamp 100+ years from now. Everything is normal 
as of now, and how the deletion happened and accuracy of system timestamp at 
that moment are unknown. We were wondering if there is a general way to recover 
the mysteriously-deleted data when the timestamp meta is screwed up.

Thanks in advanced,

Regards,

Michael Fong