Re: implementing a 'sorted set' on top of cassandra

2017-01-13 Thread Benjamin Roth
If your proposed solution is crazy depends on your needs :)
It sounds like you can live with not-realtime data. So it is ok to cache
it. Why preproduce the results if you only need 5% of them? Why not use
redis as a cache with expiring sorted sets that are filled on demand from
cassandra partitions with counters?
So redis has much less to do and can scale much better. And you are not
limited on keeping all data in ram as cache data is volatile and can be
evicted on demand.
If this is effective also depends on the size of your sets. CS wont be able
to sort them by score for you, so you will have to load the complete set to
redis for caching and / or do sorting in your app on demand. This certainly
won't work out well with sets with millions of entries.

2017-01-13 23:14 GMT+01:00 Mike Torra :

> We currently use redis to store sorted sets that we increment many, many
> times more than we read. For example, only about 5% of these sets are ever
> read. We are getting to the point where redis is becoming difficult to
> scale (currently at >20 nodes).
>
> We've started using cassandra for other things, and now we are
> experimenting to see if having a similar 'sorted set' data structure is
> feasible in cassandra. My approach so far is:
>
>1. Use a counter CF to store the values I want to sort by
>2. Periodically read in all key/values in the counter CF and sort in
>the client application (~every five minutes or so)
>3. Write back to a different CF with the ordered keys I care about
>
> Does this seem crazy? Is there a simpler way to do this in cassandra?
>



-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer


Re: implementing a 'sorted set' on top of cassandra

2017-01-13 Thread Benjamin Roth
Not if you want to sort by score (a counter)

Am 14.01.2017 08:33 schrieb "DuyHai Doan" :

> Clustering column can be seen as sorted set
>
> Table abstraction == Map>
>
>
> On Sat, Jan 14, 2017 at 2:28 AM, Edward Capriolo 
> wrote:
>
>>
>>
>> On Fri, Jan 13, 2017 at 8:14 PM, Jonathan Haddad 
>> wrote:
>>
>>> I've thought about this for years and have never arrived on a
>>> particularly great implementation.  Your idea will be maybe OK if the sets
>>> are very small and if the values don't change very often.  But in a system
>>> where the values of the keys in the set change frequently (lots of
>>> tombstones) or the sets are large I think you're going to experience quite
>>> a bit of pain.
>>>
>>> On Fri, Jan 13, 2017 at 2:14 PM Mike Torra 
>>> wrote:
>>>
>>> We currently use redis to store sorted sets that we increment many, many
>>> times more than we read. For example, only about 5% of these sets are ever
>>> read. We are getting to the point where redis is becoming difficult to
>>> scale (currently at >20 nodes).
>>>
>>> We've started using cassandra for other things, and now we are
>>> experimenting to see if having a similar 'sorted set' data structure is
>>> feasible in cassandra. My approach so far is:
>>>
>>>1. Use a counter CF to store the values I want to sort by
>>>2. Periodically read in all key/values in the counter CF and sort in
>>>the client application (~every five minutes or so)
>>>3. Write back to a different CF with the ordered keys I care about
>>>
>>> Does this seem crazy? Is there a simpler way to do this in cassandra?
>>>
>>>
>> Redis is the other side of the coin.
>>
>> Fast:
>> https://groups.google.com/forum/#!topic/redis-db/4TAItKMyUEE
>>
>> http://stackoverflow.com/questions/6076342/is-there-a-practi
>> cal-limit-to-the-number-of-elements-in-a-sorted-set-in-redis
>>
>> 320MB memory for a 2,000,000 email addresses is hard to scale. If you are
>> only maintaining a single list great, but if you have millions of lists
>> this memory/ cost profile is not idea.
>>
>
>


Re: implementing a 'sorted set' on top of cassandra

2017-01-13 Thread DuyHai Doan
Clustering column can be seen as sorted set

Table abstraction == Map>


On Sat, Jan 14, 2017 at 2:28 AM, Edward Capriolo 
wrote:

>
>
> On Fri, Jan 13, 2017 at 8:14 PM, Jonathan Haddad 
> wrote:
>
>> I've thought about this for years and have never arrived on a
>> particularly great implementation.  Your idea will be maybe OK if the sets
>> are very small and if the values don't change very often.  But in a system
>> where the values of the keys in the set change frequently (lots of
>> tombstones) or the sets are large I think you're going to experience quite
>> a bit of pain.
>>
>> On Fri, Jan 13, 2017 at 2:14 PM Mike Torra  wrote:
>>
>> We currently use redis to store sorted sets that we increment many, many
>> times more than we read. For example, only about 5% of these sets are ever
>> read. We are getting to the point where redis is becoming difficult to
>> scale (currently at >20 nodes).
>>
>> We've started using cassandra for other things, and now we are
>> experimenting to see if having a similar 'sorted set' data structure is
>> feasible in cassandra. My approach so far is:
>>
>>1. Use a counter CF to store the values I want to sort by
>>2. Periodically read in all key/values in the counter CF and sort in
>>the client application (~every five minutes or so)
>>3. Write back to a different CF with the ordered keys I care about
>>
>> Does this seem crazy? Is there a simpler way to do this in cassandra?
>>
>>
> Redis is the other side of the coin.
>
> Fast:
> https://groups.google.com/forum/#!topic/redis-db/4TAItKMyUEE
>
> http://stackoverflow.com/questions/6076342/is-there-a-
> practical-limit-to-the-number-of-elements-in-a-sorted-set-in-redis
>
> 320MB memory for a 2,000,000 email addresses is hard to scale. If you are
> only maintaining a single list great, but if you have millions of lists
> this memory/ cost profile is not idea.
>


Re: implementing a 'sorted set' on top of cassandra

2017-01-13 Thread Edward Capriolo
On Fri, Jan 13, 2017 at 8:14 PM, Jonathan Haddad  wrote:

> I've thought about this for years and have never arrived on a particularly
> great implementation.  Your idea will be maybe OK if the sets are very
> small and if the values don't change very often.  But in a system where the
> values of the keys in the set change frequently (lots of tombstones) or the
> sets are large I think you're going to experience quite a bit of pain.
>
> On Fri, Jan 13, 2017 at 2:14 PM Mike Torra  wrote:
>
> We currently use redis to store sorted sets that we increment many, many
> times more than we read. For example, only about 5% of these sets are ever
> read. We are getting to the point where redis is becoming difficult to
> scale (currently at >20 nodes).
>
> We've started using cassandra for other things, and now we are
> experimenting to see if having a similar 'sorted set' data structure is
> feasible in cassandra. My approach so far is:
>
>1. Use a counter CF to store the values I want to sort by
>2. Periodically read in all key/values in the counter CF and sort in
>the client application (~every five minutes or so)
>3. Write back to a different CF with the ordered keys I care about
>
> Does this seem crazy? Is there a simpler way to do this in cassandra?
>
>
Redis is the other side of the coin.

Fast:
https://groups.google.com/forum/#!topic/redis-db/4TAItKMyUEE

http://stackoverflow.com/questions/6076342/is-there-a-practical-limit-to-the-number-of-elements-in-a-sorted-set-in-redis

320MB memory for a 2,000,000 email addresses is hard to scale. If you are
only maintaining a single list great, but if you have millions of lists
this memory/ cost profile is not idea.


Re: implementing a 'sorted set' on top of cassandra

2017-01-13 Thread Jonathan Haddad
I've thought about this for years and have never arrived on a particularly
great implementation.  Your idea will be maybe OK if the sets are very
small and if the values don't change very often.  But in a system where the
values of the keys in the set change frequently (lots of tombstones) or the
sets are large I think you're going to experience quite a bit of pain.

On Fri, Jan 13, 2017 at 2:14 PM Mike Torra  wrote:

We currently use redis to store sorted sets that we increment many, many
times more than we read. For example, only about 5% of these sets are ever
read. We are getting to the point where redis is becoming difficult to
scale (currently at >20 nodes).

We've started using cassandra for other things, and now we are
experimenting to see if having a similar 'sorted set' data structure is
feasible in cassandra. My approach so far is:

   1. Use a counter CF to store the values I want to sort by
   2. Periodically read in all key/values in the counter CF and sort in the
   client application (~every five minutes or so)
   3. Write back to a different CF with the ordered keys I care about

Does this seem crazy? Is there a simpler way to do this in cassandra?


Re: implementing a 'sorted set' on top of cassandra

2017-01-13 Thread Edward Capriolo
On Fri, Jan 13, 2017 at 5:14 PM, Mike Torra  wrote:

> We currently use redis to store sorted sets that we increment many, many
> times more than we read. For example, only about 5% of these sets are ever
> read. We are getting to the point where redis is becoming difficult to
> scale (currently at >20 nodes).
>
> We've started using cassandra for other things, and now we are
> experimenting to see if having a similar 'sorted set' data structure is
> feasible in cassandra. My approach so far is:
>
>1. Use a counter CF to store the values I want to sort by
>2. Periodically read in all key/values in the counter CF and sort in
>the client application (~every five minutes or so)
>3. Write back to a different CF with the ordered keys I care about
>
> Does this seem crazy? Is there a simpler way to do this in cassandra?
>

Have you considered using only the keys in Cassandra's map type?

I proposed an implementation that I wanted to experiment with adding to a
set: https://issues.apache.org/jira/browse/CASSANDRA-6870 . Even though
redis and it's feature set is wildly popular there is not a great consensus
that Cassandra should do those things as manipulations of a single column.


implementing a 'sorted set' on top of cassandra

2017-01-13 Thread Mike Torra
We currently use redis to store sorted sets that we increment many, many times 
more than we read. For example, only about 5% of these sets are ever read. We 
are getting to the point where redis is becoming difficult to scale (currently 
at >20 nodes).

We've started using cassandra for other things, and now we are experimenting to 
see if having a similar 'sorted set' data structure is feasible in cassandra. 
My approach so far is:

  1.  Use a counter CF to store the values I want to sort by
  2.  Periodically read in all key/values in the counter CF and sort in the 
client application (~every five minutes or so)
  3.  Write back to a different CF with the ordered keys I care about

Does this seem crazy? Is there a simpler way to do this in cassandra?


Re: incremental repairs with -pr flag?

2017-01-13 Thread Bruno Lavoie
Another point, I've done another test on my 5 node cluster.

Created a keyspace with replication factor of 5 and inserted some data in
it.

Run a full repair on each node to make sstable appear on disk.

Then run multiple times on each nodes:
1 - nodetool repair
2 - nodetool repair -pr

Due to incremental repair, if I altenate between these 2 commands, no files
change on disk... It means anti-compaction is used.

But on the wall-clock timing, the first method takes an average of 15
seconds and the second one an average of 10 seconds.

But there's no mutation between these commands...

Bruno



On Fri, Jan 13, 2017 at 3:43 PM, Bruno Lavoie  wrote:

> Thanks for your reply,
>
> But can't figure out why it's not recommendend, by DataStax, to run
> primary-range with incremental repair...
>
> It's just doing less work on each repair call on the repaired node.
> At the end, when all the nodes are repaired using either method, all data
> is equally consistent? (excluding read/writes occuring during repair)
>
> Does it harm something?
> Maybe I'm missing something... or to want to understand too much.
>
> I've run a test on a node with the 2 ways, and the output is pretty
> similar (5 nodes, replication factor 2):
> «nodetool repair»  https://gist.github.com/blavoie/
> cc6bbad10ec1b0bf403b3486765d1e0b
> «nodetool repair -pr» https://gist.github.com/blavoie/
> f9c5aaf6bf75ad7d2e6b40213cbf6021
>
> On the same topic: when is it advised to do a full repair?
> https://docs.datastax.com/en/cassandra/3.0/cassandra/
> operations/opsRepairNodesWhen.html
>
> They recommend running incremental daily, and full-repairs weekly or
> monthly.
> And a full repair eliminate anti-compaction, is it bad?
> Does it generates a lot of files to run regularly anti-compaction stuff?
>
> Thanks again
> Bruno Lavoie
>
> On Wed, Jan 11, 2017 at 3:23 PM, Paulo Motta 
> wrote:
>
>> The objective of non-incremental primary-range repair is to avoid redoing
>> work, but with incremental repair anticompaction will segregate repaired
>> data so no extra work is done on the next repair.
>>
>> You should run nodetool repair [ks] [table] in all nodes sequentially.
>> The more often you run, the smaller time repair will take, so just choose
>> the periodicity that suits you better provided it's below gc_grace_seconds.
>>
>>
>> 2017-01-10 13:40 GMT-02:00 Bruno Lavoie :
>>
>>>
>>>
>>> On 2016-10-24 13:39 (-0500), Alexander Dejanovski <
>>> a...@thelastpickle.com> wrote:
>>> > Hi Sean,
>>> >
>>> > In order to mitigate its impact, anticompaction is not fully executed
>>> when
>>> > incremental repair is run with -pr. What you'll observe is that running
>>> > repair on all nodes with -pr will leave sstables marked as unrepaired
>>> on
>>> > all of them.
>>> >
>>> > Then, if you think about it you realize it's no big deal as -pr is
>>> useless
>>> > with incremental repair : data is repaired only once with incremental
>>> > repair, which is what -pr intended to fix on full repair, by repairing
>>> all
>>> > token ranges only once instead of times the replication factor.
>>> >
>>> > Cheers,
>>> >
>>> > Le lun. 24 oct. 2016 18:05, Sean Bridges 
>>> a
>>> > écrit :
>>> >
>>> > > Hey,
>>> > >
>>> > > In the datastax documentation on repair [1], it says,
>>> > >
>>> > > "The partitioner range option is recommended for routine
>>> maintenance. Do
>>> > > not use it to repair a downed node. Do not use with incremental
>>> repair
>>> > > (default for Cassandra 3.0 and later)."
>>> > >
>>> > > Why is it not recommended to use -pr with incremental repairs?
>>> > >
>>> > > Thanks,
>>> > >
>>> > > Sean
>>> > >
>>> > > [1]
>>> > > https://docs.datastax.com/en/cassandra/3.x/cassandra/operati
>>> ons/opsRepairNodesManualRepair.html
>>> > > --
>>> > >
>>> > > Sean Bridges
>>> > >
>>> > > senior systems architect
>>> > > Global Relay
>>> > >
>>> > > *sean.brid...@globalrelay.net* 
>>> > >
>>> > > *866.484.6630 *
>>> > > New York | Chicago | Vancouver | London (+44.0800.032.9829) |
>>> Singapore
>>> > > (+65.3158.1301)
>>> > >
>>> > > Global Relay Archive supports email, instant messaging, BlackBerry,
>>> > > Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter,
>>> > > Facebook and more.
>>> > >
>>> > > Ask about *Global Relay Message*
>>> > >  - The Future of
>>> > > Collaboration in the Financial Services World
>>> > >
>>> > > All email sent to or from this address will be retained by Global
>>> Relay's
>>> > > email archiving system. This message is intended only for the use of
>>> the
>>> > > individual or entity to which it is addressed, and may contain
>>> information
>>> > > that is privileged, confidential, and exempt from disclosure under
>>> > > applicable law. Global Relay will not be liable for any compliance or
>>> > > technical information provided herein. All trademarks are the
>>> 

Re: incremental repairs with -pr flag?

2017-01-13 Thread Bruno Lavoie
Thanks for your reply,

But can't figure out why it's not recommendend, by DataStax, to run
primary-range with incremental repair...

It's just doing less work on each repair call on the repaired node.
At the end, when all the nodes are repaired using either method, all data
is equally consistent? (excluding read/writes occuring during repair)

Does it harm something?
Maybe I'm missing something... or to want to understand too much.

I've run a test on a node with the 2 ways, and the output is pretty similar
(5 nodes, replication factor 2):
«nodetool repair»
https://gist.github.com/blavoie/cc6bbad10ec1b0bf403b3486765d1e0b
«nodetool repair -pr»
https://gist.github.com/blavoie/f9c5aaf6bf75ad7d2e6b40213cbf6021

On the same topic: when is it advised to do a full repair?
https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsRepairNodesWhen.html

They recommend running incremental daily, and full-repairs weekly or
monthly.
And a full repair eliminate anti-compaction, is it bad?
Does it generates a lot of files to run regularly anti-compaction stuff?

Thanks again
Bruno Lavoie

On Wed, Jan 11, 2017 at 3:23 PM, Paulo Motta 
wrote:

> The objective of non-incremental primary-range repair is to avoid redoing
> work, but with incremental repair anticompaction will segregate repaired
> data so no extra work is done on the next repair.
>
> You should run nodetool repair [ks] [table] in all nodes sequentially. The
> more often you run, the smaller time repair will take, so just choose the
> periodicity that suits you better provided it's below gc_grace_seconds.
>
>
> 2017-01-10 13:40 GMT-02:00 Bruno Lavoie :
>
>>
>>
>> On 2016-10-24 13:39 (-0500), Alexander Dejanovski 
>> wrote:
>> > Hi Sean,
>> >
>> > In order to mitigate its impact, anticompaction is not fully executed
>> when
>> > incremental repair is run with -pr. What you'll observe is that running
>> > repair on all nodes with -pr will leave sstables marked as unrepaired on
>> > all of them.
>> >
>> > Then, if you think about it you realize it's no big deal as -pr is
>> useless
>> > with incremental repair : data is repaired only once with incremental
>> > repair, which is what -pr intended to fix on full repair, by repairing
>> all
>> > token ranges only once instead of times the replication factor.
>> >
>> > Cheers,
>> >
>> > Le lun. 24 oct. 2016 18:05, Sean Bridges 
>> a
>> > écrit :
>> >
>> > > Hey,
>> > >
>> > > In the datastax documentation on repair [1], it says,
>> > >
>> > > "The partitioner range option is recommended for routine maintenance.
>> Do
>> > > not use it to repair a downed node. Do not use with incremental repair
>> > > (default for Cassandra 3.0 and later)."
>> > >
>> > > Why is it not recommended to use -pr with incremental repairs?
>> > >
>> > > Thanks,
>> > >
>> > > Sean
>> > >
>> > > [1]
>> > > https://docs.datastax.com/en/cassandra/3.x/cassandra/operati
>> ons/opsRepairNodesManualRepair.html
>> > > --
>> > >
>> > > Sean Bridges
>> > >
>> > > senior systems architect
>> > > Global Relay
>> > >
>> > > *sean.brid...@globalrelay.net* 
>> > >
>> > > *866.484.6630 *
>> > > New York | Chicago | Vancouver | London (+44.0800.032.9829) |
>> Singapore
>> > > (+65.3158.1301)
>> > >
>> > > Global Relay Archive supports email, instant messaging, BlackBerry,
>> > > Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter,
>> > > Facebook and more.
>> > >
>> > > Ask about *Global Relay Message*
>> > >  - The Future of
>> > > Collaboration in the Financial Services World
>> > >
>> > > All email sent to or from this address will be retained by Global
>> Relay's
>> > > email archiving system. This message is intended only for the use of
>> the
>> > > individual or entity to which it is addressed, and may contain
>> information
>> > > that is privileged, confidential, and exempt from disclosure under
>> > > applicable law. Global Relay will not be liable for any compliance or
>> > > technical information provided herein. All trademarks are the
>> property of
>> > > their respective owners.
>> > >
>> > > --
>> > -
>> > Alexander Dejanovski
>> > France
>> > @alexanderdeja
>> >
>> > Consultant
>> > Apache Cassandra Consulting
>> > http://www.thelastpickle.com
>> >
>>
>> Hello,
>>
>> Was looking for exactly the same detail about the Datastax documentation,
>> and not sure to understand everything from your response. I looked at my
>> Cassandra: The Definitive Guide and nothing about this detail too.
>>
>> IIRC:
>> - with incremental repair, it's safe to simply run 'nodetool repair' on
>> each node, without any overhead or wasted resources (merkle trees building,
>> compaction, etc)?
>> - I've read that we must manually run manual anti-entropy repair on each
>> node weekly or before the gc_grace_seconds (default 10 days)? Or only on
>> returning dead 

Re: Metric to monitor partition size

2017-01-13 Thread Bryan Cheng
We're on 2.X so this information may not apply to your version, but you
should see:

1) A log statement upon compaction, like "Writing large partition",
including the primary partition key (see
https://issues.apache.org/jira/browse/CASSANDRA-9643). Configurable
threshold in cassandra.yaml

2) Problematic partition distributions in nodetool cfhistograms, although
without the primary partition key

3) Potentially large partitions in sstables themselves using sstable
parsing utilities. There's also a patch for sstablekeys here, but I've
never used it (https://issues.apache.org/jira/browse/CASSANDRA-8720)

While you _could_  monitor partitions and stop writing to that partition
key when the size reaches a certain threshold (roughly acquired through a
method like above) I'm struggling to think of a case where you'd actually
want to do that- pushing partitions to some maximum size is generally not a
great idea. Ideally you'd want your partitions as small as you can manage
them without making your queries absolutely neurotic.



On Thu, Jan 12, 2017 at 6:08 AM, Saumitra S 
wrote:

> Is there any metric or way to find out if any partition has grown beyond a
> certain size or certain row count?
>
> If a partition reaches a certain size or limit, I want to stop sending
> further write requests to it. Is it possible?
>
>
>


Re: Backups eating up disk space

2017-01-13 Thread Kunal Gangakhedkar
Great, thanks a lot to all for the help :)

I finally took the dive and went with Razi's suggestions.
In summary, this is what I did:

   - turn off incremental backups on each of the nodes in rolling fashion
   - remove the 'backups' directory from each keyspace on each node.

This ended up freeing up almost 350GB on each node - yay :)

Again, thanks a lot for the help, guys.

Kunal

On 12 January 2017 at 21:15, Khaja, Raziuddin (NIH/NLM/NCBI) [C] <
raziuddin.kh...@nih.gov> wrote:

> snapshots are slightly different than backups.
>
>
>
> In my explanation of the hardlinks created in the backups folder, notice
> that compacted sstables, never end up in the backups folder.
>
>
>
> On the other hand, a snapshot is meant to represent the data at a
> particular moment in time. Thus, the snapshots directory contains hardlinks
> to all active sstables at the time the snapshot was taken, which would
> include: compacted sstables; and any sstables from memtable flush or
> streamed from other nodes that both exist in the table directory and the
> backups directory.
>
>
>
> So, that would be the difference between snapshots and backups.
>
>
>
> Best regards,
>
> -Razi
>
>
>
>
>
> *From: *Alain RODRIGUEZ 
> *Reply-To: *"user@cassandra.apache.org" 
> *Date: *Thursday, January 12, 2017 at 9:16 AM
>
> *To: *"user@cassandra.apache.org" 
> *Subject: *Re: Backups eating up disk space
>
>
>
> My 2 cents,
>
>
>
> As I mentioned earlier, we're not currently using snapshots - it's only
> the backups that are bothering me right now.
>
>
>
> I believe backups folder is just the new name for the previously called
> snapshots folder. But I can be completely wrong, I haven't played that much
> with snapshots in new versions yet.
>
>
>
> Anyway, some operations in Apache Cassandra can trigger a snapshot:
>
>
>
> - Repair (when not using parallel option but sequential repairs instead)
>
> - Truncating a table (by default)
>
> - Dropping a table (by default)
>
> - Maybe other I can't think of... ?
>
>
>
> If you want to clean space but still keep a backup you can run:
>
>
>
> "nodetool clearsnapshots"
>
> "nodetool snapshot "
>
>
>
> This way and for a while, data won't be taking space as old files will be
> cleaned and new files will be only hardlinks as detailed above. Then you
> might want to work at a proper backup policy, probably implying getting
> data out of production server (a lot of people uses S3 or similar
> services). Or just do that from time to time, meaning you only keep a
> backup and disk space behaviour will be hard to predict.
>
>
>
> C*heers,
>
> ---
>
> Alain Rodriguez - @arodream - al...@thelastpickle.com
>
> France
>
>
>
> The Last Pickle - Apache Cassandra Consulting
>
> http://www.thelastpickle.com
>
>
>
> 2017-01-12 6:42 GMT+01:00 Prasenjit Sarkar :
>
> Hi Kunal,
>
>
>
> Razi's post does give a very lucid description of how cassandra manages
> the hard links inside the backup directory.
>
>
>
> Where it needs clarification is the following:
>
> --> incremental backups is a system wide setting and so its an all or
> nothing approach
>
>
>
> --> as multiple people have stated, incremental backups do not create hard
> links to compacted sstables. however, this can bloat the size of your
> backups
>
>
>
> --> again as stated, it is a general industry practice to place backups in
> a different secondary storage location than the main production site. So
> best to move it to the secondary storage before applying rm on the backups
> folder
>
>
>
> In my experience with production clusters, managing the backups folder
> across multiple nodes can be painful if the objective is to ever recover
> data. With the usual disclaimers, better to rely on third party vendors to
> accomplish the needful rather than scripts/tablesnap.
>
>
>
> Regards
>
> Prasenjit
>
>
>
>
>
> On Wed, Jan 11, 2017 at 7:49 AM, Khaja, Raziuddin (NIH/NLM/NCBI) [C] <
> raziuddin.kh...@nih.gov> wrote:
>
> Hello Kunal,
>
>
>
> Caveat: I am not a super-expert on Cassandra, but it helps to explain to
> others, in order to eventually become an expert, so if my explanation is
> wrong, I would hope others would correct me. J
>
>
>
> The active sstables/data files are are all the files located in the
> directory for the table.
>
> You can safely remove all files under the backups/ directory and the
> directory itself.
>
> Removing any files that are current hard-links inside backups won’t cause
> any issues, and I will explain why.
>
>
>
> Have you looked at your Cassandra.yaml file and checked the setting for
> incremental_backups?  If it is set to true, and you don’t want to make new
> backups, you can set it to false, so that after you clean up, you will not
> have to clean up the backups again.
>
>
>
> Explanation:
>
> Lets look at the the definition of incremental backups again: “Cassandra
> creates a hard link to each SSTable 

Re: Check snapshot / sstable integrity

2017-01-13 Thread Jérôme Mainaud
Hello Alain,

Thank you for your answer.

Basically having a tool to check all sstables in a folder using the
checksum would be nice. But finally I can have the same result using some
shasum tool.
The goal is to verify integrity of files copied back from an external
backup tool.

The question came because their backup system corrupted some file in the
past and they think with their current backup process in mind.
I will insist on the snapshot on truncate that already saved me, and that
other checks should be done by the backup tool if any is used.

Cheers,


-- 
Jérôme Mainaud
jer...@mainaud.com

2017-01-12 14:05 GMT+01:00 Alain RODRIGUEZ :

> Hi Jérôme,
>
> About this concern:
>
> But my Op retains my arm and asks: "Are you sure that the snapshot is safe
>> and will be restored before truncating data we have?"
>
>
> Make sure to enable snapshot on truncate (cassandra.yaml) or do it
> manually. This way if the restored dataset is worst than the current one
> (the one you plan to truncate), you can always rollback this truncate /
> restore action. This way you can tell your "Op" that this is perfectly safe
> anyway, no data would be lost, even in the worst case scenario (not
> considering the downtime that would be induced). Plus this snapshot is
> cheap (hard links) and do not need to be moved around or kept once you are
> sure the old backup fits your need.
>
> Truncate is definitely the way to go before restoring a backup. Parsing
> the data to delete it all is not really an option imho.
>
> Then about the technical question "how to know that a snapshot is clean"
> it would be good to define "clean". You can make sure the backup is
> readable, consistent enough and correspond to what you want by inserting
> all  the sstables into a testing cluster and performing some reads there
> before doing it in production. You can use for example AWS EC2 machines
> with big EBS attached or whatever and use the sstableloader to load data
> into it.
>
> If you are just worried about SSTables format validity, there is no tool I
> am aware of to check sstables well formatted but it might exist or be
> doable. An other option might be to do a checksum on each sstable before
> uploading it elsewhere and make sure it matches when downloaded back.
> That's the first things that come to my mind.
>
> Hope that is helpful. Hopefully, someone else will be able to point you to
> an existing tool to do this work.
>
> Cheers,
> ---
> Alain Rodriguez - @arodream - al...@thelastpickle.com
> France
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> 2017-01-12 11:33 GMT+01:00 Jérôme Mainaud :
>
>> Hello,
>>
>> Is there any tool to test the integrity of a snapshot?
>>
>> Suppose I have a snapshot based backup stored in an external low cost
>> storage system that I want to restore to a database after someone deleted
>> important data by mistake.
>>
>> Before restoring the files, I will truncate the table to remove the
>> problematic tombstones.
>>
>> But my Op retains my arm and asks: "Are you sure that the snapshot is
>> safe and will be restored before truncating data we have?"
>>
>> If this scenario is a theoretical, the question is good. How can I verify
>> that a snapshot is clean?
>>
>> Thank you,
>>
>> --
>> Jérôme Mainaud
>> jer...@mainaud.com
>>
>
>