Re: Log application Queries

2018-06-05 Thread Luigi Tagliamonte
I implemented an audit query handler that archives the queries in ES here:
https://github.com/ltagliamonte/cassandra-audit
PRs are welcome!

On Mon, May 28, 2018 at 1:55 AM, Horia Mocioi 
wrote:

> Hello,
>
> Another way to do it would be to create your own QueryHandler:
>
>- create a class that would implement the QueryHandler interface and
>make Cassandra aware of it
>- in that class you can maintain a list of the queries (add to this
>list when prepare method is being called) and the current query that
>you will get from the list when getPrepared it's called; you can get
>it from the list using the MD5Digest id
>- when processPrepared is called you can replace the ? in the query
>string with the values in the QueryOptions options.getValues().
>
>
> On fre, 2018-05-25 at 11:24 -0400, Nitan Kainth wrote:
>
> Hi,
>
> I would like to log all C* queries hitting cluster. Could someone please
> tell me how can I do it at cluster level?
> Will nodetool setlogginglevel work? If so, please share example with
> library name.
>
> C* version 3.11
>
>


Re: inexistent columns familes

2018-04-17 Thread Luigi Tagliamonte
got it thanks! will have to tackle this in another way.
Thank you.
Regards
L.

On Tue, Apr 17, 2018 at 1:58 PM, Jeff Jirsa <jji...@gmail.com> wrote:

> I imagine 3.0.16 has THIS bug, but it has far fewer other real bugs.
>
>
>
> On Tue, Apr 17, 2018 at 1:56 PM, Luigi Tagliamonte <
> luigi.tagliamont...@gmail.com> wrote:
>
>> Thank you Jeff,
>> my backup scripts works using the cf folders on disk :)
>> it parses all the keyspaces and for each performs: nodetool flush
>> ${keyspace} ${cf} and then nodetool snapshot ${keyspace} -cf ${cf}
>> Does 3.0.16 not having this "bug"?
>> Regards
>> L.
>>
>> On Tue, Apr 17, 2018 at 1:50 PM, Jeff Jirsa <jji...@gmail.com> wrote:
>>
>>> It's probably not ideal, but also not really a bug. We need to create
>>> the table in the schema to see if it exists on disk so we know whether or
>>> not to migrate it, and when we learn it's empty, we remove it from the
>>> schema but we dont delete the directory. It's not great, but it's not going
>>> to cause you any problems.
>>>
>>> That said: 3.0.11 may cause you problems, you should strongly consider
>>> 3.0.16 instead.
>>>
>>> On Tue, Apr 17, 2018 at 1:47 PM, Luigi Tagliamonte <
>>> luigi.tagliamont...@gmail.com> wrote:
>>>
>>>> Hello everybody,
>>>> i'm having a problem with a brand new cassandra:3.0.11 node. The
>>>> following tables belonging to the system keyspace:
>>>>
>>>> - schema_aggregates
>>>> - schema_columnfamilies
>>>> - schema_columns
>>>> - schema_functions
>>>> - schema_keyspaces
>>>> - schema_triggers
>>>> - schema_usertypes
>>>>
>>>>
>>>> get initialised on disk:
>>>>
>>>> *root@ip-10-48-93-149:/var/lib/cassandra/data/system# pwd*
>>>> /var/lib/cassandra/data/system
>>>>
>>>> *root@ip-10-48-93-149:/var/lib/cassandra/data/system# ls -1*
>>>> IndexInfo-9f5c6374d48532299a0a5094af9ad1e3
>>>> available_ranges-c539fcabd65a31d18133d25605643ee3
>>>> batches-919a4bc57a333573b03e13fc3f68b465
>>>> batchlog-0290003c977e397cac3efdfdc01d626b
>>>> built_views-4b3c50a9ea873d7691016dbc9c38494a
>>>> compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca
>>>> hints-2666e20573ef38b390fefecf96e8f0c7
>>>> local-7ad54392bcdd35a684174e047860b377
>>>> paxos-b7b7f0c2fd0a34108c053ef614bb7c2d
>>>> peer_events-59dfeaea8db2334191ef109974d81484
>>>> peers-37f71aca7dc2383ba70672528af04d4f
>>>> range_xfers-55d764384e553f8b9f6e676d4af3976d
>>>> schema_aggregates-a5fc57fc9d6c3bfda3fc01ad54686fea
>>>> schema_columnfamilies-45f5b36024bc3f83a3631034ea4fa697
>>>> schema_columns-296e9c049bec3085827dc17d3df2122a
>>>> schema_functions-d1b675fe2b503ca48e49c0f81989dcad
>>>> schema_keyspaces-b0f2235744583cdb9631c43e59ce3676
>>>> schema_triggers-0359bc7171233ee19a4ab9dfb11fc125
>>>> schema_usertypes-3aa752254f82350b8d5c430fa221fa0a
>>>> size_estimates-618f817b005f3678b8a453f3930b8e86
>>>> sstable_activity-5a1ff267ace03f128563cfae6103c65e
>>>> views_builds_in_progress-b7f2c10878cd3c809cd5d609b2bd149c
>>>>
>>>>
>>>>
>>>> but if I describe the system keyspace those cf are not present.
>>>>
>>>> cassandra@cqlsh> DESCRIBE KEYSPACE system;
>>>>
>>>> CREATE KEYSPACE system WITH replication = {'class': 'LocalStrategy'}
>>>> AND durable_writes = true;
>>>>
>>>> CREATE TABLE system.available_ranges (
>>>> keyspace_name text PRIMARY KEY,
>>>> ranges set
>>>> ) WITH bloom_filter_fp_chance = 0.01
>>>> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>>> AND comment = 'available keyspace/ranges during bootstrap/replace
>>>> that are ready to be served'
>>>> AND compaction = {'class': 'org.apache.cassandra.db.compa
>>>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>>>> 'min_threshold': '4'}
>>>> AND compression = {'chunk_length_in_kb': '64', 'class': '
>>>> org.apache.cassandra.io.compress.LZ4Compressor'}
>>>> AND crc_check_chance = 1.0
>>>> AND dclocal_read_repair_chance = 0.0
>>>> AND default_time_to_live = 0
>>>> AND gc_grace_seconds = 0
>>>> AND max_index_interval = 2048
>>

Re: inexistent columns familes

2018-04-17 Thread Luigi Tagliamonte
Thank you Jeff,
my backup scripts works using the cf folders on disk :)
it parses all the keyspaces and for each performs: nodetool flush
${keyspace} ${cf} and then nodetool snapshot ${keyspace} -cf ${cf}
Does 3.0.16 not having this "bug"?
Regards
L.

On Tue, Apr 17, 2018 at 1:50 PM, Jeff Jirsa <jji...@gmail.com> wrote:

> It's probably not ideal, but also not really a bug. We need to create the
> table in the schema to see if it exists on disk so we know whether or not
> to migrate it, and when we learn it's empty, we remove it from the schema
> but we dont delete the directory. It's not great, but it's not going to
> cause you any problems.
>
> That said: 3.0.11 may cause you problems, you should strongly consider
> 3.0.16 instead.
>
> On Tue, Apr 17, 2018 at 1:47 PM, Luigi Tagliamonte <
> luigi.tagliamont...@gmail.com> wrote:
>
>> Hello everybody,
>> i'm having a problem with a brand new cassandra:3.0.11 node. The
>> following tables belonging to the system keyspace:
>>
>> - schema_aggregates
>> - schema_columnfamilies
>> - schema_columns
>> - schema_functions
>> - schema_keyspaces
>> - schema_triggers
>> - schema_usertypes
>>
>>
>> get initialised on disk:
>>
>> *root@ip-10-48-93-149:/var/lib/cassandra/data/system# pwd*
>> /var/lib/cassandra/data/system
>>
>> *root@ip-10-48-93-149:/var/lib/cassandra/data/system# ls -1*
>> IndexInfo-9f5c6374d48532299a0a5094af9ad1e3
>> available_ranges-c539fcabd65a31d18133d25605643ee3
>> batches-919a4bc57a333573b03e13fc3f68b465
>> batchlog-0290003c977e397cac3efdfdc01d626b
>> built_views-4b3c50a9ea873d7691016dbc9c38494a
>> compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca
>> hints-2666e20573ef38b390fefecf96e8f0c7
>> local-7ad54392bcdd35a684174e047860b377
>> paxos-b7b7f0c2fd0a34108c053ef614bb7c2d
>> peer_events-59dfeaea8db2334191ef109974d81484
>> peers-37f71aca7dc2383ba70672528af04d4f
>> range_xfers-55d764384e553f8b9f6e676d4af3976d
>> schema_aggregates-a5fc57fc9d6c3bfda3fc01ad54686fea
>> schema_columnfamilies-45f5b36024bc3f83a3631034ea4fa697
>> schema_columns-296e9c049bec3085827dc17d3df2122a
>> schema_functions-d1b675fe2b503ca48e49c0f81989dcad
>> schema_keyspaces-b0f2235744583cdb9631c43e59ce3676
>> schema_triggers-0359bc7171233ee19a4ab9dfb11fc125
>> schema_usertypes-3aa752254f82350b8d5c430fa221fa0a
>> size_estimates-618f817b005f3678b8a453f3930b8e86
>> sstable_activity-5a1ff267ace03f128563cfae6103c65e
>> views_builds_in_progress-b7f2c10878cd3c809cd5d609b2bd149c
>>
>>
>>
>> but if I describe the system keyspace those cf are not present.
>>
>> cassandra@cqlsh> DESCRIBE KEYSPACE system;
>>
>> CREATE KEYSPACE system WITH replication = {'class': 'LocalStrategy'}  AND
>> durable_writes = true;
>>
>> CREATE TABLE system.available_ranges (
>> keyspace_name text PRIMARY KEY,
>> ranges set
>> ) WITH bloom_filter_fp_chance = 0.01
>> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>> AND comment = 'available keyspace/ranges during bootstrap/replace
>> that are ready to be served'
>> AND compaction = {'class': 'org.apache.cassandra.db.compa
>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>> 'min_threshold': '4'}
>> AND compression = {'chunk_length_in_kb': '64', 'class': '
>> org.apache.cassandra.io.compress.LZ4Compressor'}
>> AND crc_check_chance = 1.0
>> AND dclocal_read_repair_chance = 0.0
>> AND default_time_to_live = 0
>> AND gc_grace_seconds = 0
>> AND max_index_interval = 2048
>> AND memtable_flush_period_in_ms = 360
>> AND min_index_interval = 128
>> AND read_repair_chance = 0.0
>> AND speculative_retry = '99PERCENTILE';
>>
>> CREATE TABLE system.batches (
>> id timeuuid PRIMARY KEY,
>> mutations list,
>> version int
>> ) WITH bloom_filter_fp_chance = 0.01
>> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>> AND comment = 'batches awaiting replay'
>> AND compaction = {'class': 'org.apache.cassandra.db.compa
>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>> 'min_threshold': '2'}
>> AND compression = {'chunk_length_in_kb': '64', 'class': '
>> org.apache.cassandra.io.compress.LZ4Compressor'}
>> AND crc_check_chance = 1.0
>> AND dclocal_read_repair_chance = 0.0
>> AND default_time_to_live = 0
>> AND gc_grace_seconds = 0
>> AND max_index_interval = 2048
>> AND memtable_flush_period_in_ms = 360
>> AND min

inexistent columns familes

2018-04-17 Thread Luigi Tagliamonte
Hello everybody,
i'm having a problem with a brand new cassandra:3.0.11 node. The following
tables belonging to the system keyspace:

- schema_aggregates
- schema_columnfamilies
- schema_columns
- schema_functions
- schema_keyspaces
- schema_triggers
- schema_usertypes


get initialised on disk:

*root@ip-10-48-93-149:/var/lib/cassandra/data/system# pwd*
/var/lib/cassandra/data/system

*root@ip-10-48-93-149:/var/lib/cassandra/data/system# ls -1*
IndexInfo-9f5c6374d48532299a0a5094af9ad1e3
available_ranges-c539fcabd65a31d18133d25605643ee3
batches-919a4bc57a333573b03e13fc3f68b465
batchlog-0290003c977e397cac3efdfdc01d626b
built_views-4b3c50a9ea873d7691016dbc9c38494a
compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca
hints-2666e20573ef38b390fefecf96e8f0c7
local-7ad54392bcdd35a684174e047860b377
paxos-b7b7f0c2fd0a34108c053ef614bb7c2d
peer_events-59dfeaea8db2334191ef109974d81484
peers-37f71aca7dc2383ba70672528af04d4f
range_xfers-55d764384e553f8b9f6e676d4af3976d
schema_aggregates-a5fc57fc9d6c3bfda3fc01ad54686fea
schema_columnfamilies-45f5b36024bc3f83a3631034ea4fa697
schema_columns-296e9c049bec3085827dc17d3df2122a
schema_functions-d1b675fe2b503ca48e49c0f81989dcad
schema_keyspaces-b0f2235744583cdb9631c43e59ce3676
schema_triggers-0359bc7171233ee19a4ab9dfb11fc125
schema_usertypes-3aa752254f82350b8d5c430fa221fa0a
size_estimates-618f817b005f3678b8a453f3930b8e86
sstable_activity-5a1ff267ace03f128563cfae6103c65e
views_builds_in_progress-b7f2c10878cd3c809cd5d609b2bd149c



but if I describe the system keyspace those cf are not present.

cassandra@cqlsh> DESCRIBE KEYSPACE system;

CREATE KEYSPACE system WITH replication = {'class': 'LocalStrategy'}  AND
durable_writes = true;

CREATE TABLE system.available_ranges (
keyspace_name text PRIMARY KEY,
ranges set
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = 'available keyspace/ranges during bootstrap/replace that
are ready to be served'
AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 360
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';

CREATE TABLE system.batches (
id timeuuid PRIMARY KEY,
mutations list,
version int
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = 'batches awaiting replay'
AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '2'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 360
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';

CREATE TABLE system."IndexInfo" (
table_name text,
index_name text,
PRIMARY KEY (table_name, index_name)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (index_name ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = 'built column indexes'
AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 360
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';

CREATE TABLE system.views_builds_in_progress (
keyspace_name text,
view_name text,
generation_number int,
last_token text,
PRIMARY KEY (keyspace_name, view_name)
) WITH CLUSTERING ORDER BY (view_name ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = 'views builds current progress'
AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.0
AND 

Re: [Question] Automated cluster cleanup

2017-04-27 Thread Luigi Tagliamonte
Hello Ben,
thank you for sharing the cassandra-reaper reposiroty and about the
security advice.
Regards
L

On Thu, Apr 27, 2017 at 4:54 PM, Ben Bromhead <b...@instaclustr.com> wrote:

> Hi Luigi
>
> Under the hood, nodetool is actually just a command line wrapper around
> certain JMX calls. If you are looking to automate some more of commonplace
> nodetool actions, have a look at the nodetool source and it will show
> exactly what JMX calls (and parameters) are being passed.
>
> One thing to keep in mind with JMX, is it does allow a remote user to do
> some scary things to Cassandra and it has included remote code execution
> vulns. So ensure you lock down JMX thoroughly (password/username auth,
> certification auth, fw rules etc).
>
> For the other most common management, repairs, check out Cassandra reaper
> https://github.com/thelastpickle/cassandra-reaper.
>
> Ben
>
> On Thu, 27 Apr 2017 at 16:37 Luigi Tagliamonte <lu...@sysdig.com> wrote:
>
>> Hello Cassandra users,
>> my cluster is getting bigger and I was looking into automating some
>> tedious operations like the node cleanup after adding a new node to the
>> cluster.
>>
>> I gave a quick search and I didn't find any good available option, so I
>> decided to look into the JMX interface (In the storage service, I found the
>> method: forceKeyspaceCleanup that seems a good candidate), before going
>> hardcore with SSH+nodetool sessions.
>>
>> I was wondering if somebody here wants to share his experiences about
>> this task, and what do you think about JMX approach instead of the SSH one.
>>
>> Thank you.
>>
>> --
>> Luigi
>> ---
>> “The only way to get smarter is by playing a smarter opponent.”
>>
> --
> Ben Bromhead
> CTO | Instaclustr <https://www.instaclustr.com/>
> +1 650 284 9692 <(650)%20284-9692>
> Managed Cassandra / Spark on AWS, Azure and Softlayer
>



-- 
Luigi
---
“The only way to get smarter is by playing a smarter opponent.”


[Question] Automated cluster cleanup

2017-04-27 Thread Luigi Tagliamonte
Hello Cassandra users,
my cluster is getting bigger and I was looking into automating some tedious
operations like the node cleanup after adding a new node to the cluster.

I gave a quick search and I didn't find any good available option, so I
decided to look into the JMX interface (In the storage service, I found the
method: forceKeyspaceCleanup that seems a good candidate), before going
hardcore with SSH+nodetool sessions.

I was wondering if somebody here wants to share his experiences about this
task, and what do you think about JMX approach instead of the SSH one.

Thank you.

-- 
Luigi
---
“The only way to get smarter is by playing a smarter opponent.”


restore cassandra snapshots on a smaller cluster

2016-05-17 Thread Luigi Tagliamonte
Hi everyone,
i'm wondering if it is possible to restore all the snapshots of a cluster
(10 nodes) in a smaller cluster (3 nodes)? If yes how to do it?

-- 
Luigi
---
“The only way to get smarter is by playing a smarter opponent.”


Cassandra Cleanup and disk space

2015-11-26 Thread Luigi Tagliamonte
Hi Everyone,
I'd like to understand what cleanup does on a running cluster when there is
no cluster topology change, i did a test and i saw the cluster disk space
shrink of 200GB.
I'm using cassandra 2.1.9.
-- 
Luigi
---
“The only way to get smarter is by playing a smarter opponent.”


Re: Cassandra Cleanup and disk space

2015-11-26 Thread Luigi Tagliamonte
I did it 2 times and in both times it freed a lot of space, don't think
that it's just a coincidence.
On Nov 26, 2015 10:56 AM, "Carlos Alonso" <i...@mrcalonso.com> wrote:

> May it be a SizeTieredCompaction of big SSTables just finished and freed
> some space?
>
> Carlos Alonso | Software Engineer | @calonso <https://twitter.com/calonso>
>
> On 26 November 2015 at 08:55, Luigi Tagliamonte <lu...@sysdig.com> wrote:
>
>> Hi Everyone,
>> I'd like to understand what cleanup does on a running cluster when there
>> is no cluster topology change, i did a test and i saw the cluster disk
>> space shrink of 200GB.
>> I'm using cassandra 2.1.9.
>> --
>> Luigi
>> ---
>> “The only way to get smarter is by playing a smarter opponent.”
>>
>
>


Re: Re : Data restore to a new cluster

2015-10-29 Thread Luigi Tagliamonte
+1 also interested about an official documentation about this.


On Mon, Oct 26, 2015 at 5:12 PM, sai krishnam raju potturi <
pskraj...@gmail.com> wrote:

> hi;
>we are working on a data backup and restore procedure to a new cluster.
> We are following the datastax documentation. It mentions a step
>
> "Restore the SSTable files snapshotted from the old cluster onto the new
> cluster using the same directories"
>
>
> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_snapshot_restore_new_cluster.html
>
> Could not find a mention about  "SCHEMA" creation. Could somebody shed
> some light on this. At what point do we create the "SCHEMA", of required.
>
>
> thanks
> Sai
>



-- 
Luigi
---
“The only way to get smarter is by playing a smarter opponent.”


Re: What is your backup strategy for Cassandra?

2015-09-24 Thread Luigi Tagliamonte
Since I'm running on AWS we wrote a script that for each column performs a
snapshot and sync it on S3, and at the end of the script i'm also grabbing
the node tokens and store them on S3.
In case of restore i will use this procedure

.

On Mon, Sep 21, 2015 at 9:23 PM, Sanjay Baronia <
sanjay.baro...@triliodata.com> wrote:

> John,
>
> Yes the Trilio solution is private and today, it is for Cassandra running
> in Vmware and OpenStack environment. AWS support is on the roadmap. Will
> reach out separately to give you a demo after the summit.
>
> Thanks,
>
> Sanjay
>
> _
>
>
>
> *Sanjay Baronia VP of Product & Solutions Management Trilio Data *(c)
> 508-335-2306
> sanjay.baro...@triliodata.com
>
> [image: Trilio-Business Assurance_300 Pixels] 
>
> *Experience Trilio* *in action*, please *click here
> * to request a demo today!
>
>
> From: John Wong 
> Reply-To: Cassandra Maillist 
> Date: Friday, September 18, 2015 at 8:02 PM
> To: Cassandra Maillist 
> Subject: Re: What is your backup strategy for Cassandra?
>
>
>
> On Fri, Sep 18, 2015 at 3:02 PM, Sanjay Baronia <
> sanjay.baro...@triliodata.com> wrote:
>
>>
>> Will be at the Cassandra summit next week if any of you would like a demo.
>>
>>
>>
>
> Sanjay, is Trilio Data's work private? Unfortunately I will not attend the
> Summit, but maybe Trilio can also talk about this in, say, a Cassandra
> Planet blog post? I'd like to see a demo or get a little more technical. If
> open source would be cool.
>
> I didn't implement our solution, but the current solution is based on full
> snapshot copies to a remote server for storage using rsync (only transfers
> what is needed). On our remote server we have a complete backup of every
> hour, so if you cd into the data directory you can get every node's exact
> moment-in-time data like you are browsing on the actual nodes.
>
> We are an AWS shop so we can further optimize our cost by using EBS
> snapshot so the volume can reduce (currently we provisioned 4000GB which is
> too much). Anyway, s3 we tried, and is an okay solution. The bad thing is
> performance plus ability to quickly go back in time. With EBS I can create
> a dozen volumes from the same snapshot, attach each to my each of my node,
> and cp -r files over.
>
> John
>
>>
>> From: Maciek Sakrejda 
>> Reply-To: Cassandra Maillist 
>> Date: Friday, September 18, 2015 at 2:09 PM
>> To: Cassandra Maillist 
>> Subject: Re: What is your backup strategy for Cassandra?
>>
>> On Thu, Sep 17, 2015 at 7:46 PM, Marc Tamsky  wrote:
>>
>>> This seems like an apt time to quote [1]:
>>>
>>> > Remember that you get 1 point for making a backup and 10,000 points
>>> for restoring one.
>>>
>>> Restoring from backups is my goal.
>>>
>>> The commonly recommended tools (tablesnap, cassandra_snapshotter) all
>>> seem to leave the restore operation as a pretty complicated exercise for
>>> the operator.
>>>
>>> Do any include a working way to restore, on a different host, all of
>>> node X's data from backups to the correct directories, such that the
>>> restored files are in the proper places and the node restart method [2]
>>> "just works"?
>>>
>>
>> As someone getting started with Cassandra, I'm very much interested in
>> this as well. It seems that for the most part, folks seem to rely on
>> replication and node replacement to recover from failures, and perhaps this
>> is a testament for how well this works, but as long as we're hauling out
>> aphorisms, "RAID is not a backup" seems to (partially) apply here too.
>>
>> I'd love to hear more about how the community does restores, too. This
>> isn't complaining about shoddy tooling: this is trying to understand--and
>> hopefully, in time, improve--the status quo re: disaster recovery. E.g.,
>> given that tableslurp operates on a single table at a time, do people
>> normally just restore single tables? Is that used when there's filesystem
>> or disk corruption? Bugs? Other issues? Looking forward to learning more.
>>
>> Thanks,
>> Maciek
>>
>
>


-- 
Luigi
---
“The only way to get smarter is by playing a smarter opponent.”