Re: Internal Handling of Map Updates

2016-06-01 Thread Eric Stevens
>From that perspective, you could also use a frozen collection which takes
away the ability to append, but for which overwrites shouldn't generate a
tombstone.

On Wed, Jun 1, 2016, 5:54 PM kurt Greaves  wrote:

> Is there anything stopping you from using JSON instead of a collection?
>
> On 27 May 2016 at 15:20, Eric Stevens  wrote:
>
>> If you aren't removing elements from the map, you should instead be able
>> to use an UPDATE statement and append the map. It will have the same effect
>> as overwriting it, because all the new keys will take precedence over the
>> existing keys. But it'll happen without generating a tombstone first.
>>
>> If you do have to remove elements from the collection during this
>> process, you are either facing tombstones or having to surgically figure
>> out which elements ought to be removed (which also involves tombstones,
>> though at least not range tombstones, so a bit cheaper).
>>
>> On Fri, May 27, 2016, 5:39 AM Matthias Niehoff <
>> matthias.nieh...@codecentric.de> wrote:
>>
>>> We are processing events in Spark and store the resulting entries
>>> (containing a map) in Cassandra. The results can be new (no entry for this
>>> key in Cassandra) or an Update (there is already an entry with this key in
>>> Cassandra). We use the spark-cassandra-connector to store the data in
>>> Cassandra.
>>>
>>> The connector will always do an insert of the data and will rely on the
>>> upsert capabilities of cassandra. So every time an event is updated the
>>> complete map is replaced with all the problems of tombstones.
>>> Seems like we have to implement our own persist logic in which we check
>>> if an element already exists and if yes update the map manually. that would
>>> require a read before write which would be nasty. Another option would be
>>> not to use a collection but (clustering) columns. Do you have another idea
>>> of doing this?
>>>
>>> (the conclusion of this whole thing for me would be: use upsert, but do
>>> specific updates on collections as an upsert might replace the whole
>>> collection and generate thumbstones)
>>>
>>> 2016-05-25 17:37 GMT+02:00 Tyler Hobbs :
>>>
 If you replace an entire collection, whether it's a map, set, or list,
 a range tombstone will be inserted followed by the new collection.  If you
 only update a single element, no tombstones are generated.

 On Wed, May 25, 2016 at 9:48 AM, Matthias Niehoff <
 matthias.nieh...@codecentric.de> wrote:

> Hi,
>
> we have a table with a Map Field. We do not delete anything in this
> table, but to updates on the values including the Map Field (most of the
> time a new value for an existing key, Rarely adding new keys). We now
> encounter a huge amount of thumbstones for this Table.
>
> We used sstable2json to take a look into the sstables:
>
>
> {"key": "Betty_StoreCatalogLines:7",
>
>  "cells": [["276-1-6MPQ0RI-276110031802001001:","",1463820040628001],
>
>["276-1-6MPQ0RI-276110031802001001:last_modified","2016-05-21 
> 08:40Z",1463820040628001],
>
>
> ["276-1-6MPQ0RI-276110031802001001:last_modified_by_source:_","276-1-6MPQ0RI-276110031802001001:last_modified_by_source:!",1463040069753999,"t",1463040069],
>
>
> ["276-1-6MPQ0RI-276110031802001001:last_modified_by_source:_","276-1-6MPQ0RI-276110031802001001:last_modified_by_source:!",1463120708590002,"t",1463120708],
>
>
> ["276-1-6MPQ0RI-276110031802001001:last_modified_by_source:_","276-1-6MPQ0RI-276110031802001001:last_modified_by_source:!",1463145700735007,"t",1463145700],
>
>
> ["276-1-6MPQ0RI-276110031802001001:last_modified_by_source:_","276-1-6MPQ0RI-276110031802001001:last_modified_by_source:!",1463157430862000,"t",1463157430],
>
>
> [„276-1-6MPQ0RI-276110031802001001:last_modified_by_source:_“,“276-1-6MPQ0RI-276110031802001001:last_modified_by_source:!“,1463164595291002,"t",1463164595],
>
> . . .
>
>   
> ["276-1-6MPQ0RI-276110031802001001:last_modified_by_source:_","276-1-6MPQ0RI-276110031802001001:last_modified_by_source:!",1463820040628000,"t",1463820040],
>
>
> ["276-1-6MPQ0RI-276110031802001001:last_modified_by_source:62657474795f73746f72655f636174616c6f675f6c696e6573","0154d265c6b0",1463820040628001],
>
>
> [„276-1-6MPQ0RI-276110031802001001:payload“,"{\"payload\":{\"Article 
> Id\":\"276110031802001001\",\"Row Id\":\"1-6MPQ0RI\",\"Article 
> #\":\"31802001001\",\"Quote Item Id\":\"1-6MPWPVC\",\"Country 
> Code\":\"276\"}}",1463820040628001]
>
>
>
> Looking at the SStables it seem like every update of a value in a Map
> breaks down to a delete and insert in the corresponding SSTable (see all
> the thumbstone flags „t“ in the extract of sstable2json 

Re: Internal Handling of Map Updates

2016-06-01 Thread kurt Greaves
Is there anything stopping you from using JSON instead of a collection?

On 27 May 2016 at 15:20, Eric Stevens  wrote:

> If you aren't removing elements from the map, you should instead be able
> to use an UPDATE statement and append the map. It will have the same effect
> as overwriting it, because all the new keys will take precedence over the
> existing keys. But it'll happen without generating a tombstone first.
>
> If you do have to remove elements from the collection during this process,
> you are either facing tombstones or having to surgically figure out which
> elements ought to be removed (which also involves tombstones, though at
> least not range tombstones, so a bit cheaper).
>
> On Fri, May 27, 2016, 5:39 AM Matthias Niehoff <
> matthias.nieh...@codecentric.de> wrote:
>
>> We are processing events in Spark and store the resulting entries
>> (containing a map) in Cassandra. The results can be new (no entry for this
>> key in Cassandra) or an Update (there is already an entry with this key in
>> Cassandra). We use the spark-cassandra-connector to store the data in
>> Cassandra.
>>
>> The connector will always do an insert of the data and will rely on the
>> upsert capabilities of cassandra. So every time an event is updated the
>> complete map is replaced with all the problems of tombstones.
>> Seems like we have to implement our own persist logic in which we check
>> if an element already exists and if yes update the map manually. that would
>> require a read before write which would be nasty. Another option would be
>> not to use a collection but (clustering) columns. Do you have another idea
>> of doing this?
>>
>> (the conclusion of this whole thing for me would be: use upsert, but do
>> specific updates on collections as an upsert might replace the whole
>> collection and generate thumbstones)
>>
>> 2016-05-25 17:37 GMT+02:00 Tyler Hobbs :
>>
>>> If you replace an entire collection, whether it's a map, set, or list, a
>>> range tombstone will be inserted followed by the new collection.  If you
>>> only update a single element, no tombstones are generated.
>>>
>>> On Wed, May 25, 2016 at 9:48 AM, Matthias Niehoff <
>>> matthias.nieh...@codecentric.de> wrote:
>>>
 Hi,

 we have a table with a Map Field. We do not delete anything in this
 table, but to updates on the values including the Map Field (most of the
 time a new value for an existing key, Rarely adding new keys). We now
 encounter a huge amount of thumbstones for this Table.

 We used sstable2json to take a look into the sstables:


 {"key": "Betty_StoreCatalogLines:7",

  "cells": [["276-1-6MPQ0RI-276110031802001001:","",1463820040628001],

["276-1-6MPQ0RI-276110031802001001:last_modified","2016-05-21 
 08:40Z",1463820040628001],


 ["276-1-6MPQ0RI-276110031802001001:last_modified_by_source:_","276-1-6MPQ0RI-276110031802001001:last_modified_by_source:!",1463040069753999,"t",1463040069],


 ["276-1-6MPQ0RI-276110031802001001:last_modified_by_source:_","276-1-6MPQ0RI-276110031802001001:last_modified_by_source:!",1463120708590002,"t",1463120708],


 ["276-1-6MPQ0RI-276110031802001001:last_modified_by_source:_","276-1-6MPQ0RI-276110031802001001:last_modified_by_source:!",1463145700735007,"t",1463145700],


 ["276-1-6MPQ0RI-276110031802001001:last_modified_by_source:_","276-1-6MPQ0RI-276110031802001001:last_modified_by_source:!",1463157430862000,"t",1463157430],


 [„276-1-6MPQ0RI-276110031802001001:last_modified_by_source:_“,“276-1-6MPQ0RI-276110031802001001:last_modified_by_source:!“,1463164595291002,"t",1463164595],

 . . .

   
 ["276-1-6MPQ0RI-276110031802001001:last_modified_by_source:_","276-1-6MPQ0RI-276110031802001001:last_modified_by_source:!",1463820040628000,"t",1463820040],


 ["276-1-6MPQ0RI-276110031802001001:last_modified_by_source:62657474795f73746f72655f636174616c6f675f6c696e6573","0154d265c6b0",1463820040628001],


 [„276-1-6MPQ0RI-276110031802001001:payload“,"{\"payload\":{\"Article 
 Id\":\"276110031802001001\",\"Row Id\":\"1-6MPQ0RI\",\"Article 
 #\":\"31802001001\",\"Quote Item Id\":\"1-6MPWPVC\",\"Country 
 Code\":\"276\"}}",1463820040628001]



 Looking at the SStables it seem like every update of a value in a Map
 breaks down to a delete and insert in the corresponding SSTable (see all
 the thumbstone flags „t“ in the extract of sstable2json above).

 We are using Cassandra 2.2.5.

 Can you confirm this behavior?

 Thanks!
 --
 Matthias Niehoff | IT-Consultant | Agile Software Factory  | Consulting
 codecentric AG | Zeppelinstr 2 | 76185 Karlsruhe | Deutschland
 tel: +49 (0) 721.9595-681 | fax: +49 (0) 721.9595-666 | mobil: +49 (0)

Library/utility announcements?

2016-06-01 Thread James Carman
Some user lists allow it. Does the Cassandra community mind folks
announcing their super cool Cassandra libraries on this list? Is there a
page for us to list them?


Token Ring Question

2016-06-01 Thread Anubhav Kale
Hello,

I recently learnt that regardless of number of Data Centers, there is really 
only one token ring across all nodes. (I was under the impression that there is 
one per DC like how Datastax Ops Center would show it).

Suppose we have 4 v-nodes, and 2 DCs (2 nodes in each DC) and a key space is 
set to replicate in only one DC - say DC1.

Now, if the token for a particular write falls in the "primary range" of a node 
living in DC2, does the code check for such conditions and instead put it on 
some node in DC1 ? What is the true meaning of "primary" token range in such 
scenarios ?

Is this how things works roughly speaking or am I missing something ?

Thanks !


Re: (Full) compaction does not delete (all) old files

2016-06-01 Thread Dongfeng Lu
Alain,

Thanks for responding to my question. 

1 & 2: I think it is a bug, but as you said, maybe no one will dig it. I just 
hope it has been fixed in the later versions.
3: Restarting the code does NOT remove those files. I stopped and restarted C* 
many times and it did nothing. 
4: Thanks for the links. I will probably try DTCS in the near future.

A: Automatic compaction by C* does not work in a timely manner for me. I set 
TTL to 8 days, and hoped that I only have data files with timestamps like 
within 2 weeks. However, I often saw files created 2 months ago with 50GB in 
size.

In the final step of upgrade, I am suppose to run upgradesstables, which is 
like a compaction. I know compaction takes a long time to run. In order to 
reduce the amount of time during the actual upgrade, I ran a manual compaction 
to cut down the size, by 80% in my case.

B: We have tested the procedure with 2.1.11 in our DEV environment quite some 
time ago. Due to priority changes, we only started applying it to production 
lately. By rule, I had to re-test it if I switch to 2.1.14, and I don't see 
much benefits doing it.

C: Yes, I noticed the statement "When upgrading to Cassandra 2.1 all nodes must 
be on at least Cassandra 2.0.7 to support rolling start." Since we are at 
2.0.6, we have to migrate twice, from 2.0.6 to 2.0.17. then to 2.1.11.

Finally, my solution was to manually delete those old files. I actually deleted 
them while C* is running and did not see any errors/warnings in system.log. My 
guess is that those files are not in C* metadata so C* does not know their 
existance.

Thanks,
Dongfeng 

On Wednesday, June 1, 2016 6:36 AM, Alain RODRIGUEZ  
wrote:
 

 Hi,
About your main concern:
1. True those files should have been removed. Yet Cassandra 2.0 is no longer 
supported, even more such an old version (2.0.6), so I think no one is going to 
dig this issue. To fix it, upgrade will probably be enough.

I don't usually run manual compaction, and relied completely on Cassandra to 
automatically do it. A couple of days ago in preparation for an upgrade to 
Cassandra 2.1.11, I ran a manual, complete compaction

2. As you might know, sstables are immutable, meaning compacting, merging row 
shards, has to be done somewhere else, not in place. Those -tmp- files are the 
result of compactions ongoing basically. It is perfectly normal. Yet '-tmp-' 
files are supposed to be removed once compaction is done.

3. Restarting the node will most probably solve your issue. To be sure to 
indeed free disk space, make sure you have no snapshot of those old sstables.
4. The advantage of DTCS is that data is not mixed per age. Meaning Cassandra 
can drop a full expired sstable, without compacting. It sounds like a good fit. 
Yet this compaction strategy is the most recent one and some things are still 
being fixed. I still think it is safe to use it. Make sure you read first: 
https://labs.spotify.com/2014/12/18/date-tiered-compaction/ And/Or 
http://www.datastax.com/dev/blog/datetieredcompactionstrategy

You also might want to have a look at https://github.com/jeffjirsa/twcs.
Some other off-topic, but maybe useful questions / info
A - Why do you need a manual compaction before upgrading? I really can't see 
any reason for it.B - Why upgrading to Cassandra 2.1.14 when 2.1.14 is 
available and brings some more bug fixes (compared to 2.1.11)?C - It is 
recommended to move to 2.0.last before going to 2.1.X. You might run into some 
issue. Either make sure to test it works or go incrementally 2.0.6 --> 2.0.17 
--> 2.1.14. I would probably do both. Test it and go incrementally. I would not 
go with 2.0.6 --> 2.1.14 without testing it first anyway.
Hope it is all clear and that a restart will solve your issue.
C*heers,
---Alain Rodriguez - alain@thelastpickle.comFrance
The Last Pickle - Apache Cassandra Consultinghttp://www.thelastpickle.com
2016-05-17 0:06 GMT+01:00 Dongfeng Lu :

Forgive me if that has been answered somewhere, but I could not find a concise 
or clear answer.

I am using Cassandra 2.0.6 on a 3 node cluster. I don't usually run manual 
compaction, and relied completely on Cassandra to automatically do it. A couple 
of days ago in preparation for an upgrade to Cassandra 2.1.11, I ran a manual, 
complete compaction. The compaction ran for many hours, but it did complete 
successfully, and the "load" in "nodetool status" dropped 80%. However, I did 
not see a big drop in disk usage, even after waiting for a couple of days. 
There are still many old data files left on the disk. For instance, here is a 
list of data files for one table.

-bash-4.1$ ls -ltr *-Data.db
-rw-r--r--  1 cassandra cassandra 36441245112 Jan 19 05:42 
keyspace-event_index-jb-620839-Data.db
-rw-r--r--  1 cassandra cassandra 48117578123 Jan 25 05:17 
keyspace-event_index-jb-649329-Data.db
-rw-r--r--  1 cassandra cassandra  8731574747 Jan 27 18:30 

Timeout while waiting for workers when flushing pool

2016-06-01 Thread Zhang, Charles
We have a 4 nodes, two dc's test cluster. All of them have datastax enterprise 
installed and running. One dc is the Cassandra dc, and the other is the Solr 
dc. We first used sstableloader to stream 1 billion rows into the cluster. 
After that was done, we created a Solr core using resource auto-generation. It 
started indexing fine,  for a while, and then one of the solr nodes went down. 
And the system log shows this:

ERROR [NonPeriodicTasks:1] 2016-05-31 17:47:36,560  CassandraDaemon.java:229 - 
Exception in thread Thread[NonPeriodicTasks:1,5,main]
java.lang.RuntimeException: Timeout while waiting for workers when flushing 
pool zootopia.ltb Index; current timeout is 30 millis, consider increasing 
it, or reducing load on the node.
Failure to flush may cause excessive growth of Cassandra commit log.

at 
com.datastax.bdp.search.solr.core.CassandraCoreContainer.doShutdown(CassandraCoreContainer.java:1081)
 ~[dse-search-4.8.7.jar:4.8.7]
at 
com.datastax.bdp.search.solr.core.CassandraCoreContainer.access$100(CassandraCoreContainer.java:99)
 ~[dse-search-4.8.7.jar:4.8.7]
at 
com.datastax.bdp.search.solr.core.CassandraCoreContainer$1.run(CassandraCoreContainer.java:626)
 ~[dse-search-4.8.7.jar:4.8.7]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_91]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[na:1.8.0_91]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 ~[na:1.8.0_91]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 ~[na:1.8.0_91]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_91]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_91]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
Caused by: java.util.concurrent.TimeoutException: Timeout while waiting for 
workers when flushing pool zootopia.ltb Index; current timeout is 30 
millis, consider increasing it, or reducing load on the node.
Failure to flush may cause excessive growth of Cassandra commit log.

at com.datastax.bdp.concurrent.WorkPool.doFlushError(WorkPool.java:598) 
~[dse-core-4.8.7.jar:4.8.7]
at 
com.datastax.bdp.concurrent.WorkPool.doTwoPhaseFlush(WorkPool.java:559) 
~[dse-core-4.8.7.jar:4.8.7]
at com.datastax.bdp.concurrent.WorkPool.doFlush(WorkPool.java:523) 
~[dse-core-4.8.7.jar:4.8.7]
at com.datastax.bdp.concurrent.WorkPool.shutdown(WorkPool.java:461) 
~[dse-core-4.8.7.jar:4.8.7]
at 
com.datastax.bdp.search.solr.AbstractSolrSecondaryIndex.shutdownIndexUpdates(AbstractSolrSecondaryIndex.java:534)
 ~[dse-search-4.8.7.jar:4.8.7]
at 
com.datastax.bdp.search.solr.core.CassandraCoreContainer.doShutdown(CassandraCoreContainer.java:1076)
 ~[dse-search-4.8.7.jar:4.8.7]
... 9 common frames omitted


When I restarted the solr nodes, the num of docs showed a very small number, 
much smaller than the last number we saw before it went down...

I saw this was a known issue in datastax 4.8 and was resolved in 4.8.1, but we 
are running 4.8.7...

Any ideas on whether it's still the same deadlock issue or something else?

Thanks,
Charles


Re: [Marketing Mail] [Marketing Mail] Cassandra 2.1: Snapshot data changing while transferring

2016-06-01 Thread Paul Dunkler
Hi Reynald,

> If I understand correctly, you are making a tar file with all the folders 
> named "snapshots" (i.e. the folder under which all the snapshots are created. 
> So you have one snapshots folder per table).

No, Thats not the case. We are doing a nightly snapshot of the whole database 
(named with a date).

> If this is the case, when you are executing "nodetool repair", Cassandra will 
> create a snapshot at the beginning of the repair, creating a new directory 
> under each snapshots directories. If this happens while you are creating your 
> tar file, you will get the error you saw.

Sure, i am aware of that. No, we are only taring the snapshot we just created 
some second before.

> If you are not yet doing it, I advise you to use the -t option of the 
> "nodetool snapshot" command to specify a specific name to your snapshot.
> Then you should copy only the directories named 
> snapshots/ in your tar file.

We're doing exactly that.

> Can you confirm that you are creating your tar file from the snapshots 
> directories directly (resulting in taking all the currently generated 
> snapshots)?

As i wrote above, we just do a snapshot and ONLY tar this exact snapshot.

Short (high level) overview about our backup script:

1. check if repair is running - if yes, exit
2. Dump the current db-schema
3. nodetool snapshot -t $(date)
4. Wait 60 seconds
5. Create a tar of all snapshot folders with the date we just created
6. Copy that away to a remote server

> 
> Kind regards
> 
> Reynald
> 
> On 01/06/2016 13:27, Paul Dunkler wrote:
>>> I guess this might come from the incremental repairs...
>>> The repair time is stored in the sstable (RepairedAt timestamp metadata).
>> 
>> By the way: We are not using incremental repairs at all. So can't be the 
>> case here.
>> 
>> It really seems like there is somewhat that can still change data in 
>> snapshot directories. I feel like it's something to do with flushing / 
>> compaction. But no clue, what... :(
>> 
>>> 
>>> Cheers,
>>> Reynald
>>> 
>>> On 31/05/2016 11:03, Paul Dunkler wrote:
 Hi there,
 
 i am sometimes running in very strange errors while backing up snapshots 
 from a cassandra cluster.
 
 Cassandra version:
 2.1.11
 
 What i basically do:
 1. nodetool snapshot
 2. tar all snapshot folders into one file
 3. transfer them to another server
 
 What happens is that tar just sometimes give the error message "file 
 changed as we read it" while its adding a .db-file from the folder of the 
 previously created snapshot.
 If i understand everything correct, this SHOULD never happen. Snapshots 
 should be totally immutable, right?
 
 Am i maybe hitting a bug or is there some rare case with running repair 
 operations or what-so-ever which can change snapshotted data?
 I already searched through cassandra jira but couldn't find a bug which 
 looks related to this behaviour.
 
 Would love to get some help on this.
 
 —
 Paul Dunkler
>>> 
>> 
>> —
>> Paul Dunkler
> 

—
Paul Dunkler

** * * UPLEX - Nils Goroll Systemoptimierung

Scheffelstraße 32
22301 Hamburg

tel +49 40 288 057 31
mob +49 151 252 228 42
fax +49 40 429 497 53

xmpp://pauldunk...@jabber.ccc .de

http://uplex.de/ 


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Evict Tombstones with STCS

2016-06-01 Thread Alain RODRIGUEZ
Hi, I think you got it, this is probably the way to go:


> And if it so, forceUserdefinedcompaction or setting 
> unchecked_tombstone_compactions
> to true wont help either as tombstones are less than 20% and not much disk
> would be recovered.


But if you have less than 20% tombstones in there I am not sure that
compacting those tables will help anyhow. It's probably time to add
capacity (nodes, as you can't add disks).

If you still want to give it a try, set the unchecked_tombstone_compactions
to true and the ratio to10% and see how it goes.

An other option to have better control on disk size and being able to use
up to 70% - 80% of the disk, then LCS might be a good option, if it fits
with your use case.

C*heers,

---
Alain Rodriguez - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2016-05-28 20:24 GMT+01:00 Anuj Wadehra :

> Hi,
>
> We are using C* 2.0.x . What options are available if disk space is too
> full to do compaction on huge sstables formed by STCS (created around long
> ago but not getting compacted due to min_compaction_threshold being 4).
>
> We suspect that huge space will be released when 2 largest sstables get
> compacted together such that tombstone eviction is possible. But there is
> not enough space for compacting them together assuming that compaction
> would need at least free disk=size of sstable1 + size of sstable 2 ??
>
> I read STCS code and if no sstables are available for compactions, it
> should pick individual sstable for compaction. But somehow, huge sstables
> are not participating in individual compaction.. is it due to default 20%
> tombstone threshold?? And if it so, forceUserdefinedcompaction or setting
> unchecked_tombstone_compactions to true wont help either as tombstones are
> less than 20% and not much disk would be recovered.
>
> It is not possible to add additional disks too.
>
> We see huge difference in disk utilization of different nodes. May be some
> nodes were able to get away with tombstones while others didnt manage to
> evict tombstones.
>
>
> Would be good to know more alternatives from community.
>
>
> Thanks
> Anuj
>
>
>
>
>
>
>
> Sent from Yahoo Mail on Android
> 
>


Re: [Marketing Mail] Re: [Marketing Mail] Cassandra 2.1: Snapshot data changing while transferring

2016-06-01 Thread Reynald Bourtembourg

Hi Paul,

If I understand correctly, you are making a tar file with all the 
folders named "snapshots" (i.e. the folder under which all the snapshots 
are created. So you have one /snapshots /folder per table).
If this is the case, when you are executing "nodetool repair", Cassandra 
will create a snapshot at the beginning of the repair, creating a new 
directory under each /snapshots/ directories. If this happens while you 
are creating your tar file, you will get the error you saw.


If you are not yet doing it, I advise you to use the -t option of the 
"nodetool snapshot" command to specify a specific name to your snapshot.
Then you should copy only the directories named 
snapshots/ in your tar file.


Can you confirm that you are creating your tar file from the snapshots 
directories directly (resulting in taking all the currently generated 
snapshots)?


Kind regards

Reynald

On 01/06/2016 13:27, Paul Dunkler wrote:

I guess this might come from the incremental repairs...
The repair time is stored in the sstable (RepairedAt timestamp metadata).


By the way: We are not using incremental repairs at all. So can't be 
the case here.


It really seems like there is somewhat that can still change data in 
snapshot directories. I feel like it's something to do with flushing / 
compaction. But no clue, what... :(




Cheers,
Reynald

On 31/05/2016 11:03, Paul Dunkler wrote:

Hi there,

i am sometimes running in very strange errors while backing up 
snapshots from a cassandra cluster.


Cassandra version:
2.1.11

What i basically do:
1. nodetool snapshot
2. tar all snapshot folders into one file
3. transfer them to another server

What happens is that tar just sometimes give the error message "file 
changed as we read it" while its adding a .db-file from the folder 
of the previously created snapshot.
If i understand everything correct, this SHOULD never happen. 
Snapshots should be totally immutable, right?


Am i maybe hitting a bug or is there some rare case with running 
repair operations or what-so-ever which can change snapshotted data?
I already searched through cassandra jira but couldn't find a bug 
which looks related to this behaviour.


Would love to get some help on this.

—
Paul Dunkler




—
Paul Dunkler




Re: (Full) compaction does not delete (all) old files

2016-06-01 Thread Alain RODRIGUEZ
Hi,

About your main concern:

1. True those files should have been removed. Yet Cassandra 2.0 is no
longer supported, even more such an old version (2.0.6), so I think no one
is going to dig this issue. To fix it, upgrade will probably be enough.

I don't usually run manual compaction, and relied completely on Cassandra
> to automatically do it. A couple of days ago in preparation for an upgrade
> to Cassandra 2.1.11, I ran a manual, complete compaction


2. As you might know, sstables are immutable, meaning compacting, merging
row shards, has to be done somewhere else, not in place. Those -tmp- files
are the result of compactions ongoing basically. It is perfectly normal.
Yet '-tmp-' files are supposed to be removed once compaction is done.

3. Restarting the node will most probably solve your issue. To be sure to
indeed free disk space, make sure you have no snapshot of those old
sstables.

4. The advantage of DTCS is that data is not mixed per age. Meaning
Cassandra can drop a full expired sstable, without compacting. It sounds
like a good fit. Yet this compaction strategy is the most recent one and
some things are still being fixed. I still think it is safe to use it. Make
sure you read first:
https://labs.spotify.com/2014/12/18/date-tiered-compaction/ And/Or
http://www.datastax.com/dev/blog/datetieredcompactionstrategy

You also might want to have a look at https://github.com/jeffjirsa/twcs.

Some other off-topic, but maybe useful questions / info

A - Why do you need a manual compaction before upgrading? I really can't
see any reason for it.
B - Why upgrading to Cassandra 2.1.14 when 2.1.14 is available and brings
some more bug fixes (compared to 2.1.11)?
C - It is recommended to move to 2.0.last before going to 2.1.X. You might
run into some issue. Either make sure to test it works or go incrementally
2.0.6 --> 2.0.17 --> 2.1.14. I would probably do both. Test it and go
incrementally. I would not go with 2.0.6 --> 2.1.14 without testing it
first anyway.

Hope it is all clear and that a restart will solve your issue.

C*heers,

---
Alain Rodriguez - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2016-05-17 0:06 GMT+01:00 Dongfeng Lu :

> Forgive me if that has been answered somewhere, but I could not find a
> concise or clear answer.
>
> I am using Cassandra 2.0.6 on a 3 node cluster. I don't usually run manual
> compaction, and relied completely on Cassandra to automatically do it. A
> couple of days ago in preparation for an upgrade to Cassandra 2.1.11, I ran
> a manual, complete compaction. The compaction ran for many hours, but it
> did complete successfully, and the "load" in "nodetool status" dropped 80%.
> However, I did not see a big drop in disk usage, even after waiting for a
> couple of days. There are still many old data files left on the disk. For
> instance, here is a list of data files for one table.
>
> -bash-4.1$ ls -ltr *-Data.db
> -rw-r--r--  1 cassandra cassandra 36441245112 Jan 19 05:42
> keyspace-event_index-jb-620839-Data.db
> -rw-r--r--  1 cassandra cassandra 48117578123 Jan 25 05:17
> keyspace-event_index-jb-649329-Data.db
> -rw-r--r--  1 cassandra cassandra  8731574747 Jan 27 18:30
> keyspace-event_index-jb-662597-Data.db
> -rw-r--r--  1 cassandra cassandra   835204478 Feb  2 07:20
> keyspace-event_index-jb-670851-Data.db
> -rw-r--r--  1 cassandra cassandra39496133 Feb  2 15:29
> keyspace-event_index-tmp-jb-672828-Data.db
> ... about 110 files listed here, removed for clarity ...
>
> -rw-r--r--  1 cassandra cassandra   149344563 May  9 20:53
> keyspace-event_index-tmp-jb-827472-Data.db
> -rw-r--r-- 11 cassandra cassandra 20149715779 May 15 04:18
> keyspace-event_index-jb-829601-Data.db
> -rw-r--r-- 11 cassandra cassandra  7153875910 May 15 11:15
> keyspace-event_index-jb-830446-Data.db
> -rw-r--r-- 11 cassandra cassandra  3051908121 May 16 03:08
> keyspace-event_index-jb-831112-Data.db
> -rw-r--r-- 11 cassandra cassandra  6109582092 May 16 06:11
> keyspace-event_index-jb-831709-Data.db
> -rw-r--r-- 11 cassandra cassandra  2922532233 May 16 07:14
> keyspace-event_index-jb-831873-Data.db
> -rw-r--r-- 11 cassandra cassandra  1766025989 May 16 08:31
> keyspace-event_index-jb-832111-Data.db
> -rw-r--r--  8 cassandra cassandra  2922259593 May 16 11:39
> keyspace-event_index-jb-832693-Data.db
> -rw-r--r--  8 cassandra cassandra  1224495235 May 16 11:50
> keyspace-event_index-jb-832764-Data.db
> -rw-r--r--  7 cassandra cassandra  2051385733 May 16 12:57
> keyspace-event_index-jb-832975-Data.db
> -rw-r--r--  6 cassandra cassandra   853824939 May 16 13:12
> keyspace-event_index-jb-833100-Data.db
> -rw-r--r--  5 cassandra cassandra   763243638 May 16 14:58
> keyspace-event_index-jb-833203-Data.db
> -rw-r--r--  3 cassandra cassandra99076639 May 16 16:29
> keyspace-event_index-jb-833222-Data.db
> -rw-r--r--  2 cassandra cassandra   254935385 May 16 17:21
> 

Re: [Marketing Mail] Cassandra 2.1: Snapshot data changing while transferring

2016-06-01 Thread Paul Dunkler
> I guess this might come from the incremental repairs...
> The repair time is stored in the sstable (RepairedAt timestamp metadata).

By the way: We are not using incremental repairs at all. So can't be the case 
here.

It really seems like there is somewhat that can still change data in snapshot 
directories. I feel like it's something to do with flushing / compaction. But 
no clue, what... :(

> 
> Cheers,
> Reynald
> 
> On 31/05/2016 11:03, Paul Dunkler wrote:
>> Hi there,
>> 
>> i am sometimes running in very strange errors while backing up snapshots 
>> from a cassandra cluster.
>> 
>> Cassandra version:
>> 2.1.11
>> 
>> What i basically do:
>> 1. nodetool snapshot
>> 2. tar all snapshot folders into one file
>> 3. transfer them to another server
>> 
>> What happens is that tar just sometimes give the error message "file changed 
>> as we read it" while its adding a .db-file from the folder of the previously 
>> created snapshot.
>> If i understand everything correct, this SHOULD never happen. Snapshots 
>> should be totally immutable, right?
>> 
>> Am i maybe hitting a bug or is there some rare case with running repair 
>> operations or what-so-ever which can change snapshotted data?
>> I already searched through cassandra jira but couldn't find a bug which 
>> looks related to this behaviour.
>> 
>> Would love to get some help on this.
>> 
>> —
>> Paul Dunkler
> 

—
Paul Dunkler


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Restoring Incremental Backups without using sstableloader

2016-06-01 Thread Alain RODRIGUEZ
Hi,

Well you can do it through copy / past all the sstable as written in the
link you gave as long as your token ranges distribution did not change
since you took the snapshots and that you have a way to be sure what node
each sstable belongs. Make sure that snapshot taken to node X indeed go
back to node X.

If you do not have information on where the sstable comes from or if you
added / removed nodes, then using the sstableloader is probably a good
idea. If you really don't like sstableloader (not sure why), you can paste
all the sstables to all the nodes then nodetool refresh + nodetool cleanup.
But in most cases, all the data won't fit in one node, plus you might have
sstable names identical you'll have to handle.

Hope that helps,

C*heers,

---
Alain Rodriguez - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2016-05-17 11:14 GMT+01:00 Ravi Teja A V :

> Hi everyone
>
> I am currently working with Cassandra 3.5. I would like to know if it is
> possible to restore backups without using sstableloader. I have been
> referring to the following pages in the datastax documentation:
>
> https://docs.datastax.com/en/cassandra/3.x/cassandra/operations/opsBackupSnapshotRestore.html
> Thank you.
>
> Yours sincerely
> RAVI TEJA A V
>


Re: [Marketing Mail] Cassandra 2.1: Snapshot data changing while transferring

2016-06-01 Thread Paul Dunkler
Hi Mike,

> Hi Paul, what is the value of the snapshot_before_compaction property in your 
> cassandra.yaml?

snapshot_before_compaction: false

> Say if another snapshot is being taken (because compaction kicked in and 
> snapshot_before_compaction property is set to TRUE) and at this moment you're 
> tarring the snapshot folders..

Okay, totally understand. But this feature is currently disabled on our side.
> Maybe can take a look at the records in system.compaction:
> 
> select * from system.compaction_history;
> 
I did so and found a snapshot exactly starting at 01:30 (roughly at the same 
time the snapshot starts).
Name of the snapshotted table matches the .db-File, tar is complaining about.

What happened here is that we are saving incremental backups every 10 minutes. 
In that process we do a manual nodetool flush.
This nodetool flush seems to trigger compactions. Just checked the cassandra 
log and compactions always take place at every 10 minutes.

We are using the SizeTieredCompactionStrategy for all tables.
Is it true that - using that strategy - compaction is triggered exactly at the 
point where a flush (automatically or manually) is done?

Probably it would be a better idea to not do manual flushes when saving the 
incremental_backups (because then compactions won't happen at same time with 
snapshot), right?
> Regards,
> 
> Mike Yeap
> 
> 
> 
> 
> On Tue, May 31, 2016 at 5:21 PM, Paul Dunkler  > wrote:
> And - as an addition:
> 
> Shoudln't that be documented that even snapshot files can change?
> 
>> I guess this might come from the incremental repairs...
>>> The repair time is stored in the sstable (RepairedAt timestamp metadata).
>> 
>> ok, that sounds interesting.
>> Could that also happen to incremental backup files as well? I had another 
>> case where incremental backup files were totally deleted automagically.
>> 
>> And - what is the suggested way to solve that problem? Should i try again 
>> tar-ing the snapshot until it doesn't happen anymore that something changes 
>> in between?
>> Or is there a way to "pause" the incremental repairs?
>> 
>> 
>>> Cheers,
>>> Reynald
>>> 
>>> On 31/05/2016 11:03, Paul Dunkler wrote:
 Hi there,
 
 i am sometimes running in very strange errors while backing up snapshots 
 from a cassandra cluster.
 
 Cassandra version:
 2.1.11
 
 What i basically do:
 1. nodetool snapshot
 2. tar all snapshot folders into one file
 3. transfer them to another server
 
 What happens is that tar just sometimes give the error message "file 
 changed as we read it" while its adding a .db-file from the folder of the 
 previously created snapshot.
 If i understand everything correct, this SHOULD never happen. Snapshots 
 should be totally immutable, right?
 
 Am i maybe hitting a bug or is there some rare case with running repair 
 operations or what-so-ever which can change snapshotted data?
 I already searched through cassandra jira but couldn't find a bug which 
 looks related to this behaviour.
 
 Would love to get some help on this.
 
 —
 Paul Dunkler
>>> 
>> 
>> —
>> Paul Dunkler
> 
> —
> Paul Dunkler

—
Paul Dunkler


signature.asc
Description: Message signed with OpenPGP using GPGMail