Re: Periodic Anti-Entropy repair

2015-05-24 Thread Anuj Wadehra
You should use nodetool repair -pr on every node to make sure that each range 
is repaired only once.



Thanks

Anuj Wadehra

Sent from Yahoo Mail on Android

From:Brice Argenson bargen...@gmail.com
Date:Sat, 23 May, 2015 at 12:31 am
Subject:Periodic Anti-Entropy repair

Hi everyone,


We are currently migrating from DSE to Apache Cassandra and we would like to 
put in place an automatic and periodic nodetool repair execution to replace the 
one executed by OpsCenter.


I wanted to create a script / service that would run something like that:


token_rings = `nodetool ring | awk '{print $8}’` 
for(int i = 0; i  token_rings.length; i += 2) { 
   `nodetool repair -st token_rings[i] -et token_rings[i+1]` 
}


That script / service would run every week (our GCGrace is 10 days) and would 
repair all the ranges of the ring one by one.


I also looked a bit on Google and I found that script: 
https://github.com/BrianGallew/cassandra_range_repair

It seems to do something equivalent but it also seems to run the repair node by 
node instead of the complete ring. 

From my understanding, that would mean that the script has to be run for every 
node of the cluster and that all token ranges would be repair as many time as 
the number of replicas containing it.



Is there something I misunderstand? 

Which approach is better? 

How do you handle your Periodic Anti-Entropy Repairs?



Thanks a lot!



Leveled Compaction Strategy with a really intensive delete workload

2015-05-24 Thread Stefano Ortolani
Hi all,

I have a question re leveled compaction strategy that has been bugging me
quite a lot lately. Based on what I understood, a compaction takes place
when the SSTable gets to a specific size (10 times the size of its previous
generation). My question is about an edge case where, due to a really
intensive delete workloads, the SSTable is promoted to the next level (say
L1) and its size, because of the many evicted tombstones, fall back to 1/10
of its size (hence to a size compatible to the previous generation, L0).

What happens in this case? If the next major compaction is set to happen
when the SSTable is promoted to L2, well, that might take too long and too
many tobmstones could then appear in the meanwhile (and queries might
subsequently fail). Wouldn't be more correct to flag the SStable's
generation to its previous value (namely, not changing it even if a major
compaction took place)?

Regards,
Stefano Ortolani


Re: Multiple cassandra instances per physical node

2015-05-24 Thread Jonathan Haddad
What impact would vnodes have on strong consistency?  I think the problem
you're describing exists with or without them.

On Sat, May 23, 2015 at 2:30 PM Nate McCall n...@thelastpickle.com wrote:


 So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra
 nodes (each with 5 data disks, 1 commit log disk) and either give each its
 own container  IP or change the listen ports. Will this work? What are the
 risks? Will/should Cassandra support this better in the future?


 Don't use vnodes if any operations need strong consistency (reading or
 writing at quorum). Otherwise, at RF=3, if you loose a single node you will
 only have one 1 replica left for some portion of the ring.



 --
 -
 Nate McCall
 Austin, TX
 @zznate

 Co-Founder  Sr. Technical Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com



RE: Periodic Anti-Entropy repair

2015-05-24 Thread Tiwari, Tarun
We have encountered issues of very long running nodetool repair when we ran it 
node by node on really large dataset. It even kept on running for a week in 
some cases.
IMO the strategy you are choosing of repairing nodes by –st and –et is good one 
and does the same work in small increments logs of which can be analyzed easily.

In addition my suggestion would be to use –h option to connect to the node from 
outside, and take care of the fact that node tool ring will give even –ve token 
ranges in the ‘for’ loop. You can go from -2^63 to first ring value, then from 
(there+1) to next token value. Better not use i+=2 because token values are not 
necessarily even numbers.

Regards,
Tarun

From: Anuj Wadehra [mailto:anujw_2...@yahoo.co.in]
Sent: Sunday, May 24, 2015 6:31 AM
To: user@cassandra.apache.org
Subject: Re: Periodic Anti-Entropy repair

You should use nodetool repair -pr on every node to make sure that each range 
is repaired only once.


Thanks
Anuj Wadehra

Sent from Yahoo Mail on 
Androidhttps://overview.mail.yahoo.com/mobile/?.src=Android


From:Brice Argenson bargen...@gmail.commailto:bargen...@gmail.com
Date:Sat, 23 May, 2015 at 12:31 am
Subject:Periodic Anti-Entropy repair
Hi everyone,

We are currently migrating from DSE to Apache Cassandra and we would like to 
put in place an automatic and periodic nodetool repair execution to replace the 
one executed by OpsCenter.

I wanted to create a script / service that would run something like that:

token_rings = `nodetool ring | awk '{print $8}’`
for(int i = 0; i  token_rings.length; i += 2) {
   `nodetool repair -st token_rings[i] -et token_rings[i+1]`
}

That script / service would run every week (our GCGrace is 10 days) and would 
repair all the ranges of the ring one by one.

I also looked a bit on Google and I found that script: 
https://github.com/BrianGallew/cassandra_range_repair
It seems to do something equivalent but it also seems to run the repair node by 
node instead of the complete ring.
From my understanding, that would mean that the script has to be run for every 
node of the cluster and that all token ranges would be repair as many time as 
the number of replicas containing it.


Is there something I misunderstand?
Which approach is better?
How do you handle your Periodic Anti-Entropy Repairs?


Thanks a lot!





Fwd: INFO LOGS NOT written to System.log (Intermittently)

2015-05-24 Thread Parth Setya
-- Forwarded message --
From: Parth Setya setya.pa...@gmail.com
Date: Fri, May 22, 2015 at 5:14 PM
Subject: INFO LOGS NOT written to System.log (Intermittently)
To: user@cassandra.apache.org


Hi

I have a *3 node *cluster.
Logging Level: *INFO*

We observed that for there is nothing written to the system.log file(on all
three nodes) for a substantial duration of time(~24 minutes)






*INFO [CompactionExecutor:52531] 2015-05-20 05:16:38,187
CompactionController.java (line 198) Compacting large row
system/hints:dfa0cbd6-1aad-463c-bad4-8d0ef5d5b6d5 (99465841 bytes)
incrementallyINFO [CompactionExecutor:52531] 2015-05-20 05:16:38,944
CompactionTask.java (line 275) Compacted 2 sstables to [].  32,782,133
bytes to 0 (~0% of original) in 764ms = 0.00MB/s.  3 total partitions
merged to 0.  Partition merge counts were {1:1, 2:1, } INFO
[OptionalTasks:1] 2015-05-20 05:40:27,719 ColumnFamilyStore.java (line 740)
Enqueuing flush of Memtable-compaction_history@927327767(5375/52143
serialized/live bytes, 137 ops) INFO [FlushWriter:2627] 2015-05-20
05:40:27,721 Memtable.java (line 333) Writing
Memtable-compaction_history@927327767(5375/52143 serialized/live bytes, 137
ops)*
Can someone highlight any scenario which would lead to this.

Apart from this , there were the following known issues present:

1. Some mutations (async) were being dropped due to lack of tuning.
2. The data directory contained about 1000 sstables(most having size 1.1Kb)
Data present in the db at the time (~50 Million Rows already present, and
30 million data(rows) being loaded in the db)

Best

Parth


Re: Leveled Compaction Strategy with a really intensive delete workload

2015-05-24 Thread Manoj Khangaonkar
Hi,

For a delete intensive workload ( translate to write intensive), is there
any reason to use leveled compaction ? The recommendation seems to be that
leveled compaction is suited for read intensive workloads.

Depending on your use case, you might better of with data tiered or size
tiered strategy.

regards

regards

On Sun, May 24, 2015 at 10:50 AM, Stefano Ortolani ostef...@gmail.com
wrote:

 Hi all,

 I have a question re leveled compaction strategy that has been bugging me
 quite a lot lately. Based on what I understood, a compaction takes place
 when the SSTable gets to a specific size (10 times the size of its previous
 generation). My question is about an edge case where, due to a really
 intensive delete workloads, the SSTable is promoted to the next level (say
 L1) and its size, because of the many evicted tombstones, fall back to 1/10
 of its size (hence to a size compatible to the previous generation, L0).

 What happens in this case? If the next major compaction is set to happen
 when the SSTable is promoted to L2, well, that might take too long and too
 many tobmstones could then appear in the meanwhile (and queries might
 subsequently fail). Wouldn't be more correct to flag the SStable's
 generation to its previous value (namely, not changing it even if a major
 compaction took place)?

 Regards,
 Stefano Ortolani




-- 
http://khangaonkar.blogspot.com/