Re: What is the merit of incremental backup

2016-07-15 Thread Rajath Subramanyam
Hi Satoshi,

Incremental Backup if set to True, copies SSTables to the backup folder as
soon as a SSTable is flushed to disk. Hence these backed up SSTables miss
out on the opportunity to go through compaction. Does that explain the
longer time ?

- Rajath


Rajath Subramanyam


On Fri, Jul 15, 2016 at 12:20 AM, Satoshi Hikida  wrote:

> Hi Prasenjit
>
> Thank you for your reply.
>
> However, I doubt that incremental backup can reduce RTO. I think the
> demerit of incremental backup is to take longer repair time rather than
> without incremental backup.
>
> Because I've compared the repair time of two cases like below.
>
> (a) snapshot(10GB, full repaired) + incremental backup(1GB)
> (b) snapshot(10GB, full repaired)
>
> Each case consists of 3 node cluster, replication factor is 3 and total
> data size is 12GB/node. And we assume one node got failure then we restore
> the node. The result showed that case (b) is faster than case (a). The
> repair process of the token ranges included in incremental backup was very
> slow. However, the just transferring replicated data from existing nodes to
> repairing node is faster than repair.
>
> So far, I think Pros and Cons of incremental back is as following:
>
> - Pros (There are already agreed by you)
> - It allows storing backups offsite without transferring entire snapshots
> - With incremental backups and snapshots, it can provide more recent RPO
> (Recovery Point Objective)
> - Cons
> - It takes much longer repair time rather than without incremental backup
> (longer RTO)
>
>
> Is it correct understand? I would appreciate you can give me any advice or
> ideas if I was misunderstanding.
>
>
> Regards,
> Satoshi
>
>
> On Fri, Jul 15, 2016 at 1:46 AM, Prasenjit Sarkar <
> prasenjit.sar...@datos.io> wrote:
>
>> Hi Satoshi
>>
>> You are correct that incremental backups offer you the opportunity to
>> reduce the amount of data you need to transfer offsite. On the recovery
>> path, you need to piece together the full backup and subsequent incremental
>> backups.
>>
>> However, where incremental backups help is with respect to the RTO due to
>> the data reduction effect you mentioned. The RPO can be reduced only if you
>> take more frequent incremental backups than full backups.
>>
>> Hope this helps,
>> Prasenjit
>>
>> On Wed, Jul 13, 2016 at 11:54 PM, Satoshi Hikida 
>> wrote:
>>
>>> Hi,
>>>
>>> I want to know the actual advantage of using incremental backup.
>>>
>>> I've read through the DataStax document and it says the merit of using
>>> incremental backup is as follows:
>>>
>>> - It allows storing backups offsite without transferring entire snapshots
>>> - With incremental backups and snapshots, it can provide more recent RPO
>>> (Recovery Point Objective)
>>>
>>> Is my understanding correct? I would appreciate if someone gives me some
>>> advice or correct me.
>>>
>>> References:
>>> - DataStax, "Enabling incremental backups",
>>> http://docs.datastax.com/en/cassandra/2.2/cassandra/operations/opsBackupIncremental.html
>>>
>>> Regards,
>>> Satoshi
>>>
>>
>>
>


Re: Ring connection timeouts with 2.2.6

2016-07-15 Thread Mike Heffner
Just to followup on this post with a couple of more data points:

1)

We upgraded to 2.2.7 and did not see any change in behavior.

2)

However, what *has* fixed this issue for us was disabling msg coalescing by
setting:

otc_coalescing_strategy: DISABLED

We were using the default setting before (time horizon I believe).

We see periodic timeouts on the ring (once every few hours), but they are
brief and don't impact latency. With msg coalescing turned on we would see
these timeouts persist consistently after an initial spike. My guess is
that something in the coalescing logic is disturbed by the initial timeout
spike which leads to dropping all / high-percentage of all subsequent
traffic.

We are planning to continue production use with msg coaleasing disabled for
now and may run tests in our staging environments to identify where the
coalescing is breaking this.

Mike

On Tue, Jul 5, 2016 at 12:14 PM, Mike Heffner  wrote:

> Jeff,
>
> Thanks, yeah we updated to the 2.16.4 driver version from source. I don't
> believe we've hit the bugs mentioned in earlier driver versions.
>
> Mike
>
> On Mon, Jul 4, 2016 at 11:16 PM, Jeff Jirsa 
> wrote:
>
>> AWS ubuntu 14.04 AMI ships with buggy enhanced networking driver –
>> depending on your instance types / hypervisor choice, you may want to
>> ensure you’re not seeing that bug.
>>
>>
>>
>> *From: *Mike Heffner 
>> *Reply-To: *"user@cassandra.apache.org" 
>> *Date: *Friday, July 1, 2016 at 1:10 PM
>> *To: *"user@cassandra.apache.org" 
>> *Cc: *Peter Norton 
>> *Subject: *Re: Ring connection timeouts with 2.2.6
>>
>>
>>
>> Jens,
>>
>>
>>
>> We haven't noticed any particular large GC operations or even
>> persistently high GC times.
>>
>>
>>
>> Mike
>>
>>
>>
>> On Thu, Jun 30, 2016 at 3:20 AM, Jens Rantil  wrote:
>>
>> Hi,
>>
>> Could it be garbage collection occurring on nodes that are more heavily
>> loaded?
>>
>> Cheers,
>> Jens
>>
>>
>>
>> Den sön 26 juni 2016 05:22Mike Heffner  skrev:
>>
>> One thing to add, if we do a rolling restart of the ring the timeouts
>> disappear entirely for several hours and performance returns to normal.
>> It's as if something is leaking over time, but we haven't seen any
>> noticeable change in heap.
>>
>>
>>
>> On Thu, Jun 23, 2016 at 10:38 AM, Mike Heffner  wrote:
>>
>> Hi,
>>
>>
>>
>> We have a 12 node 2.2.6 ring running in AWS, single DC with RF=3, that is
>> sitting at <25% CPU, doing mostly writes, and not showing any particular
>> long GC times/pauses. By all observed metrics the ring is healthy and
>> performing well.
>>
>>
>>
>> However, we are noticing a pretty consistent number of connection
>> timeouts coming from the messaging service between various pairs of nodes
>> in the ring. The "Connection.TotalTimeouts" meter metric show 100k's of
>> timeouts per minute, usually between two pairs of nodes for several hours
>> at a time. It seems to occur for several hours at a time, then may stop or
>> move to other pairs of nodes in the ring. The metric
>> "Connection.SmallMessageDroppedTasks." will also grow for one pair of
>> the nodes in the TotalTimeouts metric.
>>
>>
>>
>> Looking at the debug log typically shows a large number of messages like
>> the following on one of the nodes:
>>
>>
>>
>> StorageProxy.java:1033 - Skipped writing hint for /172.26.33.177
>> 
>> (ttl 0)
>>
>> We have cross node timeouts enabled, but ntp is running on all nodes and
>> no node appears to have time drift.
>>
>>
>>
>> The network appears to be fine between nodes, with iperf tests showing
>> that we have a lot of headroom.
>>
>>
>>
>> Any thoughts on what to look for? Can we increase thread count/pool sizes
>> for the messaging service?
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Mike
>>
>>
>>
>> --
>>
>>
>>   Mike Heffner 
>>
>>   Librato, Inc.
>>
>>
>>
>>
>>
>>
>>
>> --
>>
>>
>>   Mike Heffner 
>>
>>   Librato, Inc.
>>
>>
>>
>> --
>>
>> Jens Rantil
>> Backend Developer @ Tink
>>
>> Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
>> For urgent matters you can reach me at +46-708-84 18 32.
>>
>>
>>
>>
>>
>> --
>>
>>
>>   Mike Heffner 
>>
>>   Librato, Inc.
>>
>>
>>
>
>
>
> --
>
>   Mike Heffner 
>   Librato, Inc.
>
>


-- 

  Mike Heffner 
  Librato, Inc.


Fwd: Cassandra discussion channel on Slack!

2016-07-15 Thread denish patel
Hello,

I have started Slack channel to discuss Cassandra.

The purpose of this channel to use existing slack platform to get connected
to like minded Cassandra people.  You can sign up using
https://cassandra-slack.herokuapp.com/

Looking forward to talk to you on Slack!

-- 
Thanks and Regards,
Denish Patel


Re: What is the merit of incremental backup

2016-07-15 Thread Satoshi Hikida
Hi Prasenjit

Thank you for your reply.

However, I doubt that incremental backup can reduce RTO. I think the
demerit of incremental backup is to take longer repair time rather than
without incremental backup.

Because I've compared the repair time of two cases like below.

(a) snapshot(10GB, full repaired) + incremental backup(1GB)
(b) snapshot(10GB, full repaired)

Each case consists of 3 node cluster, replication factor is 3 and total
data size is 12GB/node. And we assume one node got failure then we restore
the node. The result showed that case (b) is faster than case (a). The
repair process of the token ranges included in incremental backup was very
slow. However, the just transferring replicated data from existing nodes to
repairing node is faster than repair.

So far, I think Pros and Cons of incremental back is as following:

- Pros (There are already agreed by you)
- It allows storing backups offsite without transferring entire snapshots
- With incremental backups and snapshots, it can provide more recent RPO
(Recovery Point Objective)
- Cons
- It takes much longer repair time rather than without incremental backup
(longer RTO)


Is it correct understand? I would appreciate you can give me any advice or
ideas if I was misunderstanding.


Regards,
Satoshi


On Fri, Jul 15, 2016 at 1:46 AM, Prasenjit Sarkar  wrote:

> Hi Satoshi
>
> You are correct that incremental backups offer you the opportunity to
> reduce the amount of data you need to transfer offsite. On the recovery
> path, you need to piece together the full backup and subsequent incremental
> backups.
>
> However, where incremental backups help is with respect to the RTO due to
> the data reduction effect you mentioned. The RPO can be reduced only if you
> take more frequent incremental backups than full backups.
>
> Hope this helps,
> Prasenjit
>
> On Wed, Jul 13, 2016 at 11:54 PM, Satoshi Hikida 
> wrote:
>
>> Hi,
>>
>> I want to know the actual advantage of using incremental backup.
>>
>> I've read through the DataStax document and it says the merit of using
>> incremental backup is as follows:
>>
>> - It allows storing backups offsite without transferring entire snapshots
>> - With incremental backups and snapshots, it can provide more recent RPO
>> (Recovery Point Objective)
>>
>> Is my understanding correct? I would appreciate if someone gives me some
>> advice or correct me.
>>
>> References:
>> - DataStax, "Enabling incremental backups",
>> http://docs.datastax.com/en/cassandra/2.2/cassandra/operations/opsBackupIncremental.html
>>
>> Regards,
>> Satoshi
>>
>
>