[jira] [Commented] (CASSANDRA-10907) Nodetool snapshot should provide an option to skip flushing

2016-01-20 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15109209#comment-15109209
 ] 

Paulo Motta commented on CASSANDRA-10907:
-

LGTM, thanks!

> Nodetool snapshot should provide an option to skip flushing
> ---
>
> Key: CASSANDRA-10907
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10907
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
> Environment: PROD
>Reporter: Anubhav Kale
>Priority: Minor
>  Labels: lhf
> Attachments: 0001-Skip-Flush-for-snapshots.patch, 
> 0001-Skip-Flush-option-for-Snapshot.patch, 
> 0001-Skip-Flush-option-for-Snapshot.patch, 0001-flush.patch
>
>
> For some practical scenarios, it doesn't matter if the data is flushed to 
> disk before taking a snapshot. However, it's better to save some flushing 
> time to make snapshot process quick.
> As such, it will be a good idea to provide this option to snapshot command. 
> The wiring from nodetool to MBean to VerbHandler should be easy. 
> I can provide a patch if this makes sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10907) Nodetool snapshot should provide an option to skip flushing

2016-01-19 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107351#comment-15107351
 ] 

Paulo Motta commented on CASSANDRA-10907:
-

Looking better. A few more nits:
* Rename {{skipflush}} option to {{skipFlush}} (camelCase)
* remove skipFlush from takeMultipleTableSnapshot javadoc
* add @Deprecated annotation to old methods (in addition to @deprecated javadoc)
* in javadoc {{@link #takeSnapshot..}} replace {{Map}} with 
{{Map}} (generics is not supported in javadoc link)
* Add options to message: {{Requested creating snapshot(s) for 
\[keyspace1.standard1,keyspace1.counter1\] with snapshot name \[1453233210025\] 
and options \{skipFlush=false\}.}}
* Fix broken test 
{{org.apache.cassandra.service.StorageServiceServerTest.testTableSnapshot}}
* Improve nodetool option description from {{Skip blocking flush of the 
memtable}} to {{Do not flush memtables before snapshotting (snapshot will not 
contain unflushed data)}}

bq. I did add a Boolean to detect if KS / CF was passed to the proposed 
signature to make things easy. 

I still find it a bit redudant, since it's possible to replace the keyspaces 
boolean with {{entities\[0\].contains(".")}}, and in the future we can simplify 
the snapshot command to accept an arbitrary list of mixed keyspaces and/or 
tables, so I'd prefer to not have this boolean.

||trunk||
|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-10907]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-10907-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-10907-dtest/lastCompletedBuild/testReport/]|


> Nodetool snapshot should provide an option to skip flushing
> ---
>
> Key: CASSANDRA-10907
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10907
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
> Environment: PROD
>Reporter: Anubhav Kale
>Priority: Minor
>  Labels: lhf
> Attachments: 0001-Skip-Flush-for-snapshots.patch, 
> 0001-Skip-Flush-option-for-Snapshot.patch, 0001-flush.patch
>
>
> For some practical scenarios, it doesn't matter if the data is flushed to 
> disk before taking a snapshot. However, it's better to save some flushing 
> time to make snapshot process quick.
> As such, it will be a good idea to provide this option to snapshot command. 
> The wiring from nodetool to MBean to VerbHandler should be easy. 
> I can provide a patch if this makes sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10907) Nodetool snapshot should provide an option to skip flushing

2016-01-08 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15090012#comment-15090012
 ] 

Paulo Motta commented on CASSANDRA-10907:
-

Please click submit patch when you have a new version ready.

> Nodetool snapshot should provide an option to skip flushing
> ---
>
> Key: CASSANDRA-10907
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10907
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
> Environment: PROD
>Reporter: Anubhav Kale
>Priority: Minor
>  Labels: lhf
> Attachments: 0001-flush.patch
>
>
> For some practical scenarios, it doesn't matter if the data is flushed to 
> disk before taking a snapshot. However, it's better to save some flushing 
> time to make snapshot process quick.
> As such, it will be a good idea to provide this option to snapshot command. 
> The wiring from nodetool to MBean to VerbHandler should be easy. 
> I can provide a patch if this makes sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10907) Nodetool snapshot should provide an option to skip flushing

2016-01-08 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15090010#comment-15090010
 ] 

Paulo Motta commented on CASSANDRA-10907:
-

Overall looks good but we cannot change the methods from 
{{StorageServiceMBean}} as this is a public interface and might be used by 
other systems.

I propose you add a new method {{takeSnapshot(String tag, Map 
options, String... entities)}}, where the {{options}} map may only contain the 
{{skipFlush}} option for the time being, but may be extended in the future with 
more options. The {{entities}} array will contain strings in the format 
ks\[.cf\], meaning take a snapshot of keyspaces and/or specific cfs. In this 
way, we don't need to create a new method in the future if we add a new option. 

You should also add a {{@Deprecated}} annotation to the previous methods and 
javadocs, similar to the {{forceRepairAsync}} deprecation notices. It would be 
nice to unify the implementation of {takeMultipleTableSnapshot}}, 
{{takeTableSnapshot}}, {{takeSnapshot}} to use the new method 
{{takeSnapshot(String tag, Map options, String... entities)}}.

> Nodetool snapshot should provide an option to skip flushing
> ---
>
> Key: CASSANDRA-10907
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10907
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
> Environment: PROD
>Reporter: Anubhav Kale
>Priority: Minor
>  Labels: lhf
> Attachments: 0001-flush.patch
>
>
> For some practical scenarios, it doesn't matter if the data is flushed to 
> disk before taking a snapshot. However, it's better to save some flushing 
> time to make snapshot process quick.
> As such, it will be a good idea to provide this option to snapshot command. 
> The wiring from nodetool to MBean to VerbHandler should be easy. 
> I can provide a patch if this makes sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10907) Nodetool snapshot should provide an option to skip flushing

2016-01-05 Thread Anubhav Kale (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15083489#comment-15083489
 ] 

Anubhav Kale commented on CASSANDRA-10907:
--

Any updates here ?

> Nodetool snapshot should provide an option to skip flushing
> ---
>
> Key: CASSANDRA-10907
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10907
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
> Environment: PROD
>Reporter: Anubhav Kale
>Priority: Minor
>  Labels: lhf
> Attachments: 0001-flush.patch
>
>
> For some practical scenarios, it doesn't matter if the data is flushed to 
> disk before taking a snapshot. However, it's better to save some flushing 
> time to make snapshot process quick.
> As such, it will be a good idea to provide this option to snapshot command. 
> The wiring from nodetool to MBean to VerbHandler should be easy. 
> I can provide a patch if this makes sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10907) Nodetool snapshot should provide an option to skip flushing

2015-12-23 Thread Anubhav Kale (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15070095#comment-15070095
 ] 

Anubhav Kale commented on CASSANDRA-10907:
--

For point in time backups, its always somewhat unpredictable what data is 
backed up especially with replication on. The concern here is the unnecessary 
time and resources spent in a blocking flush when its not really required. 

I have provided a patch. Its possible to provide overrides at other places, I 
took a stab at providing those on KS and CF and did the wiring. If you prefer 
some other approach, let me know.

> Nodetool snapshot should provide an option to skip flushing
> ---
>
> Key: CASSANDRA-10907
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10907
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
> Environment: PROD
>Reporter: Anubhav Kale
>Priority: Minor
>  Labels: lhf
>
> For some practical scenarios, it doesn't matter if the data is flushed to 
> disk before taking a snapshot. However, it's better to save some flushing 
> time to make snapshot process quick.
> As such, it will be a good idea to provide this option to snapshot command. 
> The wiring from nodetool to MBean to VerbHandler should be easy. 
> I can provide a patch if this makes sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10907) Nodetool snapshot should provide an option to skip flushing

2015-12-22 Thread Nick Bailey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15068501#comment-15068501
 ] 

Nick Bailey commented on CASSANDRA-10907:
-

Fair enough on incremental backups. The only other thing I'd say is that if 
blocking on flushing is that big of an impact you might be close to IO capacity 
anyway. That said, I won't advocate for closing this ticket.

> Nodetool snapshot should provide an option to skip flushing
> ---
>
> Key: CASSANDRA-10907
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10907
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
> Environment: PROD
>Reporter: Anubhav Kale
>Priority: Minor
>  Labels: lhf
>
> For some practical scenarios, it doesn't matter if the data is flushed to 
> disk before taking a snapshot. However, it's better to save some flushing 
> time to make snapshot process quick.
> As such, it will be a good idea to provide this option to snapshot command. 
> The wiring from nodetool to MBean to VerbHandler should be easy. 
> I can provide a patch if this makes sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10907) Nodetool snapshot should provide an option to skip flushing

2015-12-21 Thread Nick Bailey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066826#comment-15066826
 ] 

Nick Bailey commented on CASSANDRA-10907:
-

My only objection is that the behavior of what information is actually backed 
up is basically undefined. It's possibly it's useful in some very specific use 
cases but it also introduces potential traps when used incorrectly.

It sounds to me like you should be using incremental backups. When that is 
enabled a hardlink is created every time a memtable is flushed or an sstable 
streamed. You can then just watch that directory and ship the sstables off node 
on demand as they are created.

> Nodetool snapshot should provide an option to skip flushing
> ---
>
> Key: CASSANDRA-10907
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10907
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
> Environment: PROD
>Reporter: Anubhav Kale
>Priority: Minor
>  Labels: lhf
>
> For some practical scenarios, it doesn't matter if the data is flushed to 
> disk before taking a snapshot. However, it's better to save some flushing 
> time to make snapshot process quick.
> As such, it will be a good idea to provide this option to snapshot command. 
> The wiring from nodetool to MBean to VerbHandler should be easy. 
> I can provide a patch if this makes sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10907) Nodetool snapshot should provide an option to skip flushing

2015-12-21 Thread Anubhav Kale (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066956#comment-15066956
 ] 

Anubhav Kale commented on CASSANDRA-10907:
--

I agree that what is backed up will be undefined. In my opinion, the trap is 
very clear here so I don't think it can be misused. IMHO, the other nodetool 
commands have such traps as well so this is no different (e.g. why does scrub 
have an option to not snapshot ?). 

That said, if you feel strongly against this, I understand and we can kill this 
(I can always make a local patch).

BTW I can't use incremental backups, because I do not want to ship SS Table 
files that would have been removed as part of compaction. When compaction kicks 
in and deletes some files, it won't remove them from backups (which makes sense 
else it won't be incremental). So, at the time of recovery we are moving too 
many files back thus increasing the downtime of Apps. If I am not understanding 
something correctly here, please let me know !

> Nodetool snapshot should provide an option to skip flushing
> ---
>
> Key: CASSANDRA-10907
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10907
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
> Environment: PROD
>Reporter: Anubhav Kale
>Priority: Minor
>  Labels: lhf
>
> For some practical scenarios, it doesn't matter if the data is flushed to 
> disk before taking a snapshot. However, it's better to save some flushing 
> time to make snapshot process quick.
> As such, it will be a good idea to provide this option to snapshot command. 
> The wiring from nodetool to MBean to VerbHandler should be easy. 
> I can provide a patch if this makes sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10907) Nodetool snapshot should provide an option to skip flushing

2015-12-21 Thread Nick Bailey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066715#comment-15066715
 ] 

Nick Bailey commented on CASSANDRA-10907:
-

Just wondering what scenarios skipping flushing makes sense. It seems like any 
scenario there would be covered by the incremental backup option which 
hardlinks every sstable as its flushed.

> Nodetool snapshot should provide an option to skip flushing
> ---
>
> Key: CASSANDRA-10907
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10907
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
> Environment: PROD
>Reporter: Anubhav Kale
>Priority: Minor
>  Labels: lhf
>
> For some practical scenarios, it doesn't matter if the data is flushed to 
> disk before taking a snapshot. However, it's better to save some flushing 
> time to make snapshot process quick.
> As such, it will be a good idea to provide this option to snapshot command. 
> The wiring from nodetool to MBean to VerbHandler should be easy. 
> I can provide a patch if this makes sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10907) Nodetool snapshot should provide an option to skip flushing

2015-12-21 Thread Anubhav Kale (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066788#comment-15066788
 ] 

Anubhav Kale commented on CASSANDRA-10907:
--

We plan to move backups outside the nodes. So, when a snapshot is taken it 
would be ideal for it to be fast (thus not flush) so that it can be moved out 
as quickly as possible. We have enough replication so we can tolerate the data 
loss because the memtable wasn't flushed.

Do you feel strongly against it ?

> Nodetool snapshot should provide an option to skip flushing
> ---
>
> Key: CASSANDRA-10907
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10907
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
> Environment: PROD
>Reporter: Anubhav Kale
>Priority: Minor
>  Labels: lhf
>
> For some practical scenarios, it doesn't matter if the data is flushed to 
> disk before taking a snapshot. However, it's better to save some flushing 
> time to make snapshot process quick.
> As such, it will be a good idea to provide this option to snapshot command. 
> The wiring from nodetool to MBean to VerbHandler should be easy. 
> I can provide a patch if this makes sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)