[
https://issues.apache.org/jira/browse/CASSANDRA-11721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Lerer updated CASSANDRA-11721:
---------------------------------------
Description:
Right now with truncate, it will always create a snapshot. That is the right
thing to do most of the time. 'auto_snapshot' exists as an option to disable
that but it is server wide and requires a restart to change. There are data
models, however, that require rotating through a handful of tables and
periodically truncating them. Currently you either have to operate with no
safety net (some actually do this) or manually clear those snapshots out
periodically. Both are less than optimal.
In HDFS, you generally delete something where it goes to the trash. If you
don't want that safety net, you can do something like 'rm -rf -skiptrash
/jeremy/stuff' in one command.
It would be nice to have something in the truncate ddl to skip the snapshot on
a per operation basis. Perhaps 'TRUNCATE solarsystem.earth NO SNAPSHOT'.
This might also be useful in those situations where you're just playing with
data and you don't want something to take a snapshot in a development system.
If that's the case, this would also be useful for the DROP operation, but that
convenience is not the main reason for this option.
+Additional information for newcomers:+
This test is a bit more complex that normal LHF tickets but is still reasonably
easy.
The idea is to support disabling snapshots when performing a Truncate as follow:
{code}TRUNCATE x WITH OPTIONS = { 'snapshot' : false }{code}
In order to implement that feature several changes are required:
* A new Class {{TruncateAttributes}} inheriting from {{PropertyDefinitions}}
must be create in a similar way to {{KeyspaceAttributes}} or {{TableAttributes}}
* This class should be passed to the {{TruncateStatement}} constructor and
stored as a field
* The ANTLR parser logic should be change to retrieve the options and passe
them to the constructor (see {{createKeyspaceStatement}} for an example)
* The {{TruncateStatement}} will then need to be modified to take into account
the new option. Locally it will neeed to call
{{ColumnFamilyStore#truncateBlockingWithoutSnapshot}} if no snapshot should be
done instead of {{ColumnFamilyStore#truncateBlocking}}. For non local call it
will need to pass a new parameter to {{StorageProxy#truncateBloking}}. That
parameter will then need to be passed to the other nodes through the
{{TruncateRequest}}.
* As a new field need to be added to {{TruncateRequest}} this field will need
to be serialized and deserialized and a new {{MessagingService.Version}} will
need to be created and set as the current version the new version should be 50
(and yes it means that the next release will be a major one 5.0)
* In {{TruncateVerbHandler}} the new field should be used to determine if
{{ColumnFamilyStore#truncateBlockingWithoutSnapshot}} or
{{ColumnFamilyStore#truncateBlocking}} should be called.
* An in-jvm test should be added in
{{test/distributed/org/apache/cassandra/distributed/test}} to test that
truncate does not generate snapshots when the new option is specified.
Do not hesitate to ping the mentor for more information.
was:
Right now with truncate, it will always create a snapshot. That is the right
thing to do most of the time. 'auto_snapshot' exists as an option to disable
that but it is server wide and requires a restart to change. There are data
models, however, that require rotating through a handful of tables and
periodically truncating them. Currently you either have to operate with no
safety net (some actually do this) or manually clear those snapshots out
periodically. Both are less than optimal.
In HDFS, you generally delete something where it goes to the trash. If you
don't want that safety net, you can do something like 'rm -rf -skiptrash
/jeremy/stuff' in one command.
It would be nice to have something in the truncate ddl to skip the snapshot on
a per operation basis. Perhaps 'TRUNCATE solarsystem.earth NO SNAPSHOT'.
This might also be useful in those situations where you're just playing with
data and you don't want something to take a snapshot in a development system.
If that's the case, this would also be useful for the DROP operation, but that
convenience is not the main reason for this option.
+Additional information for newcomers:+
The idea is to support disabling snapshots when performing a Truncate as follow:
> Have a per operation truncate ddl "no snapshot" option
> ------------------------------------------------------
>
> Key: CASSANDRA-11721
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11721
> Project: Cassandra
> Issue Type: Improvement
> Components: Legacy/CQL
> Reporter: Jeremy Hanna
> Priority: Low
>
> Right now with truncate, it will always create a snapshot. That is the right
> thing to do most of the time. 'auto_snapshot' exists as an option to disable
> that but it is server wide and requires a restart to change. There are data
> models, however, that require rotating through a handful of tables and
> periodically truncating them. Currently you either have to operate with no
> safety net (some actually do this) or manually clear those snapshots out
> periodically. Both are less than optimal.
> In HDFS, you generally delete something where it goes to the trash. If you
> don't want that safety net, you can do something like 'rm -rf -skiptrash
> /jeremy/stuff' in one command.
> It would be nice to have something in the truncate ddl to skip the snapshot
> on a per operation basis. Perhaps 'TRUNCATE solarsystem.earth NO SNAPSHOT'.
> This might also be useful in those situations where you're just playing with
> data and you don't want something to take a snapshot in a development system.
> If that's the case, this would also be useful for the DROP operation, but
> that convenience is not the main reason for this option.
> +Additional information for newcomers:+
> This test is a bit more complex that normal LHF tickets but is still
> reasonably easy.
> The idea is to support disabling snapshots when performing a Truncate as
> follow:
> {code}TRUNCATE x WITH OPTIONS = { 'snapshot' : false }{code}
> In order to implement that feature several changes are required:
> * A new Class {{TruncateAttributes}} inheriting from {{PropertyDefinitions}}
> must be create in a similar way to {{KeyspaceAttributes}} or
> {{TableAttributes}}
> * This class should be passed to the {{TruncateStatement}} constructor and
> stored as a field
> * The ANTLR parser logic should be change to retrieve the options and passe
> them to the constructor (see {{createKeyspaceStatement}} for an example)
> * The {{TruncateStatement}} will then need to be modified to take into
> account the new option. Locally it will neeed to call
> {{ColumnFamilyStore#truncateBlockingWithoutSnapshot}} if no snapshot should
> be done instead of {{ColumnFamilyStore#truncateBlocking}}. For non local
> call it will need to pass a new parameter to
> {{StorageProxy#truncateBloking}}. That parameter will then need to be passed
> to the other nodes through the {{TruncateRequest}}.
> * As a new field need to be added to {{TruncateRequest}} this field will need
> to be serialized and deserialized and a new {{MessagingService.Version}} will
> need to be created and set as the current version the new version should be
> 50 (and yes it means that the next release will be a major one 5.0)
> * In {{TruncateVerbHandler}} the new field should be used to determine if
> {{ColumnFamilyStore#truncateBlockingWithoutSnapshot}} or
> {{ColumnFamilyStore#truncateBlocking}} should be called.
> * An in-jvm test should be added in
> {{test/distributed/org/apache/cassandra/distributed/test}} to test that
> truncate does not generate snapshots when the new option is specified.
> Do not hesitate to ping the mentor for more information.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]