[
https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17525440#comment-17525440
]
Tibor Repasi edited comment on CASSANDRA-17568 at 4/21/22 6:37 AM:
-------------------------------------------------------------------
Hi, thanks for the feedback.
{quote}
1) return all data directories of a particular table(s).
2) return all data directories which are eligible to be deleted as the
respective keyspace / table (or both) does not exist anymore in Cassandra.
You implemented 1) but I miss 2).
{quote}
That's right. But, it is not trivial to identify directories of deleted tables
and keyspaces, since Cassandra doesn't keep track of them. Deleting directories
which aren't data paths of any existing table would assume that nothing else
could have created them, which makes this approach particularly dangerous.
The main goal might obviously be to clean up directories Cassandra created and
not using anymore. I like CASSANDRA-16843 and I really love CASSANDRA-16451,
which both improve control and handling of snapshots. I could imagine Cassandra
to keep track of directories belonging to dropped tables and keyspaces and
clean them up automatically under specific circumstances after some time. Maybe
data directories could have bounded TTL? But, I think that's a complete
different discussion track. This ticket and my patch are about making things
visible to support operators.
BTW, I am aware about the deadline for the version cut.
was (Author: rtib):
Hi, thanks for the feedback.
{quote}
1) return all data directories of a particular table(s).
2) return all data directories which are eligible to be deleted as the
respective keyspace / table (or both) does not exist anymore in Cassandra.
You implemented 1) but I miss 2).
{quote}
That's right. But, it is not trivial to identify directories of deleted tables
and keyspaces, since Cassandra doesn't keep track of them. Deleting directories
which aren't data paths of any existing table would assume that nothing else
could have created them, which makes this approach particularly dangerous.
The main goal might obviously be to clean up directories Cassandra created and
not using anymore. I like CASSANDRA-16843 and I really love CASSANDRA-16451,
which both improve control and handling of snapshots. I could imagine Cassandra
to keep track of directories belonging to dropped tables and keyspaces and
clean them up automatically after some time. Maybe data directories could have
bounded TTL? But, I think that's a complete different discussion track. This
ticket and my patch are about making things visible to support operators.
BTW, I am aware about the deadline for the version cut.
> Tool to list data directories
> -----------------------------
>
> Key: CASSANDRA-17568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17568
> Project: Cassandra
> Issue Type: New Feature
> Components: Tool/nodetool
> Reporter: Tibor Repasi
> Assignee: Tibor Repasi
> Priority: Normal
> Fix For: 4.x
>
>
> When a table is created, dropped and re-created with the same name,
> directories remain within data paths. Operators may be challenged finding out
> which directories belong to existing tables and which may be subject to
> removal. However, the information is available in CQL as well as in MBeans
> via JMX, a convenient access to this information is still missing.
> My proposal is a new nodetool subcommand allowing to list data paths of all
> existing tables.
> {code}
> % bin/nodetool datapaths -- example
> Keyspace : example
> Table : test
> Paths :
>
> /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301
> ----------------
> {code}
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]