[ 
https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17525284#comment-17525284
 ] 

Stefan Miklosovic commented on CASSANDRA-17568:
-----------------------------------------------

Hi [~rtib], thanks for the patch & idea. I think I was the one who added that 
"getDataPaths" method to be able to retrieve this information and I am glad it 
is useful and you are building on top of that.

I was briefly looking into the code. If we ever merge something like this, it 
would be nice to have it more robust / prepared to other scenarios. The tool 
should be able to:

1) return all data directories of a particular table(s).
2) return all data directories which are eligible to be deleted as the 
respective keyspace / table (or both) does not exist anymore in Cassandra.

You implemented 1) but I miss 2). As I understand it, now you get the list of 
1) and then you go over all the dirs and make "diff" to see which one you can 
remove. Why not to do it in such a way that you would get the list of tables to 
remove directly?

In order to do 2), I think that this is somehow tangential to what [~paulo] was 
trying to do with his refactorisation of data dir parsing. I will link the JIRA 
if I find it. The refactorisation he was doing was also done due to the fact 
that right now you can not list snapshots of dropped tables because Cassandra 
does not "see" it anymore when they are dropped. Hence I think we need to first 
move Paulo's work forward and once done, we would expose the information what 
tables are not meant to be there anymore - which would be your list.

> Tool to list data directories
> -----------------------------
>
>                 Key: CASSANDRA-17568
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17568
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Tool/nodetool
>            Reporter: Tibor Repasi
>            Assignee: Tibor Repasi
>            Priority: Normal
>             Fix For: 4.x
>
>
> When a table is created, dropped and re-created with the same name, 
> directories remain within data paths. Operators may be challenged finding out 
> which directories belong to existing tables and which may be subject to 
> removal. However, the information is available in CQL as well as in MBeans 
> via JMX, a convenient access to this information is still missing.
> My proposal is a new nodetool subcommand allowing to list data paths of all 
> existing tables.
> {code}
> % bin/nodetool datapaths -- example
> Keyspace : example
>       Table : test
>       Paths :
>               
> /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301
> ----------------
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to