[
https://issues.apache.org/jira/browse/CASSANDRA-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17817414#comment-17817414
]
Sam Tunnicliffe commented on CASSANDRA-19394:
---------------------------------------------
Part of the benefit of dumping only to a binary format is precisely that it is
opaque and has a very limited set of uses. For now these include reloading a
binary dump into a new or existing cluster (e.g. for DR, debugging or cloning
purposes), or writing low level custom code to explore and modify the metadata.
Like Marcus said, this is really intended as an escape hatch for when (if)
things go catastrophically wrong and I agree with him that we should not change
this yet.
{quote}consume a lot of disk space if dumps are done frequently and they are
big.
{quote}
Dump files are current pretty tiny, even for clusters with many members and
large schema.
{quote}An adversary might just dump cluster metadata until no disk space is
left.
{quote}
Nodetool / JMX should be properly secured to prevent this. An adversary could
simply run {{nodetool assassinate}} if they had access.
> Rethink dumping of cluster metadata via CMSOperationsMBean
> ----------------------------------------------------------
>
> Key: CASSANDRA-19394
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19394
> Project: Cassandra
> Issue Type: Improvement
> Components: Tool/nodetool, Transactional Cluster Metadata
> Reporter: Stefan Miklosovic
> Priority: Normal
>
> I think there are two problems in the implementation of dumping
> ClusterMetadata in CMSOperationsMBean
> 1) A dump is saved in a file and dumpClusterMetadata methods will return just
> a file name where that dump is. However, nodetool / JMX call to MBean (or any
> place this method is invoked from, we would like to offer a command in
> nodetool which returns the dump) is meant to be used from anywhere, remotely,
> so what happens when we execute nodetool or call these methods on a machine
> different from a machine a node runs on? E.g. admins can just have some
> jumpbox to a cluster they manage, they do not necessarily have access to
> nodes themselves. So they would not be able to read it.
> 2) It creates temp file which is not deleted so /tmp will be populated with
> these dumps until node is turned off which might take a lot of time and can
> consume a lot of disk space if dumps are done frequently and they are big. An
> adversary might just dump cluster metadata until no disk space is left.
> What I propose is that we would return all dump string, not just a filename
> where we save it. We can also format the output on the client or we can tell
> server what format we want the dump to be returned in.
> If there is a concern about size of data to be returned, we might optionally
> allow dumps to be returned as compressed by simple zipping on server and
> unzipping on client where "zipper" is a standard java.util.zip so it
> basically doesn't matter what jvm runs on client and server.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]