I'm not sure what is the concern here. Is it a malicious user exploiting this? Or human error with unintended consequences?

For malicious user, in order to exploit this, an attacker needs to be able to write to the config file. The config file on Linux by default is owned by the root user and has the -rw-r--r-- permission, that means the attacker must either gain root access to the system or has the ability to write arbitrary file on the filesystem. With either of these permission, they can already do almost anything they want (e.g. modify a SUID executable file). They wouldn't even need to exploit this to run a script or dangerous command. So this sounds like a non-issue to me, at least on Linux-based OSes.

For human error, if the operator puts "rm -rf" in it, the software should treat it as the operator actually wants to do that. I personally don't like software attempting to outsmart human, which often ends up interfering with legitimate use cases. The best thing a software can do is log it, so there's some traceability if and when things go wrong.

So, IMO, there's nothing wrong with the implementation in Cassandra.


On 30/08/2024 17:13, guo Maxwell wrote:
Commitlog has the ability of archive  log file, see CommitLogArchiver.java <https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java>,  we can achieve the purpose of archive and restore commitlog by configuring archive_command and restore_command in commitlog_archiving.properties <https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28>.The archive_command and restore_command can be some linux/unix shell command.  However, I found that the shell command can actually be filled with any script, even if "*rm -rf"* .I have tested this situation and it finally succeeded with my test file being deleted.

  Personally, I think it is a dangerous behavior, because if there are no system-level restrictions and users are allowed to do anything in these shell commands. So here I want to discuss with you whether it is necessary to impose any restrictions on use, or do we need a new way of archiving/restoring commitlog? Of course, before that, I would also like to ask, how many people are using archive and restore of commitlog? It seems that the commitlog archive code has not been updated for a long time.

I have two ideas.
One is to make some restrictions on the command context based on the existing usage methods, such as strictly only allowing the current cp/mv/ln %path to %name.Other redundant strings in the command are not allowed. Another one , As I roughly investigated the archive of mysql and pg. They do not give users too much space (I am talking about letting users define their own archiving command ), and archive directly to a designated location. For us, I feel that we can refer to c * Incremental backup of sstable,  add a hardlink to the commitlog to the specified location, but this place may modify the original configuration method, such as setting the archive location and restoring location of the node through nodetool and deprecate the commitlog_archiving.properties <https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28> configuration.

I am just putting forward some views  here, and looking forward to your feedback. 😀

Reply via email to