Re: 【DISCUSS】The configuration of Commitlog archiving

Bowen Song via dev Fri, 30 Aug 2024 19:25:44 -0700

I'm not sure what is the concern here. Is it a malicious user exploitingthis? Or human error with unintended consequences?

For malicious user, in order to exploit this, an attacker needs to beable to write to the config file. The config file on Linux by default isowned by the root user and has the -rw-r--r-- permission, that means theattacker must either gain root access to the system or has the abilityto write arbitrary file on the filesystem. With either of thesepermission, they can already do almost anything they want (e.g. modify aSUID executable file). They wouldn't even need to exploit this to run ascript or dangerous command. So this sounds like a non-issue to me, atleast on Linux-based OSes.

For human error, if the operator puts "rm -rf" in it, the softwareshould treat it as the operator actually wants to do that. I personallydon't like software attempting to outsmart human, which often ends upinterfering with legitimate use cases. The best thing a software can dois log it, so there's some traceability if and when things go wrong.


So, IMO, there's nothing wrong with the implementation in Cassandra.


On 30/08/2024 17:13, guo Maxwell wrote:

Commitlog has the ability of archive log file, seeCommitLogArchiver.java<https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java>, we can achieve the purpose of archive and restore commitlog byconfiguring archive_command and restore_command incommitlog_archiving.properties<https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28>.Thearchive_command and restore_command can be some linux/unix shellcommand. However, I found that the shell command can actually befilled with any script, even if "*rm -rf"* .I have tested thissituation and it finally succeeded with my test file being deleted.
Personally, I think it is a dangerous behavior, because if there areno system-level restrictions and users are allowed to do anything inthese shell commands. So here I want to discuss with you whether it isnecessary to impose any restrictions on use, or do we need a new wayof archiving/restoring commitlog?Of course, before that, I would also like to ask, how many people areusing archive and restore of commitlog? It seems that the commitlogarchive code has not been updated for a long time.
I have two ideas.
One is to make some restrictions on the command context based on theexisting usage methods, such as strictly only allowing the currentcp/mv/ln %path to %name.Other redundant strings in the command are notallowed.Another one , As I roughly investigated the archive of mysql and pg.They do not give users too much space (I am talking about lettingusers define their own archiving command ), and archive directly to adesignated location. For us, I feel that we can refer to c *Incremental backup of sstable, add a hardlink to the commitlog to thespecified location, but this place may modify the originalconfiguration method, such as setting the archive location andrestoring location of the node through nodetool and deprecate thecommitlog_archiving.properties<https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28>configuration.
I am just putting forward some views here, and looking forward toyour feedback. 😀

Re: 【DISCUSS】The configuration of Commitlog archiving

Reply via email to