+1 to Scott’s comments. Once you expose those YAML config params outside of a single node which many of us do, this becomes an RCE attack vector. Something more structured as Scott proposes, similar to snapshots, would be preferred. Would recommend a CEP.
Jordan On Fri, Aug 30, 2024 at 20:58 C. Scott Andreas <csco...@icloud.com> wrote: > I appreciate this report and would love to work toward the direction it > recommends. > > I’m also familiar with past concerns raised by others with our FQL > configuration parameters that allow passing shell commands for FQL segment > archival. > > We bias toward ensuring an MBean exists for dynamic modification of yaml > parameters. When we couple dynamic configuration updates and arbitrary > shell command execution, we introduce vectors for arbitrary code execution, > data exfiltration, and data compromise that have a lower bar to achieve > than local file write. > > I agree that we should work toward removing operator-provided shell > commands in yaml. > > For concerns like archival, these seem like areas that Cassandra could > easily accomplish itself without shelling out to gzip/zstd/lz4-compress a > file. Introducing a new config structure that declares an archival format, > accompanying implementations for compression/decompression, and deprecation > of the prior approach sounds both reasonable and desirable to me. > > – Scott > > — > Mobile > > On Aug 30, 2024, at 10:25 PM, Bowen Song via dev <dev@cassandra.apache.org> > wrote: > > > > I'm not sure what is the concern here. Is it a malicious user exploiting > this? Or human error with unintended consequences? > > For malicious user, in order to exploit this, an attacker needs to be able > to write to the config file. The config file on Linux by default is owned > by the root user and has the -rw-r--r-- permission, that means the attacker > must either gain root access to the system or has the ability to write > arbitrary file on the filesystem. With either of these permission, they can > already do almost anything they want (e.g. modify a SUID executable file). > They wouldn't even need to exploit this to run a script or dangerous > command. So this sounds like a non-issue to me, at least on Linux-based > OSes. > > For human error, if the operator puts "rm -rf" in it, the software should > treat it as the operator actually wants to do that. I personally don't like > software attempting to outsmart human, which often ends up interfering with > legitimate use cases. The best thing a software can do is log it, so > there's some traceability if and when things go wrong. > > So, IMO, there's nothing wrong with the implementation in Cassandra. > > > On 30/08/2024 17:13, guo Maxwell wrote: > > Commitlog has the ability of archive log file, see > CommitLogArchiver.java > <https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java>, > we can achieve the purpose of archive and restore commitlog by > configuring archive_command and restore_command in > commitlog_archiving.properties > <https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28> > .The archive_command and restore_command can be some linux/unix shell > command. However, I found that the shell command can actually be filled > with any script, even if "*rm -rf"* .I have tested this situation and it > finally succeeded with my test file being deleted. > > Personally, I think it is a dangerous behavior, because if there are > no system-level restrictions and users are allowed to do anything in these > shell commands. So here I want to discuss with you whether it is > necessary to impose any restrictions on use, or do we need a new way of > archiving/restoring commitlog? > > Of course, before that, I would also like to ask, how many people are > using archive and restore of commitlog? It seems that the commitlog archive > code has not been updated for a long time. > > I have two ideas. > One is to make some restrictions on the command context based on the > existing usage methods, such as strictly only allowing the current cp/mv/ln > %path to %name.Other redundant strings in the command are not allowed. > Another one , As I roughly investigated the archive of mysql and pg. They > do not give users too much space (I am talking about letting users define > their own archiving command ), and archive directly to a designated > location. For us, I feel that we can refer to c * Incremental backup of > sstable, add a hardlink to the commitlog to the specified location, but > this place may modify the original configuration method, such as setting > the archive location and restoring location of the node through nodetool > and deprecate the commitlog_archiving.properties > <https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28> > configuration. > > I am just putting forward some views here, and looking forward to your > feedback. 😀 > >