I thought about this a bit over the last few days, and there's actually quite a few problems present that would need to be addressed.
*Insecure JMX* First off - if someone has access to JMX, the entire system is already compromised. A bad actor can mess with the cluster topology, truncate tables, and do a ton of other disruptive stuff. But if we're going to go down this path I think we should apply your logic consistently to avoid creating a "solution" that has the same "problem" as we do now. I use quotes because I'm not entirely convinced the root cause of the problem is enabling some shell access, but I'll entertain it for the sake of the discussion. *Dynamic Configuration and Shell Scripts* Let's pretend that somehow an open JMX isn't already a *massive* security flaw by itself. Once an attacker has control of a system, the next phase of the attack relies on them dynamically changing the configuration to point to a different shell script, or to execute arbitrary shell scripts. I agree with the general idea that we don't want this - so in my mind the necessary solution here is to disable the ability to change the commit log archiving behavior at runtime. The idea that commit log archiving (and many other config settings) would be dynamically configurable is a massive security flaw that should be disallowed. If you want to take this a step further and claim there's a flaw with shell scripts in general, I'll even entertain that for a minute, but we need to examine if the proposed solution of moving code to Java actually solves the problem. *Dynamic Configuration and Java Code* Let's say we've removed the ability to use shell scripts, and we've gotten people to rewrite their shell code with java code, but we've left the dynamic configuration in. Going back to my original email, I mentioned copying commit logs off the node and into an object store. If someone is able to change the parameter dynamically at runtime, they could just as easily point to a public S3 bucket and commit logs would be archived there which is just as bad as the shell version. So if we are to convert this functionality to Java, we should also be making best practice recommendations on what users should and should not do. *Apply All Operational Best Practices* There's been a variety of examples of how a user can further compromise a machine once they have JMX, working in tandem with shell scripts, but I hope at this point you can see that the issue is fundamentally more complex than simply disallowing shell scripts. The issue is present in the Java examples as well, and is strongly tied to the issue of dynamic config. If we're to design this the "right" way, I think we'd want these properties: * Commit log archiving should only have the ability to compress and move files to a staging location * Once the files are moved to the staging location, the file should be moved somewhere else by a script NOT run as the C* user. * The commit log archive configuration should not be dynamically updatable, nor should any config affecting directories Moving the scell configuration to Java code is a half measure that's only solving a tiny problem in a massive chain of events and security holes. Jon On Tue, Sep 3, 2024 at 4:15 AM Štefan Miklošovič <smikloso...@apache.org> wrote: > Scott is right that this is also coming from us having a MBean method > which allows commands to be changed in runtime. The solution to that was > that we can prevent it from changing dynamically by having a configuration > property, which is actually by default set to false so FQL archiving is > ever possible only in case an operator explicitly enables that. > > However, even if commands were not modifiable in runtime via JMX and even > an operator has a chance to enable command execution explicitly, that still > does not make it 100% secure because an attacker does not need to change / > modify cassandra.yaml where the script to execute is configure, just the > content of such a script which is executed. > > So, introducing a similar property as it was done for FQL would in this > context mean that it would be used for disabling commitlog archiving / > restoring altogether while for FQL it would still do its thing, it would > just not archive it. Whole commitlog archiving / restoring is now based on > some commands to be executed so disabling commands being executed > practically means we disabled this whole feature as such. > > We could indeed make it flat out impossible to execute anything but these > scripts might contain some custom logic, like uploading to various cloud > storages (AWS, Azure, GCP or something completely custom), people have > their own "storage solutions" like remove the old logs when new come in > etc. so by disabling this altogether we would make it impossible and users > would need to accommodate that which would break their existing solutions. > > What I find confusing is that commitlog_archiving.properties is used both > for restoration AS WELL AS for archiving. If we're ever going to change how > this works, I think that it should be somehow logically split into > archiving and restoring parts. > > So, we might introduce a property in cassandra.yaml to disable > commitlog_archiving.properties altogether and we might deprecate > commitlog_archiving.properties way of doing this (still keep it there for > legacy reasons), add a new cassandra.yaml configuration section for that > and there make the archiving and the restoration pluggable. By default we > would provide "cp $from $to" implemented by Cassandra itself without any > process invocation. Then we might eventually drop > commitlog_archiving.properties but if the maintenance of that is cheap I > would just keep it, we would just flip the switch so a new way of doing > that would be preferable and the old way of doing it (via properties) would > need to be explicitly enabled. > > On Tue, Sep 3, 2024 at 11:55 AM guo Maxwell <cclive1...@gmail.com> wrote: > >> Thank you very much for everyone's replies, they are all very valuable >> feedback to me. >> >> I don't really understand what benefit adding restrictions would serve. >>> Would it be hard coded in C* itself, or configurable? If it's >>> configurable, then are we just making users enter their commands twice? >>> This is meant to be used by an operator, so who's actually protected by an >>> allow-list? >>> >> >> I agree with you too, so I may prefer to idea 2 with implement commitlog >> archiving in c* (not archiving by user defined shell), and deprecate the >> commitlog_archiving.properties >> <https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28> >> configuration >> through which we can set the properties of commitlog archiving. This view >> may be similar to that of Scott. >> >> If I want to use rclone or aws-cli to archive my commit logs that's my >>> prerogative. >>> >> >> Yes, it may be very flexible if we set aws-cli in shell. But as I know >> cassandra-medusa can also do this , and for me letting other tools to do >> this work may be better , for example we can upload more than one log (if >> log size is not big ) in a rpc to improve write throughput. >> >> I think we can divide this big task into several subtasks: >> >> - Add this feature that Stefan mentioned before for commitlog archive >> CASSANDRA-18550 >> <https://issues.apache.org/jira/browse/CASSANDRA-18550> in 5.x and >> may the original commitlog_archiving.properties deprecate. >> - Add the feature of archiving for cassandra (commitlog/query log/or >> event sstable) in the long run such as 6.0. >> >> I can prepare a cep if necessary. Looking forward to your feedback. >> >> >> We can divide this task into several subtasks and complete them step by >> step >> >> >> >> Jordan West <jorda...@gmail.com> 于2024年9月3日周二 00:55写道: >> >>> +1 to Scott’s comments. Once you expose those YAML config params outside >>> of a single node which many of us do, this becomes an RCE attack vector. >>> Something more structured as Scott proposes, similar to snapshots, would be >>> preferred. Would recommend a CEP. >>> >>> Jordan >>> >>> On Fri, Aug 30, 2024 at 20:58 C. Scott Andreas <csco...@icloud.com> >>> wrote: >>> >>>> I appreciate this report and would love to work toward the direction >>>> it recommends. >>>> >>>> I’m also familiar with past concerns raised by others with our FQL >>>> configuration parameters that allow passing shell commands for FQL segment >>>> archival. >>>> >>>> We bias toward ensuring an MBean exists for dynamic modification of >>>> yaml parameters. When we couple dynamic configuration updates and arbitrary >>>> shell command execution, we introduce vectors for arbitrary code execution, >>>> data exfiltration, and data compromise that have a lower bar to achieve >>>> than local file write. >>>> >>>> I agree that we should work toward removing operator-provided shell >>>> commands in yaml. >>>> >>>> For concerns like archival, these seem like areas that Cassandra could >>>> easily accomplish itself without shelling out to gzip/zstd/lz4-compress a >>>> file. Introducing a new config structure that declares an archival format, >>>> accompanying implementations for compression/decompression, and deprecation >>>> of the prior approach sounds both reasonable and desirable to me. >>>> >>>> – Scott >>>> >>>> — >>>> Mobile >>>> >>>> On Aug 30, 2024, at 10:25 PM, Bowen Song via dev < >>>> dev@cassandra.apache.org> wrote: >>>> >>>> >>>> >>>> I'm not sure what is the concern here. Is it a malicious user >>>> exploiting this? Or human error with unintended consequences? >>>> >>>> For malicious user, in order to exploit this, an attacker needs to be >>>> able to write to the config file. The config file on Linux by default is >>>> owned by the root user and has the -rw-r--r-- permission, that means the >>>> attacker must either gain root access to the system or has the ability to >>>> write arbitrary file on the filesystem. With either of these permission, >>>> they can already do almost anything they want (e.g. modify a SUID >>>> executable file). They wouldn't even need to exploit this to run a script >>>> or dangerous command. So this sounds like a non-issue to me, at least on >>>> Linux-based OSes. >>>> >>>> For human error, if the operator puts "rm -rf" in it, the software >>>> should treat it as the operator actually wants to do that. I personally >>>> don't like software attempting to outsmart human, which often ends up >>>> interfering with legitimate use cases. The best thing a software can do is >>>> log it, so there's some traceability if and when things go wrong. >>>> >>>> So, IMO, there's nothing wrong with the implementation in Cassandra. >>>> >>>> >>>> On 30/08/2024 17:13, guo Maxwell wrote: >>>> >>>> Commitlog has the ability of archive log file, see >>>> CommitLogArchiver.java >>>> <https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java>, >>>> we can achieve the purpose of archive and restore commitlog by >>>> configuring archive_command and restore_command in >>>> commitlog_archiving.properties >>>> <https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28> >>>> .The archive_command and restore_command can be some linux/unix shell >>>> command. However, I found that the shell command can actually be >>>> filled with any script, even if "*rm -rf"* .I have tested this >>>> situation and it finally succeeded with my test file being deleted. >>>> >>>> Personally, I think it is a dangerous behavior, because if there >>>> are no system-level restrictions and users are allowed to do anything in >>>> these shell commands. So here I want to discuss with you whether it is >>>> necessary to impose any restrictions on use, or do we need a new way of >>>> archiving/restoring commitlog? >>>> >>>> Of course, before that, I would also like to ask, how many people are >>>> using archive and restore of commitlog? It seems that the commitlog archive >>>> code has not been updated for a long time. >>>> >>>> I have two ideas. >>>> One is to make some restrictions on the command context based on the >>>> existing usage methods, such as strictly only allowing the current cp/mv/ln >>>> %path to %name.Other redundant strings in the command are not allowed. >>>> Another one , As I roughly investigated the archive of mysql and pg. >>>> They do not give users too much space (I am talking about letting users >>>> define their own archiving command ), and archive directly to a designated >>>> location. For us, I feel that we can refer to c * Incremental backup of >>>> sstable, add a hardlink to the commitlog to the specified location, but >>>> this place may modify the original configuration method, such as setting >>>> the archive location and restoring location of the node through nodetool >>>> and deprecate the commitlog_archiving.properties >>>> <https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28> >>>> configuration. >>>> >>>> I am just putting forward some views here, and looking forward to your >>>> feedback. 😀 >>>> >>>>