I categorize this as Not a Problem. So far the two main justifications have been operator error and a fairly convoluted series of steps to an already compromised database. I don't view either of them as a reason to inconvenience users.
If someone wants to avoid the shell command, what's wrong with CDC? Jon On Tue, Sep 24, 2024 at 9:21 AM guo Maxwell <cclive1...@gmail.com> wrote: > Hello,are there any new updates?🤔 > > guo Maxwell <cclive1...@gmail.com>于2024年9月18日 周三下午4:06写道: > >> Do you have any new updates on this DISCUSS ? >> >> - The reason this pattern is popular is it allows extension of >> functionality ahead of the database. Some people copy to a NAS/SAN. Some >> people copy to S3. Some people copy to their own object storage that isn’t >> s3 compatible. “Compress and move” is super limiting, because “move” varies >> remarkably between environments. >> >> Yes, it is indeed very flexible to use this way, but would it be more >> appropriate to decouple the file archiving to heterogeneous storage and >> leave it to other systems to handle it specifically? And we only do >> compression and copying (file linking like sstable incremental backup)? >> >> >> Štefan Miklošovič <smikloso...@apache.org> 于2024年9月5日周四 04:18写道: >> >>> >>> On Wed, Sep 4, 2024 at 8:34 PM Jon Haddad <j...@jonhaddad.com> wrote: >>> >>>> I thought about this a bit over the last few days, and there's actually >>>> quite a few problems present that would need to be addressed. >>>> >>>> *Insecure JMX* >>>> >>>> First off - if someone has access to JMX, the entire system is already >>>> compromised. A bad actor can mess with the cluster topology, truncate >>>> tables, and do a ton of other disruptive stuff. But if we're going to go >>>> down this path I think we should apply your logic consistently to avoid >>>> creating a "solution" that has the same "problem" as we do now. I use >>>> quotes because I'm not entirely convinced the root cause of the problem is >>>> enabling some shell access, but I'll entertain it for the sake of the >>>> discussion. >>>> >>>> *Dynamic Configuration and Shell Scripts* >>>> >>>> Let's pretend that somehow an open JMX isn't already a *massive* >>>> security flaw by itself. Once an attacker has control of a system, the >>>> next phase of the attack relies on them dynamically changing the >>>> configuration to point to a different shell script, or to execute arbitrary >>>> shell scripts. >>>> >>> I agree with the general idea that we don't want this - so in my mind >>>> the necessary solution here is to disable the ability to change the commit >>>> log archiving behavior at runtime. >>>> >>>> The idea that commit log archiving (and many other config settings) >>>> would be dynamically configurable is a massive security flaw that should be >>>> disallowed. If you want to take this a step further and claim there's a >>>> flaw with shell scripts in general, I'll even entertain that for a minute, >>>> but we need to examine if the proposed solution of moving code to Java >>>> actually solves the problem. >>>> >>>> *Dynamic Configuration and Java Code* >>>> >>>> Let's say we've removed the ability to use shell scripts, and we've >>>> gotten people to rewrite their shell code with java code, but we've left >>>> the dynamic configuration in. Going back to my original email, I mentioned >>>> copying commit logs off the node and into an object store. If someone is >>>> able to change the parameter dynamically at runtime, they could just as >>>> easily point to a public S3 bucket and commit logs would be archived there >>>> which is just as bad as the shell version. So if we are to convert this >>>> functionality to Java, we should also be making best practice >>>> recommendations on what users should and should not do. >>>> >>> >>> I think what you meant here is that if we allowed people to provide a >>> pluggable way how stuff is copied over and they would code it up, put that >>> JAR on the class path, Cassandra (re)started etc, then someone might >>> reconfigure this custom solution in runtime? Yeah, we do not want this. We >>> can make it pluggable, but not reconfigurable. To have it pluggable and not >>> reconfigurable, then to replace it with something else, an attacker would >>> basically need to restart Cassandra with a rogue JAR on the class path. In >>> order to do that, I think that at this point it would be beyond any >>> salvation and the system is completely compromised anyway. >>> >>> >>>> >>>> >>>> *Apply All Operational Best Practices* >>>> >>>> There's been a variety of examples of how a user can further compromise >>>> a machine once they have JMX, working in tandem with shell scripts, but I >>>> hope at this point you can see that the issue is fundamentally more complex >>>> than simply disallowing shell scripts. The issue is present in the Java >>>> examples as well, and is strongly tied to the issue of dynamic config. If >>>> we're to design this the "right" way, I think we'd want these properties: >>>> >>>> * Commit log archiving should only have the ability to compress and >>>> move files to a staging location >>>> * Once the files are moved to the staging location, the file should be >>>> moved somewhere else by a script NOT run as the C* user. >>>> >>> * The commit log archive configuration should not be dynamically >>>> updatable, nor should any config affecting directories >>>> >>> >>> This would essentially copy the logic we have for snapshots as Jordan >>> mentioned. I do not mind having it like that. It is a good question for >>> what exactly we need to have it reconfigurable. Why is it like that? People >>> do not want to restart a whole cluster consisting of 100 nodes when the >>> destination of the archived commit logs changed? How often is this >>> happening that we need to expose ourselves to the problems related to that? >>> >>> >>>> >>>> Moving the scell configuration to Java code is a half measure that's >>>> only solving a tiny problem in a massive chain of events and security >>>> holes. >>>> >>>> Jon >>>> >>>> >>>> >>>> On Tue, Sep 3, 2024 at 4:15 AM Štefan Miklošovič < >>>> smikloso...@apache.org> wrote: >>>> >>>>> Scott is right that this is also coming from us having a MBean method >>>>> which allows commands to be changed in runtime. The solution to that was >>>>> that we can prevent it from changing dynamically by having a configuration >>>>> property, which is actually by default set to false so FQL archiving is >>>>> ever possible only in case an operator explicitly enables that. >>>>> >>>>> However, even if commands were not modifiable in runtime via JMX and >>>>> even an operator has a chance to enable command execution explicitly, that >>>>> still does not make it 100% secure because an attacker does not need to >>>>> change / modify cassandra.yaml where the script to execute is configure, >>>>> just the content of such a script which is executed. >>>>> >>>>> So, introducing a similar property as it was done for FQL would in >>>>> this context mean that it would be used for disabling commitlog archiving >>>>> / >>>>> restoring altogether while for FQL it would still do its thing, it would >>>>> just not archive it. Whole commitlog archiving / restoring is now based on >>>>> some commands to be executed so disabling commands being executed >>>>> practically means we disabled this whole feature as such. >>>>> >>>>> We could indeed make it flat out impossible to execute anything but >>>>> these scripts might contain some custom logic, like uploading to various >>>>> cloud storages (AWS, Azure, GCP or something completely custom), people >>>>> have their own "storage solutions" like remove the old logs when new come >>>>> in etc. so by disabling this altogether we would make it impossible and >>>>> users would need to accommodate that which would break their existing >>>>> solutions. >>>>> >>>>> What I find confusing is that commitlog_archiving.properties is used >>>>> both for restoration AS WELL AS for archiving. If we're ever going to >>>>> change how this works, I think that it should be somehow logically split >>>>> into archiving and restoring parts. >>>>> >>>>> So, we might introduce a property in cassandra.yaml to disable >>>>> commitlog_archiving.properties altogether and we might deprecate >>>>> commitlog_archiving.properties way of doing this (still keep it there for >>>>> legacy reasons), add a new cassandra.yaml configuration section for that >>>>> and there make the archiving and the restoration pluggable. By default we >>>>> would provide "cp $from $to" implemented by Cassandra itself without any >>>>> process invocation. Then we might eventually drop >>>>> commitlog_archiving.properties but if the maintenance of that is cheap I >>>>> would just keep it, we would just flip the switch so a new way of doing >>>>> that would be preferable and the old way of doing it (via properties) >>>>> would >>>>> need to be explicitly enabled. >>>>> >>>>> On Tue, Sep 3, 2024 at 11:55 AM guo Maxwell <cclive1...@gmail.com> >>>>> wrote: >>>>> >>>>>> Thank you very much for everyone's replies, they are all very >>>>>> valuable feedback to me. >>>>>> >>>>>> I don't really understand what benefit adding restrictions would >>>>>>> serve. Would it be hard coded in C* itself, or configurable? If it's >>>>>>> configurable, then are we just making users enter their commands twice? >>>>>>> This is meant to be used by an operator, so who's actually protected by >>>>>>> an >>>>>>> allow-list? >>>>>>> >>>>>> >>>>>> I agree with you too, so I may prefer to idea 2 with implement >>>>>> commitlog archiving in c* (not archiving by user defined shell), >>>>>> and deprecate the commitlog_archiving.properties >>>>>> <https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28> >>>>>> configuration >>>>>> through which we can set the properties of commitlog archiving. This >>>>>> view >>>>>> may be similar to that of Scott. >>>>>> >>>>>> If I want to use rclone or aws-cli to archive my commit logs that's >>>>>>> my prerogative. >>>>>>> >>>>>> >>>>>> Yes, it may be very flexible if we set aws-cli in shell. But as I >>>>>> know cassandra-medusa can also do this , and for me letting other tools >>>>>> to >>>>>> do this work may be better , for example we can upload more than one log >>>>>> (if log size is not big ) in a rpc to improve write throughput. >>>>>> >>>>>> I think we can divide this big task into several subtasks: >>>>>> >>>>>> - Add this feature that Stefan mentioned before for commitlog >>>>>> archive CASSANDRA-18550 >>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-18550> in 5.x >>>>>> and may the original commitlog_archiving.properties deprecate. >>>>>> - Add the feature of archiving for cassandra (commitlog/query >>>>>> log/or event sstable) in the long run such as 6.0. >>>>>> >>>>>> I can prepare a cep if necessary. Looking forward to your feedback. >>>>>> >>>>>> >>>>>> We can divide this task into several subtasks and complete them step >>>>>> by step >>>>>> >>>>>> >>>>>> >>>>>> Jordan West <jorda...@gmail.com> 于2024年9月3日周二 00:55写道: >>>>>> >>>>>>> +1 to Scott’s comments. Once you expose those YAML config params >>>>>>> outside of a single node which many of us do, this becomes an RCE attack >>>>>>> vector. Something more structured as Scott proposes, similar to >>>>>>> snapshots, >>>>>>> would be preferred. Would recommend a CEP. >>>>>>> >>>>>>> Jordan >>>>>>> >>>>>>> On Fri, Aug 30, 2024 at 20:58 C. Scott Andreas <csco...@icloud.com> >>>>>>> wrote: >>>>>>> >>>>>>>> I appreciate this report and would love to work toward the >>>>>>>> direction it recommends. >>>>>>>> >>>>>>>> I’m also familiar with past concerns raised by others with our FQL >>>>>>>> configuration parameters that allow passing shell commands for FQL >>>>>>>> segment >>>>>>>> archival. >>>>>>>> >>>>>>>> We bias toward ensuring an MBean exists for dynamic modification of >>>>>>>> yaml parameters. When we couple dynamic configuration updates and >>>>>>>> arbitrary >>>>>>>> shell command execution, we introduce vectors for arbitrary code >>>>>>>> execution, >>>>>>>> data exfiltration, and data compromise that have a lower bar to achieve >>>>>>>> than local file write. >>>>>>>> >>>>>>>> I agree that we should work toward removing operator-provided shell >>>>>>>> commands in yaml. >>>>>>>> >>>>>>>> For concerns like archival, these seem like areas that Cassandra >>>>>>>> could easily accomplish itself without shelling out to >>>>>>>> gzip/zstd/lz4-compress a file. Introducing a new config structure that >>>>>>>> declares an archival format, accompanying implementations for >>>>>>>> compression/decompression, and deprecation of the prior approach sounds >>>>>>>> both reasonable and desirable to me. >>>>>>>> >>>>>>>> – Scott >>>>>>>> >>>>>>>> — >>>>>>>> Mobile >>>>>>>> >>>>>>>> On Aug 30, 2024, at 10:25 PM, Bowen Song via dev < >>>>>>>> dev@cassandra.apache.org> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I'm not sure what is the concern here. Is it a malicious user >>>>>>>> exploiting this? Or human error with unintended consequences? >>>>>>>> >>>>>>>> For malicious user, in order to exploit this, an attacker needs to >>>>>>>> be able to write to the config file. The config file on Linux by >>>>>>>> default is >>>>>>>> owned by the root user and has the -rw-r--r-- permission, that means >>>>>>>> the >>>>>>>> attacker must either gain root access to the system or has the ability >>>>>>>> to >>>>>>>> write arbitrary file on the filesystem. With either of these >>>>>>>> permission, >>>>>>>> they can already do almost anything they want (e.g. modify a SUID >>>>>>>> executable file). They wouldn't even need to exploit this to run a >>>>>>>> script >>>>>>>> or dangerous command. So this sounds like a non-issue to me, at least >>>>>>>> on >>>>>>>> Linux-based OSes. >>>>>>>> >>>>>>>> For human error, if the operator puts "rm -rf" in it, the software >>>>>>>> should treat it as the operator actually wants to do that. I personally >>>>>>>> don't like software attempting to outsmart human, which often ends up >>>>>>>> interfering with legitimate use cases. The best thing a software can >>>>>>>> do is >>>>>>>> log it, so there's some traceability if and when things go wrong. >>>>>>>> >>>>>>>> So, IMO, there's nothing wrong with the implementation in Cassandra. >>>>>>>> >>>>>>>> >>>>>>>> On 30/08/2024 17:13, guo Maxwell wrote: >>>>>>>> >>>>>>>> Commitlog has the ability of archive log file, see >>>>>>>> CommitLogArchiver.java >>>>>>>> <https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java>, >>>>>>>> we can achieve the purpose of archive and restore commitlog by >>>>>>>> configuring archive_command and restore_command in >>>>>>>> commitlog_archiving.properties >>>>>>>> <https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28> >>>>>>>> .The archive_command and restore_command can be some linux/unix >>>>>>>> shell command. However, I found that the shell command can >>>>>>>> actually be filled with any script, even if "*rm -rf"* .I have >>>>>>>> tested this situation and it finally succeeded with my test file being >>>>>>>> deleted. >>>>>>>> >>>>>>>> Personally, I think it is a dangerous behavior, because if >>>>>>>> there are no system-level restrictions and users are allowed to do >>>>>>>> anything >>>>>>>> in these shell commands. So here I want to discuss with you >>>>>>>> whether it is necessary to impose any restrictions on use, or do we >>>>>>>> need a >>>>>>>> new way of archiving/restoring commitlog? >>>>>>>> >>>>>>>> Of course, before that, I would also like to ask, how many people >>>>>>>> are using archive and restore of commitlog? It seems that the commitlog >>>>>>>> archive code has not been updated for a long time. >>>>>>>> >>>>>>>> I have two ideas. >>>>>>>> One is to make some restrictions on the command context based on >>>>>>>> the existing usage methods, such as strictly only allowing the current >>>>>>>> cp/mv/ln %path to %name.Other redundant strings in the command are not >>>>>>>> allowed. >>>>>>>> Another one , As I roughly investigated the archive of mysql and >>>>>>>> pg. They do not give users too much space (I am talking about letting >>>>>>>> users >>>>>>>> define their own archiving command ), and archive directly to a >>>>>>>> designated >>>>>>>> location. For us, I feel that we can refer to c * Incremental backup of >>>>>>>> sstable, add a hardlink to the commitlog to the specified location, >>>>>>>> but >>>>>>>> this place may modify the original configuration method, such as >>>>>>>> setting >>>>>>>> the archive location and restoring location of the node through >>>>>>>> nodetool >>>>>>>> and deprecate the commitlog_archiving.properties >>>>>>>> <https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28> >>>>>>>> configuration. >>>>>>>> >>>>>>>> I am just putting forward some views here, and looking forward to >>>>>>>> your feedback. 😀 >>>>>>>> >>>>>>>>