Thanks Jon But should we at least add the ability of dynamic closing to the archive that Stefan mentioned before ?
Jon Haddad <j...@rustyrazorblade.com> 于2024年9月25日周三 01:16写道: > I categorize this as Not a Problem. > > So far the two main justifications have been operator error and a fairly > convoluted series of steps to an already compromised database. I don't > view either of them as a reason to inconvenience users. > > If someone wants to avoid the shell command, what's wrong with CDC? > > Jon > > > > > On Tue, Sep 24, 2024 at 9:21 AM guo Maxwell <cclive1...@gmail.com> wrote: > >> Hello,are there any new updates?🤔 >> >> guo Maxwell <cclive1...@gmail.com>于2024年9月18日 周三下午4:06写道: >> >>> Do you have any new updates on this DISCUSS ? >>> >>> - The reason this pattern is popular is it allows extension of >>> functionality ahead of the database. Some people copy to a NAS/SAN. Some >>> people copy to S3. Some people copy to their own object storage that isn’t >>> s3 compatible. “Compress and move” is super limiting, because “move” varies >>> remarkably between environments. >>> >>> Yes, it is indeed very flexible to use this way, but would it be more >>> appropriate to decouple the file archiving to heterogeneous storage and >>> leave it to other systems to handle it specifically? And we only do >>> compression and copying (file linking like sstable incremental backup)? >>> >>> >>> Štefan Miklošovič <smikloso...@apache.org> 于2024年9月5日周四 04:18写道: >>> >>>> >>>> On Wed, Sep 4, 2024 at 8:34 PM Jon Haddad <j...@jonhaddad.com> wrote: >>>> >>>>> I thought about this a bit over the last few days, and there's >>>>> actually quite a few problems present that would need to be addressed. >>>>> >>>>> *Insecure JMX* >>>>> >>>>> First off - if someone has access to JMX, the entire system is already >>>>> compromised. A bad actor can mess with the cluster topology, truncate >>>>> tables, and do a ton of other disruptive stuff. But if we're going to go >>>>> down this path I think we should apply your logic consistently to avoid >>>>> creating a "solution" that has the same "problem" as we do now. I use >>>>> quotes because I'm not entirely convinced the root cause of the problem is >>>>> enabling some shell access, but I'll entertain it for the sake of the >>>>> discussion. >>>>> >>>>> *Dynamic Configuration and Shell Scripts* >>>>> >>>>> Let's pretend that somehow an open JMX isn't already a *massive* >>>>> security flaw by itself. Once an attacker has control of a system, the >>>>> next phase of the attack relies on them dynamically changing the >>>>> configuration to point to a different shell script, or to execute >>>>> arbitrary >>>>> shell scripts. >>>>> >>>> I agree with the general idea that we don't want this - so in my mind >>>>> the necessary solution here is to disable the ability to change the commit >>>>> log archiving behavior at runtime. >>>>> >>>>> The idea that commit log archiving (and many other config settings) >>>>> would be dynamically configurable is a massive security flaw that should >>>>> be >>>>> disallowed. If you want to take this a step further and claim there's a >>>>> flaw with shell scripts in general, I'll even entertain that for a minute, >>>>> but we need to examine if the proposed solution of moving code to Java >>>>> actually solves the problem. >>>>> >>>>> *Dynamic Configuration and Java Code* >>>>> >>>>> Let's say we've removed the ability to use shell scripts, and we've >>>>> gotten people to rewrite their shell code with java code, but we've left >>>>> the dynamic configuration in. Going back to my original email, I >>>>> mentioned >>>>> copying commit logs off the node and into an object store. If someone is >>>>> able to change the parameter dynamically at runtime, they could just as >>>>> easily point to a public S3 bucket and commit logs would be archived there >>>>> which is just as bad as the shell version. So if we are to convert this >>>>> functionality to Java, we should also be making best practice >>>>> recommendations on what users should and should not do. >>>>> >>>> >>>> I think what you meant here is that if we allowed people to provide a >>>> pluggable way how stuff is copied over and they would code it up, put that >>>> JAR on the class path, Cassandra (re)started etc, then someone might >>>> reconfigure this custom solution in runtime? Yeah, we do not want this. We >>>> can make it pluggable, but not reconfigurable. To have it pluggable and not >>>> reconfigurable, then to replace it with something else, an attacker would >>>> basically need to restart Cassandra with a rogue JAR on the class path. In >>>> order to do that, I think that at this point it would be beyond any >>>> salvation and the system is completely compromised anyway. >>>> >>>> >>>>> >>>>> >>>>> *Apply All Operational Best Practices* >>>>> >>>>> There's been a variety of examples of how a user can further >>>>> compromise a machine once they have JMX, working in tandem with shell >>>>> scripts, but I hope at this point you can see that the issue is >>>>> fundamentally more complex than simply disallowing shell scripts. The >>>>> issue is present in the Java examples as well, and is strongly tied to the >>>>> issue of dynamic config. If we're to design this the "right" way, I think >>>>> we'd want these properties: >>>>> >>>>> * Commit log archiving should only have the ability to compress and >>>>> move files to a staging location >>>>> * Once the files are moved to the staging location, the file should be >>>>> moved somewhere else by a script NOT run as the C* user. >>>>> >>>> * The commit log archive configuration should not be dynamically >>>>> updatable, nor should any config affecting directories >>>>> >>>> >>>> This would essentially copy the logic we have for snapshots as Jordan >>>> mentioned. I do not mind having it like that. It is a good question for >>>> what exactly we need to have it reconfigurable. Why is it like that? People >>>> do not want to restart a whole cluster consisting of 100 nodes when the >>>> destination of the archived commit logs changed? How often is this >>>> happening that we need to expose ourselves to the problems related to that? >>>> >>>> >>>>> >>>>> Moving the scell configuration to Java code is a half measure that's >>>>> only solving a tiny problem in a massive chain of events and security >>>>> holes. >>>>> >>>>> Jon >>>>> >>>>> >>>>> >>>>> On Tue, Sep 3, 2024 at 4:15 AM Štefan Miklošovič < >>>>> smikloso...@apache.org> wrote: >>>>> >>>>>> Scott is right that this is also coming from us having a MBean method >>>>>> which allows commands to be changed in runtime. The solution to that was >>>>>> that we can prevent it from changing dynamically by having a >>>>>> configuration >>>>>> property, which is actually by default set to false so FQL archiving is >>>>>> ever possible only in case an operator explicitly enables that. >>>>>> >>>>>> However, even if commands were not modifiable in runtime via JMX and >>>>>> even an operator has a chance to enable command execution explicitly, >>>>>> that >>>>>> still does not make it 100% secure because an attacker does not need to >>>>>> change / modify cassandra.yaml where the script to execute is configure, >>>>>> just the content of such a script which is executed. >>>>>> >>>>>> So, introducing a similar property as it was done for FQL would in >>>>>> this context mean that it would be used for disabling commitlog >>>>>> archiving / >>>>>> restoring altogether while for FQL it would still do its thing, it would >>>>>> just not archive it. Whole commitlog archiving / restoring is now based >>>>>> on >>>>>> some commands to be executed so disabling commands being executed >>>>>> practically means we disabled this whole feature as such. >>>>>> >>>>>> We could indeed make it flat out impossible to execute anything but >>>>>> these scripts might contain some custom logic, like uploading to various >>>>>> cloud storages (AWS, Azure, GCP or something completely custom), people >>>>>> have their own "storage solutions" like remove the old logs when new come >>>>>> in etc. so by disabling this altogether we would make it impossible and >>>>>> users would need to accommodate that which would break their existing >>>>>> solutions. >>>>>> >>>>>> What I find confusing is that commitlog_archiving.properties is used >>>>>> both for restoration AS WELL AS for archiving. If we're ever going to >>>>>> change how this works, I think that it should be somehow logically split >>>>>> into archiving and restoring parts. >>>>>> >>>>>> So, we might introduce a property in cassandra.yaml to disable >>>>>> commitlog_archiving.properties altogether and we might deprecate >>>>>> commitlog_archiving.properties way of doing this (still keep it there for >>>>>> legacy reasons), add a new cassandra.yaml configuration section for that >>>>>> and there make the archiving and the restoration pluggable. By default we >>>>>> would provide "cp $from $to" implemented by Cassandra itself without any >>>>>> process invocation. Then we might eventually drop >>>>>> commitlog_archiving.properties but if the maintenance of that is cheap I >>>>>> would just keep it, we would just flip the switch so a new way of doing >>>>>> that would be preferable and the old way of doing it (via properties) >>>>>> would >>>>>> need to be explicitly enabled. >>>>>> >>>>>> On Tue, Sep 3, 2024 at 11:55 AM guo Maxwell <cclive1...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Thank you very much for everyone's replies, they are all very >>>>>>> valuable feedback to me. >>>>>>> >>>>>>> I don't really understand what benefit adding restrictions would >>>>>>>> serve. Would it be hard coded in C* itself, or configurable? If it's >>>>>>>> configurable, then are we just making users enter their commands twice? >>>>>>>> This is meant to be used by an operator, so who's actually protected >>>>>>>> by an >>>>>>>> allow-list? >>>>>>>> >>>>>>> >>>>>>> I agree with you too, so I may prefer to idea 2 with implement >>>>>>> commitlog archiving in c* (not archiving by user defined shell), >>>>>>> and deprecate the commitlog_archiving.properties >>>>>>> <https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28> >>>>>>> configuration >>>>>>> through which we can set the properties of commitlog archiving. This >>>>>>> view >>>>>>> may be similar to that of Scott. >>>>>>> >>>>>>> If I want to use rclone or aws-cli to archive my commit logs that's >>>>>>>> my prerogative. >>>>>>>> >>>>>>> >>>>>>> Yes, it may be very flexible if we set aws-cli in shell. But as I >>>>>>> know cassandra-medusa can also do this , and for me letting other tools >>>>>>> to >>>>>>> do this work may be better , for example we can upload more than one log >>>>>>> (if log size is not big ) in a rpc to improve write throughput. >>>>>>> >>>>>>> I think we can divide this big task into several subtasks: >>>>>>> >>>>>>> - Add this feature that Stefan mentioned before for commitlog >>>>>>> archive CASSANDRA-18550 >>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-18550> in 5.x >>>>>>> and may the original commitlog_archiving.properties deprecate. >>>>>>> - Add the feature of archiving for cassandra (commitlog/query >>>>>>> log/or event sstable) in the long run such as 6.0. >>>>>>> >>>>>>> I can prepare a cep if necessary. Looking forward to your feedback. >>>>>>> >>>>>>> >>>>>>> We can divide this task into several subtasks and complete them step >>>>>>> by step >>>>>>> >>>>>>> >>>>>>> >>>>>>> Jordan West <jorda...@gmail.com> 于2024年9月3日周二 00:55写道: >>>>>>> >>>>>>>> +1 to Scott’s comments. Once you expose those YAML config params >>>>>>>> outside of a single node which many of us do, this becomes an RCE >>>>>>>> attack >>>>>>>> vector. Something more structured as Scott proposes, similar to >>>>>>>> snapshots, >>>>>>>> would be preferred. Would recommend a CEP. >>>>>>>> >>>>>>>> Jordan >>>>>>>> >>>>>>>> On Fri, Aug 30, 2024 at 20:58 C. Scott Andreas <csco...@icloud.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> I appreciate this report and would love to work toward the >>>>>>>>> direction it recommends. >>>>>>>>> >>>>>>>>> I’m also familiar with past concerns raised by others with our FQL >>>>>>>>> configuration parameters that allow passing shell commands for FQL >>>>>>>>> segment >>>>>>>>> archival. >>>>>>>>> >>>>>>>>> We bias toward ensuring an MBean exists for dynamic modification >>>>>>>>> of yaml parameters. When we couple dynamic configuration updates and >>>>>>>>> arbitrary shell command execution, we introduce vectors for arbitrary >>>>>>>>> code >>>>>>>>> execution, data exfiltration, and data compromise that have a lower >>>>>>>>> bar to >>>>>>>>> achieve than local file write. >>>>>>>>> >>>>>>>>> I agree that we should work toward removing operator-provided >>>>>>>>> shell commands in yaml. >>>>>>>>> >>>>>>>>> For concerns like archival, these seem like areas that Cassandra >>>>>>>>> could easily accomplish itself without shelling out to >>>>>>>>> gzip/zstd/lz4-compress a file. Introducing a new config structure that >>>>>>>>> declares an archival format, accompanying implementations for >>>>>>>>> compression/decompression, and deprecation of the prior approach >>>>>>>>> sounds >>>>>>>>> both reasonable and desirable to me. >>>>>>>>> >>>>>>>>> – Scott >>>>>>>>> >>>>>>>>> — >>>>>>>>> Mobile >>>>>>>>> >>>>>>>>> On Aug 30, 2024, at 10:25 PM, Bowen Song via dev < >>>>>>>>> dev@cassandra.apache.org> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> I'm not sure what is the concern here. Is it a malicious user >>>>>>>>> exploiting this? Or human error with unintended consequences? >>>>>>>>> >>>>>>>>> For malicious user, in order to exploit this, an attacker needs to >>>>>>>>> be able to write to the config file. The config file on Linux by >>>>>>>>> default is >>>>>>>>> owned by the root user and has the -rw-r--r-- permission, that means >>>>>>>>> the >>>>>>>>> attacker must either gain root access to the system or has the >>>>>>>>> ability to >>>>>>>>> write arbitrary file on the filesystem. With either of these >>>>>>>>> permission, >>>>>>>>> they can already do almost anything they want (e.g. modify a SUID >>>>>>>>> executable file). They wouldn't even need to exploit this to run a >>>>>>>>> script >>>>>>>>> or dangerous command. So this sounds like a non-issue to me, at least >>>>>>>>> on >>>>>>>>> Linux-based OSes. >>>>>>>>> >>>>>>>>> For human error, if the operator puts "rm -rf" in it, the software >>>>>>>>> should treat it as the operator actually wants to do that. I >>>>>>>>> personally >>>>>>>>> don't like software attempting to outsmart human, which often ends up >>>>>>>>> interfering with legitimate use cases. The best thing a software can >>>>>>>>> do is >>>>>>>>> log it, so there's some traceability if and when things go wrong. >>>>>>>>> >>>>>>>>> So, IMO, there's nothing wrong with the implementation in >>>>>>>>> Cassandra. >>>>>>>>> >>>>>>>>> >>>>>>>>> On 30/08/2024 17:13, guo Maxwell wrote: >>>>>>>>> >>>>>>>>> Commitlog has the ability of archive log file, see >>>>>>>>> CommitLogArchiver.java >>>>>>>>> <https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java>, >>>>>>>>> we can achieve the purpose of archive and restore commitlog by >>>>>>>>> configuring archive_command and restore_command in >>>>>>>>> commitlog_archiving.properties >>>>>>>>> <https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28> >>>>>>>>> .The archive_command and restore_command can be some linux/unix >>>>>>>>> shell command. However, I found that the shell command can >>>>>>>>> actually be filled with any script, even if "*rm -rf"* .I have >>>>>>>>> tested this situation and it finally succeeded with my test file being >>>>>>>>> deleted. >>>>>>>>> >>>>>>>>> Personally, I think it is a dangerous behavior, because if >>>>>>>>> there are no system-level restrictions and users are allowed to do >>>>>>>>> anything >>>>>>>>> in these shell commands. So here I want to discuss with you >>>>>>>>> whether it is necessary to impose any restrictions on use, or do we >>>>>>>>> need a >>>>>>>>> new way of archiving/restoring commitlog? >>>>>>>>> >>>>>>>>> Of course, before that, I would also like to ask, how many people >>>>>>>>> are using archive and restore of commitlog? It seems that the >>>>>>>>> commitlog >>>>>>>>> archive code has not been updated for a long time. >>>>>>>>> >>>>>>>>> I have two ideas. >>>>>>>>> One is to make some restrictions on the command context based on >>>>>>>>> the existing usage methods, such as strictly only allowing the current >>>>>>>>> cp/mv/ln %path to %name.Other redundant strings in the command are not >>>>>>>>> allowed. >>>>>>>>> Another one , As I roughly investigated the archive of mysql and >>>>>>>>> pg. They do not give users too much space (I am talking about letting >>>>>>>>> users >>>>>>>>> define their own archiving command ), and archive directly to a >>>>>>>>> designated >>>>>>>>> location. For us, I feel that we can refer to c * Incremental backup >>>>>>>>> of >>>>>>>>> sstable, add a hardlink to the commitlog to the specified location, >>>>>>>>> but >>>>>>>>> this place may modify the original configuration method, such as >>>>>>>>> setting >>>>>>>>> the archive location and restoring location of the node through >>>>>>>>> nodetool >>>>>>>>> and deprecate the commitlog_archiving.properties >>>>>>>>> <https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28> >>>>>>>>> configuration. >>>>>>>>> >>>>>>>>> I am just putting forward some views here, and looking forward to >>>>>>>>> your feedback. 😀 >>>>>>>>> >>>>>>>>>