Hello,are there any new updates?🤔 guo Maxwell <cclive1...@gmail.com>于2024年9月18日 周三下午4:06写道:
> Do you have any new updates on this DISCUSS ? > > - The reason this pattern is popular is it allows extension of > functionality ahead of the database. Some people copy to a NAS/SAN. Some > people copy to S3. Some people copy to their own object storage that isn’t > s3 compatible. “Compress and move” is super limiting, because “move” varies > remarkably between environments. > > Yes, it is indeed very flexible to use this way, but would it be more > appropriate to decouple the file archiving to heterogeneous storage and > leave it to other systems to handle it specifically? And we only do > compression and copying (file linking like sstable incremental backup)? > > > Štefan Miklošovič <smikloso...@apache.org> 于2024年9月5日周四 04:18写道: > >> >> On Wed, Sep 4, 2024 at 8:34 PM Jon Haddad <j...@jonhaddad.com> wrote: >> >>> I thought about this a bit over the last few days, and there's actually >>> quite a few problems present that would need to be addressed. >>> >>> *Insecure JMX* >>> >>> First off - if someone has access to JMX, the entire system is already >>> compromised. A bad actor can mess with the cluster topology, truncate >>> tables, and do a ton of other disruptive stuff. But if we're going to go >>> down this path I think we should apply your logic consistently to avoid >>> creating a "solution" that has the same "problem" as we do now. I use >>> quotes because I'm not entirely convinced the root cause of the problem is >>> enabling some shell access, but I'll entertain it for the sake of the >>> discussion. >>> >>> *Dynamic Configuration and Shell Scripts* >>> >>> Let's pretend that somehow an open JMX isn't already a *massive* >>> security flaw by itself. Once an attacker has control of a system, the >>> next phase of the attack relies on them dynamically changing the >>> configuration to point to a different shell script, or to execute arbitrary >>> shell scripts. >>> >> I agree with the general idea that we don't want this - so in my mind the >>> necessary solution here is to disable the ability to change the commit log >>> archiving behavior at runtime. >>> >>> The idea that commit log archiving (and many other config settings) >>> would be dynamically configurable is a massive security flaw that should be >>> disallowed. If you want to take this a step further and claim there's a >>> flaw with shell scripts in general, I'll even entertain that for a minute, >>> but we need to examine if the proposed solution of moving code to Java >>> actually solves the problem. >>> >>> *Dynamic Configuration and Java Code* >>> >>> Let's say we've removed the ability to use shell scripts, and we've >>> gotten people to rewrite their shell code with java code, but we've left >>> the dynamic configuration in. Going back to my original email, I mentioned >>> copying commit logs off the node and into an object store. If someone is >>> able to change the parameter dynamically at runtime, they could just as >>> easily point to a public S3 bucket and commit logs would be archived there >>> which is just as bad as the shell version. So if we are to convert this >>> functionality to Java, we should also be making best practice >>> recommendations on what users should and should not do. >>> >> >> I think what you meant here is that if we allowed people to provide a >> pluggable way how stuff is copied over and they would code it up, put that >> JAR on the class path, Cassandra (re)started etc, then someone might >> reconfigure this custom solution in runtime? Yeah, we do not want this. We >> can make it pluggable, but not reconfigurable. To have it pluggable and not >> reconfigurable, then to replace it with something else, an attacker would >> basically need to restart Cassandra with a rogue JAR on the class path. In >> order to do that, I think that at this point it would be beyond any >> salvation and the system is completely compromised anyway. >> >> >>> >>> >>> *Apply All Operational Best Practices* >>> >>> There's been a variety of examples of how a user can further compromise >>> a machine once they have JMX, working in tandem with shell scripts, but I >>> hope at this point you can see that the issue is fundamentally more complex >>> than simply disallowing shell scripts. The issue is present in the Java >>> examples as well, and is strongly tied to the issue of dynamic config. If >>> we're to design this the "right" way, I think we'd want these properties: >>> >>> * Commit log archiving should only have the ability to compress and move >>> files to a staging location >>> * Once the files are moved to the staging location, the file should be >>> moved somewhere else by a script NOT run as the C* user. >>> >> * The commit log archive configuration should not be dynamically >>> updatable, nor should any config affecting directories >>> >> >> This would essentially copy the logic we have for snapshots as Jordan >> mentioned. I do not mind having it like that. It is a good question for >> what exactly we need to have it reconfigurable. Why is it like that? People >> do not want to restart a whole cluster consisting of 100 nodes when the >> destination of the archived commit logs changed? How often is this >> happening that we need to expose ourselves to the problems related to that? >> >> >>> >>> Moving the scell configuration to Java code is a half measure that's >>> only solving a tiny problem in a massive chain of events and security >>> holes. >>> >>> Jon >>> >>> >>> >>> On Tue, Sep 3, 2024 at 4:15 AM Štefan Miklošovič <smikloso...@apache.org> >>> wrote: >>> >>>> Scott is right that this is also coming from us having a MBean method >>>> which allows commands to be changed in runtime. The solution to that was >>>> that we can prevent it from changing dynamically by having a configuration >>>> property, which is actually by default set to false so FQL archiving is >>>> ever possible only in case an operator explicitly enables that. >>>> >>>> However, even if commands were not modifiable in runtime via JMX and >>>> even an operator has a chance to enable command execution explicitly, that >>>> still does not make it 100% secure because an attacker does not need to >>>> change / modify cassandra.yaml where the script to execute is configure, >>>> just the content of such a script which is executed. >>>> >>>> So, introducing a similar property as it was done for FQL would in this >>>> context mean that it would be used for disabling commitlog archiving / >>>> restoring altogether while for FQL it would still do its thing, it would >>>> just not archive it. Whole commitlog archiving / restoring is now based on >>>> some commands to be executed so disabling commands being executed >>>> practically means we disabled this whole feature as such. >>>> >>>> We could indeed make it flat out impossible to execute anything but >>>> these scripts might contain some custom logic, like uploading to various >>>> cloud storages (AWS, Azure, GCP or something completely custom), people >>>> have their own "storage solutions" like remove the old logs when new come >>>> in etc. so by disabling this altogether we would make it impossible and >>>> users would need to accommodate that which would break their existing >>>> solutions. >>>> >>>> What I find confusing is that commitlog_archiving.properties is used >>>> both for restoration AS WELL AS for archiving. If we're ever going to >>>> change how this works, I think that it should be somehow logically split >>>> into archiving and restoring parts. >>>> >>>> So, we might introduce a property in cassandra.yaml to disable >>>> commitlog_archiving.properties altogether and we might deprecate >>>> commitlog_archiving.properties way of doing this (still keep it there for >>>> legacy reasons), add a new cassandra.yaml configuration section for that >>>> and there make the archiving and the restoration pluggable. By default we >>>> would provide "cp $from $to" implemented by Cassandra itself without any >>>> process invocation. Then we might eventually drop >>>> commitlog_archiving.properties but if the maintenance of that is cheap I >>>> would just keep it, we would just flip the switch so a new way of doing >>>> that would be preferable and the old way of doing it (via properties) would >>>> need to be explicitly enabled. >>>> >>>> On Tue, Sep 3, 2024 at 11:55 AM guo Maxwell <cclive1...@gmail.com> >>>> wrote: >>>> >>>>> Thank you very much for everyone's replies, they are all very valuable >>>>> feedback to me. >>>>> >>>>> I don't really understand what benefit adding restrictions would >>>>>> serve. Would it be hard coded in C* itself, or configurable? If it's >>>>>> configurable, then are we just making users enter their commands twice? >>>>>> This is meant to be used by an operator, so who's actually protected by >>>>>> an >>>>>> allow-list? >>>>>> >>>>> >>>>> I agree with you too, so I may prefer to idea 2 with implement >>>>> commitlog archiving in c* (not archiving by user defined shell), >>>>> and deprecate the commitlog_archiving.properties >>>>> <https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28> >>>>> configuration >>>>> through which we can set the properties of commitlog archiving. This >>>>> view >>>>> may be similar to that of Scott. >>>>> >>>>> If I want to use rclone or aws-cli to archive my commit logs that's my >>>>>> prerogative. >>>>>> >>>>> >>>>> Yes, it may be very flexible if we set aws-cli in shell. But as I know >>>>> cassandra-medusa can also do this , and for me letting other tools to do >>>>> this work may be better , for example we can upload more than one log (if >>>>> log size is not big ) in a rpc to improve write throughput. >>>>> >>>>> I think we can divide this big task into several subtasks: >>>>> >>>>> - Add this feature that Stefan mentioned before for commitlog >>>>> archive CASSANDRA-18550 >>>>> <https://issues.apache.org/jira/browse/CASSANDRA-18550> in 5.x >>>>> and may the original commitlog_archiving.properties deprecate. >>>>> - Add the feature of archiving for cassandra (commitlog/query >>>>> log/or event sstable) in the long run such as 6.0. >>>>> >>>>> I can prepare a cep if necessary. Looking forward to your feedback. >>>>> >>>>> >>>>> We can divide this task into several subtasks and complete them step >>>>> by step >>>>> >>>>> >>>>> >>>>> Jordan West <jorda...@gmail.com> 于2024年9月3日周二 00:55写道: >>>>> >>>>>> +1 to Scott’s comments. Once you expose those YAML config params >>>>>> outside of a single node which many of us do, this becomes an RCE attack >>>>>> vector. Something more structured as Scott proposes, similar to >>>>>> snapshots, >>>>>> would be preferred. Would recommend a CEP. >>>>>> >>>>>> Jordan >>>>>> >>>>>> On Fri, Aug 30, 2024 at 20:58 C. Scott Andreas <csco...@icloud.com> >>>>>> wrote: >>>>>> >>>>>>> I appreciate this report and would love to work toward the >>>>>>> direction it recommends. >>>>>>> >>>>>>> I’m also familiar with past concerns raised by others with our FQL >>>>>>> configuration parameters that allow passing shell commands for FQL >>>>>>> segment >>>>>>> archival. >>>>>>> >>>>>>> We bias toward ensuring an MBean exists for dynamic modification of >>>>>>> yaml parameters. When we couple dynamic configuration updates and >>>>>>> arbitrary >>>>>>> shell command execution, we introduce vectors for arbitrary code >>>>>>> execution, >>>>>>> data exfiltration, and data compromise that have a lower bar to achieve >>>>>>> than local file write. >>>>>>> >>>>>>> I agree that we should work toward removing operator-provided shell >>>>>>> commands in yaml. >>>>>>> >>>>>>> For concerns like archival, these seem like areas that Cassandra >>>>>>> could easily accomplish itself without shelling out to >>>>>>> gzip/zstd/lz4-compress a file. Introducing a new config structure that >>>>>>> declares an archival format, accompanying implementations for >>>>>>> compression/decompression, and deprecation of the prior approach sounds >>>>>>> both reasonable and desirable to me. >>>>>>> >>>>>>> – Scott >>>>>>> >>>>>>> — >>>>>>> Mobile >>>>>>> >>>>>>> On Aug 30, 2024, at 10:25 PM, Bowen Song via dev < >>>>>>> dev@cassandra.apache.org> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> I'm not sure what is the concern here. Is it a malicious user >>>>>>> exploiting this? Or human error with unintended consequences? >>>>>>> >>>>>>> For malicious user, in order to exploit this, an attacker needs to >>>>>>> be able to write to the config file. The config file on Linux by >>>>>>> default is >>>>>>> owned by the root user and has the -rw-r--r-- permission, that means the >>>>>>> attacker must either gain root access to the system or has the ability >>>>>>> to >>>>>>> write arbitrary file on the filesystem. With either of these permission, >>>>>>> they can already do almost anything they want (e.g. modify a SUID >>>>>>> executable file). They wouldn't even need to exploit this to run a >>>>>>> script >>>>>>> or dangerous command. So this sounds like a non-issue to me, at least on >>>>>>> Linux-based OSes. >>>>>>> >>>>>>> For human error, if the operator puts "rm -rf" in it, the software >>>>>>> should treat it as the operator actually wants to do that. I personally >>>>>>> don't like software attempting to outsmart human, which often ends up >>>>>>> interfering with legitimate use cases. The best thing a software can do >>>>>>> is >>>>>>> log it, so there's some traceability if and when things go wrong. >>>>>>> >>>>>>> So, IMO, there's nothing wrong with the implementation in Cassandra. >>>>>>> >>>>>>> >>>>>>> On 30/08/2024 17:13, guo Maxwell wrote: >>>>>>> >>>>>>> Commitlog has the ability of archive log file, see >>>>>>> CommitLogArchiver.java >>>>>>> <https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java>, >>>>>>> we can achieve the purpose of archive and restore commitlog by >>>>>>> configuring archive_command and restore_command in >>>>>>> commitlog_archiving.properties >>>>>>> <https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28> >>>>>>> .The archive_command and restore_command can be some linux/unix >>>>>>> shell command. However, I found that the shell command can >>>>>>> actually be filled with any script, even if "*rm -rf"* .I have >>>>>>> tested this situation and it finally succeeded with my test file being >>>>>>> deleted. >>>>>>> >>>>>>> Personally, I think it is a dangerous behavior, because if there >>>>>>> are no system-level restrictions and users are allowed to do anything in >>>>>>> these shell commands. So here I want to discuss with you whether it >>>>>>> is necessary to impose any restrictions on use, or do we need a new way >>>>>>> of >>>>>>> archiving/restoring commitlog? >>>>>>> >>>>>>> Of course, before that, I would also like to ask, how many people >>>>>>> are using archive and restore of commitlog? It seems that the commitlog >>>>>>> archive code has not been updated for a long time. >>>>>>> >>>>>>> I have two ideas. >>>>>>> One is to make some restrictions on the command context based on the >>>>>>> existing usage methods, such as strictly only allowing the current >>>>>>> cp/mv/ln >>>>>>> %path to %name.Other redundant strings in the command are not allowed. >>>>>>> Another one , As I roughly investigated the archive of mysql and pg. >>>>>>> They do not give users too much space (I am talking about letting users >>>>>>> define their own archiving command ), and archive directly to a >>>>>>> designated >>>>>>> location. For us, I feel that we can refer to c * Incremental backup of >>>>>>> sstable, add a hardlink to the commitlog to the specified location, but >>>>>>> this place may modify the original configuration method, such as setting >>>>>>> the archive location and restoring location of the node through nodetool >>>>>>> and deprecate the commitlog_archiving.properties >>>>>>> <https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28> >>>>>>> configuration. >>>>>>> >>>>>>> I am just putting forward some views here, and looking forward to >>>>>>> your feedback. 😀 >>>>>>> >>>>>>>